TW202407538A

TW202407538A - No-operation-compatible instruction

Info

Publication number: TW202407538A
Application number: TW112126839A
Authority: TW
Inventors: 約翰麥克霍利; 馬克賽爾林拉特蘭; 西蒙約翰克拉斯克; 馬杜蘇達那雷迪范吉雷迪
Original assignee: 英商Ａｒｍ股份有限公司
Priority date: 2022-08-05
Filing date: 2023-07-19
Publication date: 2024-02-16
Also published as: WO2024028565A1

Abstract

An apparatus comprises an instruction decoder to decode instructions; processing circuitry to perform data processing in response to decoding of the instructions by the instruction decoder; and at least one control register to specify instruction-function-selecting information. In response to a no-operation-compatible instruction, the instruction decoder is configured to control the processing circuitry to: treat the no-operation-compatible instruction as a no-operation instruction, when the instruction-function-selecting information specified by the at least one control register is in a first state; perform both a first operation and a second operation, when the instruction-function-selecting information specified by the at least one control register is in a second state; and perform the first operation but not the second operation, when the instruction-function-selecting information specified by the at least one control register is in a third state.

Description

No operational compatibility instructions

本技術係關於資料處理領域。This technology is related to the field of data processing.

一種資料處理設備具有處理電路系統，用以回應於指令解碼器解碼的指令而執行資料處理。指令編碼的格式及由各指令所表示的功能可根據指令集架構(instruction set architecture, ISA)定義。ISA代表製造用於給定處理器實施方案的處理硬體之硬體製造商與寫入程式碼以在該硬體上執行的軟體開發者之間的同意構架，使得根據ISA編寫的程式碼將在支援ISA之硬體上正確地運作。選擇由ISA及其編碼支援之指令可存在設計挑戰。當規劃ISA的指令定義時，由ISA設計者所做出的設計決策可具有在執行特定程式時由處理硬體達成之對於真實世界效能之顯著的影響。A data processing device has processing circuitry for performing data processing in response to instructions decoded by an instruction decoder. The format of instruction encoding and the functions represented by each instruction can be defined according to the instruction set architecture (ISA). The ISA represents a framework of agreement between hardware manufacturers who make the processing hardware for a given processor implementation and software developers who write code to run on that hardware, such that code written in accordance with the ISA will Works correctly on ISA-capable hardware. Selecting instructions supported by the ISA and its encoding can present design challenges. When planning the instruction definition of an ISA, the design decisions made by the ISA designer can have a significant impact on the real-world performance achieved by the processing hardware when executing a specific program.

至少一些實例提供一種設備，其包含：一指令解碼器，以解碼指令；處理電路系統，以回應於該等指令藉由該指令解碼器的解碼而執行資料處理；及至少一控制暫存器，以指定指令功能選擇資訊；其中：回應於一無操作相容指令，該指令解碼器經組態以控制該處理電路系統以：當由該至少一控制暫存器指定之該指令功能選擇資訊在一第一狀態中時，將該無操作相容指令視為一無操作指令；當由該至少一控制暫存器指定之該指令功能選擇資訊在一第二狀態中時，執行一第一操作及一第二操作兩者；及當由該至少一控制暫存器指定之該指令功能選擇資訊在一第三狀態中時，執行該第一操作但不執行該第二操作。At least some examples provide an apparatus that includes: an instruction decoder to decode instructions; processing circuitry to perform data processing in response to decoding of the instructions by the instruction decoder; and at least one control register, with specified instruction function selection information; wherein: in response to a non-operation compatible instruction, the instruction decoder is configured to control the processing circuitry to: when the instruction function selection information specified by the at least one control register is in When in a first state, the no-operation compatible instruction is regarded as a no-operation instruction; when the instruction function selection information specified by the at least one control register is in a second state, a first operation is performed and a second operation; and when the command function selection information specified by the at least one control register is in a third state, the first operation is performed but the second operation is not performed.

至少一些實例提供一種方法，其包含：解碼指令；及回應於該等指令之解碼而執行資料處理；其中：回應於一無操作相容指令之解碼，該方法包含：當由至少一控制暫存器指定之指令功能選擇資訊在一第一狀態中時，將該無操作相容指令視為一無操作指令；當由該至少一控制暫存器指定之該指令功能選擇資訊在一第二狀態中時，執行一第一操作及一第二操作兩者；及當由該至少一控制暫存器指定之該指令功能選擇資訊在一第三狀態中時，執行該第一操作但不執行該第二操作。At least some examples provide a method that includes: decoding instructions; and performing data processing in response to decoding of the instructions; wherein: in response to decoding of a no-op compatible instruction, the method includes: when stored by at least one control buffer; When the instruction function selection information specified by the controller is in a first state, the no-operation compatible instruction is regarded as a no-operation instruction; when the instruction function selection information specified by the at least one control register is in a second state when the command function selection information specified by the at least one control register is in a third state, perform the first operation but not perform the Second operation.

至少一些實例提供一種包含指令的電腦程式，當該等指令由一主機資料處理設備執行時，控制該主機資料處理設備提供用於執行目標碼的一指令執行環境，該電腦程式包含：指令解碼程式邏輯，以解碼該目標碼之指令；及暫存器仿真程式邏輯，以將資料維持在該主機資料處理設備的儲存電路系統中，以仿真至少一控制暫存器用於指定指令功能選擇資訊；其中：回應於一無操作相容指令，該指令解碼程式邏輯經組態以控制該主機資料處理設備以：當該指令功能選擇資訊在一第一狀態中時，將該無操作相容指令視為一無操作指令；當該指令功能選擇資訊在一第二狀態中時，執行一第一操作及一第二操作兩者；及當該指令功能選擇資訊在一第三狀態中時，執行該第一操作但不執行該第二操作。At least some examples provide a computer program including instructions that, when executed by a host data processing device, control the host data processing device to provide an instruction execution environment for executing object code, the computer program including: an instruction decoding program Logic to decode the instructions of the object code; and register emulator logic to maintain data in the storage circuitry of the host data processing device to emulate at least one control register for designated instruction function selection information; wherein : In response to a no-op compliant command, the command decode program logic is configured to control the host data processing device to: treat the no-op compliant command as if the command function selection information is in a first state A no-operation instruction; when the instruction function selection information is in a second state, perform both a first operation and a second operation; and when the instruction function selection information is in a third state, execute the third operation one operation but does not perform the second operation.

該電腦程式可儲存在一儲存媒體上。該儲存媒體可係一非暫時性儲存媒體或一暫時性儲存媒體。The computer program can be stored on a storage medium. The storage medium may be a non-transitory storage medium or a transitory storage medium.

一種設備包含：一指令解碼器，以解碼指令；處理電路系統，以回應於該等指令藉由該指令解碼器的解碼而執行資料處理；及至少一控制暫存器，以指定指令功能選擇資訊。回應於一無操作相容（NOP相容）指令，該指令解碼器控制該處理電路系統以：當由該至少一控制暫存器指定之該指令功能選擇資訊在一第一狀態中時，將該NOP相容指令視為一無操作(NOP)指令；當由該至少一控制暫存器指定之該指令功能選擇資訊在一第二狀態中時，執行一第一操作及一第二操作兩者；及當由該至少一控制暫存器指定之該指令功能選擇資訊在一第三狀態中時，執行該第一操作但不執行該第二操作。A device includes: an instruction decoder to decode instructions; processing circuitry to perform data processing in response to decoding of the instructions by the instruction decoder; and at least one control register to specify instruction function selection information. . In response to a no operation compatible (NOP compatible) instruction, the instruction decoder controls the processing circuitry to: when the instruction function selection information specified by the at least one control register is in a first state, The NOP compatible instruction is regarded as a no-operation (NOP) instruction; when the instruction function selection information specified by the at least one control register is in a second state, both a first operation and a second operation are performed. ; and when the command function selection information specified by the at least one control register is in a third state, the first operation is performed but the second operation is not performed.

NOP相容指令可用於實施一些可選之操作，若執行，其可有用，但對於使用指令之軟體程式的正確運作並非關鍵。例如，第一操作及第二操作可係提供可選的安全增強或效能提示操作的操作，其可有助於改善安全及/或效能，即使對於獲得程式之正確運作結果並非必要。彼等增強可不必然需要，且因此所欲的是能夠在執行包括用於執行第一操作及第二操作中之一者或兩者的指令之一序列程式碼的給定實例上控制是否該等操作實際上被執行。未執行該等操作可幫助節省電力或允許分析相同程式碼將如何在不支援可選之操作的傳統硬體上運行。藉由基於儲存在至少一控制暫存器中的指令功能選擇資訊來控制指令的功能，相同程式碼可在不同使用情境中執行，其中取決於指令功能選擇資訊的目前值，第一操作及第二操作的不同結果（若有）係回應於指令由處理電路系統執行。NOP-compliant instructions can be used to perform optional operations that, if executed, may be useful but are not critical to the correct operation of the software program using the instructions. For example, the first operation and the second operation may be operations that provide optional security enhancement or performance prompting operations, which may help improve security and/or performance even though they are not necessary to obtain correct operating results of the program. These enhancements may not necessarily be required, and thus what is desirable is to be able to control, on a given instance executing a sequence of code that includes instructions for performing one or both of the first operation and the second operation, whether the The operation is actually performed. Not performing these operations can help save power or allow analysis of how the same code will run on legacy hardware that does not support the optional operations. By controlling the function of the instruction based on the instruction function selection information stored in at least one control register, the same program code can be executed in different usage scenarios, where the first operation and the second operation depend on the current value of the instruction function selection information. The different results (if any) of the two operations are executed by the processing circuitry in response to the instructions.

當指令功能選擇資訊係在第一狀態時，該指令動作如NOP指令。NOP指令可係其執行除了程式計數器的變化以外不會導致架構狀態的任何變化之指令，程式計數器的變化可隱含在從NOP指令至NOP之後的下一指令依序前進之程式流程（不具有分支）中。因此，軟體開發者具有關閉由第一及第二操作一起代表之特徵的選項。此可有用，例如，其中第一操作及第二操作對應於在更新的ISA版本中引入的特徵，其可能無法在支援較舊ISA版本的傳統設備上可用。另一種使用情況可係第一操作及第二操作可不總是需要（例如，當針對較低安全使用情況執行程式碼時，在執行包括NOP相容指令的程式碼序列之前，可藉由設定指令功能選擇資訊至第一狀態，而關閉第一及/或第二操作代表之安全增強，以節省電力）。When the command function selection information is in the first state, the command acts like a NOP command. The NOP instruction can be an instruction whose execution does not result in any change in the architectural state except the change of the program counter. The change of the program counter can be implicit in the program flow that advances sequentially from the NOP instruction to the next instruction after the NOP (without branch). Therefore, the software developer has the option of turning off the features represented by the first and second operations together. This may be useful, for example, where the first operation and the second operation correspond to features introduced in newer ISA versions, which may not be available on legacy devices supporting older ISA versions. Another use case may be that the first and second operations may not always be required (for example, when executing code for a less safe use case, before executing a sequence of code that includes a NOP-compliant instruction, the The function selection information is returned to the first state, and the security enhancement represented by the first and/or second operation is turned off to save power).

當指令功能選擇資訊係在第二狀態時，處理電路系統回應於NOP相容指令而執行第一操作及第二操作兩者。當指令功能選擇資訊係在第三狀態時，處理電路系統回應於NOP相容指令而執行第一操作，但不執行第二操作。通常，為了提供是否由程式碼序列執行第一操作及第二操作的不同組合，吾人將預期各操作編碼在單獨指令中，使得執行兩個操作之組合將需要兩個不同指令，而若需要此等操作的僅一者，則此等指令的僅一者將包括在軟體碼中。然而，利用上文所述之NOP相容指令，第一及第二操作兩者可回應相同指令而執行（以及具有相同指令被執行的選項係僅僅第一操作被執行，但不執行第二操作）。因此，用於包括NOP相容指令之程式碼之部分的完全相同程式碼二元可以不同方式動作，取決於執行程式碼的該部分之前指令功能選擇資訊已如何執行。相較於提供分別對應於第一操作及第二操作之兩個不同的NOP相容指令的實施方案，此提供更有效的ISA編碼。首先，此避免ISA需要使用高達兩個不同的指令編碼，用於分別對應於第一操作及第二操作之不同的NOP相容指令，節省可用於另一種類型的指令的編碼，且因此改善使用該其他類型指令的程式碼的效能（相較於需要將該其他類型指令的操作分割成多個較簡單的指令來說）。再者，使用NOP相容指令的軟體程式碼將僅需要處理電路系統使用單一指令槽，用於處理管線的提取、解碼、發布、及執行階段中的指令（以及指令快取記憶體或用於儲存正在執行的程式碼的其他儲存器中的指令），其節省由實施不同操作的另一指令使用的指令槽，且因此可改善效能，以及改善快取記憶體中及記憶體中的程式碼儲存密度。When the instruction function selection information is in the second state, the processing circuit system performs both the first operation and the second operation in response to the NOP compatible instruction. When the command function selection information is in the third state, the processing circuit system performs the first operation in response to the NOP compatible instruction, but does not perform the second operation. Typically, to provide whether different combinations of a first operation and a second operation are performed by a sequence of code, one would expect each operation to be encoded in a separate instruction, such that performing the combination of two operations would require two different instructions, and if this is required only one of these operations, then only one of these instructions will be included in the software code. However, with the NOP compatible instructions described above, both the first and second operations can be executed in response to the same instruction (and the option with the same instruction executed is that only the first operation is executed, but not the second operation) ). Therefore, identical code bins for a portion of code that includes a NOP-compliant instruction may behave in different ways depending on how the instruction function selection information has been executed prior to executing that portion of the code. This provides a more efficient ISA encoding than an implementation that provides two different NOP compliant instructions corresponding to the first operation and the second operation respectively. First, this avoids the need for the ISA to use up to two different instruction encodings for different NOP-compliant instructions corresponding to the first operation and the second operation, saving encoding that could be used for another type of instruction, and thus improving usage The performance of the code for the other type of instruction (compared to the need to split the operation of the other type of instruction into multiple simpler instructions). Furthermore, software code using NOP-compliant instructions will only require processing circuitry to use a single instruction slot for instructions in the fetch, decode, issue, and execution stages of the processing pipeline (and the instruction cache or instruction in another memory that stores the executing code), which saves an instruction slot used by another instruction that performs a different operation, and therefore improves performance, as well as improving the code in the cache and in memory Storage density.

因此，上文描述之NOP相容指令可幫助改善ISA編碼對於存在可能不總是需要的可選操作之使用情況之效能及效率。Therefore, the NOP compliant instructions described above can help improve the performance and efficiency of ISA encoding for use cases where optional operations may not always be required.

在一些實例中，NOP相容指令可不支援執行第二操作而不執行第一操作的選項。例如，指令功能選擇資訊的任何其餘編碼（除上文所描述之第一、第二及第三狀態之外）可用以控制第一或第二操作如何執行的其他特徵（例如，調整第一操作或第二操作的行為之控制參數），而非用以指示第二操作應被執行而第一操作則否。In some examples, NOP-compliant instructions may not support the option of performing a second operation without performing the first operation. For example, any remaining encoding of the instruction function selection information (in addition to the first, second, and third states described above) may be used to control other features of how the first or second operation is performed (e.g., adjust the first operation or a control parameter of the behavior of the second operation), rather than indicating that the second operation should be performed but the first operation should not.

然而，在其他實例中，當該指令功能選擇資訊在一第四狀態中時，回應於該NOP相容指令，該指令解碼器可控制該處理電路系統，以執行該第二操作，但不執行該第一操作。此可尤其有用，使得NOP相容指令可用以執行該第一操作但不執行該第二操作，或者執行該第二操作但不執行該第一操作，或者兩操作一起執行，或可動作如NOP，使得第一操作、第二操作皆不可執行。因此，可使用指令功能選擇資訊以獨立開啟及關閉第一操作及第二操作中之各者。此在若第一操作及第二操作中之各者經編碼為不同的NOP相容指令之情況中對於選擇哪個操作被執行提供相同的彈性，但具有更高效率的指令編碼。However, in other examples, when the instruction function selection information is in a fourth state, in response to the NOP compliant instruction, the instruction decoder may control the processing circuitry to perform the second operation, but not execute The first operation. This can be particularly useful so that NOP compatible instructions can be used to perform the first operation but not the second operation, or the second operation but not the first operation, or both operations together, or can act as a NOP , making both the first operation and the second operation unexecutable. Therefore, the command function selection information can be used to independently turn on and off each of the first operation and the second operation. This provides the same flexibility in selecting which operation is performed as would be the case if each of the first and second operations were encoded as different NOP-compliant instructions, but with more efficient instruction encoding.

在一些實例中，當該指令功能選擇資訊係在該第二狀態中時，該處理電路系統可基於該指令功能選擇資訊控制施加該第一操作與該第二操作的一相對順序。例如，可在第二操作之前執行第一操作，或第二操作可在第一操作之前執行，或兩個操作可並行執行。指令功能選擇資訊可用以在二或更多個此等選項之間選擇。此可有用，因為第一操作與第二操作之間的一個順序可比另一個順序具有優點。例如，第一順序可提供更大的安全性，但第二順序可對效能更有效率。因此，提供控制狀態，其允許所使用之相對順序之組態可用於允許相同程式碼二元與不同的使用情況執行，不同的使用情況可具有對於優先級安全及/或處理效能之不同偏好。In some examples, when the command function selection information is in the second state, the processing circuitry may control a relative sequence of applying the first operation and the second operation based on the command function selection information. For example, a first operation may be performed before a second operation, or the second operation may be performed before the first operation, or both operations may be performed in parallel. Command function selection information may be used to select between two or more of these options. This can be useful because one order between the first operation and the second operation can have advantages over the other order. For example, the first order may provide greater security, but the second order may be more efficient for performance. Thus, control states are provided that allow configurations of the relative order of use to be used to allow the same code binary to execute with different use cases, which may have different preferences for priority security and/or processing performance.

廣泛範圍的處理操作可分別實施作為第一操作及第二操作。A wide range of processing operations can be implemented as the first operation and the second operation respectively.

然而，NOP相容指令可特別有用於與功能呼叫相關聯的功能導言操作。因此，對於NOP相容指令之功能導言變體，第一操作及第二操作係與功能呼叫相關聯的功能導言操作。例如，功能導言操作可係在進入功能主體之前，在做出功能呼叫之前、期間或不久之後執行的初步操作。因為相同功能可在給定軟體工作負載執行期間被呼叫大量的次數，即使在呼叫功能之單一實例上達成相對較小的效能節省也可導致整體軟體工作負載效能的大幅改善，因為該改善係在功能每次被呼叫時都可看到。因此，藉由使與功能呼叫相關聯的第一操作及第二操作能夠回應於單一NOP相容指令而執行，而非需要多個指令，此對於整體工作負載可提供明顯的效能改善。However, NOP compliant instructions may be particularly useful for function preamble operations associated with function calls. Therefore, for the function preamble variant of the NOP compliant instruction, the first operation and the second operation are the function preamble operations associated with the function call. For example, a function preamble operation may be a preliminary operation performed before entering the function body, before, during, or shortly after making a function call. Because the same function can be called a large number of times during the execution of a given software workload, even relatively small performance savings achieved on a single instance of the calling function can result in a large improvement in overall software workload performance because the improvement is The function is visible every time it is called. Therefore, by enabling the first operation and the second operation associated with the function call to be executed in response to a single NOP-compliant instruction, rather than requiring multiple instructions, this can provide significant performance improvements to the overall workload.

對於類似原因，NOP相容指令可用於與來自功能之處理的返回相關聯的功能收尾操作。此等可係在功能之主要主體完成之後執行的操作，以準備返回到呼叫該功能之背景處理。此類收尾操作可在實際將處理返回到背景處理的返回分支之前、期間或之後執行。因此，對於NOP相容指令之功能收尾變體，第一操作及第二操作係與來自功能之處理的返回相關聯的功能收尾操作。For similar reasons, NOP-compliant instructions may be used in function wrap-up operations associated with returns from processing of a function. These can be operations performed after the main body of the function has completed, in preparation for returning to the background process that called the function. Such wrap-up operations can be performed before, during, or after the return branch that actually returns processing to background processing. Therefore, for the function wrap-up variant of the NOP-compliant instruction, the first operation and the second operation are the function wrap-up operations associated with the return from the processing of the function.

在一些實例中，對於該NOP相容指令之至少一變體，該第一操作及該第二操作中之一者包含一認證碼產生操作，以基於一運算元而產生一認證碼，且將該認證碼與該運算元相關聯。In some examples, for at least one variant of the NOP-compliant instruction, one of the first operation and the second operation includes an authentication code generation operation to generate an authentication code based on an operand, and The authentication code is associated with the operand.

此一認證碼產生操作可施加至任何運算元，但可特別有用，其中其施加至功能返回位址，該功能返回位址係設定在功能呼叫上，以表示一旦功能主體已完成時後續返回分支應返回處理的指令位址，以提供針對返回導向程式化(Return-oriented-programming, ROP)攻擊的防禦。This authentication code generation operation can be applied to any operand, but is particularly useful where it is applied to a function return address that is set on a function call to indicate a subsequent return branch once the function body has completed. The address of the instruction processed should be returned to provide defense against Return-oriented-programming (ROP) attacks.

ROP型攻擊是資料處理系統上常見的攻擊類別。ROP攻擊是藉由損壞用以自功能呼叫或例外返回的返回狀態資訊而嘗試使程式以非預期方式動作的攻擊。通常，軟體將將返回狀態資訊儲存至記憶體，例如，以促進功能呼叫或例外的嵌套。在可以用於內部功能呼叫或例外的返回狀態資訊在暫存器中覆寫用於（功能呼叫或例外的嵌套集合的）外部功能呼叫或例外的返回狀態資訊之前，用於外部功能呼叫或例外的返回狀態資訊可儲存至記憶體以保存其。ROP攻擊可在返回狀態資訊被恢復至暫存器且用以控制功能返回或例外返回之前，在返回狀態資訊儲存於記憶體中時，嘗試竄改返回狀態資訊。成功的ROP攻擊可導致功能返回或例外返回將程式流程返回至功能被呼叫或例外被採取的點之後的下一指令之外的一指令，這可允許攻擊者控制處理電路系統以執行程式設計者所意欲的操作序列以外的任意操作。ROP attacks are a common type of attack on data processing systems. A ROP attack is an attack that attempts to cause a program to behave in an unexpected manner by corrupting the return status information used to return from function calls or exceptions. Typically, software will store return status information in memory, for example, to facilitate nesting of function calls or exceptions. Before the return status information available for an inner function call or exception overwrites in the scratchpad the return status information for an outer function call or exception (a nested collection of function calls or exceptions), the return status information for an outer function call or Exception return status information can be saved to memory to preserve it. ROP attacks can attempt to tamper with return status information while it is stored in memory before it is restored to a register and used to control function returns or exception returns. A successful ROP attack can cause a function return or exception return to return program flow to an instruction other than the next instruction after the point where the function was called or the exception was taken, which can allow the attacker to control the processing circuitry to execute the programmer's Any operation outside the intended sequence of operations.

認證碼產生操作可幫助保護免於此類ROP攻擊，藉由：產生對應於運算元（例如，功能返回位址）的認證碼，使得竄改儲存在記憶體中的運算元之後續嘗試可基於經竄改運算元及對應的認證碼之間的不匹配而刪除，若運算元已經竄改，認證碼可能不再對應於運算元。雖然認證碼產生操作可用於安全性，但其不是必要的，且在一些使用情況下，針對效能原因，較佳地省略認證碼產生操作。因此，可為有用的是提供NOP相容指令，其實現回應於指令而選擇是否執行認證碼產生操作，使得針對是否執行認證碼產生操作，相同程式碼序列可在不同的情境中執行而有不同的結果。因此，可有用的是第一操作及第二操作中之一者包含認證碼產生操作。Authentication code generation helps protect against such ROP attacks by generating authentication codes corresponding to operands (e.g., function return addresses) so that subsequent attempts to tamper with operands stored in memory can be based on experience. Delete the mismatch between the tampered operand and the corresponding authentication code. If the operand has been tampered with, the authentication code may no longer correspond to the operand. Although the authentication code generation operation can be used for security, it is not necessary, and in some use cases it is better to omit the authentication code generation operation for performance reasons. Therefore, it may be useful to provide NOP-compliant instructions whose implementation responds to the instructions by selecting whether to perform the authentication code generation operation, so that the same sequence of code can be executed differently in different contexts as to whether to perform the authentication code generation operation. result. Therefore, it may be useful for one of the first operation and the second operation to include an authentication code generation operation.

雖然用於認證碼產生操作的運算元可係任意運算元（例如，自NOP相容指令指定的暫存器獲得的運算元），在一些實例中，該運算元包含獲自一連結暫存器的一值。回應於一功能返回分支指令，該指令解碼器可控制該處理電路系統以分支至該連結暫存器中指定的一位址。因此，當施加認證碼產生操作至連結暫存器中之運算元時，通常，用於認證碼產生操作的運算元可係功能返回位址。此可有用於提供防禦抵抗ROP攻擊。在認證碼產生操作施加至功能返回位址之情況下，則此可係如上文所論述之功能導言操作的實例，因為其通常是此操作將相關於功能呼叫執行的情況。Although the operand used in the authentication code generation operation can be any operand (e.g., an operand obtained from a register specified by a NOP-compliant instruction), in some examples, the operand includes an operand obtained from a linked register. of a value. In response to a function return branch instruction, the instruction decoder may control the processing circuitry to branch to an address specified in the link register. Therefore, when applying an authentication code generation operation to an operand in a link register, typically the operand used for the authentication code generation operation can be the function return address. This can be useful to provide defense against ROP attacks. In the case where an authentication code generation operation is applied to a function return address, then this may be an example of a function preamble operation as discussed above, as this is typically the case where this operation will be related to the execution of a function call.

在認證碼產生操作中，所產生的認證碼可以不同的方式與運算元相關聯。例如，認證碼可儲存至具有與提供運算元之暫存器的已知關聯的特定暫存器。然而，此可非必要，且在一些情況下，認證碼可嵌入在運算元本身之部分中。因此，將該認證碼與該運算元相關聯可包含將該認證碼嵌入在該運算元之更高有效位元的一部分中。此可係有用的，因為藉由在運算元本身中嵌入認證碼，則此意指任何後續操作以移動運算元從一個位置至另一個位置（例如，將運算元自一暫存器推入至記憶體中的堆疊資料結構上）亦隱含地導致認證碼與運算元一起被轉移，而不需要單獨操作以轉移認證碼。運算元之更有效位元可用於表示認證碼，因為通常，雖然實務上處理器架構可支援具有某個數目的位元（例如，64位元）的位址，真實世界資料處理裝置可能還不需提供使用整個64位元位址空間之記憶體儲存。因此，雖然位址可具有64位元，但實務上僅使用較小數目的位元，其中數個最高有效位元對應於零（或一些其他固定值）。因此，因為實務上未使用一些上位元，這些位元可以認證碼取代（認證碼可插入至位址的上端處的這些未使用位元的任何子集中）。In the authentication code generation operation, the generated authentication code can be associated with the operands in different ways. For example, the authentication code may be stored in a specific register that has a known association with the register providing the operand. However, this may not be necessary, and in some cases the authentication code may be embedded as part of the operand itself. Thus, associating the authentication code with the operand may include embedding the authentication code in a portion of the more significant bits of the operand. This can be useful because, by embedding the authentication code in the operand itself, this means that any subsequent operation to move the operand from one location to another (e.g., pushing the operand from a register to The stacked data structure in memory) also implicitly causes the authentication code to be transferred together with the operand, without the need for a separate operation to transfer the authentication code. More significant bits of the operand may be used to represent the authentication code because, in general, although in practice processor architectures can support addresses with a certain number of bits (e.g., 64 bits), real-world data processing devices may not yet Memory storage using the entire 64-bit address space is required. So while an address can be 64 bits, in practice only a smaller number of bits are used, with the most significant bits corresponding to zeros (or some other fixed value). Therefore, since some upper bits are practically unused, these bits can be replaced by an authentication code (the authentication code can be inserted into any subset of these unused bits at the upper end of the address).

該認證碼產生操作可包含至少基於該運算元及一密鑰根據一密碼功能而產生該認證碼。藉由根據密碼安全功能（諸如QARMA-64、QARMA-128或SHA256）產生認證碼，例如基於密鑰，可以攻擊器難以預測對應於給定位址的認證碼之方式產生認證碼。The authentication code generation operation may include generating the authentication code according to a cryptographic function based on at least the operand and a key. By generating the authentication code based on a cryptographic security function (such as QARMA-64, QARMA-128 or SHA256), for example based on a key, the authentication code can be generated in a manner that makes it difficult for an attacker to predict the authentication code corresponding to a given address.

在一些實例中，認證碼亦可取決於至密碼功能的修飾符輸入。修飾符可例如係與處理的目前點相關聯的一值，諸如一堆疊指標的一目前值。此可幫助防止重新使用攻擊，其中攻擊器獲得在程式之一點處使用的有效運算元認證碼對，且嘗試取代該運算元以用於在程式之不同點處使用的不同運算元。In some instances, the authentication code may also depend on modifier input to the password function. The modifier may, for example, be a value associated with the current point of processing, such as a current value of a stack pointer. This helps prevent reuse attacks, in which the attacker obtains a valid operand authentication code pair used at one point in the program and attempts to substitute that operand for a different operand used at a different point in the program.

在一些實例中，對於該NOP相容指令之至少一變體，該第一操作及該第二操作中之一者包含一保護控制堆疊(guarded-control-stack, GCS)推送操作，以將該運算元推送至用於保護返回狀態資訊的一GCS資料結構。此類GCS推送操作係抵抗ROP攻擊之防禦措施的另一實例，但非依賴指派認證碼以保護返回狀態免於竄改，可建立保護的GCS資料結構，其具有至少一防禦措施以限制在GCS資料結構中寫入資料的能力，提供相對於正常記憶體區域之一些額外保護。再次，GCS推送操作可係功能導言操作的實例，因為其可用於在呼叫功能時在功能返回位址上執行GCS推送操作。雖然此類GCS推送操作可用於安全性，但其具有效能成本，且因此，具有較低安全要求的一些使用情況可能偏向不執行其。因此，GCS推送操作係一操作的另一實例，其可使用NOP相容指令有效地實施，使得包括NOP相容指令的相同程式碼序列可在不同使用情況中執行，其中指令功能選擇資訊控制GCS推送操作是否實際實行。In some examples, for at least one variant of the NOP compliant instruction, one of the first operation and the second operation includes a guarded-control-stack (GCS) push operation to push the The operand is pushed to a GCS data structure that protects the returned status information. This type of GCS push operation is another example of a defense against ROP attacks, but instead of relying on assigning authentication codes to protect return status from tampering, a protected GCS data structure can be created that has at least one defense to limit the GCS data The ability to write data into the structure provides some additional protection relative to normal memory areas. Again, a GCS push operation can be an instance of a function preamble operation, as it can be used to perform a GCS push operation on the function return address when calling the function. While such a GCS push operation can be used for security, it has a performance cost, and therefore some use cases with lower security requirements may prefer not to perform it. Thus, a GCS push operation is another instance of an operation that can be efficiently implemented using NOP-compliant instructions such that the same sequence of code including NOP-compliant instructions can be executed in different use cases, where the instruction function selection information controls GCS Whether the push operation is actually performed.

該NOP相容指令特別有用於該NOP相容指令之變體，其中該第一操作及該第二操作中之一者包含一認證碼產生操作，以基於一運算元而產生一認證碼，且將該認證碼與該運算元相關聯；及該第一操作及該第二操作中之另一者包含一保護控制堆疊(GCS)推送操作，以將該運算元推送至用於保護返回狀態資訊的一GCS資料結構。由於認證碼產生操作及GCS推送操作可視為用於保護功能返回狀態免於ROP攻擊的替代技術，在相同程式（其與功能呼叫相關聯）點處通常需要彼等，且因此有用的是將彼等組合成單一指令，同時亦提供關閉此等操作之一或兩者的選項。雖然兩個操作名義上保護免於相同類別的攻擊，但其等可具有不同的利弊，且因此針對「防禦深度」，一些開發人員可希望包括兩個措施，使得其可用於支援在一功能呼叫下執行兩個操作的選項。藉由使用NOP相容指令，可僅執行單一指令以執行兩種類型之操作（在指令功能選擇資訊在第二狀態中的情況中）。The NOP-compliant instruction is particularly useful in a variant of the NOP-compliant instruction, wherein one of the first operation and the second operation includes an authentication code generation operation to generate an authentication code based on an operand, and associating the authentication code with the operand; and the other of the first operation and the second operation includes a protection control stack (GCS) push operation to push the operand to the protection return status information A GCS data structure. Since the authentication code generation operation and the GCS push operation can be considered as alternative techniques for protecting function return state from ROP attacks, they are usually required at the same program point (which is associated with the function call), and it is therefore useful to separate them etc. are combined into a single command, with the option to turn off one or both of these operations. Although both operations nominally protect against the same class of attacks, they can have different pros and cons, and therefore for "defense depth" some developers may wish to include both measures so that they can be used to support a function call option to perform two actions. By using NOP compatible instructions, only a single instruction can be executed to perform both types of operations (in the case where the instruction function selection information is in the second state).

在第一操作及第二操作分別係認證碼產生操作及GCS推送操作的情況下（或反之亦然），在此等操作之間可能有不同的順序。一些實施方案可因此允許選擇使用哪個順序，取決於指令功能選擇資訊。In the case where the first operation and the second operation are the authentication code generation operation and the GCS push operation respectively (or vice versa), there may be a different order between these operations. Some implementations may thus allow selection of which order to use, depending on the instruction function selection information.

回應於該NOP相容指令，當指令功能選擇資訊係在該第二狀態之一第一子狀態中時，該處理電路系統可將該運算元及該認證碼兩者推送至該GCS資料結構（例如，此可對應於先執行該認證碼產生操作，以在該運算元中嵌入該認證碼，且接著在該認證碼產生操作的結果上執行GCS推送操作）。此方法可藉由使用GCS資料結構來保護認證碼而改善安全性。In response to the NOP-compliant instruction, when the instruction function selection information is in one of the first sub-states of the second state, the processing circuitry may push both the operand and the authentication code to the GCS data structure ( For example, this may correspond to first performing the authentication code generation operation to embed the authentication code in the operand, and then performing a GCS push operation on the result of the authentication code generation operation). This approach improves security by using GCS data structures to protect authentication codes.

回應於該NOP相容指令，當該指令功能選擇資訊係在該第二狀態之一第二子狀態中時，該處理電路系統可推送該運算元、但不推送該認證碼至該GCS資料結構。在此情況下，認證碼產生操作及GCS推送操作可彼此獨立，且因此可並行或以任一順序執行。藉由支援並行地執行其等之選項，此可改善效能，但意指GCS資料結構不保護認證碼。In response to the NOP-compliant instruction, when the instruction function selection information is in one of the second sub-states of the second state, the processing circuitry may push the operand but not the authentication code to the GCS data structure . In this case, the authentication code generation operation and the GCS push operation may be independent of each other, and thus may be performed in parallel or in either order. This improves performance by supporting the option to execute them in parallel, but means that the GCS data structure does not protect the authentication code.

在另一實例中，對於NOP相容指令之至少一變體，該第一操作及該第二操作中之一者包含一認證碼檢查操作，以檢查與一運算元相關聯的一相關聯認證碼是否匹配於基於該運算元產生的一預期認證碼，且回應於偵測到該相關聯認證碼及該預期認證碼之間的一失配而觸發一錯誤處理回應。此操作可用以檢查由上文所描述之認證碼產生操作所產生之認證碼的有效性，且儘管其可在任何運算元上執行，但其通常可執行作為功能收尾操作，以檢查返回位址是否安全以使用（如上文所提及之防禦ROP攻擊）。因此，針對認證碼產生操作的對應原因，NOP相容指令可用於認證碼檢查操作。In another example, for at least one variant of the NOP compliant instruction, one of the first operation and the second operation includes an authentication code check operation to check an associated authentication associated with an operand whether the code matches an expected authentication code generated based on the operand, and an error handling response is triggered in response to detecting a mismatch between the associated authentication code and the expected authentication code. This operation can be used to check the validity of the authentication code generated by the authentication code generation operation described above, and although it can be performed on any operand, it can usually be performed as a function wrap-up operation to check the return address Is it safe to use (as mentioned above to defend against ROP attacks). Therefore, for the corresponding reason of the authentication code generation operation, the NOP compliant instructions can be used for the authentication code checking operation.

在一些實例中，該認證碼檢查操作的相關聯的認證碼可自運算元的較高有效位元的一部分獲得。In some examples, the authentication code associated with the authentication code checking operation may be obtained from a portion of the more significant bits of the operand.

在對於認證碼產生操作之對應的方式中，在認證碼檢查操作中，預期認證碼可至少基於該運算元及一密鑰（且在一些情況下亦基於修飾符（諸如，堆疊指標））根據密碼功能而產生。In a corresponding manner to the authentication code generation operation, in the authentication code checking operation, the expected authentication code may be based on at least the operand and a key (and in some cases also based on modifiers (such as stacking indicators)) according to Generated by password function.

對於該NOP相容指令之至少一變體，該第一操作及該第二操作中之一者可包含一保護控制堆疊(GCS)取出操作，以自用於保護該功能返回資訊的一GCS資料結構取出功能返回資訊。此可用以獲得返回狀態資訊，該返回狀態資訊先前被先前的GCS推送操作推送至GCS資料結構。對於GCS推送操作有類似原因，GCS取出操作（功能收尾操作的實例）可用於使用NOP相容指令實施。For at least one variant of the NOP compliant instruction, one of the first operation and the second operation may include a protection control stack (GCS) fetch operation to return information from a GCS data structure used to protect the function Retrieve function returns information. This can be used to obtain return status information that was previously pushed to the GCS data structure by a previous GCS push operation. For similar reasons for GCS push operations, GCS fetch operations (instances of function wrap-up operations) can be implemented using NOP-compliant instructions.

再次，NOP相容指令的一些變體可支援認證碼檢查操作及GCS取出操作兩者。因此，該第一操作及該第二操作中之一者可包含一認證碼檢查操作，以檢查與一運算元相關聯的一相關聯認證碼是否匹配於基於該運算元產生的一預期認證碼，且回應於偵測到該相關聯認證碼及該預期認證碼之間的一失配而觸發一錯誤處理回應；及該第一操作及該第二操作中之另一者包含一保護控制堆疊(GCS)取出操作，以自用於保護該功能返回資訊的一GCS資料結構取出功能返回資訊。此等操作可用於組合至相同的NOP相容指令中，因為若需要兩者，其等將通常在程式中的相同點處執行，如同在執行功能返回之前、在功能的主要主體的完成之後的功能收尾操作。Third, some variations of NOP-compliant instructions may support both authentication code checking operations and GCS fetching operations. Accordingly, one of the first operation and the second operation may include an authentication code checking operation to check whether an associated authentication code associated with an operand matches an expected authentication code generated based on the operand , and triggering an error handling response in response to detecting a mismatch between the associated authentication code and the expected authentication code; and the other of the first operation and the second operation includes a protection control stack (GCS) fetch operation to fetch function return information from a GCS data structure used to protect the function's return information. These operations can be used to combine into the same NOP compatible instruction because if both are required, they will usually be executed at the same point in the program, as before the function returns and after the completion of the main body of the function. Function closing operation.

可基於指令功能選擇資訊控制認證碼檢查操作及GCS取出操作之間的順序。回應於該NOP相容指令，當該指令功能選擇資訊係在該第二狀態之一第一子狀態中時，該處理電路系統可執行該GCS取出操作，並執行該認證碼檢查操作於該GCS取出操作自該GCS資料結構取出的一值；及回應於該NOP相容指令，當該指令功能選擇資訊係在該第二狀態之一第二子狀態中時，該處理電路系統可在執行該GCS取出操作之前執行該認證碼檢查操作於一給定暫存器中的一值，並執行該GCS取出操作，以自該GCS資料結構取出該功能返回資訊至該給定暫存器。再次，此提供針對效能與安全的折衷之不同選項。The sequence between the authentication code checking operation and the GCS retrieval operation can be controlled based on the command function selection information. In response to the NOP compliant instruction, when the instruction function selection information is in one of the first sub-states of the second state, the processing circuit system can perform the GCS fetch operation and perform the authentication code check operation on the GCS A fetch operation retrieves a value from the GCS data structure; and in response to the NOP compatible instruction, when the instruction function selection information is in one of the second sub-states of the second state, the processing circuit system may execute the The authentication code is performed before the GCS fetch operation to check a value in a given register, and the GCS fetch operation is performed to fetch the function return information from the GCS data structure to the given register. Again, this provides different options for trade-offs between performance and security.

對於NOP相容指令實施GCS推送操作或GCS取出操作作為第一/第二操作之一者的情況，則回應於NOP相容指令，當該指令功能選擇資訊係在指示對於GCS資料結構的存取係回應於NOP相容指令而執行之狀態中（亦即，當將執行GCS推送操作及GCS取出操作之一者時），該處理電路系統可回應於偵測到對應於該NOP相容指令之一目標位址的一記憶體區域係由記憶體屬性資料指定為用於儲存該GCS資料結構的GCS區域之外的一記憶體區域，拒絕NOP相容指令所觸發的記憶體存取。因此，若存取係由GCS存取類型之指令觸發，則可拒絕對於未指定為GCS資料結構的記憶體區域的存取。此避免GCS存取類型指令（當NOP相容指令執行GCS推送操作或GCS取出操作時，包括NOP相容指令）被誤用而存取不意欲用於儲存GCS資料結構的記憶體區域，其可減少攻擊器可利用的攻擊面。For the case where a NOP compatible instruction implements a GCS push operation or a GCS fetch operation as one of the first/second operations, respond to the NOP compatible instruction when the instruction function selection information indicates access to the GCS data structure In a state of execution in response to a NOP-compliant instruction (i.e., when one of a GCS push operation and a GCS fetch operation is to be executed), the processing circuitry may respond to detection of the NOP-compliant instruction. A memory region at a target address that is specified by the memory attribute data as a memory region other than the GCS region used to store the GCS data structure denies memory accesses triggered by NOP-compliant instructions. Therefore, access to memory areas not designated as GCS data structures may be denied if the access is triggered by an instruction of the GCS access type. This prevents GCS access type instructions (including NOP compatible instructions when performing a GCS push operation or a GCS fetch operation) from being misused to access memory areas that are not intended to store GCS data structures, which can reduce The attack surface available to attackers.

類似地，該處理電路系統可回應於偵測到對應於該非GCS存取類型指令之一目標位址的一記憶體區域係由記憶體屬性資料指定為該GCS區域，拒絕一非GCS存取類型指令所觸發的一寫入記憶體存取。藉由將寫入至GCS區域的能力限制至GCS存取類型指令（當指令功能選擇資訊係在指示將執行GCS存取之狀態中時，包括NOP相容指令），其他較一般的記憶體存取指令不能竄改GCS資料結構的內容，提供對於儲存在GCS資料結構中的保護的返回狀態資訊更大的安全保證。再次，此減少攻擊器嘗試安裝ROP攻擊時可利用之攻擊面。Similarly, the processing circuitry may reject a non-GCS access type in response to detecting that a memory region corresponding to a target address of the non-GCS access type instruction is specified by the memory attribute data as the GCS region. A write memory access triggered by the command. By limiting the ability to write to the GCS area to GCS access type instructions (including NOP compatible instructions when the instruction function selection information is in a state indicating that a GCS access will be performed), other more general memory storage Instructions cannot tamper with the contents of the GCS data structure, providing a greater security guarantee for the protected return status information stored in the GCS data structure. Again, this reduces the attack surface that attackers can exploit when trying to install ROP attacks.

指令功能選擇資訊可以不同方式表示。在一些實例中，稍早提及之第一、第二及第三狀態對應於指令功能選擇資訊之不同（可能任意選擇之）編碼。因此，可使用在第一、第二及第三狀態與指令功能選擇資訊之位元值的不同組合之間的任何映射。Command function selection information can be represented in different ways. In some examples, the first, second, and third states mentioned earlier correspond to different (possibly arbitrarily selected) encodings of command function selection information. Therefore, any mapping between different combinations of first, second, and third states and bit values of the command function selection information may be used.

然而，在一些實例中，可有用的是，該指令功能選擇資訊包含一第一操作指示符，以指示該第一操作是否將回應於該NOP相容指令被執行；及一第二操作指示符，以指示該第二操作是否將回應於該NOP相容指令被執行。例如，指令功能選擇資訊可包含一組位元，其中各位元對應於可回應於NOP相容指令可能被選擇執行的操作中之一者，且指示是否需要執行該操作。指令功能選擇資訊的此編碼可容易由軟體開發者理解且較簡單由處理電路系統的硬體或指令解碼器解碼，因為各操作的選擇僅取決於單一指示符（例如，單一位元）而非需要更複雜之解碼電路邏輯。However, in some instances, it may be useful for the instruction function selection information to include a first operation indicator to indicate whether the first operation will be executed in response to the NOP compliant instruction; and a second operation indicator , to indicate whether the second operation will be executed in response to the NOP-compliant instruction. For example, the instruction function selection information may include a set of bits, each bit corresponding to one of the operations that may be selected to be performed in response to the NOP-compliant instruction, and indicating whether the operation needs to be performed. This encoding of instruction function selection information can be easily understood by software developers and simpler to decode by processing circuitry hardware or instruction decoders because the selection of each operation depends only on a single indicator (e.g., a single bit) rather than More complex decoding circuit logic is required.

在一些實例中，基於指令功能選擇資訊，NOP相容指令可支援自超過兩個操作選擇之選項。因此，該指令功能選擇資訊可進一步指示該處理電路系統是否應回應於該NOP相容指令而執行一第三操作。例如，其中指令功能選擇資訊包含一組位元，各指示是否應回應於該NOP相容指令而執行各別操作，可相對有效率的是增加對於所欲的額外操作的支援。因此，雖然下文討論的實例展示具有兩個操作之實例，申請專利範圍不限於此，且亦可支援額外操作。In some instances, a NOP-compliant instruction may support options selected from more than two operations based on instruction function selection information. Therefore, the instruction function selection information may further indicate whether the processing circuit system should perform a third operation in response to the NOP-compliant instruction. For example, the instruction function selection information includes a set of bits, each indicating whether a respective operation should be performed in response to the NOP-compliant instruction, which can relatively efficiently add support for desired additional operations. Therefore, while the examples discussed below show examples with two operations, the scope of the patent claims is not limited thereto and additional operations may be supported.

上文討論的技術可實施在資料處理設備內，該資料處理設備具有針對實施上文討論之處理電路系統及指令解碼器而提供的硬體電路系統。The techniques discussed above may be implemented within a data processing device having hardware circuitry provided for implementing the processing circuitry and instruction decoders discussed above.

然而，相同技術亦可實施在電腦程式內，該電腦程式係在主機資料處理設備上執行以提供用於目標碼之執行的指令執行環境。即使主機資料處理設備本身不支援該架構，此一電腦程式可控制主機資料處理設備以模擬其將提供在實際支援根據某個指令集架構之目標碼的硬體設備上的架構環境。電腦程式可具有指令解碼程式邏輯及暫存器仿真程式邏輯，其控制主機資料處理設備以仿真上文所討論之特徵，包括針對如上文所描述之NOP相容指令的支援。指令解碼程式邏輯解碼目標碼的指令，並產生由主機支援的原生架構的指令，以仿真由目標碼中經解碼指令表示的功能。暫存器仿真程式邏輯將資料維持在該主機資料處理設備的儲存電路系統中，以仿真至少一控制暫存器的內容，包括儲存如上文所論述之指令功能選擇資訊的（多個）暫存器。因此，當包括NOP相容指令的目標碼在由主機資料處理設備上執行的模擬電腦程式所提供的指令執行環境中執行時，即使主機資料處理設備本身不支援NOP相容指令，也可達成如上文所論述之相同功能。However, the same techniques may also be implemented within a computer program that is executed on a host data processing device to provide an instruction execution environment for the execution of object code. This computer program can control the host data processing device to simulate the architectural environment it would provide on a hardware device that actually supports object code based on an instruction set architecture, even if the host data processing device itself does not support the architecture. The computer program may have instruction decode program logic and register emulation program logic that control the host data processing device to emulate the features discussed above, including support for NOP-compliant instructions as described above. The instruction decoder logic decodes the instructions of the object code and generates instructions for the native architecture supported by the host to emulate the functionality represented by the decoded instructions in the object code. Register emulator logic maintains data in the storage circuitry of the host data processing device to emulate the contents of at least one control register, including the register(s) that store command function selection information as discussed above device. Therefore, when the object code including the NOP-compatible instructions is executed in the instruction execution environment provided by the simulated computer program executed on the host data processing device, even if the host data processing device itself does not support the NOP-compatible instructions, the above can be achieved. The same functions as discussed in the article.

例如，當針對一個指令集架構所編寫之程式碼係在支援一不同指令集架構的一主機處理器上執行時，此一模擬程式可係有用的。再者，由於軟體在模擬執行環境上的執行可使軟體測試得以與支援新架構之硬體裝置的進行中開發平行，模擬可允許針對一較新版本的指令集架構的軟體開發在支援該新架構版本的處理硬體就緒之前開始。模擬程式可儲存在儲存媒體上，該儲存媒體可係非暫時性儲存媒體。For example, this emulation may be useful when code written for one instruction set architecture is executed on a host processor that supports a different instruction set architecture. Furthermore, because the execution of software on a simulated execution environment allows software testing to be paralleled by the ongoing development of hardware devices that support the new architecture, simulation can allow software development for a newer version of the instruction set architecture to support the new architecture. Processing of the architectural version begins before the hardware is ready. The simulation program may be stored on a storage medium, which may be a non-transitory storage medium.

圖1示意地繪示資料處理設備2的一實例。資料處理設備具有包括若干個管線級的處理管線4。在此實例中，管線級包括用於從指令快取記憶體8提取指令的提取級6；用於解碼提取程式指令之解碼級10，以產生待由管線的其餘級處理的微操作(micro-operation)（經解碼指令）；用於檢查微操作所需的運算元在暫存器檔案14中是否可用，並發布用於將給定微操作的所需運算元執行一次的微操作係可用的發布級12；用於執行對應於藉由處理讀自暫存器檔案14之運算元以產生結果值之微操作的資料處理操作的執行級16；及用於將處理的結果寫回至暫存器檔案14的寫回級18。應瞭解此僅係可能的管線架構的一個實例，且其他系統可具有額外級或不同的級組態。例如，在亂序處理器中，可包括用於將由程式指令或微操作指定的架構暫存器映射至識別暫存器檔案14中之實體暫存器之實體暫存器說明符的暫存器重命名級。在一些實例中，在由解碼級10解碼的程式指令與由執行級所處理之對應微操作之間可有一對一關係。在程式指令與微操作之間亦係可能有一對多或多對一關係，使得，例如，單一程式指令可分成二或更多個微操作，或者二或更多個程式指令可融合以作為單一微操作來處理。Figure 1 schematically illustrates an example of a data processing device 2. The data processing device has a processing pipeline 4 consisting of several pipeline stages. In this example, the pipeline stages include a fetch stage 6 for fetching instructions from the instruction cache 8; a decode stage 10 for decoding fetched program instructions to generate micro-operations to be processed by the remaining stages of the pipeline. operation) (decoded instruction); used to check whether the operands required by the micro-operation are available in the register file 14, and to issue the available micro-operation system for executing the required operands of a given micro-operation once Release level 12; execution level 16 for performing data processing operations corresponding to micro-operations that produce result values by processing operands read from the scratchpad file 14; and for writing the results of the processing back to scratchpad Write back level 18 of device file 14. It should be understood that this is only one example of a possible pipeline architecture and other systems may have additional stages or different stage configurations. For example, in an out-of-order processor, a register cache may be included for mapping architectural registers specified by program instructions or micro-operations to physical register descriptors identifying physical registers in register file 14 . Named level. In some examples, there may be a one-to-one relationship between program instructions decoded by decode stage 10 and corresponding micro-operations processed by the execution stage. There may also be a one-to-many or many-to-one relationship between program instructions and micro-operations, such that, for example, a single program instruction can be divided into two or more micro-operations, or two or more program instructions can be fused to function as a single micro-operations to handle.

執行級16（處理電路系統之實例）包括若干個處理單元，以用於執行不同類別的處理操作。例如，執行單元可包括用於對讀自暫存器14的純量運算元進行算術或邏輯運算的純量算術/邏輯單元(arithmetic/logic unit, ALU) 20；用於對浮點值進行運算的浮點單元22；用於評估分支操作之結果及調整該程式計數器的分支單元24，該程式化計數器據此表示目前的執行點；及用於執行載入/儲存操作以存取記憶體系統8、30、32、34中之資料的載入/儲存單元26。提供一記憶體管理單元(memory management unit, MMU) 28，其係記憶體管理電路系統之實例，以用於執行由載入/儲存單元26基於資料存取指令之運算元所指定的虛擬位址與識別記憶體系統中資料之儲存位置的實體位址之間的位址轉譯。MMU具有一轉譯後備緩衝區(translation lookaside buffer, TLB) 29，其用於從儲存在記憶體系統中的分頁表快取位址轉譯資料，其中分頁表的分頁表項目定義位址轉譯映射及存取權限，其（例如）管控是否允許管線上執行的給定程序從給定記憶體區域讀取或寫入資料或執行指令。當分頁表結構被遍歷以定位對應於所需位址的分頁表項目時，MMU 28可具有在分頁表走訪期間請求記憶體存取的電路系統。Execution stage 16 (an example of processing circuitry) includes a number of processing units for performing different types of processing operations. For example, the execution unit may include a scalar arithmetic/logic unit (ALU) 20 for performing arithmetic or logical operations on scalar operands read from the register 14; for performing operations on floating point values; a floating point unit 22; a branch unit 24 for evaluating the results of the branch operation and adjusting the program counter accordingly representing the current execution point; and for performing load/store operations to access the memory system 8, 30, 32, 34 data loading/storage unit 26. A memory management unit (MMU) 28 is provided, which is an example of a memory management circuit system, for executing the virtual address specified by the load/store unit 26 based on the operand of the data access instruction. Address translation to and from a physical address that identifies where data is stored in a memory system. The MMU has a translation lookaside buffer (TLB) 29, which is used to cache address translation data from the page table cache stored in the memory system, where the page table entry of the page table defines the address translation mapping and storage. Access permissions, which (for example) control whether a given program executing on the pipeline is allowed to read or write data from a given memory area or execute instructions. MMU 28 may have circuitry to request memory access during a page table walk when the page table structure is traversed to locate the page table entry corresponding to the desired address.

在此實例中，記憶體系統包括一級資料快取記憶體30、一級指令快取記憶體8、共用二級快取記憶體32、及主系統記憶體34。將理解此僅係可能的記憶體階層的一個實例，並可提供快取記憶體的其他配置。顯示在執行級16中的處理單元20至26的具體類型僅係一個實例，且其他實施方案可具有不同組的處理單元或可包括相同類型的處理單元的多個實例，使得可平行地處理多個相同類型的微操作。應理解圖1僅係一可能的處理器管線實施方案的一些組件的簡化表示，且處理器可包括為了簡潔起見而未繪示的許多其他元件。雖然圖1顯示具有對記憶體34存取的單一處理器核心，該設備2亦可具有一或多個另外的處理器核心，其共享對記憶體34的存取，其中各核心具有各別快取記憶體8、30、32。In this example, the memory system includes L1 data cache 30 , L1 instruction cache 8 , shared L2 cache 32 , and main system memory 34 . It will be understood that this is only one example of a possible memory hierarchy and that other configurations of cache memory may be provided. The specific type of processing units 20 - 26 shown in execution stage 16 is one example only, and other implementations may have different sets of processing units or may include multiple instances of the same type of processing units such that multiple processing units may be processed in parallel. micro-operations of the same type. It should be understood that Figure 1 is only a simplified representation of some components of a possible processor pipeline implementation, and that the processor may include many other elements not shown for the sake of brevity. Although FIG. 1 shows a single processor core with access to memory 34, the device 2 may also have one or more additional processor cores that share access to memory 34, with each core having its own processor core. Take memory 8, 30, 32.

圖2繪示設備2之一些暫存器14的實例。應瞭解，圖2未顯示所有暫存器-設備亦可包括其他暫存器。提供一組通用暫存器50，用於儲存通用運算元及處理操作之結果。此等通用暫存器中之一些亦可具有更特定功能，諸如用於儲存功能返回位址之連結暫存器(link register, LR)，其可使用通用暫存器識別符中之一者（例如，暫存器X30）來定址。暫存器亦包括用於儲存堆疊指標的堆疊指標(stack pointer, SP)暫存器52。該設備亦具有一些控制暫存器56，其用於儲存用以控制該處理器之操作的控制資訊。例如，控制暫存器56可包括保護控制堆疊(GCS)堆疊指標暫存器58，其用以控制對該GCS的存取，如下文相關於圖6進一步討論的；及至少一暫存器，其提供用以控制如下文進一步論述之NOP相容指令的行為之指令功能選擇資訊60。雖然圖2顯示指令功能選擇資訊60作為單一暫存器，但此資訊亦可能分割至二或更多個暫存器。FIG. 2 shows examples of some registers 14 of the device 2 . It should be understood that not all registers are shown in Figure 2 - the device may include other registers as well. A set of general-purpose registers 50 are provided for storing general-purpose operation elements and results of processing operations. Some of these general purpose registers may also have more specific functions, such as a link register (LR) used to store function return addresses, which may use one of the general purpose register identifiers ( For example, register X30) to address. The register also includes a stack pointer (SP) register 52 for storing stack pointers. The device also has control registers 56 for storing control information used to control the operation of the processor. For example, control register 56 may include a guard control stack (GCS) stack pointer register 58 that controls access to the GCS, as discussed further below with respect to FIG. 6; and at least one register, It provides instruction function selection information 60 used to control the behavior of NOP-compliant instructions as discussed further below. Although FIG. 2 shows the command function selection information 60 as a single register, this information may also be divided into two or more registers.

圖3展示NOP相容指令70之實例。指令編碼包括識別指令類型的運算碼72、及用於指定指令的運算元之一或多個運算元欄位74、76。運算元74、76可使用立即值、或使用指定儲存運算元的暫存器14之暫存器識別符、或至少一立即值及至少一暫存器識別符兩者之組合來指定。在一些情況下，該指令亦可指定識別待寫入指令之結果的暫存器的額外目的地欄位，或替代地針對亦可充當目的地暫存器的運算元欄位中之一者所指定的暫存器的一者。雖然圖3為了實例的緣故顯示了兩個運算元，但NOP相容指令的其他實例可具有較大或較小數量的運算元。Figure 3 shows an example of a NOP compliant instruction 70. The instruction encoding includes an operation code 72 that identifies the instruction type, and one or more operand fields 74, 76 used to specify the operands of the instruction. Operands 74, 76 may be specified using an immediate value, or using a register identifier specifying the register 14 in which the operand is stored, or a combination of at least one immediate value and at least one register identifier. In some cases, the instruction may also specify an additional destination field of the register that identifies the result of the instruction to be written, or alternatively target one of the operand fields that may also serve as the destination register. One of the specified registers. Although Figure 3 shows two operands for the sake of example, other instances of NOP-compliant instructions may have larger or smaller numbers of operands.

在具有相同運算碼72及運算元欄位74、76的相同定義之單一指令編碼內，NOP相容指令代表完全不執行操作（使得該指令之動作如同NOP指令）或執行至少兩個不同處理操作（包括至少第一操作及第二操作）的一或多者之選項。基於儲存在控制暫存器56中的指令功能選擇資訊60，回應於NOP相容指令經選擇而執行操作的哪個組合（藉由指令解碼器10及/或處理管線4的執行級16）。此資訊可藉由處理器所執行的指令來設定。例如，系統暫存器修改指令（其可限制成在某些執行狀態或優先層級中執行）可用以設定指令功能選擇資訊60。替代地，可存在用於設定指令功能選擇資訊的專用類型指令，其不同於用於修改其他控制暫存器56的系統暫存器修改指令。在執行包括NOP相容指令之一段程式碼的使用情況下，一些實例可使用相同一段程式碼的較早指令來設定指令功能選擇資訊60。在其他實例中，指令功能選擇資訊60可由監督程式碼設定，其管理包括NOP相容指令之程式碼之執行（例如，藉由管理應用程式的執行之作業系統，或管理作業系統的執行之監督器）。Within a single instruction code with the same opcode 72 and the same definition of operand fields 74 and 76, a NOP-compatible instruction means performing no operation at all (making the instruction behave like a NOP instruction) or performing at least two different processing operations One or more options (including at least the first operation and the second operation). Based on the instruction function selection information 60 stored in the control register 56 , which combination of operations is selected to be executed (by the instruction decoder 10 and/or the execution stage 16 of the processing pipeline 4 ) in response to the NOP compatible instruction. This information can be set by instructions executed by the processor. For example, system register modification instructions (which may be restricted to execution in certain execution states or priority levels) may be used to set the instruction function selection information 60 . Alternatively, there may be a dedicated type of instruction used to set command function selection information, which is different from the system register modification instructions used to modify other control registers 56 . In the case of use that executes a section of code that includes a NOP-compliant instruction, some examples may use an earlier instruction of the same section of code to set the instruction function selection information 60. In other examples, command function selection information 60 may be set by supervisory code that manages the execution of code that includes NOP-compliant instructions (e.g., by an operating system that manages the execution of an application, or by a supervisor that manages the execution of an operating system device).

圖4係繪示處理NOP相容指令之方法的流程圖。在步驟100，指令解碼電路系統10判定NOP相容指令是否經解碼。若否，則指令解碼電路系統10解碼另一類型的指令，控制處理器以執行由該指令表示的操作，並繼續等待經解碼的NOP相容指令。Figure 4 is a flowchart illustrating a method of processing NOP compliant instructions. At step 100, instruction decoding circuitry 10 determines whether the NOP-compliant instruction was decoded. If not, instruction decoding circuitry 10 decodes another type of instruction, controls the processor to perform the operation represented by the instruction, and continues to wait for a decoded NOP-compliant instruction.

當NOP相容指令經解碼，則在步驟102處，指令解碼器10檢查（或控制處理器之另一部分，諸如執行級16，以檢查）儲存在控制暫存器56中的指令功能選擇資訊60的狀態。When the NOP compliant instruction is decoded, then at step 102 , the instruction decoder 10 checks (or controls another portion of the processor, such as the execution stage 16 to check) the instruction function selection information 60 stored in the control register 56 status.

若指令功能選擇資訊60在第一狀態中，則在步驟104處，指令解碼器10控制執行級16以將NOP相容指令視為NOP指令。因此，回應於NOP相容指令，處理電路系統不會導致架構狀態的任何變化（除了前進程式計數器以指向NOP指令之後的下一依序指令）。If the instruction function selection information 60 is in the first state, then at step 104, the instruction decoder 10 controls the execution stage 16 to treat the NOP-compliant instruction as a NOP instruction. Therefore, processing circuitry does not cause any changes in architectural state in response to a NOP-compliant instruction (other than advancing the program counter to point to the next sequential instruction after the NOP instruction).

若指令功能選擇資訊60在第二狀態中，則在步驟106處，處理取決於指令功能選擇資訊是否是第二狀態的第一子狀態或第二子狀態。若指令功能選擇資訊60係第二狀態的第一子狀態，則在步驟108處，處理電路系統16經控制以執行第一操作及第二操作兩者（以該第一操作與第二操作之間的第一順序）。若指令功能選擇資訊60係第二狀態的第二子狀態，則在步驟110處，處理電路系統16經控制以執行第一操作及第二操作兩者（以該第一操作與第二操作之間的第二順序，其不同於第一順序）。第一及第二順序可不同在於第一及第二操作是否依序或並行執行，或不同在於第一及第二操作哪個第一執行且哪個第二執行。順序亦可不同在於第一操作與第二操作之間是否有任何相依性（亦即，是否一個操作取決於另一者的結果或兩個操作是否獨立）。If the command function selection information 60 is in the second state, then at step 106, processing depends on whether the command function selection information is a first sub-state or a second sub-state of the second state. If the command function selection information 60 is the first sub-state of the second state, then at step 108 , the processing circuitry 16 is controlled to perform both the first operation and the second operation (the difference between the first operation and the second operation). first order among them). If the command function selection information 60 is the second sub-state of the second state, then at step 110 , the processing circuit system 16 is controlled to perform both the first operation and the second operation (the difference between the first operation and the second operation). a second order between, which is different from the first order). The first and second orders may differ in whether the first and second operations are performed sequentially or in parallel, or in which of the first and second operations are performed first and which are performed second. The order may also differ in whether there are any dependencies between the first operation and the second operation (ie, whether one operation depends on the result of the other or whether the two operations are independent).

用於控制第一操作與第二操作之間的順序的支援係可選的，且在一些實例中，步驟106及110可省略，使得當指令功能選擇資訊係在第二狀態時，則該方法繼續進行至步驟108，以用預設選擇的第一順序執行第一操作及第二操作。Support for controlling the sequence between the first operation and the second operation is optional, and in some examples, steps 106 and 110 may be omitted, so that when the command function selection information is in the second state, the method Continue to step 108 to perform the first operation and the second operation in a preset selected first order.

下文相關於圖9至圖12描述控制操作之間的順序的特定實例。Specific examples of sequences between control operations are described below with respect to FIGS. 9-12.

在步驟102處，若指令功能選擇資訊60係在第三狀態，則在步驟112處，控制處理電路系統16以執行第一操作但不執行第二操作。At step 102, if the command function selection information 60 is in the third state, then at step 112, the processing circuit system 16 is controlled to perform the first operation but not the second operation.

若指令功能選擇資訊係在第四狀態，則在步驟114處，控制處理電路系統16以執行第二操作但不執行第一操作。支援步驟140可係可選的，且在一些實例中，NOP相容指令不可能執行第二操作但不執行第一操作。例如，在一些使用情況中，若用於指令功能選擇資訊的四個不同狀態有唯一空間（例如，因為僅2位元用於此資訊60），則一些實施方案可偏好使用指令功能選擇資訊60的第四編碼，以允許第二狀態的二個不同子狀態，如步驟108及110所示，使得操作的順序可受控制。其他實例可支援圖3之所有狀態及子狀態，且因此可使用具有3或更多位元之指令功能選擇資訊以允許指令功能選擇資訊的這些額外編碼。If the command function selection information is in the fourth state, at step 114, the processing circuit system 16 is controlled to perform the second operation but not the first operation. Support step 140 may be optional, and in some examples, it is not possible for a NOP-compliant instruction to perform the second operation without performing the first operation. For example, in some use cases, some implementations may prefer to use command function selection information 60 if there is unique room for four different states of command function selection information (e.g., because only 2 bits are used for this information 60 ) The fourth encoding is to allow two different sub-states of the second state, as shown in steps 108 and 110, so that the sequence of operations can be controlled. Other examples may support all states and sub-states of Figure 3, and thus may use command function selection information with 3 or more bits to allow these additional encodings of command function selection information.

指令功能選擇資訊的一個可能編碼（不具有用於控制第一操作與第二操作之間的順序的支援）可如下： • 0b00-第一操作停用，第二操作停用-指令動作如NOP（第一狀態）； • 0b01-第一操作啟用，第二操作停用（第三狀態）； • 0b10-第一操作停用，第二操作啟用（第四狀態）； • 0b11-第一操作啟用，第二操作啟用（第二狀態）。 One possible encoding of command function selection information (without support for controlling the sequence between the first operation and the second operation) could be as follows: • 0b00-The first operation is disabled, the second operation is disabled-instruction action such as NOP (first state); • 0b01-The first operation is enabled, the second operation is disabled (the third state); • 0b10-The first operation is disabled, the second operation is enabled (the fourth state); • 0b11-The first operation is enabled, the second operation is enabled (second state).

此一編碼可針對各操作提供不同的位元，其等獨立地指示是否啟用或停用該操作。藉由提供每個操作額外位元（各自啟用/關閉任何額外操作），此可延伸至第三操作或進一步操作。This encoding can provide different bits for each operation, which independently indicate whether to enable or disable the operation. This can be extended to a third or further operation by providing additional bits per operation (respectively enabling/disabling any additional operations).

指令功能選擇資訊的另一實例編碼分配有執行第二操作而不執行第一操作的能力，但使用指令功能選擇資訊60的第四編碼，以在執行兩個操作時指示第一操作與第二操作之間的所欲順序： • 0b00-第一操作停用，第二操作停用-指令動作如NOP（第一狀態）； • 0b01-第一操作啟用，第二操作停用（第三狀態）； • 0b10-第一操作啟用，第二操作啟用，第一操作及第二操作之間的第一順序（第二狀態-第一子狀態）； • 0b11-第一操作啟用，第二操作啟用，第一操作及第二操作之間的第二順序（第二狀態-第二子狀態）； Another example encoding of command function selection information assigns the ability to perform a second operation without performing the first operation, but uses a fourth encoding of command function selection information 60 to indicate the first operation and the second operation when both operations are performed. Desired order between operations: • 0b00-The first operation is disabled, the second operation is disabled-instruction action such as NOP (first state); • 0b01-The first operation is enabled, the second operation is disabled (the third state); • 0b10-The first operation is enabled, the second operation is enabled, the first sequence between the first operation and the second operation (second state-first sub-state); • 0b11-The first operation is enabled, the second operation is enabled, the second sequence between the first operation and the second operation (second state-second sub-state);

另一實例編碼可使用超過2個位元且留下一些編碼以備未來使用，例如，在加入額外操作或組態選項的支援時： • 0bX00-第一操作停用，第二操作停用-指令動作如NOP（第一狀態）； • 0bX01-第一操作啟用，第二操作停用（第三狀態）； • 0bX10-第一操作停用，第二操作啟用（第四狀態）； • 0b011-第一操作啟用，第二操作啟用，第一操作及第二操作之間的第一順序（第二狀態-第一子狀態）； • 0b111-第一操作啟用，第二操作啟用，第一操作及第二操作之間的第二順序（第二狀態-第二子狀態）； Another example encoding can use more than 2 bits and leave some encoding for future use, for example, when adding support for additional operations or configuration options: • 0bX00-The first operation is disabled, the second operation is disabled-instruction action such as NOP (first state); • 0bX01-The first operation is enabled, the second operation is disabled (the third state); • 0bX10-The first operation is disabled, the second operation is enabled (the fourth state); • 0b011-The first operation is enabled, the second operation is enabled, the first sequence between the first operation and the second operation (second state-first sub-state); • 0b111-The first operation is enabled, the second operation is enabled, the second sequence between the first operation and the second operation (second state-second sub-state);

應瞭解，所有此等實例僅係一些方式，其中針對NOP指令將執行的操作可由指令功能選擇資訊60編碼。It should be understood that all of these examples are but some of the ways in which the operations to be performed for a NOP instruction may be encoded by the instruction function selection information 60.

各種處理操作可用作第一及第二操作。然而，一個特定使用情形可用於功能導言或收尾操作分別在功能呼叫或功能返回上執行時。具體而言，可有用的是，第一操作及第二操作係在功能返回狀態上用於保護免於返回導向程式化(return oriented programming, ROP)攻擊的替代操作。例如，第一操作及第二操作可係下文進一步描述的指標認證/檢查操作或GCS推送/取出操作。Various processing operations can be used as the first and second operations. However, a specific use case can be used when a function intro or wrap-up operation is performed on a function call or function return respectively. In particular, it may be useful that the first operation and the second operation are alternative operations on the return state of the function to protect against return oriented programming (ROP) attacks. For example, the first operation and the second operation may be indicator authentication/checking operations or GCS push/fetch operations described further below.

圖5繪示呼叫功能（為了易於參見，標示為fn1）及自功能返回的實例。功能（亦稱為程序）係可自程式之另一部分呼叫的指令序列，且當完成時其將處理返回至該功能被呼叫之程式流程之部分。相同功能可自程式中之多個不同位置被呼叫，且因此在呼叫功能時儲存功能返回位址，使得功能返回可區分程式流程應返回至哪個位址。Figure 5 illustrates an example of a call function (labeled fn1 for ease of reference) and a return from the function. A function (also called a program) is a sequence of instructions that can be called from another part of the program and, when completed, returns processing to the part of the program flow from which the function was called. The same function can be called from many different locations in the program, and therefore the function return address is stored when the function is called, so that the function return can distinguish to which address the program flow should return.

例如，如圖5所示，具有連結指令BLR的分支可在功能將被呼叫的該點（由位址#add1表示）處執行，以使程式流程分支至使用具有連結指令的分支的運算元所指定之分支目標位址#add2處的指令。具有連結指令之分支亦導致處理電路系統設定連結暫存器（用於追蹤功能返回位址之指定暫存器，例如，如上文所示之通用暫存器）至在具有連結指令之分支之後的下一指令的位址（在此實例中，功能返回位址係#add+4）。在已採取分支之後，在功能碼中執行數個指令（例如，LD、MUL、ADD等），且當功能完成時，執行返回分支指令RET，其使分支至儲存在連結暫存器中的返回位址指示的指令。For example, as shown in Figure 5, a branch with a join instruction BLR can be executed at the point (represented by address #add1) where the function will be called, causing program flow to branch to the operand using the branch with a join instruction. The instruction at the specified branch target address #add2. The branch with the join instruction also causes the processing circuitry to set the link register (the designated register used to track the return address of the function, such as the general purpose register shown above) to the branch after the branch with the join instruction. The address of the next instruction (in this example, the function return address is #add+4). After the branch has been taken, several instructions (e.g., LD, MUL, ADD, etc.) are executed in the function code, and when the function is completed, the return branch instruction RET is executed, which causes the branch to the return stored in the link register The instruction indicated by the address.

若無其他功能自fn1內被呼叫，且在到達fn1之結束處的返回分支之前無例外發生，則在連結暫存器中的位址仍應相同如fn1被呼叫時所設定的一般。If no other function is called from fn1, and no exception occurs before the return branch at the end of fn1 is reached, the address in the link register should still be the same as it was set when fn1 was called.

然而，通常，由背景程式碼所呼叫的第一功能fn1本身可以嵌套方式呼叫另一功能（就是說fn2），且在此情況中，對fn2之功能呼叫將覆寫儲存在連結暫存器中的返回位址，且因此在呼叫另一功能之前，第一功能fn1的功能程式碼應包括一指令，以將來自連結暫存器之返回位址儲存至記憶體中的資料結構（例如，堆疊結構，以後進先出(last-in-first-out, LIFO)方式操作），且在自fn2返回之後，fn1之功能程式碼應將返回位址恢復至執行返回分支之前的連結暫存器。用於儲存及恢復功能返回狀態（諸如返回位址）之職責將通常在於軟體（可不存在用於儲存返回位址之架構上強制硬體機構）。Typically, however, the first function fn1 called by the background code can itself call another function (say fn2) in a nested manner, and in this case the function call to fn2 will overwrite the link register the return address in the link register, and therefore before calling another function, the function code of the first function fn1 should include an instruction to store the return address from the link register into a data structure in memory (e.g., Stacked structure, operating in last-in-first-out (LIFO) mode), and after returning from fn2, the function code of fn1 should restore the return address to the link register before executing the return branch . The responsibility for storing and restoring function return status (such as return addresses) will typically lie with software (there may be no architecturally mandated hardware mechanism for storing return addresses).

然而，當功能返回位址儲存在記憶體中時，可能易於受到攻擊器修改該資料之攻擊，例如使用在另一處理器核心上執行的另一執行緒，或藉由中斷該呼叫功能且同時執行其他程式碼，其覆寫儲存在記憶體中的該返回位址。替代地，該攻擊器可執行一些指令，其目標在於修改將該返回位址自記憶體恢復至暫存器之該指令之位址運算元，使得從記憶體載入的該資料不同於在呼叫嵌套功能之前原本儲存至記憶體的返回位址。若攻擊器可導致返回分支分支至程式流程中並非功能呼叫分支之後的指令之點，攻擊器可能夠導致軟體不正確動作，且可能夠繞過某些安全保護或導致執行非所欲操作。However, when the function return address is stored in memory, it may be vulnerable to an attacker modifying this data, such as using another thread executing on another processor core, or by interrupting the calling function and simultaneously Other code is executed that overwrites the return address stored in memory. Alternatively, the attacker can execute instructions whose goal is to modify the address operand of the instruction that restores the return address from memory to the scratchpad so that the data loaded from memory is different from the data loaded in the call. The return address originally stored in memory before the nested function. If an attacker can cause a return branch to a point in the program flow that is not an instruction following the function call branch, the attacker may be able to cause the software to behave incorrectly, and may be able to bypass certain security protections or cause unintended operations to be performed.

功能呼叫係操作的一實例，其產生返回狀態資訊，返回狀態資訊提供關於處理電路系統稍後將被恢復至的狀態的資訊。返回狀態資訊可被擷取之另一情境可係採取例外時，其中提供在硬體中的例外處理電路系統，或軟體例外處理器可擷取例外返回狀態資訊，諸如指示在從處理一例外返回之後將執行之指令的位址的例外返回位址，及/或指示在從例外返回之後處理器將執行的模式或執行狀態之儲存的處理器狀態資訊。例如，儲存的處理器狀態資訊可指示採取例外的例外層級，以及關於採取例外時處理器的操作狀態的其他資訊。如同功能呼叫，例外可被嵌套，且因此當採取另一例外時，針對一例外擷取的例外返回狀態可儲存至記憶體（以硬體自動地，或藉由軟體例外處理器），且因此當其儲存在儲存記憶體中時可能容易受攻擊器竄改。這些類型的攻擊可稱為返回導向程式化(ROP)攻擊。可能期望提供針對此類攻擊的架構反制措施。A function call is an instance of an operation that generates return state information that provides information about a state to which the processing circuitry will later be restored. Another situation in which return status information may be retrieved may be when exceptions are taken, where exception handling circuitry is provided in hardware, or a software exception handler may retrieve exception return status information, such as to indicate when returning from handling an exception. An exception return address that is the address of an instruction that will later be executed, and/or stored processor state information that indicates the mode or execution state in which the processor will execute after returning from an exception. For example, stored processor state information may indicate the exception level at which the exception was taken, as well as other information about the operating state of the processor when the exception was taken. Like function calls, exceptions can be nested, and thus the exception return status retrieved for one exception can be stored in memory (either automatically in hardware, or by a software exception handler) when another exception is taken, and Therefore, when stored in storage memory, it may be vulnerable to tampering by attackers. These types of attacks may be called return-oriented programmed (ROP) attacks. It may be desirable to provide architectural countermeasures against such attacks.

圖6繪示用於使用在記憶體中稱為「保護控制堆疊(guarded control stack, GCS)」之保護的資料結構120保護免於ROP攻擊的方法。在記憶體位址空間內之GCS資料結構的位置可藉由軟體選擇，但硬體提供經設計以保護GCS資料結構免於受惡意攻擊者竄改之架構特徵。Figure 6 illustrates a method for protecting against ROP attacks using a protected data structure 120 in memory called a guarded control stack (GCS). The location of the GCS data structure within the memory address space can be selected by software, but the hardware provides architectural features designed to protect the GCS data structure from tampering by malicious attackers.

如圖2所示，暫存器14包括控制暫存器56，其包括用於儲存一堆疊指標的一或多個經保護控制堆疊指標（GCS指標）暫存器58，該堆疊指標指示GCS資料結構上的位址。在一些實例中，GCS指標暫存器58可係堆積的暫存器組，分別提供用於至少兩個執行狀態（例如，例外層級），以實現在不同執行狀態下操作的軟體參照記憶體內之不同GCS結構，而不需要在執行狀態之各轉變之後重新程式化共用的堆疊指標暫存器。其他實例可使用單一GCS指標暫存器，且軟體可在執行狀態之間的轉變時更新儲存在GCS指標暫存器58中的堆疊指標。As shown in FIG. 2, register 14 includes control register 56, which includes one or more protected control stack pointer (GCS pointer) registers 58 for storing a stack pointer indicative of GCS data. The address on the structure. In some examples, the GCS indicator register 58 may be a stacked register group provided for at least two execution states (eg, exception levels) to implement software reference memory operating in different execution states. Different GCS structures without the need to reprogram the shared stack pointer register after each transition in execution state. Other examples may use a single GCS indicator register, and the software may update stacked indicators stored in GCS indicator register 58 on transitions between execution states.

如圖6所示，GCS資料結構120儲存在由用於控制位址轉譯及存取權限檢查的記憶體管理單元(MMU) 28所使用之分頁表的相關聯分頁表項目直接或間接指定的記憶體屬性所指定為記憶體之GCS區域的記憶體區域中。GCS區域屬性可直接在對應分頁表項目之編碼內指定，或可間接參照在由分頁表項目所參照之暫存器內。As shown in Figure 6, the GCS data structure 120 is stored in memory specified directly or indirectly by the associated paging table entry of the paging table used by the memory management unit (MMU) 28 for controlling address translation and access permission checking. In the memory area specified by the volume attribute as the GCS area of the memory. GCS region attributes can be specified directly in the code corresponding to the paging table entry, or they can be indirectly referenced in the register referenced by the paging table entry.

當記憶體區域被識別為GCS區域時，則當執行GCS存取指令的一特定子集時，至該區域的寫入存取係限於在由處理電路系統16觸發的寫入請求。由軟體使用用於一般儲存操作而不意欲存取GCS結構的通用儲存指令並不視為GCS存取指令之限制子集中之一者。MMU 28仍可允許使用通用載入指令讀取GCS結構，該通用載入指令導致發佈並非GCS記憶體存取請求的讀取請求。當記憶體存取請求請求對GCS區域之存取時，該請求係寫入請求，且該請求不是由GCS存取指令之限制子集的一者觸發的GCS記憶體存取請求，則該記憶體存取請求被拒絕且傳訊錯誤。GCS存取指令的子集可包括至少一GCS推送指令，其導致返回狀態資訊（諸如來自連結暫存器的功能返回位址，或例外返回位址或採取例外時被擷取之經儲存的處理器狀態）被推送至使用GCS指標暫存器58中指示的堆疊指標所判定的GCS結構上的位置。GCS存取指令亦可包括至少一種形式之GCS取出指令，其自GCS結構取出保護的返回資訊。When a memory region is identified as a GCS region, then write access to that region is limited to write requests triggered by processing circuitry 16 when a specific subset of GCS access instructions are executed. General store instructions used by software for general store operations that are not intended to access GCS structures are not considered to be among the restricted subset of GCS access instructions. The MMU 28 may still allow the GCS structure to be read using a general load instruction that results in the issuance of a read request that is not a GCS memory access request. When a memory access request requests access to a GCS region, the request is a write request, and the request is not a GCS memory access request triggered by one of the restricted subsets of GCS access instructions, then the memory access request The body access request was denied with a subpoena error. A subset of GCS access instructions may include at least one GCS push instruction that results in the return of status information (such as a function return address from a link register, or an exception return address, or a stored handle that is retrieved when an exception is taken) processor status) is pushed to the location on the GCS structure determined using the stacking metric indicated in GCS metric register 58 . GCS access instructions may also include at least one form of GCS fetch instructions that retrieve protected return information from a GCS structure.

相反地，GCS存取指令可不被允許存取未由分頁表屬性指定為GCS區域類型之記憶體區域。因此，當由該存取之目標記憶體區域未標記為GCS區域類型時，若嘗試執行GCS存取，則可傳訊錯誤。藉由禁止使用GCS存取指令存取非GCS區域，此不鼓勵程式設計者使用GCS存取指令，除非真的意欲是GCS存取，以減少可用於攻擊器的攻擊面。Conversely, GCS access instructions may not be allowed to access memory areas that are not specified as GCS area types by the paging table attribute. Therefore, an error may be signaled if a GCS access is attempted when the memory region targeted by the access is not marked as a GCS region type. By prohibiting the use of GCS access commands to access non-GCS regions, this discourages programmers from using GCS access commands unless GCS access is truly intended, thereby reducing the attack surface available to attackers.

GCS結構與由軟體使用的任何資料結構分開，以將儲存的返回狀態資訊維持在記憶體中，以處理功能呼叫或例外的嵌套。因此，當功能呼叫或例外嵌套時，GCS結構不意欲消除軟體本身追蹤返回狀態資訊的儲存及恢復之需求（返回狀態的軟體觸發儲存可以如在不支援上文所討論的GCS保護之架構措施的處理器上相同的方式繼續）。替代地，GCS結構提供經保護記憶體的區域，其被保護免於危害的程式碼的竄改，其可用以提供用於驗證意欲由軟體使用以從功能呼叫或例外的處理返回之返回狀態資訊的資訊。The GCS structure is separate from any data structures used by software to maintain stored return status information in memory to handle nesting of function calls or exceptions. Therefore, the GCS architecture is not intended to eliminate the need for the software itself to track the storage and restoration of return status information when function calls or exceptions are nested (software-triggered storage of return status may occur if the architecture does not support the GCS protection measures discussed above continue in the same manner on the processor). Alternatively, the GCS structure provides a region of protected memory that is protected from tampering with compromising code, which can be used to provide verification of return status information intended to be used by software to return from the handling of function calls or exceptions. information.

在一些實施方案中，使經保護返回狀態資訊從GCS結構取出的GCS取出指令亦可導致處理電路系統16比較取出的返回狀態與儲存在暫存器中的目前返回狀態資訊（例如，用於功能返回的連結暫存器54，或例外返回位址暫存器及/或用於例外返回之經儲存的處理器狀態暫存器），且若從GCS結構120取出的返回狀態資訊與軟體意欲用於功能/異常返回的預期返回狀態資訊之間存在失配，則傳訊錯誤。因此，軟體可藉由包括功能呼叫/返回或例外進入/返回時執行的程式碼內之GCS推送及GCS取出指令之實例而受保護免於竄改。In some implementations, a GCS fetch instruction that causes protected return status information to be fetched from a GCS structure may also cause processing circuitry 16 to compare the fetched return status to the current return status information stored in a register (e.g., for a function the return link register 54, or the exception return address register and/or the stored processor status register for exception returns), and if the return status information retrieved from the GCS structure 120 is intended to be used by the software If there is a mismatch between the expected return status information returned by the function/exception, a signaling error occurs. Therefore, software can be protected from tampering by including instances of GCS push and GCS fetch instructions within the code that executes on function calls/returns or exception entries/returns.

其他實施方案可界定用於驗證意欲的返回狀態資訊是否有效的單獨指令，不同於從GCS結構120取出返回狀態資訊的指令。Other implementations may define separate instructions for verifying that the intended return status information is valid, separate from the instructions for retrieving the return status information from the GCS structure 120 .

替代地，GCS取出指令可直接將來自GCS的經保護返回狀態取出至用以指定例外返回或功能返回的返回狀態之一或多個暫存器（或可與例外/功能返回指令組合，以取出經保護返回狀態並且使用該狀態用於控制例外/功能返回），在此情況中，不必實行驗證是否軟體提供的意欲的返回狀態資訊係有效，因為在此一實施方案中，GCS保護的返回狀態係直接用以控制例外/功能返回。例如，對於功能返回位址的GCS保護，可將功能返回位址直接取出至連結暫存器54，取代軟體可基於其自己的經管理堆疊結構而置放在其中的任何軟體管理之功能返回位址。Alternatively, the GCS fetch instruction may directly fetch the protected return status from GCS into one or more registers specifying the return status of an exception return or functional return (or may be combined with an exception/function return instruction to fetch protected return state and using that state for controlling exceptions/function returns), in which case it is not necessary to perform verification that the intended return state information provided by the software is valid, because in this implementation, the GCS protected return state It is used directly to control exceptions/function returns. For example, for GCS protection of a function return address, the function return address may be fetched directly into link register 54, replacing any software-managed function return bits that the software may place there based on its own managed stack structure. site.

此外，亦可支援存取指令之其他類型之GCS。當啟用GCS模式時（控制暫存器56中的控制狀態可控制是否GCS模式啟用），一些指令（其在停用GCS的使用之模式中具有其他功能）在執行時，可導致處理電路系統16執行額外功能（諸如額外的GCS模式特定安全檢查）。In addition, other types of GCS can also support access instructions. When the GCS mode is enabled (the control status in the control register 56 can control whether the GCS mode is enabled), some instructions (which have other functions in the mode in which GCS is disabled), when executed, may cause the processing circuitry 16 Perform additional functionality (such as additional GCS mode specific security checks).

一般而言，藉由提供用於定義用於GCS結構120之GCS記憶體區類型的架構支援，且限制對GCS區域類型的寫入存取至GCS存取指令的有限子集（其可能不被允許存取除了GCS區域類型以外的記憶體區域），此減少攻擊器可用於嘗試竄改儲存在GCS結構120上的保護的返回狀態資訊之攻擊面。Generally speaking, by providing architectural support for defining GCS memory region types for GCS structure 120 and restricting write access to GCS region types to a limited subset of GCS access instructions (which may not be Allowing access to memory regions other than GCS region types), this reduces the attack surface that attackers can use to attempt to tamper with the protected return state information stored on the GCS structure 120.

GCS提供一種防禦抵抗ROP攻擊。用於保護免於ROP攻擊的另一選項是使用與返回狀態資訊相關聯的認證碼。圖7繪示基於第一源運算元src1回應於一認證碼產生指令而執行的一認證碼產生操作的實例。源運算元可係任何值，但其特別有用於施加認證碼產生操作至位址指標，諸如功能返回位址。源運算元可指定（例如，藉由參照來源暫存器，諸如連結暫存器）包含某個數目之位元X的位址，但實際上僅那些位元之某個數目Y可用於有效位址（例如，X可等於64且Y可等於48或52）。因此，位址的X-Y上位元可預設設定為零。GCS provides a defense against ROP attacks. Another option for protecting against ROP attacks is to use an authentication code associated with the returned status information. FIG. 7 illustrates an example of an authentication code generation operation performed based on the first source operand src1 in response to an authentication code generation instruction. The source operand can be any value, but it is particularly useful for applying authentication code generation operations to address pointers, such as function return addresses. The source operand may specify (e.g., by reference to a source register, such as a link register) an address containing a certain number of bits X, but in fact only a certain number Y of those bits is available for the significant bits address (for example, X could equal 64 and Y could equal 48 or 52). Therefore, the X-Y upper bits of the address can be set to zero by default.

在認證碼產生操作中，源運算元可傳遞至處理電路系統16的加密/解密電路系統（例如，執行級16可包括類似於圖1所示的其他執行單元20、22、24、26之加密/解密功能單元），其可基於來自密鑰儲存器的密鑰及至少一修飾符值，施加一認證碼產生功能140至第一來源值。將所得的認證碼(PAC)插入至指標位址的未使用上位元中，以產生指令的結果。結果可例如寫回至儲存源運算元的相同暫存器。例如，若與源運算元執行的是儲存在連結暫存器54中的目前功能返回位址，則結果寫回至連結暫存器54。認證碼產生功能40可使用密碼雜湊功能（例如，SHA256、SHA128、QARMA-128、QARMA-256等），其使其運算不可行以猜測與特定位址相關聯的認證碼，而不知道密鑰。修飾符值可係用於連接由該操作所產生的該認證碼的該特定實例至程式碼中到達的特定執行值之值，從而降低重新使用攻擊的風險，其中來自該程式碼之一部分的一有效位址/PAC對係不正確地被取代而在該程式碼之另一部分使用。例如，來自堆疊指標暫存器52的堆疊指標、或代表所採取以到達目前處理點的功能呼叫史之呼叫路徑指示符，可用作修飾符。During an authentication code generation operation, source operands may be passed to encryption/decryption circuitry of processing circuitry 16 (e.g., execution stage 16 may include encryption similar to other execution units 20, 22, 24, 26 shown in FIG. 1 /decryption function), which can apply an authentication code generation function 140 to the first source value based on the key from the key store and at least one modifier value. The resulting authentication code (PAC) is inserted into the unused upper bit of the pointer address to produce the result of the instruction. The result can, for example, be written back to the same register that stored the source operand. For example, if the current function return address stored in the link register 54 is executed with the source operand, the result is written back to the link register 54 . The authentication code generation function 40 may use a cryptographic hash function (e.g., SHA256, SHA128, QARMA-128, QARMA-256, etc.) that makes it computationally infeasible to guess the authentication code associated with a specific address without knowing the key . The modifier value can be a value used to connect the specific instance of the authentication code produced by the operation to a specific execution value reached in the code, thereby reducing the risk of re-use attacks, where one comes from a portion of the code. A valid address/PAC pair was incorrectly substituted for use in another part of the code. For example, a stacking pointer from stacking pointer register 52, or a call path indicator representing the history of function calls taken to reach the current processing point, may be used as modifiers.

圖8顯示在第二源運算元src2上執行的對應認證碼檢查操作。第二源運算元預期是指標位址，其先前已藉由在圖7所示的認證碼產生操作中將認證碼PAC插入至其上位元而認證，但若攻擊器已經修改該指標，則認證碼(PAC)可能不有效。在認證碼檢查操作中，處理電路系統16使用對應於當由位址的上位元所代表的認證碼產生時預期已使用的密鑰及修飾符值之密鑰及修飾符值，將相同的認證碼產生功能140施加至第二源運算元的位址位元（排除認證碼PAC）。接著，將預期認證碼PAC與自第二源運算元src2的上位元提取的相關聯認證碼PAC比較，且判定是否預期認證碼與相關聯的認證碼匹配。若是，則允許處理繼續，而若預期及相關聯碼之間存在失配，則觸發錯誤處理回應，例如觸發一例外或將來源暫存器的上位元設定至對應於無效位址的一值，使得對於該位址的後續存取或指令提取將因為存取無效位址而觸發MMU 28觸發記憶體錯誤（這意指若具有不正確PAC之位址係用於功能返回，在返回分支之後後續嘗試自該位址提取指令將觸發錯誤，防止處理電路系統繼續執行不正確功能返回之外的程式）。Figure 8 shows the corresponding authentication code check operation performed on the second source operand src2. The second source operand is expected to be the pointer address, which has been previously authenticated by inserting the authentication code PAC into its upper bit in the authentication code generation operation shown in Figure 7, but if the attacker has modified the pointer, then the authentication The PAC may not be valid. In an authentication code checking operation, processing circuitry 16 converts the same authentication code to the same authentication code using key and modifier values that correspond to those expected to have been used when the authentication code represented by the upper bit of the address was generated. The code generation function 140 applies to the address bits of the second source operand (excluding the authentication code PAC). Next, the expected authentication code PAC is compared with the associated authentication code PAC extracted from the upper element of the second source operand src2, and it is determined whether the expected authentication code matches the associated authentication code. If so, processing is allowed to continue, and if there is a mismatch between the expected and associated codes, an error handling response is triggered, such as triggering an exception or setting the upper bit of the source register to a value corresponding to the invalid address. Such that subsequent accesses or instruction fetches to that address will trigger the MMU 28 to trigger a memory error due to accessing an invalid address (this means that if the address with an incorrect PAC is used for a function return, subsequent accesses after the return branch Attempts to fetch an instruction from this address will trigger an error, preventing the processing circuitry from continuing beyond the incorrect function return).

藉由使用圖7及圖8之認證碼產生及檢查操作，此允許指標被認證，使得攻擊器更難以注入未認證之指標且成功使程式碼分支至該指標所識別之位置，保護抵抗ROP攻擊。藉由使用密碼功能作為認證碼產生功能140，此可使對與特定位址相關聯之認證碼的暴力猜測變困難。用於執行認證碼產生操作的指令可包括在當指標位址產生時的點處的程式碼中（例如，在回應於功能呼叫設定連結暫存器及儲存自連結暫存器至記憶體的功能返回位址之間），且當稍後實際要使用該位址時（例如，在功能返回分支之前），可包括認證碼檢查指令AUT，以在實際分支至該位址之前，仔細檢查認證碼。By using the authentication code generation and checking operations of Figures 7 and 8, this allows the indicator to be authenticated, making it more difficult for an attacker to inject an unauthenticated indicator and successfully branch the code to the location identified by the indicator, protecting against ROP attacks. . By using the password function as the authentication code generation function 140, this can make brute force guessing of the authentication code associated with a specific address difficult. Instructions for performing authentication code generation operations may be included in the code at the point when the pointer address is generated (e.g., in response to a function call to set the link register and store the link register to memory function between return addresses), and when that address is actually used later (for example, before a function returns a branch), the authentication code check instruction AUT can be included to double-check the authentication code before actually branching to that address. .

認證碼產生功能140可隨實施方案變化，或可基於控制暫存器56中的控制狀態而在給定實施方案上可組態。用於認證碼產生功能的修飾符值亦可係可組態的或可能針對實施認證碼產生/檢查操作的指令的不同變體有所不同。The authentication code generation function 140 may vary from implementation to implementation, or may be configurable on a given implementation based on the control status in the control register 56 . The modifier values used for the authentication code generation function may also be configurable or may differ for different variants of the instructions that perform the authentication code generation/checking operations.

上文關於圖6至圖8所描述之GCS操作及認證碼操作兩者皆可看作防禦抵抗ROP攻擊，但一些使用者可能偏好使用一種防禦形式，且其他使用者可使用另一形式。一些使用者可偏好使用兩者在深度方法中之防禦的組合。此外，在有時使用在需要ROP防禦之安全性的使用情況中之一段程式碼亦可在不需要此安全性的使用情況中執行的情境中，可能有時想要省略此等操作，此情況中可能較好的係針對效能而省略此等操作。因此，此等操作對於上文描述之NOP相容指令之第一及第二操作可係有用之實例。Both the GCS operations and the authentication code operations described above with respect to Figures 6-8 can be considered defenses against ROP attacks, but some users may prefer to use one form of defense, and other users may use the other. Some users may prefer to use a combination of both defenses in a depth approach. In addition, there may be times when a piece of code that is used in a use case that requires the security of ROP defense can also be executed in a use case that does not require this security. It may sometimes be desirable to omit these operations. In this case It might be better to omit this operation for performance reasons. Therefore, these operations may be useful examples for the first and second operations of the NOP compliant instructions described above.

在一些情況下，可能期望提供程式二元，其向後相容於不支援GCS及認證碼特徵的舊硬體，且當在支援此等特徵的較新硬體上運行時支援新功能。因此，針對NOP相容指令選擇的運算碼72可係在較舊硬體上可被視為NOP指令的一者。In some cases, it may be desirable to provide program binaries that are backwards compatible with older hardware that does not support GCS and authentication code features, and that support new functionality when running on newer hardware that supports these features. Therefore, opcode 72 selected for a NOP compatible instruction may be considered one of the NOP instructions on older hardware.

通常，當新架構特徵添加至ISA時，控制暫存器可用以指示是否支援該特徵，且軟體在使用前可能需要檢查該特徵是否實施。對於許多特徵，此係可接受的，但對於預期非常頻繁執行的特徵，諸如用於功能導言及收尾之特徵，由於效能的原因，此係不可行的。對於此類特徵，可有用的是提供執行無操作的一組NOP指令編碼，至少直到針對支援更新架構的較新硬體，一些功能被添加至該編碼。隨著更多特徵加入架構，可在對應於各額外特徵的功能導言及收尾中添加更多的NOP相容指令，但此將使功能更巨大，因為添加更多的NOP相容指令，引致快取結構及指令通量的成本，且因此降低效能。Typically, when a new architectural feature is added to the ISA, the control register can be used to indicate whether the feature is supported, and software may need to check whether the feature is implemented before using it. This is acceptable for many features, but for features that are expected to be executed very frequently, such as features used for feature intros and wrap-ups, this is not feasible for performance reasons. For such features, it may be useful to provide encoding for a set of NOP instructions that perform no operations, at least until some functionality is added to the encoding for newer hardware that supports newer architectures. As more features are added to the architecture, more NOP-compliant instructions can be added to the function intro and outro corresponding to each additional feature, but this will make the function larger because adding more NOP-compliant instructions will lead to faster This takes the cost of structure and instruction throughput, and therefore reduces performance.

因此，使用上文描述之NOP相容指令，可將多個操作過載至單一NOP相容編碼上，其中提供在一或多個控制暫存器56中之指令功能選擇資訊60獨立地開啟/關閉各特徵。Therefore, multiple operations can be overloaded onto a single NOP-compliant code using the NOP-compliant instructions described above, where command function selection information 60 in one or more control registers 56 is provided to enable/disable independently. Each characteristic.

例如，如上文所論述，針對經保護控制堆疊特徵，其中吾人希望在功能進入時將連結暫存器之內容（例如，功能返回位址）推送至經保護堆疊上，且上文提及之認證碼產生操作（PAC特徵）簽署指標，諸如功能返回位址，此將通常涉及兩個指令，以執行GCS推送操作(GCSPUSH)及簽署操作(PACIASP)。類似地，在功能返回時，存在對應的指令，以執行GCS取出操作(GCS pop operation, GCSPOP)及認證碼檢查操作(authentication code checking operation, AUTIASP)。 myfunc(): PACIASP LR GCSPUSH LR ... //我的功能內容 GCSPOP LR AUTIASP LR RET For example, as discussed above, for the protected control stack feature, where one wishes to push the contents of the link register (e.g., the function return address) onto the protected stack upon function entry, and the authentication mentioned above Code generation operations (PAC characteristics) sign indicators, such as function return addresses, which will typically involve two instructions to perform a GCS push operation (GCSPUSH) and a signing operation (PACIASP). Similarly, when the function returns, there are corresponding instructions to perform the GCS pop operation (GCSPOP) and the authentication code checking operation (AUTIASP). myfunc(): PACIASP LR GCSPUSH LR ... //My function content GCSPOP LR AUTIASP LR RET

相比之下，藉由使用上文所描述之NOP相容指令提供兩段功能，此允許吾人具有較小的功能，且亦允許其在舊及新硬體上工作（因為舊硬體可視其為NOP，且甚至在較新硬體上，存在藉由設定指令功能選擇資訊至第一狀態而停用兩個操作之組態選項）。In contrast, by using the NOP compatible instructions described above to provide two stages of functionality, this allows us to have smaller functionality and also allows it to work on both old and new hardware (since the old hardware can see it is a NOP, and even on newer hardware, there is a configuration option to disable both operations by setting the command function selection message to the first state).

例如，若組合之指令（在此實例中，假定具有PACIASP編碼）執行所有操作，吾人可具有較小功能： myfunc(): PACIASP LR //亦執行GCSPUSH ...//我的功能內容 AUTIASP LR //亦執行GCSPOP RET For example, if the combined instructions (in this example, assumed to have PACIASP encoding) perform all operations, we can have a smaller function: myfunc(): PACIASP LR //Also execute GCSPUSH ...//My function content AUTIASP LR //Also execute GCSPOP RET

應注意，AUTIASP係PACIASP之補體，且在上述實例中，藉由執行GCSPOP操作，亦為GCSPUSH之補體。It should be noted that AUTIASP is the complement of PACIASP, and in the above example, by performing the GCSPOP operation, is also the complement of GCSPUSH.

舉例而言，吾等可提供2個控制位元以管理此等編碼之行為 00 - PAC停用，GCS停用 01 - PAC啟用，GCS停用 10 - PAC停用，GCS啟用 11 - PAC啟用，GCS啟用 For example, we can provide 2 control bits to manage the behavior of these codes 00 - PAC disabled, GCS disabled 01 - PAC enabled, GCS disabled 10 - PAC disabled, GCS enabled 11 - PAC enabled, GCS enabled

此可在未來延伸有新特徵，添加控制位元來開啟/關閉功能，全部由單一編碼執行。此外，控制狀態將可能控制操作被執行的相對順序。This can be extended in the future with new features, adding control bits to turn functions on/off, all performed by a single code. Additionally, control states will likely control the relative order in which operations are performed.

因此，多個新特徵可開啟及關閉相同指令編碼，而不需重建程式二元。Therefore, multiple new features can turn the same command code on and off without rebuilding the program binary.

上述實例展示第一操作及第二操作係以下任一者的情況： • 針對功能導言變體，分別係GCS推送操作及認證碼產生操作，或反之亦然；或 • 針對功能收尾變體，分別係GCS取出操作及認證碼檢查操作，或反之亦然。 The above example shows the situation where the first operation and the second operation are any of the following: • For the function introduction variant, it is the GCS push operation and the authentication code generation operation, or vice versa; or • For the function closing variant, it is the GCS removal operation and the authentication code checking operation, or vice versa.

兩種變體可在相同ISA中受支援，其中分別針對功能導言變體及功能收尾變體有運算碼72的不同編碼。Both variants can be supported in the same ISA, with different encodings of opcode 72 for the function introductory variant and the function tail variant.

然而，在其他實例中，亦可能提供NOP相容指令，其組合GCS推送操作與認證碼產生操作之外的另一種操作，或其組合認證碼產生操作與GCS推送操作以外的另一種操作。類似地，可能提供NOP相容指令，其組合GCS取出操作與認證碼檢查操作之外的另一種操作，或其組合認證碼檢查操作與GCS取出操作以外的另一種操作。However, in other instances, it is possible to provide NOP-compliant instructions that combine a GCS push operation with another operation other than an authentication code generation operation, or that combine an authentication code generation operation with another operation other than a GCS push operation. Similarly, it is possible to provide NOP-compliant instructions that combine a GCS fetch operation with another operation than an authentication code check operation, or that combine an authentication code check operation with another operation than a GCS fetch operation.

圖9至圖12顯示在回應於NOP相容指令而執行兩操作之情況中，控制PAC/AUT及GCS操作之間的順序的不同實例。Figures 9 to 12 show different examples of controlling the sequence between PAC/AUT and GCS operations in the case where both operations are performed in response to a NOP compliant instruction.

圖9顯示其中NOP相容指令（作為功能導言執行）執行認證碼產生操作(PAC)及GCS推送操作的實例，其中GCS推送操作取決於認證碼產生操作的結果，使得功能返回位址及其相關聯認證碼均被推送至GCS。換言之，此等效於執行PAC操作且接著繼續執行GCS推送操作，其中，PAC操作之目的地暫存器相同於用於GCS推送操作之來源暫存器。相比之下，圖10顯示以不同順序執行PAC及GCS推送操作，其中GCS推送操作獨立於PAC操作。此允許功能返回位址至GCS結構之推送與PAC操作的認證碼的計算並行地開始。此可有用，因為PAC產生功能140及用於GCS推送操作之記憶體存取可相對緩慢，且因此相較於圖9之順序，使此等操作並行可具有改善之效能。另一方面，圖9的實例可改善安全性，因為所產生的認證碼在GCS上被保護，不僅是功能返回位址被保護。Figure 9 shows an example in which a NOP-compliant instruction (executed as a function preamble) performs an authentication code generation operation (PAC) and a GCS push operation, where the GCS push operation depends on the result of the authentication code generation operation, causing the function to return the address and its associated The joint authentication codes are pushed to GCS. In other words, this is equivalent to performing a PAC operation and then proceeding with a GCS push operation, where the destination register for the PAC operation is the same as the source register for the GCS push operation. In contrast, Figure 10 shows PAC and GCS push operations performed in a different order, with GCS push operations being independent of PAC operations. This allows the push of the function's return address to the GCS structure to begin in parallel with the calculation of the authentication code for the PAC operation. This may be useful because the PAC generation function 140 and memory accesses for GCS push operations may be relatively slow, and thus parallelizing these operations may have improved performance compared to the sequence of Figure 9. On the other hand, the example of Figure 9 can improve security because the generated authentication code is protected on the GCS, not only the function return address is protected.

類似地，圖11及圖12顯示其中NOP相容指令（作為功能收尾執行）實施GCS取出操作及認證碼檢查(AUT)操作作為第一操作及第二操作之實例的替代順序（任一方式-第一操作可係GCS取出操作，且第二操作可係AUT操作，或反之亦然）。圖11展示一順序，其中AUT操作取決於GCS取出行操作之結果，因為由GCS取出操作從GCS取出的值係用作AUT操作的輸入。此方法可用於一實例中，其中執行作為功能導言之對應NOP相容指令使用圖9所示之方法。再次，此具有較大安全性的優點，因為使用由MMU 28實施的GCS記憶體區域保護來保護認證碼免受竄改。另一方面，圖12顯示一順序，其中AUT操作及GCS取出操作係獨立，使得其等可並行開始。在此情況下，認證碼檢查可施加至連結暫存器中的值，且接著分開地，GCS取出操作亦可取出至目的地暫存器的值（不具有關聯的認證碼）。假設在自GCS結構取出的值自記憶體被返回之前，針對AUT操作讀取連結暫存器中的舊值，GCS取出操作的目的地暫存器仍可係相同於用於AUT操作之源運算元的暫存器，在此情況下，若AUT操作偵測到失配，採取錯誤或其他錯誤回應動作，且若程式碼匹配，則允許GCS取出操作完成，且接著，由GCS取出操作取出的位址可用於功能返回。替代地，GCS取出操作可將來自記憶體之GCS結構的經保護返回位址取出至由AUT操作所檢查之暫存器以外的暫存器，且接著隨後可進行由AUT操作所檢查的值與自GCS結構所取出之值之間的比較，以確認GCS保護位址是否匹配AUT檢查位址，從而提供進一步檢查是否基於該位址進行功能返回係安全。Similarly, Figures 11 and 12 show an alternative sequence in which a NOP-compliant instruction (executed as a function wrapper) performs a GCS fetch operation and an authentication code check (AUT) operation as examples of the first and second operations (either way - The first operation may be a GCS fetch operation and the second operation may be an AUT operation, or vice versa). Figure 11 shows a sequence in which the AUT operation depends on the results of the GCS fetch row operation because the values fetched from GCS by the GCS fetch operation are used as inputs to the AUT operation. This approach may be used in an example where executing a corresponding NOP compliant instruction as a function introduction uses the approach shown in Figure 9. Again, this has the advantage of greater security since the authentication code is protected from tampering using the GCS memory area protection implemented by the MMU 28. On the other hand, Figure 12 shows a sequence in which the AUT operation and the GCS fetch operation are independent so that they can be started in parallel. In this case, the authentication code check can be applied to the value in the link register, and then separately, the GCS fetch operation can also fetch the value in the destination register (without the associated authentication code). Assuming that the old value in the link register is read for the AUT operation before the value fetched from the GCS structure is returned from memory, the destination register for the GCS fetch operation can still be the same as the source operation used for the AUT operation In this case, if the AUT operation detects a mismatch, an error or other error response action is taken, and if the code matches, the GCS fetch operation is allowed to complete, and then the GCS fetch operation is fetched The address can be used for function returns. Alternatively, a GCS fetch operation may fetch the protected return address of the GCS structure from memory to a register other than the register checked by the AUT operation, and the value checked by the AUT operation may then be compared with Comparison between the values taken out from the GCS structure to confirm whether the GCS protection address matches the AUT check address, thereby providing further checking whether it is safe to perform function return based on this address.

因此，有用的是指令解碼器10及執行級16支援： - 一第一NOP相容指令用於一功能導言，其中該第一/第二操作中之一者係該認證碼產生操作(PAC)，且該第一/第二操作中之另一者係GCS推送操作；及 - 一第二NOP相容指令用於一功能收尾（具有與該第一NOP相容指令不同之編碼），其中該第一/第二操作中之一者係該認證碼檢查操作(AUT)，且該第一/第二操作中之另一者係GCS取出操作。 Therefore, it is useful for instruction decoder 10 and execution level 16 to support: - A first NOP-compliant instruction for a functional preamble, wherein one of the first/second operations is the authentication code generation operation (PAC), and the other of the first/second operations is GCS push operations; and - a second NOP-compliant instruction for a functional closure (with a different encoding than the first NOP-compliant instruction), where one of the first/second operations is the authentication code check operation (AUT), And the other one of the first/second operations is a GCS fetch operation.

指令功能選擇資訊60可指定第一操作及第二操作（若有）的何者將回應於該指令執行，且亦可控制如圖9至圖12所示之操作的相對順序。The command function selection information 60 can specify which of the first operation and the second operation (if any) will be executed in response to the command, and can also control the relative order of the operations shown in FIGS. 9 to 12 .

圖13係顯示藉由MMU 28執行之步驟的流程圖，用於檢查記憶體存取請求的記憶體存取權限。在步驟200，觸發載入/儲存操作的指令被解碼。若指令功能選擇資訊60指定待執行GCS推送或取出操作，則支援GCS推送或取出操作的上述NOP相容指令被視為一載入/儲存指令。Figure 13 is a flowchart showing the steps performed by the MMU 28 for checking memory access permissions of a memory access request. In step 200, the instruction triggering the load/store operation is decoded. If the instruction function selection information 60 specifies that a GCS push or fetch operation is to be performed, then the above-mentioned NOP compatible instruction that supports the GCS push or fetch operation is regarded as a load/store instruction.

在步驟202，判定觸發載入/儲存操作的指令是否是GCS存取指令（被允許存取記憶體的GCS區域之GCS存取指令之限制子集中之一者）。當指令功能選擇資訊指示需要GCS推送或取出操作時，將NOP相容指令視為GCS存取指令。亦可將其他類型的指令視為GCS存取指令。在一些情況下，若指令功能選擇資訊在指示不需要GCS推送或取出操作的狀態中，不將NOP相容指令視為GCS存取指令。替代地，若GCS推送或取出操作係能夠回應於NOP相容指令產生載入/儲存操作的唯一操作（例如，上文描述之PAC/AUT操作無法產生任何載入/儲存請求），則NOP相容指令可始終被視為GCS存取指令，無論該指令功能選擇資訊60的值如何。In step 202, it is determined whether the instruction that triggered the load/store operation is a GCS access instruction (one of a restricted subset of GCS access instructions that are allowed to access the GCS area of the memory). When the command function selection information indicates that a GCS push or fetch operation is required, the NOP compatible command is treated as a GCS access command. Other types of instructions can also be considered GCS access instructions. In some cases, NOP-compliant instructions are not considered GCS access instructions if the instruction function selection information is in a state indicating that no GCS push or fetch operation is required. Alternatively, if the GCS push or fetch operation is the only operation that can generate a load/store operation in response to a NOP-compliant instruction (e.g., the PAC/AUT operation described above cannot generate any load/store request), then the NOP corresponds to The content command can always be regarded as a GCS access command, regardless of the value of the command function selection information 60.

若經解碼指令係GCS存取指令，則在步驟204處，MMU 28檢查對應於載入/儲存操作所存取的目標位址之記憶體屬性資料是否指定對應於目標位址的記憶體位址空間區域係GCS區域。此記憶體屬性資料可衍生自對應於目標位址的頁表項目或由頁表項目指定的間接暫存器（且可在MMU 28之轉譯後備緩衝區(TLB)中被快取）。若指令係GCS存取指令，但記憶體屬性資料指定對應於目標位址之區域不為GCS區域，則在步驟206處，載入/儲存操作被拒絕，且傳訊錯誤。此防止GCS存取指令用於存取非GCS記憶體，此更安全，以減少可用於攻擊器之攻擊面。否則，若GCS存取指令存取GCS區域，則在步驟212處，MMU 28檢查任何其他存取權限檢查是否由載入/儲存操作傳遞。若此等其他檢查失敗，則在步驟214處，載入/儲存操作被拒絕，且傳訊錯誤。若傳遞此等其他檢查，則在步驟216處，允許載入/儲存操作。此等其他檢查可例如檢查指示是否允許讀取或寫入至記憶體區域之讀取/寫入權限，或可檢查不相關於GCS區域檢查的其他安全屬性（例如，限制允許存取記憶體區域中的處理器的執行狀態之屬性）。If the decoded instruction is a GCS access instruction, then at step 204, the MMU 28 checks whether the memory attribute data corresponding to the target address accessed by the load/store operation specifies the memory address space corresponding to the target address. The area is a GCS area. This memory attribute data may be derived from the page table entry corresponding to the target address or an indirect register specified by the page table entry (and may be cached in the translation lookaside buffer (TLB) of MMU 28). If the instruction is a GCS access instruction, but the memory attribute data specifies that the area corresponding to the target address is not a GCS area, then at step 206, the load/store operation is rejected and an error is transmitted. This prevents GCS access commands from being used to access non-GCS memory, which is more secure and reduces the attack surface available to attackers. Otherwise, if the GCS access instruction accesses the GCS region, then at step 212, the MMU 28 checks whether any other access rights checks were passed by the load/store operation. If these other checks fail, then at step 214 the load/save operation is rejected and an error is signaled. If these other checks are passed, then at step 216, the load/store operation is allowed. These other checks may, for example, check read/write permissions indicating whether reading or writing to the memory region is allowed, or may check other security attributes that are not relevant to the GCS region check (e.g., restricting access to the memory region) properties of the execution state of the processor in ).

若觸發載入/儲存操作的指令並非GCS存取指令，則在步驟208處，MMU 28檢查對應於載入/儲存操作所存取的目標位址之記憶體屬性資料是否指定對應於目標位址的記憶體位址空間區域係GCS區域。然而，在此情況下，相較於步驟204，該回應相反，若非GCS存取指令不存取GCS區域，載入/儲存請求可可能被接受，而若其存取GCS區域，可被拒絕。更具體而言，若對應於目標位址之區域的記憶體屬性資料指定GCS區域，則在步驟205處，MMU檢查記憶體存取是否係寫入記憶體存取，且若是，在步驟210處拒絕載入/儲存操作，且傳訊錯誤。藉由限制至GCS區域的寫入存取至GCS存取指令的專用類別，此防止程式碼中的大部分正規儲存指令竄改GCS資料結構上的保護的返回狀態。此減少攻擊器在嘗試損壞GCS上之保護的返回狀態時修改正規儲存指令的運算元之機會。If the instruction that triggered the load/store operation is not a GCS access instruction, then at step 208, the MMU 28 checks whether the memory attribute data corresponding to the target address accessed by the load/store operation is specified to correspond to the target address. The memory address space area is the GCS area. However, in this case, in contrast to step 204, the response is that the load/store request may be accepted if the non-GCS access instruction does not access the GCS region, but may be rejected if it accesses the GCS region. More specifically, if the memory attribute data corresponding to the region of the target address specifies a GCS region, then at step 205 the MMU checks whether the memory access is a write memory access, and if so, at step 210 Load/save operation rejected with error message. By restricting write access to the GCS region to a dedicated class of GCS access instructions, this prevents most regular store instructions in the code from tampering with the protected return state on the GCS data structure. This reduces the chance of an attacker modifying the operands of a regular store instruction when trying to corrupt the protected return state on GCS.

另一方面，若非GCS存取指令不嘗試存取GCS區域（在步驟208處，否）或嘗試存取GCS區域，但係讀取請求（在步驟205處，否），則該方法再次繼續進行至步驟212，以便施加任何其他存取權限檢查，且接著取決於此等檢查的結果而控制載入/儲存操作是否被拒絕或允許，相同如上文針對步驟212、214、216所論述的。On the other hand, if the non-GCS access instruction does not attempt to access the GCS region (at step 208, No) or attempts to access the GCS region, but is a read request (at step 205, no), then the method continues again to step 212 in order to apply any other access rights checks and then control whether the load/save operation is denied or allowed depending on the results of these checks, also as discussed above for steps 212, 214, 216.

圖14繪示可使用的模擬器實施方案。雖然稍早所述之實施例以用於操作支援所關注技術的特定處理硬體之設備及方法來實施本發明，但亦可能根據本文所述之實施例提供一指令執行環境，其係透過使用電腦程式實施。此類電腦程式常稱為模擬器，因為其等提供硬體架構之基於軟體的實施方案。模擬器電腦程式的種類包括仿真器、虛擬機、模型、及二元轉譯器（包括動態二元轉譯器）。一般而言，模擬器實施方案可在可選地運行主機作業系統1320、支援模擬器程式1310的主機處理器1330上運行。在一些配置中，在硬體與所提供的指令執行環境及/或相同的主機處理器上提供的多個相異指令執行環境之間可有多層模擬。歷史上，已需要強大的處理器來提供模擬器實施方案，其以合理速度執行，但此種方法在某些情況下可係有正當理由的，諸如當因為相容性或再使用原因此需要執行另一處理器原生的程式碼時。例如，模擬器實施方案可提供具有不為主機處理器硬體所支援之額外功能性的指令執行環境，或提供一般與不同的硬體架構相關聯的指令執行環境。模擬的綜述係於「Some Efficient Architecture Simulation Techniques」中給出，Robert Bedichek, Winter 1990 USENIX Conference，頁數53至63。Figure 14 illustrates a simulator implementation that may be used. Although the embodiments described earlier implement the present invention with apparatus and methods for operating specific processing hardware supporting the technology of interest, it is also possible to provide an instruction execution environment in accordance with the embodiments described herein by using Computer program implementation. Such computer programs are often called emulators because they provide a software-based implementation of the hardware architecture. Types of simulator computer programs include emulators, virtual machines, models, and binary translators (including dynamic binary translators). Generally speaking, emulator implementations may run on a host processor 1330 that optionally runs a host operating system 1320 and supports an emulator program 1310. In some configurations, there may be multiple layers of emulation between the hardware and the instruction execution environment provided and/or multiple distinct instruction execution environments provided on the same host processor. Historically, powerful processors have been required to provide emulator implementations that execute at reasonable speeds, but this approach may be justified in certain circumstances, such as when this is required for compatibility or reuse reasons. When executing code native to another processor. For example, an emulator implementation may provide an instruction execution environment with additional functionality not supported by the host processor hardware, or provide an instruction execution environment typically associated with different hardware architectures. An overview of simulation is given in "Some Efficient Architecture Simulation Techniques", Robert Bedichek, Winter 1990 USENIX Conference, pp. 53-63.

在先前已參照特定硬體架構或特徵來描述實施例之情況下，在一模擬實施例中，可藉由合適的軟體架構或特徵提供等效功能。例如，可在模擬實施例中將特定電路系統實施為電腦程式邏輯。類似地，記憶體硬體（諸如暫存器或快取）可在模擬實施例中實施為軟體資料結構。在先前描述的實施例中提及的硬體元件的一或多者存在於主機硬體（例如，主機處理器1330）上的配置中，一些模擬實施例可（在適當處）利用主機硬體。Where embodiments have been previously described with reference to particular hardware architectures or features, equivalent functionality may be provided by suitable software architectures or features in a simulated embodiment. For example, certain circuit systems may be implemented as computer program logic in simulated embodiments. Similarly, memory hardware (such as registers or caches) may be implemented as software data structures in simulated embodiments. Where one or more of the hardware elements mentioned in previously described embodiments reside in a configuration on host hardware (eg, host processor 1330 ), some simulation embodiments may utilize host hardware (where appropriate) .

模擬器程式1310可儲存在電腦可讀儲存媒體（其可係非暫時性媒體）上，並提供程式介面（指令執行環境）給目標碼1300（其可包括應用程式、作業系統、及超管理器），該程式介面與藉由模擬器程式1310模型化之硬體架構的介面相同。因此，目標碼1300之包括如上文描述之NOP相容指令的程式指令可在指令執行環境內使用模擬器程式1310執行，使得實際上不具有上文討論之設備2的硬體特徵的主機電腦1330可模仿此等特徵。類似地，圖13的記憶體管理檢查功能可使用模擬器程式1310的記憶體管理程式邏輯1318來仿真。The emulator program 1310 can be stored on a computer-readable storage medium (which can be a non-transitory medium) and provide a program interface (command execution environment) to the object code 1300 (which can include an application, an operating system, and a hypervisor) ), the program interface is the same as the interface of the hardware architecture modeled by the simulator program 1310. Accordingly, program instructions of object code 1300 including NOP compliant instructions as described above may be executed using emulator program 1310 within an instruction execution environment such that a host computer 1330 does not actually have the hardware features of device 2 discussed above. These characteristics can be imitated. Similarly, the memory management check functionality of Figure 13 may be simulated using the memory management program logic 1318 of the simulator program 1310.

因此，模擬器程式1310可具有處理程式邏輯1312，其模擬用於硬體設備2之上述處理的狀態。例如，處理程式邏輯1312可回應於目標碼1300之模擬執行期間發生的事件而控制執行狀態（例如，例外層級）的轉變。指令解碼程式邏輯1314仿真指令解碼器10之行為，且解碼目標碼1300之指令，且將此等指令映射至主機設備1330之原生指令集中的對應指令集。暫存器仿真程式邏輯1316將由目標碼請求的暫存器存取映射成對維護在主機設備1330之主機硬體上的對應資料結構的存取，諸如藉由存取主機設備1330之暫存器或記憶體1332中的資料。記憶體管理程式邏輯1318實施位址轉譯、頁表走訪及以對應於在上文的硬體實施例中描述的MMU 28之方式存取控制檢查，但亦具有映射模擬實體位址（基於針對目標碼1300所界定的頁表而藉由位址轉譯獲得）的額外功能至用以存取主機記憶體1332的主機虛擬位址。此等主機虛擬位址可本身使用由主機所支援之標準位址轉譯機構來轉譯成主機實體位址（將主機虛擬位址轉譯成主機實體位址係在受到模擬器程式1310控制的範圍之外）。Therefore, the simulator program 1310 may have a processing program logic 1312 that simulates the state for the above-mentioned processing of the hardware device 2 . For example, handler logic 1312 may control transitions in execution state (eg, exception level) in response to events occurring during simulated execution of object code 1300 . Instruction decoding program logic 1314 emulates the behavior of instruction decoder 10 and decodes the instructions of target code 1300 and maps these instructions to corresponding instruction sets in the native instruction set of host device 1330 . Register emulator logic 1316 maps register accesses requested by the object code into accesses to corresponding data structures maintained on the host hardware of host device 1330 , such as by accessing the registers of host device 1330 Or the data in memory 1332. Memory manager logic 1318 performs address translation, page table walks, and access control checks in a manner corresponding to the MMU 28 described in the hardware embodiment above, but also has a mapped simulated physical address (based on the target The page table defined by code 1300 is obtained by address translation) to the host virtual address used to access host memory 1332. These host virtual addresses may themselves be translated into host physical addresses using standard address translation mechanisms supported by the host (translation of host virtual addresses into host physical addresses is outside the scope of control of the emulator program 1310 ).

在本申請案中，用語「經組態以...(configured to...)」係用以意指一設備的一元件具有能夠實行該經定義作業的一組態。在此上下文中，「組態(configuration)」意指硬體或軟體之互連的配置或方式。例如，該設備可具有專用硬體，其提供經定義的作業，或者一處理器或其他處理裝置可經程式化以執行該功能。「經組態以(configured to)」並不意味著設備元件需要以任何方式改變以提供所定義的作業。In this application, the term "configured to" is used to mean that an element of a device has a configuration capable of performing the defined operation. In this context, "configuration" means the arrangement or manner of interconnection of hardware or software. For example, the device may have specialized hardware that provides a defined job, or a processor or other processing device may be programmed to perform the function. "Configured to" does not mean that the device element needs to be changed in any way to provide the defined operation.

雖然本文已參照附圖詳細地描述本發明的說明性實施例，應瞭解本發明不限於該等精確實施例，且所屬技術領域中具有通常知識者可於其中實行各種變化與修改，而不脫離如隨附申請專利範圍所定義的本發明的範圍。Although illustrative embodiments of the present invention have been described in detail herein with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiments, and that various changes and modifications may be made therein by those skilled in the art without departing from the The scope of the invention is defined by the appended claims.

2:資料處理設備 4:處理管線 6:提取級 8:指令快取記憶體/記憶體系統/快取記憶體 10:解碼級 12:發布級 14:暫存器檔案 16:執行級 18:寫回級 20:算術/邏輯單元(arithmetic/logic unit, ALU)/處理單元/執行單元 22:浮點單元/處理單元/執行單元 24:分支單元/處理單元/執行單元 26:載入/儲存單元/處理單元/執行單元 28:記憶體管理單元(memory management unit, MMU) 29:轉譯後備緩衝區(translation lookaside buffer, TLB) 30:記憶體系統/快取記憶體 32:記憶體系統/快取記憶體 34:記憶體系統 50:通用暫存器 52:堆疊指標(stack pointer, SP)暫存器 54:連結暫存器 56:控制暫存器 58:保護控制堆疊(GCS)堆疊指標暫存器/GCS指標暫存器 60:指令功能選擇資訊 70:NOP相容指令 72:運算碼 74,76:運算元欄位/運算元 120:資料結構/GCS結構/GCS資料結構 100,102,104,106,108,110,112,114,140:步驟 140:認證碼產生功能 200,202,204,205,206,208,210,212,214,216:步驟 1300:目標碼 1310:模擬器程式 1312:處理程式邏輯 1314:指令解碼程式邏輯 1316:暫存器仿真程式邏輯 1318:記憶體管理程式邏輯 1320:主機作業系統 1330:主機處理器/主機設備/主機電腦 1332:暫存器或記憶體/主機記憶體 2: Data processing equipment 4: Processing pipeline 6: Extraction level 8: Instruction cache/memory system/cache 10: Decoding level 12: Release level 14: scratchpad file 16:Executive level 18: Write back to level 20: Arithmetic/logic unit (ALU)/processing unit/execution unit 22: Floating point unit/processing unit/execution unit 24: Branch unit/processing unit/execution unit 26:Load/storage unit/processing unit/execution unit 28: Memory management unit (MMU) 29: Translation lookaside buffer (TLB) 30:Memory system/cache 32:Memory system/cache 34:Memory system 50: General purpose register 52: stack pointer (SP) register 54: Link register 56: Control register 58: Protection control stack (GCS) stack indicator register/GCS indicator register 60: Command function selection information 70:NOP compatible instructions 72:Operation code 74,76: Operator field/operator 120:Data structure/GCS structure/GCS data structure 100,102,104,106,108,110,112,114,140: Steps 140: Authentication code generation function 200,202,204,205,206,208,210,212,214,216: Steps 1300:Object code 1310:Simulator program 1312: Processing program logic 1314: Instruction decoding program logic 1316: Register emulation program logic 1318: Memory management program logic 1320: Host operating system 1330: Host processor/Host device/Host computer 1332: Scratchpad or memory/host memory

本技術的進一步態樣、特徵、及優點將由於結合附圖閱讀的以下實例描述而顯而易見，在該等附圖中：［圖1］示意地繪示資料處理設備的一實例；［圖2］繪示設備之暫存器的實例；［圖3］繪示無操作相容（NOP相容(no-operation-compatible)）指令之實例；［圖4］繪示處理NOP相容指令之方法；［圖5］繪示功能呼叫及功能返回的實例；［圖6］繪示保護控制堆疊(GCS)推送操作及GCS取出操作的實例；［圖7］繪示認證碼產生操作的實例；［圖8］繪示認證碼檢查操作的實例；［圖9］及［圖10］繪示以不同順序執行GCS推送操作及認證碼產生操作的實例；［圖11］及［圖12］繪示以不同順序執行GCS取出操作及認證碼檢查操作的實例；［圖13］繪示基於記憶體屬性資料檢查記憶體存取是否允許的步驟；及［圖14］繪示模擬實例。 Further aspects, features, and advantages of the present technology will become apparent from the following example description, read in conjunction with the accompanying drawings, in which: [Fig. 1] Schematically illustrates an example of data processing equipment; [Figure 2] shows an example of the device's register; [Figure 3] illustrates an example of a no-operation-compatible (NOP-compatible) instruction; [Figure 4] illustrates the method of processing NOP compatible instructions; [Figure 5] illustrates an example of function call and function return; [Figure 6] illustrates an example of a protection control stack (GCS) push operation and a GCS fetch operation; [Figure 7] illustrates an example of the authentication code generation operation; [Figure 8] illustrates an example of the authentication code checking operation; [Figure 9] and [Figure 10] illustrate examples of performing GCS push operations and authentication code generation operations in different orders; [Figure 11] and [Figure 12] illustrate examples of performing GCS retrieval operations and authentication code checking operations in different orders; [Figure 13] illustrates the steps of checking whether memory access is allowed based on memory attribute data; and [Figure 14] shows a simulation example.

100,102,104,106,108,110,112,114:步驟 100,102,104,106,108,110,112,114: Steps

Claims

A device containing: an instruction decoder to decode instructions; Processing circuitry to perform data processing in response to decoding of the instructions by the instruction decoder; and At least one control register to specify command function selection information; in: In response to a no-operation compatible instruction, the instruction decoder is configured to control the processing circuitry to: When the command function selection information specified by the at least one control register is in a first state, treating the no-op compatible command as a no-op command; When the command function selection information specified by the at least one control register is in a second state, perform both a first operation and a second operation; and When the command function selection information specified by the at least one control register is in a third state, the first operation is performed but the second operation is not performed.

The device of claim 1, wherein when the instruction function selection information is in a fourth state, in response to the incompatible instruction, the instruction decoder is configured to control the processing circuit system to execute the second operation, but does not perform the first operation.

The device of any one of the preceding claims, wherein when the instruction function selection information is in the second state, the processing circuit system is configured to control application of the first operation and the third operation based on the instruction function selection information. A relative sequence of two operations.

The device of any one of the preceding claims, wherein the first operation and the second operation are function preamble operations associated with a function call; or The first operation and the second operation are function closing operations associated with returning from processing one of the functions.

The apparatus of any one of the preceding claims, wherein for at least one variant of the no-operation compatible instruction, one of the first operation and the second operation includes an authentication code generation operation based on an operand An authentication code is generated, and the authentication code is associated with the operand.

The device of claim 5, wherein the operand contains a value obtained from a link register; and In response to a function return branch instruction, the instruction decoder is configured to control the processing circuitry to branch to an address specified in the link register.

An apparatus as claimed in any one of claims 5 and 6, wherein associating the authentication code with the operand includes embedding the authentication code in a portion of the more significant bits of the operand.

The device of any one of claims 5 to 7, wherein the authentication code generating operation includes generating the authentication code according to a cryptographic function based on at least the operation element and a key.

The device of any one of the preceding claims, wherein for at least one variant of the no-operation compatible instruction, one of the first operation and the second operation includes a guarded-control-stack, GCS) push operation to push the operand to a GCS data structure used to protect the returned status information.

Equipment as in any of the preceding requirements, wherein: For at least one variant of this no-op compatible directive: One of the first operation and the second operation includes an authentication code generation operation to generate an authentication code based on an operand and associate the authentication code with the operand; The other of the first operation and the second operation includes a guard control stack (GCS) push operation to push the operand to a GCS data structure for guard return status information.

Such as the equipment of request item 10, wherein: In response to the incompatible instruction, when the instruction function selection information is in a first sub-state of the second state, the processing circuitry is configured to push the operand and the authentication code to the GCS data structure; and In response to the incompatible instruction, when the instruction function selection information is in one of the second sub-states of the second state, the processing circuitry is configured to push the operand but not push the authentication code to The GCS data structure.

A device as in any one of the preceding claims, wherein for at least one variant of the incompatible instruction: One of the first operation and the second operation includes an authentication code check operation to check whether an associated authentication code associated with an operand matches an expected authentication code generated based on the operand, and respond An error handling response is triggered upon detecting a mismatch between the associated authentication code and the expected authentication code.

The device of claim 12, wherein the associated authentication code is obtained from a portion of the more significant bits of the operand.

The device of any one of the preceding claims, wherein for at least one variant of the no-operation compatible instruction, one of the first operation and the second operation includes a protection control stack (GCS) fetch operation, to Retrieve the function return information from a GCS data structure used to protect the function return information.

A device as in any one of the preceding claims, wherein for at least one variant of the incompatible instruction: One of the first operation and the second operation includes an authentication code check operation to check whether an associated authentication code associated with an operand matches an expected authentication code generated based on the operand, and respond triggering an error handling response upon detecting a mismatch between the associated authentication code and the expected authentication code; and The other of the first operation and the second operation includes a guard control stack (GCS) fetch operation to fetch function return information from a GCS data structure used to protect the function return information.

Such as the equipment of request item 15, wherein: In response to the non-operation compatible command, when the command function selection information is in one of the first sub-states of the second state, the processing circuitry is configured to perform the GCS fetch operation and perform the authentication code check operates on a value retrieved from the GCS data structure by the GCS fetch operation; and In response to the non-operation compatible command, when the command function selection information is in one of the second sub-states of the second state, the processing circuitry is configured to perform the authentication code check before performing the GCS fetch operation. Operate on a value in a given register and perform the GCS fetch operation to retrieve the function return information from the GCS data structure to the given register.

Such as requesting any one of the equipment in items 9, 10, 11, 14, 15 and 16, where: In response to the no-op compliant instruction, when the command function selection information is in a state indicating that an access to the GCS data structure will be executed in response to the no-op compliant instruction: The processing circuitry is configured to respond to a detection that a memory region corresponding to a target address of the no-operation compatible instruction is designated by the memory attribute data as a GCS region for storing the GCS data structure A memory access outside of a memory area that is triggered by the incompatible instruction is denied.

The apparatus of claim 17, wherein the processing circuitry is configured to respond to detection that a memory region corresponding to a target address of a non-GCS access type instruction is designated by the memory attribute data as GCS area, denying a write memory access triggered by the non-GCS access type command.

The device of any one of the preceding claims, wherein the command function selection information includes: a first operation indicator to indicate whether the first operation will be executed in response to the incompatible operation command; and a second operation An indicator indicating whether the second operation will be performed in response to the no-operation compatible instruction.

As in any one of the preceding claims, the device, wherein the command function selection information further indicates whether the processing circuit system should perform a third operation in response to the incompatible command.

A method that contains: Decode instructions; and Perform data processing in response to the decoding of such instructions; where: In response to the decoding of a no-op compatible instruction, this method contains: When the command function selection information specified by at least one control register is in a first state, treating the no-op compatible command as a no-op command; When the command function selection information specified by the at least one control register is in a second state, perform both a first operation and a second operation; and When the command function selection information specified by the at least one control register is in a third state, the first operation is performed but the second operation is not performed.

A computer program containing instructions that, when executed by a host data processing device, controls the host data processing device to provide an instruction execution environment for executing object code, the computer program includes: Instruction decoding program logic to decode the instructions of the object code; and Register emulation program logic to maintain data in the storage circuitry of the host data processing device to emulate at least one control register for designated command function selection information; wherein: In response to a no-op compatibility command, the command decoder logic is configured to control the host data processing device to: When the instruction function selection information is in a first state, treating the no-operation compatible instruction as a no-operation instruction; When the command function selection information is in a second state, perform both a first operation and a second operation; and When the instruction function selection information is in a third state, the first operation is performed but the second operation is not performed.

A storage medium storing the computer program of claim 22.