TW201714114A

TW201714114A - Microprocessor and method for securely executing instructions therein

Info

Publication number: TW201714114A
Application number: TW105105820A
Authority: TW
Inventors: Ｇ葛蘭亨利; 泰瑞派克斯; 布蘭特比恩; 湯姆士Ａ克理斯賓
Original assignee: 威盛電子股份有限公司
Priority date: 2015-10-15
Filing date: 2016-02-26
Publication date: 2017-04-16
Also published as: CN105843776B; TWI627556B; TW201715434A; TWI560575B; CN105843776A

Abstract

A microprocessor is provided in which an encrypted program can replace the decryption keys that are used to decrypt sections of the encrypted program. The microprocessor may be decrypting and executing a first section of the encrypted program when it encounters, decrypts, and executes an encrypted store-key instruction to store a new set of decryption keys. After executing the store-key instruction, the microprocessor decrypts and executes a subsequent section of the encrypted program using the new set of decryption keys. On-the-fly key switching may occur numerous times with successive encrypted store-key instructions and successive sets of encrypted instructions.

Description

Microprocessor and method for safely executing instructions therein

本發明係有關於微處理器(microprocessor)領域，特別用於增加微處理器所執行的程式之安全性。 The present invention relates to the field of microprocessors, and in particular to increasing the security of programs executed by microprocessors.

很多軟體程式在面臨破壞電腦系統安全的攻擊時，通常是脆弱不堪的。例如，駭客可藉由攻擊一運行中程式的緩衝溢位區漏洞(buffer overflow vulnerability)植入不當程式碼、並轉移主控權給該不當程式碼。如此一來，所植入的程式碼將主導被攻擊的程式。一種防範軟體程式遭攻擊的方案為指令集隨機化(instruction set randomization)。概略解釋之，指令集隨機化技術會先將程式加密(encrypt)為某些形式，再於處理器將該程式自記憶體提取後，於該處理器內解密(decrypt)該程式。如此一來，駭客便不易植入惡意指令，因為所植入的指令必須被適當地加密(例如，使用與所攻擊程式相同的加密密鑰或演算法)方會被正確地執行。例如，參閱文件「Counter Code-Injection Attacks with Instruction-Set Randomization,by Gaurav S.Kc,Angelos D.Keromytis,and Vassilis Prevelakis,CCS’03,October 27-30,2003,Washington,DC,USA,ACM 1-58113-738-9/03/0010」，其中敘述Bochs-x86 Pentium模擬器(emulator)之改良版本。相關技術的缺點已被廣泛討論。例如，參閱資料「Where’s the FEEB？The Effectiveness of Instruction Set Randomization,by Ana Nora Sovarel,David Evans,and Nathanael Paul,http：//www.cs.virginia.edu/feeb」。 Many software programs are often vulnerable to attacks that compromise the security of computer systems. For example, a hacker can embed an inappropriate code by attacking a buffer overflow vulnerability of a running program and transfer the mastership to the inappropriate code. As a result, the embedded code will dominate the attacked program. One scheme for preventing software programs from being attacked is instruction set randomization. To be explained in outline, the instruction set randomization technique encrypts the program into some form, and then decrypts the program in the processor after the processor extracts the program from the memory. As a result, the hacker is less likely to embed malicious instructions because the embedded instructions must be properly encrypted (eg, using the same encryption key or algorithm as the attacked program) to be executed correctly. See, for example, the file "Counter Code-Injection Attacks with Instruction-Set Randomization, by Gaurav S. Kc, Angelos D. Keromytis, and Vassilis Prevelakis, CCS '03, October 27-30, 2003, Washington, DC, USA, ACM 1 -58113-738-9/03/0010", which describes an improved version of the Bochs-x86 Pentium emulator. The disadvantages of the related art have been widely discussed. E.g, See "Where’s the FEEB? The Effectiveness of Instruction Set Randomization, by Ana Nora Sovarel, David Evans, and Nathanael Paul, http://www.cs.virginia.edu/feeb".

本發明特點可以多種方式實現，其一為一種微處理器，包括一安全記憶體以及一指令處理管線。該安全記憶體儲存且提供密鑰編寫之密鑰，用於加密之指令的解密。該指令處理管線自一快取記憶體提取並執行指令。該指令處理管線包括一提取單元、一解密電路、以及一或多個執行單元。該提取單元提取該微處理器所支援的一指令集架構中未加密以及加密之指令。該指令集架構包括一密鑰儲存指令，用以儲存一、或多個密鑰編寫之密鑰至該安全記憶體。該微處理器支援加密的密鑰儲存指令。 The features of the present invention can be implemented in a variety of ways, one of which is a microprocessor including a secure memory and an instruction processing pipeline. The secure memory stores and provides a key for key writing for decryption of the encrypted instructions. The instruction processing pipeline extracts and executes instructions from a cache. The instruction processing pipeline includes an extraction unit, a decryption circuit, and one or more execution units. The extracting unit extracts an unencrypted and encrypted instruction in an instruction set architecture supported by the microprocessor. The instruction set architecture includes a key storage instruction for storing one or more keys written by the key to the secure memory. The microprocessor supports encrypted key storage instructions.

該解密電路以接收自該安全記憶體的上述密鑰編寫之密鑰作加密之指令的解密。上述一或多個執行單元用於執行指令、或執行指令所轉譯出的微指令。 The decryption circuit decrypts the encrypted command with a key written from the key of the secure memory. The one or more execution units are used to execute an instruction, or execute a micro-instruction that is translated by the instruction.

該微處理器採用加密的密鑰儲存指令時，係以一第一組的一、或多個密鑰編寫之密鑰對加密的密鑰儲存指令作解密，之後執行解密後的密鑰儲存指令，之後採用上述加密的密鑰儲存指令所提供的一第二組的一、或多個密鑰編寫之密鑰解密一接續組的一或多個加密之指令。該微處理器因而致能一加密之程式，相應接續多組程式指令之解密變化提供多組密鑰編寫之密鑰。 When the microprocessor uses the encrypted key storage instruction, the encrypted key storage instruction is decrypted by a key written by one or more keys of the first group, and then the decrypted key storage instruction is executed. And then decrypting one or more encrypted instructions of a subsequent group by using a second set of one or more key-written keys provided by the encrypted key storage instruction. The microprocessor thus enables an encrypted program to provide a plurality of sets of key-written keys in response to subsequent decryption changes of the plurality of sets of program instructions.

一種實施方式中，該指令集架構包括一安全執行模式指令，要求自一一般執行模式切換至一安全執行模式。該微處理器限制加密之程式的解密，直至該微處理器進入該安全執行模式。 In one embodiment, the instruction set architecture includes a secure execution The mode command requires switching from a general execution mode to a secure execution mode. The microprocessor limits decryption of the encrypted program until the microprocessor enters the secure execution mode.

一種實施方式中，根據切換至該安全執行模式的一要求的指令的格式是否帶有一加密的參數，該微處理器條件允許該要求，符合的指令為一特權程式或程序的一部分，且該加密的參數經解密後符合運行加密之程式的預設要求。一種實施方式中，上述加密之參數與程式係採不同的密鑰編寫機制作加密。 In one embodiment, the microprocessor condition allows the request according to whether the format of a required instruction to switch to the secure execution mode carries an encrypted parameter, and the conforming instruction is part of a privileged program or program, and the encryption The parameters are decrypted to meet the preset requirements for running the encrypted program. In one embodiment, the encryption parameter and the program are encoded by different key writers.

一種實施方式中，該密鑰儲存指令在即時數據欄提供一或多個密鑰編寫之密鑰的內容。 In one embodiment, the key storage instruction provides the content of one or more key-written keys in the immediate data field.

一種實施方式中，該微處理器執行解密之指令或解密之指令轉譯出的微指令、且不曝露解密之指令或微指令。 In one embodiment, the microprocessor executes the decrypted instruction or the decrypted instruction translated microinstruction and does not expose the decrypted instruction or microinstruction.

一種實施方式中，上述安全記憶體不可由該處理器匯流排存取，且不屬於一快取記憶體階層的一部份。此外，該安全記憶體不可由執行在非特權執行模式下的程式存取。此外，在一種實施方式中，一AES或RSA加密通道致使密鑰編寫之密鑰的數值寫入該安全記憶體。 In one embodiment, the secure memory is not accessible by the processor bus and does not belong to a portion of a cache memory class. In addition, the secure memory cannot be accessed by programs executing in the non-privileged execution mode. Moreover, in one embodiment, an AES or RSA encryption channel causes the value of the key written by the key to be written to the secure memory.

本案特徵更可實現作一微處理器中安全執行指令的方法。首先，該方法儲存一第一組的一或多個密鑰編寫之密鑰至一安全記憶體用於加密之指令的解密，快取一第一組的加密之指令，且採用該第一組的一或多個密鑰編寫之密鑰對該第一組的加密之指令作解密。在某些時間點，該方法包括快取加密的一密鑰儲存指令，以儲存一第二組的一或多個密鑰編寫之密鑰至該安全記憶體，用作加密之指令的解密。在儲存該第二組的一或多個密鑰編寫之密鑰之前，加密的該密鑰儲存指令係由該第一組的一或多個密鑰編寫之密鑰解密。而後，儲存該第二組的一或多個密鑰編寫之密鑰至該安全記憶體，作解密的該密鑰儲存指令之執行。而後(或同時-該微處理器為管線處理器(pipelined processor))，一第二組的加密之指令遭快取。取得該第二組的一或多個密鑰編寫之密鑰後，所述方法採用該第二組的一或多個密鑰編寫之密鑰對該第二組的加密之指令作解密。以上步驟多數可反覆執行，用於處理接續的多個加密之密鑰儲存指令以及接續多組之加密指令。 The feature of the present invention can be implemented as a method for safely executing instructions in a microprocessor. First, the method stores a first set of one or more key-written keys to a secure memory for decryption of the encrypted instructions, caches a first set of encrypted instructions, and employs the first set The one or more key-written keys decrypt the first set of encrypted instructions. At some point in time, the method includes caching an encrypted key storage instruction to store a second set of one or more keys. The key is sent to the secure memory for decryption of the encrypted command. The encrypted key storage instruction is decrypted by a key written by one or more keys of the first group prior to storing the key of the second set of one or more keys. Then, the key of the second group of one or more keys is stored to the secure memory for decryption of the execution of the key storage instruction. Then (or at the same time - the microprocessor is a pipelined processor), a second set of encrypted instructions is cached. After obtaining the key for writing the one or more keys of the second group, the method decrypts the encrypted command of the second group by using the key written by the one or more keys of the second group. Most of the above steps can be repeated, for processing a plurality of encrypted key storage instructions and a plurality of sets of encrypted instructions.

本發明有多種表徵、保護方式，並不意圖限定於以上敘述。本發明替代表徵可僅包括摘要描述的子集內容、或與其他未提及的內容結合之子集。專利保護範圍實際界定要依照其描述內容作解讀。 The invention has various characterizations and protections and is not intended to be limited to the above description. Alternative characterizations of the present invention may include only a subset of the content of the abstract description, or a subset of other unmentioned content. The actual definition of the scope of patent protection should be interpreted in accordance with its description.

100‧‧‧微處理器 100‧‧‧Microprocessor

102‧‧‧指令快取記憶體 102‧‧‧ instruction cache memory

104‧‧‧提取單元 104‧‧‧Extraction unit

106‧‧‧指令數據(可為加密) 106‧‧‧Instruction data (can be encrypted)

108‧‧‧解碼單元 108‧‧‧Decoding unit

112‧‧‧執行單元 112‧‧‧Execution unit

114‧‧‧引出單元 114‧‧‧Exporting unit

118‧‧‧通用暫存器 118‧‧‧Universal register

122‧‧‧安全存儲區 122‧‧‧Safe storage area

124‧‧‧密鑰暫存器檔案 124‧‧‧Key Register File

128‧‧‧標誌暫存器 128‧‧‧flag register

132‧‧‧微代碼單元 132‧‧‧microcode unit

134‧‧‧提取位址 134‧‧‧ extract address

142‧‧‧主密鑰暫存器 142‧‧‧Master Key Register

144‧‧‧控制暫存器 144‧‧‧Control register

148‧‧‧E位元 148‧‧‧E-bit

152‧‧‧密鑰擴展器 152‧‧‧Key Expander

154‧‧‧多工器 154‧‧‧Multiplexer

156‧‧‧互斥邏輯 156‧‧‧ mutually exclusive logic

162‧‧‧純文字指令數據 162‧‧‧ plain text instruction data

164‧‧‧提取位址產生器 164‧‧‧ extract address generator

172‧‧‧兩組密鑰 172‧‧‧Two sets of keys

174‧‧‧解密密鑰 174‧‧‧ decryption key

176‧‧‧多位元的二進位零值 176‧‧‧Multi-digit binary zero value

178‧‧‧多工器154的輸出 178‧‧‧ Output of multiplexer 154

212‧‧‧多工器A 212‧‧‧Multiplexer A

214‧‧‧多工器B 214‧‧‧Multiplexer B

216‧‧‧旋轉器 216‧‧‧ rotator

218‧‧‧加法/減法器 218‧‧‧Addition/Subtractor

234‧‧‧第一密鑰 234‧‧‧First key

236‧‧‧第二密鑰 236‧‧‧second key

238‧‧‧旋轉器的輸出 238‧‧‧ rotator output

302-316‧‧‧步驟方塊 302-316‧‧‧Steps

402‧‧‧E位元欄位 402‧‧‧E bit field

408‧‧‧多個位元的標準x86標誌 408‧‧‧Standard x86 logo with multiple digits

500‧‧‧密鑰載入指令 500‧‧‧Key Loading Instructions

502‧‧‧操作碼 502‧‧‧Operational Code

504‧‧‧密鑰暫存器檔案目標位址 504‧‧‧Key Register File Destination Address

506‧‧‧安全存儲區來源位址 506‧‧‧secure storage source address

600‧‧‧密鑰切換指令 600‧‧‧Key Switching Instructions

602‧‧‧操作碼 602‧‧‧Operation Code

604‧‧‧密鑰暫存器檔案索引 604‧‧‧Key Register File Index

702-708‧‧‧方塊步驟 702-708‧‧‧block steps

800‧‧‧記憶體用量 800‧‧‧ memory usage

900‧‧‧分支與切換密鑰指令 900‧‧‧ Branch and Switch Key Instructions

902‧‧‧操作碼 902‧‧‧Operation Code

904‧‧‧密鑰暫存器檔案索引 904‧‧‧Key Register File Index

906‧‧‧分支資訊 906‧‧‧ branch information

1002-1018‧‧‧步驟方塊 1002-1018‧‧‧Steps

1102-1106‧‧‧步驟方塊 1102-1106‧‧‧Steps

1200‧‧‧分支與切換密鑰指令 1200‧‧‧ branch and switch key instructions

1202‧‧‧操作碼 1202‧‧‧Operation Code

1300‧‧‧塊位址範圍表： 1300‧‧‧ Block address range table:

1302‧‧‧位址範圍 1302‧‧‧Address range

1304‧‧‧密鑰暫存器檔案索引 1304‧‧‧Key Register File Index

1402-1418‧‧‧步驟方塊 1402-1418‧‧‧Steps

1500‧‧‧分支與切換密鑰指令 1500‧‧‧ branch and switch key instructions

1502‧‧‧操作碼 1502‧‧‧Operation Code

1600‧‧‧塊位址範圍表 1600‧‧‧block address range table

1604‧‧‧安全存儲區位址 1604‧‧‧Secure storage area address

1714‧‧‧步驟方塊 1714‧‧‧Steps

1802-1806‧‧‧步驟方塊 1802-1806‧‧‧Steps

1902-1944‧‧‧步驟方塊 1902-1944‧‧‧Steps

2002-2008‧‧‧步驟方塊 2002-2008‧‧‧Steps

2104‧‧‧索引 2104‧‧‧ index

2202-2216‧‧‧步驟方塊 2202-2216‧‧‧Steps

2302-2316‧‧‧步驟方塊 2302-2316‧‧‧Steps

2402‧‧‧淘汰位元 2402‧‧‧Retired bits

2506‧‧‧步驟方塊 2506‧‧‧Steps

2607、2609‧‧‧步驟方塊 2607, 2609‧‧‧step blocks

2702‧‧‧分支目標位址快取記憶體(BTAC) 2702‧‧‧Branch Target Address Cache Memory (BTAC)

2706‧‧‧目標位址 2706‧‧‧ Target address

2708‧‧‧採用/不採用指標 2708‧‧‧ Adoption/non-use of indicators

2712‧‧‧密鑰切換邏輯 2712‧‧‧Key Switching Logic

2714‧‧‧型式指標 2714‧‧‧ Type indicator

2716‧‧‧密鑰暫存器檔案索引 2716‧‧‧Key register file index

2802‧‧‧BTAC陣列 2802‧‧‧BTAC array

2808‧‧‧BTAC單元 2808‧‧‧BTAC unit

2902‧‧‧有效位元 2902‧‧‧ Valid Bits

2904‧‧‧標記欄位 2904‧‧‧Marking field

2906‧‧‧目標位址 2906‧‧‧Target address

2908‧‧‧採用/不採用欄位 2908‧‧‧With/without field

2912‧‧‧密鑰暫存器檔案索引 2912‧‧‧Key Register File Index

2914‧‧‧型式欄位 2914‧‧‧Type field

3002-3004‧‧‧步驟方塊 3002-3004‧‧‧Steps

3102-3116‧‧‧步驟方塊 3102-3116‧‧‧Steps

3208-3222‧‧‧步驟方塊 3208-3222‧‧‧Steps

ZEROES‧‧‧多位元的二進位零值 ZEROES‧‧‧ multi-digit binary zero value

第1圖為一方塊圖，圖解根據本發明技術實現的一微處理器；第2圖為一方塊圖，用以詳細說明圖解第1圖的提取單元；第3圖為一流程圖，根據本發明技術，圖解第2圖提取單元之操作；第4圖為一方塊圖，根據本發明技術，圖解第1圖標誌暫存器的欄位；第5圖為一方塊圖，根據本發明技術，圖解一密鑰載入指令的格式；第6圖為一方塊圖，根據本發明技術，圖解一密鑰切換指令的格式；第7圖為一流程圖，根據本發明技術，圖解第1圖微處理器的操作，其中執行第6圖之密鑰切換指令；第8圖為一方塊圖，根據本發明技術，圖解一加密程式的記憶體用量，該加密程式包括多個第6圖所揭露的密鑰切換指令；第9圖為一方塊圖，根據本發明技術，圖解一分支與切換密鑰指令的格式；第10圖為一流程圖，根據本發明技術，圖解第1圖微處理器的操作，其中執行第9圖之分支與切換密鑰指令；第11圖為一流程圖，根據本發明技術，圖解一後處理器的操作，由軟件工具實現，可用於後部處理一程式、且加密之，以由第1圖微處理器執行；第12圖為一方塊圖，圖解本發明另外一種實施方式的分支與切換密鑰指令的格式；第13圖為一方塊圖，根據本發明技術，圖解塊位址範圍表；第14圖為一流程圖，根據本發明技術，圖解第1圖微處理器的操作，其中執行第12圖之分支與切換密鑰指令；第15圖為一方塊圖，圖解本發明另外一種實施方式的分支與切換密鑰指令的格式；第16圖為一方塊圖，根據本發明技術，圖解塊位址範圍表；第17圖為一流程圖，根據本發明技術，圖解第1圖微處理器的操作，其中執行第15圖之分支與切換密鑰指令；第18圖為一流程圖，圖解本發明技術另外一種實施方式，其中敘述一後處理器的操作，用於後部處理一程式、且加密之，由第1圖微處理器執行；第19圖為一流程圖，根據本發明技術，圖解第1圖微處理器的操作，用於應付一任務切換，切換於一加密程式以及一純文字程式之間；第20圖圖解一流程圖，根據本發明技術，圖解第1圖微處理器所執行的系統軟體之操作；第21圖圖解一方塊圖，根據本發明另外一種實施方式，圖解第1圖標誌暫存器的欄位；第22圖為一流程圖，根據本發明技術，圖解採用第21圖之標誌暫存器的第1圖微處理器之操作，用於應付一任務切換，切換於多個加密程式之間；第23圖為一流程圖，根據本發明技術，圖解採用第21圖之標誌暫存器的第1圖微處理器之操作，用於應付一任務切換，切換於多個加密程式之間；第24圖為一方塊圖，根據本發明另外一種實施方式，圖解第1圖密鑰暫存器檔案中的單一個暫存器；第25圖為一流程圖，根據本發明另外一種實施方式，圖解採用第21圖標誌暫存器以及第24圖密鑰暫存器檔案的第1圖微處理器之操作，以應付一任務切換，切換於多個加密程式之間；第26圖為一流程圖，根據本發明另外一種實施方式，圖解採用第21圖標誌暫存器以及第24圖密鑰暫存器檔案的第1圖微處理器之操作，以應付一任務切換，切換於多個加密程式之間；第27圖為一方塊圖，圖解第1圖微處理器100部分內容的其他實施方式；第28圖為一方塊圖，根據本發明技術，詳細圖解第27圖的分支目標位址快取記憶體(BTAC)；第29圖為一方塊圖，根據本發明技術，詳細圖解第28圖之BTAC各單元之內容；第30圖為一流程圖，根據本發明技術，圖解第27圖微處理器採用第28圖BTAC的操作；第31圖為一流程圖，根據本發明技術，圖解第27圖微處理器採用第28圖BTAC的操作；以及第32圖為一流程圖，根據本發明技術，圖解第27圖微處理器對一分支與切換密鑰指令的操作。 1 is a block diagram illustrating a microprocessor implemented in accordance with the teachings of the present invention; FIG. 2 is a block diagram for illustrating the extraction unit of FIG. 1 in detail; FIG. 3 is a flowchart, The invention shows the operation of the extraction unit of FIG. 2; FIG. 4 is a block diagram illustrating the field of the flag register of FIG. 1 according to the technology of the present invention; FIG. 5 is a block diagram, according to the technique of the present invention, Graphical one key load instruction Figure 6 is a block diagram illustrating the format of a key switching instruction in accordance with the teachings of the present invention; and Figure 7 is a flow chart illustrating the operation of the microprocessor of Figure 1 in accordance with the teachings of the present invention, wherein FIG. 8 is a block diagram showing a memory usage of an encryption program according to the present invention. The encryption program includes a plurality of key switching instructions disclosed in FIG. 6; The figure is a block diagram illustrating the format of a branch and switch key instruction in accordance with the teachings of the present invention; FIG. 10 is a flow chart illustrating the operation of the microprocessor of FIG. 1 in accordance with the teachings of the present invention, wherein FIG. Branch and switching key instruction; FIG. 11 is a flowchart illustrating the operation of a post-processor according to the technique of the present invention, implemented by a software tool, which can be used for post processing a program and encrypted by the first figure 12 is a block diagram illustrating a format of a branch and switch key instruction according to another embodiment of the present invention; and FIG. 13 is a block diagram illustrating a block address range table according to the present technology; 1st 4 is a flow chart illustrating the operation of the microprocessor of FIG. 1 in which the branch and switch key instructions of FIG. 12 are performed in accordance with the teachings of the present invention; and FIG. 15 is a block diagram illustrating another embodiment of the present invention. a branch and switch key instruction format; FIG. 16 is a block diagram illustrating a block address range table in accordance with the teachings of the present invention; and FIG. 17 is a flow chart illustrating the microprocessor of FIG. 1 in accordance with the teachings of the present invention Operation in which the branch and switch key instruction of FIG. 15 is performed; FIG. 18 is a flowchart illustrating another embodiment of the present technology, in which the operation of a post processor is described for the post processing of a program, and Encrypted, executed by the microprocessor of FIG. 1; FIG. 19 is a flowchart illustrating the operation of the microprocessor of FIG. 1 for coping with a task switching, switching to an encryption program, and a pure Figure 20 illustrates a flow chart illustrating the operation of the system software executed by the microprocessor of Figure 1 in accordance with the teachings of the present invention; and Figure 21 illustrates a block diagram illustrating, in accordance with another embodiment of the present invention, Figure 1 shows the field of the register; Figure 22 is a flow chart illustrating the operation of the microprocessor of Figure 1 using the flag register of Figure 21 for handling a task switch in accordance with the teachings of the present invention. Switching between multiple encryption programs; FIG. 23 is a flow chart illustrating the operation of the microprocessor of FIG. 1 using the flag register of FIG. 21 for coping with a task switching, in accordance with the teachings of the present invention, Switch Between the plurality of encryption programs; FIG. 24 is a block diagram illustrating a single register in the key register file of FIG. 1 according to another embodiment of the present invention; FIG. 25 is a flowchart, according to a flowchart, Another embodiment of the present invention illustrates the operation of the microprocessor of FIG. 1 using the flag register of FIG. 21 and the key register file of FIG. 24 to cope with a task switching and switching between multiple encryption programs. Figure 26 is a flow chart, according to another embodiment of the present invention, illustrated The operation of the microprocessor of Fig. 21 using the flag register of Fig. 21 and the file of the key register file of Fig. 24 to cope with a task switching and switching between multiple encryption programs; Fig. 27 is a block diagram FIG. 28 is a block diagram showing the branch target address cache memory (BTAC) of FIG. 27 in detail according to the technique of the present invention; FIG. For a block diagram, the contents of the BTAC units of FIG. 28 are illustrated in detail in accordance with the teachings of the present invention; and FIG. 30 is a flow chart illustrating the operation of the microprocessor of FIG. 28 using the BTAC of FIG. 28 in accordance with the teachings of the present invention; Figure 31 is a flow chart showing the operation of the microprocessor of Figure 27 using the BTAC of Figure 28 in accordance with the teachings of the present invention; and Figure 32 is a flow chart illustrating the microprocessor pair of Figure 27 in accordance with the teachings of the present invention. A branch and the operation of switching key instructions.

參閱第1圖，一方塊圖圖解根據本發明技術所實現的一微處理器100。微處理器100包括一管線(pipeline)，其中包括一指令快取記憶體(instruction cache)102、一提取單元(fetch unit)104、一解碼單元(decode unit)108、一執行單元(execution unit)112、以及一引出單元(retire unit)114。微處理器100更包括一微代碼單元(microcode unit)132，用以提供微代碼指令(microcode instructions)給該執行單元112。微處理器100更包括通用暫存器(general purpose registers)118以及標誌暫存器 (EFLAGS register)128，以提供指令運算元(instruction operands)給執行單元112。而且，透過引出單元114，將指令執行結果更新於通用暫存器118以及標誌暫存器128。在一種實施方式中，標誌暫存器128是由傳統x86標誌暫存器修改實現，詳細實施方式將於後續篇幅說明。 Referring to Figure 1, a block diagram illustrates a microprocessor 100 implemented in accordance with the teachings of the present invention. The microprocessor 100 includes a pipeline including an instruction cache 108, a fetch unit 104, a decode unit 108, and an execution unit. 112, and a retire unit 114. The microprocessor 100 further includes a microcode unit 132 for providing microcode instructions to the execution unit 112. The microprocessor 100 further includes a general purpose registers 118 and a flag register. (EFLAGS register) 128 to provide instruction operands to the execution unit 112. Moreover, the instruction execution result is updated by the lead-out unit 114 to the general-purpose register 118 and the flag register 128. In one embodiment, the flag register 128 is implemented by a conventional x86 flag register modification, and the detailed implementation will be described in subsequent pages.

提取單元104自指令快取記憶體102提取指令數據(instruction data)106。提取單元104操作於兩種模式：一為解密模式(decryption mode)，另一為純文字模式(plain text mode)。提取單元104內一控制暫存器(control register)144的一E位元(E bit)148決定該提取單元104是操作於解密模式(設定E位元)、或操作於純文字模式(清空E位元)。純文字模式下，提取單元104視自該指令快取記憶體102所提取出的指令數據106為未加密、或純文字指令數據，因此，不對指令數據106作解密。然而，在解密模式下，提取單元104視自該指令快取記憶體102所提取出的指令數據106為加密指令數據，因此，需使用該提取單元104的一主密鑰暫存器(master key register)142所儲存的解密密鑰(decryption keys)將之解密為純文字指令數據，詳細技術內容將參考第2圖以及第3圖進行討論。 The extracting unit 104 extracts instruction data 106 from the instruction cache 102. The extracting unit 104 operates in two modes: one is a decryption mode and the other is a plain text mode. An E bit 148 of a control register 144 in the extracting unit 104 determines whether the extracting unit 104 operates in a decryption mode (set E bit) or operates in a text mode (empty E). Bit). In the plain text mode, the extracting unit 104 regards the command data 106 extracted from the instruction cache 102 as unencrypted or plain text command data, and therefore does not decrypt the command data 106. However, in the decryption mode, the extracting unit 104 regards the instruction data 106 extracted from the instruction cache 102 as the encrypted instruction data, and therefore, a master key register (master key) of the extracting unit 104 is used. The decryption keys stored in register 142 are decrypted into plain text command data. The detailed technical content will be discussed with reference to FIG. 2 and FIG.

提取單元104亦包括一提取位址產生器(fetch address generator)164，用以產生一提取位址(fetch address)134，以自該指令快取記憶體102提取指令數據106。提取位址134更供應給提取單元104的一密鑰擴展器(key expander)152。密鑰擴展器152自主密鑰暫存器142中選取兩組密鑰172，並對其實施運算以產生一解密密鑰174，作為多工器154的第一輸入。多工器154的第二輸入為多位元的二進位零值(binary zeroes)176。E位元148控制多工器154。若E位元148被設定，多工器154選擇輸出該解密密鑰174。若E位元148被清除，多工器154選擇輸出多位元的二進位零值176。多工器154的輸出178將供應給互斥邏輯156作為其第一輸入。互斥邏輯156負責對提取的指令數據106以及多工器輸出178施行布林互斥運算(Boolean exclusive-OR，XOR)，以產生純文字指令數據162。加密的指令數據106乃預先以互斥邏輯將其原本的純文字指令數據以一加密密鑰進行加密，其中該加密密鑰之數值與該解密密鑰174相同。提取單元104的詳細實施方式將配合第2圖以及第3圖內容於稍後敘述。 The extracting unit 104 also includes a fetch address generator 164 for generating a fetch address 134 for extracting the instruction data 106 from the instruction cache 102. The extracted address 134 is further supplied to a key expander 152 of the extracting unit 104. The key expander 152 selects two sets of keys 172 from the autonomous key register 142 and operates on it to generate a decryption key 174 as the first input to the multiplexer 154. Multiplex The second input of the 154 is a multi-bit binary zeroes 176. The E bit 148 controls the multiplexer 154. If the E bit 148 is set, the multiplexer 154 selects to output the decryption key 174. If the E bit 148 is cleared, the multiplexer 154 selects to output the binary zero value 176 of the multi-bit. The output 178 of multiplexer 154 will be supplied to mutual exclusion logic 156 as its first input. The exclusive logic 156 is responsible for performing Boolean exclusive-OR (XOR) on the extracted instruction data 106 and the multiplexer output 178 to generate plain text instruction data 162. The encrypted instruction data 106 is pre-encrypted with its original plain text instruction data as an encryption key, wherein the value of the encryption key is the same as the decryption key 174. The detailed embodiment of the extracting unit 104 will be described later in conjunction with the contents of the second and third figures.

純文字指令數據162將供應給解碼單元108。解碼單元108負責將純文字指令數據162之串流解碼、並分割為多個X86指令，交由執行單元112執行。在一種實施方式中，解碼單元108包括緩衝器(buffers)或佇列(queues)，以在解碼之前或期間，緩衝存儲的純文字指令數據162之串流。在一種實施方式中，解碼單元108包括一指令轉譯器(instruction translator)，用以將X86指令轉譯為微指令microinstructions或micro-ops，交由執行單元112執行。解碼單元108輸出指令時，更會針對各指令輸出一位元值，該位元值乃伴隨該指令沿所述管線結構一路行進而至，用以指示該指令是否為加密指令。該位元值將控制該執行單元112以及該引出單元114，使之根據該指令自該指令快取記憶體102取出時是加密指令或純文字指令而進行決策並且採取動作。在一種實施方式中，純文字指令不被允許執行專供指令解密模式設計的特定操作。 The plain text instruction data 162 will be supplied to the decoding unit 108. The decoding unit 108 is responsible for decoding the stream of the plain text instruction data 162 and dividing it into a plurality of X86 instructions for execution by the execution unit 112. In one embodiment, decoding unit 108 includes buffers or queues to buffer the stream of stored plain text instruction data 162 before or during decoding. In one embodiment, decoding unit 108 includes an instruction translator for translating X86 instructions into microinstructions or micro-ops for execution by execution unit 112. When the decoding unit 108 outputs the instruction, a one-bit value is output for each instruction, and the bit value is accompanied by the instruction along the pipeline structure to indicate whether the instruction is an encryption instruction. The bit value will control the execution unit 112 and the lead-out unit 114 to make an decision and take an action when the instruction is fetched from the instruction cache memory 102 as an encrypted instruction or a plain text instruction. In one embodiment, plain text instructions are not allowed to execute exclusively. The specific operation of the instruction decryption mode design.

在一種實施方式中，微處理器100為一x86架構處理器，然而，微處理器100也可以其他架構之處理器實現。若一處理器可正確執行設計給x86處理器執行的大多數應用程式，則視之為x86架構的處理器。若應用程式執行後可獲得預期結果，則可判斷該應用程式是被正確執行。特別是，微處理器100是執行x86指令集的指令，且具有x86用戶可用暫存器組(x86 user-visible register set)。 In one embodiment, the microprocessor 100 is an x86 architecture processor, however, the microprocessor 100 can also be implemented by other architecture processors. If a processor can properly execute most of the applications designed for x86 processor execution, it is considered an x86 architecture processor. If the application can get the expected results after execution, it can be judged that the application is executed correctly. In particular, microprocessor 100 is an instruction that executes the x86 instruction set and has an x86 user-visible register set.

在一種實施方式中，微處理器100乃設計成供應一複合安全架構(comprehensive security architecture)-稱為安全執行模式(secure execution mode，簡稱SEM)-以於其中執行程式。根據一種實施方式，SEM程式的執行可由數種處理器事件(processor events)引發，且不受一般(非SEM)操作封鎖。此外，SEMENABLE指令可引發自一般執行模式(normal execution mode)至安全執行模式(SEM mode)的轉態。在一種實施方式中，SEMENABLE指令具有經加密的一參數，該參數係經授權單位的一私鑰加密，屬一種密鑰編寫機制(cryptographic mechanism)，不同於加密程式之加密使用的對稱密鑰加密之密鑰編寫機制(symmetric key encryption cryptographic mechanism)。微處理器100內的安全碼介面邏輯採用的為一公開密鑰，該公開密鑰是在製造程序中存入，用以解密且鑑定該參數。該參數解密後，安全執行模式初始邏輯會初始化該安全模式。 In one embodiment, the microprocessor 100 is designed to supply a comprehensive security architecture - referred to as a secure execution mode (SEM) - to execute the program therein. According to one embodiment, execution of the SEM program can be initiated by several processor events and is not blocked by normal (non-SEM) operations. In addition, the SEMENABLE instruction can initiate a transition from a normal execution mode to a SEM mode. In one embodiment, the SEMENABLE instruction has an encrypted parameter that is encrypted by a private key of the authorized unit and is a cryptographic mechanism that is different from the symmetric key encryption used by the encryption of the encryption program. Symmetric key encryption cryptographic mechanism. The security code interface logic within microprocessor 100 employs a public key that is stored in the manufacturing program to decrypt and identify the parameter. After the parameter is decrypted, the security execution mode initial logic initializes the security mode.

在一種實施方式中，為安全執行模式之資料所供應的安全非揮發性記憶體(未顯示於圖中)-如，快閃記憶體-可用於儲存解密密鑰。該安全非揮發式記憶體經由一私有序列匯流排耦接該微處理器100，且其中所有資料為AES加密且簽名認證。在一種實施方式中，微控制器100包括小尺寸非揮發性記憶體(未顯示於圖中)，可用於儲存解密密鑰。在一種實施例中，上述非揮發性記憶體為熔絲型非揮發儲存裝置(fuse-embodied non-volatile storage)，詳述於美國專利案U.S.Patent No.7,663,957，通過引用主張全文完整併入本文。本文所述之指令解密特徵之優點為展延安全執行模式，使得安全程式儲存在微處理器100外的記憶體，無須將安全程式完整儲存在微處理器100之中。因此，安全程式碼可利用記憶體階層架構的完整尺寸與功能。在一種實施方式中，架構上的例外/中斷(例如，頁面錯誤(page faults)、偵錯斷點(debug breakpoint)…等)在安全執行模式運行下部分或全數除能。在一種實施方式中，架構上的例外/中斷在解密模式運行下(即，E-位元148處設定狀態)部分或全數除能。 In one embodiment, the data is provided for the safe execution mode. The safe non-volatile memory (not shown) - for example, flash memory - can be used to store the decryption key. The secure non-volatile memory is coupled to the microprocessor 100 via a private serial bus, and wherein all of the data is AES encrypted and signed for authentication. In one embodiment, the microcontroller 100 includes small size non-volatile memory (not shown) that can be used to store decryption keys. In one embodiment, the non-volatile memory is a fuse-embodied non-volatile storage, which is described in detail in U.S. Patent No. 7,663,957. . The advantage of the instruction decryption feature described herein is that the secure execution mode is extended such that the security program is stored in memory external to the microprocessor 100 without the need to store the security program in the microprocessor 100. Therefore, the secure code can take advantage of the full size and functionality of the memory hierarchy. In one embodiment, architectural exceptions/interruptions (eg, page faults, debug breakpoints, etc.) are partially or fully disabled in a safe execution mode run. In one embodiment, the architectural exception/interrupt is partially or fully disabled in the decrypted mode operation (ie, the state set at E-bit 148).

運行在安全執行模式的程式所執行的功能有多種例子，包括關鍵安全事件(critical security task)，如，辨識憑證並加密資料、監控系統軟體活動、辨識系統軟體完整性、追蹤資源使用、控制新軟體安裝…諸如此類。安全執行模式的例子詳細描述於2013年12月24日核發的美國專利案U.S.Patent No.8,615,799，其主張2008年5月24申請之美國臨時申請案U.S.Provisional Application No.61/055,980之優先權-以上文件皆通過引用主張全文完整併入本文。 There are many examples of functions performed by programs running in secure execution mode, including critical security tasks such as identifying credentials and encrypting data, monitoring system software activity, identifying system software integrity, tracking resource usage, and controlling new Software installation...etc. An example of a safe execution mode is described in detail in U.S. Patent No. 8,615,799, issued on Dec. The above documents are hereby incorporated by reference in their entirety.

在一種實施方式中，微處理器架構做一般模式以及安全模式兩者的指令執行。若運作在一般模式下，安全應用程式之安全執行所相關的資源無一為可觀察或可操作。監視邏輯(watchdog logic)監視安全碼、資料、以及環境與物理屬性的真實性，以蒐集竄改證據。針對安全執行模式所供應的中斷處理以及異常邏輯不同於一般模式之中斷處理以及異常邏輯。 In one embodiment, the microprocessor architecture performs instruction execution for both the general mode and the secure mode. If operating in the normal mode, none of the resources associated with the secure execution of the secure application are observable or operational. Watchdog logic monitors the authenticity of security codes, data, and environmental and physical attributes to collect tamper evidence. The interrupt handling and exception logic supplied for the secure execution mode are different from the normal mode interrupt handling and exception logic.

微處理器100更包括一密鑰暫存器檔案(key register file)124。密鑰暫存器檔案124包括複數個暫存器，其中儲存的密鑰可藉由密鑰切換指令(switch key instruction，後續討論之)載入提取單元104的主密鑰暫存器142，以解密所提取的加密指令數據106。 The microprocessor 100 further includes a key register file 124. The key register file 124 includes a plurality of registers, wherein the stored keys can be loaded into the master key register 142 of the extracting unit 104 by a switch key instruction (discussed later). The extracted encrypted instruction data 106 is decrypted.

微處理器100更包括一安全存儲區(secure memory area，簡寫為SMA)122，用於存儲解密密鑰，該解密密鑰待經第5圖所示之密鑰載入指令(load key instruction)500進而載入密鑰暫存器檔案124。在一種實施方式中，安全存儲區122限定以SEM程式存取。也就是說，安全存儲區122不可藉一般執行模式(非SEM)下所執行的程式存取。此外，安全存儲區122也不可藉處理器匯流排存取，且不屬於微處理器100之快取記憶體階層的一部份。因此，舉例說明之，快取清空操作(cache flush operation)不會導致安全存儲區122的內容寫入記憶體。關於安全存儲區122的讀寫，微處理器100指令集架構中設計有特定指令。一種實施方式是在安全存儲區122中設計一隔離式隨機存取記憶體(private RAM)，相關技術內容可參考2008年2月20日申請的美國專利申請案12/034,503(該案於2008年10月16日公開，公開號為2008/0256336)；可參照上述案件內容應用於本案發明。 The microprocessor 100 further includes a secure memory area (SMA) 122 for storing a decryption key to be subjected to a load key instruction as shown in FIG. 500 is in turn loaded into the key register file 124. In one embodiment, the secure storage area 122 is defined to be accessed by an SEM program. That is, the secure storage area 122 is not accessible by programs executed in the normal execution mode (non-SEM). In addition, the secure storage area 122 is also not accessible by the processor bus and is not part of the cache memory hierarchy of the microprocessor 100. Thus, by way of example, a cache flush operation does not cause the contents of secure storage area 122 to be written to the memory. Regarding the read and write of secure storage area 122, specific instructions are designed in the instruction set architecture of microprocessor 100. One embodiment is to design an isolated random access memory (private RAM) in the secure storage area 122. For related art, reference is made to U.S. Patent Application Serial No. 12/034,503, filed on Feb. 20, 2008. October 16th Open, publication number is 2008/0256336); can refer to the above case content for the invention of the present invention.

起先，作業系統或其他特權程序(privileged program)下載密鑰的初始化設定於該安全存儲區122、密鑰暫存器檔案124、以及主密鑰暫存器142。微處理器100起先會以該密鑰的初始化設定以解密一加密程式。此外，加密程式本身可接續寫入新的密鑰至安全存儲區122、並自安全存儲區122將密鑰載入密鑰暫存器檔案124(藉由密鑰載入指令)、且自密鑰暫存器檔案124將密鑰載入主密鑰暫存器142(藉由密鑰切換指令)。所述操作之優勢在於：所揭露的密鑰切換指令使得加密程式在執行當下得以切換解密密鑰組(on-the-fly switching)，以下將詳述之。新的密鑰可由加密程式指令自身的即時數據組成。在一種實施方式中，程式檔案標頭的一欄位會指示程式指令是否為加密型式。 Initially, the initialization of the operating system or other privileged program download key is set in the secure storage area 122, the key register file 124, and the master key register 142. The microprocessor 100 initially sets the initialization of the key to decrypt an encryption program. In addition, the encryption program itself can continue to write a new key to the secure storage area 122, and load the key from the secure storage area 122 into the key register file 124 (by the key load command), and the self-density The key register file 124 loads the key into the master key register 142 (by the key switch instruction). An advantage of the described operation is that the disclosed key switching instructions cause the encryption program to switch on-the-fly switching during execution, as will be described in more detail below. The new key can be composed of the instant data of the encryption program instruction itself. In one embodiment, a field in the program file header indicates whether the program instruction is an encrypted version.

第1圖所描述的技術有多項優點。第一，自加密指令數據106所解密出來的純文字指令數據無法由微處理器100外部獲得。 The technique described in Figure 1 has several advantages. First, the plain text instruction data decrypted from the encrypted instruction data 106 cannot be obtained externally by the microprocessor 100.

第二，提取單元104提取加密指令數據所需的時間與提取純文字指令數據所需的時間相同。此特色關係著安全與否。反之，若有時間差存在，駭客可藉此破解加密技術。 Second, the time required for the extraction unit 104 to extract the encrypted instruction data is the same as the time required to extract the plain text instruction data. This feature is related to security. On the other hand, if there is a time difference, the hacker can use this to crack the encryption technology.

第三，相較於傳統設計，本案所揭露之指令解密技術不會額外增加提取單元104所耗的時脈數量。如以下討論，密鑰擴展器152增加解密密鑰之有效長度，該解密密鑰用於解密一加密程式，且此方式不會使提取加密程式數據所需的時間長於提取純文字程式數據所需的時間。特別是，因為密鑰擴展器152之運作限時於以提取位址134查表該指令快取記憶體102獲得指令數據106之內完成，密鑰擴展器152並不會增加一般的提取程序的時間。此外，因為多工器154以及密鑰擴展器152一併限時於以提取位址134查表該指令快取記憶體102獲得指令數據106之內完成，故不會增加一般的提取程序的時間。互斥邏輯156是唯一添加於一般提取路徑的邏輯運算，所幸互斥操作156的傳播延遲相當小，不會增加工作週期。因此，本案所揭露的指令解密技術不會增加提取單元104時脈數量負擔。此外，相較於一般技術所應用於解密指令數據106的複雜解密機制，例如S盒(S-boxes)，一般技術會增加提取以及解碼指令數據106時所需的工作週期且/或所消耗的時脈數量。 Third, compared to the conventional design, the instruction decryption technique disclosed in the present case does not additionally increase the number of clocks consumed by the extracting unit 104. As discussed below, the key expander 152 increases the effective length of the decryption key, which is used to decrypt an encryption program, and this way does not cause the time required to extract the encrypted program data. Longer than the time required to extract plain text program data. In particular, since the operation of the key expander 152 is limited to completion by the fetch address 134 to read the instruction cache 102 to obtain the instruction data 106, the key expander 152 does not increase the time of the general fetch program. . In addition, since the multiplexer 154 and the key expander 152 are collectively limited in time to perform the lookup of the instruction cache memory 102 to obtain the instruction data 106 by extracting the address 134, the general extraction procedure time is not increased. The exclusive logic 156 is the only logical operation added to the general extraction path. Fortunately, the propagation delay of the exclusive operation 156 is relatively small and does not increase the duty cycle. Therefore, the instruction decryption technique disclosed in the present application does not increase the clock amount burden of the extracting unit 104. Moreover, in contrast to the complex decryption mechanisms used by the general techniques to decrypt instruction data 106, such as S-boxes, the general technique increases the duty cycle and/or consumption required to extract and decode the instruction data 106. Number of clocks.

接著，參考第2圖，一方塊圖詳細圖解第1圖之提取單元104。特別是，第1圖之密鑰擴展器152也詳細圖列其中。先前已討論採用互斥邏輯解密上述加密指令數據106的優點。然而，快且小的互斥邏輯有其缺點：若加密/解密密鑰被重複使用，則互斥邏輯屬於一種脆弱加密方法(weak encryption method)。不過，若密鑰的有效長度等同所欲加密/解密之程式的長度，互斥邏輯加密會是一種強度極高的加密技術。微處理器100之特徵在於可增長解密密鑰的有效長度，以降低密鑰重複使用的需求。第一，主密鑰暫存器檔案142所儲存的數值為中大型尺寸：在一種實施方式中，其尺寸等同自指令快取記憶體102所取出的指令數據106之提取量、或區塊尺寸，為128位元(16位元組)。第二，加密擴展器152用於增長解密密鑰的有效長度，例如，增至一實施方式所揭露的2048位元組，將於後續篇幅詳述。第三，加密程式可藉由密鑰切換指令(或其變形)在操作中改變主密鑰暫存器142內的數值，之後段落將詳述之。 Next, referring to FIG. 2, a block diagram illustrates the extraction unit 104 of FIG. 1 in detail. In particular, the key expander 152 of Fig. 1 is also shown in detail. The advantages of using the mutually exclusive logic to decrypt the encrypted instruction data 106 have been discussed previously. However, fast and small mutual exclusion logic has its disadvantages: if the encryption/decryption key is reused, the mutual exclusion logic belongs to a weak encryption method. However, if the effective length of the key is equal to the length of the program to be encrypted/decrypted, mutual exclusion logic encryption is an extremely powerful encryption technique. The microprocessor 100 is characterized in that the effective length of the decryption key can be increased to reduce the need for key reuse. First, the value stored in the master key register file 142 is a medium to large size: in one embodiment, the size is equivalent to the amount of instruction data 106 fetched from the instruction cache 102, or the block size. , is 128 bits (16 bytes). Second, the encryption expander 152 is used to increase the validity of the decryption key. The length, for example, is increased to the 2048 byte disclosed in an embodiment, which will be detailed later. Third, the encryption program can change the value in the master key register 142 in operation by a key switch instruction (or a variant thereof), as will be detailed later in the paragraph.

在第2圖所示實施方式中，142使用了五個主密鑰暫存器，編號0-4。然而，在其他實施方式中，也可以較少或較多量的主密鑰暫存器142數量增長解密密鑰長度。例如，一種實施方式採用12個主密鑰暫存器142。密鑰擴充器152包括一第一多工器A 212以及一第二多工器B 214，用以接收主密鑰暫存器142所供應的密鑰。提取位址134的部分內容用於控制多工器212/214。在第2圖所示實施方式中，多工器B 214為三轉一多工器，而多工器A 212為四轉一多工器。表格1顯示多工器212/214如何根據各自的選擇輸入選取該等主密鑰暫存器142(以上述編號識別)。表格2顯示上述選擇輸入的產生方式，以及基於提取位址134的位元[10：8]所呈的主密鑰暫存器142組合。 In the embodiment shown in Figure 2, 142 uses five master key registers, numbered 0-4. However, in other embodiments, the number of master key registers 142 may be increased by a smaller or greater amount to increase the decryption key length. For example, one embodiment employs twelve master key registers 142. The key extender 152 includes a first multiplexer A 212 and a second multiplexer B 214 for receiving the key supplied by the master key register 142. A portion of the content of the extracted address 134 is used to control the multiplexer 212/214. In the embodiment shown in Fig. 2, the multiplexer B 214 is a three-turn multiplexer, and the multiplexer A 212 is a four-turn multiplexer. Table 1 shows how multiplexers 212/214 select such master key registers 142 (identified by the above numbers) according to their respective selection inputs. Table 2 shows the manner in which the above-described selection input is generated, and the combination of the master key registers 142 represented by the bits [10:8] of the extracted address 134.

多工器B 214的輸出236是供應給加法/減法器218。多工器A 212的輸出234是供應給一旋轉器(rotator)216。旋轉器216接收提取位址134的位元[7：4]，據以旋轉多工器輸出234，決定旋轉的位元組數量。在一種實施方式中，提取位址134的位元[7：4]在供應給旋轉器216控制旋轉的位元組數量前增量，以表格3顯示之。旋轉器216的輸出238是供應給加法/減法器218。加法器/減法器218接收提取位址134的位元[7]。若該位元[7]為清空，加法/減法器218將旋轉器216的輸出238自多工器B 214之輸出236減去。若該位元[7]為設定，加法/減法器218將旋轉器216的輸出238加上多工器B 214的輸出236。加法/減法器218的輸出即第1圖所示之解密密鑰174，將供應給多工器154。以下以第3圖之流程圖詳述相關技術。 The output 236 of multiplexer B 214 is supplied to adder/subtractor 218. The output 234 of multiplexer A 212 is supplied to a rotator 216. The rotator 216 receives the bits [7:4] of the extracted address 134, based on which the multiplexer output 234 is rotated to determine the number of bytes to rotate. In one embodiment, the bits [7:4] of the extracted address 134 are incremented before being supplied to the rotator 216 to control the number of bytes rotated, as shown in Table 3. Output 238 of rotator 216 is supplied to adder/subtractor 218. Adder/subtractor 218 receives the bit [7] of the extracted address 134. If the bit [7] is empty, the adder/subtractor 218 subtracts the output 238 of the rotator 216 from the output 236 of the multiplexer B 214. If the bit [7] is set, the adder/subtractor 218 adds the output 238 of the rotator 216 to the output 236 of the multiplexer B 214. The output of the adder/subtracter 218, that is, the decryption key 174 shown in Fig. 1, is supplied to the multiplexer 154. The related art will be described in detail below with reference to the flowchart of FIG.

接著，參閱第3圖，一流程圖基於本發明技術圖解第2圖提取單元104的操作。流程始於方塊302。 Next, referring to FIG. 3, a flowchart is based on the operation of the second drawing extraction unit 104 of the technical diagram of the present invention. The process begins at block 302.

在方塊302，提取單元104以提取位址134讀取指令快取記憶體102，以開始提取一16位元組之區塊的指令數據106。指令數據106可為加密狀態或為純文字狀態，視指令數據106是為一加密程式或一純文字程式的一部分而定，由E位元148標示。流程接著進入方塊304。 At block 302, the fetch unit 104 reads the instruction cache 102 with the fetch address 134 to begin fetching the instruction data 106 for a block of 16 bytes. The command data 106 can be in an encrypted state or in a plain text state, depending on whether the command data 106 is part of an encryption program or a plain text program, as indicated by E bit 148. Flow then proceeds to block 304.

參考方塊304，根據提取位址134較高的數個位元，多工器A 212以及多工器B 214分別自主密鑰暫存器142所供應的密鑰172中選取出一第一密鑰234以及一第二密鑰236。在一種實施方式中，提取位址134所供應的該些位元施加於多工器212/214，以產生特定的密鑰對(234/236 key pair)組合。在第2圖所示之實施方式中，所供應的主密鑰暫存器142數量為5，因此，存在10組可能的密鑰對。為了簡化硬體設計，僅使用了其中8組；此設計將供應2048位元組的有效密鑰，將於後續段落詳細討論之。然而，其他實施方式也可能使用其他數量的密鑰暫存器142。以供應12個主密鑰暫存器142的實施方式為例，主密鑰暫存器142的可能組合有66組，若採用其中64組，所產生的有效密鑰將為16384位元組。整體而言，假設上述複數個密鑰數值總量為K(例如：5，且採用全部組合)，該解密密鑰、以及上述複數個密鑰數值各自的長度為W位元組(例如：16位元組)，則產生的有效密鑰將為W² * (K！/(2*(K-2)！))位元組。流程接著進入方塊306。 Referring to block 304, multiplexer A 212 and multiplexer B 214 respectively select a first key from the key 172 supplied by the autonomous key register 142 according to the higher number of bits of the extracted address 134. 234 and a second key 236. In one embodiment, the bits supplied by the extraction address 134 are applied to the multiplexer 212/214 to produce a particular key pair (234/236 key pair) combination. In the embodiment shown in Fig. 2, the number of supplied master key registers 142 is five, so there are 10 sets of possible key pairs. To simplify the hardware design, only 8 of them are used; this design will supply a valid key of 2048 bytes, which will be discussed in detail in subsequent paragraphs. However, other embodiments may also use other numbers of key registers 142. Taking the implementation of the 12 master key registers 142 as an example, the possible combinations of the master key registers 142 are 66 groups. If 64 groups are used, the generated valid keys will be 16384 bytes. In general, assuming that the total number of the plurality of key values is K (for example, 5, and all combinations are used), the decryption key and the plurality of key values each have a length of W bytes (for example, 16). The byte generated, the resulting valid key will be the W ² * (K!/(2*(K-2)!)) byte. Flow then proceeds to block 306.

在方塊306，基於提取位址134的位元[7：4]，旋轉器216使第一密鑰234旋轉相應數量的位元組。例如，若提取位址134的位元[7：4]為數值9，旋轉器216將第一密鑰234朝右旋轉9個位元組。流程接著進入方塊308。 At block 306, based on the bit [7:4] of the extracted address 134, the rotation The 216 rotates the first key 234 by a corresponding number of bytes. For example, if the bit [7:4] of the extracted address 134 is a value of 9, the rotator 216 rotates the first key 234 to the right by 9 bytes. Flow then proceeds to block 308.

在方塊308，加法/減法器218將旋轉後的第一密鑰238加至/減自該第二密鑰236，以產生第1圖之解密密鑰174。在一種實施方式中，若提取位址134的位元[7]為1，則加法/減法器218將旋轉後的第一密鑰234加至該第二密鑰236；若提取位址134的位元[7]為0，則加法/減法器218將旋轉後的第一密鑰234自該第二密鑰236減去。接著，流程進入方塊312。 At block 308, the adder/subtractor 218 adds/subtracts the rotated first key 238 from the second key 236 to produce the decryption key 174 of FIG. In an embodiment, if the bit [7] of the extracted address 134 is 1, the adder/subtractor 218 adds the rotated first key 234 to the second key 236; if the address 134 is extracted The bit [7] is 0, and the adder/subtractor 218 subtracts the rotated first key 234 from the second key 236. Next, the flow proceeds to block 312.

在決策方塊312，多工器154根據其控制信號判斷所提取的該區塊之指令數據106是來自一加密程式或一純文字程式，所述控制信號來自控制暫存器144所供應的位元E 148。若指令數據106為加密狀態，流程進入方塊314，反之，則流程進入方塊316。 At decision block 312, the multiplexer 154 determines, based on its control signal, that the extracted instruction data 106 for the block is from an encryption program or a plain text program, the control signal being derived from the bit supplied by the control register 144. E 148. If the command data 106 is in an encrypted state, the flow proceeds to block 314, otherwise, the flow proceeds to block 316.

在方塊314，多工器154選擇輸出解密密鑰174，且互斥邏輯156令加密指令數據106以及解密密鑰174進行一布林互斥運算，以產生第1圖之純文字指令數據162。流程止於方塊314。 At block 314, multiplexer 154 selects output decryption key 174, and mutual exclusion logic 156 causes encryption instruction data 106 and decryption key 174 to perform a Boolean mutual exclusion operation to produce plain text instruction data 162 of FIG. The process ends at block 314.

在方塊316，多工器154選擇輸出16位元組的二進位零值176，且互斥邏輯156令指令數據106(為純文字)以及該16位元組的二進位零值進行一布林互斥運算，以產生同樣的純文字指令數據162。流程止於此方塊316。 At block 316, the multiplexer 154 selects to output a binary zero value 176 of 16 bytes, and the exclusive logic 156 causes the command data 106 (which is plain text) and the binary zero value of the 16-bit tuple to be a Brin. Mutually exclusive operations to produce the same plain text instruction data 162. The process ends at block 316.

參考第2圖以及第3圖所揭露內容，解密密鑰174供應給所提取的該區塊指令數據106進行互斥運算，且該解密密鑰174是所選取的主密鑰對234/236以及提取位址134之函數。相比於傳統解密程序-使解密密鑰為先前密鑰值的一函數，其中持續修正密鑰以供應新的在下一次工作區間使用-本案所揭露之解密技術完全不同。以主密鑰對234/236以及提取位址134為函式獲得解密密鑰174的方式有至少以下兩種優點。第一，如以上所討論，加密指令數據以及純文字指令數據106之提取耗時相當，不會增加微處理器100所需的工作時脈。第二，遇到程式中的分支指令(branch instruction)，提取指令數據106所需的時間不會增加。在一種實施方式中，一分支預測器(branch predictor)接收提取位址134，並預測該提取位址134所指之該區塊的指令數據106是否存在一分支指令，並預測其方向以及目標位址。以第2圖所示實施方式為例，產出的解密密鑰174是主密鑰對234/236以及提取位址134的一函式，將在目標位址所指之該區塊指令數據106送抵該互斥邏輯156的同一時間產出預測之目標位址的適當解密密鑰174。與傳統解密密鑰運算手法針對目標位址計算解密密鑰所必須的多個「倒帶(rewind)」步驟相較，本案所揭露技術在處理加密指令數據時不會產生額外的延遲。 Referring to the contents disclosed in FIG. 2 and FIG. 3, the decryption key 174 is supplied to the extracted block instruction data 106 for mutual exclusion operation, and the decryption key is used. Key 174 is a function of selected master key pair 234/236 and extracted address 134. Compared to the traditional decryption procedure - making the decryption key a function of the previous key value, where the key is continuously modified to supply a new use in the next working interval - the decryption technique disclosed in this case is completely different. The manner in which the decryption key 174 is obtained by the master key pair 234/236 and the extracted address 134 as a function has at least the following two advantages. First, as discussed above, the extraction of encrypted instruction data and plain text instruction data 106 is time consuming and does not increase the operational clock required by microprocessor 100. Second, when the branch instruction in the program is encountered, the time required to extract the instruction data 106 does not increase. In one embodiment, a branch predictor receives the extracted address 134 and predicts whether the instruction data 106 of the block indicated by the extracted address 134 has a branch instruction and predicts its direction and target bit. site. Taking the implementation shown in FIG. 2 as an example, the generated decryption key 174 is a function of the master key pair 234/236 and the extracted address 134, and the block instruction data 106 to be referred to at the target address. The appropriate decryption key 174 of the target address of the predicted output is sent to the mutually exclusive logic 156 at the same time. In contrast to the plurality of "rewind" steps necessary for the conventional decryption key operation method to calculate the decryption key for the target address, the technique disclosed herein does not introduce additional delay in processing the encrypted instruction data.

另外，如第2圖以及第3圖所示，密鑰擴展器152之旋轉器216以及加法/減法器218之聯合設計，使得解密密鑰長度有效擴展，超越主密鑰之長度。例如，主密鑰共貢獻32位元組(2*16位元組)；更甚者，以駭客企圖判斷解密密鑰174為何的角度而言，旋轉器216以及加法/減法器218有效地將位於主密鑰暫存器142的32位元組的主密鑰擴展為256位元組的密鑰序列。更具體地說，有效擴展後的密鑰序列之位元組n為：為第一主密鑰234的位元組n，且為第二主密鑰236的位元組n+x。如上所述，密鑰擴展器152所產生的前八套16位元組解密密鑰174是由減法方式產生，且後八套是由加法方式產生。具體來說，選定的主密鑰對234/236各自所提供的位元組內容用於為16個連續的16位元組區塊之指令數據各個位元組產生解密密鑰174位元組，詳情請見表格3。舉例說明之，表格3第1列的符號“15-00”表示第二主密鑰236的位元組0的內容會經8位元算數運算(an eight-bit arithmetic operation)自第一主密鑰234的位元組15減去，以獲得一位元組的有效解密密鑰174，用以與一16位元組區塊之指令數據106中的位元組15進行互斥運算。 Further, as shown in Figs. 2 and 3, the joint design of the rotator 216 of the key expander 152 and the adder/subtractor 218 allows the decryption key length to be effectively expanded beyond the length of the master key. For example, the master key contributes a total of 32 bytes (2*16 bytes); moreover, the rotator 216 and the adder/subtractor 218 are effective in terms of the hacker's attempt to determine the decryption key 174. The master key of the 32-bit tuple located in the master key register 142 is expanded to a key sequence of 256 bytes. More specifically, the byte n of the effectively extended key sequence is: Is the byte n of the first master key 234, and It is the byte n+x of the second master key 236. As described above, the first eight sets of 16-byte decryption keys 174 generated by the key expander 152 are generated by subtraction, and the last eight sets are generated by addition. Specifically, the byte content provided by each of the selected master key pairs 234/236 is used to generate a decryption key 174 byte for each byte of the instruction data of 16 consecutive 16-bit block blocks. See Table 3 for details. For example, the symbol "15-00" in the first column of Table 3 indicates that the contents of the byte 0 of the second master key 236 are subjected to an eight-bit arithmetic operation from the first primary key. The byte 15 of the key 234 is subtracted to obtain a one-tuple valid decryption key 174 for mutual exclusion with the byte 15 in the instruction data 106 of a 16-bit block.

給定適當的主密鑰數值後，密鑰擴展器152所產生的擴展密鑰統計來說可有效預防互斥加密常見的攻擊，包括令文件之加密區塊以密鑰長度位移、並對加密區塊一併施行互斥運算，以下更詳細討論之。密鑰擴展器152對選定主密鑰對234/236之影響是：在所述實施方式中，程式中以完全相同的密鑰所加密的兩個指令數據106位元組之跨距可高達256位元組。在其他具有不同區塊尺寸的指令數據106、以及不同主密鑰長度的實施方式中，以同樣密鑰加密的兩個指令數據106位元組的最大跨距可有不同的量。 Given the appropriate master key value, the extended key statistics generated by the key expander 152 can effectively prevent common attacks of mutual exclusion encryption, including causing the encrypted block of the file to be shifted by the key length and encrypted. The blocks are also mutually exclusive, as discussed in more detail below. The effect of key expander 152 on selected master key pair 234/236 is that in the embodiment, the span of two instruction data 106 bytes encrypted with the exact same key in the program can be as high as 256. Bytes. In other embodiments of instruction data 106 having different block sizes, and different master key lengths, the maximum span of two instruction data 106 bytes encrypted with the same key may be of different amounts.

用來選定主密鑰對234/236的主密鑰暫存器142以及密鑰擴展器152內的多工器212/214也會決定有效密鑰長度的擴展程度。如以上討論，第2圖所示實施方式供應有5個主密鑰暫存器142，主密鑰暫存器142所供應的內容因此可以10種方式組合，而多工器212/214是用於自上述10種可能組合方式中選擇八種作用。表格3所示各密鑰對234/236所對應的256位元組有效密鑰長度搭配八種主密鑰對234/236組合後，所產生的有效密鑰長度為2048位元組。也就是說，程式中以完全相同之密鑰加密的兩個指令數據106位元組之跨距可高達2048位元組。 The master key register 142 for selecting the master key pair 234/236 and the multiplexer 212/214 within the key expander 152 also determine the extent of the effective key length. As discussed above, the embodiment shown in FIG. 2 is supplied with five master key registers 142, and the contents supplied by the master key register 142 can thus be combined in 10 ways, and the multiplexer 212/214 is used. Eight effects were selected from the above 10 possible combinations. After the 256-bit effective key length corresponding to each key pair 234/236 shown in Table 3 is combined with the eight master key pairs 234/236, the generated effective key length is 2048 bytes. In other words, the program is exactly the same The span of the two instruction data 106 bytes of the key encryption can be as high as 2048 bytes.

為了更加說明密鑰擴展器152所帶來的優點，以下簡短敘述互斥加密程序所常見的的攻擊。若互斥加密運算所採用的密鑰長度短於所加密/解密之程式指令數據的長度，密鑰中的許多位元組必須被重複使用，且被重複使用的位元組數量視程式之長度而定。此弱點使互斥指令加密程序可被破解。第一，駭客嘗試判斷出重複密鑰之長度，以下展示的說明(1)至(3)令之為n+1。第二，駭客假定指令數據內各個密鑰長度區塊(key-length block)是以同樣密鑰加密。以下列舉根據一傳統互斥加密運算加密得到的二密鑰長度區塊的數據： To further illustrate the advantages of the key extender 152, the following is a brief description of the attacks common to mutually exclusive encryption programs. If the length of the key used in the mutex encryption operation is shorter than the length of the program data of the encrypted/decrypted program, many of the bytes in the key must be reused, and the number of bytes that are reused depends on the length of the program. And set. This weakness allows the mutex command encryption program to be cracked. First, the hacker attempts to determine the length of the duplicate key. The instructions (1) through (3) shown below make it n+1. Second, the hacker assumes that each key-length block in the command data is encrypted with the same key. The following lists the data of the two-key length block obtained by encrypting according to a conventional mutual exclusion encryption operation:

其中，為第一密鑰長度區塊之數據的位元組n，將被加密；為第二密鑰長度區塊之數據的位元組n，將被加密；且k _n為密鑰的位元組n。第三，駭客對所述兩區塊進行互斥運算，使其中密鑰成分彼此相銷，獨留以下內容： among them, The byte n of the data of the first key length block will be encrypted; The byte n of the data of the second key length block will be encrypted; and k _n is the byte _n of the key. Third, the hacker performs a mutually exclusive operation on the two blocks, so that the key components are mutually exclusive, leaving the following contents:

最後，由於計算出的位元組為單純兩個純文字位元組的函式，駭客可以統計分析純文字內容之出現頻率，以嘗試求得純文字位元組的數值。 Finally, since the calculated byte is a function of two purely literal bytes, the hacker can statistically analyze the frequency of occurrence of the plain text content in an attempt to obtain the value of the plain text byte.

然而，根據第2圖以及第3圖所揭露方式計算出的加密指令數據106位元組之圖樣如以下說明(4)與(5)所示： However, the pattern of the encrypted instruction data 106 bytes calculated according to the manner disclosed in FIGS. 2 and 3 is as shown in the following descriptions (4) and (5):

其中標示所加密之第一16位元組區塊之指令數據的位元組n，標示所加密之第二16位元組區塊之指令數據的位元組n，標示主密鑰x的位元組n，且標示主密鑰y的位元組n。如前述，主密鑰x與y為不同密鑰。假定一種實施方式以五個主密鑰暫存器142提供八種主密鑰對234/236組合，2048位元組序列中各位元組是與兩個獨立的主密鑰位元組的一組合進行互斥運算。因此，當加密數據以任何方式於256位元組的區塊中移位並且彼此作互斥運算，所求得的位元組都會存在兩個主密鑰的複雜成分，因此，不若說明(3)的內容，此處所得的運算結果不單純只是純文字位元組。例如，假設駭客選擇使同一256位元組區塊中的16位元組區塊對齊並彼此進行互斥操作使同樣的密鑰零位元組在各段中被使用，位元組0之運算結果如說明(6)所示，所獲得的位元組存在兩個主密鑰的複雜組合：其中n不為1。 among them a byte n indicating the instruction data of the encrypted first 16-bit block, a byte n indicating the instruction data of the encrypted second 16-bit block block, Indicates the byte n of the master key x, and A byte n indicating the master key y. As mentioned above, the master keys x and y are different keys. Assume that one embodiment provides eight master key pairs 234/236 combinations in five master key registers 142, each of which is a combination of two independent master key bytes in a 2048 byte sequence Perform a mutually exclusive operation. Therefore, when the encrypted data is shifted in any way in the 256-bit block and mutually exclusive, the obtained byte has a complex component of the two master keys, and therefore, it is not explained ( 3) The content of the calculations obtained here is not simply a plain text byte. For example, suppose the hacker chooses to align 16-bit tuple blocks in the same 256-bit tuple block and mutually exclusive operations so that the same key zeros are used in each segment, byte 0 The result of the operation is as shown in the description (6). The obtained byte has a complex combination of two master keys: Where n is not 1.

再者，若駭客換成將選自不同256位元組區塊內的16位元組區塊對齊、且彼此作互斥運算，運算結果的位元組0如說明(7)所示：其中主密鑰u與v中至少一者不同於主密鑰x以及y。模擬隨機主密鑰數值所產生之有效密鑰位元組之互斥運算，可發現運算結果呈現相當平滑的分布。 Furthermore, if the hacker is replaced by aligning 16-bit blocks selected from different 256-byte blocks and mutually exclusive operations, the byte 0 of the operation result is as shown in the description (7): Wherein at least one of the master keys u and v is different from the master keys x and y. Simulate the mutually exclusive operation of the valid key bytes generated by the random master key value to find the operation result A fairly smooth distribution is presented.

當然，若駭客選擇將不同的2048位元組長度區塊內的16位元組區塊對齊、並且彼此進行互斥操作，駭客可能會獲得與說明(3)類似的結果。然而，請參照以下內容。第一，某些程式-例如，安全性相關程式-可能短於2048位元組。第二，相距2048位元組的指令位元組之統計相關性(statistical correlation)很可能非常小，導致很難破解。第三，如前述內容，所述技術之實施方式可以較多數量實現主密鑰暫存器142，使解密密鑰之有效長度擴展；例如，以12個主密鑰暫存器142供應16384位元組長度的解密密鑰，甚至其他更長的解密密鑰。第四，以下將討論的密鑰下載指令500以及密鑰切換指令600更使程式設計師得以載入新的數值至主密鑰暫存器142，以有效擴展密鑰長度超過2048位元組，或者，如果必要，也可擴展密鑰長度至程式的完整長度。 Of course, if the hacker chooses to align the 16-bit tuple blocks in different 2048-byte length blocks and mutually exclusive operations, the hacker may obtain similar results as the description (3). However, please refer to the following. First, some programs - for example, security-related programs - may be shorter than 2048 bytes. Second, the statistical correlation of the instruction bits of the 2048-bit tuple is likely to be very small, making it difficult to crack. Third, as described above, the implementation of the techniques may implement the master key register 142 in a larger number to extend the effective length of the decryption key; for example, to supply 16384 bits in 12 master key registers 142. The decryption key of the tuple length, and even other longer decryption keys. Fourth, the key download instruction 500 and the key switch instruction 600, which will be discussed below, further enable the programmer to load new values into the master key register 142 to effectively extend the key length by more than 2048 bytes. Or, if necessary, extend the key length to the full length of the program.

現在，參考第4圖，一方塊圖根據本發明技術圖解第1圖的標誌暫存器128。根據第4圖所示之實施方式，標誌暫存器128包括標準x86暫存器的複數個位元408；不過，為了此處敘述的新功能，第4圖所示實施方式會動用x86架構中一般為預留(RESERVED)的一位元。特別說明之，標誌暫存器128包括一E位元欄位402。E位元欄位402用於修復控制暫存器144的E位元148數值，用以於加密以及純文字程式間切換以及/或於不同加密程式間切換，以下將詳細討論之。E位元欄位402標示目前所執行的程式是否有加密。若目前所執行的程式有加密，E位元欄位402為設定狀態，否則，為清除狀態。當中斷事件發生，控制權切換給其他程式(例如，中斷interrupt、異常exception如頁錯誤page fault、或任務切換task switch)，儲存標誌暫存器128。反之，若控制權重回先前因中斷事件中斷的程式，則修復標誌暫存器128。微處理器100之設計會在標誌暫存器128修復時以標誌暫存器128之E位元402欄位數值更新控制暫存器144之E位元148數值，以下將詳細討論之。因此，若中斷事件發生時一加密程式正在執行(即提取單元104處於解密模式)，當控制權交還給該加密程式時，以修復的E位元欄位402令E位元148為設定狀態，以修復提取單元104為解密模式。在一種實施方式中，E位元148以及E位元欄位402為同一個具體硬體位元，因此，儲存標誌暫存器128的E位元欄位402中數值即是儲存E位元148，且修復標誌暫存器128的E位元欄位402的數值即是修復E位元148。 Referring now to Figure 4, a block diagram illustrates the flag register 128 of Figure 1 in accordance with the teachings of the present invention. According to the embodiment shown in FIG. 4, the flag register 128 includes a plurality of bits 408 of a standard x86 register; however, for the new functions described herein, the embodiment shown in FIG. 4 will be implemented in the x86 architecture. Usually one bit reserved (RESERVED). In particular, the flag register 128 includes an E-bit field 402. The E-bit field 402 is used to repair the value of the E-bit 148 of the control register 144 for switching between encryption and plain text programs and/or between different encryption programs, as discussed in more detail below. E bit field 402 indicates the target Whether the previously executed program is encrypted. If the currently executed program is encrypted, the E bit field 402 is set to the state, otherwise, it is cleared. When an interrupt event occurs and control is transferred to another program (eg, interrupt interrupt, exception exception such as page fault page fault, or task switch task switch), flag register 128 is stored. Conversely, if the control returns to the program previously interrupted by the interrupt event, the flag register 128 is repaired. The design of the microprocessor 100 updates the value of the E bit 148 of the control register 144 with the value of the E bit 402 field of the flag register 128 when the flag register 128 is repaired, as discussed in more detail below. Therefore, if an encryption program is being executed when the interruption event occurs (ie, the extraction unit 104 is in the decryption mode), when the control is returned to the encryption program, the restored E-bit field 402 causes the E-bit 148 to be set. The repair extraction unit 104 is in the decryption mode. In one embodiment, the E bit 148 and the E bit field 402 are the same specific hardware bit. Therefore, the value in the E bit field 402 of the storage flag register 128 is the storage E bit 148. And the value of the E-bit field 402 of the repair flag register 128 is the repair E-bit 148.

參閱第5圖，一方塊圖圖解根據本發明技術所實現的一密鑰載入指令500之格式。密鑰載入指令500包括一操作碼(opcode)502欄位，特地標示其為微處理器100指令集內的密鑰載入指令500。在一種實施方式中，操作碼欄位502數值為0FA6/4(x86領域)。密鑰載入指令500包括兩個運算元：一密鑰暫存器檔案目標位址504以及一安全存儲區來源位址506。該安全存儲區來源位址506為安全存儲區122中儲存一16位元組主密鑰的一位址。密鑰暫存器檔案位址504標示密鑰暫存器檔案124內的一個暫存器的位址，此暫存器將載入自安全存儲區122 載出之16位元組主密鑰。在一種實施方式中，若一程式企圖在微處理器100不為安全操作模式下執行密鑰載入指令500，則視之為無效指令異常；此外，若安全存儲區來源位址506數值位於有效安全存儲區122之外，則視之為一般保護異常。在一種實施方式中，若一程式試圖在微處理器100不為最高權限級別時(例如，x86環0權限/x86 ring 0)執行密鑰載入指令500，則視之為無效指令異常。在某些狀況下，16位元組主密鑰之構成可能包括在加密指令的即時數據字段內。所述即時數據可被一塊一塊移至安全存儲區122組成16位元組的密鑰。 Referring to Figure 5, a block diagram illustrates the format of a key load instruction 500 implemented in accordance with the teachings of the present invention. The key load instruction 500 includes an opcode 502 field, which is specifically indicated as a key load instruction 500 within the microprocessor 100 instruction set. In one embodiment, the value of the opcode field 502 is 0FA6/4 (x86 field). The key load instruction 500 includes two operands: a key register file target address 504 and a secure memory source address 506. The secure storage source address 506 is a single address in the secure storage area 122 that stores a 16-bit tuple master key. Key register file address 504 identifies the address of a scratchpad in key register file 124, which will be loaded from secure storage area 122. The 16-bit tuple master key is loaded. In one embodiment, if a program attempts to execute the key load instruction 500 in the safe mode of operation of the microprocessor 100, it is considered an invalid instruction exception; in addition, if the value of the secure memory source address 506 is valid. Outside the secure storage area 122, it is considered a general protection exception. In one embodiment, if a program attempts to execute the key load instruction 500 when the microprocessor 100 is not at the highest privilege level (eg, x86 ring 0 privilege/x86 ring 0), then it is considered an invalid instruction exception. In some cases, the composition of the 16-bit tuple master key may be included in the immediate data field of the encrypted instruction. The instant data can be moved piece by piece to the secure storage area 122 to form a 16-bit tuple key.

現在，參閱第6圖，一方塊圖圖解根據本發明技術所實現的一密鑰切換指令600之格式。密鑰切換指令600包括一操作碼602欄位，特指其為微處理器100指令集內的密鑰切換指令600。密鑰切換指令600更包括一密鑰暫存器檔案索引欄位604，標示密鑰暫存器檔案124一序列暫存器中的開端，以自此將密鑰載入主密鑰暫存器142。在一種實施方式中，若一程式嘗試在微處理器100不為安全操作模式時執行一密鑰切換指令600，則視之為無效指令異常。在一種實施方式中，若一程式意圖在微處理器100不為最高權限級別(例如，x86環0權限)時執行一密鑰切換指令600，則視之為無效指令異常。在一種實施方式中，密鑰切換指令600為原子操作型式(atomic)，即不可中斷；此處所討論，用於載入密鑰至主密鑰暫存器142的其他指令也是如此-例如，以下將討論的分支與切換密鑰指令。 Referring now to Figure 6, a block diagram illustrates the format of a key switch instruction 600 implemented in accordance with the teachings of the present invention. The key switch instruction 600 includes an opcode 602 field, specifically referred to as a key switch instruction 600 within the instruction set of the microprocessor 100. The key switch instruction 600 further includes a key register file index field 604 indicating the start of the key register file 124 in a sequence register to load the key into the master key register. 142. In one embodiment, if a program attempts to execute a key switch instruction 600 when the microprocessor 100 is not in the secure mode of operation, then it is considered an invalid command exception. In one embodiment, if a program intends to execute a key switch instruction 600 when the microprocessor 100 is not at the highest privilege level (e.g., x86 ring 0 privilege), then it is considered an invalid instruction exception. In one embodiment, the key switch instruction 600 is atomic, i.e., non-interruptible; as discussed herein, other instructions for loading the key to the master key register 142 are also the same - for example, The branch and switching key instructions will be discussed.

現在，參閱第7圖，一流程圖圖解第1圖之微處理器100之操作，其中，根據本發明技術執行第6圖介紹的密鑰切換指令600。流程始於方塊702。 Referring now to Figure 7, a flowchart illustrates the operation of the microprocessor 100 of Figure 1, wherein the key cut described in Figure 6 is performed in accordance with the teachings of the present invention. Change instruction 600. The flow begins at block 702.

在方塊702，解碼單元108將一密鑰切換指令600解碼，且將解碼結果代入微代碼單元132內實現密鑰切換指令600的微代碼程序。流程接著進入方塊704。 At block 702, decoding unit 108 decodes a key switch instruction 600 and substitutes the decoded result into a microcode program that implements key switch instruction 600 within microcode unit 132. Flow then proceeds to block 704.

在方塊704，微代碼會根據密鑰暫存器檔案索引欄位604自密鑰暫存器檔案124下載主密鑰暫存器142的內容。較佳實施方式是：微代碼以密鑰暫存器檔案索引欄位604所標示的密鑰暫存器為起始，自密鑰暫存器檔案124下載連續的n個暫存器內容作為n個密鑰存入主密鑰暫存器142，其中n為主密鑰暫存器142的總數。在一種實施方式中，數值n可標示於密鑰切換指令600的一額外空間，設定為少於主密鑰暫存器142的總數。流程接著進入方塊706。 At block 704, the microcode downloads the contents of the master key register 142 from the key register file 124 in accordance with the key register file index field 604. In a preferred embodiment, the microcode starts with a key register indicated by the key register file index field 604, and downloads n consecutive register contents from the key register file 124 as n. The keys are stored in master key register 142, where n is the total number of master key registers 142. In one embodiment, the value n may be indicated in an additional space of the key switch instruction 600, which is set to be less than the total number of master key registers 142. Flow then proceeds to block 706.

在方塊706，微代碼使微處理器100分支至接續的x86指令(即該密鑰切換指令600之後的指令)，將導致微處理器100中較密鑰切換指令600新的所有x86指令被清空，致使微處理器100內、較切換至接續x86指令的微操作新的所有微操作被清空。上述被清空的指令包括自指令快取記憶體102提取出、緩衝暫存於提取單元104以及解碼單元108內等待解密與解碼的所有指令位元組106。流程接著進入方塊708。 At block 706, the microcode branches the microprocessor 100 to a subsequent x86 instruction (i.e., the instruction following the key switch instruction 600), which causes all of the x86 instructions in the microprocessor 100 to be flushed with the new key switch instruction 600. All micro-ops within the microprocessor 100 that are new to the micro-ops that are switched to the subsequent x86 instructions are cleared. The emptied instructions include all of the instruction octets 106 that are extracted from the instruction cache 102, buffered in the fetch unit 104, and decoded in the decoding unit 108 for decryption and decoding. Flow then proceeds to block 708.

在方塊708，基於方塊706分支至接續指令的操作，提取單元104開始利用方塊704載入主密鑰暫存器142的新一組密鑰值自指令快取記憶體102提取並且解密指令數據106。流程結束於方塊708。 At block 708, based on the operation of branch 706 branching to the continuation instruction, extraction unit 104 begins to extract and decrypt instruction data 106 from instruction cache 102 using a new set of key values loaded into master key register 142 using block 704. . Flow ends at block 708.

如第7圖所示，密鑰切換指令600令正在執行中的加密程式在自指令快取記憶體102提取出來的同時得以改變主密鑰暫存器142內所儲存、供解密該加密程式使用的內容。所述主密鑰暫存器142動態調整技術使得加密該程式的有效密鑰長度超越提取單元104先天支援的長度(例如，第2圖實施方式所提供的2048位元組)；如第8圖所示程式，若將之以第1圖微處理器100操作，駭客會更不易攻破電腦系統的安全防護。 As shown in FIG. 7, the key switching instruction 600 is in execution. The encryption program, upon being extracted from the instruction cache 102, changes the content stored in the master key register 142 for decryption of the encryption program. The master key register 142 dynamically adjusts the technique so that the effective key length of the encrypted program exceeds the length supported by the extraction unit 104 (for example, the 2048 byte provided in the embodiment of FIG. 2); The program shown, if it is operated by the microprocessor 100 of Figure 1, will be more difficult for the hacker to break the security protection of the computer system.

現在，參閱第8圖，一方塊圖圖解根據本發明技術所實現的一加密程式的一記憶體用量(memory footprint)800，其中採用第6圖所示之密鑰切換指令600。第8圖所示之加密程式記憶體用量800包括連續數「塊chunk」指令數據位元組。每一「塊」的內容為一序列多個指令數據位元組(其中為預先加密的數據)，且屬於同一「塊」的指令數據位元組是由同樣的一套主密鑰暫存器142數值解密。因此，不同兩「塊」的界線是由密鑰切換指令600定義。也就是說，各「塊」的上、下界是由密鑰切換指令600之位置區分(或者，以一程式的第一「塊」為例，其上界為該程式的起始處；此外，以該程式的最後一「塊」為例，其下界為該程式的結束處)。因此，各「塊」指令數據位元組是由提取單元104基於不同套主密鑰暫存器142數值解密，意即各「塊」指令數據位元組的解密是根據前一「塊」所供應的一密鑰切換指令600所載入主密鑰暫存器142數值。加密一程式的後處理器(post-processor)會知曉各密鑰切換指令600所在之記憶體位址，並且會利用此資訊-即提取位址的相關位址位元-配合密鑰切換指令600密鑰數值產生加密密鑰位元組，以加密該程式。一些目的檔格式(object file format)允許程式設計者標示程式載入記憶體何處，或至少載明特定大小的對齊形式(例如，頁面邊界page boundary)，以提供足夠的位址資訊加密該程式。此外，一些作業系統預設值是將程式載入頁面邊界上。 Referring now to Figure 8, a block diagram illustrates a memory footprint 800 of an encryption program implemented in accordance with the teachings of the present invention, wherein the key switching instruction 600 illustrated in Figure 6 is employed. The encryption program memory usage 800 shown in FIG. 8 includes a continuous number of "chun chunk" instruction data bytes. The content of each "block" is a sequence of multiple instruction data bytes (where pre-encrypted data), and the instruction data bytes belonging to the same "block" are from the same set of master key registers. 142 value decryption. Therefore, the boundaries of the two different "blocks" are defined by the key switch instruction 600. That is to say, the upper and lower bounds of each "block" are distinguished by the location of the key switching instruction 600 (or, for example, the first "block" of a program whose upper bound is the beginning of the program; Take the last "block" of the program as an example, the lower bound is the end of the program). Therefore, each "block" instruction data byte is decrypted by the extraction unit 104 based on the value of the different sets of master key register 142, meaning that the decryption of each "block" instruction data byte is based on the previous "block". The key switch instruction 600 supplied is loaded with the value of the master key register 142. The post-processor of the encryption program will know the memory address of each key switching instruction 600, and will use this information - that is, extract the relevant address bit of the address - match the key switching instruction 600 The key value generates an encryption key byte to encrypt the program. Some object file formats allow programming The meter indicates where the program is loaded into the memory, or at least specifies a specific size alignment (eg, page boundary) to provide sufficient address information to encrypt the program. In addition, some operating system presets load the program onto the page boundaries.

密鑰切換指令600可安置於程式的任何地方。然而，若密鑰切換指令600載入特定值至主密鑰暫存器142供下一「塊」指令數據位元組解密使用、且密鑰切換指令600(或甚至密鑰載入指令500)之位置導致每一「塊」之長度短於、或等於提取單元104所能應付的有效密鑰長度(例如，第2圖實施方式所揭露的2048位元組)，則程式可被以有效長度等同整體程式長度的密鑰加密，此為相當強健的加密方式。此外，即使密鑰切換指令600的使用使得有效密鑰長度仍短於加密程式的長度(即，同樣一套主密鑰暫存器142數值被用於加密一程式的多個「塊」)，改變「塊」尺寸(例如，不限定全為2048位元組)可增加駭客破解系統的困難度，因為，駭客必須先判斷以同一套主密鑰暫存器142數值加密的「塊」位於何處，並且必須判斷該些長度不一的「塊」各自的尺寸。 The key switch instruction 600 can be placed anywhere in the program. However, if the key switch instruction 600 loads a particular value into the master key register 142 for use by the next "block" instruction data byte decryption, and the key switch instruction 600 (or even the key load instruction 500) The position of each "block" is shorter than or equal to the effective key length that the extraction unit 104 can handle (for example, the 2048 byte disclosed in the embodiment of Fig. 2), the program can be used with an effective length. A key encryption equivalent to the overall program length, which is a fairly robust encryption method. Moreover, even if the use of the key switch instruction 600 is such that the effective key length is still shorter than the length of the encryption program (ie, the same set of master key register 142 values are used to encrypt multiple "blocks" of a program), Changing the "block" size (for example, not limited to all 2048 bytes) can increase the difficulty of the hacker cracking system, because the hacker must first determine the "block" encrypted by the same set of master key register 142. Where is it, and the size of the "blocks" of different lengths must be determined.

值得注意的是，以密鑰切換指令600實現的動態密鑰切換耗費相當大量的時脈數目，主要是因為管線必須清空。此外，在一種實施方式中，密鑰切換指令600主要是以微代碼(microcode)實現，通常較非微代碼實現的指令慢。因此，程式碼開發者須考慮密鑰切換指令對效能的影響，在執行速度以及特定應用之安全性考量之間尋求平衡點。 It is worth noting that the dynamic key switching implemented by the key switch instruction 600 consumes a relatively large number of clocks, mainly because the pipeline must be emptied. Moreover, in one embodiment, the key switch instruction 600 is implemented primarily in microcode, typically slower than instructions implemented in non-microcode. Therefore, code developers must consider the impact of key-switching instructions on performance and strike a balance between execution speed and security considerations for specific applications.

現在，參閱第9圖，一方塊圖圖解根據本發明技術實現的一分支與切換密鑰指令900的格式。首先敘述該分支與切換密鑰指令900的必要性。 Referring now to Figure 9, a block diagram illustrates the technique in accordance with the present invention. A branch of the implementation and the format of the switch key instruction 900. First, the necessity of the branch and switching key command 900 will be described.

根據以上實施例所揭露內容，加密程式交由提取單元104提取的各個16位元組區塊的指令數據是有先經過加密運算(採互斥技術)，所採用的加密密鑰等同提取單元104用來解密(互斥運算)所提取之各區塊之指令數據106的各個16位元組長之解密密鑰174。如以上所述，解密密鑰174的位元組數值是由提取單元104基於以下兩種輸入計算而得：儲存於主密鑰暫存器142的主密鑰位元組數值、以及所提取之16位元組區塊之指令數據106的提取位址134的部分位元(以第2圖所揭露實施方式為例，為位元[10：4])。因此，加密一程式使之由微處理器100執行的一後處理器會知曉將儲存於主密鑰暫存器142的主密鑰位元組數值、以及一位址(或更限定為該位址的數個相關位元)；該位址指示加密程式將被載入記憶體何處、且微處理器100將自此處一連串地提取出該加密程式數個區塊的指令數據。基於上述資訊，後處理器得以適切產生解密密鑰174數值，用於加密該程式的各個16位元組區塊之指令數據。 According to the disclosure of the foregoing embodiment, the instruction data of each 16-bit tuple block extracted by the extraction unit 104 by the encryption unit is a cryptographic operation (exclusive technique), and the encryption key equivalent extraction unit 104 is adopted. A decryption key 174 for each 16-bit tuple length of the instruction data 106 used to decrypt (mutually exclusive) the extracted blocks. As described above, the byte value of the decryption key 174 is calculated by the extraction unit 104 based on the following two inputs: the primary key byte value stored in the master key register 142, and the extracted value. A portion of the bit 134 of the instruction data 106 of the 16-bit block is extracted (for example, the embodiment disclosed in FIG. 2 is a bit [10:4]). Thus, a post processor that encrypts a program for execution by microprocessor 100 will know the master key byte value to be stored in master key register 142, and the address (or more limited to this bit). The number of associated bits of the address); the address indicates where the encryption program will be loaded into the memory, and the microprocessor 100 will extract the instruction data of the plurality of blocks of the encryption program from the series. Based on the above information, the post processor can appropriately generate the value of the decryption key 174 for encrypting the instruction data of each 16-bit block of the program.

如以上所討論，當一分支指令被預測到且/或被執行，提取單元104會以分支目標位址更新提取位址134。只要加密程式從未改變(經由密鑰切換指令600)主密鑰暫存器142內儲存的主密鑰數值，分支指令是由提取單元104透明控制。也就是說，提取單元104會採用同樣的主密鑰暫存器142數值估算解密密鑰174，以供解密包括該分支指令的一區塊之指令數據106、以及解密該分支指令之目標位址所指的一區塊之指令數據106 內的指令。然而，程式改變(經由密鑰切換指令600)主密鑰暫存器142數值的能力意味著提取單元104有可能以一套主密鑰暫存器142數值估算解密密鑰174解密包括該分支指令的一區塊之指令數據106，並以不同的另外一套主密鑰暫存器142數值估算解密密鑰174解密該分支指令之目標位址所指的一區塊之指令數據106內的指令。解決此問題的一種方法是限定分支目標位址於程式同一「塊」中。另外一種解決方式是採用第9圖所揭露的分支與切換密鑰指令900。 As discussed above, when a branch instruction is predicted and/or executed, the extraction unit 104 updates the extraction address 134 with the branch target address. The branch instruction is transparently controlled by the extraction unit 104 as long as the encryption program has never changed (via the key switch instruction 600) the master key value stored in the master key register 142. That is, the extracting unit 104 estimates the decryption key 174 using the same master key register 142 value for decrypting the instruction data 106 of a block including the branch instruction, and decrypting the target address of the branch instruction. The instruction data of a block indicated Instructions within. However, the ability of the program to change (via the key switch instruction 600) the value of the master key register 142 means that the extracting unit 104 is likely to decrypt the decryption key 174 with a set of master key registers 142, including the branch instruction. The instruction data 106 of one block and the decryption key 174 of the different set of master key register 142 are used to decrypt the instruction in the instruction data 106 of a block indicated by the target address of the branch instruction. . One way to solve this problem is to limit the branch target address to the same "block" of the program. Another solution is to use the branch and switch key command 900 disclosed in FIG.

再次參閱第9圖，一方塊圖圖解根據本發明技術實現的一分支與切換密鑰指令900的格式。分支與切換密鑰指令900包括一操作碼902欄位，標示其為微處理器100指令集內的分支與切換密鑰指令900。分支與切換密鑰指令900更包括一密鑰暫存器檔案索引欄位904，標示密鑰暫存器檔案124中一連串暫存器裡的開端，以自此將密鑰載入主密鑰暫存器142。分支與切換密鑰指令900更包括一分支資訊欄位906，記載分支指令的典型資訊-如，計算目標位址的資訊、以及分支條件。在一種實施方式中，若一程式在微處理器100不為安全執行模式時嘗試執行一分支與切換密鑰指令900，則視之為無效指令異常。在一種實施方式中，若一程式在微處理器100不為最高權限層級(例如，x86的環0權限)時試圖執行分支與切換密鑰指令900，則視之為無效指令異常。在一種實施方式中，分支與切換密鑰指令900為原子操作型(atomic)。 Referring again to Figure 9, a block diagram illustrates the format of a branch and switch key instruction 900 implemented in accordance with the teachings of the present invention. The branch and switch key instruction 900 includes an opcode 902 field indicating that it is a branch and switch key instruction 900 within the instruction set of the microprocessor 100. The branch and switch key instruction 900 further includes a key register file index field 904 indicating the beginning of a series of scratchpads in the key register file 124 to load the key into the master key. 142. The branch and switch key instruction 900 further includes a branch information field 906 that records typical information of the branch instruction - for example, information on the target address, and branch conditions. In one embodiment, a program is considered to be an invalid instruction exception if it attempts to execute a branch and switch key instruction 900 when the microprocessor 100 is not in the secure execution mode. In one embodiment, if a program attempts to execute the branch and switch key instruction 900 when the microprocessor 100 is not at the highest privilege level (eg, x86 ring 0 privilege), then the program is considered an invalid instruction exception. In one embodiment, the branch and switch key instruction 900 is atomic.

參閱第10圖，一流程圖圖解第1圖微處理器100之操作，其中，根據本發明技術執行第9圖所揭露之分支與切換密鑰指令900。流程始於方塊1002。 Referring to FIG. 10, a flowchart illustrates the operation of the microprocessor 100 of FIG. 1, wherein the branching and switching disclosed in FIG. 9 is performed according to the techniques of the present invention. Key instruction 900. The process begins at block 1002.

在方塊1002，解碼單元108解碼一分支與切換密鑰指令900且將之代入微代碼單元132中實現該分支與切換密鑰指令900的微代碼程序。流程接著進入方塊1006。 At block 1002, decoding unit 108 decodes a branch and switch key instruction 900 and substitutes it into microcode unit 132 to implement the microcode program of branch and switch key instruction 900. Flow then proceeds to block 1006.

在方塊1006，微代碼解出分支方向(採用、或不採用)、以及目標位址。值得注意的是，對於無條件型分支指令(unconditional branch instruction)，所述方向衡為採用。流程接著進入判斷方塊1008。 At block 1006, the microcode resolves the branch direction (with or without) and the target address. It is worth noting that for unconditional branch instructions, the direction is adopted. Flow then proceeds to decision block 1008.

在判斷方塊1008，微代碼判斷方塊1006所解出的方向是否為採用。若為採用，流程進入方塊1014。反之，流程進入方塊1012。 At decision block 1008, the direction in which the microcode decision block 1006 is resolved is taken. If so, the flow proceeds to block 1014. Otherwise, the flow proceeds to block 1012.

在方塊1012，微代碼不切換密鑰、或跳至目標位址，因為分支操作未被採用。流程結束於方塊1012。 At block 1012, the microcode does not switch the key, or jumps to the target address because the branching operation is not taken. Flow ends at block 1012.

在方塊1014，微代碼根據密鑰暫存器檔案索引欄位904，將密鑰自密鑰暫存器檔案124載入主密鑰暫存器142。較佳實施例是，微代碼以密鑰暫存器檔案索引欄位904所標示的位置為起始，將密鑰暫存器檔案124內n個鄰近暫存器所記載的n個密鑰載入主密鑰暫存器142，其中n為主密鑰暫存器142的總數。在一種實施方式中，n值可紀錄於分支與切換密鑰指令900的一額外空間，設定為小於主密鑰暫存器142總數的值。流程接著進入方塊1016。 At block 1014, the microcode loads the key from the key register file 124 into the master key register 142 based on the key register file index field 904. In a preferred embodiment, the microcode starts with the location indicated by the key register file index field 904, and the n keys recorded by the n adjacent registers in the key register file 124 are carried. The master key register 142, where n is the total number of master key registers 142. In one embodiment, the value of n may be recorded in an additional space of the branch and switch key instruction 900, set to a value less than the total number of master key registers 142. Flow then proceeds to block 1016.

在方塊1016，微代碼使得微處理器100跳至方塊1006所解出的目標位址，將導致微處理器100中較分支與切換密鑰指令900新的所有x86指令被清空，致使微處理器100內、較分支至目標位址的微操作新的所有微操作被清空。上述被清空的指令包括自指令快取記憶體102提取出、緩衝暫存於提取單元104以及解碼單元108內等待解密與解碼的所有指令位元組106。流程接著進入方塊1008。 At block 1016, the microcode causes the microprocessor 100 to jump to the target address resolved by block 1006, which causes all x86 instructions in the microprocessor 100 to branch and switch key instruction 900 to be emptied, causing the microprocessor Within 100, All micro-ops that are new to the micro-ops that branch to the target address are emptied. The emptied instructions include all of the instruction octets 106 that are extracted from the instruction cache 102, buffered in the fetch unit 104, and decoded in the decoding unit 108 for decryption and decoding. Flow then proceeds to block 1008.

在方塊1018，隨著方塊1016分支至目標位址的操作，提取單元104採用方塊1014載入主密鑰暫存器142的新一組密鑰數值開始自指令快取記憶體102提取且解密指令數據106。流程結束於方塊1018。 At block 1018, as block 1016 branches to the target address, extraction unit 104 begins with block 1014 loading a new set of key values for master key register 142 to begin extracting from instruction cache 102 and decrypting the instruction. Data 106. Flow ends at block 1018.

現在，參閱第11圖，一流程圖圖解根據本發明技術所實現的一後處理器的操作。所述後處理器為軟件工具，可用於後處理一程式並加密之，以交由第1圖的微處理器100執行。流程始於方塊1102。 Referring now to Figure 11, a flowchart illustrates the operation of a post processor implemented in accordance with the teachings of the present invention. The post processor is a software tool that can be used to post-process a program and encrypt it for execution by the microprocessor 100 of FIG. The flow begins at block 1102.

在方塊1102，後處理器接收一程式的一目的檔。根據一種實施方式，該目的檔內的分支指令的目標位址可在程式執行前確定；例如，指向固定目標位址的分支指令。在程式運行前決定好目標位址的分支指令尚有另一形式，例如，一相對分支指令(relative branch instruction)，其中記載一偏移量，用來加上分支指令所在之記憶體位址，以求得分支目標位址。反之，關於目標位址不會在程式執行前確定的分支指令，其中一種例子是基於暫存器或記憶體所儲存的運算元計算出目標位址，因此，其值在程式執行當中可能有變動。流程接著進入方塊1104。 At block 1102, the post processor receives a destination file of a program. According to one embodiment, the target address of the branch instruction within the destination file can be determined prior to execution of the program; for example, a branch instruction directed to a fixed target address. The branch instruction that determines the target address before the program runs has another form, for example, a relative branch instruction, in which an offset is recorded, which is used to add the memory address of the branch instruction to Find the branch target address. Conversely, the branch instruction that the target address does not determine before the execution of the program, one example is based on the operand stored in the scratchpad or the memory to calculate the target address, so its value may change during program execution. . Flow then proceeds to block 1104.

在方塊1104，後微處理器將跨塊分支指令(inter-chunk branch instruction)以分支與切換密鑰指令900取代，所述指令900在密鑰暫存器檔案索引空間904儲存有適當的數值，該數值乃基於分支指令之目標位址所坐落的「塊」而設定。如第8圖所揭露內容，一「塊」是由一序列多個指令數據位元組所組成，將由同一套主密鑰暫存器142數值解密。因此，跨塊分支指令之目標位址所坐落的「塊」不同於分支指令本身的「塊」。值得注意的是，塊內分支-即目標位址與本身位於同一「塊」的分支指令-無須被替代。值得注意的是，產生出原始檔(source file)以產出目的檔的程式設計及/或編譯器可視需求明確包括分支與切換密鑰指令900，以降低後處理器取代操作的負擔。流程接著進入方塊1106。 At block 1104, the post-processor will take the inter-chunk branch instruction with the branch and switch key instruction 900. Alternatively, the instruction 900 stores an appropriate value in the key register file index space 904, which is set based on the "block" in which the target address of the branch instruction is located. As disclosed in Figure 8, a "block" consists of a sequence of multiple instruction data bytes that will be decrypted by the same set of master key registers 142. Therefore, the "block" in which the target address of the cross-block branch instruction is located is different from the "block" of the branch instruction itself. It is worth noting that the intra-block branch - the branch instruction whose target address is in the same "block" as itself - does not have to be replaced. It is worth noting that the program that produces the source file to produce the destination file and/or the visual requirements of the compiler explicitly includes the branch and switch key instructions 900 to reduce the burden of the post processor replacing the operation. Flow then proceeds to block 1106.

在方塊1106，後處理器加密該程式。後處理器知道每一「塊」之記憶體位置以及主密鑰暫存器142數值，並將之用於加密該程式。流程結束於方塊1106。 At block 1106, the post processor encrypts the program. The post processor knows the memory location of each "block" and the value of the master key register 142 and uses it to encrypt the program. Flow ends at block 1106.

現在，參閱第12圖，一方塊圖圖解本發明技術另一種實施方式所實現的一分支與切換密鑰指令1200之格式。第12圖所示之分支與切換密鑰指令1200適用於目標位址在程式執行前為未知的分支操作，以下將詳細討論之。分支與切換密鑰指令1200包括一操作碼1202欄位，用以標示其為微處理器100指令集內的分支與切換密鑰指令1200。分支與切換密鑰指令1200同樣包括一分支資訊欄位906，功用與第9圖之分支與切換密鑰指令900的該欄位類似。在一種實施方式中，若一程式在微處理器100不為安全執行模式時試圖執行分支與切換密鑰指令1200，則視之為無效指令異常。在一種實施方式中，若一程式在微處理器100不為最高權限級別(例如，x86環0權限)時試圖執行一分支與切換密鑰指令1200，則視之為無效指令異常。在一種實施方式中，分支與切換密鑰指令1200為原子型式。 Referring now to Figure 12, a block diagram illustrates the format of a branch and switch key instruction 1200 implemented in accordance with another embodiment of the present technology. The branch and switch key instruction 1200 shown in FIG. 12 is applicable to branch operations where the target address is unknown before program execution, as discussed in more detail below. The branch and switch key instruction 1200 includes an opcode 1202 field to indicate that it is a branch and switch key instruction 1200 within the microprocessor 100 instruction set. Branch and switch key instruction 1200 also includes a branch information field 906 that functions similarly to the field of branch 9 of the switch key instruction 900. In one embodiment, a program is considered an invalid instruction exception if it attempts to execute the branch and switch key instruction 1200 when the microprocessor 100 is not in the secure execution mode. In one embodiment, if a program is not at the highest privilege level (eg, x86 ring 0 privilege) When the map executes a branch and switch key instruction 1200, it is regarded as an invalid instruction exception. In one embodiment, the branch and switch key instructions 1200 are atomic.

現在，參閱第13圖，一方塊圖圖解根據本發明技術實現的「塊」位址範圍表1300。表格1300包括多個單元。每一單元與加密程式的一「塊」相關。每一單元包括一位址範圍欄位1302以及一密鑰暫存器檔案索引欄位1304。位址範圍欄位1302標示所對應「塊」的記憶體位址範圍。密鑰暫存器檔案索引欄位1304標示密鑰暫存器檔案124內的暫存器，由分支與切換密鑰指令1200將索引所指的暫存器所儲存的密鑰數值載入主密鑰暫存器142，供提取單元104解密該「塊」使用。以下參考第18圖進行討論，表格1300於需要存取表格1300內容的分支與切換密鑰指令1200執行前載入微處理器100。 Referring now to Figure 13, a block diagram illustrates a "Block" address range table 1300 implemented in accordance with the teachings of the present invention. Table 1300 includes a plurality of units. Each unit is associated with a "block" of the encryption program. Each unit includes a location range field 1302 and a key register file index field 1304. The address range field 1302 indicates the memory address range of the corresponding "block". The key register file index field 1304 indicates the register in the key register file 124. The branch and switch key instruction 1200 loads the key value stored in the register indicated by the index into the primary key. The key register 142 is used by the extracting unit 104 to decrypt the "block" for use. As discussed below with reference to FIG. 18, table 1300 is loaded into microprocessor 100 prior to execution of the branch and switch key instruction 1200 that requires access to the contents of table 1300.

現在，參閱第14圖，一流程圖圖解第1圖微處理器100的操作，其中，根據本發明技術執行第12圖的分支與切換密鑰指令1200。流程始於方塊1402。 Referring now to Figure 14, a flowchart illustrates the operation of microprocessor 100 of Figure 1, wherein branch and switch key instructions 1200 of Figure 12 are performed in accordance with the teachings of the present invention. The process begins at block 1402.

在方塊1402，解碼單元108解碼一分支與切換密鑰指令1200且將之代入微代碼單元132中實現分支與切換密鑰指令1200的微代碼程序。流程接著進入方塊1406。 At block 1402, decoding unit 108 decodes a branch and switch key instruction 1200 and substitutes it into microcode unit 132 to implement the branch and switch key instruction 1200 microcode program. Flow then proceeds to block 1406.

在方塊1406，微代碼解出分支方向(採用、或不採用)、且找出目標位址。流程接著進入判斷方塊1408。 At block 1406, the microcode resolves the branch direction (with or without) and finds the target address. Flow then proceeds to decision block 1408.

在判斷方塊1408，微代碼判斷方塊1406所解出的分支方向是否為採用。若為採用，流程進入方塊1414。反之，流程進入方塊1412。 At decision block 1408, the microcode determines whether the branch direction solved by block 1406 is taken. If so, the flow proceeds to block 1414. Otherwise, the flow proceeds to block 1412.

在方塊1412，微代碼不切換密鑰、或跳至目標位址，因為該分支未被採用。流程結束於方塊1412。 At block 1412, the microcode does not switch the key, or jumps to the target bit. Address because the branch was not taken. Flow ends at block 1412.

在方塊1414，微代碼基於方塊1406所解出的目標位址查詢第13圖所示之表格1300，得到該目標位址所坐落之「塊」所對應之密鑰暫存器檔案索引欄位1304的內容。微代碼接著基於密鑰暫存器檔案索引欄位1304內所記載的索引，自密鑰暫存器檔案124將密鑰數值載入主密鑰暫存器142。較佳實施方式是，微代碼根據密鑰暫存器檔案索引欄位1304所儲存的索引，自密鑰暫存器檔案124將n個相鄰暫存器儲存的n個密鑰值載入主密鑰暫存器142的，其中，n為主密鑰暫存器142的總數。在一種實施方式中，數值n可紀錄於分支與切換密鑰指令1200的一額外欄位中，設定為少於主密鑰暫存器142總數。流程接著進入方塊1416。 At block 1414, the microcode queries the table 1300 shown in FIG. 13 based on the target address solved by the block 1406, and obtains the key register file index field 1304 corresponding to the "block" in which the target address is located. Content. The microcode then loads the key value from the key register file 124 into the master key register 142 based on the index recorded in the key register file index field 1304. In a preferred embodiment, the microcode loads the n key values stored in the n adjacent registers from the key register file 124 according to the index stored in the key register file index field 1304. Key register 142, where n is the total number of master key registers 142. In one embodiment, the value n may be recorded in an additional field of the branch and switch key instruction 1200, less than the total number of master key registers 142. Flow then proceeds to block 1416.

在方塊1416，微代碼致使微處理器100分支至方塊1406所解出的目標位址，將導致微處理器100中較分支與切換密鑰指令1200新的所有x86指令被清空，致使微處理器100內、較分支至目標位址的微操作新的所有微操作被清空。上述被清空的指令包括自指令快取記憶體102提取出、緩衝暫存於提取單元104以及解碼單元108內等待解密與解碼的所有指令位元組106。流程接著進入方塊1418。 At block 1416, the microcode causes the microprocessor 100 to branch to the target address resolved by block 1406, which causes all x86 instructions in the microprocessor 100 to branch and switch key instruction 1200 to be emptied, causing the microprocessor All micro-ops within the 100 micro-operations that are branched to the target address are emptied. The emptied instructions include all of the instruction octets 106 that are extracted from the instruction cache 102, buffered in the fetch unit 104, and decoded in the decoding unit 108 for decryption and decoding. Flow then proceeds to block 1418.

在方塊1418，隨著方塊1416分支至目標位址的操作，提取單元104採用方塊1414載入主密鑰暫存器142的新一套密鑰值，開始自指令快取記憶體102提取並且解密指令數據106。流程結束於方塊1418。 At block 1418, as block 1416 branches to the target address, extraction unit 104 loads the new set of key values of master key register 142 using block 1414, begins to extract and decrypt from instruction cache 102. Instruction data 106. Flow ends at block 1418.

現在，參考第15圖，一方塊圖圖解根據本發明技術另外一種實施方式所實現的一分支與切換密鑰指令1500的格式。第15圖所示之分支與切換密鑰指令1500以及其操作類似第12圖所示之分支與切換密鑰指令1200。然而，取代自密鑰暫存器檔案124載入密鑰至主密鑰暫存器142，分支與切換密鑰指令1500是自安全存儲區122載入密鑰至主密鑰暫存器142，以下討論之。 Now, referring to Fig. 15, a block diagram illustrates the technique according to the present invention. A branch and switch key instruction 1500 format implemented by another embodiment. The branch and switch key instruction 1500 shown in Fig. 15 and its operation are similar to the branch and switch key command 1200 shown in Fig. 12. However, instead of loading the key from the key register file 124 to the master key register 142, the branch and switch key instruction 1500 loads the key from the secure storage area 122 to the master key register 142. Discussed below.

現在，參考第16圖，一方塊圖圖解根據本發明技術所實現的「塊」位址範圍表1600。第16圖所示表格1600類似第13圖所示之表格1300。然而，取代包括一密鑰暫存器檔案索引欄位1304，表格1600包括一安全存儲區位址欄位1604。安全存儲區位址欄位1604記載安全存儲區122內的一位址，該位址儲存的密鑰值須由分支與切換密鑰指令1500載入主密鑰暫存器142，以供該提取單元104解密該「塊」時使用。以下討論參考第18圖內容，表格1600是在需要查詢該表格1600的分支與切換密鑰指令1500被執行前載入微處理器100。在一種實施方式中，安全存儲區122位址之較低數個位元無須儲存在安全存儲區位址欄位1604，特別是因為安全存儲區122中儲存一組密鑰的位置之總量相當大(例如，16位元組x 5)、且該組密鑰可沿著一設定尺寸範圍對齊。 Referring now to Figure 16, a block diagram illustrates a "block" address range table 1600 implemented in accordance with the teachings of the present invention. The table 1600 shown in Fig. 16 is similar to the table 1300 shown in Fig. 13. However, instead of including a key register file index field 1304, table 1600 includes a secure memory address field 1604. The secure memory area address field 1604 records a single address in the secure memory area 122, and the key value stored in the address must be loaded into the master key register 142 by the branch and switch key instruction 1500 for the extracting unit. 104 is used when decrypting the "block". The following discussion refers to the contents of Figure 18, which is loaded into the microprocessor 100 before the branch and switch key instructions 1500 that need to query the table 1600 are executed. In one embodiment, the lower number of bits of the secure storage area 122 address need not be stored in the secure storage area address field 1604, particularly since the total amount of locations in the secure storage area 122 where a set of keys are stored is quite large. (eg, 16-bit tuple x 5), and the set of keys can be aligned along a set size range.

現在，參閱第17圖，一流程圖圖解第1圖微處理器100的操作，其中根據本發明技術執行第15圖的分支與切換密鑰指令1500。流程始於方塊1702。第17圖之流程圖的許多方塊與第14圖的許多方塊類似，因此採同樣的編號。然而，方塊1414是由方塊1714取代，微代碼基於方塊1406所求得的目標位址查表第16圖之表格1600，以獲得目標位址所坐落的「塊」之安全存儲區位址欄位1604數值。微代碼接著根據安全存儲區位址欄位1604數值自安全存儲區122將密鑰數值載入主密鑰暫存器142。較佳實施方式是，微代碼由安全存儲區位址欄位1604數值自安全存儲區122將n個鄰近16位元組空間位置內所儲存的n個密鑰數值載入主密鑰暫存器142，其中n為主密鑰暫存器142的總數。在一種實施方式中，數值n可記載於分支與切換密鑰指令1500中一額外欄位，設定為少於主密鑰暫存器142總數。 Referring now to Figure 17, a flowchart illustrates the operation of microprocessor 100 of Figure 1, wherein branch and switch key instructions 1500 of Figure 15 are performed in accordance with the teachings of the present invention. The process begins at block 1702. The many blocks of the flow chart of Figure 17 are similar to the many blocks of Figure 14, and are therefore numbered the same. However, block 1414 is replaced by block 1714, and the microcode is based on the target address found in block 1406. Table 1600 of Table 16 is used to obtain the value of the secure storage address field 1604 of the "block" in which the target address is located. The microcode then loads the key value from the secure storage area 122 into the master key register 142 based on the secure storage area address field 1604 value. In a preferred embodiment, the microcode loads the n key values stored in n adjacent 16-bit space locations from the secure storage area 122 from the secure memory area 122 by the secure memory address field 1604 value into the master key register 142. Where n is the total number of master key registers 142. In one embodiment, the value n can be written in an additional field in the branch and switch key instruction 1500, which is set to be less than the total number of master key registers 142.

現在，參閱第18圖，一流程圖圖解根據本發明另外一種實施方式所實現的一後處理器的操作。所述後處理器可用於後處理一程式並加密之，以交由第1圖的微處理器100執行。流程始於方塊1802。 Referring now to Figure 18, a flow diagram illustrates the operation of a post processor implemented in accordance with another embodiment of the present invention. The post processor can be used to post-process a program and encrypt it for execution by the microprocessor 100 of FIG. The flow begins at block 1802.

在方塊1802，後處理器接收一程式的目的檔。根據一種實施方式，該目的檔內的分支指令，可為目標位址在程式執行前判定、可為目標位址不可在程式執行前判定。流程接著進入方塊1803。 At block 1802, the post processor receives the destination file of a program. According to an embodiment, the branch instruction in the destination file may be determined before the execution of the program by the target address, and may be determined before the execution of the program by the target address. Flow then proceeds to block 1803.

在方塊1803，後處理器建立第13圖或第16圖之「塊」位址範圍表1300或1600，以列入該目標檔。在一種實施方式中，作業系統在載入且執行一加密程式前將表格1300/1600載入微處理器100，使分支與切換密鑰指令1200/1500得以存取之。在一種實施方式中，後處理器在程式中插入指令，以在任何分支與切換密鑰指令1200/1500執行前載入表格1300/1600至微處理器100。流程接著進入方塊1804。 At block 1803, the post processor creates a "block" address range table 1300 or 1600 of Figure 13 or Figure 16 for inclusion in the target file. In one embodiment, the operating system loads the table 1300/1600 into the microprocessor 100 prior to loading and executing an encryption program to enable branch and switch key commands 1200/1500 to be accessed. In one embodiment, the post processor inserts an instruction into the program to load the table 1300/1600 to the microprocessor 100 before any branch and switch key instructions 1200/1500 are executed. Flow then proceeds to block 1804.

在方塊1804，類似先前所討論、關於第11圖之方塊1104的操作，後處理器將每個執行前目標位址可決定的跨塊分支指令以第9圖的分支與切換密鑰指令900取代，指令900基於分支指令目標位址所在「塊」記載有合適的密鑰暫存器檔案索引欄位904數值。流程接著進入方塊1805。 At block 1804, similar to the previously discussed, regarding the 11th figure In operation of block 1104, the post-processor replaces each of the pre-execution target address-determinable cross-block branch instructions with the branch and switch key instruction 900 of FIG. 9, and the instruction 900 records based on the "block" of the branch instruction target address. There is a suitable key register file index field 904 value. Flow then proceeds to block 1805.

在方塊1805，後處理器根據方塊1803所產生的表格型態(1300/1600)將每個限於執行過程中決定目標位址的分支指令以第12圖或第15圖所示之分支與切換密鑰指令1200或1500取代。流程接著進入方塊1806。 At block 1805, the post processor limits each branch instruction that determines the target address during execution to the branch and switch key shown in FIG. 12 or FIG. 15 according to the table type (1300/1600) generated by block 1803. The key instruction 1200 or 1500 is replaced. Flow then proceeds to block 1806.

在方塊1806，後處理器加密該程式。該後處理器知道關於各「塊」的記憶體位置與主密鑰暫存器142數值，將用於加密該程式。流程結束於方塊1806。 At block 1806, the post processor encrypts the program. The post processor knows the location of the memory for each "block" and the value of the master key register 142, which will be used to encrypt the program. Flow ends at block 1806.

現在，參閱第19圖，一流程圖圖解第1圖微處理器100的操作，其中，根據本發明技術處理加密程式以及純文字程式之間的任務切換。流程始於方塊1902。 Referring now to Figure 19, a flow diagram illustrates the operation of microprocessor 100 of Figure 1, wherein the task switching between the encryption program and the plain text program is handled in accordance with the teachings of the present invention. The process begins at block 1902.

在方塊1902，標誌暫存器128的E位元欄位402的E位元以及第1圖控制暫存器144之E位元148由微處理器100的一重置操作清空。流程接著進入方塊1904。 At block 1902, the E bit of the E bit field 402 of the flag register 128 and the E bit 148 of the first picture control register 144 are cleared by a reset operation of the microprocessor 100. Flow then proceeds to block 1904.

在方塊1904，微處理器100在執行其重置微代碼進行初始化後，開始提取並且執行使用者程式指令(例如，系統韌體)，其為純文字程式指令。特別是，由於E位元128為清空，如前所述，提取單元104視提取出來的指令數據106為純文字指令。流程接著進入方塊1906。 At block 1904, after performing its reset microcode initialization, the microprocessor 100 begins fetching and executing user program instructions (eg, system firmware), which are plain text program instructions. In particular, since the E bit 128 is emptied, as described above, the extracting unit 104 regards the extracted command data 106 as a plain text command. Flow then proceeds to block 1906.

在方塊1906，系統韌體(例如，作業系統、韌體、基本輸入輸出系統BIOS…等)接收一要求(request)，要執行一加密程式。在一種實施方式中，執行一加密程式的上述要求伴隨、或由一切換操作指示，以切換至微處理器100的一安全執行模式，如以上討論內容。在一種實施方式中，微處理器100僅在安全執行模式時，方允許操作於一解密模式(即，E位元148為設定狀態)。在一種實施方式中，微處理器100僅在系統管理模式(system management mode，例如，x86架構中常見的SSM)，方允許以解密模式操作。流程接著進入方塊1908。 At block 1906, the system firmware (eg, operating system, firmware, basic input/output system BIOS, etc.) receives a request to perform a request Encryption program. In one embodiment, the above-described requirements for executing an encryption program are accompanied by, or indicated by, a switching operation to switch to a secure execution mode of the microprocessor 100, as discussed above. In one embodiment, the microprocessor 100 is allowed to operate in a decryption mode only when in the secure execution mode (ie, the E bit 148 is in the set state). In one embodiment, the microprocessor 100 is only allowed to operate in a decryption mode in a system management mode, such as the SSM common in the x86 architecture. Flow then proceeds to block 1908.

在方塊1908，系統軟體於主密鑰暫存器142中載入其初始值，與程式中將被執行的第一「塊」相關。在一種實施方式中，系統軟體執行一密鑰切換指令600下載密鑰至主密鑰暫存器142。在載入密鑰至主密鑰暫存器142之前，密鑰暫存器檔案124的內容可由一或多個密鑰載入指令500載入。在一種實施方式中，載入密鑰至主密鑰暫存器142以及密鑰暫存器檔案124之前，安全存儲區122可先被寫入密鑰數值，其中，所述寫入乃經由常見的安全通道技術，例如，AES或RSA加密通道，以防止駭客窺探其值。如以上所討論，以上密鑰數值可儲存在一安全非揮發性記憶體(例如快閃記憶體)經由一隔離串行總線(private serial bus)耦接微處理器100，或者，可儲存在微處理器100的一非揮發性單次寫入記憶體。如以上討論，所述程式可包含在單一「塊」中。也就是說，所述程式可不包括密鑰切換指令600，整個程式可由單一套主密鑰暫存器142數值解密。流程接著進入方塊1916。 At block 1908, the system software loads its initial value in the master key register 142, associated with the first "block" to be executed in the program. In one embodiment, the system software executes a key switch instruction 600 to download the key to the master key register 142. The contents of the key register file 124 may be loaded by one or more key load instructions 500 prior to loading the key into the master key register 142. In one embodiment, the secure storage area 122 may be first written to the key value before the key is loaded into the master key register 142 and the key register file 124, wherein the write is via a common Secure channel technology, for example, AES or RSA encryption channels to prevent hackers from snooping their values. As discussed above, the above key values may be stored in a secure non-volatile memory (eg, flash memory) coupled to the microprocessor 100 via a separate serial bus, or may be stored in the micro A non-volatile single write memory of processor 100. As discussed above, the program can be included in a single "block." That is, the program may not include the key switch instruction 600, and the entire program may be decrypted by a single set of master key registers 142. Flow then proceeds to block 1916.

在方塊1916，隨著控制權轉移至加密程式，微處理器100設定標誌暫存器128的E位元欄位402標示目前所執行的程式為加密型式，且設定控制暫存器144的E位元148，使提取單元104處於解密模式。微處理器100更致使管線內的指令被刷新，其動作類似第7圖方塊706所實行的刷新操作。流程接著進入方塊1918。 At block 1916, as control transfers to the encryption program, microprocessor 100 sets E-bit field 402 of flag register 128 to indicate that it is currently executing. The program is in an encrypted version and is set to control the E bit 148 of the register 144 to cause the extraction unit 104 to be in the decryption mode. The microprocessor 100 causes the instructions within the pipeline to be refreshed, similar to the refresh operation performed by block 706 of FIG. Flow then proceeds to block 1918.

在方塊1918，提取單元104提取加密程式內的指令106，並且參考第1圖至第3圖所揭露的技術將之以解密模式解密並且執行之。流程接著進入方塊1922。 At block 1918, the extraction unit 104 extracts the instructions 106 within the encryption program and decrypts and executes it in a decryption mode with reference to the techniques disclosed in Figures 1 through 3. Flow then proceeds to block 1922.

在方塊1922，微處理器100提取並且執行加密程式時，微處理器100接收到中斷事件。舉例說明之，所述中斷事件可為一中斷interrupt、一異常exception(如頁面錯誤page fault)、或任務切換task switch。當一中斷事件發生，微處理器100管線所有待處理的指令會被清空。所以，若管線中有任何先前提取的加密指令，將之清空。此外，自指令快取記憶體102所提取出、可能在緩衝儲存在提取單元104以及解碼單元108中等待被解密、解碼的所有指令位元組會被清空。在一種實施方式中，微代碼被喚起回應中斷事件。流程接著進入方塊1924。 At block 1922, when the microprocessor 100 extracts and executes the encryption program, the microprocessor 100 receives the interrupt event. For example, the interrupt event may be an interrupt interrupt, an exception exception (such as a page fault page fault), or a task switch task switch. When an interrupt event occurs, all pending instructions of the microprocessor 100 pipeline are cleared. So, if there are any previously extracted encryption instructions in the pipeline, clear them. In addition, all instruction bytes extracted from the instruction cache 102, possibly buffered in the extraction unit 104 and the decoding unit 108, waiting to be decrypted, decoded, are emptied. In one embodiment, the microcode is evoked to respond to an interrupt event. Flow then proceeds to block 1924.

在方塊1924，微處理器100儲存標誌暫存器128(以及微處理器100其他結構狀態，包括受中斷的加密程式的目前指令指標數值)至一堆疊式記憶體(stack memory)。儲存加密程式之E位元欄位402數值將使其得以在後續操作中修復(在方塊1934)。流程接著進入方塊1926。 At block 1924, the microprocessor 100 stores the flag register 128 (and other structural states of the microprocessor 100, including the current command index values of the interrupted encryption program) to a stack memory. Storing the E-bit field 402 value of the encryption program will cause it to be repaired in subsequent operations (at block 1934). Flow then proceeds to block 1926.

在方塊1926，當控制權轉移到新的程式(例如，中斷處理程序interrupt handler、異常處理程序exception handler、或新任務)，微處理器100清空標誌暫存器128的E位元欄位402、以及控制暫存器144的E位元148，以應付純文字的新程式。也就是說，第19圖所示實施例假設微處理器100同一時間只有允許運作一個加密程式，且已有一個加密程式在執行(但被中斷)。第22圖至第26圖另外揭露有其他種的實施方式。流程接著進入方塊1928。 At block 1926, when control transfers to a new program (eg, interrupt handler, exception handler, or new task), microprocessor 100 clears E-bit field 402 of flag register 128, And control the E bit 148 of the register 144 to cope with the new program of plain text. That is, the embodiment shown in Fig. 19 assumes that the microprocessor 100 is only allowed to operate an encryption program at the same time, and that an encryption program is already executing (but is interrupted). Figures 22 through 26 additionally disclose other embodiments. Flow then proceeds to block 1928.

在方塊1928，提取單元104參考第1圖至第3圖所揭露內容以純文字模式提取新程式的指令106。特別是，控制暫存器144內E位元148的清空狀態使得多工器154將指令數據106與多位元的二進位零值176進行互斥運算，使得指令數據106不被解密操作。流程接著進入方塊1932。 At block 1928, the extracting unit 104 extracts the instructions 106 of the new program in plain text mode with reference to the content disclosed in Figures 1 through 3. In particular, the clear state of the E bit 148 in the control register 144 causes the multiplexer 154 to mutually exclusive operation of the instruction data 106 with the multi-bit binary zero value 176 such that the instruction data 106 is not decrypted. Flow then proceeds to block 1932.

在方塊1932，新程式執行一返回操作自中斷指令(例如，x86 IRET)或類似指令返回，使得控制權回歸加密程式。在一種實施方式中，自中斷指令返回的操作由微代碼實現。流程接著進入方塊1934。 At block 1932, the new program executes a return operation from an interrupt instruction (e.g., x86 IRET) or a similar instruction return, causing control to return to the encryption program. In one embodiment, the operation returned from the interrupt instruction is implemented by microcode. Flow then proceeds to block 1934.

在方塊1934，回應前述自中斷指令返回的操作，由於控制權移轉回加密程式，微處理器100修復標誌暫存器128，令標誌暫存器128之E位元欄位402重回先前方塊1924所儲存的設定狀態。流程接著進入方塊1938。 At block 1934, in response to the operation returned by the self-interrupt instruction, the microprocessor 100 repairs the flag register 128, causing the E-bit field 402 of the flag register 128 to return to the previous block as control moves back to the encryption program. The setting status stored in 1924. Flow then proceeds to block 1938.

在方塊1938，由於控制權移轉回加密程式，微處理器100以標誌暫存器128的E位元欄位402數值更新控制暫存器144的E位元148，使得提取單元104重新提取並且解密該加密程式之指令數據106。流程接著進入方塊1942。 At block 1938, the microprocessor 100 updates the E bit 148 of the control register 144 with the value of the E bit field 402 of the flag register 128 as the control moves back to the encryption program, causing the extraction unit 104 to re-extract and The instruction data 106 of the encryption program is decrypted. Flow then proceeds to block 1942.

在方塊1942，微代碼令微處理器100分支至先前方塊1924儲存於堆疊式記憶體中的指令指標數值，使得微處理器 100中所有x86指令清空、且使得微處理器100中所有微操作清空。所清空內容包括提取自指令快取記憶體102、緩衝暫存在提取單元104以及解碼單元108中等待被解密、解碼的所有指令位元組106。流程接著進入方塊1944。 At block 1942, the microcode causes the microprocessor 100 to branch to the instruction index value stored in the stacked memory in block 1924, such that the microprocessor All x86 instructions in 100 are emptied and all micro-ops in microprocessor 100 are emptied. The emptied content includes all instruction octets 106 extracted from the instruction cache 102, the buffer escaping extraction unit 104, and the decoding unit 108 waiting to be decrypted and decoded. Flow then proceeds to block 1944.

在方塊1944，提取單元104重新開始提取該加密程式內的指令106，並且參考第1圖至第3圖所揭露技術以解密模式解密並且執行之。流程結束於方塊1944。 At block 1944, extraction unit 104 resumes fetching instructions 106 within the encryption program and decrypts and executes it in a decryption mode with reference to the techniques disclosed in Figures 1 through 3. Flow ends at block 1944.

現在，參考第20圖，一流程圖圖解根據本發明技術實現的一系統軟體之操作，由第1圖之微處理器100執行。第20圖流程可配合第19圖內容執行。流程始於方塊2002。 Referring now to Figure 20, a flow diagram illustrates the operation of a system software implemented in accordance with the teachings of the present invention, executed by microprocessor 100 of Figure 1. The process of Fig. 20 can be executed in conjunction with the contents of Fig. 19. The process begins at block 2002.

在方塊2002，系統軟體收到一要求，欲執行一個新的加密程式。流程接著進入決策方塊2004。 At block 2002, the system software receives a request to execute a new encryption program. The flow then proceeds to decision block 2004.

在決策方塊2004，系統軟體判斷此一加密程式是否為系統已在執行的程式之一。在一種實施方式中，系統軟體以一旗標標示一加密程式是否為系統中已在執行的程式之一。若此加密程式是系統已在執行的程式之一，流程進入方塊2006，反之，則流程進入方塊2008。 At decision block 2004, the system software determines if the encryption program is one of the programs that the system is already executing. In one embodiment, the system software indicates by a flag whether an encryption program is one of the programs already executing in the system. If the encryption program is one of the programs that the system is already executing, the flow proceeds to block 2006, otherwise, the flow proceeds to block 2008.

在方塊2006，系統軟體等待該加密程式執行完畢且不再是系統執行中的程式之一。流程接著進入方塊2008。 At block 2006, the system software waits for the encryption program to execute and is no longer one of the programs in the system execution. The flow then proceeds to block 2008.

在方塊2008，微處理器100允許新的加密程式開始執行。流程結束於方塊2008。 At block 2008, the microprocessor 100 allows the new encryption program to begin execution. The process ends at block 2008.

現在，參考第21圖，一方塊圖根據本發明技術另外一種實施方式，圖解第1圖標誌暫存器128的欄位。第21圖的標誌暫存器128類似第4圖所示實施方式，相比之，更包括索引欄位(index bits)2104。根據一種實施方式，索引欄位2104(類似E位元402)通常是x86架構所預留的位元。索引欄位2104用於應付多個加密程式的切換，以下詳細討論之。較佳實施方式是，密鑰切換指令600以及分支與切換密鑰指令900/1200以本身的密鑰暫存器索引欄位604/904/1304更新標誌暫存器128的索引欄位2104。 Referring now to Figure 21, a block diagram illustrates the field of the flag register 128 of Figure 1 in accordance with another embodiment of the present technology. The flag register 128 of Fig. 21 is similar to the embodiment shown in Fig. 4, and includes an index. The index bits are 2104. According to one embodiment, index field 2104 (similar to E bit 402) is typically a bit reserved by the x86 architecture. The index field 2104 is used to handle the switching of multiple encryption programs, as discussed in detail below. In a preferred embodiment, the key switch instruction 600 and the branch and switch key command 900/1200 update the index field 2104 of the flag register 128 with its own key register index field 604/904/1304.

現在，參考第22圖，一流程圖圖解第1圖微處理器100的操作，其中，根據本發明技術採用第21圖所示之標誌暫存器128實行多個加密程式之間的任務切換。流程接著進入方塊2202。 Referring now to Figure 22, a flow chart illustrates the operation of microprocessor 100 of Figure 1, wherein task switching between a plurality of encryption programs is performed using flag register 128 shown in Figure 21 in accordance with the teachings of the present invention. Flow then proceeds to block 2202.

在方塊2202，一要求發向該系統軟體，要執行一個新的加密程式。流程接著進入決策方塊2204。 At block 2202, a request is sent to the system software to execute a new encryption program. Flow then proceeds to decision block 2204.

在決策方塊2204，系統軟體判斷密鑰暫存器檔案124中是否有空間應付一個新的加密程式。在一種實施方式中，方塊2202所產生的該要求會指出需要密鑰暫存器檔案124內多少空間。若密鑰暫存器檔案124中有空間應付新的加密程式，流程進入方塊2208，反之，流程進入方塊2206。 At decision block 2204, the system software determines if there is room in the key register file 124 for a new encryption program. In one embodiment, the request generated by block 2202 will indicate how much space is needed in the key register file 124. If there is space in the key register file 124 for the new encryption program, the flow proceeds to block 2208, otherwise, the flow proceeds to block 2206.

在方塊2206，系統軟體等待一或多個加密程式完成、使密鑰暫存器檔案124騰出空間應付新的加密程式。流程接著進入方塊2208。 At block 2206, the system software waits for one or more encryption programs to complete, freeing the key register file 124 to cope with the new encryption program. Flow then proceeds to block 2208.

在方塊2208，系統軟體將密鑰暫存器檔案124內的空間配置給新的加密程式，並且隨之填寫標誌暫存器128中的索引欄位2104，以標示密鑰暫存器檔案124中新配置的空間。流程接著進入方塊2212。 At block 2208, the system software configures the space in the key register file 124 to the new encryption program, and then fills in the index field 2104 in the flag register 128 to indicate the key register file 124. Newly configured space. Flow then proceeds to block 2212.

在方塊2212，系統軟體在方塊2208所配置的密鑰暫存器檔案124位置載入供新程式使用的密鑰數值。如以上討論，所載入的密鑰數值可採用密鑰載入指令500自安全存儲區122載入，或者，在必要情況下，可以安全管道由微處理器100外部位置取得。流程接著進入方塊2214。 At block 2212, the system software loads the key value for use by the new program at the location of the key register file 124 configured at block 2208. As discussed above, the loaded key value may be loaded from secure storage area 122 using key load instruction 500 or, if necessary, may be securely piped from an external location of microprocessor 100. Flow then proceeds to block 2214.

在方塊2214，系統軟體基於密鑰暫存器檔案索引欄位604/904/1304將密鑰自密鑰暫存器檔案124載入主密鑰暫存器142。在一種實施方式中，系統軟體執行一密鑰切換指令600載入密鑰至主密鑰暫存器142。流程接著進入方塊2216。 At block 2214, the system software loads the key from the key register file 124 into the master key register 142 based on the key register file index field 604/904/1304. In one embodiment, the system software executes a key switch instruction 600 to load the key into the master key register 142. Flow then proceeds to block 2216.

在方塊2216，由於控制權移轉至加密程式，微處理器100設定標誌暫存器128之E位元欄位402以標示目前執行的程式為加密型式，並且設定控制暫存器144的E位元148以設定提取單元104為解密模式。流程結束於方塊2216。 At block 2216, as control transfers to the encryption program, microprocessor 100 sets E-bit field 402 of flag register 128 to indicate that the currently executing program is an encrypted version and sets the E-bit of control register 144. The element 148 sets the extraction unit 104 to the decryption mode. Flow ends at block 2216.

現在，參考第23圖，一流程圖圖解第1圖微處理器100的操作，其中，根據本發明技術採用第21圖所示之標誌暫存器128應付多個加密程式之間的任務切換。流程始於方塊2302。 Referring now to Figure 23, a flow chart illustrates the operation of microprocessor 100 of Figure 1, wherein the flag register 128 shown in Figure 21 is utilized to cope with task switching between multiple encryption programs in accordance with the teachings of the present invention. The process begins at block 2302.

在方塊2302，目前執行的程式執行一返回操作，自一中斷指令返回，引發一任務切換至新程式；所述新程式先前曾被執行過但被跳開，且其結構狀態(例如，標誌暫存器128、指令指標暫存器、以及通用暫存器)曾被儲存在堆疊式記憶體中。如先前所提過，在一種實施方式中，自中斷指令返回的操作是由微代碼實現。現在執行中的程式以及新的程式可為加密程式或純文字程式。流程進入方塊2304。 At block 2302, the currently executing program performs a return operation, returning from an interrupt instruction, causing a task to switch to the new program; the new program was previously executed but was skipped, and its structural state (eg, flag temporary) The buffer 128, the instruction indicator register, and the general purpose register are stored in the stacked memory. As previously mentioned, in one embodiment, the operation returned from the interrupt instruction is implemented by microcode. The currently executing program and the new program can be an encryption program or a plain text program. Flow proceeds to block 2304.

在方塊2304，微處理器100根據堆疊式記憶體修復標誌暫存器128，以應付接續返回的程式。也就是說，微處理器100將接續程式(即目前跳換回的程式)先前跳換出去時儲存於堆疊式記憶體的標誌暫存器128數值重新載入標誌暫存器128。流程接著進入決策方塊2306。 At block 2304, the microprocessor 100 repairs the flag register 128 in accordance with the stacked memory to cope with the program that continues to return. That is, the microprocessor 100 reloads the value of the flag register 128 stored in the stacked memory when the continuation program (ie, the program currently swapped back) is previously swapped out, to the flag register 128. Flow then proceeds to decision block 2306.

在決策方塊2306，微處理器100判斷修復後的標誌暫存器128之E位元402是否為設定狀態。若是，則流程進入方塊2308；反之，則流程進入方塊2312。 At decision block 2306, the microprocessor 100 determines if the E bit 402 of the repaired flag register 128 is in the set state. If so, the flow proceeds to block 2308; otherwise, the flow proceeds to block 2312.

在方塊2308，微處理器100根據方塊2304所修復的EFLAGS暫存器128索引欄位2104數值將密鑰載入密鑰暫存器檔案124。流程接著進入方塊2312。 At block 2308, the microprocessor 100 loads the key into the key register file 124 in accordance with the EFLAGS register 128 index field 2104 value as fixed at block 2304. Flow then proceeds to block 2312.

在方塊2312，微處理器100將控制暫存器144之E位元148的內容以方塊2304所修復的標誌暫存器128之E位元欄位402數值更新。因此，若接續的程式是一個加密程式，提取單元104會被設定為解密模式，反之，則設定為純文字模式。流程接著進入方塊2314。 At block 2312, the microprocessor 100 updates the contents of the E bit 148 of the control register 144 to the value of the E bit field 402 of the flag register 128 repaired by block 2304. Therefore, if the subsequent program is an encryption program, the extraction unit 104 is set to the decryption mode, and vice versa, the text mode is set to the plain text mode. Flow then proceeds to block 2314.

在方塊2314，微處理器100以堆疊式記憶體的內容修復指令指標暫存器、並且分支跳躍至指令指標所指的位置，所述動作將清除微處理器100所有x86指令，並且清除微處理器所有微操作。所清除的包括自指令快取記憶體102所提取出、緩衝暫存於提取單元104、解碼單元108中等待解密、解碼的所有指令位元組106。流程接著進入方塊2316。 At block 2314, the microprocessor 100 repairs the instruction pointer register with the contents of the stacked memory and branches to the location indicated by the instruction index, which will clear all x86 instructions of the microprocessor 100 and clear the microprocessor. All micro-operations. The cleared includes all the instruction byte groups 106 extracted from the instruction cache 102 and buffered in the extraction unit 104 and the decoding unit 108 for decryption and decoding. Flow then proceeds to block 2316.

在方塊2316，提取單元104參考第1圖至第3圖技術重新開始自接續程式中提取指令106，並視方塊2312所修復的控制暫存器144之E位元148數值以解密模式或純文字模式操作。流程結束於方塊2316。 At block 2316, the extraction unit 104 re-starts the extraction of the instruction 106 from the continuation program with reference to the techniques of Figures 1 through 3, and repairs the block as determined by block 2312. The E bit 148 value of the control register 144 operates in a decrypted mode or a plain text mode. Flow ends at block 2316.

現在，參考第24圖，一方塊圖根據本發明、圖解第1圖密鑰暫存器檔案124之單一個暫存器的另外一種實施方式。根據第24圖所示之實施方式，每個密鑰暫存器檔案124更包括一位元-為淘汰位元2402(kill bit，以下簡稱K位元)。K位元2402用於應付微處理器100對多個加密程式的多任務(multitasking)操作，所述多個加密程式總計需要多於密鑰暫存器檔案124空間尺寸的密鑰儲存空間，以下將詳述之。 Referring now to Figure 24, a block diagram illustrates another embodiment of a single register of the key register file 124 of Figure 1 in accordance with the present invention. According to the embodiment shown in FIG. 24, each key register file 124 further includes a bit-kill bit 2402 (kill bit, hereinafter referred to as K bit). The K bit 2402 is used to cope with the multitasking operation of the microprocessor 100 for a plurality of encryption programs, the total of which requires a key storage space larger than the space size of the key register file 124, below It will be detailed.

現在，參考第25圖，一流程圖圖解第1圖微處理器100的操作，其中根據本發明技術以第21圖之標誌暫存器128以及第24圖之密鑰暫存器檔案124實現多個加密程式之間之任務切換的另外一種實施方式。第25圖所示流程類似第22圖所示流程。不同處在於決策方塊2204判定密鑰暫存器檔案124中沒有足夠可用空間時，第25圖流程會進入方塊2506而非不存在於第25圖的方塊2204。另外，若決策方塊2204判定密鑰暫存器檔案124中尚有足夠可用空間，則第25圖流程同樣進入第22圖之方塊2208至方塊2216。 Referring now to Figure 25, a flow chart illustrates the operation of the microprocessor 100 of Figure 1, wherein the flag register 128 of Figure 21 and the key register file 124 of Figure 24 are implemented in accordance with the teachings of the present invention. Another implementation of task switching between encryption programs. The process shown in Figure 25 is similar to the process shown in Figure 22. The difference is that decision block 2204 determines that there is not enough free space in key register file 124, and flow of Figure 25 proceeds to block 2506 instead of block 2204 that is not present at Figure 25. Additionally, if decision block 2204 determines that there is sufficient free space in key register file 124, then the process of FIG. 25 also proceeds to block 2208 through block 2216 of FIG.

在方塊2506，系統軟體將密鑰暫存器檔案124中已經被其他加密程式使用(即已經被配置)的空間(即暫存器)配置出來，並且設定所配置暫存器的K位元2402為設定狀態，並且隨之設定標誌暫存器128的索引欄位2104以標示新配置空間在密鑰暫存器檔案124中的位置。K位元2402之設定狀態，是標示該暫存器中關於其他加密程式的密鑰值將被方塊2212的操作覆寫為新的加密程式的密鑰值。然而，如以下第26圖所敘述，其他加密程式的密鑰值將在其返回程序中由方塊2609重新載入。第25圖流程進入方塊2506，會接著導向第22圖所示之方塊2212，結束於方塊2216。 At block 2506, the system software configures the space in the key register file 124 that has been used (ie, has been configured) by other encryption programs (ie, the scratchpad) and sets the K bit 2402 of the configured scratchpad. To set the state, and then index field 2104 of flag register 128 is set to indicate the location of the new configuration space in key register file 124. The setting state of the K bit 2402 is to indicate that the key value of the other encryption program in the register is to be operated by the block 2212. Overwrite the key value of the new encryption program. However, as described in Figure 26 below, the key values of the other encryption programs will be reloaded by block 2609 in their return program. The flow of Fig. 25 proceeds to block 2506, which in turn leads to block 2212 shown in Fig. 22, ending at block 2216.

現在，參閱第26圖，一流程圖圖解第1圖微處理器100的操作，其中根據本發明技術以第21圖之標誌暫存器128以及第24圖之密鑰暫存器檔案124實現多個加密程式之間之任務切換的另外一種實施方式。第26圖所示流程類似第23圖所示流程。不同處在於，若決策方塊2306判定標誌暫存器128的E位元402為設定，第26圖令流程進入決策方塊2607而非方塊2308。 Referring now to Figure 26, a flow chart illustrates the operation of the microprocessor 100 of Figure 1, wherein the flag register 128 of Figure 21 and the key register file 124 of Figure 24 are implemented in accordance with the teachings of the present invention. Another implementation of task switching between encryption programs. The process shown in Figure 26 is similar to the process shown in Figure 23. The difference is that if decision block 2306 determines that E bit 402 of flag register 128 is set, then process 26 proceeds to decision block 2607 instead of block 2308.

在決策方塊2607，微處理器100判斷密鑰暫存器檔案124中，由標誌暫存器128索引欄位2104數值(於方塊2304中修復)所標示的任何暫存器之K位元2402是否為設定狀態。若是，則流程進入方塊2609；若否，則流程進入方塊2308。 At decision block 2607, the microprocessor 100 determines whether the K-bit 2402 of any register indicated by the flag register 2104 (fixed in block 2304) is indexed by the flag register 128 in the key register file 124. To set the status. If so, the flow proceeds to block 2609; if not, the flow proceeds to block 2308.

在方塊2609，微處理器100產生一異常警示(exception)交由一異常處理程序處理。在一種實施方式中，異常處理程序設計於系統軟體中。在一種實施方式中，異常處理程序是由安全執行模式架構提供。根據方塊2304所修復的標誌暫存器128索引欄位2104數值，異常處理程序將目前修復的加密程式(即現在所返回執行的加密程式)之密鑰重新載入密鑰暫存器檔案124。異常處理程序可類似先前第19圖所提及的方塊1908作動，將修復之加密程式的密鑰載入密鑰暫存器檔案124，或者，在必要情況下，自微處理器100外部將密鑰載入安全存儲區122。同樣地，若密鑰暫存器檔案124中被重新載入的暫存器有被其他加密程式使用，系統軟體會令其暫存器的K位元2402為設定狀態。流程接著自方塊2609進入2308，且方塊2308至2316是參考第23圖內容。 At block 2609, the microprocessor 100 generates an exception alert for processing by an exception handler. In one embodiment, the exception handler is designed in the system software. In one embodiment, the exception handler is provided by a secure execution mode architecture. Based on the flag register 128 fixed in block 2304 indexing the field 2104 value, the exception handler reloads the key of the currently repaired encryption program (i.e., the encryption program now returned) into the key register file 124. The exception handler can be actuated like block 1908 mentioned in the previous Figure 19 to load the key of the repaired encryption program into the key register file 124 or, if necessary, from the outside of the microprocessor 100. The key is loaded into secure storage area 122. Similarly, if the key register file 124 is reloaded in the temporary storage The device is used by other encryption programs, and the system software will make the K bit 2402 of its scratchpad set. Flow then proceeds to block 2308 from block 2609, and blocks 2308 through 2316 refer to FIG.

如第24圖至第26圖所教示，此處所敘述的實施方式令微處理器100得以實行多個加密程式的多任務操作，即便上述加密程式需要密鑰暫存空間總合多於密鑰暫存器124空間尺寸。 As illustrated in Figures 24 through 26, the embodiments described herein enable the microprocessor 100 to perform multi-tasking operations of multiple encryption programs, even if the encryption program requires a total of key temporary storage space more than the key. The memory 124 has a spatial size.

現在，參考第27圖，一方塊圖圖解修改自第1圖微處理器100的本發明另外一種實施方式。與第1圖類似的元件是採用同樣標號；例如，指令快取記憶體102、提取單元104以及密鑰暫存器檔案124。然而，此處提取單元104被修正成更包括密鑰切換邏輯2712，耦接第1圖所介紹之主密鑰暫存器檔案142以及密鑰暫存器檔案124。第27圖之微處理器100更包括一分支目標位址快取記憶體(branch target address cache，BTAC)2702。BTAC 2702接收第1圖所揭露之提取位址134，且與指令快取記憶體102的存取平行，皆是基於該提取位址134。根據提取位址134，BTAC 2702供應分支目標位址2706給第1圖所揭露的提取位址產生器164，供應一採用/不採用指標(T/NT indicator)2708以及一型式指標(type indicator)2714給密鑰切換邏輯2712，並且供應一密鑰暫存器檔案(KRF)索引2716給密鑰暫存器檔案124。 Referring now to Figure 27, a block diagram illustrates another embodiment of the present invention modified from microprocessor 100 of Figure 1. Elements similar to those of FIG. 1 are labeled with the same reference; for example, the instruction cache 102, the extraction unit 104, and the key register file 124. However, the extraction unit 104 is modified to include a key switching logic 2712 coupled to the master key register file 142 and the key register file 124 as described in FIG. The microprocessor 100 of FIG. 27 further includes a branch target address cache (BTAC) 2702. The BTAC 2702 receives the extracted address 134 as disclosed in FIG. 1 and is parallel to the access of the instruction cache 102, based on the extracted address 134. Based on the extracted address 134, the BTAC 2702 supplies the branch target address 2706 to the extracted address generator 164 disclosed in FIG. 1, supplying a T/NT indicator 2708 and a type indicator. 2714 gives the key switch logic 2712 and supplies a Key Register Profile (KRF) index 2716 to the Key Holder File 124.

現在，參閱第28圖，一方塊圖根據本發明技術更詳細圖解第27圖的BTAC 2702。BTAC 2702包括一BTAC矩陣2802，其中具有複數個BTAC單元2808，第29圖圖解BTAC單元 2808的內容。BTAC 2802儲存的資訊包括先前執行過的分支指令的歷史資訊，以預測接續執行之分支指令的方向以及目標位址。特別是，BTAC 2802會採用儲存的歷史資訊，基於提取的位址134預測先前執行過的分支指令後續發生的提取操作。分支目標位址快取之操作可參考常見的分支預測技術。然而，本發明所揭露的BTAC 2802是更修正成記錄先前執行過的分支與切換密鑰指令900/1200的歷史資訊，以進行相關的預測操作。特別是，儲存的歷史紀錄使得BTAC 2802得以在提取時間內預測所提取的分支與切換密鑰指令900/1200將載入主密鑰暫存器142的該組數值。此操作致能密鑰切換邏輯2712在分支與切換密鑰指令900/1200實際執行前將密鑰數值載入，避免受限於需根據分支與切換密鑰指令900/1200之執行清空微處理器100的管線內容，以下將詳細討論之。此外，根據一種實施方式，BTAC 2802更被修正成儲存包括先前執行過的密鑰切換指令600的歷史資訊，以達到相同的效果。 Referring now to Figure 28, a block diagram illustrates the BTAC 2702 of Figure 27 in more detail in accordance with the teachings of the present invention. BTAC 2702 includes a BTAC matrix 2802 with a plurality of BTAC units 2808, and FIG. 29 illustrates a BTAC unit 2808 content. The information stored by the BTAC 2802 includes historical information of previously executed branch instructions to predict the direction of the branch instruction to be executed and the target address. In particular, the BTAC 2802 will use the stored history information to predict the subsequent extraction operations of the previously executed branch instructions based on the extracted address 134. Branch target address cache operations can refer to common branch prediction techniques. However, the BTAC 2802 disclosed herein is further modified to record historical information of previously executed branch and handover key commands 900/1200 for related prediction operations. In particular, the stored history record enables the BTAC 2802 to predict the set of values that the extracted branch and switch key instructions 900/1200 will load into the master key register 142 during the fetch time. This operation enables the key switch logic 2712 to load the key value before the branch and switch key instructions 900/1200 are actually executed, avoiding the need to clear the microprocessor according to the branch and switch key instructions 900/1200. The pipeline content of 100 will be discussed in detail below. Moreover, according to one embodiment, the BTAC 2802 is further modified to store historical information including previously executed key switching instructions 600 to achieve the same effect.

現在，參閱第29圖，一方塊圖根據本發明技術更詳細圖解第28圖BTAC單元2808的內容。每個單元2808包括一有效位元2902指示所屬單元2808是否為有效。每個單元2808更包括一標記欄位2904，用以與提取位址134的部分內容比較。若提取位址134的索引部分選擇的單元2808使得提取位址134之標記部分吻合其中有效標記2904，則提取位址134正中BTAC 2802。每個陣列單元2808更包括一目標位址欄位2906，用於儲存先前執行過之分支指令-包括分支與切換密鑰指令900/1200-的目標位址。每個陣列單元2808更包括一採用/不採用欄位2908，用以儲存先前執行過的分支指令-包括分支與切換密鑰指令900/1200-的方向(採用/不採用)記錄。每個陣列單元2808更包括一密鑰暫存器檔案索引2912欄位，用於儲存先前執行過的分支與切換密鑰指令900/1200的密鑰暫存器檔案索引904/1304記錄，以下將詳細討論之。根據一種實施方式，BTAC 2802是在其密鑰暫存器檔案索引2912欄位儲存先前執行過的密鑰切換指令600的密鑰暫存器檔案索引604記錄。每個陣列單元2808更包括一型式欄位2914，指示所紀錄的指令的型式。例如，型式欄位2914可標示所紀錄的歷史指令為一呼叫(call)、返回(return)、條件跳躍(conditional jump)、無條件跳躍(unconditional jump)、分支與切換密鑰指令900/1200、或密鑰切換指令600。 Referring now to Figure 29, a block diagram illustrates the contents of BTAC unit 2808 of Figure 28 in greater detail in accordance with the teachings of the present invention. Each unit 2808 includes a valid bit 2902 indicating whether the belonging unit 2808 is active. Each unit 2808 further includes a flag field 2904 for comparison with a portion of the content of the extracted address 134. If the selected portion 2808 of the index portion of the extracted address 134 is such that the marked portion of the extracted address 134 matches the valid flag 2904 therein, the address 134 is extracted to center the BTAC 2802. Each array unit 2808 further includes a target address field 2906 for storing the target address of the previously executed branch instruction - including the branch and switch key instruction 900/1200. Each array unit 2808 further includes an adoption/not Field 2908 is used to store the previously executed branch instructions - including the direction of the branch and switch key instructions 900/1200 - (taken/not taken). Each array unit 2808 further includes a key register file index 2912 field for storing the previously executed branch and switch key instructions 900/1200 key register file index 904/1304 records, Discuss it in detail. According to one embodiment, the BTAC 2802 stores a key register file index 604 record of the previously executed key switch instruction 600 in its key register file index 2912 field. Each array unit 2808 further includes a type field 2914 indicating the type of command being recorded. For example, the type field 2914 may indicate that the recorded history command is a call, a return, a conditional jump, an unconditional jump, a branch and switch key instruction 900/1200, or Key switch instruction 600.

現在，參閱第30圖，一流程圖圖解第27圖微處理器100的操作，其中，根據本發明技術，所述微處理器100包括第28圖揭露的BTAC 2802。流程始於方塊3002。 Referring now to Figure 30, a flowchart illustrates the operation of microprocessor 100 of Figure 27, wherein microprocessor 100 includes BTAC 2802 as disclosed in Figure 28 in accordance with the teachings of the present invention. The process begins at block 3002.

在方塊3002，微處理器100執行一分支與切換密鑰指令900/1200，以下將以第32圖詳述之。流程接著進入方塊3004。 At block 3002, the microprocessor 100 executes a branch and switch key instruction 900/1200, which will be detailed below in FIG. Flow then proceeds to block 3004.

在方塊3004，微處理器100在BTAC 2802中配置一陣列單元2808給執行過的分支與切換密鑰指令900/1200，將該分支與切換密鑰指令900/1200解出的方向、目標位址、密鑰暫存器檔案索引904/1304、以及指令型式分別紀錄於所配置的該陣列單元2808之採用/不採用欄位2908、目標位址欄位2906、密鑰暫存器檔案索引2912欄位、以及型式欄位2914中，以作為該分支與切換密鑰指令900/1200的歷史資訊。流程結束於方塊3004。 At block 3004, the microprocessor 100 configures an array unit 2808 in the BTAC 2802 for the executed branch and handover key commands 900/1200, the direction and destination address from which the branch and handover key commands 900/1200 are resolved. The key register file index 904/1304 and the command pattern are respectively recorded in the configured/not used field 2908 of the array unit 2808, the target address field 2906, and the key register file index 2912. Bit, and type field 2914, as This branch and switch key command 900/1200 history information. Flow ends at block 3004.

現在，參閱第31圖，一流程圖圖解第27圖微處理器100的操作，其中，根據本發明技術，所述微處理器100包括第28圖揭露的BTAC 2802。流程始於方塊3102。 Referring now to Figure 31, a flowchart illustrates the operation of microprocessor 100 of Figure 27, wherein microprocessor 100 includes BTAC 2802 disclosed in Figure 28 in accordance with the teachings of the present invention. The process begins at block 3102.

在方塊3102，提取位址134供應給指令快取記憶體102以及BTCA 2802。流程接著進入方塊3104。 At block 3102, the fetch address 134 is supplied to the instruction cache 102 and the BTCA 2802. Flow then proceeds to block 3104.

在方塊3104，提取位址134正中BTAC 2802，且BTAC 2802將對應的陣列單元2808之目標位址2906、採用/不採用2908、密鑰暫存器檔案索引2912欄位以及型式2914欄位的內容分別以目標位址2706、採用/不採用指標2708、密鑰暫存器檔案索引2716、以及型式指標2714輸出。特別是，型式欄位2914用於指示所儲存指令為一分支與切換密鑰指令900/1200。流程接著進入決策方塊3106。 At block 3104, the address 134 is extracted in the middle of the BTAC 2802, and the BTAC 2802 will correspond to the target address 2906 of the array unit 2808, with/without 2908, the key register file index 2912 field, and the content of the type 2914 field. Output is performed by target address 2706, adoption/non-use indicator 2708, key register file index 2716, and type indicator 2714, respectively. In particular, the pattern field 2914 is used to indicate that the stored instruction is a branch and switch key instruction 900/1200. Flow then proceeds to decision block 3106.

在決策方塊3106，密鑰切換邏輯2712藉由檢驗採用/不採用輸出2708判斷分支與切換密鑰指令900/1200被BTAC 2802預測為會採用。若採用/不採用輸出2708顯示分支與切換密鑰指令900/1200被預測為採用，流程接著進入方塊3112；反之，流程接著進入方塊3108。 At decision block 3106, the key switch logic 2712 determines that the branch and switch key command 900/1200 is predicted to be employed by the BTAC 2802 by verifying the adopt/disuse output 2708. If the take/not output 2708 display branch and switch key command 900/1200 is predicted to be employed, the flow then proceeds to block 3112; otherwise, the flow proceeds to block 3108.

在方塊3108，微處理器100隨著分支與切換密鑰指令900/1200順著輸送一指示，顯示BTAC 2802預測其不被採用。(此外，若採用/不採用輸出2708顯示該分支與切換密鑰指令被預測為採用，微處理器100在方塊3112隨著該分支與切換密鑰指令900/1200順著輸送一指示，顯示BTAC 2802預測其會被採用)。流程結束於3108。 At block 3108, the microprocessor 100, along with the branch and switch key command 900/1200, transmits an indication that the BTAC 2802 predicts that it is not being taken. (In addition, if the branch/switch key instruction is predicted to be employed with/without output 2708, microprocessor 100 displays BTAC along block 3112 along with the branch and switch key command 900/1200. 2802 predicts that it will be picked use). The process ends at 3108.

在方塊3112，提取位址產生器164以BTAC 2802於方塊3104所預測的目標位址2706更新提取位址134。流程接著進入方塊3114。 At block 3112, the extracted address generator 164 updates the extracted address 134 with the target address 2706 predicted by the BTAC 2802 at block 3104. Flow then proceeds to block 3114.

在方塊3114，根據BTAC 2802於方塊3104所預測的密鑰暫存器檔案索引2712，密鑰切換邏輯2712以其所指示之密鑰暫存器檔案124位置更新主密鑰暫存器142內的密鑰數值。在一種實施方式中，必要狀況下，密鑰切換邏輯2712會拖延提取單元104提取指令數據106內的區塊，直至主密鑰暫存器142被更新。流程接著進入方塊3116。 At block 3114, based on the key register file index 2712 predicted by the BTAC 2802 at block 3104, the key switch logic 2712 updates the location in the master key register 142 with the location of the key register file 124 indicated therein. Key value. In one embodiment, the key switch logic 2712 will delay the extraction unit 104 to extract the blocks within the instruction data 106 until necessary, until the master key register 142 is updated, as necessary. Flow then proceeds to block 3116.

在方塊3116，提取單元104利用方塊3114所載入的新主密鑰暫存器142內容持續提取並且解密指令數據106。流程結束於方塊3116。 At block 3116, the extraction unit 104 continues to extract and decrypt the instruction data 106 using the new master key register 142 content loaded by the block 3114. Flow ends at block 3116.

現在，參閱第32圖，一流程圖圖解第27圖微處理器100的操作，其中，根據本發明技術，執行一分支與切換密鑰指令900/1200。第32圖流程在某一方面類似第10圖流程，且類似的方塊是採以同樣標號。雖然第32圖的討論是參照第10圖內容，其應用可更考慮第14圖所介紹的分支與切換密鑰指令1200操作。第32圖流程始於方塊1002。 Referring now to Figure 32, a flowchart illustrates the operation of microprocessor 100 of Figure 27, in which a branch and switch key instruction 900/1200 is performed in accordance with the teachings of the present invention. The flowchart of Fig. 32 is similar to the flow of Fig. 10 in a certain aspect, and similar blocks are given the same reference numerals. Although the discussion of FIG. 32 is based on the contents of FIG. 10, its application may further consider the branch and switch key instruction 1200 operations described in FIG. The process of Figure 32 begins at block 1002.

在方塊1002，解碼單元108解碼一分支與切換密鑰指令900/1200，且將之代入微代碼單元132實現分支與切換密鑰指令900/1200的微代碼程序。流程接著進入方塊1006。 At block 1002, decoding unit 108 decodes a branch and switch key instruction 900/1200 and substitutes it into microcode unit 132 to implement the microcode program for branch and switch key instructions 900/1200. Flow then proceeds to block 1006.

在方塊1006，微代碼解出分支方向(即採用/不採用)以及目標位址。流程接著進入方塊3208。 At block 1006, the microcode resolves the branch direction (ie, taken/not taken) and the target address. Flow then proceeds to block 3208.

在方塊3208，微代碼判斷BTAC 2802是否為該分支與切換密鑰指令900/1200提供一預測。若有提供，流程接著進入決策方塊3214；若無提供，流程接著進入第10圖的方塊1008。 At block 3208, the microcode determines if the BTAC 2802 provides a prediction for the branch and handover key command 900/1200. If so, the flow then proceeds to decision block 3214; if not, the flow then proceeds to block 1008 of FIG.

在決策方塊3214，微代碼藉由將BTAC 2802輸送出的採用/不採用指標2708以及目標位址2706與方塊1006所解出的方向以及目標位址判斷BTAC 2802所做的預測是否正確。若BTAC 2802的預測正確，則流程結束；反之，則流程來到決策方塊3216。 At decision block 3214, the microcode determines whether the prediction made by the BTAC 2802 is correct by using the direction of the adoption/non-use indicator 2708 and the target address 2706 delivered by the BTAC 2802 and the block 1006 and the target address. If the prediction of BTAC 2802 is correct, the process ends; otherwise, the flow proceeds to decision block 3216.

在決策方塊3216，微代碼判斷此不正確的BTAC 2802預測有沒有被採用。若已被採用，流程進入方塊3222；若無，流程進入第10圖的方塊1014。 At decision block 3216, the microcode determines if this incorrect BTAC 2802 prediction has been taken. If so, the flow proceeds to block 3222; if not, the flow proceeds to block 1014 of FIG.

在方塊3222，微代碼修復主密鑰暫存器142的內容，因為BTAC 2802對分支與切換密鑰指令900/1200所做的錯誤預測被採用，導致第31圖方塊3114將錯誤的密鑰數值載入其中。在一種實施方式中，密鑰切換邏輯2712包括修復主密鑰暫存器142所需的儲存元件與邏輯。在一種實施方式中，微代碼產生一異常警示交由一異常處理器修復主密鑰暫存器142。此外，微代碼使得微處理器100分支跳躍到該分支與切換密鑰指令900/1200之後接續的x86指令，使得微處理器100中新於該分支與切換密鑰指令900/1200的所有x86指令清空，並且使微處理器100中較分支至目標位址之微代碼新的所有微代碼清空。被清空的內容包括讀取自指令快取記憶體102、且緩衝暫存於提取單元104、解碼單元108中等待被解碼的所有指令位元組106。隨著分支至接續的指令，提取單元104開始使用主密鑰暫存器142內的該組修復後的密鑰數值自指令快取記憶體102提取並且解密指令數據106。流程結束於方塊3222。 At block 3222, the microcode repairs the contents of the master key register 142 because the BTAC 2802's erroneous prediction of the branch and switch key instructions 900/1200 is employed, resulting in the incorrect key value of block 31, block 31. Load it. In one embodiment, key switching logic 2712 includes the storage elements and logic required to repair master key register 142. In one embodiment, the microcode generates an exception alert to an exception handler to repair the master key register 142. In addition, the microcode causes the microprocessor 100 to branch to jump to the x86 instruction following the branch and switch key instruction 900/1200, causing all x86 instructions in the microprocessor 100 to be new to the branch and switch key instructions 900/1200. It is emptied and all microcodes in the microprocessor 100 that are newer than the microcode that branches to the target address are emptied. The emptied content includes all instruction byte groups 106 that are read from the instruction cache 102 and buffered in the extraction unit 104 and the decoding unit 108 for decoding. With the branch-to-continuous instruction, the extraction unit 104 begins extracting and decrypting the instruction data 106 from the instruction cache 102 using the set of repaired key values in the master key register 142. Flow ends at block 3222.

除了以上所述、由微處理器100實現的指令解密實施方式所帶來的安全優勢，發明人更發展出建議編碼指南，其使用可配合以上實施方式，削弱藉由分析x86指令實際使用量、對加密x86碼以統計技巧發展出的駭客攻擊。 In addition to the security advantages brought by the instruction decryption implementation implemented by the microprocessor 100 described above, the inventors have developed a recommended coding guide, which can be used in conjunction with the above embodiments to reduce the actual usage of x86 instructions. A hacking attack developed with statistical techniques for encrypting x86 code.

第一，由於駭客通常假設所提取的16位元組的指令數據106全數為x86指令，因此，相對於程式執行流程，編碼時應當在16位元組區塊之間加入「洞(holes)」。也就是說，其編碼應當以多個指令跳躍一些指令位元組，以未加密的位元組產生多個「洞」，其中可填入適當的數值，以增加純文字位元組的熵值(entropy)。此外，倘若能更提升純文字位元組的熵值，其編碼可盡可能採用即時數據值。此外，所述即時數據值可作為假線索，指向錯誤的指令操作碼位址。 First, since the hacker usually assumes that the extracted 16-byte tuple instruction data 106 is all x86 instructions, the "holes" should be added between the 16-bit tuple blocks in the encoding process relative to the program execution flow. "." That is to say, its encoding should jump some instruction bytes with multiple instructions, and generate multiple "holes" with unencrypted bytes, which can be filled with appropriate values to increase the entropy of plain text bytes. (entropy). In addition, if the entropy value of a pure text byte is further improved, the encoding can use real-time data values whenever possible. In addition, the immediate data value can be used as a false clue to point to the wrong instruction opcode address.

第二，所述編碼可包括特別的NOP指令，其中包括“不理會”欄位，填有適當數值以增加上述熵值。例如，x86指令0x0F0D05xxxxxxxx屬於7位元組的NOP，其中最後四個位元組可為任意值。此外，NOP指令的操作碼型式以及其「不理會」位元組的數量更可有其他變化。 Second, the encoding may include a special NOP instruction including a "disregard" field filled with appropriate values to increase the entropy value described above. For example, the x86 instruction 0x0F0D05xxxxxxxx belongs to a 7-byte NOP, where the last four bytes can be any value. In addition, there are other variations in the opcode format of the NOP instruction and the number of "don't care" bytes.

第三，許多x86指令具有與其他x86指令相同的基本功能。關於等效功能的指令，其編碼可捨棄重複使用同樣的指令，改採用多重型式並且/或採用使純文字熵值提升的型式。例如，指令0xC10107以及指令0xC10025作的是同樣的事情。甚至，某些等效指令是以不同長度的版本呈現，例如，0xEB22以及0xE90022；因此，編碼時可採用多種長度但相同效果的指令。 Third, many x86 instructions have the same basic functionality as other x86 instructions. For an equivalent function instruction, the code can discard the same instruction repeatedly, use multiple patterns and/or adopt a pattern that increases the pure text entropy value. For example, the instruction 0xC10107 and the instruction 0xC10025 do the same thing. very To that, some equivalent instructions are presented in versions of different lengths, for example, 0xEB22 and 0xE90022; therefore, encodings of various lengths but with the same effect can be used for encoding.

第四，x86架構允許使用冗餘且無意義的操作碼字首(opcode prefixes)，因此，編碼時可小心應用之，以更增加上述熵值。例如，指令0x40以及0x2627646567F2F340作的是完全一樣的事情。因為其中僅有8個安全的x86字首，他們需被小心地安插在編碼中，以避免過度頻繁地出現。 Fourth, the x86 architecture allows for the use of redundant and meaningless opcode prefixes, so the encoding can be applied with care to increase the entropy. For example, the instructions 0x40 and 0x2627646567F2F340 do exactly the same thing. Because there are only 8 secure x86 prefixes, they need to be carefully placed in the code to avoid excessive frequency.

雖然已經列舉多種實施例以密鑰擴展器對主密鑰暫存器數值中的一對數值進行旋轉以及加/減運算，尚有其他實施方式可考慮使用，其中，密鑰擴展器可對多於兩個的主密鑰暫存器數值進行運算，此外，所進行的運算可不同於旋轉以及加/減運算。此外，第6圖揭露的密鑰切換指令600以及第9圖揭露的分支與切換密鑰指令900更可有其他實施方式，例如，將新的密鑰數值由安全存儲區122載入主密鑰暫存器142而非由密鑰暫存器檔案124載入，並且，第15圖所介紹的分支與切換密鑰指令1500的其他實施方式是以索引欄位2104儲存安全存儲區122的位址。此外，雖然已列舉多種實施例調整BTAC 2702儲存KRF索引配合分支與切換密鑰指令900/1200使用，尚有其他實施方式是調整BTAC 2702儲存安全存儲區位址，以配合分支與切換密鑰指令1500使用。 Although various embodiments have been described in which the key expander rotates and adds/subtracts a pair of values in the master key register value, other embodiments may be considered, wherein the key expander may be more The operations are performed on the two master key register values, and in addition, the operations performed may be different from the rotation and addition/subtraction operations. In addition, the key switching instruction 600 disclosed in FIG. 6 and the branch and handover key instruction 900 disclosed in FIG. 9 may have other embodiments, for example, loading a new key value from the secure storage area 122 into the master key. The scratchpad 142 is instead loaded by the key register file 124, and other implementations of the branch and switch key instructions 1500 introduced in FIG. 15 store the address of the secure memory area 122 in the index field 2104. . In addition, although various embodiments have been described to adjust the BTAC 2702 storage KRF index mate branch and switch key command 900/1200 usage, there are other embodiments for adjusting the BTAC 2702 storage secure memory address to match the branch and switch key command 1500. use.

特別是，由於解密密鑰174衍生自第一以及第二密鑰234與236，主密鑰172(包含組成任一特定密鑰對的第一以及第二密鑰234與236)可替代為解密密鑰元(decryption key primitives)。”元(primitive)”在此作為”衍生物(derivative)”的反義詞。 In particular, since the decryption key 174 is derived from the first and second keys 234 and 236, the master key 172 (including the first and second keys 234 and 236 that make up any particular key pair) can be replaced by decryption. Key element Primitives). "Primary" is used herein as an antonym of "derivative".

以上列舉的本發明諸多實施方式僅是作為說明例使用，並非意圖限制發明範圍。相關電腦技術領域人員可在不偏離本發明範圍的前提下作出形式以及細節的諸多變形。例如，可以軟體方式實現所述如函式、製作、模組化、模擬、說明、以及/或測試此篇所討論之設備與方法的方式。實現方式包括一般程式語言(例如，C、C++)、硬體描述語言包括Verilog HDL、VHDL…等、或其他可用的程式工具。所述軟體可載於任何已知的計算機可讀媒體，例如，磁帶、半導體、磁碟、或光碟(例如，CD-ROM、DVD-ROM等)、網路、有線傳輸、無線或其他通訊媒體。所述設備與方法的實施方式可包含於半導體知識產權核心，例如一微處理器核心(例如以HDL實現)，並可轉成硬體以積體電路實現。此外，所述之設備與方法可由軟、硬體結合方式實現。因此，本發明範圍不應限定於所述任何實施方式，應當是以下列請求項以及其等效技術界定之。特別是，本發明技術可以一般用途計算機所採用的微處理器實現。值得注意的是，本技術領域人員可能不偏離請求項所定義之發明範圍、以所揭露之概念以及特殊實施例為基礎、設計或修正提出其他架構產生與本發明相同的效果。 The various embodiments of the invention described above are merely illustrative and are not intended to limit the scope of the invention. Many variations in form and detail may be made by those skilled in the art of the related art without departing from the scope of the invention. For example, the manner in which the apparatus, methods, and methods discussed in this section can be implemented in a software, such as a function, fabrication, modularization, simulation, description, and/or testing. Implementations include general programming languages (eg, C, C++), hardware description languages including Verilog HDL, VHDL, etc., or other available programming tools. The software can be embodied on any known computer readable medium, such as a magnetic tape, semiconductor, magnetic disk, or optical disk (eg, CD-ROM, DVD-ROM, etc.), network, wired transmission, wireless, or other communication medium. . Embodiments of the apparatus and method may be embodied in a semiconductor intellectual property core, such as a microprocessor core (eg, implemented in HDL), and may be implemented in hardware as an integrated circuit. In addition, the apparatus and method described can be implemented by a combination of soft and hard. Therefore, the scope of the invention should not be limited to any of the described embodiments, and should be defined by the following claims and their equivalents. In particular, the techniques of the present invention can be implemented in a microprocessor employed in a general purpose computer. It should be noted that those skilled in the art may have the same effect as the present invention without departing from the scope of the invention as defined by the claims, and the design or modification of other architectures based on the disclosed embodiments.

100‧‧‧微處理器 100‧‧‧Microprocessor

102‧‧‧指令快取記憶體 102‧‧‧ instruction cache memory

104‧‧‧提取單元 104‧‧‧Extraction unit

108‧‧‧解碼單元 108‧‧‧Decoding unit

112‧‧‧執行單元 112‧‧‧Execution unit

114‧‧‧引出單元 114‧‧‧Exporting unit

118‧‧‧通用暫存器 118‧‧‧Universal register

122‧‧‧安全存儲區 122‧‧‧Safe storage area

124‧‧‧密鑰暫存器檔案 124‧‧‧Key Register File

128‧‧‧標誌暫存器 128‧‧‧flag register

132‧‧‧微代碼單元 132‧‧‧microcode unit

134‧‧‧提取位址 134‧‧‧ extract address

142‧‧‧主密鑰暫存器 142‧‧‧Master Key Register

144‧‧‧控制暫存器 144‧‧‧Control register

148‧‧‧E位元 148‧‧‧E-bit

152‧‧‧密鑰擴展器 152‧‧‧Key Expander

154‧‧‧多工器 154‧‧‧Multiplexer

156‧‧‧互斥邏輯 156‧‧‧ mutually exclusive logic

162‧‧‧純文字指令數據 162‧‧‧ plain text instruction data

164‧‧‧提取位址產生器 164‧‧‧ extract address generator

172‧‧‧兩組密鑰 172‧‧‧Two sets of keys

174‧‧‧解密密鑰 174‧‧‧ decryption key

178‧‧‧多工器154的輸出 178‧‧‧ Output of multiplexer 154

Claims

A microprocessor comprising: a secure memory, configured to store and provide a key for key writing, for decrypting an encrypted instruction; an instruction processing pipeline for setting up and executing instructions from a cache memory, The method further includes: an extracting unit extracting an unencrypted and encrypted instruction in an instruction set architecture supported by the microprocessor; and a decrypting circuit for receiving an encryption key from the key of the secure memory Decryption; and one or more execution units, executing instructions, or executing micro instructions that are translated by the instructions, wherein the instruction set architecture includes a key storage instruction for storing one or more key secrets Key to the secure memory, wherein the microprocessor supports an encrypted key storage instruction, wherein the microprocessor uses an encrypted key storage instruction with a first set of one or more keys The written key decrypts the encrypted key storage instruction, and then executes the decrypted key storage instruction, and then uses a second set of ones provided by the encrypted key storage instruction Or a key written by a plurality of keys decrypts one or more encrypted instructions of a subsequent group, and the microprocessor thus enables an encrypted program to provide a plurality of sets of keys for subsequent decryption changes of the plurality of sets of program instructions. Key.

The microprocessor of claim 1, wherein the instruction set architecture includes a secure execution mode instruction, requiring switching from a general execution mode to a A secure execution mode in which the microprocessor limits decryption of the encrypted program until the microprocessor enters the secure execution mode.

The microprocessor of claim 2, wherein the microprocessor condition allows the request according to whether the format of a required instruction to switch to the secure execution mode has an encrypted parameter, and the matching instruction is one. A privileged program or part of a program whose decrypted parameters are decrypted to meet the preset requirements for running the encrypted program.

The microprocessor of claim 3, wherein the encryption parameter and the program are encoded by different key writers.

The microprocessor of claim 1, wherein the key storage instruction provides the content of the key written by one or more keys in an immediate data field.

The microprocessor of claim 1, wherein the decrypted instruction or the decrypted instruction is executed and the decrypted instruction or microinstruction is not exposed.

The microprocessor of claim 1, further comprising a processor bus, wherein the secure memory is not accessible by the processor bus.

The microprocessor of claim 1, further comprising a cache memory, the hierarchy isolating the secure memory.

The microprocessor of claim 1 further includes an AES or RSA encryption channel, such that the value of the key written by the key is written to the secure memory.

The microprocessor of claim 1, wherein the secure memory is not accessible by a program executing in a non-privileged execution mode.

A method for securely executing instructions in a microprocessor, comprising: storing a first set of one or more key-written keys to a secure memory, Decryption of instructions for encryption; caching a first set of encrypted instructions; decrypting the first set of encrypted instructions using the keys of the first set of one or more keys; cache encryption a key storage instruction for storing a key of a second set of one or more keys to the secure memory for decryption of the encrypted instruction; using the first set of one or more keys Writing the key to decrypt the encrypted key storage instruction; storing the key of the second group of one or more keys to the secure memory for decryption of the key storage instruction; a second set of encrypted instructions; the second set of encrypted instructions are decrypted using the keys of the second set of one or more keys.

The method of claim 11, further comprising performing a secure execution mode switch, requiring switching from a general execution mode to a secure execution mode, wherein the microprocessor limits decryption of the encrypted instruction until the micro process The device enters this secure execution mode.

According to the method of claim 12, according to whether the format of the instruction to switch to the security execution mode has an encrypted parameter, the requirement to switch to the security execution mode is allowed, and the matching instruction is one. A privileged program or part of a program whose decrypted parameters are decrypted to meet the preset requirements for running the encrypted program.

For example, the method described in claim 13 is written with a different key. The machine creates the encryption parameters and the encryption of the program.

The method of claim 11, wherein the key storage instruction provides the content of the key written by one or more keys in the immediate data field.

For example, in the method described in claim 11, the value of the key written by the key is written into the secure memory via an AES or RSA encryption channel.

A computer program product, encoded in at least one non-transitory computer medium, operated by an computing device, the computer program product further comprising: a computer program code, the microprocessor described in the non-transitory computer medium, the computer The code includes: a first code, describing a secure memory storage, and providing a key for key writing, decryption of an instruction for encryption; and a second code describing a command processing pipeline from a cache The memory extracts and executes the instruction, and the instruction processing pipeline further includes: a third code, describing an extracting unit extracting an unencrypted and encrypted instruction in an instruction set architecture supported by the microprocessor, wherein the instruction set architecture a key storage instruction for storing one or more key coded keys to the secure memory, and the microprocessor supports an encrypted key storage instruction; a fourth code describing a decryption circuit Decryption of the encrypted key received from the key of the secure memory; and a fifth code describing one or more execution units executing the instruction, Executed instruction was translated microinstructions, a sixth code describes the architecture of the microprocessor, in response to an encryption key store instructions, wherein in the preparation of a, or a plurality of keys of the first group The key decrypts the encrypted key storage instruction, and then executes the decrypted key storage instruction, and then uses the key of the second group of one or more keys provided by the encrypted key storage instruction. Decrypting one or more encrypted instructions of a subsequent group.