TW201203108A - Microprocessors and operating methods thereof and encryption/decryption methods - Google Patents
Microprocessors and operating methods thereof and encryption/decryption methods Download PDFInfo
- Publication number
- TW201203108A TW201203108A TW100118074A TW100118074A TW201203108A TW 201203108 A TW201203108 A TW 201203108A TW 100118074 A TW100118074 A TW 100118074A TW 100118074 A TW100118074 A TW 100118074A TW 201203108 A TW201203108 A TW 201203108A
- Authority
- TW
- Taiwan
- Prior art keywords
- instruction
- key
- microprocessor
- branch
- block
- Prior art date
Links
Landscapes
- Storage Device Security (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
Description
201203108 六、發明說明: 【發明所屬之技術領域】 本發明係有關於微處理器(microprocessor)領域,特別 用於增加微處理器所執行的程式之安全性。 【先前技術】201203108 VI. Description of the Invention: [Technical Field] The present invention relates to the field of microprocessors, and in particular to increasing the security of programs executed by microprocessors. [Prior Art]
很多軟體程式在面臨破壞電腦系統安全的攻擊時,通 常是脆弱不堪的。例如,駭客可藉由攻擊一運行中程式的 緩衝溢位區漏洞(buffer overflow vulnerability)植入不當程 式碼、並轉移主控權給該不當程式碼。如此一來,所植入 的程式碼將主導被攻擊的程式。一種防範軟體程式遭攻擊 的方案為指令集隨機化(instruction set randomization)。概略 解釋之’指令集隨機化技術會先將程式加密(encrypt)為某 些形式’再於處理器將該程式自記憶體提取後,於該處理 器内解密(decrypt)該程式。如此一來,駭客便不易植入惡 意指令,因為所植入的指令必須被適當地加密(例如,使用 與所攻擊程式相同的加密密鑰或演算法)方會被正確地執 行。例如,參閱文件「Counter Code-Injection Attacks with Instruction-Set Randomization, by Gaurav S. Kc, Angelos D. Keromytis, and Vassilis Prevelakis, CCS Ό3, October 27-30, 2003, Washington, DC, USA, ACM 1-58113-738-9/03/0010」,其中敘述 Bochs-x86 Pentium 模 擬器(emulator)之改良版本。相關技術的缺點已被廣泛討 論。例如,參閱資料「erm/ze柯五77^五价如⑽邮以 of Instruction Set Randomization, by Ana Nora Sovarel, David Evans, and Nathanael Paul, CNTR2449100-TW/0608-A43129-TW/ Final 4 201203108 http://www.cs.virginia.edu/feeh」〇 【發明内容】 本發明一種實施方式揭露一微處理器。該微處理器包 括一指令快取記憶體、一指令解碼單元、以及一提取單元。 該提取單元用於:(a)自該指令快取記憶體提取一區塊的指 令數據;(b)以一數據實體對該區塊執行一布林互斥運算, 以產生純文字指令數據;以及(c)將上述純文字指令數據提 供給該指令解碼單元。在一第一狀況下,該區塊包括加密 才曰令數據、且該數據實體為解密密錄。在一第二狀況下, 該區塊包括非加密指令數據、且該數據實體為多個位元的 二進位零值。無論該區塊的指令數據為加密或非加密,實 行上述内容(a)、(b)以及(c)所需要的時間在該第一狀況下以 及該第二狀況下是相同的。 本發明另外一種實施方式揭露一方法,用以操作具有 一指令快取記憶體的一微處理器。該方法包括:(a)自該指 令快取記憶體提取一區塊的指令數據;(b)以一數據實體對 該區塊進行一布林互斥運算,以產生純文字指令數據;以 及(c)供應上述純文字指令數據給一指令解碼單元。在一第 一狀況下,該區塊包括加密指令數據、且該數據實體為解 狯您鑰。在一第二狀況下,該區塊包括非加密指令數據、 且°亥數據Λ體為多個位元的二進位零值。無論該區塊的指 令數據為加密或非加密,實行上述内容(a)、(b)以及(c)所需 要的日守間在該第一狀況下以及該第二狀況下是相同的。 本發明一種實施方式提供一微處理器。該微處理器包 CNTR2449100-TW/0608-A43129-TW/Final 5 201203108 ^令記憶體以及—提取單元。該提取單元會自1 =夬取錢體—序列多個提取位 ^自該 列多個區塊的加密指令。在提取該 單元更叫數個密鑰數值以及所 ^時’提取 部份内容為-函式,生成解密密位址的 各個區塊,提取罝开好田机從 丁耵杈取出的該序列 指令,〜 =3:記憶體提取該序列上述多個區塊時 微處,更新該提取單元㈣該等密輪數值。 於aΜ外—種實施方式揭露—種方法,操作具有一 記憶體提取一程式複數個第一力::包=: 為複數個第一非加密指令。該方法更包括將 Γ錢以—第二解密密鍮取代,回應該等第一非 :二二中的—密齡換指令。該方法更包括自該指令快 取5己憶體提取該程式的複數個第二加密指令,且將之以該 第二解密密鑰解密為複數個第二非加密指令。 本發明另外一種實施方式揭露一種方L,用於操作-微處理器。該方法包括自-指令快取記憶體—序列多個提 取位址提取-加密程式-序列多個區塊的加密指令。該方 法更包括在提取該序列各個區塊時,以複數個密鑰數值以 及所提取㈣塊的提取位址的部份内容為—函式生成解密 密錄。該方法更包括針對該序列内各區塊,使㈣應的上 述解密密賴密其中的加密指令。財法更包括在提取該 序列上述多個區塊時,執行一密鑰切換指令。執行上述密 CNTR2449100-TW/0608-A43129-TW/ Final ^ 201203108 鑰切換指令包括更新用於生成上述解密密鑰的該等密鑰數 值。 本發明一種實施方式揭露一種微處理器。該微處理器 包括一提取單元,使用第一解密密鑰數據提取並且解密一 分支與切換密鑰指令。該微處理器更包括微代碼。上述微 代碼在該分支與切換密鑰指令的方向不被採用的狀況下, 令該提取單元採用上述第一解密密鑰數據提取並且解密該 分支與切換密鑰指令之後的接續指令。該微代碼更在該分 支與切換密鑰指令被採用的狀況下,令該提取單元採用不 同於上述第一解密密鑰數據的第二解密密鑰數據提取並且 解密該分支與切換密鑰指令的一目標指令。 本發明另外一種實施方式揭露一方法,以一微處理器 處理一加密程式。該方法包括使用第一解密密鑰數據提取 並且解密一分支與切換密鑰指令。此方法更包括,在該分 支與切換密鑰指令的方向不被採取的狀況下,以上述第一 解密密鑰數據提取並且解密該分支與切換密鑰指令之後的 接續指令。該方法更包括,在該分支與切換密鑰指令的方 向被採取的狀況下,以不同於上述第一解密密鑰數據的第 二解密密鑰數據提取並且解密該分支與切換密鑰指令的一 目標指令。 本發明另外一種實施方式亦揭露一方法,用於加密一 程式,以供用於解密與執行加密程式的一微處理器日後執 行。該方法包括接收一非加密程式的一目的檔,其中包括 傳統分支指令,所指示的目標位址可於該微處理器執行該 程式前判定。該方法更包括分析該程式以獲得塊資訊。上 CNTR2449100-TW/0608-A43129-TW/ Final 7 201203108 述塊資訊將該程式劃分成一序列多個塊。各塊包括一序列 多個指令。上述塊資訊更包括各塊相關的加密密鑰數據。 各塊對應的加密密鑰數據不相同。該方法更包括將上述傳 統分支指令中目標位址與自身坐落不同塊者各自以一分支 與切換密鑰指令取代。該方法更包括基於上述塊資訊加密 該程式。 本本發明另外一種實施方式亦揭露一方法,用於加密 一程式,以供用於解密與執行加密程式的一微處理器日後 執行。該方法包括接收一非加密程式的一目的檔,其中包 括傳統分支指令,所指示的目標位址僅能在該微處理器執 行該程式時判定。該方法更包括分析該程式以獲得塊資 訊。上述塊資訊將該程式劃分成一序列多個塊。各塊包括 一序列多個指令。上述塊資訊更包括各塊相關的加密密鑰 數據。各塊對應的加密密鑰數據不相同。該方法更包括將 上述傳統分支指令各自以一分支與切換密鑰指令取代。該 方法更包括基於上述塊資訊,加密該程式。 本發明一種實施方式揭露一微處理器。該微處理器包 括一架構暫存器,該架構暫存器包括一位元。該微處理器 負責設定該位元。該微處理器更包括一提取單元。該提取 單元自一指令快取記憶體提取加密指令、並在執行上述加 密指令前將上述加密指令解密,以回應該微處理器將該位 元設定的操作。若接收到一中斷,該微處理器儲存該位元 的數值至一堆疊式記憶體、並且隨後將該位元清除。在微 處理器清除該位元後,該提取指令是自該指令快取記憶體 提取非加密指令,並不對上述非加密指令作解密操作即執 CNTR2449100-TW/0608-A43129-TW/ Final 8 201203108 行之。該微處理器更自 用來修復該架構暫存器5己憶體將先前儲存的數值 的操作。若判定該位;修數:::土中斷指令返回 單元重新提取並且解密加密指令。‘、'、叹疋狀態,該提取 本發明另外一種會+上 有-指令快取記憶體以及— 方法,用於操作具 方法包括設定該架構暫内=的—微處理器。該 令快取記憶體提取加密指;,::::;_並且隨後自該指 將上述加密指令解密。在二 仃上述加密指令前 存該架構暫存_位元 i 斷時’財法更包括儲 清除該位元後,該後清除該位元。在 加密指令,並且不作解密二執行:夬:記憶體提取非 更包括以先前儲存的數值修復該架二:該方法 應自中斷指令返回的操作。若判 : 設定狀態,該方法更包括舌心 i復傻的數值為 指令。 l括重新&取並骑密並且執行加密 M = 種實施方式揭露-微處理器。該微處理 ^括一七構暫存器以及一提取單元,該架構暫存器包括 -位兀。該微處理器儲存該位元的數值,以回應中斷執行 甲程式的-要求。該位元標示執行中程式為加密或非加 密。賴處理器以先前健存的數值修復該位元,並且重新 提取被中斷的程式作為執行中程式,以回應自中斷指令返 回的操作。若該位元修復後的數值為設定狀態,該微處理 器在重新提取中斷的程式之前,先將解密密鑰數值修復’ 以使用修復的解密密鑰數值解密所提取的指令。若該位元 CNTR2449100-TW/0608-A43129-TW/ Final Λ 201203108 修復後的數值為清除狀態,該微處理器不作解密密鑰數值 修復、並且不對所提取之指令作解密。 本發明另外一種實施方式揭露一種方法,用以操作一 微處理器。該方法包括儲存該微處理器一位元的數值,以 回應中斷執行中程式的一要求。該位元標示執行中程式為 加密或非加密。回應自中斷指令返回的操作,該方法更包 括以先前儲存的數值修復該位元,並且重新提取中斷的程 式作為執行中程式。若該位元修復後的數值為設定狀態, 該方法更包括在重新提取中斷程式之前,將解密密鑰數值 修復,並且以修復後的解密密鑰數值解密所提取的指令。 若該位元修復後的數值為清除狀態,該方法不會作解密密 鑰修復操作,也不對提取的指令作解密。 本發明一種實施方式揭露一種微處理器。該微處理器 包括一儲存元件,具有複數個位置各自儲存一個加密程式 的解密密鑰數據。該微處理器更包括一控制暫存器,以一 欄位標示該儲存元件上述複數個位置中與執行中的加密程 式相關者。回應自中斷指令返回的操作,該微處理器自記 憶體將先前儲存的該攔位之數值用來修復該控制暫存器。 該微處理器更包括一提取單元,用以提取執行中的加密程 式之加密指令、並且將之以該欄位修復後的數值在該儲存 元件所標示的位置所儲存的解密密鑰數據解密。 本發明另外一種實施方式揭露一方法,用以操作具有 一控制暫存器以及一儲存元件的一微處理器,該儲存元件 内複數個位置各自儲存一個加密程式的解密密鑰數據。該 方法包括自記憶體將先前儲存的該欄位之數值用來修復該 CNTR2449100-TW/0608-A43129-TW/ Final 10 201203108 控制暫存器内一欄位,以回應自中斷指令返回的操作,其 中,該欄位的數值標示該儲存元件上述複數個位置中與執 行中加密程式有關者。該方法更包括提取執行中的加密程 式之加密指令。該方法更包括以該欄位修復後的數值在該 儲存元件所標示的位置所儲存的解密密鑰數據解密所提取 的加密指令。 本發明一種實施方式揭露一種微處理器。該微處理器 包括一分支目標位址快取記憶體(BTAC)紀錄先前執行過 的分支與切換密錄指令之歷史資訊。上述歷史資訊包括所 紀錄之分支與切換密鑰指令的目標位址以及標識符。上述 標識符標示與所屬的分支與切換密鑰指令相關的複數個密 鑰數值。該微處理器更包括一提取單元,耦接該分支目標 位址快取記憶體。該提取單元提取先前執行過的分支與切 換密鑰指令時,會接收該分支目標位址快取記憶體所作的 預測、並且自該分支目標位址快取記憶體接收關於所提取 之分支與切換密鑰指令的上述目標位址以及標識符。該提 取單元更根據所接收的目標位址提取加密指令數據、並且 根據所接收的標識符所標示的多個密鑰數值解密所提取的 加密指令數據,以回應接收到的上述預測。 本發明另外一種實施方式接露一種方法,用於操作一 微處理器。該方法包括以一分支目標位址快取記憶體 (BTAC)紀錄先前執行過的分支與切換密鑰指令之歷史資 訊。上述歷史資訊包括所紀錄之分支與切換密鑰指令的目 標位址以及標識符。上述標識符標示與所屬的分支與切換 密鑰指令相關的複數個密鑰數值。該方法更於先前執行過 CNTR2449100-TW/0608-A43129-TW/ Final 11 201203108 的分支與切換密鑰指令被提取時接收該分支目標位址快取 記憶體所作的預測、並且自該分支目標位址快取記憶體接 收關於所提取之分支與切換密鑰指令的上述目標位址以及 標識符。該方法更根據所接收的目標位址提取加密指令數 據、並且根據所接收的標識符所標示的多個密鑰數值解密 所提取的加密指令數據,以回應接收到的上述預測。 【實施方式】 參閱第1圖’一方塊圖圖解根據本發明技術所實現的 一微處理器100。微處理器100包括一管線(pipeline),JL 中包括一指令快取記憶體(instruction cache)102、一提取單 元(fetch unit)104、一解碼單元(decode unit)108、一執行單 元(execution unit)l 12、以及一引出單元(retire unit)l 14。微 處理器100更包括一微代碼單元(microcode unit)132,用以 提供微代碼指令(microcode instructions)給該執行單元 112。微處理器100更包括通用暫存器(general purpose registers)118 以及標誌暫存器(EFLAGS register)128,以提 供指令運算元(instruction operands)給執行單元112。而且, 透過引出單元114,將指令執行結果更新於通用暫存器U8 以及標誌暫存器128。在一種實施方式中,標誌暫存器128 是由傳統x86標誌暫存器修改實現,詳細實施方式將於後 續篇幅說明。 提取單元104自指令快取記憶體102提取指令數據 (instruction data) 106。提取單元104操作於兩種模式:一為 解密模式(decryption mode),另一為純文字模式(plain text CNTR2449100-TW/0608-A43129-TW/ Final 12 201203108 mode)。提取單元104内一控制暫存器(c〇ntr〇i register)144 的E位元(Ebit)148決定該提取單元1〇4是操作於解密模 式(設定E位元)、或操作於純文字模式(清空E位元)。純文 字模式下’提取單元1〇4視自該指令快取記憶體1〇2所提 取出的才曰令數據106為未加密、或純文字指令數據,因此, 不對才曰令數據106作解密。然而,在解密模式下,提取單 元1〇4視自該指令快取記憶體102所提取出的指令數據ι〇6 為加密指令數據,因此,需使用該提取單元1〇4的一主密 鑰暫存器(master key register)142所儲存的解密密鑰 (decryption keys)將之解密為純文字指令數據,詳細技術内 容將參考第2圖以及第3圖進行討論。 提取單元104 φ包括-提取指令產生器(fetch福職 §611釘&加〇164,用以產生一提取位址(&化11&(1心以5)134,以 自該指令快取記憶體102提取指令數據1〇6。提取位址134 更供應給提取單元104的一密鑰擴展器(key expander)152。密鑰擴展器152自主密鑰暫存142中選取兩 組饴鑰172 ’並對其貫施運算以產生一解密密錄174,作為 多工器154的第一輸入。多工器154的第二輸入為多位元 的二進位零值(binary 2^〇5)176。£位元148控制多工器 154。若E位元148被設定,多工器154選擇輸出該加密密 鑰Π4。若E位元148被清除,多工器154選擇輸出多位 兀的二進位零值176。多工器154的輸出178將供應給互 斥邏輯156作為其第一輸入。互斥邏輯156負責對提取的 才曰令數據106以及多工益輸出178施行布林互斥運算 (Boolean exclusWe-OR,XOR),以產生純文字指令數據 CNTR2449100-TW/0608-A43129-T W/ Final π 201203108 162。加密的指令數據106乃預先以互斥邏輯將其原本的純 文字指令數據以一加密密錄進行加密,其中該加密密鑰之 數值與該解密密錄Π4相同。提取單元ι〇4的詳細實施方 式將配合第2圖以及第3圖内容於猶後敛述。 純文字指令數據162將供應給解碼單元1〇8。解碼單元 108負責將純文字指令數據162之串流解碼、並分割為多 個X86指令,交由執行單元112執行。在一種實施方式中, 解碼單元丨08包括緩衝器(buffers)或佇列(queus),以在解碼 之前或期間’緩衝存儲的純文字指令數據162之串流。在 一種貫施方式中’解碼單元1〇8包括一指令轉譯器 (instruction translator),用以將X86指令轉譯為微指令 microinstructions 或 micro-ops,交由執行單元 112 執行。 解碼單元108輸出指令時,更會針對各指令輸出一位元 值,該位元值乃伴隨該指令沿所述管線結構一路行進而 至,用以指不該指令是否為加密指令。該位元值將控制該 執行單兀112以及該引出單元114,使之根據該指令自該 指令快取記憶體102取出時是加密指令或純文字指令而進 行決策並且採取動作。在一種實施方式中,純文字指令不 被允許執行專供指令解密模式設計的特定操作。 在一種實施方式中,微處理器1〇〇為一 χ86架構處理 器,然而,微處理器1〇〇也可以其他架構之處理器實現。 若-處理器可正確執行設計給χ86處理器執行的大多數應 用程式’則視之為χ86架構的處理器4應用程式執行後 可獲得預期結果’則可判斷誠㈣式是被正雜行。特 別是’微處理器100是執行χ86指令集的指令,且具有χ86 CNTR2449100-TW/0608-A43129-TW/Final 14 201203108 用戶 了用暫存益組(x86 user-visible register set)。 在一種實施方式中’微處理器1〇〇乃設計成供應一複 〇 女王架構(comprehensive security architecture)—稱為安 全執行模式(secure execution mode,簡稱SEM)—以於其中 執行程式。根據一種實施方式,SEM程式的執行可由數種 處理器事件(processor events)引發,且不受一般(非SEM) 操作封鎖。以下舉例說明限定於SEM下執行的程式所實現 的功月b ’其中包括關鍵安全任務(CritiCalSeCUritytaskS)如: 憑證核對以及資料加密、系統軟件活動監控、系統軟件完 整性驗證、資源使用追蹤、新軟件的安裝控制…等。關於 SEM的實施方式請參考本公司於2〇〇8年1〇月η日申請 的美國專利申請案,案號12/263,131,(美國專利公開號為 2009-0292893,於2009年11月20日公開);該案的優先 權主張溯及2008年5月24曰的美國專利臨時申請案(案號 61/055,980);本申請案相關技術部份可參照上述案件内 容。在一種實施方式中,用於存儲SEM數據為安全非揮發 記憶體(未顯示在圖示)一如快取記憶體(flash mem〇ry)—可 用於存儲解逸、密錄,並藉由一隔離串行匯流排(private serial bus)耦接微處理器1〇〇,且其中所有資料乃aes加密 (AES-encrypted)且經過簽署驗正(signature_veHfied)的。在 一種實施方式中,微處理器100包括少量的單一次寫入性 非揮發5己憶體(non-volatile write-once memory ’未顯示於圖 示)’用於存儲解达、岔鑰,其中一種實施方式可參考美國專 利案7,663,957所揭露的一熔絲型非揮發存儲器;可參照上 述案件内容應用於本案發明。本案所揭露的指令解密特徵 CNTR2449100-TW/0608-A43129-T W/ Final ι< 201203108 的其中一項優點為:擴展安全執行模式(SEM)的應用範 圍’使安全性程式(secure program)得以存儲在微處理器100 外的記憶體,無須限定完整存儲於微處理器10〇内部。因 此’安全性程式可利用記憶體階層架構所提供的完整空間 以及功能。在一種實施方式中,部分或全部的結構性異常/ 中斷(architectural exceptions/interrupts ’ 例如,頁面錯誤 page faults、除錯中斷點 debug breakpoints)…等,在 SEM 模式下是除能(disable)的。在一種實施方式中,部分或全部 的結構性異常/中斷在解密模式(即E位元148為設定)下是 除能(disable)的。 士次處理器1〇〇更包括一密錄暫存器檔案(key register file)124。密鑰暫存器檔案124包括複數個暫存器,其中儲 存的搶錄可藉由密錄切換指令(switch key instruction,後續 討論之)載入提取單元104的主密鑰暫存器142,以解密所 提取的加密指令數據1〇6。 微處理器1〇〇更包括一安全存儲區(securemem〇ry area,簡寫為SMA)122,用於存儲解密密鑰,該解密密鑰 待經第5圖所示之密鑰載入指令(load key instrUCticm)500 進而載入密鑰暫存器檔案124。在一種實施方式中,安全 存儲區122限定以SEM程式存取。也就是說,安全存儲區 122不可藉一般執行模式(非SEM)下所執行的程式存取。 此外,安全存儲區122也不可藉處理器匯流排存取,且不 屬於微處理器100之快取記憶體階層的一部份。因此,舉 例°兒明之,快取清空操作(cache flush operation)不會導致安 全存儲區122白勺内容寫入記憶冑。關於安全存儲區122的 CNTR2449100.TW/0608-A43129.TW/Final ]6 201203108 讀寫’微處理器100指令集架構中設計有特定指令。一種 實施方式是在安全存儲區122中設計一隔離式隨機存取記 憶體(private RAM) ’相關技術内容可參考2〇〇8年2月20 曰申請的美國專利申請案12/〇34,503(該案於2008年10月 16曰公開,公開號為2008/0256336);可參照上述案件内 容應用於本案發明。 起先,作業系統或其他特權程序(privileged pr〇gram) 下載密鑰的初始化設定於該安全存儲區122、密鑰暫存器 檔案124、以及主密鑰暫存器142。微處理器1〇〇起先會以 該密錄的初始化設定以解密—加密^此外,加密程式 本身可接續寫人新的密駐安全存㈣122、並自安 儲區222,鑰載,鑰暫存器料124(藉由密料μ 令}二且自—密鑰暫存器槽案m將密錄載入主密輪暫存器 142(藉由絲切換指令)。所述操作之優勢在於:所揭露的 密錄切換指令使得加密㈣錄行#下得⑽換解密 組(on-the-fly switching) ’以下將詳述之。新的密輪可 密程式指令自身的即時數據級成。在—種實施方式中 式檔案標頭的-欄位會指示料指令是否為加密型式。壬 第1圖所描述的技術有多項優點。第一,自加奸人 數據106所解密出來的純文字指令數據無法由微處^ 100外部獲得。 第二,提取單元104提取加密指令數據所需的時間盘 提取純文字指令數據所需的時間相同。此特色關係著安二 與否。反之’若有時間差存在,骇客可藉此破解加密技 第三,相較於傳統設計,本案所揭露之指令解密技術Many software programs are often vulnerable to attacks that compromise the security of computer systems. For example, a hacker can embed an inappropriate code by attacking a buffer overflow vulnerability of a running program and transfer the mastership to the inappropriate code. As a result, the embedded code will dominate the attacked program. One scheme for preventing software programs from being attacked is instruction set randomization. The instruction set randomization technique first encrypts the program into some form. After the processor extracts the program from the memory, the program is decrypted in the processor. As a result, the hacker is less likely to embed malicious instructions because the embedded instructions must be properly encrypted (for example, using the same encryption key or algorithm as the attacked program) to be executed correctly. For example, see the file "Counter Code-Injection Attacks with Instruction-Set Randomization, by Gaurav S. Kc, Angelos D. Keromytis, and Vassilis Prevelakis, CCS Ό3, October 27-30, 2003, Washington, DC, USA, ACM 1- 58113-738-9/03/0010, which describes an improved version of the Bochs-x86 Pentium emulator. The disadvantages of the related art have been widely discussed. For example, see the article "erm/ze 柯五77^五价如(10) 邮到of Instruction Set Randomization, by Ana Nora Sovarel, David Evans, and Nathanael Paul, CNTR2449100-TW/0608-A43129-TW/ Final 4 201203108 http: //www.cs.virginia.edu/feeh"[Abstract] One embodiment of the present invention discloses a microprocessor. The microprocessor includes an instruction cache, an instruction decoding unit, and an extraction unit. The extracting unit is configured to: (a) extract instruction data of a block from the instruction cache; (b) perform a Boolean mutual exclusion operation on the block by a data entity to generate plain text instruction data; And (c) providing the above plain text instruction data to the instruction decoding unit. In a first condition, the block includes encrypted data and the data entity is a decrypted secret. In a second condition, the block includes non-encrypted instruction data and the data entity is a binary zero value of a plurality of bits. Regardless of whether the instruction data of the block is encrypted or unencrypted, the time required to implement the above contents (a), (b), and (c) is the same in the first condition and in the second case. Another embodiment of the present invention discloses a method for operating a microprocessor having an instruction cache. The method comprises: (a) extracting instruction data of a block from the instruction cache; (b) performing a mutually exclusive operation on the block by a data entity to generate plain text instruction data; and c) supplying the above plain text instruction data to an instruction decoding unit. In a first condition, the block includes encrypted instruction data and the data entity is to decrypt your key. In a second condition, the block includes non-encrypted instruction data, and the data block is a binary zero value of a plurality of bits. Regardless of whether the command data of the block is encrypted or unencrypted, the day-to-day stipulation required to carry out the above contents (a), (b), and (c) is the same under the first condition and the second condition. One embodiment of the present invention provides a microprocessor. The microprocessor package CNTR2449100-TW/0608-A43129-TW/Final 5 201203108 ^ memory and extraction unit. The extracting unit will extract the digits from 1 = the number of extracted bits from the plurality of blocks in the column. In the extraction of the unit, a number of key values are extracted and the part is extracted as a function, and each block of the decrypted secret address is generated, and the sequence instruction extracted from the Ding Hao is extracted. , ~ = 3: The memory extracts the sequence of the above multiple blocks, and updates the extraction unit (4) the value of the pin wheel. The method of the present invention discloses a method for extracting a program by a plurality of first forces:: package =: a plurality of first unencrypted instructions. The method further includes replacing the money with the second decryption key, and returning to wait for the first non-two-second-old order change instruction. The method further includes extracting a plurality of second encryption instructions of the program from the instruction cache and decrypting the second decryption key into a plurality of second non-encrypted instructions. Another embodiment of the present invention discloses a method L for an operation-microprocessor. The method includes self-instruction cache memory-sequence multiple extraction address extraction-encryption program-encryption instructions for sequence multiple blocks. The method further comprises: when extracting each block of the sequence, generating a decryption secret record by using a plurality of key values and a part of the extracted address of the extracted (four) block as a function. The method further includes, for each block in the sequence, causing (4) the above-mentioned decryption key to be encrypted. The financial method further includes executing a key switching instruction when extracting the plurality of blocks of the sequence. Executing the above-mentioned key CNTR2449100-TW/0608-A43129-TW/ Final ^ 201203108 key switching instructions includes updating the key values used to generate the above-described decryption key. One embodiment of the present invention discloses a microprocessor. The microprocessor includes an extraction unit that extracts and decrypts a branch and switch key command using the first decryption key data. The microprocessor further includes microcode. The above-mentioned microcode causes the extracting unit to extract and decrypt the splicing command after the branch and the switching key command using the first decryption key data in a state where the branch and the direction of the switching key command are not used. The microcode further causes the extracting unit to extract and decrypt the branch and switch key command by using the second decryption key data different from the first decryption key data in a situation where the branch and the switch key instruction are adopted. A target instruction. Another embodiment of the present invention discloses a method of processing an encryption program with a microprocessor. The method includes extracting and decrypting a branch and switch key instruction using the first decryption key data. The method further includes extracting and decrypting the splicing instruction after the branch and the switching key instruction with the first decryption key data in a state where the branch and the direction of the switching key instruction are not taken. The method further includes extracting and decrypting one of the branch and switch key instructions with the second decryption key data different from the first decryption key data in a condition that the branch and the direction of the switch key instruction are taken Target instruction. Another embodiment of the present invention also discloses a method for encrypting a program for execution by a microprocessor for decrypting and executing the encryption program. The method includes receiving a destination file of a non-encrypted program, including a legacy branch instruction, the indicated target address being determinable before the microprocessor executes the program. The method further includes analyzing the program to obtain block information. CNTR2449100-TW/0608-A43129-TW/ Final 7 201203108 The block information divides the program into a sequence of multiple blocks. Each block includes a sequence of multiple instructions. The above block information further includes the encryption key data of each block. The encryption key data corresponding to each block is different. The method further includes replacing each of the above-mentioned conventional branch instructions with a target address and a different branch by itself with a branch and switch key instruction. The method further includes encrypting the program based on the block information described above. Another embodiment of the present invention also discloses a method for encrypting a program for later execution by a microprocessor for decrypting and executing the encryption program. The method includes receiving a destination file of an unencrypted program, including a legacy branch instruction, the indicated target address being determined only when the microprocessor executes the program. The method further includes analyzing the program to obtain block information. The above block information divides the program into a sequence of multiple blocks. Each block includes a sequence of multiple instructions. The above block information further includes the encryption key data of each block. The encryption key data corresponding to each block is different. The method further includes replacing each of the above-described conventional branch instructions with a branch and a switch key instruction. The method further includes encrypting the program based on the block information. One embodiment of the present invention discloses a microprocessor. The microprocessor includes an architectural register that includes a bit. The microprocessor is responsible for setting the bit. The microprocessor further includes an extraction unit. The extracting unit extracts the encrypted instruction from an instruction cache and decrypts the encrypted instruction before executing the encryption instruction to respond to the operation of setting the bit by the microprocessor. If an interrupt is received, the microprocessor stores the value of the bit to a stacked memory and then clears the bit. After the microprocessor clears the bit, the fetch instruction extracts the non-encrypted instruction from the instruction cache memory, and does not decrypt the non-encrypted instruction, ie CNTR2449100-TW/0608-A43129-TW/ Final 8 201203108 OK. The microprocessor is also used to repair the operation of the previously stored values of the structure register. If the bit is determined; the number of repairs::: earth interrupt command returns the unit to re-extract and decrypt the encrypted command. ‘, ', sigh state, the extraction of the present invention is another type of + instructional cache memory and method for operating the method including setting the architecture temporarily = the microprocessor. The cache memory is extracted by the encryption finger;, ::::; _ and then the encrypted instruction is decrypted from the finger. In the second cryptographic instruction pre-existing the structure temporary storage _ bit i is broken, the financial method further includes the clearing of the bit, and then clearing the bit. The encryption instruction is executed without decryption: 夬: Memory extraction does not include repairing the shelf 2 with the previously stored value: the method should return the operation from the interrupt instruction. If judged: set the state, the method further includes the value of the tongue and soul. l include & take and ride the secret and perform encryption M = embodiment of the disclosure - microprocessor. The micro-processing includes a seven-seven register and an extracting unit, and the architectural register includes a bit. The microprocessor stores the value of the bit in response to a request to interrupt execution of the program. This bit indicates that the program being executed is encrypted or non-encrypted. The processor repairs the bit with the previously saved value and re-fetches the interrupted program as an executing program in response to the operation returned from the interrupt instruction. If the repaired value of the bit is the set state, the microprocessor first repairs the decryption key value before re-fetching the interrupted program to decrypt the extracted instruction using the repaired decryption key value. If the bit is CNTR2449100-TW/0608-A43129-TW/ Final Λ 201203108 The repaired value is cleared, the microprocessor does not fix the decryption key value, and does not decrypt the extracted instruction. Another embodiment of the present invention discloses a method for operating a microprocessor. The method includes storing a value of a bit of the microprocessor in response to a request to interrupt the executing program. This bit indicates that the program being executed is encrypted or unencrypted. In response to the operation returned by the interrupt instruction, the method further includes repairing the bit with the previously stored value and re-extracting the interrupted program as an executing program. If the value of the bit repaired is set, the method further includes repairing the decryption key value and retrieving the extracted instruction with the repaired decryption key value before re-fetching the interrupt program. If the value after the bit is repaired is cleared, the method does not perform the decryption key repair operation and does not decrypt the extracted instruction. One embodiment of the present invention discloses a microprocessor. The microprocessor includes a storage component having decryption key data for storing an encryption program in a plurality of locations. The microprocessor further includes a control register that indicates, in a field, the associated one of the plurality of locations of the storage element and the encryption process in execution. In response to the operation returned by the interrupt instruction, the microprocessor self-remembering uses the previously stored value of the block to repair the control register. The microprocessor further includes an extracting unit for extracting the encrypted instruction of the encrypted program in execution and decrypting the decrypted key data stored at the location indicated by the storage element with the value of the field repaired by the field. Another embodiment of the present invention discloses a method for operating a microprocessor having a control register and a storage element, each of which stores a decryption key data of an encryption program. The method includes using the value of the previously stored field from the memory to repair a field in the CNTR2449100-TW/0608-A43129-TW/ Final 10 201203108 control register in response to the operation returned from the interrupt instruction. The value of the field indicates that the storage element is related to the executing encryption program in the plurality of locations. The method further includes extracting the encrypted instruction of the encryption method in execution. The method further includes decrypting the extracted encrypted instruction with the decrypted key data stored at the location indicated by the storage element with the value of the field repaired. One embodiment of the present invention discloses a microprocessor. The microprocessor includes a branch target address cache memory (BTAC) to record historical information of previously executed branches and switch cryptographic instructions. The above historical information includes the recorded branch and the target address of the switch key instruction and the identifier. The above identifier identifies the plurality of key values associated with the branch and switch key instructions to which it belongs. The microprocessor further includes an extracting unit coupled to the branch target address cache memory. When the extracting unit extracts the previously executed branch and switch key instruction, it receives the prediction made by the branch target address cache and receives the extracted branch and switch from the branch target address cache. The above target address of the key instruction and the identifier. The extracting unit further extracts the encrypted instruction data based on the received target address, and decrypts the extracted encrypted instruction data based on the plurality of key values indicated by the received identifier in response to the received prediction. Another embodiment of the present invention discloses a method for operating a microprocessor. The method includes recording, by a branch of the target address cache memory (BTAC), historical information of previously executed branches and switch key commands. The above historical information includes the recorded branch and the target address of the switch key instruction and the identifier. The above identifier identifies the plurality of key values associated with the associated branch and switch key command. The method receives the prediction made by the branch target address cache memory when the branch and switch key instructions previously executed CNTR2449100-TW/0608-A43129-TW/ Final 11 201203108 are extracted, and from the branch target bit The address cache receives the above target address and identifier for the extracted branch and switch key instructions. The method further extracts the encrypted instruction data based on the received target address and decrypts the extracted encrypted instruction data based on the plurality of key values indicated by the received identifier in response to the received prediction. [Embodiment] Referring to Figure 1 is a block diagram illustrating a microprocessor 100 implemented in accordance with the teachings of the present invention. The microprocessor 100 includes a pipeline including an instruction cache 108, a fetch unit 104, a decoding unit 108, and an execution unit. L12, and a retire unit l 14 . The microprocessor 100 further includes a microcode unit 132 for providing microcode instructions to the execution unit 112. The microprocessor 100 further includes a general purpose registers 118 and an EFLAGS register 128 to provide instruction operands to the execution unit 112. Moreover, the instruction execution result is updated by the lead-out unit 114 to the general-purpose register U8 and the flag register 128. In one embodiment, the flag register 128 is implemented by a conventional x86 flag register modification, and the detailed implementation will be described later. The extracting unit 104 extracts instruction data 106 from the instruction cache 102. The extracting unit 104 operates in two modes: one is a decryption mode and the other is a plain text mode (plain text CNTR2449100-TW/0608-A43129-TW/ Final 12 201203108 mode). The E bit (Ebit) 148 of a control register (144) in the extracting unit 104 determines whether the extracting unit 1〇4 operates in a decryption mode (set E bit) or operates in plain text. Mode (clear E bit). In the text-only mode, the extracting unit 1〇4 regards the data 106 obtained from the instruction cache 1〇2 as unencrypted or plain text command data, and therefore, the data 106 is not decrypted. . However, in the decryption mode, the extracting unit 1〇4 regards the command data ι〇6 extracted from the instruction cache 102 as the encrypted command data, and therefore, a master key of the extracting unit 1〇4 is used. The decryption keys stored in the master key register 142 are decrypted into plain text command data. The detailed technical content will be discussed with reference to FIG. 2 and FIG. The extracting unit 104 φ includes an extract instruction generator (fetch § 611 PIN & 〇 164 for generating an extracted address (& 11 & 1 to 5) 134 to cache from the instruction The memory 102 extracts the command data 1〇6. The extracted address 134 is further supplied to a key expander 152 of the extracting unit 104. The key expander 152 selects two sets of keys 172 from the autonomous key temporary storage 142. 'And operate on it to generate a decryption secret 174 as the first input to the multiplexer 154. The second input of the multiplexer 154 is a multi-bit binary zero value (binary 2^〇5) 176 The £bit 148 controls the multiplexer 154. If the E bit 148 is set, the multiplexer 154 selects to output the encryption key Π 4. If the E bit 148 is cleared, the multiplexer 154 selects the output of the multiple bits 兀The carry zero value 176. The output 178 of the multiplexer 154 is supplied to the mutex logic 156 as its first input. The mutex logic 156 is responsible for performing the Boolean mutual exclusion operation on the extracted kernel data 106 and the multi-product output 178. (Boolean exclus We-OR, XOR) to generate plain text command data CNTR2449100-TW/0608-A43129-T W/ Final π 201203 108 162. The encrypted instruction data 106 is pre-encrypted with its original plain text instruction data in an encrypted secret record, wherein the encryption key has the same value as the decryption key record 4. The extraction unit ι〇4 The detailed embodiment will be described later in conjunction with the contents of Fig. 2 and Fig. 3. The plain text command data 162 is supplied to the decoding unit 1 to 8. The decoding unit 108 is responsible for decoding the stream of the plain text command data 162 and Dividing into a plurality of X86 instructions is performed by execution unit 112. In one embodiment, decoding unit 包括08 includes buffers or queues to 'buffer stored plain text instructions before or during decoding. The stream of data 162. In one implementation, the 'decoding unit 1' 8 includes an instruction translator for translating the X86 instructions into microinstructions or micro-ops for execution by the execution unit 112. When the decoding unit 108 outputs an instruction, a bit value is outputted for each instruction, and the bit value is accompanied by the instruction along the pipeline structure to refer to Whether the instruction is an encrypted instruction. The bit value controls the execution unit 112 and the extraction unit 114 to make an decision based on the instruction when the instruction is retrieved from the memory 102 and is an encrypted instruction or a plain text instruction. And taking an action. In one embodiment, the plain text instruction is not allowed to perform a particular operation specifically for the instruction decryption mode design. In one embodiment, the microprocessor 1 is an 86 architecture processor, however, the microprocessor 1 can also be implemented by other architecture processors. If the processor can correctly execute most of the applications designed to be executed by the χ86 processor, then it can be judged that the processor (4) of the χ86 architecture can obtain the expected result. In particular, the microprocessor 100 is an instruction to execute the χ86 instruction set and has a 8686 CNTR2449100-TW/0608-A43129-TW/Final 14 201203108 user-visible register set. In one embodiment, the 'microprocessor 1 is designed to supply a complex security architecture called a secure execution mode (SEM) for executing the program therein. According to one embodiment, execution of the SEM program can be initiated by several processor events and is not blocked by normal (non-SEM) operations. The following examples illustrate the power cycle b' implemented by programs executed under SEM, including critical security tasks (CritiCalSeCUritytaskS) such as: credential check and data encryption, system software activity monitoring, system software integrity verification, resource usage tracking, new software Installation control...etc. For the implementation of the SEM, please refer to the U.S. Patent Application No. 12/263,131 filed by the Company on the date of the PCT application, No. 12/263,131, (U.S. Patent Publication No. 2009-0292893, on November 20, 2009) The disclosure of the case is based on the US patent provisional application filed May 24, 2008 (Case No. 61/055,980); the relevant technical part of this application can refer to the above case content. In one embodiment, the SEM data is stored as a secure non-volatile memory (not shown), such as a flash mem〇ry, which can be used to store the escaping, cryptography, and The isolated serial bus is coupled to the microprocessor 1 and all of the data is aES-encrypted and signed_vehified. In one embodiment, the microprocessor 100 includes a small number of non-volatile write-once memories (not shown) for storing the solution, the key, wherein An embodiment of the present invention can be applied to a fuse-type non-volatile memory disclosed in U.S. Patent No. 7,663,957; One of the advantages of the instruction decryption feature CNTR2449100-TW/0608-A43129-T W/ Final ι< 201203108 disclosed in this case is that the extended security execution mode (SEM) application range enables the secure program to be stored. The memory outside the microprocessor 100 need not be completely stored inside the microprocessor 10A. Therefore, the security program takes advantage of the full space and functionality provided by the memory hierarchy. In one embodiment, some or all of the structural exceptions/interrupts (eg, page faults, debug breakpoints, etc.) are disabled in SEM mode. In one embodiment, some or all of the structural anomalies/interruptions are disabled in the decryption mode (i.e., E bit 148 is set). The sub-processor 1 further includes a key register file 124. The key register file 124 includes a plurality of registers, wherein the stored snoops can be loaded into the master key register 142 of the extracting unit 104 by a switch key instruction (discussed later). The extracted encrypted instruction data 1 〇 6 is decrypted. The microprocessor 1 further includes a secure memory area (abbreviated as SMA) 122 for storing a decryption key to be subjected to a key load instruction (load) as shown in FIG. The key instrUCticm) 500 is in turn loaded into the key register file 124. In one embodiment, secure storage area 122 is defined for access by an SEM program. That is, the secure storage area 122 is not accessible by programs executed in the normal execution mode (non-SEM). In addition, secure storage area 122 is not accessible by the processor bus and is not part of the cache memory hierarchy of microprocessor 100. Therefore, for example, the cache flush operation does not cause the contents of the secure memory area 122 to be written to the memory. Regarding the secure storage area 122, CNTR2449100.TW/0608-A43129.TW/Final]6 201203108 Read and write 'The microprocessor 100 instruction set architecture is designed with specific instructions. One embodiment is to design an isolated random access memory (private RAM) in the secure storage area 122. For a related art, reference is made to U.S. Patent Application Serial No. 12/34,503, filed on Feb. 20, 2008. The case was published on October 16, 2008, and the publication number is 2008/0256336); the case can be applied to the invention according to the above case. Initially, the initialization of the operating system or other privileged pr〇gram download key is set in the secure storage area 122, the key register file 124, and the master key register 142. The microprocessor 1 will first use the initial setting of the secret record to decrypt-encrypt. In addition, the encryption program itself can continue to write a new private security deposit (IV) 122, and from the storage area 222, key load, key temporary The buffer 124 is loaded into the main pin register 142 (by the wire switching command) from the key register slot m. The advantage of the operation is that The disclosed secret record switching instruction causes the encryption (4) to record the on-the-fly switching, which will be described in detail below. The new secret-wheel-programmable instruction itself is an instant data level. In the embodiment, the - field of the file header will indicate whether the material instruction is an encrypted version. The technique described in Figure 1 has several advantages. First, the plain text instruction decrypted from the traitor data 106 The data cannot be obtained externally by the micro-^ 100. Second, the time required for the extraction unit 104 to extract the encrypted instruction data is the same as the time required to extract the plain text instruction data. This feature relates to the second or not. Otherwise, if there is a time difference Exist, hackers can use this to crack the third encryption technology. Compared with the traditional design, the instruction decryption technology disclosed in this case
CNTR2449100-TW/0608-A43129-TW/ Final 17 J 201203108 不會額外增加提取單元104所耗的時脈數量。如以下討 論,密鑰擴展器152增加解密密鑰之有效長度,該解密密 鑰用於解密一加密程式,且此方式不會使提取加密程式數 據所需的時間長於提取純文字程式數據所需的時間。特別 是,因為密鑰擴展器152之運作限時於以提取位址134查 表該指令快取記憶體102獲得指令數據106之内完成,密 鑰擴展器152並不會增加一般的提取程序的時間。此外, 因為多工器154以及密鑰擴展器152 —併限時於以提取位 址134查表該指令快取記憶體102獲得指令數據106之内 完成,故不會增加一般的提取程序的時間。互斥邏輯156 是唯一添加於一般提取路徑的邏輯運算,所幸互斥操作156 的傳播延遲相當小,不會增加工作週期。因此,本案所揭 露的指令解密技術不會增加提取單元104時脈數量負擔。 此外,相較於一般技術所應用於解密指令數據106的複雜 解密機制,例如S盒(S-boxes),一般技術會增加提取以及 解碼指令數據106時所需的工作週期且/或所消耗的時脈數 量。 接著,參考第2圖,一方塊圖詳細圖解第1圖之提取 單元104。特別是,第1圖之密鑰擴展器152也詳細圖列 其中。先前已討論採用互斥邏輯解密上述加密指令數據106 的優點。然而,快且小的互斥邏輯有其缺點:若加密/解密 密鑰被重複使用,則互斥邏輯屬於一種脆弱加密方法(weak encryption method)。不過,若密錄的有效長度等同所欲加 密/解密之程式的長度,互斥邏輯加密會是一種強度極高的 加密技術。微處理器100之特徵在於可增長解密密鑰的有 CNTR2449100-TW/0608-A43129-TW/ Final 18 201203108 效長度,以隊& $ 器142 :低费鑰重複使用的需求。第一,主密鑰暫存 ^中 ^斤儲存的數值(檔案)為中大型尺寸:在一種實施方 二 其尺寸等同自指令快取記憶體102所取出的指令數 據106之提取^番 + T— 里、或區塊尺寸,為128位元(16位元組)。 加密擴展器152用於增長解密密鑰的有效長度, 如,增至一眚J 詳述。k s’ L a露的MM位元組,將於後續篇幅 操作中改i主藉由密餘切換指令(或其變形)在 之。 文王在鑰暫存器142内的數值,之後段落將詳述CNTR2449100-TW/0608-A43129-TW/ Final 17 J 201203108 does not additionally increase the number of clocks consumed by the extraction unit 104. As discussed below, the key expander 152 increases the effective length of the decryption key, which is used to decrypt an encryption program, and this does not require the extraction of the encrypted program data longer than necessary to extract the plain text program data. time. In particular, since the operation of the key expander 152 is limited to completion by the fetch address 134 to read the instruction cache 102 to obtain the instruction data 106, the key expander 152 does not increase the time of the general fetch program. . In addition, since the multiplexer 154 and the key expander 152 are not limited to the completion of the instruction memory 106 to obtain the instruction data 106 by extracting the address 134, the general extraction procedure time is not increased. The exclusive logic 156 is the only logical operation added to the general extraction path. Fortunately, the propagation delay of the exclusive operation 156 is relatively small and does not increase the duty cycle. Therefore, the instruction decryption technique disclosed in the present case does not increase the clock load of the extracting unit 104. Moreover, in contrast to the complex decryption mechanisms used by the general techniques to decrypt instruction data 106, such as S-boxes, the general technique increases the duty cycle and/or consumption required to extract and decode the instruction data 106. Number of clocks. Next, referring to Fig. 2, a block diagram illustrates the extraction unit 104 of Fig. 1 in detail. In particular, the key expander 152 of Fig. 1 is also shown in detail. The advantages of using the mutually exclusive logic to decrypt the encrypted instruction data 106 have been discussed previously. However, fast and small mutual exclusion logic has its drawbacks: if the encryption/decryption key is reused, the mutual exclusion logic belongs to a weak encryption method. However, if the effective length of the cipher is equal to the length of the program to be encrypted/decrypted, mutual exclusion logic encryption is an extremely powerful encryption technique. The microprocessor 100 is characterized by a CNTR2449100-TW/0608-A43129-TW/Final 18 201203108 effective length that can be used to increase the decryption key, with the team & 142: low fee key reuse requirements. First, the value (file) stored in the master key temporary storage is medium to large size: in one embodiment, the size of the command data 106 extracted from the instruction cache 102 is equivalent to the extraction of the instruction data 106. — In, or block size, is 128 bits (16 bytes). The cipher expander 152 is used to increase the effective length of the decryption key, for example, to a detailed description. The MM byte of k s' L a will be changed in the subsequent space operation by the secret switching instruction (or its variant). The value of Wenwang in the key register 142, which will be detailed later.
142實施方式中’使用了五個主密輪暫存器 ^㈣^ °然而,在其他實施方式中,也可以較少或 夕一的,鑰暫存器142數量增長解密密鑰長 種實把方式採用12個主密錄暫存器14 f⑸包括—第—多工器a 212以及-第二;廣充B 二的用密鑰暫存11142所供應的密鑰。提取位址 134的。P刀内谷用於控制多工器犯 施方式中,多工器B214為 在弟2圖所不實 為四轉一多工器。表”顧―^ &,而多工器A212 自的選擇輸人選取該等主密#何根據各 別)。表格2顯示上述選擇輸入的二式 位址134的位元π〇·幻胼3从* 乂汉丞於k取 以.8]所呈的主密输暫存器142組合。 多工器B的 選擇信號 選取的主密 鑰暫存器之 編號 CNTR2449100-TW/0608-A43129-T W/ Final 多工器A的 選擇信號 選取的主密 鑰暫存器之 編號 201203108 00 0 00 1 01 1 01 2 10 2 10 3 11 4 表格1 提取位址之 位元[10:8] 多工器B-多 工器A的選 取組合 多工器B的 選取信號 多工器A的 選取信號 000 0-1 00 00 001 0-2 00 01 010 0-3 00 10 011 0-4 00 11 100 1-2 01 01 101 1-3 01 10 110 1-4 01 11 111 2-3 10 10 表格2 多工器B 214的輸出236是供應給加法/減法器218。 CNTR2449100-TW/0608-A43129-TW/ Final 20 201203108 多工器A 2]2的輸出234 旋轉器216接收提取位址i34、應—凝轉器(她㈣2ί6。 器輸出234,決定旋轉的位元^,據以旋轉多工 提取位址134的位元[7:4]在供應仏里^種實施方式令, 位元組數量前增量,以表袼/、〜、_°疋轉态216控制旋轉的 W是供應給加法/減法^ 2i3s顯示之。旋轉器加的輪出 取位址134的位元「71 Λ 加法器/減法器218接收提 218將旋轉器216 _H[7]為清空’加法/減法器 減去。若該位元[7]為:定^ J马叹疋,加法/減法器218將旋轉器216 的輸出238加上多工器B214的輸出说。加法/減法器川 #輸出β卩第1圖所示之解密密鍮m,將供應給多工哭 154。以,以第3圖之流程圖詳述相關技術。 〇〇 接著’參閱第3圖’―流程圖基於本發明技術圖解第2 圖提取單元1〇4的操作。流程始於方塊3〇2。 在方塊302’提取單元丨〇4以提取位址134讀取指令快 取δ己憶體1〇2,以開始提取一 16位元組之區塊的指令數據 106。指令數據106可為加密狀態或為純文字狀態,視指令 數據106是為一加密程式或一純文字程式的一部分而定, 由Ε位元148標示。流程接著進入方塊304。 參考方塊304,根據提取位址134較高的數個位元,多 工器A 212以及多工器β 214分別自主密鑰暫存器142所 供應的密鑰172中選取出一第一密鑰234以及一第二密鑰 236。在一種實施方式中,提取位址134所供應的該些位元 施加於多工器212/214,以產生特定的密鑰對(234/236 key pair)組合。在第2圖所示之實施方式中,所供應的主密輪 CNTR2449100-TW/0608-A43129-TW/ Final 201203108 暫存器142數置為5,因此,存在1〇組可能的密錄對。為 了簡化硬體叹叶,僅使用了其中8組;此設計將供應2〇48 位兀組的有效密鑰,將於後續段落詳細討論之。然而,其 他實施方式也可能使用其他數量的麟暫存器142。以供 應12個主密鑰暫存器142的實施方式為例,主密錄暫存器 142的可能組合有66組,若採用其中64組,所產生的有 效密錄將為16384位元組。整體而言,假設上述複數個密 鑰數值總量為K(例如:5,且採用全部組合),該解密密鑰、 以及上述複數個密鍮數值各自的長度為W位元組(例如: 16位元組)’則產生的有效密鑰將為W2 * (κ!/(2*(κ_2)丨)) 位元組。流程接著進入方塊306。 在方塊306,基於提取位址134的位元[7:4],旋轉器 216使第一密鑰234旋轉相應數量的位元組。例如,若提 取位址134的位元[7:4]為數值9 ,旋轉器216將第一密鑰 234朝右旋轉9個位元組。流程接著進入方塊3〇8。 在方塊308’加法/減法器218將旋轉後的第一密鑰2% 加至/減自該第二密鑰236’以產生第i圖之解密密鑰Π4。 在一種實施方式中,若提取位址134的位元[7]為丨,則加 法/減法器218將旋轉後的第一密鑰234加至該第二密鑰 236 ;若提取位址134的位元[7]為〇,則加法/減法器218 將旋轉後的第-密鑰234自該第二密錄236減去。接著, 流程進入方塊312。 在決策方塊312,多工器154根據其控制信號判斷所提 取的該區塊之指令㈣1G6是來自—加密程式或一純文字 程式’所述控制信號來自控制暫存器144所供應的位元£ CNTR2449100-TW/0608-A43129-T W/ Final 22 201203108 148 若指令數據106為加密狀態,法 之,則流程進入方塊316。 4進入方塊314,反 在方塊314,多工器154選擇輪出解 斥邏輯156令加密指令數據1〇6以 在鑰174 ’且互 布林互斥運算,以產生第!圖之C174進行-程止於方塊314。 文子‘令數據!62。流 在方塊316,多工器154選擇輪 、 零值176,且互斥邏輯156令指令數摅立疋組的一進位 該16位元組的二進位零值進行―希 6(為純文字)以及 樣的純文字指令數據162。流程止^^互斥運算’以產生同 參考第2圖以及第3圖所揭露㈣ 應給所提取的該區塊指令數據1〇6 密密錄m是所選取的主密繪對234進二互斥運算,且該解 止 ^ 4/236以及提取位址134 二:相:密程序—使解密密输為先前密鑰值 的-函數’其中持續修正密糾供應新的在下—次工作區 間使用—本案所揭露之解密技術完全不同。以主密鑰對 234/236以及提取位址134為函式獲得解密密錄μ的方式 有至少以下兩種優點。第-’如以上所討論,加密指令數 據以及純文字指令數據剛之提取耗時相當,不會增加微 處理器100所需的玉作時脈。第二’遇到程式中的分支指 令(branch instruction) ’提取指令數據i %所需的時間不會 增加。在-種實施方式中’—分支預測器___▲) 接收提取位址134,並預測該提取位址134所指之該區塊 的指令數據106是否存在—分支指令,並預測其方向以及 目標位址。以第2圖所示實施方式為例,產出的解密密錄 CNTR2449100-TW/0608-A43129-TW/ Final 201203108 174是主密鑰對234/236以及提取位址134的一函式,將在 目標位址所指之該區塊指令數據106送抵該互斥邏輯156 的同一時間產出預測之目標位址的適當解密密鑰174。與 傳統解密密鑰運算手法針對目標位址計算解密密鑰所必須 的多個「倒帶(rewind)」步驟相較,本案所揭露技術在處理 加密指令數據時不會產生額外的延遲。 另外’如第2圖以及第3圖所示,密錄擴展器152之 方疋轉器216以及加法/減法器218之聯合設計,使得解密密 鑰長度有效擴展,超越主密鑰之長度。例如,主密鑰共貢 獻32位元組(2*16位元組);更甚者,以駭客企圖判斷解密 密錄174為何的角度而言,旋轉器216以及加法/減法器218 有效地將位於主密鑰暫存器142的32位元組的主密鑰擴展 為256位元組的密鑰序列。更具體地說,有效擴展後的密 錄序列之位元組η為: \為第一主密錄234的位元組η,且灸為第二主密錄的 η+χ 位元組η.χ。如上所述’密鑰擴展器152所產生的前八套 16位元組解密_ 174是由減法方式產生,且後八套是由 加法方式產生。具體來說,選定駐麵對各自 所提供的位it組内容用於為16個連續的16位元組區塊之 指令數據各個位元組產生解密密鑰174位元組,詳情請見 表格3。舉例說明之,表格3第1列的符號,,15·,,表示第 CNTR2449100-TW/0608-A43129-TW/ Final 24 201203108 二主密鑰236的位元組0的内容會經8位元算數運算(an eight-bit arithmetic operation)自第一主密鑰 234 的位元組 15減去,以獲得一位元組的有效解密密鑰174,用以與一 16位元組區塊之指令數據106中的位元組15進行互斥運 算。 15-00 14-15 13-14 12-13 11-12 10-11 09-10 08-09 07-08 06-07 05-06 04-05 03-04 02-03 01-02 00-01 15-01 14-00 13-15 12-14 11-13 10-12 09-11 08-10 07-09 06-08 05-07 04-06 03-05 02-04 01-03 00-02 15-02 14-01 13-00 12-15 11-14 10-13 09-12 08-Π 07-10 06-09 05-08 04-07 03-06 02-05 01-04 00-03 15-03 14-02 13-01 12-00 11-15 10-14 09-13 08-12 07-11 06-10 05-09 04-08 03-07 02-06 01-05 00-04 15-04 14-03 13-02 12-01 11-00 10-15 09-14 08-13 07-12 06-11 05-10 04-09 03-08 02-07 01-06 00-05 15-05 14-04 13-03 12-02 11-01 10-00 09-15 08-14 07-13 06-12 05-11 04-10 03-09 02-08 01-07 00-06 15-06 14-05 13-04 12-03 11-02 10-01 09-00 08-15 07-14 06-13 05-12 04-11 03-10 02-09 01-08 00-07 15-07 14-06 13-05 12-04 11-03 10-02 09-01 08-00 07-15 06-14 05-13 04-12 03-11 02-10 01-09 00-08 15+08 14+07 13+06 12+05 11+04 10+03 09+02 08+01 07+00 06+15 05+14 04+13 03+12 02+11 01+10 00+09 15+09 14+08 13+07 12+06 11+05 10+04 09+03 08+02 07+01 06+00 05+15 04+14 03+13 02+12 01+] 1 00+10 15+10 14+09 13+08 12+07 11+06 10+05 09+04 08+03 07+02 06+01 05+00 04+15 03+14 02+13 01+12 00+11 15+Π 14+10 13+09 12+08 11+07 10+06 09+05 08+04 07+03 06+02 05+01 04+00 03+15 02+14 0丨+13 00+12 15+12 14+11 13+10 12+09 11+08 10+07 09+06 08+05 07+04 06+03 05+02 04+01 03+00 02+15 01+14 00+13 15+13 14+12 13+Π 12+10 11+09 10+08 09+07 08+06 07+05 06+04 05+03 04+02 03+01 02+00 01+15 00+14 15+14 14+13 13+12 12+11 11+10 10+09 09+08 08+07 07+06 06+05 05+04 04+03 03+02 02+01 01+00 00+15 15+15 14+14 13+13 12+12 11+1] 10+10 09+09 08+08 07+07 06+06 05+05 04+04 03+03 02+02 01+01 00+00 表格3 給定適當的主密鑰數值後,密鑰擴展器152所產生的 擴展密鑰統計來說可有效預防互斥加密常見的攻擊,包括 令文件之加密區塊以密鑰長度位移、並對加密區塊一併施 行互斥運算,以下更詳細討論之。密鑰擴展器152對選定 主密鑰對234/236之影響是:在所述實施方式中,程式中 25 CNTR2449100-TW/0608-A43129-TW/ Final 201203108 以完全相同的密鑰所加密的兩個指令數據1〇6位元組之跨 距可高達256位元組。在其他具有不同區塊尺寸的指令數 據106、以及不同主絲長度的實施方式中以同樣密錄 加密的兩個指令數據106位元組的最大跨距可有不同的 量。 用來選定主密錄對234/236的主密錄暫存器142以及密 鑰擴展a 152内的多工器212/214也會決定有效密鑰長度 的擴展程度ϋ上討論’ ® 2圖所示實施方式供應有5 個主密鑰暫存器142,主密鑰暫存器142所供應的内容因 此可以ίο種方式組合,而多工器212/214是用於自上述1〇 種可能組合方式中選擇八種作用。表格3所示各密鑰對 234/236所對應的256位元組有效密鑰長度搭配八種主密鑰 對234/236組合後,所產生的有效密錄長度為2〇48位元 組。也就是說,程式中以完全相同之密鑰加密的兩個指令 數據106位元組之跨距可高達2048位元組。 為了更加說明密鑰擴展器152所帶來的優點,以下簡 短敘述互斥加密程序所常見的的攻擊。若互斥加密運算所 採用的密鑰長度短於所加密/解密之程式指令數據的長 度,密鑰中的許多位元組必須被重複使用,且被重複使用 的位元組數量視程式之長度而定。此弱點使互斥指令加密 程序可被破解。第一,駭客嘗試判斷出重複密鑰之長度’ 以下展示的說明(1)至(3)令之為n+1。第二,駭客假定指令 數據内各個密錄長度區塊(key-length block)是以同樣密錄 加密。以下列舉根據一傳統互斥加密運算加密得到的二密 錄長度區塊的數據: CNTR2449100-TW/0608-A43129-TW/ Final 26 201203108 ⑴ A h b'/\kn aL· U0 1 aL· 八k‘ 0 其中’ 6«〇為第一密鑰長度區塊之數據的位元組n,將被加 搶,、為苐二密鍮長度區塊之數據的位元組n,將被加密; 且心為密鑰的位元組η。第三,駭客對所述兩區塊進行互 斥運算’使其中密鑰成分彼此相銷,獨留以下内容: (3)〜λ、,…,、八△,厶八办 ϋ 1 ^ ι\ υ〇 υι 最後,由於計算出的位元組為單純兩個純文字位元組 的函式,駭客可以統計分析純文字内容之出現頻率,以嘗 試求得純文字位元組的數值。 然而,根據第2圖以及第3圖所揭露方式計算出的加 密指令數據106位元組之圖樣如以下說明(4)與(5)所示:In the 142 embodiment, 'the five main secret register registers are used ^(4). However, in other embodiments, the number of the key register 142 may be increased or the number of the key register 142 may be increased. The method uses 12 primary secret register registers 14 f (5) including - the first multiplexer a 212 and - the second; the wide charge B two key temporary storage 11142. Extract the address 134. In the case where the P-knife is used to control the multiplexer's mitigation, the multiplexer B214 is not a four-turn multiplexer. The table "Gu-^ &, and the multiplexer A212 selects the input key to select the main secret # depending on the individual.) Table 2 shows the bit π 〇 胼 胼 二 选择 选择 选择 选择 选择3 from * 乂汉丞 to k take the .8] of the main secret transfer register 142 combination. The multiplexer B select signal selected master key register number CNTR2449100-TW/0608-A43129- T W/ Final multiplexer A selection signal selected master key register number 201203108 00 0 00 1 01 1 01 2 10 2 10 3 11 4 Table 1 extracting the bit of the address [10:8] Selection of the combiner B-multiplexer A The selection signal of the multiplexer B is selected from the signal of the multiplexer A. 000 0-1 00 00 001 0-2 00 01 010 0-3 00 10 011 0-4 00 11 100 1-2 01 01 101 1-3 01 10 110 1-4 01 11 111 2-3 10 10 Table 2 The output 236 of the multiplexer B 214 is supplied to the adder/subtractor 218. CNTR2449100-TW/0608-A43129- TW/ Final 20 201203108 Output 234 of multiplexer A 2]2 The rotator 216 receives the extracted address i34, the condenser (she (four) 2 ί6. The output 234, determines the rotated bit ^, according to the rotation multiplex extraction Address 134 Bit [7:4] in the supply ^ 种 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施 实施Displayed. The bit of the rotator plus wheel take-out address 134 "71 Λ adder / subtractor 218 receives 218 and rotator 216 _H[7] is emptied 'addition / subtractor minus. If the bit [7] is: set ^ J Ma sigh, adder / subtractor 218 adds the output 238 of the rotator 216 to the output of the multiplexer B214. Addition / subtractor Chuan # output β 卩 the first picture shown in Figure 1 The key m will be supplied to the multiplex cry 154. The related art will be described in detail in the flowchart of Fig. 3. Next, 'see Fig. 3' - the flowchart is based on the technical diagram of the present invention. The operation of 4. The process begins at block 3〇2. At block 302', the unit 丨〇4 is extracted to extract the address 134 to read the instruction cache δ mn 〇1 〇 2 to start extracting a block of 16 octets. Instruction data 106. The command data 106 can be in an encrypted state or in a plain text state, depending on whether the command data 106 is part of an encryption program or a plain text program. 148 Ε indicated by bit. Flow then proceeds to block 304. Referring to block 304, multiplexer A 212 and multiplexer β 214 select a first key from the key 172 supplied by the autonomous key register 142, respectively, based on the higher number of bits of the extracted address 134. 234 and a second key 236. In one embodiment, the bits supplied by the extraction address 134 are applied to the multiplexer 212/214 to produce a particular key pair (234/236 key pair) combination. In the embodiment shown in Fig. 2, the supplied main pinch CNTR2449100-TW/0608-A43129-TW/ Final 201203108 sets the number of registers 142 to 5, so there is a possible set of secret pairs for the group. In order to simplify the hardware slash, only 8 of them are used; this design will supply a valid key for the 2〇48-bit group, which will be discussed in detail in the subsequent paragraphs. However, other implementations may also use other numbers of lining registers 142. Taking the implementation of the 12 master key registers 142 as an example, there are 66 possible combinations of the primary secret register 142. If 64 of them are used, the generated active secret will be 16384 bytes. In general, assuming that the total number of the plurality of key values is K (for example, 5, and all combinations are used), the decryption key and the plurality of the plurality of key values are each a W-bit group (for example: 16) The byte generated by 'bytes' will be the W2 * (κ!/(2*(κ_2)丨)) byte. Flow then proceeds to block 306. At block 306, based on the bits [7:4] of the extracted address 134, the rotator 216 rotates the first key 234 by a corresponding number of bytes. For example, if the bit [7:4] of the extracted address 134 is a value of 9, the rotator 216 rotates the first key 234 to the right by 9 bytes. The flow then proceeds to block 3〇8. The rotated first key 2% is added/subtracted from the second key 236' at block 308' adder/subtractor 218 to produce the decryption key Π4 of the ith map. In an embodiment, if the bit [7] of the extracted address 134 is 丨, the adder/subtractor 218 adds the rotated first key 234 to the second key 236; if the address 134 is extracted The bit [7] is 〇, and the adder/subtractor 218 subtracts the rotated first-key 234 from the second secret 236. Next, the flow proceeds to block 312. At decision block 312, the multiplexer 154 determines, based on its control signal, the command to extract the block. (4) 1G6 is from the encryption program or a plain text program. The control signal is derived from the bit slot supplied by the control register 144. CNTR2449100-TW/0608-A43129-T W/ Final 22 201203108 148 If the command data 106 is in an encrypted state, the flow proceeds to block 316. 4 Enter block 314, and in block 314, multiplexer 154 selects the round-off resolving logic 156 to cause the encrypted command data 1〇6 to be at the key 174' and the mutually exclusive operation to generate the first! C174 of the figure proceeds to block 314. Wenzi ‘order data! 62. Flowing at block 316, multiplexer 154 selects the round, zero value 176, and mutual exclusion logic 156 causes the instruction number to be a carry-forward group of the carry-in binary bit value of the 16-bit group - "6" (for plain text) And the plain text instruction data 162. The process stops ^^ mutual exclusion operation 'to generate the same reference as shown in Figure 2 and Figure 3. (4) The block instruction data to be extracted should be given 1 〇 6 密密录 m is the selected primary key pair 234 into the second Mutually exclusive operation, and the decommissioning ^ 4/236 and extracting the address 134 two: phase: secret program - the decryption secret is the function of the previous key value - which continuously corrects the secret correction supply to the new sub-working interval Use—The decryption techniques disclosed in this case are completely different. There are at least the following two advantages in the manner in which the master key pair 234/236 and the extracted address 134 are functions to obtain the decrypted secret record μ. As described above, the encryption instruction data and the plain text instruction data are just as time-consuming to extract, and do not increase the jade clock required by the microprocessor 100. The second 'experienced branch instruction' in the program does not increase the time required to extract the instruction data i %. In the embodiment, the '-branch predictor___▲) receives the extracted address 134, and predicts whether the instruction data 106 of the block indicated by the extracted address 134 exists - a branch instruction, and predicts its direction and target Address. Taking the implementation shown in FIG. 2 as an example, the generated decryption secret record CNTR2449100-TW/0608-A43129-TW/ Final 201203108 174 is a function of the master key pair 234/236 and the extracted address 134, which will be The block instruction data 106 referred to by the target address is sent to the appropriate decryption key 174 of the target address of the mutually exclusive predicted output at the same time of the mutually exclusive logic 156. Compared with the multiple "rewind" steps necessary for the conventional decryption key operation method to calculate the decryption key for the target address, the technique disclosed in the present invention does not generate additional delay in processing the encrypted instruction data. Further, as shown in Figs. 2 and 3, the joint design of the multiplexer 152 and the adder/subtracter 218 allows the decryption key length to be effectively expanded beyond the length of the master key. For example, the master key contributes a total of 32 bytes (2*16 bytes); moreover, the rotator 216 and the adder/subtractor 218 are effective in terms of the hacker's attempt to determine why the decrypted secret 174 is arbitrarily The master key of the 32-bit tuple located in the master key register 142 is expanded to a key sequence of 256 bytes. More specifically, the byte η of the effectively expanded secret sequence is: \ is the byte η of the first primary secret record 234, and the moxibustion is the n + χ byte η of the second primary secret record. Hey. As described above, the first eight sets of 16-byte decryption_174 generated by the key expander 152 are generated by subtraction, and the last eight sets are generated by addition. Specifically, the selected station is provided with a bit set group content provided for generating a decryption key 174 byte for each byte of the instruction data of 16 consecutive 16-bit block blocks. For details, see Table 3. . For example, the symbol in the first column of Table 3, 15·, indicates that CNTR2449100-TW/0608-A43129-TW/ Final 24 201203108 The contents of byte 0 of the two master key 236 are subjected to 8-bit arithmetic. An eight-bit arithmetic operation is subtracted from the byte 15 of the first master key 234 to obtain a one-tuple valid decryption key 174 for command data with a 16-bit block block. The byte 15 in 106 performs a mutually exclusive operation. 15-00 14-15 13-14 12-13 11-12 10-11 09-10 08-09 07-08 06-07 05-06 04-05 03-04 02-03 01-02 00-01 15- 01 14-00 13-15 12-14 11-13 10-12 09-11 08-10 07-09 06-08 05-07 04-06 03-05 02-04 01-03 00-02 15-02 14 -01 13-00 12-15 11-14 10-13 09-12 08-Π 07-10 06-09 05-08 04-07 03-06 02-05 01-04 00-03 15-03 14-02 13-01 12-00 11-15 10-14 09-13 08-12 07-11 06-10 05-09 04-08 03-07 02-06 01-05 00-04 15-04 14-03 13- 02 12-01 11-00 10-15 09-14 08-13 07-12 06-11 05-10 04-09 03-08 02-07 01-06 00-05 15-05 14-04 13-03 12 -02 11-01 10-00 09-15 08-14 07-13 06-12 05-11 04-10 03-09 02-08 01-07 00-06 15-06 14-05 13-04 12-03 11-02 10-01 09-00 08-15 07-14 06-13 05-12 04-11 03-10 02-09 01-08 00-07 15-07 14-06 13-05 12-04 11- 03 10-02 09-01 08-00 07-15 06-14 05-13 04-12 03-11 02-10 01-09 00-08 15+08 14+07 13+06 12+05 11+04 10 +03 09+02 08+01 07+00 06+15 05+14 04+13 03+12 02+11 01+10 00+09 15+09 14+08 13+07 12+06 11+05 10+04 09+03 08+02 07+01 06+00 05+15 04+14 03+13 02+12 01+] 1 00+10 15+10 14+09 13+08 12+07 11+06 10+05 09 + 04 08+03 07+02 06+01 05+00 04+15 03+14 02+13 01+12 00+11 15+Π 14+10 13+09 12+08 11+07 10+06 09+05 08 +04 07+03 06+02 05+01 04+00 03+15 02+14 0丨+13 00+12 15+12 14+11 13+10 12+09 11+08 10+07 09+06 08+ 05 07+04 06+03 05+02 04+01 03+00 02+15 01+14 00+13 15+13 14+12 13+Π 12+10 11+09 10+08 09+07 08+06 07 +05 06+04 05+03 04+02 03+01 02+00 01+15 00+14 15+14 14+13 13+12 12+11 11+10 10+09 09+08 08+07 07+06 06+05 05+04 04+03 03+02 02+01 01+00 00+15 15+15 14+14 13+13 12+12 11+1] 10+10 09+09 08+08 07+07 06 +06 05+05 04+04 03+03 02+02 01+01 00+00 Table 3 Given the appropriate master key value, the extended key statistics generated by the key expander 152 can effectively prevent each other. Common attacks that refute encryption include shifting the encrypted block of the file by the key length and performing a mutually exclusive operation on the encrypted block, as discussed in more detail below. The effect of the key expander 152 on the selected master key pair 234/236 is: in the embodiment, the program 25 is CNTR2449100-TW/0608-A43129-TW/ Final 201203108 encrypted with the exact same key The span of the instruction data 1 〇 6 bytes can be as high as 256 bytes. In other embodiments having different block sizes of instruction data 106 and different main line lengths, the maximum span of two instruction data 106 bytes encrypted with the same secret record may be of different amounts. The primary cipher register 142 for selecting the primary cipher pair 234/236 and the multiplexer 212/214 within the key extension a 152 will also determine the extent of the effective key length extension. The illustrated embodiment is provided with five master key registers 142, the contents of which the master key register 142 supplies can thus be combined, and the multiplexer 212/214 is used for one possible combination from the above. Choose eight roles in the way. After the 256-bit effective key length corresponding to each key pair 234/236 shown in Table 3 is combined with the eight master key pairs 234/236, the effective secret recording length is 2〇48 bytes. That is to say, the span of two instruction data 106 bytes encrypted with the exact same key in the program can be as high as 2048 bytes. To further illustrate the advantages of the key extender 152, the following is a brief description of the attacks common to mutually exclusive encryption programs. If the length of the key used in the mutex encryption operation is shorter than the length of the program data of the encrypted/decrypted program, many of the bytes in the key must be reused, and the number of bytes that are reused depends on the length of the program. And set. This weakness allows the mutex command encryption program to be cracked. First, the hacker attempts to determine the length of the duplicate key. The instructions (1) through (3) shown below are n+1. Second, the hacker assumes that each key-length block in the command data is encrypted with the same secret record. The following is a list of data of a ciphertext length block obtained by encrypting according to a conventional mutex encryption operation: CNTR2449100-TW/0608-A43129-TW/ Final 26 201203108 (1) A h b'/\kn aL· U0 1 aL· 八k '0, where '6' is the byte n of the data of the first key length block, which will be robbed, and the byte n of the data of the second block length block will be encrypted; The heart is the byte η of the key. Third, the hacker performs a mutually exclusive operation on the two blocks, so that the key components are sold to each other, leaving the following contents: (3) ~λ,,...,, eight △, 厶八办ϋ 1 ^ ι \ υ〇υι Finally, since the calculated byte is a function of two pure text bytes, the hacker can statistically analyze the frequency of occurrence of the plain text content in an attempt to obtain the value of the plain text byte. However, the pattern of the encrypted instruction data 106 bytes calculated according to the manner disclosed in Figs. 2 and 3 is as shown in the following descriptions (4) and (5):
(4) bn 0 (5) 、 其中心〇標不所加密之第-16位元組區塊之指令數據的位 元組η’、標示所加密之第二16位元組區塊之指令數據的 CNTR2449100-TW/0608-A43129-TW/ Final 201203108 位元組η,&標示主密錄χ的位元組η,且&標示主密 X γ 鑰y的位元組η。如前述,主密鑰X與y為不同密鑰。假 定一種實施方式以五個主密鑰暫存器142提供八種主密鑰 對234/236組合,2048位元組序列中各位元組是與兩個獨 立的主密鑰位元組的一組合進行互斥運算。因此,當加密 數據以任何方式於256位元組的區塊中移位並且彼此作互 斥運算,所求得的位元組都會存在兩個主密鑰的複雜成 分,因此,不若說明(3)的内容,此處所得的運算結果不單 純只是純文字位元組。例如,假設駭客選擇使同一 256位 元組區塊中的16位元組區塊對齊並彼此進行互斥操作使 同樣的密鑰零位元組在各段中被使用,位元組〇之運算結 果如說明(6)所示,所獲得的位元組存在兩個主密鑰的複雜 組合: (6) b Ό0(4) bn 0 (5), the byte η' of the instruction data of the 16th byte block not encrypted by the center mark, and the instruction data indicating the encrypted second 16-bit block block CNTR2449100-TW/0608-A43129-TW/ Final 201203108 The byte η, & indicates the byte η of the primary cipher, and & indicates the byte η of the primary dense X γ key y. As mentioned above, the master keys X and y are different keys. Assume that one embodiment provides eight master key pairs 234/236 combinations in five master key registers 142, each of which is a combination of two independent master key bytes in a 2048 byte sequence Perform a mutually exclusive operation. Therefore, when the encrypted data is shifted in any way in the 256-bit block and mutually exclusive, the obtained byte has a complex component of the two master keys, and therefore, it is not explained ( 3) The content of the calculations obtained here is not simply a plain text byte. For example, suppose the hacker chooses to align 16-bit tuple blocks in the same 256-bit tuple block and mutually exclusive operations so that the same key zeros are used in each segment, the byte is The result of the operation is as shown in the description (6). The obtained byte has a complex combination of two master keys: (6) b Ό0
其中η不為1。 再者,若駭客換成將選自不同256位元組區塊内的16 位元組區塊對齊、且彼此作互斥運算,運算結果的位元組 0如說明(7)所示: f \ kr, ±k k〇 ±k\ ly CNTR2449100-TW/0608-A43129-TW/ Final 28 201203108 .其中主密鑰讀v中至少一者不同於主密餘x以及y。模 擬隨機主密鑰數值所產生之有效密鑰位元組之互斥運算,' 可發現運算結果L +灸) 口相4上 ^ ^ \ V 土免〜J王現相當平滑的分 布。 當然,若駭客選擇將不同的2048位元組長度區塊内的 16位元組區塊對齊、並且彼此進行互斥操作,駭客可能會 獲得與說明(3)類似的結果。然而,請參照以下内容。第一, 某些程式一例如,安全性相關程式—可能短於2〇48位元 組。第二,相距2048位元組的指令位元組之統計相關性 (statistical correlation)很可能非常小,導致很難破解。第 二,如前述内容,所述技術之實施方式可以較多數量實現 主密鑰暫存器142,使解密密鑰之有效長度擴展;例如, 以12個主岔錄暫存器142供應16384位元組長度的解密密 鑰,甚至其他更長的解密密鑰。第四,以下將討論的密鑰 下載指令500以及密鑰切換指令6〇〇更使程式設計師得以 載入新的數值至主密鑰暫存器142,以有效擴展密鑰長度 超過2048位元組,或者,如果必要,也可擴展密鑰長度至 程式的完整長度。 現在,參考第4圖,一方塊圖根據本發明技術圖解第1 圖的標誌暫存器128。根據第4圖所示之實施方式,標誌 暫存器128包括標準χ86暫存器的複數個位元4〇8;不過, 為了此處敘述的新功能’第4圖所示實施方式會動用χ86 木構中一般為預留(RESERVED)的一位元。特別說明之,才票 誌暫存器128包括一 E位元攔位402。E位元欄位402用 CNTR2449100-TW/0608-A43129-TW/ Final 29 201203108 於修復控制暫存ϋ 144的E位元148触,用以於加密以 及純文子程式間切換以及/或於不同加密程式間切換,以下 將詳細討論之。E位元攔位4〇2標示目前所執行的程式是 否有加密。若目前所執行的程式有加密,E位元襴位402 為設定狀態,否則,為清除狀態。當中斷事件發生,控制 權切換給其他程式(例如,中斷imerrupt、異常 exception 如頁錯》吳page fault、或任務切換仏认switch),儲存標諸暫 存益128。反之,若控制權重回先前因中斷事件中斷的程 式,則修復標誌暫存器128。微處理器1〇〇之設計會在標 誌暫存器128修復時以標誌暫存器128之E位元4〇2攔位 數值更新控制暫存器144之E位元148數值,以下將詳細 討論之。因此,若中斷事件發生時一加密程式正在執行(即 提取單元104處於解密模式),當控制權交還給該加密程式 時,以修復的E位元攔位402 位元148為設定狀態, 以修復提取單元104為解密模式。在一種實施方式中,E 位元148以及E位元攔位402為同一個具體硬體位元,因 此,儲存標誌暫存器128的E位元欄位4〇2中數值即是儲 存E位元148,且修復標誌暫存器128的E位元欄位4〇2 的數值即是修復E位元148。 參閱第5圖,一方塊圖圖解根據本發明技術所實現的 一密鑰下載指令500之格式。密鑰下載指令5〇〇包括一操 作碼(opcode)502攔位,特地標示其為微處理器ι〇〇指令集 内的密鑰下載指令500。在一種實施方式中,操作碼欄位 502數值為〇FA6/4(x86領域)。密鑰下載指令5〇〇包括兩個 運异元.一雄、鑰暫存器檔案目標位址504以及一安全存儲 CNTR2449100-TW/0608-A43129-TW/ Final 30 201203108 Ϊ=、:506。該安全存儲區來源位址506為安全存儲 案位址爾標示密鑰暫存|^幸H立址。密鎗暫存器檔 密鑰。在-種實施方式中若3八 * 效ρ显!! 订錢載入指令_,則視之為無 有效安又;^此外,右*全存儲區來源位址慨數值位於 之外’則視之為-般保護異常。在-、右一程式試圖在微處理1 1〇〇不為最高權 , X86if0#p,/x86 ring 7 50二’則視之為無效指令異常。在某些狀況下,μ位元 組主密鑰之構成可能包括在加密指令的㈣數據字段内。 所述即時數據可被-塊-塊移至安全存舰122、组成⑹立 元組的密餘。 現在’參閱第6圖’ -方塊圖圖解根據本發明技術所 實現的-密仙換指令_之格式。密仙換指令_包 括一操作碼602攔位,特地其為微處理器1〇〇指令集内的 密鑰切換指令600。密鑰切換指令600更包括一密鑰暫存 器檔案索引攔位604,標示密鑰暫存器檔案124 一序列暫 存器中的開端,以自此將密鑰載入主密鑰暫存器142。在 一種實施方式中,若一程式嘗試在微處理器1〇〇不為安全 操作模式時執行一密鑰切換指令600,則視之為無效指令 異帛。在一種貫施方式中,若一程式意圖在微處理器1〇〇 不為最高權限級別(例如,x86環〇權限)時執行一密錄切換 指令600,則視之為無效指令異常。在一種實施方式中, CNTR2449100-TW/0608-A43129-TW/ Final 31 201203108 密鑰切換指令600為原子操作型式(at〇mic),即不可中斷; 此處所相,用於載人密鑰至主⑽暫存器142的其他指 令也是如此—例如’以下將討論的分支與切換密鍮指令。 現在’參閱第7圖,一流程圖圖解第1圖之微處理器 1〇〇之操作,其中,根據本發明技術執行第6圖介紹的密 錄切換指令600。流程始於方塊7〇2。 在方塊702解碼單元1〇8將一密鑰切換指令刪解 碼,且將解碼結果代人微代料元132㈣現密錄切換指 令600的微代碼程序。流程接著進入方塊7〇4。 在方塊704 ’微代碼會根據密鑰暫存器槽案索引爛位 604自—密鍮暫存器檔案124下載主密餘暫存器⑷的内容。 較佳實施方式是:微代碼以料暫存諸案索引欄位刪 所標示的密錄暫存器為起始,自⑽暫存器檑案124下載 連續的η個暫存器内容作為n個密鑰存入主密鑰暫存器 142’其中η為主密鑰暫存器142的總數。在—種實施方式 中,數值η可標示於密鑰切換指令6〇〇的一額外空間,設 定為少於主密錄暫存器M2的總數。流裎接著=入方塊 706。 在方塊706’微代碼使微處理器1〇〇分支至接續的χ86 指令(即該密鑰切換指令600之後的指令),將導致微處理 器100中較密鑰切換指令600新的所有χ86指令被清空, 致使微處理器100内、較切換至接續χ86指令的微操作新 的所有微操作被清空。上述被清空的指令包括自指令快取 記憶體102提取出、緩衝暫存於提取單元1〇4以及解碼單 元108内等待解密與解碼的所有指令位元組1〇6。流程接 CNTR2449100-TW/0608-A43129-TW/ Final 32 201203108 著進入方塊708。 刀支至接續指令的操作,提 載入主密鑰暫存器142的新 102提取並且解密指令數據 在方塊708,基於方塊706 取單元104開始利用方塊7〇4 一組密鑰值自指令快取記憶體 106。流程結束於方塊7〇8。 如第7圖所示,密鑰切換指令㈣令正在執行中的加 密程式在自指令快取記憶體1G2提取出來的同時得以改變 主密鑰暫存H 142内所儲存、供解密該加密程式使用的内 容。所述主密鑰暫存ϋ 142動_整技術使得加密該程式 的有效密錄長度超越提取單元1G4先天支援的長度(例如, 第2圖實施方式所提供的綱位元組);如第8圖所示程 式,若將之以第1圖微處理器1〇〇操作,駭客會更不易攻 破電腦系統的安全防護。 現在’參閱第8圖,_方塊圖圖解根據本發明技術所 貫現的一加密程式的一記憶體用量(mem〇ry fo〇tprint)800,其中採用第6圖所示之密鑰切換指令6〇〇。 第8圖所示之加密程式記憶體用4 8⑼包括連續數「塊 chunk」指令數據位元組。每一「塊」的内容為一序列多個 才曰令數據位元組(其中為預先加密的數據),且屬於同一 「塊」的指令數據位7〇組是由同樣的一套主密鑰暫存器142 數值解密。因此,不同兩「塊」的界線是由密鑰切換指令 600定義。也就是說,各「塊」的上、下界是由密输切換 指令600之位置區分(或者,以一程式的第一「塊」為例, 其上界為邊程式的起始處;此外,以該程式的最後一「塊」 為例,其下界為該程式的結束處)。因此,各「塊」指令數 CNTR2449100-TW/0608-A43129-TW/ Final ,, 201203108 據位元組是由提取單元1〇4基於不同套主密鑰暫存器l42 數值解密,意即各「塊」指令數據位元組的解密是根據前 一「塊」所供應的一密鑰切換指令600所載入主密鑰暫存 器142數值。加密一程式的後處理器(post-processor)會知曉 各密鑰切換指令600所在之記憶體位址,並且會利用此資 §fl—即提取位址的相關位址位元—配合密鑰切換指令6〇〇 密鑰數值產生加密密鑰位元組,以加密該程式。一些目的 檔格式(object file f0rmat)允許程式設計者標示程式載入記 憶體何處,或至少載明特定大小的對齊形式(例如,頁面邊 界page boundary),以提供足夠的位址資訊加密該程式。此 外,些作業糸統預設值是將程式載入頁面邊界上。 密鑰切換指令600可安置於程式的任何地方。然而, 若密鑰切換指令600載入特定值至主密鑰暫存器142供下 一「塊」指令數據位元組解密使用、且密鑰切換指令6〇〇(或 甚至密鑰載入指令500)之位置導致每一「塊」之長度短於、 或等於提取單元1〇4所能應付的有效密鑰長度(例如,第2 圖貫施方式所揭露的2048位元組)’則程式可被以有效長 度等同整體程式長度的密鑰加密,此為相當強健的加密方 式。此外,即使密鑰切換指令6〇〇的使用使得有效密鑰長 度仍短於加密程式的長度(即,同樣一套主密鑰暫存器142 數值被用於加密一程式的多個「塊」),改變「塊」尺寸(例 如,不限定全為2G48位元組)可增加駭客破解系統的困難 度,因為,駭客必須先判斷以同一套主密鑰暫存器142數 值加密的「塊」位於何處,並且必須判斷該些長度不一的 「塊」各自的尺寸。 CNTR2449100-TW/0608-A43 丨 29-T W/ Final 34 201203108 值得注意的是,以密鑰切換指令_ 切換耗費相當大量的時脈數目,主要θ 、現的動態密鑰 空。此外’在一種實施方式中, =因為管線必須清 以微代石馬(microcede)實現,通常在=換指令_主要是 慢。因此,程式碼開發二指令 響,在執行速度以及特定應用之安全性^對效能的影 點。 里之間尋求平衡 現在’參閱第9圖,一方塊圖圖解根 現的一分支與切換密鑰指令900的格式。本發明技術實 與切換密鑰指令900的必要性。。"。首先敘述該分支 根據以上實施例所揭露内容,加 ⑽提取的各個16位元組區塊的指令提取單元Where η is not 1. Furthermore, if the hacker is replaced by aligning 16-bit tuple blocks selected from different 256-bit tuple blocks and mutually exclusive operations, the byte 0 of the operation result is as shown in the description (7): f \ kr, ±kk〇±k\ ly CNTR2449100-TW/0608-A43129-TW/ Final 28 201203108 . wherein at least one of the master key reads v is different from the primary secret x and y. The mutual exclusion operation of the effective key byte generated by the simulation of the random master key value, 'can find the operation result L + moxibustion' on the mouth 4 ^ ^ \ V soil free ~ J king is now a fairly smooth distribution. Of course, if the hacker chooses to align the 16-bit tuple blocks in different 2048-byte length blocks and mutually exclusive operations, the hacker may obtain similar results as the description (3). However, please refer to the following. First, some programs, such as security-related programs, may be shorter than 2〇48 bytes. Second, the statistical correlation of the instruction bits of the 2048-bit tuple is likely to be very small, making it difficult to crack. Second, as described above, the implementation of the technology can implement the master key register 142 in a larger number, so that the effective length of the decryption key is extended; for example, 16384 bits are supplied by the 12 main buffer registers 142. The decryption key of the tuple length, and even other longer decryption keys. Fourth, the key download instruction 500 and the key switch instruction 6 discussed below enable the programmer to load new values into the master key register 142 to effectively extend the key length by more than 2048 bits. Group, or, if necessary, extend the key length to the full length of the program. Referring now to Figure 4, a block diagram illustrates the flag register 128 of Figure 1 in accordance with the teachings of the present invention. According to the embodiment shown in FIG. 4, the flag register 128 includes a plurality of bits 4〇8 of the standard χ86 register; however, for the new function described herein, the embodiment shown in FIG. 4 will use χ86. The wood structure is generally a one-bit reserved (RESERVED). Specifically, the ticket register 128 includes an E bit block 402. The E-bit field 402 uses CNTR2449100-TW/0608-A43129-TW/ Final 29 201203108 in the E-bit 148 of the repair control temporary storage 144 for switching between encryption and plain text subprograms and/or different encryption. Switch between programs, which will be discussed in detail below. The E bit block 4〇2 indicates whether the currently executed program is encrypted. If the currently executed program is encrypted, E bit clamp 402 is set, otherwise it is cleared. When an interrupt event occurs, control is transferred to another program (for example, interrupt imerrupt, exception exception such as page fault, page fault, or task switch), storing the temporary benefit 128. Conversely, if the control returns to the previous interrupt due to the interrupt event, the flag register 128 is repaired. The design of the microprocessor 1 will update the value of the E bit 148 of the control register 144 with the E bit 4 〇 2 of the flag register 128 when the flag register 128 is repaired, as discussed in more detail below. It. Therefore, if an encryption program is being executed when the interruption event occurs (ie, the extraction unit 104 is in the decryption mode), when the control is returned to the encryption program, the repaired E-bit block 402 bit 148 is set to the state to be repaired. The extraction unit 104 is in a decryption mode. In one embodiment, the E bit 148 and the E bit block 402 are the same specific hardware bit. Therefore, the value in the E bit field 4〇2 of the storage flag register 128 is the storage E bit. 148, and the value of the E-bit field 4〇2 of the repair flag register 128 is the repair E-bit 148. Referring to Figure 5, a block diagram illustrates the format of a key download instruction 500 implemented in accordance with the teachings of the present invention. The key download command 5 includes an opcode 502 block, specifically designated as a key download command 500 in the microprocessor ι〇〇 instruction set. In one embodiment, the value of the opcode field 502 is 〇FA6/4 (x86 field). The key download command 5 includes two transport elements, a male, a key register file target address 504, and a secure storage CNTR2449100-TW/0608-A43129-TW/ Final 30 201203108 Ϊ=,: 506. The secure storage area source address 506 is a secure storage address indicating the key temporary storage|^幸H address. Secret gun register key. In the embodiment, if the 3 8 * effect is obvious! ! If you order the money to load the command _, it will be regarded as no effective security; ^ In addition, the right * full storage area source address is outside the value of 'is regarded as a general protection exception. In the -, the right program tries to be in the micro-processing 1 1〇〇 is not the highest right, X86if0#p, /x86 ring 7 50 2' is regarded as an invalid instruction exception. In some cases, the composition of the μ byte master key may be included in the (four) data field of the encrypted instruction. The real-time data can be moved by the block-block to the secure storage ship 122, which constitutes the secret of the (6) entity. Referring now to Fig. 6 - a block diagram illustrates the format of the EM-instruction _ implemented in accordance with the teachings of the present invention. The cipher command _ includes an opcode 602, specifically a key switch instruction 600 in the microprocessor 1 〇〇 instruction set. The key switch instruction 600 further includes a key register file index block 604 indicating the start of the key register file 124 in a sequence register to load the key into the master key register. 142. In one embodiment, if a program attempts to execute a key switch instruction 600 when the microprocessor 1 is not in the safe mode of operation, it is considered invalid. In one implementation, if a program intends to execute a cipher switch instruction 600 when the microprocessor 1 is not at the highest privilege level (e.g., x86 ring privilege), then it is considered an invalid instruction exception. In one embodiment, CNTR2449100-TW/0608-A43129-TW/ Final 31 201203108 key switching instruction 600 is an atomic operation type (at〇mic), that is, non-interruptible; here, for carrying a person key to the master (10) The other instructions of the register 142 are also the same - for example, the branch and switch key instructions discussed below. Referring now to Figure 7, a flowchart illustrates the operation of the microprocessor 1 of Figure 1, wherein the password switching instruction 600 introduced in Figure 6 is performed in accordance with the teachings of the present invention. The process begins at block 7〇2. At block 702, decoding unit 1 删 8 deletes a key switch instruction and decodes the result into a microcode program of the switch instruction 600. The flow then proceeds to block 7〇4. At block 704' the microcode will download the contents of the primary secret register (4) from the secret register file 124 based on the key register slot index 604. In a preferred embodiment, the microcode starts with the secret register indicated by the index column of the temporary storage, and downloads the consecutive n register contents from the (10) register file 124 as n The key is stored in the master key register 142' where n is the total number of master key registers 142. In an embodiment, the value η may be indicated in an additional space of the key switching instruction 6〇〇, which is set to be less than the total number of the primary secret register M2. The flow then goes to block 706. At block 706' the microcode causes the microprocessor 1 to branch to the subsequent χ86 instruction (i.e., the instruction following the key switch instruction 600), which will result in all new χ86 instructions in the microprocessor 100 that are more recent than the key switch instruction 600. It is emptied, causing all micro-operations in the microprocessor 100 to be switched to the micro-operation of the χ86 command to be emptied. The above emptied instructions include all of the instruction octets 1 〇 6 that are extracted from the instruction cache 102, buffered in the fetch unit 1 〇 4, and decoded in the decoding unit 108 for decryption and decoding. Flow proceeds to block 708. CNTR2449100-TW/0608-A43129-TW/ Final 32 201203108. The operation of the knife-to-sequence command, the new 102 extracted into the master key register 142, and the decrypted instruction data are at block 708. Based on block 706, the unit 104 begins to use the block 7〇4 set of key values to self-command. Take the memory 106. The process ends at block 7-8. As shown in Fig. 7, the key switching instruction (4) causes the encryption program being executed to be changed from the instruction cache 1G2 and is changed to be stored in the master key temporary storage H 142 for decryption of the encryption program. Content. The master key temporary storage technology enables the effective secret recording length of the encrypted program to exceed the length of the innate support of the extracting unit 1G4 (for example, the hierarchical group provided by the embodiment of FIG. 2); The program shown in the figure, if it is operated by the microprocessor 1 in Figure 1, the hacker will be more difficult to break the security protection of the computer system. Referring now to FIG. 8, a block diagram illustrates a memory usage (mem〇ry fo〇tprint) 800 of an encryption program that is implemented in accordance with the teachings of the present invention, wherein the key switching instruction 6 shown in FIG. 6 is employed. Hey. The encryption program memory shown in Fig. 8 includes a continuous number of "block chunk" instruction data bytes by 4 8 (9). The content of each "block" is a sequence of multiple data bytes (which are pre-encrypted data), and the command data bits belonging to the same "block" are grouped by the same set of master keys. The register 142 is numerically decrypted. Therefore, the boundaries of the two different "blocks" are defined by the key switch instruction 600. That is to say, the upper and lower bounds of each "block" are distinguished by the location of the secret switching instruction 600 (or, for example, the first "block" of a program whose upper bound is the beginning of the edge program; Take the last "block" of the program as an example, the lower bound is the end of the program). Therefore, the number of "block" instructions CNTR2449100-TW/0608-A43129-TW/ Final,, 201203108 is determined by the extraction unit 1〇4 based on the different sets of master key registers l42, meaning "each" The decryption of the block "instruction data byte" is based on the value of the master key register 142 loaded by a key switch instruction 600 supplied by the previous "block". The post-processor of the encryption program knows the memory address of each key switching instruction 600, and uses the resource §fl—that is, extracts the relevant address bit of the address—to cooperate with the key switching instruction. The 6〇〇 key value generates an encryption key byte to encrypt the program. Some object file formats (object file f0rmat) allow the programmer to indicate where the program is loaded into the memory, or at least specify a specific size alignment (eg, page boundary) to provide sufficient address information to encrypt the program. . In addition, some of the operating system defaults are to load the program onto the page boundaries. The key switch instruction 600 can be placed anywhere in the program. However, if the key switch instruction 600 loads a particular value into the master key register 142 for use by the next "block" instruction data byte decryption, and the key switch instruction 6 (or even the key load instruction) The position of 500) causes the length of each "block" to be shorter than, or equal to the effective key length that the extraction unit 1〇4 can handle (for example, the 2048-bit group disclosed in Figure 2) It can be encrypted with a key of effective length equal to the length of the overall program, which is a fairly robust encryption method. Moreover, even if the use of the key switch instruction 6〇〇 is such that the effective key length is still shorter than the length of the encryption program (i.e., the same set of master key register 142 values are used to encrypt multiple "blocks" of a program. ), changing the "block" size (for example, not limited to all 2G48 bytes) can increase the difficulty of the hacker cracking system, because the hacker must first judge the value of the same set of master key register 142. Where the block is located, and the size of the "blocks" of different lengths must be determined. CNTR2449100-TW/0608-A43 丨 29-T W/ Final 34 201203108 It is worth noting that the key switching instruction _ switches a considerable amount of clocks, the main θ, the current dynamic key is empty. Furthermore, in one embodiment, = because the pipeline must be implemented with a micro-dealer, usually at = change command _ is mainly slow. Therefore, the code develops two commands, the speed of execution and the security of the specific application. Seeking a balance between the present and now, referring to Figure 9, a block diagram illustrates the format of a branch and switch key instruction 900 that is rooted. The technique of the present invention is a necessity for switching the key command 900. . ". First, the branch is described. According to the above disclosure, (10) the extracted instruction fetch unit of each 16-bit block is added.
,互斥技術)’所採用的加密密錄等過加密 用來解密(互斥運算)所提取之各區塊之指 W UM 個16位元組長之解密密鑰174。如以 康的各 的位元組數值是由提取單元刚基於^解密密输m 得:儲存於主密餘暫存器142的主 種輪入計算而 所提取之16位元組區塊之指令數據以及 部分位元(以第2圖所揭露實施方式為例,^位-r 4的 因此,加密一程式使之由微處理器100執行的。 會知曉將儲存於主密㈣存器142的主密☆—後處理器 以及一位址(或更限定為該位址的數個相關:元二:址 指示加絲式將被载人記憶體何處、且微)’ “位址 此處-連串地提取出該加密程式數個區塊;指::二將自 於上述貢訊’後處理器得以適切產生解 1 基 CNTR2449100-TW/0608-A43129-TW/Final 35 時上/兮数值, 201203108 用於加密該程式的各個]6位元 如以上所討論,當一分支指人、鬼之彳曰令數據。 提取單元104會以分支 '^預測到且/或被執行’ 加密程式從未改變(經由密 明控制。也就是說,提:單 二算數解密麵174,密二= ^ ^之指令數據1〇6、以及解密該分支 由 之指令數據106内的指令。然而,程式 切換指令6GG)主密鑰暫存11142數值的能力 二管觫六六早兀104有可能以一套主密鑰暫存器142數值 撼"m/、錢174解密包括該分支指令的―區塊之指令數 據襄’並以不同的另外—套主密錄暫存器142數值估算 ㈣密錄m解密該分支指令之目標位址所指的一區塊之 心令數據106内的指令。解決此問題的一種方法是限定分 支目標位址於程式同—「塊」中。另外-種解決方式是採 用第9圖所揭露的分支與切換密鑰指令_。 再次參閱第9圖,一方塊圖圖解根據本發明技術實現 的-分支與切換密餘指令9⑼的格式。分支與切換密錄指 7 900包括操作碼9〇2襴位,標示其為微處理器1〇〇指 々集内的7?支與切換⑧输指| 。分支與切換密餘指令 900,更匕括街鑰暫存器檔案索引攔位904,標示密錄暫存 器槽案124中—連串暫存器㈣開端,以自此將密錄載入 主密输暫存器142。分支與切換密鑰指令9GG更包括-分 支資訊欄位9G6,記載分支指令的典型f訊—如,計算目 CNTR2449100-TW/0608-A43129-TW/ Final 201203108 標位址的資訊、以及分支條件。在一種實施方式中,若一 程式在微處理器100不為安全執行模式時嘗試執行一分支 與切換密鑰指令900,則視之為無效指令異常。在一種實 施方式中,若一程式在微處理器100不為最高權限層級(;例 如’ x86的環〇權限)時試圖執行分支與切換密鍮指令9〇〇, 則視之為無效指令異常。在一種實施方式中,分支與切換 密鑰指令900為原子操作型(atomic)。 參閱第10圖’一流程圖圖解第1圖微處理器之操 作’其中’根據本發明技術執行第9圖所揭露之分支與切 換密鑰指令900。流程始於方塊1〇〇2。 在方塊1002,解碼早元108解碼·一分支與切換密錄指 令900且將之代入微代碼單元132中實現該分支與切換密 鑰指令900的微代碼程序。流程接著進入方塊1〇〇6。 在方塊1006,微代碼解出分支方向(採用、或不採用)、 以及目標位址。值得注意的是,對於無條件型分支指令 (unconditional branch instruction),戶斤述方向衡為採用。流 程接著進入判斷方塊1008。 在判斷方塊1008,微代碼判斷方塊1006所解出的方向 是否為採用。若為採用’流程進入方塊1014。反之,流程 進入方塊1012。 在方塊1012 ’微代碼不切換密錄、或跳至目標位址, 因為分支操作未被採用。流程結束於方塊1012。 在方塊1014’微代碼根據密鑰暫存器檔案索引搁位 904 ’將密鐵·自密錄暫存器播案124載入主密鑰暫存哭 142。較佳實施例是,微代碼以密鑰暫存器檔案索引攔位 CNTR2449100-TW/0608-A43129-TW/ Final 37 201203108 904所標示的位置為起始,將密鑰暫存器檐案124内η個 鄰近暫存器所記載的η個密输載入主密錄暫存器142,其 中η為主密鑰暫存器142的總數。在一種實施方式中,η 值可紀錄於分支與切換密錄指令9〇〇的一額外空間,設定 為小於主密鑰暫存器142總數的值。流程接著進入方塊 1016。 在方塊1016,微代碼使得微處理器1〇〇跳至方塊1〇〇6 所解出的目標位址,將導致微處理器1〇〇中較分支與切換 选鑰指令900新的所有χ86指令被清空,致使微處理器1〇〇 内較分支至目標位址的微操作新的所有微操作被清空。 上述被清空的指令包括自指令快取記憶體1〇2提取出、緩 衝暫存於提取單元104以及解碼單元1〇8内等待解密與解 碼的所有指令位元組1〇6。流程接著進入方塊1〇〇8。” 在方塊1018,隨著方塊1〇16分支至目標位址的操作, 提取單元104採用方塊1〇14載入主密鑰暫存器142的新一 組密鑰數值開始自指令快取記憶體1〇2提取且解密指令數 據106。流程結束於方塊1 〇 18。 —現在,參閱第11 ffi ’ -流程圖圖解根據本發明技術所 實現的一後處理器的操作。所述後處理器為軟件工具,可 用於後處理一程式並加密之,以交由第1圖賴處理器· 執行。流程始於方塊1102。 在方塊1102,後處理器接收一程式的一目的檔。根據 一種實施方式,該目的檔内的分支指令的目標位址可在程 式執行前確定;例如,指向固定目標位址的分支指令。在 矛王式運行刖決疋好目標位址的分支指令尚有另一形式,例 CNTR2449100-TW/0608-A43129-TW/ Final 38 201203108 如’一相對分支指令(relative branch instruction),其中記載 一偏移量,用來加上分支指令所在之記憶體位址,以求得 分支目彳市位址。反之,關於目標位址不會在程式執行前確 定的分支指令,其中一種例子是基於暫存器或記憶體所儲 存的運异元計异出目標位址’因此,其值在程式執行當中 可能有變動。流程接著進入方塊11〇4。 在方塊1104,後微處理器將跨塊分支指令(inter_chunk branch instruction)以分支與切換密鑰指令9〇〇取代,所述 指令900在密鑰暫存器檔案索引空間9〇4儲存有適當的數 值,該數值乃基於分支指令之目標位址所坐落「 設定。如第8圖所揭露内容,一「塊」是由丄多允;: 令數據位元組所組成,將由同一套主密鑰暫存器數值 解密。因此,跨塊分支指令之目標位址所坐落的「塊」不 同於分支指令本身的「塊」。值得注意的是,塊内分支—即 目標位址與本身位於同-「塊」的分支指令—無須被替代。 值得注意的是,產生出原始槽(_似fUe)以產出目的槽的 程式設計及/錢譯H可視f求明確包括分支與切換密瑜 指令以降低後處理器取代操作的負擔。流程接 入方塊1106。 $ 在方塊1106,後處理器加密該程式。後處理器知道每 -「塊」之記憶體位置以及主密鑰暫存器142數值,並將 之用於加岔該程式。流程結束於方塊1 1 。 現在’參㈣12圖’-方塊圖圖解本發明技術另一種 實施方式所實現的-分支與切換密儲令丨之格式 12圖所示之分支與切換密鑰指令測適用於目標 CNTR2449 丨 00-TW/0608-A43129-TW/Final 39 ^ 201203108 程式執行前為未知的分支操作,以下將詳細討論之。分支 與切換密錄指令1200包括一操作碼1202欄位,用以標示 其為微處理器100指令集内的分支與切換密鑰指令12〇〇。 分支與切換密鑰指令12〇〇同樣包括一分支資訊欄位906, 功用與第9圖之分支與切換密鑰指令9〇〇的該欄位類似。 在一種實施方式中’若一程式在微處理器100不為安全執 行模式時試圖執行分支與切換密鑰指令12〇〇,則視之為無 效指令異常。在一種實施方式中’若一程式在微處理器1〇0 不為最高權限級別(例如,χ86環〇權限)時試圖執行一分支 與切換密鑰指令1200 ’則視之為無效指令異常。在一種實 施方式中,分支與切換密鑰指令12〇〇為原子型式。 現在,參閱第13圖,一方塊圖圖解根據本發明技術實 現的「塊」位址範圍表13〇〇。表格13〇〇包括多個單元。 每=單元與加密程式的一「塊」相關。每一單元包括一位 址範圍欄位1302以及—密鑰暫存器槽案索引搁位13〇4。 位址乾圍攔位13G2標示所對應「塊」的記憶體位址範圍 密鑰暫存器檔案索弓丨欄位1304標示密鍮暫存器標案12〇 的暫存器,由分支與切換密鑰指令12〇〇將索引所指賴 益所儲存的絲數值載人主錄暫存^ 14 :解密該「塊」使用。以下參考第18圖進行討= :需要存取表格·内容的分支與切換密㈣/ 1200執行前載入微處理器ι〇〇。 現在,參閱第14圖,—流程圖圖解第丨圖微處理器 ^呆作,其巾,根據本發明技術執行第12®的分支與甘 岔鑰指令12〇〇。流程始於方塊14〇2。 < CNTR2449100-TW/0608-A43129-TW/ Final 40 201203108 在方塊1402,解碼單元ι〇8解碼一分支與切換密鑰指 令1200且將之代入微代碼單元132中實現分支與切換密輪 指令1200的微代碼程序。流程接著進入方塊1406。 在方塊1406,微代碼解出分支方向(採用、或不採用)、 且找出目標位址。流程接著進入判斷方塊14〇8。 在判斷方塊1408,微代碼判斷方塊14〇6所解出的分支 方向疋否為採用。若為採用,流程進入方塊1414。反之, 流程進入方塊1412。 在方塊1412,微代碼不切換密鑰、或跳至目標位址, 因為§亥分支未被採用。流程結束於方塊1412。 在方塊1414’微代碼基於方塊14〇6所解出的目標位址 查詢第13圖所示之表格,得到該目標位址所坐落之 「塊」所對應之密鑰暫存器檔案索引欄位13〇4的内容。微 代碼接著基於密鑰暫存器檔案索引攔位13〇4内所記载的 索引’自密鑰暫存器標案124將密输數值載入主密輪暫广 器142。較佳實施方式是,微代碼根據密錄暫存器: 引欄位1304所儲存的索引,自密鑰暫存器棺案12此,、 個相鄰暫存ϋ儲存的n個密鑰值載人主密鑰暫存器:Π 的’其中’ η為主密餘暫存_ 142的總數。在—種, 式中’數值η可紀錄於分支與切換密鑰指令12⑼的:施方 攔位中,設定為少於主密鑰暫存器' 142總數 ,外 入方塊1416。 钱香進 在方塊1416,微代竭致使微處理器1G0分支 1406所解出的目標位址,將導致微處理器1⑽中_八方% 切換密鑰指令1200新的所有χ86#令被清空 :, CNTR2449100-TW/0608-A43129-TW/ Final 爾'處理 201203108 器100内、較分支至目標位址的微操作新的所有微操作被 /月空。上述被清空的指令包括自指令快取記憶體102提取 出、緩衝暫存於提取單元104以及解碼單元1〇8内等待解 密與解碼的所有指令位元組106。流程接著進入方塊1418。 在方塊1418,隨著方塊1416分支至目標位址的操作, 提取單元104採用方塊1414載入主密錄暫存器142的新一 套街錄值,開始自指令快取記憶體1〇2提取並且解密指令 數據106。流程結束於方塊〗418。 現在,參考第15圖,一方塊圖圖解根據本發明技術另 外一種實施方式所實現的一分支與切換密鑰指令15〇〇的 格式。第15圖所示之分支與切換密鑰指令15〇〇以及其操 作類似第12圖所示之分支與切換密鑰指令12〇〇。然而, 取代自密鑰暫存器檔案124載入密鑰至主密鑰暫存器 142,分支與切換密鑰指令1500是自安全存儲區122载入 密鑰至主密鑰暫存器142,以下討論之。 現在,參考第16圖,一方塊圖圖解根據本發明技術所 實現的一「塊」位址範圍表1600。第16圖所示表格16〇〇 類似第13圖所示之表格1300。然而,取代包括一密輪暫 存器檔案索引欄位1304,表格1600包括一安全存儲區位 址攔位1604。安全存儲區位址欄位1604記载安全存儲區 122内的一位址,該位址儲存的密鑰值須由分支與切換密 鑰指令1500載入主密鑰暫存器142,以供該提取單元1〇46 解密該「塊」時使用。以下討論參考第18圖内容,表柊 1600是在需要查詢該表格16〇〇的分支與切換密鍮指令 1500被執行前載入微處理器1〇〇。在一種實施方式中,安 CNTR2449100-TW/0608-A43129-TW/ Final 42 201203108 .全存㈣122㈣之較健個位元無漏存在安全存儲區 位址欄位1604,特別是因為安全存儲區122中儲存—纽: 鑰的位置之總量相當大(;例如,16位^組乂5)、且該纪密餘 可沿著一設定尺寸範為對齊。 現在’參閱第17圖’一流程圖圖解第i圖微處理器 的操作,其中根據本發明技術執行第15圖的分支與切換密 餘指令1500。流程始於方塊17〇2。第17圖之流程圖的許 多方塊與第14圖的許多方塊類似,因此採同樣的編號。然 而,方塊1414是由方塊1714取代,微代碼基於方塊14^ 所求得的目標位址查表第16圖之表格16〇〇,以獲得目標 位址所坐落的「塊」之安全存儲區位址欄位16〇4數值。微 代碼接著根據安全存儲區位址欄位16〇4數值自安全存儲 區122將密鑰數值载入主密錄暫存器142。較佳實施方式 是,微代碼由安全存儲區位址欄位16〇4數值自安全存儲區 122將n個鄰近16位元組空間位置内所儲存的n個密输數 值載入主密鑰暫存器142,其中n為主密鑰暫存器142的 總數。在-種實施方式中,數值η可記載於分支與切換密 鑰指令1500中一額外攔位,設定為少於主密鑰暫存器μ〕 總數。 °° 現在,參閱第18圖,一流程圖圖解根據本發明另外一 種實施方式所實現的一後處理器的操作。所述後處理器可 用於後處理一程式並加密之,以交由第i圖的微處理器ι〇〇 執行。流程始於方塊1802。 在方塊1802,後處理器接收一程式的目的檔。根據一 種實施方式,該目的檔内的分支指令,可為目標位址在程 CNTR2449100-TW/0608-A43129-TW/ Final 201203108 式執行前判定、可為目標位址不可在程式執行前判定。流 程接著進入方塊1803。 在方塊1803’後處理器建立第13圖或第16圖之「塊」 位址範圍表1300或1600 ’以列入該目標槽。在一種實施 方式中,作業系統在載入且執行一加密程式前將表格 1300/1600載入微處理器100,使分支與切換密鑰指令 1200/1500得以存取之。在一種實施方式中,後處理器在程 式中插入指令’以在任何分支與切換密輪指令1200/1500 執行前載入表格13〇〇/16〇〇至微處理器100。流程接著進入 方塊1804。 在方塊1804 ’類似先前所討論、關於第丨丨圖之方塊 1104的操作’後處理器將每個執行前目標位址可決定的跨 塊分支指令以第9圖的分支與切換密鑰指令900取代,指 令900基於分支指令目標位址所在「塊」記載有合適的密 鑰暫存器檔案索引欄位904數值。流程接著進入方塊1805。 在方塊1805,後處理器根據方塊1803所產生的表格型 態(1300/1600)將每個限於執行過程中決定目標位址的分支 指令以第12圖或第15圖所示之分支與切換密鑰指令1200 或1500取代。流程接著進入方塊1806。 在方塊1806,後處理器加密該程式。該後處理器知道 關於各「塊」的記憶體位置與主密鑰暫存器142數值,將 用於加密該程式。流程結束於方塊1806。 現在,參閱第19圖,一流程圖圖解第1圖微處理器100 的操作,其中,根據本發明技術處理加密程式以及純文字 程式之間的任務切換。流程始於方塊1902。 CNTR2449100-TW/0608-A43129-TW/ Final 44 201203108 4, Mutual exclusion technique) The encrypted secret record used for encryption is used to decrypt (mutually exclusive) the extracted blocks of the UM 16-bit tuple decryption keys 174. The byte values of each of the parameters are obtained by the extracting unit just based on the decrypted secret m: the instruction of the 16-byte block extracted by the main seed rounding calculation stored in the main secret register 142 Data and partial bits (exemplified by the embodiment disclosed in FIG. 2, the bit-r 4 is thus encrypted, and the program is executed by the microprocessor 100. It will be known that it will be stored in the main secret (four) memory 142. The main secret ☆ - post-processor and one address (or more limited to the number of related addresses of the address: Yuan 2: the address indicates that the silk will be carried in the memory, and micro) '" address here - Extract a number of blocks of the encryption program in series; means: 2 will be from the above-mentioned Gongxun's post-processor to generate a solution 1 base CNTR2449100-TW/0608-A43129-TW/Final 35 on /兮Value, 201203108 Each of the 6-bits used to encrypt the program is discussed above, when a branch refers to the data of the person or ghost. The extraction unit 104 predicts and/or is executed by the branch '^. Never changed (via secret control. That is, mention: single two arithmetic decryption surface 174, secret two = ^ ^ instruction number 1〇6, and the instruction in the instruction data 106 is decrypted by the branch. However, the program switching instruction 6GG) the ability of the master key to temporarily store the value of 11142 is not limited to a set of master keys. The value 撼"m/, money 174 of the memory 142 decrypts the "block instruction data 襄" of the branch instruction and estimates the value of the different privileged register 142 with different sets of keys (4) cipher m decrypts the branch instruction The instruction in the data block 106 of the block indicated by the target address. One way to solve this problem is to limit the branch target address to the same block - "block". Another solution is to use the 9th figure. The disclosed branch and switch key instruction_. Referring again to Figure 9, a block diagram illustrates the format of the branch and switch secret command 9(9) implemented in accordance with the teachings of the present invention. Branch and switch secret entry 7 900 includes opcode 9 〇 2 ,, which indicates that it is the 7 支 branch and the switch 8 input finger in the microprocessor 1 〇〇 々 。 。 。 。 。 。 。 。 。 。 。 。 。 。 。 。 分支 。 分支 分支 分支 分支 分支 分支 分支 分支 分支 904 904 904 904 904 , marked in the secret register slot 124 - serial register At the beginning, the secret record is loaded into the main secret register 142. The branch and switch key instruction 9GG further includes a branch information field 9G6, which records a typical f signal of the branch instruction - for example, the calculation target CNTR2449100-TW/ 0608-A43129-TW/ Final 201203108 Information on the address and branch conditions. In one embodiment, if a program attempts to execute a branch and switch key instruction 900 when the microprocessor 100 is not in the secure execution mode, then Treated as an invalid instruction exception. In one embodiment, if a program attempts to perform a branch and switch key command when the microprocessor 100 is not at the highest privilege level (eg, 'x86 〇 〇 〇 ) )) It is treated as an invalid instruction exception. In one embodiment, the branch and switch key instruction 900 is atomic. Referring to Figure 10, a flowchart illustrates the operation of the microprocessor of Figure 1. The branch and switch key instructions 900 disclosed in Figure 9 are performed in accordance with the teachings of the present invention. The process begins at block 1〇〇2. At block 1002, the decode early element 108 decodes a branch and switch cipher command 900 and substitutes it into the microcode unit 132 to implement the microcode program of the branch and switch key instruction 900. The flow then proceeds to block 1〇〇6. At block 1006, the microcode resolves the branch direction (with or without) and the target address. It is worth noting that for unconditional branch instructions, the direction of the household is used. The flow then proceeds to decision block 1008. At decision block 1008, the direction in which the microcode decision block 1006 is resolved is taken. If the process is used, block 1014 is entered. Otherwise, the flow proceeds to block 1012. At block 1012' the microcode does not switch the secret record, or jumps to the target address because the branch operation is not taken. Flow ends at block 1012. At block 1014', the microcode loads the secret iron self-secret register broadcast 124 into the master key temporary cry 142 based on the key register file index slot 904'. In a preferred embodiment, the microcode starts with the location indicated by the key register file index block CNTR2449100-TW/0608-A43129-TW/ Final 37 201203108 904, and the key register is stored in the file 124. The n cryptographically described n cryptographically loaded primary cipher registers 142, where n is the total number of primary key registers 142. In one embodiment, the value of η may be recorded in an extra space of the branch and switch cryptographic instructions 9〇〇, set to a value less than the total number of master key registers 142. Flow then proceeds to block 1016. At block 1016, the microcode causes the microprocessor 1 to jump to the target address solved by block 1〇〇6, which will result in all new χ86 instructions in the microprocessor 1 较 branch and switch key instruction 900 new. It is emptied, causing all micro-ops in the micro-operation of the microprocessor 1 to branch to the target address to be emptied. The above emptied instructions include all instruction octets 1 〇 6 that are extracted from the instruction cache 1 〇 2, buffered in the extraction unit 104, and decoded in the decoding unit 〇 8 for decryption and decoding. The flow then proceeds to block 1-8. At block 1018, as block 1 〇 16 branches to the target address, extraction unit 104 begins with block 1 〇 14 loading a new set of key values for master key register 142 from the instruction cache. 1〇2 extracts and decrypts the instruction data 106. The flow ends at block 1 〇 18. - Now, referring to the 11th ffi' - flowchart illustrates the operation of a post processor implemented in accordance with the teachings of the present invention. A software tool that can be used to post-process a program and encrypt it for execution by the first Figure processor. The process begins at block 1102. At block 1102, the post processor receives a destination file of a program. According to one embodiment The target address of the branch instruction in the destination file can be determined before the execution of the program; for example, a branch instruction pointing to a fixed target address. There is another form of branch instruction in the spear-type operation to determine the target address. , for example CNTR2449100-TW/0608-A43129-TW/ Final 38 201203108 such as 'relative branch instruction, which records an offset, used to add the memory address of the branch instruction, In order to obtain the branch destination address, on the contrary, the branch instruction that the target address does not determine before the execution of the program, an example of which is based on the memory of the register or the memory is different from the target address. 'Therefore, its value may change during program execution. The flow then proceeds to block 11〇4. At block 1104, the post-processor replaces the inter_chunk branch instruction with the branch and switch key instruction 9〇〇 The instruction 900 stores an appropriate value in the key register file index space 〇4, which is based on the setting of the target address of the branch instruction. As disclosed in Fig. 8, a "block" It is composed of multiple data blocks: It consists of data bytes and will be decrypted by the same set of master key register values. Therefore, the "block" in which the target address of the cross-block branch instruction is located is different from the "block" of the branch instruction itself. It is worth noting that the intra-block branch—that is, the branch address with the same target address as the “block”—does not need to be replaced. It is worth noting that the original slot (like fUe) is generated to produce the destination slot and the program explicitly includes branching and switching commands to reduce the burden of post-processor replacement operations. Flow proceeds to block 1106. $ At block 1106, the post processor encrypts the program. The post processor knows the memory location of each - "block" and the value of the master key register 142 and uses it to add the program. The process ends at block 1 1 . Now, the reference to the fourth embodiment of the present invention is implemented in another embodiment of the present invention. The branch and switch key instructions shown in the format 12 are applied to the target CNTR2449 丨00-TW. /0608-A43129-TW/Final 39 ^ 201203108 The program is an unknown branch operation before execution, which is discussed in more detail below. The branch and switch cryptographic instructions 1200 include an opcode 1202 field to indicate that it is a branch and switch key instruction 12 within the instruction set of the microprocessor 100. The branch and switch key instruction 12A also includes a branch information field 906, which is similar to the field of Figure 9 and the switch key command 9〇〇. In one embodiment, if a program attempts to execute a branch and switch key instruction 12 when the microprocessor 100 is not in the secure execution mode, it is considered an invalid instruction exception. In one embodiment, a program that attempts to execute a branch and switch key instruction 1200' when the microprocessor 1〇0 is not at the highest privilege level (e.g., χ86 〇 〇 〇) is considered an invalid instruction exception. In one embodiment, the branch and switch key instructions 12 are atomic. Referring now to Figure 13, a block diagram illustrates a "block" address range table 13〇〇 implemented in accordance with the teachings of the present invention. Table 13〇〇 includes multiple units. Each = unit is associated with a "block" of the encryption program. Each unit includes a location range field 1302 and a key register slot index shelf 13〇4. The address of the address block 13G2 indicates the memory address range of the corresponding "block". The key register file is located in the register 1304, which indicates the register of the password register 12, which is branched and switched. The key instruction 12 暂 stores the silk value stored in the index and stores the temporary record of the master record ^ 14 : decrypts the "block" for use. Refer to Figure 18 below for the discussion of =: need to access the table · the branch of the content and switch the secret (four) / 1200 before loading the microprocessor ι〇〇. Referring now to Figure 14, a flow chart diagram of a microprocessor is used to perform the 12th branch and the key command 12〇〇 in accordance with the teachings of the present invention. The process begins at block 14〇2. < CNTR 2449100-TW/0608-A43129-TW/ Final 40 201203108 At block 1402, decoding unit ι 8 decodes a branch and switch key instruction 1200 and substitutes it into microcode unit 132 to implement branch and switch pinch instructions 1200 Microcode program. Flow then proceeds to block 1406. At block 1406, the microcode resolves the branch direction (with or without) and finds the target address. The flow then proceeds to decision block 14〇8. At decision block 1408, the branch direction resolved by the microcode decision block 14〇6 is taken. If so, the flow proceeds to block 1414. Otherwise, the flow proceeds to block 1412. At block 1412, the microcode does not switch the key, or jumps to the target address because the §Hai branch is not taken. Flow ends at block 1412. At block 1414', the microcode queries the table shown in FIG. 13 based on the target address solved by the block 14〇6, and obtains the key register file index field corresponding to the “block” in which the target address is located. 13〇4 content. The microcode then loads the secret value into the primary pinch 142 from the key register file 124 based on the index' recorded in the key register file index block 13〇4. In a preferred embodiment, the microcode is stored according to the index stored in the cipher register: the index 1304, and the n key values stored in the adjacent temporary storage 载The master master key register: 总数 'where ' η is the total number of secret secret temporary _ 142. In the formula, the value η can be recorded in the branch and switch key instruction 12(9): in the donor block, set to be less than the total number of master key registers '142, and the entry block 1416. Qian Xiangjin at block 1416, the micro-generation causes the target address of the microprocessor 1G0 branch 1406 to be solved, which will cause all the χ86# commands in the microprocessor 1 (10) to switch the key command 1200 to be cleared: CNTR2449100-TW/0608-A43129-TW/Final 'Processing 201203108 All micro-operations in the micro-operation within the device 100, which are branched to the target address are/monthly. The above cleared instructions include all instruction byte groups 106 extracted from the instruction cache 102, buffered in the extraction unit 104, and decoded in the decoding unit 1-8 for decryption and decoding. Flow then proceeds to block 1418. At block 1418, as block 1416 branches to the target address, extraction unit 104 loads a new set of street records for primary cc register 142 using block 1414, and begins extracting from instruction cache 1 〇 2 And the instruction data 106 is decrypted. The process ends at block 418. Referring now to Figure 15, a block diagram illustrates the format of a branch and switch key command 15A implemented in accordance with another embodiment of the present technology. The branch and switch key command 15 shown in Fig. 15 and its operation are similar to the branch and switch key command 12 shown in Fig. 12. However, instead of loading the key from the key register file 124 to the master key register 142, the branch and switch key instruction 1500 loads the key from the secure storage area 122 to the master key register 142. Discussed below. Referring now to Figure 16, a block diagram illustrates a "block" address range table 1600 implemented in accordance with the teachings of the present invention. The table 16 shown in Fig. 16 is similar to the table 1300 shown in Fig. 13. However, instead of including a pin buffer register index field 1304, table 1600 includes a secure memory address block 1604. The secure memory area address field 1604 records an address within the secure memory area 122, and the key value stored by the address must be loaded into the master key register 142 by the branch and switch key instruction 1500 for the extraction. Unit 1〇46 is used when decrypting the "Block". The following discussion refers to the contents of Figure 18, which is loaded into the microprocessor 1 before the branch that needs to query the table 16 and the switch key instruction 1500 are executed. In one embodiment, the CNTR2449100-TW/0608-A43129-TW/ Final 42 201203108. The more healthy bits of the (4) 122 (4) are stored in the secure memory area address field 1604, especially because it is stored in the secure storage area 122. - New: The total number of locations of the keys is quite large (for example, 16 bits ^ 5), and the secrets can be aligned along a set size. The operation of the microprocessor of Fig. i is now illustrated by reference to Fig. 17 which illustrates the branching and switching of the secret instructions 1500 of Fig. 15 in accordance with the teachings of the present invention. The process begins at block 17〇2. Many of the blocks of the flowchart of Fig. 17 are similar to the many blocks of Fig. 14, and therefore the same numbers are used. However, block 1414 is replaced by block 1714, and the microcode is based on the table 16 of Figure 16 of the target address obtained by block 14^ to obtain the secure block address of the "block" in which the target address is located. Field 16〇4 value. The microcode then loads the key value from the secure storage area 122 into the primary secret register 142 based on the secure storage area address field 16〇4 value. In a preferred embodiment, the microcode is loaded into the master key from the secure storage area 122 by n values of the n adjacent 16-bit tuple spatial locations from the secure storage area 122 by the value of the secure storage area address field 16〇4. 142, where n is the total number of master key registers 142. In an embodiment, the value η can be recorded in an additional block in the branch and switch key instruction 1500, set to be less than the total number of master key registers. °° Referring now to Figure 18, a flow diagram illustrates the operation of a post processor implemented in accordance with another embodiment of the present invention. The post processor can be used to post-process a program and encrypt it for execution by the microprocessor ι of Figure i. The flow begins at block 1802. At block 1802, the post processor receives the destination file of a program. According to an embodiment, the branch instruction in the destination file may be determined before the execution of the target address in the program, and the target address may not be determined before the execution of the program. The flow then proceeds to block 1803. At block 1803', the processor creates a "block" address range table 1300 or 1600' of Figure 13 or Figure 16 for inclusion in the target slot. In one embodiment, the operating system loads the table 1300/1600 into the microprocessor 100 prior to loading and executing an encryption program to enable branch and switch key commands 1200/1500 to be accessed. In one embodiment, the post processor inserts an instruction' in the program to load the table 13/16 to the microprocessor 100 before any branch and switch pinch instructions 1200/1500 are executed. Flow then proceeds to block 1804. At block 1804 'similar to the operation discussed above with respect to block 1104 of the second figure', the processor will determine the cross-block branch instruction determinable for each pre-execution target address with the branch and switch key instruction 900 of FIG. Instead, instruction 900 records the appropriate key register file index field 904 value based on the "block" in which the branch instruction target address is located. Flow then proceeds to block 1805. At block 1805, the post processor limits each branch instruction that determines the target address during execution to the branch and switch key shown in FIG. 12 or FIG. 15 according to the table type (1300/1600) generated by block 1803. The key instruction 1200 or 1500 is replaced. Flow then proceeds to block 1806. At block 1806, the post processor encrypts the program. The post processor knows the memory location of each "block" and the value of the master key register 142, which will be used to encrypt the program. Flow ends at block 1806. Referring now to Figure 19, a flow chart illustrates the operation of microprocessor 100 of Figure 1, wherein the task switching between the encryption program and the plain text program is handled in accordance with the teachings of the present invention. The process begins at block 1902. CNTR2449100-TW/0608-A43129-TW/ Final 44 201203108 4
• _在方塊1902 ’標誌、暫存器128的E位元欄位402的E 位兀以及第1圖控制暫存器144之E位元148由微處理器 ⑽的—重置操作清空。流程接著進人方塊1904。 、在方塊1904,微處理器1〇〇在執行其重置微代碼進行 ,始化後’開始提取並且執行使用者程式指令(例如,系統 其為純文子程式指令。特別是,由於E位元為 /、月二如則所述,提取單元1〇4視提取出來的指令數據1〇6 為純文字指令。流程接著進入方塊19〇6。 •在方塊1906,系統韌體(例如,作業系統、韌體、基本 輸入輸出系統BIOS...等)接收一要求(reqUest),要執行一加 密程式。在一種實施方式中,執行一加密程式的上述要求 伴隨、或由—切換操作指示,以切換至微處理器100的一 安全執行模式,如以上討論内容。在一種實施方式中,微 處理器100僅在安全執行模式時,方允許操作於一解密模 式(即’ E位元148為設定狀態)。在一種實施方式中,微處 理器100僅在系統管理模式(system management mode,例 如’ x86架構中常見的SSM),方允許以解密模式操作。流 程接著進入方塊1908。 在方塊1908,系統軟體於主密鑰暫存器142中載入其 初始值’與程式中將被執行的第一「塊」相關。在一種實 施方式中’系統軟體執行一密鑰切換指令600下載密錄至 主密鑰暫存器142。在載入密鑰至主密鑰暫存器M2之前, 雄錄暫存器槽案124的内容可由一或多個密鑰載入指令 500載入。在一種實施方式中’載入密輸至主密錄暫存器 142以及密鑰暫存器檔案124之前,安全存儲區122可先 CNTR2449100-TW/0608-A43129-TW/ Final 45 201203108 被寫入密絲值,其中,所述寫人乃經由常見的安全通道 技術例如’AES或RSA加密通道,以防止骇客窺探其值。 如以上所討論,以上密鑰數值可儲存在一安全非揮發性記 憶體(例如快閃記憶體)經由一隔離串行總線⑽她_ ㈣麵接微處理器,或者,謂存在微處驾丨⑽的一 非,發性單次寫人記憶體。如以上討論,所述程式可包含 在單一塊」中。也就是說,所述程式可不包括密錄切換 指令_,整個程式可由單—套主密鑰暫存器142數值解 密。流程接著進入方塊1916。 在方鬼1916 p近著控制權轉移至加密程式,微處理器 100。又疋暫存器128的E位元攔位4〇2標示目前所執 行的程式為加密型式’且設定控制暫存器144的E位元 Γ48 ’使提取單元104 4於解密模式。微處理器1〇〇更致使 管線内的指令被刷新,其動作類似第7圖方塊7〇6所實行 的刷新操作。流程接著進入方塊1918。 在方塊1918,提取單丨1〇4提取加密程式内的指令 106並且參考第1圖至第3圖所揭露的技術將之以解密模 式解敌並且執行之。流程接著進入方塊1922。 在方塊1922,微處理@ 1〇〇提取並且執行加密程式 時’微處理|§ 100接收到中斷事件。舉例說明之,所述中 斷事件可為-中斷intemipt、—異f exeeptiQn(如頁面錯誤 page fault)或任務切換task switch。當一中斷事件發生, 微處理器100管線所有待處理的指令會被清空。所以,若 管線中有任何先前提取的加密指令,將之清空。此外,自 指令快取記憶體102所提取出、可能在緩衝儲存在提取單 CNTR2449100-TW/0608-A43129-TW/ Final 201203108 • 70 104以及解碼單元108中等待被解密、解碼的所有指令 位元組會被清空。在一種實施方式中,微代碼被喚起回應 中斷事件。流程接著進入方塊1924。 在方塊1924,微處理器1〇〇儲存標誌暫存器128(以及 微處理器100其他結構狀態,包括受中斷的加密程式的目 月'J才日令扣彳示數值)至一堆豐式記憶體(stack mem〇ry)。儲存 加密程式之E位元攔位402數值將使其得以在後續操作中 修復(在方塊1934)。流程接著進入方塊1926。 在方塊1926,當控制權轉移到新的程式(例如,中斷處 理程序 interrupt handler、異常處理程序 excepti〇nhandler、 或新任務),微處理器100清空標誌暫存器128的E位元欄 位402、以及控制暫存器144的E位元148,以應付純文字 的新程式。也就是說,第19圖所示實施例假設微處理器 100同一時間只有允許運作一個加密程式,且已有一個加 密程式在執行(但被中斷)。第22圖至第26圖另外揭露有 其他種的實施方式。流程接著進入方塊1928。 命在方塊1928,提取單元104參考第}圖至第3圖所揭 路内容以純文字模式提取新程式的指令1〇6。特別是,控 制暫存器144内E位元148的清空狀態使得多工器154將 指令數據106與多位元的二進位零值176進行互斥運算, 使得指令數據106不被解密操作。流程接著進入方塊1932。 在方塊1932,新程式執行一返回操作自中斷指令(例 如,x86 IRET)或類似指令返回,使得控制權回歸加密程 ^在一種貫施方式中,自中斷指令返回的操作由微代碼 實現。流程接著進入方塊1934。 CNTR2449100-TW/0608-A43129-TW/ Final 47 201203108 在方塊1934 ’回應前述自中斷指令返回的操作,由於 控制權移轉回加密程式,微處理器⑽修復標誌暫存器 128,令標誌'暫存11 128之E位元搬重回先前方塊 1924所儲存的設定狀態。流程接著進人方塊1938。 在方塊1938纟於控制權移轉回加密程式,微處理器 以標誌、暫存器128的E位元欄位402數值更新控制暫 存器位元148,使得提取單元⑽姨提取j 解密該加練权騎數據1G6。—衫粒方塊浦。 在方塊1942,微代碼令微處理器1〇〇分支至先前方塊 =24儲存於堆疊式記龍巾的指令指標數值,使得微處理 „。100中所有X86指令清空、且使得微處理器1〇〇中所有 微操作π工。所β空内容包括提取自指令快取記憶體1〇2、 緩衝暫存在提取單mx及解碼單元⑽巾等待被解 密、解碼的所有指令位元纟且1G6。流雜著進入方塊 1944。 在方塊1944 ’ S取單幻〇4 f新開始提取該加密程式 内的才曰令106 ’並且參考第!圖至第3圖所揭露技術以解 密模式解密並且執狀。流麵束於^塊1944。 現在’參考第20圖,-流程圖圖解根據本發明技術實 現的-系統軟體之操作,㈣丨圖之微處理器丨⑼執行。 第20圖流程可配合第19圖内容執行。流程始於方塊 2002。 在方塊2002 ’系統軟體收到一要求,欲執行一個新的 加密程式。流程接著進入決策方塊2004。 在決策方塊2004,系統軟體判斷此—加密程式是否為 系統已在執行的程式之-。在—種實施方式中,系統軟體 以-旗標標卜加絲式是否為祕中已在執行的程式之 CNTR2449100-TW/0608-A43129-T W/ Final 48 201203108 流程進入 一。若此加密程式是系統已在執行的程式之__ 方塊2006,反之,則流程進入方塊2〇〇8。 在方塊2006 ’糸統权體荨待該加密程式執行^畢且 再是系統執行中的程式之一。流程接著進入方塊2〇〇8且不 在方塊2008,微處理器100允許新的加密程式開始 行。流程結束於方塊2008。 現在,參考第21圖,一方塊圖根據本發明技術另外一 種實施方式,圖解第1圖標誌暫存器128的攔位。第21 的標誌暫存器128類似第4圖所示實施方式,相比之圖 包括索引欄位(index bits)2104。根據一種實施方式,勺^ 索引攔位2104類似E位元402通常是x86架構所^留 元。索引欄位2104用於應付多個加密程式的切換,以下詳 細討論之。較佳實施方式是’密鑰切換指令6〇〇以及八支 與切換密錄指令900Π200以本身的密錄暫存器索引搁位 604/904/1304更新標誌暫存器128的索引欄位21〇4。 立 現在,參考第22圖,一流程圖圖解第i圖微處理器 的操作’其中’根據本發明技術採用第21圖所示之標畔、暫 存裔12 8貫行多個加後程式之間的任務切換。流程^著進 入方塊2202。 在方塊2202,一要求發向該系統軟體,要執行一個新 的加密程式。流程接著進入決策方塊2204。 在決策方塊2204,系統軟體判斷密鑰暫存器檔案 中是否有空間應付一個新的加密程式。在—種實施方气 中,方塊2202所產生的該要求會指出需要密鑰暫存器梓案 124内多少空間。若密錄暫存器檔案124巾有空間=新 CNTR2449100-TW/0608-A43129-TW/ Final 49 ^ 201203108 的加密程式,流程進入方塊2208,反之,流程進入方塊 2206。 在方塊2206,系統軟體等待一或多個加密程式完成、 使密鑰暫存器檔案124騰出空間應付新的加密程式。流程 接著進入方塊2208。 在方塊2208 ’系統軟體將密鑰暫存器檔案124内的空 間配置給新的加密程式,並且隨之填寫標誌暫存器128中 的索引欄位2104,以標示密鑰暫存器檔案124中新配置的 空間。流程接著進入方塊2212。 在方塊2212’系統軟體在方塊22〇8所配置的密鑰暫存 器檔案124位置載入供新程式使用的密鑰數值。如以上討 論,所載入的密鑰數值可採用密鑰下載指令5〇〇自安全存 儲區122載人’或者’在必要情況下,可以安全管 處理器100外部位置取得。流程接著進入料2214。Λ 在方塊2214,系統軟體基於密输暫存器樓案 604/904Π綱將密錄自密鑰暫存器槽案124載入㈣ 存器142。在一種實施方式中’系統軟體執行-密鑰:換 指令6GG載人密齡主密鑰财器142。流程接著進入方 塊 2216 。 在方塊2216,由於控制權移轉至加密程式,微處理器 100設定標該暫存器128之Ε位元棚位搬以標示目前執 行的程式為加密型式’並且設定控制暫存器144的Ε位元 Μ8以設定提取單it 1()4為解密模式。流程結束於方塊 2216。 現在,參考第23圖,一流程圖圖解第丨圖微處理器 CNTR2449100-TW/0608-A43129-TW/ Final 50 201203108 .的刼作,其中,根據本發明技術採用第21圖所示之標誌暫 .存器128應付多個加密程式之間的任務切換。流裎始於方 塊 2302 。 在方塊2302,目前執行的程式執行一返回操作,自— 中斷指令返回,引發一任務切換至新程式;所述新程式先 則曾被執行過但被跳開,且其結構狀態(例如,標誌暫存器 128、指令指標暫存器、以及通用暫存器)曾被儲存在堆疊 式記憶體中。如先前所提過,在一種實施方式中,自中斷 指令返回的操作是由微代碼實現。現在執行中的程式以及 新的程式可為加密程式或純文字程式。流程進入方塊2304。 在方塊2304,微處理器ι〇〇根據堆疊式記憶體修復標 達、暫存器128,以應付接續返回的程式。也就是說,微處 理器100將接續程式(即目前跳換回的程式)先前跳換出去 時儲存於堆疊式記憶體的標誌暫存器128數值重新載入標 誌暫存器128。流程接著進入決策方塊23〇6。 在決策方塊2306 ’微處理器ι〇〇判斷修復後的標誌暫 存器128之E位元402是否為設定狀態。若是,則流程進 入方塊2308 ;反之’則流程進入方塊2312。 在方塊2308,微處理器1〇〇根據方塊2304所修復的 EFLAGS暫存器128索引襴位2104數值將密鑰載入密鑰暫 存器檔案124。流程接著進入方塊2312。 在方塊2312,微處理器1〇〇將控制暫存器144之E位 元148的内容以方塊2304所修復的標誌暫存器128之E位 元欄位402數值更新。因此,若接續的程式是一個加密程 式,提取單元104會被設定為解密模式’反之,則設定為 CNTR2449100-TW/0608-A43129-TW/ Final 51 201203108 純文J模式。流程接著進入方塊23M。 復指令器1〇°以堆疊式記憶體的内容修 置,所述動作料除微;4分/卿至指令指標所指的位 除微處理器所有微操作:所並且清 102所提取出、緩衝暫存於提取單元刚、; =密'解碼的所有指令位元組1〇6。流程 重二,提取單元104參考第1圖至第3 _ 斤開始自接續程式中提取指令106,並視方塊2312所得 復的控制暫存器144之£位元148數值以解密 : 字模式操作。流程結束於方塊2316。 一 現在’參考第24圖’一方塊圖根據本發明、圖解第1 圖密鑰暫存器檔案124之單一個暫存器的另外—種實施方 式。根據第24圖所示之實施方式,每個密鑰暫存器檔案 U4更包括一位元—為淘汰位元2402(kill bit,以下簡稱κ 位元)。Κ位元2402用於應付微處理器100對多個加密程 式的多任務(multitasking)操作,所述多個加密程式總計需 要多於密鑰暫存器檔案124空間尺寸的密鑰儲存空間,以 下將詳述之。 現在,參考第25圖,一流程圖圖解第1圖微處理器1〇〇 的操作’其中根據本發明技術以第21圖之標誌暫存器128 以及第24圖之密鑰暫存器檔案124實現多個加密程式之間 之任務切換的另外一種實施方式。第25圖所示流程類似第 22圖所示流程。不同處在於決策方塊2204判定密鑰暫存 CNTR2449100-TW/0608-A43129-TW/ Final 52 201203108 器檔案124中沒有足夠可用空間時,第25圖流程會進入方 塊2506而非不存在於第25圖的方塊2204。另外,若決策 方塊2204判定密鑰暫存器檔案124中尚有足夠可用空間, 則第25圖流程同樣進入第22圖之方塊2208至方塊2216。 在方塊2506,系統軟體將密鑰暫存器檔案124中已經 被其他加密程式使用(即已經被配置)的空間(即暫存器 置出來,並且設定所配置暫存器的κ位元24〇2為設定狀 態,並且隨之设疋標这'暫存器128的索引欄位2104以標示 新配置空間在密鑰暫存器檔案124中的位置。κ位元二〇2 之設定狀態,是標示該暫存器中關於其他加密程式的密鑰 值將被方塊2212的操作覆寫為新的加密程式的密鑰值。然 而,如以下第26圖所敘述,其他加密程式的密鑰值將在其 返回程序中由方塊2609重新载入。第25圖流程進入方塊 2506,會接著導向第22圖所示之方塊2212,結束於方塊 2216 。 ' 現在’參閱第26圖,一流程圖圖解第j圖微處理器工⑼ 的操作,其中根據本發明技術以第21圖之標誌暫存器128 以及第24圖之密鑰暫存器檔案124實現多個加密程式之間 之任務切換的另外一種實施方式。第26圖所示流程類似第 23圖所示流程。不同處在於,若決策方塊23〇6判定標誌 暫存器128的E位元402為設定,第26圖令流程進入決策 方塊2607而非方塊2308。 在決策方塊2607,微處理器1〇〇判斷密鑰暫存器檔案 124中,由標誌、暫存态128索引攔位21〇4數值(於方塊2304 中修復)所標示的任何暫存器之K位元2402是否為設定狀 CNTR2449100-TW/0608-A43129-TW/ Final q 201203108 態。若是’則流程進入方塊2609 ;若否,則流程進入方塊 2308。 在方塊2609’微處理器1〇〇產生一異常警示(excepti〇n) 父由一異常處理程序處理。在一種實施方式中,異常處理 程序設計於系統軟體中。在一種實施方式中,異常處理程 序是由安全執行模式架構提供。根據方塊23〇4所修復的標 誌暫存器128索引攔位2104數值,異常處理程序將目前修 復的2密程式(即現在所返回執行的加密程式)之密鑰重新 載入密鑰暫存器檔案丨24。異常處理程序可類似先前第19 ^所提及的方塊1908作動,將修復之加密程式的密鑰載入 讀暫存器檔# 124,或者,在必要情況下,自微處理器 =〇外部將密鑰載人安全存儲區122。同樣地,若密输暫存 器槽案124中被重新載人的暫存器有被其他加密程式使 用’系統軟體會令其暫存器的κ位元·為設定狀離。 流程接著自方塊雇進入2_,且方塊薦至23 參考第23圖内容。 如第24圖至第26圖所教示,此處所敘述的實施方3 :微處理器100得以實行多個加密程式的多任務操作, 便上述加密程式需要密騎存空間總合多於密 124空間尺寸。 讦给 現在’參考第27圖,-方塊關解修改自第 ^⑽的本發明糾—種實施方式。與第丨圖類似i 疋抓用同樣標號;例如’指令快取記憶體1〇2、提承 凡1〇4以及密鑰暫存器檔帛124。然而,此處提取單元. 被修正成更包括密仙換邏輯2712,雛第丨圖所介矣• The _ at block 1902 ’ flag, the E bit E of the E bit field 402 of the scratchpad 128, and the E bit 148 of the first picture control register 144 are cleared by the reset operation of the microprocessor (10). Flow proceeds to block 1904. At block 1904, the microprocessor 1 is performing its reset microcode execution, and begins to extract and execute the user program instructions (eg, the system is a plain text subroutine instruction. In particular, due to the E bit For example, as described in the second and second months, the extracting unit 1〇4 regards the extracted command data 1〇6 as a plain text command. The flow then proceeds to block 19〇6. • At block 1906, the system firmware (eg, the operating system) , firmware, basic input/output system BIOS, etc.) receive a request (reqUest) to execute an encryption program. In one embodiment, the above requirements for executing an encryption program are accompanied by, or by, a switching operation indication, Switching to a secure execution mode of the microprocessor 100 is as discussed above. In one embodiment, the microprocessor 100 is allowed to operate in a decryption mode only when in the secure execution mode (ie, 'E bit 148 is set State). In one embodiment, the microprocessor 100 is only allowed to operate in a decryption mode in a system management mode, such as the SSM common in the x86 architecture. The process then proceeds to block 1908. At block 1908, the system software loads its initial value 'in the master key register 142' with the first "block" to be executed in the program. In one embodiment, the system software executes A key switch instruction 600 downloads the secret record to the master key register 142. Before loading the key to the master key register M2, the contents of the register register 124 can be made up of one or more keys. The load instruction 500 is loaded. In one embodiment, the secure storage area 122 can be CNTR2449100-TW/0608-A43129-TW/ before being loaded into the primary secret register 142 and the key register file 124. Final 45 201203108 is written to the secret value, where the writer encrypts the channel via common secure channel technologies such as 'AES or RSA to prevent hackers from snooping their values. As discussed above, the above key values can be stored. In a secure non-volatile memory (such as flash memory) via a separate serial bus (10) she _ (four) face the microprocessor, or, in the presence of a micro-control (10), a non-issued single write Memory. As discussed above, the program can be packaged In a single block, that is, the program may not include the secret switch instruction _, and the entire program may be decrypted by the single-set master key register 142. The flow then proceeds to block 1916. The transfer to the encryption program, the microprocessor 100. The E bit block 4〇2 of the scratchpad 128 indicates that the currently executed program is the encrypted type 'and the E bit Γ48' of the control register 144 is set. The extracting unit 104 is in the decryption mode. The microprocessor 1 causes the instructions in the pipeline to be refreshed, the action of which is similar to the refresh operation performed by the block 7-6 of FIG. Flow then proceeds to block 1918. At block 1918, the extraction unit 106 extracts the instructions 106 within the encryption program and decomposes and executes it in a decryption mode with reference to the techniques disclosed in Figures 1 through 3. Flow then proceeds to block 1922. At block 1922, when the microprocessor @1〇〇 extracts and executes the encryption program, 'microprocessing|§ 100 receives an interrupt event. By way of example, the interrupt event can be an interrupt intemipt, an exclusive f exeeptiQn (such as a page fault page fault), or a task switch task switch. When an interrupt event occurs, all pending instructions of the microprocessor 100 pipeline are cleared. So, if there are any previously extracted encryption instructions in the pipeline, clear them. In addition, all the instruction bits extracted from the instruction cache 102 and possibly buffered and stored in the extraction list CNTR2449100-TW/0608-A43129-TW/ Final 201203108 • 70 104 and the decoding unit 108 are waiting to be decrypted and decoded. The group will be cleared. In one embodiment, the microcode is evoked to respond to an interrupt event. Flow then proceeds to block 1924. At block 1924, the microprocessor 1 stores the flag register 128 (and other structural states of the microprocessor 100, including the date of the interrupted encryption program), to a bunch of abundance Memory (stack mem〇ry). Storing the E-bit block 402 value of the encryption program will cause it to be repaired in subsequent operations (at block 1934). Flow then proceeds to block 1926. At block 1926, when control transfers to a new program (eg, interrupt handler interrupt handler, exception handler exceptioni〇nhandler, or new task), microprocessor 100 clears E-bit field 402 of flag register 128. And control the E bit 148 of the register 144 to cope with the new program of plain text. That is, the embodiment shown in Fig. 19 assumes that the microprocessor 100 is allowed to operate only one encryption program at a time, and that an encryption program is already executing (but is interrupted). Figures 22 through 26 additionally disclose other embodiments. Flow then proceeds to block 1928. At block 1928, the extracting unit 104 extracts the instructions 1〇6 of the new program in plain text mode with reference to the contents of the maps to the third figure. In particular, controlling the clear state of the E bit 148 in the scratchpad 144 causes the multiplexer 154 to mutually exclusive operation of the instruction data 106 with the binary zero value 176 of the multi-bit such that the instruction data 106 is not decrypted. Flow then proceeds to block 1932. At block 1932, the new program executes a return operation from an interrupt instruction (e.g., x86 IRET) or the like, such that control returns to the encryption process. In one implementation, the operation returned from the interrupt instruction is implemented by microcode. Flow then proceeds to block 1934. CNTR2449100-TW/0608-A43129-TW/ Final 47 201203108 In block 1934 'Respond to the operation returned by the aforementioned self-interrupt instruction, since the control is transferred back to the encryption program, the microprocessor (10) repairs the flag register 128, so that the flag is temporarily The E bit of 11 128 is moved back to the set state stored in the previous block 1924. The flow then proceeds to block 1938. At block 1938, the control is transferred back to the encryption program, and the microprocessor updates the control register bit 148 with the value of the E-bit field 402 of the flag, register 128, so that the extraction unit (10) extracts j and decrypts the addition. Ride the ride data 1G6. - Sweaters. At block 1942, the microcode causes the microprocessor to branch to the previous block = 24 to store the command indicator values stored in the stacked dragon towel, such that all X86 commands in the microprocessor 100 are cleared and the microprocessor is enabled. All the micro-operations in the 〇. The β-empty content includes all the instruction bits extracted from the instruction cache 1〇2, the buffer temporary extraction list mx, and the decoding unit (10) waiting to be decrypted and decoded, and 1G6. Hybrid entry block 1944. At block 1944 'S take a single illusion 4 f new start to extract the certificate 106 ' in the encryption program and refer to the techniques disclosed in Figures! through 3 to decrypt and decrypt in the decryption mode. The flow surface is bundled into block 1944. Referring now to Figure 20, a flow chart illustrates the operation of the system software implemented in accordance with the teachings of the present invention, and (iv) the microprocessor (9) of the figure is executed. The figure content is executed. The process begins at block 2002. At block 2002, the system software receives a request to execute a new encryption program. The flow then proceeds to decision block 2004. At decision block 2004, the system software determines this - plus Whether the program is a program that the system is already executing. In the embodiment, the system software is marked with the - flag, whether it is the program that has been executed in the secret CNTR2449100-TW/0608-A43129-T W / Final 48 201203108 The process enters 1. If the encryption program is the program __ box 2006 of the system already executing, otherwise, the flow proceeds to block 2〇〇8. In block 2006, the system is waiting for the encryption program to execute. And then one of the programs in the system execution. The flow then proceeds to block 2〇〇8 and not in block 2008, the microprocessor 100 allows the new encryption program to start. The flow ends at block 2008. Now, refer to Figure 21. According to another embodiment of the present technology, a block diagram of the flag register 128 of FIG. 1 is illustrated. The flag register 128 of the 21st is similar to the embodiment shown in FIG. 4, and the comparison chart includes an index column. Index bits 2104. According to one embodiment, the scoring index 2104 is similar to the E-bit 402. The index field 2104 is used to handle the switching of multiple encryption programs, as discussed in detail below. More The embodiment is that the 'key switching instruction 6' and the eight-to-switch and secret-recording instructions 900 Π 200 update the index field 21 〇 4 of the flag register 128 with its own cryptographic register index shelf 604/904/1304. Now, referring to Fig. 22, a flowchart illustrates the operation of the microprocessor of Fig. i, wherein the technique according to the present invention adopts the standard, the temporary storage, and the plurality of post-programs shown in Fig. 21. The task is switched. The process proceeds to block 2202. At block 2202, a request is sent to the system software to execute a new encryption program. Flow then proceeds to decision block 2204. At decision block 2204, the system software determines if there is room in the key register file for a new encryption program. In an implementation, the requirement generated by block 2202 will indicate how much space is needed in the key register file 124. If the cipher register file 124 has space = new CNTR2449100-TW/0608-A43129-TW/ Final 49 ^ 201203108 encryption program, the flow proceeds to block 2208, otherwise, the flow proceeds to block 2206. At block 2206, the system software waits for one or more encryption programs to complete, causing the key register file 124 to make room for the new encryption program. Flow then proceeds to block 2208. At block 2208' the system software configures the space in the key register file 124 to the new encryption program, and then fills in the index field 2104 in the flag register 128 to indicate the key register file 124. Newly configured space. Flow then proceeds to block 2212. The key value for the new program is loaded at block 2212' where the system software loads the key register file 124 at block 22〇8. As discussed above, the loaded key value can be retrieved from the secure storage area 122 by a key download command 5 or, if necessary, securely taken from the external location of the processor 100. The process then proceeds to feed 2214. Λ At block 2214, the system software loads the secret file from the key register slot 124 to the fourth register 142 based on the secret register file 604/904. In one embodiment, the 'system software execution-key: change instruction 6GG manned key master key 142. The flow then proceeds to block 2216. At block 2216, the microprocessor 100 sets the location of the scratchpad 128 to indicate that the currently executing program is in the encrypted version and sets the control register 144 as the control is transferred to the encryption program. Bit Μ 8 sets the extraction list it 1 () 4 as the decryption mode. The process ends at block 2216. Referring now to Figure 23, a flow chart illustrates the operation of the second embodiment of the microprocessor CNTR2449100-TW/0608-A43129-TW/ Final 50 201203108, wherein the flag shown in Figure 21 is employed in accordance with the teachings of the present invention. The buffer 128 handles task switching between multiple encryption programs. The rogue begins at block 2302. At block 2302, the currently executing program performs a return operation, returning from the interrupt instruction, causing a task to switch to the new program; the new program was first executed but was skipped, and its structural state (eg, flag) The scratchpad 128, the instruction indicator register, and the general purpose register have been stored in the stacked memory. As previously mentioned, in one embodiment, the operation returned from the interrupt instruction is implemented by microcode. The currently executing program and the new program can be either an encryption program or a plain text program. Flow proceeds to block 2304. At block 2304, the microprocessor ι repairs the index, scratchpad 128 based on the stacked memory to cope with the program that continues to return. That is, the microprocessor 100 reloads the value of the flag register 128 stored in the stacked memory when the continuation program (i.e., the program currently being swapped back) is previously swapped out to the flag register 128. The flow then proceeds to decision block 23〇6. At decision block 2306, the microprocessor ι determines whether the E bit 402 of the repaired flag register 128 is in the set state. If so, the flow proceeds to block 2308; otherwise, the flow proceeds to block 2312. At block 2308, the microprocessor 1 loads the key into the key register file 124 according to the EFLAGS register 128 index clamp 2104 value as fixed at block 2304. Flow then proceeds to block 2312. At block 2312, the microprocessor 1 updates the contents of the E bit 148 of the control register 144 to the value of the E bit field 402 of the flag register 128 repaired by block 2304. Therefore, if the subsequent program is an encryption mode, the extraction unit 104 is set to the decryption mode. Otherwise, it is set to CNTR2449100-TW/0608-A43129-TW/ Final 51 201203108 plain text J mode. Flow then proceeds to block 23M. The complex commander 1〇 is trimmed with the contents of the stacked memory, and the action material is divided by the micro; the 4 points/qing to the instruction index refers to the bits except the micro-operation of the microprocessor: The buffer is temporarily stored in the extracting unit just after; = dense 'decoded all instruction byte 1 〇 6. In the second step, the extracting unit 104 extracts the instruction 106 from the continuation program with reference to FIG. 1 to FIG. 3, and operates on the value of the 148 bits of the control register 144 obtained by the block 2312 to decrypt: word mode operation. Flow ends at block 2316. Another block diagram of a single register of the key register file 124 of FIG. 1 is illustrated in accordance with the present invention. According to the embodiment shown in Fig. 24, each key register file U4 further includes a bit element - a kill bit (hereinafter referred to as a κ bit). The bit 2402 is used to cope with the multitasking operation of the microprocessor 100 for a plurality of encryption programs, the total of which requires a key storage space larger than the space size of the key register file 124, It will be detailed. Referring now to Figure 25, a flow chart illustrates the operation of microprocessor 1 of Figure 1 wherein the flag register 128 of Figure 21 and the key register file 124 of Figure 24 are in accordance with the teachings of the present invention. Another implementation that implements task switching between multiple encryption programs. The process shown in Figure 25 is similar to the process shown in Figure 22. The difference is that the decision block 2204 determines that the key is temporarily stored CNTR2449100-TW/0608-A43129-TW/ Final 52 201203108 When there is not enough free space in the file 124, the process of Figure 25 will enter block 2506 instead of not present in Figure 25. Block 2204. Alternatively, if decision block 2204 determines that there is sufficient free space in the key register file 124, then the flow of Fig. 25 also proceeds to block 2208 through block 2216 of Fig. 22. At block 2506, the system software sets the space in the key register file 124 that has been used by other encryption programs (i.e., has been configured) (i.e., the scratchpad is set, and sets the κ bit of the configured scratchpad 24〇) 2 is the set state, and the index field 2104 of the temporary register 128 is marked to indicate the position of the new configuration space in the key register file 124. The setting state of the κ bit 〇2 is The key value indicating the other encryption program in the register will be overwritten by the operation of block 2212 as the key value of the new encryption program. However, as described in Figure 26 below, the key values of other encryption programs will be In its return procedure, it is reloaded by block 2609. The flow of Fig. 25 proceeds to block 2506, which in turn leads to block 2212 shown in Fig. 22, ending at block 2216. 'Now' see Fig. 26, a flowchart illustration The operation of the microprocessor (9), wherein the task register between the plurality of encryption programs is implemented by the flag register 128 of FIG. 21 and the key register file 124 of FIG. 24 in accordance with the teachings of the present invention. Embodiment. Figure 26 The flow is similar to the flow shown in Figure 23. The difference is that if decision block 23〇6 determines that E-bit 402 of flag register 128 is set, then process 26 enters decision block 2607 instead of block 2308. At block 2607, the microprocessor 1 determines, in the key register file 124, the K bit of any register indicated by the flag, the temporary state 128 index block 21〇4 (fixed in block 2304). Whether the element 2402 is a set CNTR2449100-TW/0608-A43129-TW/ Final q 201203108 state. If yes, the flow proceeds to block 2609; if not, the flow proceeds to block 2308. At block 2609, the microprocessor 1 generates a An exception alert (excepti〇n) The parent is handled by an exception handler. In one embodiment, the exception handler is designed in the system software. In one embodiment, the exception handler is provided by the secure execution mode framework. The flag register 128 fixed by 〇4 indexes the value of the block 2104, and the exception handler reloads the key of the currently repaired 2 crypto program (that is, the cipher program that is now returned) into the key register. Case 24. The exception handler can be actuated similar to block 1908 mentioned in the previous section 19^, loading the key of the repaired encryption program into the read register file #124, or, if necessary, from the microprocessor. = The external key is carried in the secure storage area 122. Similarly, if the reloaded register in the secret register slot 124 is used by other encryption programs, the system software will cause its temporary storage. The κ bit is set as the detachment. The flow then proceeds from the block to enter 2_, and the block is recommended to 23 to refer to Figure 23. As taught in Figures 24 to 26, the embodiment 3 described herein: the microprocessor 100 is capable of performing multi-tasking operations of a plurality of encryption programs, so that the encryption program requires a combination of dense riding space and more than dense 124 space. size.讦 Given now, reference is made to Figure 27, which is a modified embodiment of the invention modified from the ^(10). Similar to the figure i, the same number is used; for example, 'instruction cache memory 1〇2, mentioning 1〇4 and key register file 124. However, the extraction unit here has been modified to include more than the immortal logic 2712, which is introduced by the younger map.
CNTR2449100-TW/0608-A43129-TW/ Final cA 201203108 主岔鑰暫存器檔案丨42以及密鑰暫存器檔案124。第27圖 之U處理器1〇〇更包括一分支目標位址快取記憶體(branch target address cache ’ BTAC)2702。BTAC 2702 接收第 1 圖 所揭露之提取位址134,且與指令快取記憶體1〇2的存取 平行,皆是基於該提取位址134。根據提取位址134,BTAC 2702供應分支目標位址2706給第1圖所揭露的提取位址 產生器164,供應一採用/不採用指標(T/NT indicat〇r)27〇8 以及一型式指標(type indiCator)2714給密鑰切換邏輯 2712,並且供應一德、鑰暫存器標案(krj7)索引2716給密鑰 暫存器檔案124。 現在,參閱第28圖,一方塊圖根據本發明技術更詳細CNTR2449100-TW/0608-A43129-TW/ Final cA 201203108 Master Keypad File 42 and Keypad File 124. The U processor 1 of Fig. 27 further includes a branch target address cache (BTAC) 2702. The BTAC 2702 receives the extracted address 134 as disclosed in FIG. 1 and is parallel to the access of the instruction cache 1 〇 2, based on the extracted address 134. According to the extracted address 134, the BTAC 2702 supplies the branch target address 2706 to the extracted address generator 164 disclosed in FIG. 1 to supply a adoption/non-use index (T/NT indicat〇r) 27〇8 and a type index. (type indiCator) 2714 gives key switching logic 2712 and supplies a DT register (krj7) index 2716 to key register file 124. Referring now to Figure 28, a block diagram is more detailed in accordance with the teachings of the present invention.
圖解第27圖的BTAC 2702。BTAC 2702包括一 BTAC矩陣 2802’其中具有複數個BTAC單元2808,第29圖圖解BTAC 單元2808的内容。BTAC 2802儲存的資訊包括先前執行過 的分支指令的歷史資訊,以預測接續執行之分支指令的方 向以及目標位址。特別是’ BTAC 2802會採用儲存的歷史 資訊,基於提取的位址134預測先前執行過的分支指令後 續發生的提取彳呆作。分支目標位址快取之操作可參考常見 的分支預測技術。然而,本發明所揭露的BTAC 2802是更 修正成記錄先前執行過的分支與切換密鑰指令9〇〇/12〇〇的 歷史資訊,以進行相關的預測操作。特別是,儲存的歷史 紀錄使得BTAC 2802得以在提取時間内預測所提取的分支 與切換密鑰指令900/1200將載入主密鑰暫存器142的該組 數值。此操作致能密鑰切換邏輯2712在分支與切換密錄指 令900/1200實際執行前將密鑰數值載入,避免受限於需根 CNTR2449100-TW/0608-A43129-TW/ Final 55 201203108 據为支與切換後錄指令900/1200之執行清空微處理器1 〇〇 的管線内谷,以下將詳細討論之。此外,根據一種實施方 式,BTAC 2802更被修正成儲存包括先前執行過的密鑰切 換指令600的歷史資訊,以達到相同的效果。 現在,參閱第29圖,一方塊圖根據本發明技術更詳細 圖解第28圖BTAC單元2808的内容。每個單元2808包括 一有效位元2902指示所屬單元2808是否為有效。每個單 元2808更包括一標記襴位29〇4,用以與提取位址134的 部分内容比較。若提取位址134的索引部分選擇的單元 2808使得提取位址134之標記部分吻合其中有效標記 2904’則提取位址134正中BTAC 2802。每個陣列單元2808 更包括一目標位址攔位2906,用於儲存先前執行過之分支 指令一包括分支與切換密鑰指令900/1200—的目標位址。 每個陣列單元2808更包括一採用/不採用攔位29〇8,用以 儲存先則執行過的分支指令一包括分支與切換密餘指令 900/1200—的方向(採用/不採用)記錄。每個陣列單元28〇8 更包括一密鑰暫存器索引欄位2912,用於儲存先前執行過 的分支與切換密鑰指令900/1200的密鑰暫存器檔案索引 904/1304記錄,以下將詳細討論之。根據一種實施方式, BTAC 2802是在其密鑰暫存器檔案索引欄位2912儲存先前 執行過的密鑰切換指令600的密鑰暫存器檔案索引6〇4記 錄。每個陣列單元2808更包括一型式攔位2914,指示所 紀錄的指令的型式。例如,型式攔位2914可標示所紀錄的 歷史指令為一呼叫(call)、返回(return)、條件跳躍 (conditional jump)、無條件跳躍(unconditi〇nal jump)、分支 CNTR2449100-TW/0608-A43129-TW/ Final 56 201203108 與切換密鑰指令900/1200、或密鑰切換指令600。 現在,參閱第30圖,一流程圖圖解第27圖微處理器 100的操作,其中,根據本發明技術,所述微處理器1〇〇 包括第28圖揭露的BTAC 2802。流程始於方塊3002。 在方塊3002,微處理器1〇〇執行一分支與切換密鑰指 令900/1200,以下將以第32圖詳述之。流程接著進入方塊 3004。 在方塊3004’微處理器1〇〇在BTAC 2802中配置一陣 列單元2808給執行過的分支與切換密鑰指令900/1200,將 該分支與切換密鑰指令900/1200解出的方向、目標位址、 密鑰暫存器檔案索引904/1304、以及指令型式分別紀錄於 所配置的該陣列單元2808之採用/不採用攔位2908、目標 位址攔位2906、密鑰暫存器檔案索引欄位2912'以及型式 欄位2914中’以作為該分支與切換密鑰指令9〇0/12〇〇的 歷史資訊。流程結束於方塊3004。 現在,參閱第31圖,一流程圖圖解第27圖微處理器 1〇〇的操作’其中,根據本發明技術,所述微處理器1〇〇 包括第28圖揭露的BTAC 2802。流程始於方塊3102。 在方塊3102,提取位址134供應給指令快取記憶體1〇2 以及BTCA 2802。流程接著進入方塊31〇4。 在方塊3104,提取位址134正中BTAC 2802,且BTAC 2802將對應的陣列單元28〇8之目標位址29〇6、採用/不 採用2908、密鑰暫存器檔案索引2912以及型式2914攔 位的内容分別以目標位址27〇6、採用/不採用指標27〇8、 密鑰暫存器檔案索引2712、以及型式指標2714輸出。特 CNTR2449100-TW/0608-A43129-TW/ Final c7 201203108 別是’型式欄位2914用於指示所儲存指令為一分支與切 換密鑰指令900/1200。流程接著進入決策方塊3106。 在決策方塊3106,密鑰切換邏輯2712藉由檢驗採用/ 不採用輸出2708判斷分支與切換密鑰指令900/1200被 BTAC 2802預測為會採用。若採用/不採用輸出2708顯示 分支與切換密鑰指令900/1200被預測為採用,流程接著 進入方塊3112 ;反之,流程接著進入方塊31〇8。 在方塊3108,微處理器1〇〇隨著分支與切換密鑰指令 900/1200順著輸送一指示,顯示BTAC 2802預測其不被 採用。(此外,若採用/不採用輸出2708顯示該分支與切換 密鑰指令被預測為採用,微處理器丨〇〇在方塊3112隨著 該分支與切換密鑰指令900/1200順著輸送一指示,顯示 BTAC 2802預測其會被採用)。流程結束於3108 在方塊3112,提取位址產生器164以BTAC 2802於方 塊3104所預測的目標位址2706更新提取位址134。流程 接著進入方塊3114。 在方塊3114 ’根據BTAC 2802於方塊31 〇4所預測的 密鑰暫存器檔案索引2712,密鑰切換邏輯2712以其所指 示之在、錄暫存益檐案124位置更新主密餘暫存器142内的 密鑰數值。在一種實施方式中,必要狀況下,密錄切換邏 輯2712會拖延提取單元1 〇4提取指令數據1 〇6内的區塊, 直至主岔錄暫存器142被更新。流程接著進入方塊3116。 在方塊3116,提取單元104利用方塊3114所載入的新 主密鑰暫存器142内容持續提取並且解密指令數據ι〇6。 流程結束於方塊3116。 CNTR2449100-TW/0608-A43129-TW/ Final 58 201203108 現在’參閱第32圖,一流程圖圖解第27圖微處理器 100的操作,其中’根據本發明技術’執行一分支與切換 密鑰指令900/1200。第32圖流程在某一方面類似第1〇圖 流程’且類似的方塊是採以同樣標號。雖然第32圖的討論 是參照第10圖内容,其應用可更考慮第14圖所介紹的分 支與切換密鑰指令1200操作。第32圖流程始於方塊1〇〇2。 在方塊1002,解碼單元1〇8解碼一分支與切換密輪指 令900/1200 ’且將之代入微代碼單元132實現分支與切換 密鑰指令900/1200的微代碼程序。流程接著進入方塊i〇〇6。 在方塊1006,微代碼解出分支方向(即採用/不採用)以 及目標位址。流程接著進入方塊3208。 在方塊3208,微代碼判斷BTAC 2802是否為該分支與 切換密鑰指令900/1200提供一預測《若有提供,流程接著 進入決策方塊3214 ;若無提供,流程接著進入第1〇圖的 方塊1008。 在決策方塊3214,微代碼藉由將BTAC 2802輸送出的 採用/不採用指標2708以及目標位址2706與方塊1006所 解出的方向以及目標位址判斷BTAC 2802所做的預測是否 正確。若BTAC 2802的預測正確’則流程結束;反之,則 流程來到決策方塊3216。 在決策方塊3216,微代碼判斷此不正破的BTAC 2802 預測有沒有被採用。若已被採用,流程進入方塊3222 ;若 無,流程進入第10圖的方塊1014。 在方塊3222 ’微代碼修復主密錄暫存器142的内容, 因為BTAC 2802對分支與切換密錄指令900/1200所做的錯 CNTR2449100-TW/0608-A43129-TW/ Final 59 201203108 誤預測被採用,導致第31圖方塊3104將錯誤的密鑰數值 載入其中。在一種實施方式中,密錄切換邏輯2712包括修 復主密鑰暫存器142所需的儲存元件與邏輯。在一種實施 方式中’微代碼產生一異常警示交由一異常處理器修復主 密鑰暫存器142。此外,微代碼使得微處理器100分支跳 躍到該分支與切換密鑰指令900/1200之後接續的x86指 令’使得微處理器1〇〇中新於該分支與切換密鑰指令 900/1200的所有x86指令清空’並且使微處理器10〇中較 为支至目彳示位址之微代碼新的所有微代碼清空。被清空的 内容包括讀取自指令快取記憶體1〇2、且緩衝暫存於提取 單元104、解碼單元108中等待被解碼的所有指令位元組 106。隨著分支至接續的指令,提取單元1〇4開始使用主密 鑰暫存器142内的該組修復後的密鑰數值自指令快取記憶 體102提取並且解密指令數據106。流程結束於方塊3222。 除了以上所述、由微處理器100實現的指令解密實方 方式所帶來的安全優勢,發明人更發展出建議編碼指南 其使用可配合以上實施方式’削㈣由分析观指令實朽 使用量、對加密x86碼以統計技巧發展出的骇客攻擊。 第-,由於駭客通常假設所提取的16位元组的指令逢 據H)6全數為X86指令,因此,相對於程式執行流程,矣 碼時應當在位纽區塊之間加人「郎細)」。也旧 說,其編碼應當以多個指令跳躍一些指令位_纟、 密的位元組產生多個「洞」’其中可填;;適當:值= 加純文字位兀組的熵值(entropy)。此外 文字位元組的熵值,其編碼可盡可能彳。此更k升,. 1 J应」此知用即時數攄值。 CNTR2449100-TW/0608-A43129-TW/ Final 6〇 201203108 外,所述即時數據值可作為假線索,指向錯誤的指令操作 碼位址。 第二’所述編碼可包括特別的nop指令,其中包括,, 不理會”欄位,填有適當數值以增加上述熵值。例如,x86 指令0x0F0D05xxxxxxxx屬於7位元組的N〇p,其中最後 四個位元組可為任意值。此外,:N[〇p指令的操作碼型式以 及其「不理會」位元組的數量更可有其他變化。 第三,許多x86指令具有與其他χ86指令相同的基本 功能。關於等效功能的指令’其編碼可捨棄重複使用同樣 的指令,改採用多重型式並且/或採用使純文字熵值提升的 型式。例如’指令0xC10107以及指令〇xC1〇〇25作的是同 樣的事情。甚至,某些等效指令是以不同長度的版本呈現, 例如,OxEB22以及0XE90022 ;因此,編碼時可採用多種 長度但相同效果的指令。 第四,架構允許使用冗餘且無意義的操作碼字首 (〇PC〇deprediX),因此,編碼時可小心應用之,以更增加上 述熵值。例如,扣令0x40以及〇x2627646567F2F34〇作的 疋70全一樣的事情。因為其中僅有8個安全的x86字首, 他們需被小心地安插在編碼中,以避免過度頻繁地出現。 雖然已經列舉多種實施例以密錄擴展器對主密錄暫存 器數值中的-對數值進行_以及加/減運算,尚有其他實 施方式可考慮使用,其中,密錄擴展器可對多於兩個的主 密鑰暫存器數值進行運算,此外,所進行的運算可不同於 旋轉以及加/減運算。此外,第6圖揭露的密錄切換指令㈣ 以及第9圖揭露的分支與切換密錄指令_更可有其他實 CNTR2449100-TW/0608-A43129-T W/ Final ' 201203108 施方式’例如’將新的密錄數值由安全存儲區122載入主 密錄暫存器142而非由密錄暫存器槽案124載入,並且, 第15圖所介紹的分支與切換密錄指令15〇〇的其他實施方 式是以索引攔位2104儲存安全存儲區122的位址。此外, 雖然已列舉多種實施例調整BTAC謂2儲存krf索引配 合分支與切換密鑰指令刚/丨使用,尚有其他實施方式 是調整BTAC2702儲存安全存儲區位址,以配合分支與切 換密鑰指令1500使用。 以上列舉的本發明料實財式僅是作為說明例使 用,並非意圖限制發明範圍。相關電腦技術領域人員可在 不偏離本發明範圍的前提下作出形式以及細節的諸多變 形。例如,可以軟體方式實現所述如函式、製作、模組化、 模擬、說明、以及/❹m此篇所討論之設備與方法的方 式。實現方式包括-般程式語言(例如,C、C++)、硬體描 述語言包括Vediog HDL、VHDL...等、或其他可用的程式 工具。所述軟體可載於任何已知的計算機可讀媒體,例如, 磁帶、半導體、磁碟、或光碟(例如,Cd_r〇m、dvd_r〇m 等)、網路、有線傳輸、無線或其他通訊媒體。所述設備與 方法的實施方式可包含於半導體知識絲㈣,例如一微 處理器核心(例如以肌實現)’並可轉成硬體以積體電路 實現。此外’所述之設備與方法可由軟、硬體处合方 現。因此,本發明範圍不應限定於所述任何實施方式,摩 當是以下mt求項以及其等效技術界定之。制是,本發 明技術可以-般用途計算機所採用的微處理器實現。 注意的是,本技術領域人員可能Μ離請求項所定義 CNTR2449100-TW/0608-A43129-TW/ Final 62 又 201203108 明範圍、以所揭露之概念以及特殊實施例為基礎、設計成 修正提出其他架構產生與本發明相同的效果。 【圖式簡單說明】 第1圖為一方塊圖’圖解根據本發明技術實現的一微 處理器; 第2圖為一方塊圖’用以詳細說明圖解第1圖的提取 單元; 第3圖為一流程圖’根據本發明技術,圖解第2圖提 取單元之操作; 第4圖為一方塊圖,根據本發明技術,圖解第1圖標 諸暫存器的襴位; 第5圖為一方塊圖,根據本發明技術,圖解一密鑰載 入指令的格式; 第6圖為一方塊圖,根據本發明技術,圖解一密鑰切 換指令的格式; 第7圖為一流程圖’根據本發明技術,圖解第1圖微 處理器的操作,其中執行第6圖之密錄切換指令; 、第8圖為一方塊圖’根據本發明技術,圖解一加密裎 式勺己隐體用1:,该加密程式包括多個第6圖 鑰切換指令; 牧路日]在 第9圖為一方塊圖,稂據本發明技術,圖解一 切換进绩指令的格式; ’、 第10圖為一流程圖,根據本發明技術,圖 處理器的操作’其中執行第9圖之分支與切換密鑰指= CNTR2449,〇〇.TW/〇6〇8.A43129.TW/Final 幻 , 201203108 第11圖為一流程圖,根據本發明技術,圖解一後處理 器的操作,由軟件工具實現,可用於後部處理一程式、且 加密之,以由第1圖微處理器執行; 第12圖為一方塊圖,圖解本發明另外一種實施方式的 分支與切換密鑰指令的格式; 第13圖為一方塊圖,根據本發明技術,圖解塊位址範 圍表; 第14圖為一流程圖,根據本發明技術,圖解第1圖微 處理器的操作,其中執行第12圖之分支與切換密鑰指令; 第15圖為一方塊圖,圖解本發明另外一種實施方式的 分支與切換密鑰指令的格式; 第16圖為一方塊圖,根據本發明技術,圖解塊位址範 圍表; 第17圖為一流程圖,根據本發明技術,圖解第1圖微 處理器的操作,其中執行第15圖之分支與切換密鑰指令; 第18圖為一流程圖,圖解本發明技術另外一種實施方 式,其中敘述一後處理器的操作,用於後部處理一程式、 且加密之,由第1圖微處理器執行; 第19圖為一流程圖,根據本發明技術,圖解第1圖微 處理器的操作,用於應付一任務切換,切換於一加密程式 以及一純文字程式之間; 第20圖圖解一流程圖,根據本發明技術,圖解第1圖 微處理器所執行的系統軟體之操作; 第21圖圖解一方塊圖,根據本發明另外一種實施方 式,圖解第1圖標誌暫存器的欄位; CNTR2449100-TW/0608-A43129-TW/ Final 64 201203108 第22圖為一流程圖,根據本發明技術,圖解採用第21 ' 圖之標誌暫存器的第1圖微處理器之操作,用於應付一任 務切換,切換於多個加密程式之間; 第23圖為一流程圖,根據本發明技術,圖解採用第21 圖之標誌暫存器的第1圖微處理器之操作,用於應付一任 務切換,切換於多個加密程式之間; 第24圖為一方塊圖,根據本發明另外一種實施方式, 圖解第1圖密鑰暫存器檔案中的單一個暫存器; 第25圖為一流程圖,根據本發明另外一種實施方式, 圖解採用第21圖標誌暫存器以及第24圖密鑰暫存器檔案 的第1圖微處理器之操作,以應付一任務切換,切換於多 個加密程式之間; 第26圖為一流程圖,根據本發明另外一種實施方式, 圖解採用第21圖標誌暫存器以及第24圖密鑰暫存器檔案 的第1圖微處理器之操作,以應付一任務切換,切換於多 個加密程式之間; 第27圖為一方塊圖,圖解第1圖微處理器100部分内 容的其他實施方式; 第28圖為一方塊圖,根據本發明技術,詳細圖解第27 圖的分支目標位址快取記憶體(BTAC); 第29圖’為一方塊圖,根據本發明技術,詳細圖解第28 圖之BTAC各單元之内容; 第30圖為一流程圖,根據本發明技術,圖解第27圖 微處理器採用第28圖BTAC的操作; 第31圖為一流程圖,根據本發明技術,圖解第27圖 CNTR2449100-TW/0608-A43129-TW/ Final 65 201203108 微處理器採用第28圖BTAC的操作;以及 微产流程圖’根據本發明技術,圖解第27 微处理益對-分支與切換密錄指令的操作;以及 【主要元件符號說明】 100〜微處理器; 104〜提取單元; 108〜解碼單元; 114〜引出單元; 122〜安全存儲區; 128〜標誌暫存器; 134〜提取位址; 144〜控制暫存器; 102〜指令快取記憶體; 106〜私令數據(可為加密); 112〜執行單元; 118〜通用暫存器; 124〜密鑰暫存器檔案; 132〜微代碼單元; 142〜主密瑜暫存器; 148〜E位元; 152〜密鑰擴展器; 154〜多工器; 156〜互斥邏輯; 164〜提取指令產生器 162〜純文字指令數據; ;172〜兩組密鑰; Π4〜解密密鑰; 176〜多位元的二進位零值 178〜多工器154的輸出; 212〜多工器a ; 214〜多工器B ; 216〜旋轉器; 218〜加法/減法器; 234〜第一密鑰; 236〜第二密鑰; 23 8〜旋轉器的輸出; 402〜E位元攔位; 302-316〜步驟方塊; 408〜多個位元的標準 x86標誌、; 5 00〜密錄載入指令; 5 02〜操作碼; CNTR2449100-T W/0608-Α43129-T W/ Final 66 201203108 504-密鑰暫存器檔案目標位址; 506〜安全存儲區來源位址; 600〜密錄切換指令;602〜操作碼; 604〜密鑰暫存器檔案索引; 702-708〜方塊步驟; 800〜記憶體用量; 900〜分支與切換密鑰指令; 902〜操作碼; 906〜分支資訊; 1102-1106〜步驟方塊; 1202〜操作碼; 13 02〜位址範圍; 1402-1418〜步驟方塊; 15 02〜操作碼; 1604〜安全存儲區位址 1714〜步驟方塊; 904〜密鑰暫存器檔案索引; 1002-1018〜步驟方塊; 1200〜分支與切換密鑰指令 1300〜塊位址範圍表: 1304〜密鑰暫存器檔案索引 1500〜分支與切換密鑰指令 1600〜塊位址範圍表: 1802-1806〜步驟方塊; 1902-1944〜步驟方塊;2002-2008〜步驟方塊; 2104〜索引; 2202-2216〜步驟方塊; 2302-2316〜步驟方塊;2402〜淘汰位元; 2506〜步驟方塊; 2607、2609〜步驟方塊; 2702〜分支目標位址快取記憶體(BTAC); 2706〜目標位址; 2708〜採用/不採用指標; 2712〜密鑰切換邏輯;2714〜型式指標; 2716〜密鑰暫存器檔案索引; 2802〜BTAC陣列; 2808〜BTAC單元; 2902〜有效位元; 2904〜標記攔位; CNTR2449100-TW/0608-A43129-TW/ Final 67 201203108 2906〜目標位址; 2908〜採用/不採用欄位 2912〜密鑰暫存器檔案索引; 2914〜型式欄位; 3002-3004〜步驟方塊; 以及 3102-3116〜步驟方塊;3208-3222〜步驟方塊; ZEROS〜多位元的二進位零值。 CNTR2449100-TW/0608-A43129-TW/ Final 68The BTAC 2702 of Figure 27 is illustrated. The BTAC 2702 includes a BTAC matrix 2802' having a plurality of BTAC units 2808 therein, and FIG. 29 illustrates the contents of the BTAC unit 2808. The information stored by the BTAC 2802 includes historical information of previously executed branch instructions to predict the direction of the branch instruction to be executed and the destination address. In particular, BTAC 2802 will use the stored history information to predict the subsequent occurrence of the subsequent execution of the branch instruction based on the extracted address 134. For branch target address cache operations, refer to the common branch prediction techniques. However, the BTAC 2802 disclosed in the present invention is more modified to record historical information of previously executed branch and handover key commands 9 〇〇 / 12 , for related prediction operations. In particular, the stored history record enables the BTAC 2802 to predict during the extraction time that the extracted branch and switch key instructions 900/1200 will be loaded into the set of values of the master key register 142. This operation enables the key switch logic 2712 to load the key value before the branch and switch secret record instructions 900/1200 are actually executed, avoiding being limited by the need CNTR2449100-TW/0608-A43129-TW/ Final 55 201203108 Execution of the post-switching instruction 900/1200 clears the in-line valley of the microprocessor 1 ,, which will be discussed in detail below. Moreover, according to one embodiment, BTAC 2802 is further modified to store historical information including previously executed key switching instructions 600 to achieve the same effect. Referring now to Figure 29, a block diagram illustrates the contents of BTAC unit 2808 of Figure 28 in more detail in accordance with the teachings of the present invention. Each unit 2808 includes a valid bit 2902 indicating whether the belonging unit 2808 is active. Each unit 2808 further includes a flag field 29〇4 for comparison with the portion of the extracted address 134. If the selected portion 2808 of the index portion of the extracted address 134 is such that the marked portion of the extracted address 134 matches the valid flag 2904', the address 134 is extracted from the BTAC 2802. Each array unit 2808 further includes a target address block 2906 for storing a previously executed branch instruction, a target address including a branch and switch key instruction 900/1200. Each array unit 2808 further includes a use/non-use of a block 29〇8 for storing a branch instruction that has been executed first, including a branch (and/or adopt) direction of branching and switching the secret instruction 900/1200. Each array unit 28〇8 further includes a key register index field 2912 for storing previously executed branch and switch key instructions 900/1200 key register file index 904/1304 records, Will be discussed in detail. According to one embodiment, the BTAC 2802 stores the key register file index 6〇4 record of the previously executed key switch instruction 600 in its key register file index field 2912. Each array unit 2808 further includes a type of stop 2914 indicating the type of command being recorded. For example, the type block 2914 may indicate that the recorded history command is a call, a return, a conditional jump, an unconditial jump, and a branch CNTR2449100-TW/0608-A43129- TW/ Final 56 201203108 and the key exchange command 900/1200 or the key switch instruction 600. Referring now to Figure 30, a flowchart illustrates the operation of microprocessor 100 of Figure 27, wherein microprocessor 1 includes BTAC 2802 disclosed in Figure 28 in accordance with the teachings of the present invention. The process begins at block 3002. At block 3002, the microprocessor 1 executes a branch and switch key instruction 900/1200, which will be detailed below in FIG. Flow then proceeds to block 3004. At block 3004 'The microprocessor 1 配置 configures an array unit 2808 in the BTAC 2802 for the executed branch and handover key commands 900/1200, the direction and target of the branch and handover key command 900/1200. The address, the key register file index 904/1304, and the instruction pattern are respectively recorded in the configured array unit 2808 with/without the intercept 2908, the target address block 2906, and the key register file index. Field 2912' and type field 2914' are used as historical information for the branch and switch key command 9〇0/12〇〇. Flow ends at block 3004. Referring now to Figure 31, a flow chart illustrates the operation of the microprocessor 1 of Figure 27, wherein the microprocessor 1 includes the BTAC 2802 disclosed in Figure 28 in accordance with the teachings of the present invention. The process begins at block 3102. At block 3102, the fetch address 134 is supplied to the instruction cache 1 〇 2 and the BTCA 2802. The flow then proceeds to block 31〇4. At block 3104, the BTAC 2802 is extracted from the address 134, and the BTAC 2802 blocks the target address 29〇6, the adoption/non-use 2908, the key register file index 2912, and the type 2914 of the corresponding array unit 28〇8. The contents are outputted with the target address 27〇6, the adoption/non-use index 27〇8, the key register file index 2712, and the type index 2714, respectively. CNTR2449100-TW/0608-A43129-TW/ Final c7 201203108 is not a type field 2914 for indicating that the stored instruction is a branch and switch key command 900/1200. Flow then proceeds to decision block 3106. At decision block 3106, the key switch logic 2712 determines that the branch and switch key commands 900/1200 are predicted to be employed by the BTAC 2802 by verifying the take/take output 2708. If the use/not output 2708 is displayed, the branch and switch key command 900/1200 is predicted to be employed, and the flow proceeds to block 3112; otherwise, the flow proceeds to block 31〇8. At block 3108, the microprocessor 1 displays, along with the branch and switch key command 900/1200, an indication that the BTAC 2802 is predicting that it is not being used. (In addition, if the branch/switch key instruction is predicted to be employed with/without output 2708, the microprocessor will continue to transmit an indication along with the branch and switch key command 900/1200 at block 3112. Show BTAC 2802 predicts that it will be adopted). Flow ends at 3108. At block 3112, the extracted address generator 164 updates the extracted address 134 with the target address 2706 predicted by the BTAC 2802 at block 3104. Flow then proceeds to block 3114. At block 3114 'based on the key register file index 2712 predicted by BTAC 2802 at block 31 〇 4, the key switch logic 2712 updates the primary secret temporary storage at the location indicated by the record temporary store 124. The key value in the 142. In one embodiment, the secret recording switching logic 2712 will delay the extraction unit 1 〇 4 to extract the blocks within the instruction data 1 〇 6 until necessary, until the primary snippet register 142 is updated. Flow then proceeds to block 3116. At block 3116, the extraction unit 104 continues to extract and decrypt the instruction data ι6 using the new master key register 142 content loaded by block 3114. Flow ends at block 3116. CNTR2449100-TW/0608-A43129-TW/ Final 58 201203108 Referring now to Figure 32, a flowchart illustrates the operation of microprocessor 100 of Figure 27, wherein a branch and switch key instruction 900 is executed in accordance with the techniques of the present invention. /1200. The flowchart of Fig. 32 is similar in some respects to the first scheme flow' and similar blocks are given the same reference numerals. Although the discussion of Fig. 32 is based on the contents of Fig. 10, its application may more consider the branch and switch key instruction 1200 operations described in Fig. 14. The process of Figure 32 begins at block 1〇〇2. At block 1002, decoding unit 〇8 decodes a branch and switch pinch instruction 900/1200' and substitutes it into microcode unit 132 to implement the microcode program for branching and switching key instructions 900/1200. The flow then proceeds to block i〇〇6. At block 1006, the microcode resolves the branch direction (i.e., with/without) and the target address. Flow then proceeds to block 3208. At block 3208, the microcode determines whether the BTAC 2802 provides a prediction for the branch and handover key command 900/1200. If provided, the flow then proceeds to decision block 3214; if not provided, the flow then proceeds to block 1008 of the first diagram. . At decision block 3214, the microcode determines whether the predictions made by BTAC 2802 are correct by using the direction/defects of the BTAC 2802 with/without indicator 2708 and the target address 2706 and the block 1006. If the prediction of BTAC 2802 is correct, then the process ends; otherwise, the flow proceeds to decision block 3216. At decision block 3216, the microcode determines if the uncorrupted BTAC 2802 prediction has been taken. If so, the flow proceeds to block 3222; if not, the flow proceeds to block 1014 of FIG. At block 3222, the microcode repairs the contents of the primary secret register 142 because the BTAC 2802 mistypes the branch and switch cryptographic instructions 900/1200. CNTR2449100-TW/0608-A43129-TW/ Final 59 201203108 Adoption causes block 31 of Figure 31 to load the wrong key value. In one embodiment, the secret record switching logic 2712 includes the storage elements and logic required to repair the master key register 142. In one embodiment, the microcode generates an exception alert to the exception key processor 142 for repairing the master key register 142. In addition, the microcode causes the microprocessor 100 to branch to jump to the x86 instruction following the branch and switch key instruction 900/1200', making the microprocessor 1 new to the branch and switch key instructions 900/1200 The x86 instruction clears 'and clears all microcode in the microprocessor 10 that is new to the microcode of the directory address. The contents that are emptied include all of the instruction byte groups 106 that are read from the instruction cache 1 〇 2 and buffered in the extraction unit 104 and the decoding unit 108 waiting to be decoded. With the branch-to-continuous instruction, the fetch unit 1-4 begins fetching and decrypting the instruction data 106 from the instruction cache 102 using the set of repaired key values in the main key register 142. Flow ends at block 3222. In addition to the security advantages brought by the above-mentioned instruction decryption method implemented by the microprocessor 100, the inventor has developed a suggestion coding guide whose use can be matched with the above embodiment of the method of cutting (four) by the analysis of the instructional actual use amount A hacking attack that develops statistical techniques for encrypting x86 code. First, since the hacker usually assumes that the extracted 16-byte instruction is based on H)6, the full number is X86. Therefore, compared with the program execution flow, the weight should be added between the blocks. fine)". It is also said that its encoding should jump some instruction bits with multiple instructions _ 纟, dense bytes to generate multiple "holes" which can be filled;; appropriate: value = plus pure text bit group entropy value (entropy ). In addition, the entropy value of the text byte can be encoded as much as possible. This is more k liters, 1 J should be "this knows the use of real-time devaluation. CNTR2449100-TW/0608-A43129-TW/ Final 6〇 201203108 In addition, the real-time data value can be used as a false clue to point to the wrong instruction opcode address. The second 'the encoding may include a special nop instruction, including, ignore the "field", filled with appropriate values to increase the above entropy value. For example, the x86 instruction 0x0F0D05xxxxxxxx belongs to the 7-bit N〇p, where the last The four bytes can be any value. In addition, the N[〇p instruction's opcode pattern and its number of "don't care" bytes can be changed. Third, many x86 instructions have the same basic functionality as the other χ86 instructions. The instruction for the equivalent function's code can discard the same instruction repeatedly, use multiple patterns and/or adopt a pattern that enhances the pure text entropy value. For example, 'command 0xC10107 and the command 〇xC1〇〇25 do the same thing. Even some equivalent instructions are presented in versions of different lengths, such as OxEB22 and 0XE90022; therefore, multiple lengths of the same effect can be used for encoding. Fourth, the architecture allows the use of redundant and meaningless opcode prefixes (〇PC〇deprediX), so the encoding can be applied with care to increase the entropy value described above. For example, the deduction order 0x40 and the 〇x2627646567F2F34 的70 are all the same thing. Because there are only 8 secure x86 prefixes, they need to be carefully placed in the code to avoid excessive frequency. Although various embodiments have been described to perform _ and add/subtract operations on the -log value in the value of the main cipher register in the snippet expander, other embodiments may be considered, wherein the snippet expander may be more The operations are performed on the two master key register values, and in addition, the operations performed may be different from the rotation and addition/subtraction operations. In addition, the confidential recording switching instruction (4) disclosed in FIG. 6 and the branching and switching secret recording instruction disclosed in FIG. 9 may have other real CNTR2449100-TW/0608-A43129-T W/ Final '201203108 implementation manner 'for example' The new secret value is loaded into the primary secret register 142 by the secure storage area 122 instead of being loaded by the secret register slot 124, and the branch and switch secret instructions 15 described in FIG. Other implementations store the address of secure storage area 122 with index block 2104. In addition, although various embodiments have been described to adjust the BTAC 2 storage krf index mate branch and switch key command just/丨, there are other implementations for adjusting the BTAC 2702 storage secure storage address to match the branch and switch key command 1500. use. The above-described embodiments of the present invention are intended to be illustrative only and are not intended to limit the scope of the invention. Many variations in form and detail may be made by those skilled in the art of the related art without departing from the scope of the invention. For example, the means, such as the functions, fabrication, modularization, simulation, description, and/or discussion of the devices and methods discussed in this section, can be implemented in a software. Implementations include general-purpose programming languages (for example, C, C++), hardware description languages including Vediog HDL, VHDL, etc., or other available programming tools. The software can be carried on any known computer readable medium, such as a magnetic tape, a semiconductor, a magnetic disk, or a compact disk (eg, Cd_r〇m, dvd_r〇m, etc.), a network, a wired transmission, a wireless or other communication medium. . Embodiments of the apparatus and method may be embodied in a semiconductor knowledge wire (4), such as a microprocessor core (e.g., implemented in muscle)' and may be converted into a hardware implemented in an integrated circuit. Further, the apparatus and method described can be achieved by combining soft and hard materials. Therefore, the scope of the invention should not be limited to any of the embodiments described, which are defined by the following mt claims and their equivalents. The system of the present invention can be implemented by a microprocessor employed in a general purpose computer. It is noted that those skilled in the art may deviate from the CNTR2449100-TW/0608-A43129-TW/ Final 62 and 201203108 scopes defined in the claims, based on the disclosed concepts and specific embodiments, and are designed to modify other architectures. The same effects as the present invention are produced. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram 'illustrating a microprocessor implemented in accordance with the teachings of the present invention; FIG. 2 is a block diagram 'for explaining the extraction unit of FIG. 1 in detail; FIG. 3 is a diagram A flowchart "Operation of the extraction unit of FIG. 2 according to the technology of the present invention; FIG. 4 is a block diagram illustrating the clamping of the first icon buffers according to the technique of the present invention; FIG. 5 is a block diagram A format of a key load instruction is illustrated in accordance with the teachings of the present invention; FIG. 6 is a block diagram illustrating the format of a key switch instruction in accordance with the teachings of the present invention; and FIG. 7 is a flow chart of the present invention. The operation of the microprocessor of FIG. 1 is illustrated, wherein the secret recording switching instruction of FIG. 6 is executed; and FIG. 8 is a block diagram. According to the technology of the present invention, an encrypted cryptic spoon is used for the hidden body 1: The encryption program includes a plurality of 6th picture key switching instructions; the pastoral day] is a block diagram in FIG. 9, and according to the technology of the present invention, a format of the switching performance instruction is illustrated; ', and FIG. 10 is a flowchart. Graphic processor according to the present invention As a flowchart in which the execution of FIG. 9 and the switching key finger = CNTR 2449, 〇〇.TW/〇6〇8.A43129.TW/Final illusion, 201203108, FIG. 11 is a flowchart according to the present invention, The operation of the post-processor is implemented by a software tool, which can be used for post-processing a program and encrypted for execution by the microprocessor of FIG. 1; FIG. 12 is a block diagram illustrating the branching of another embodiment of the present invention. The format of the switching key instruction; FIG. 13 is a block diagram illustrating a block address range table according to the present technology; FIG. 14 is a flowchart illustrating the operation of the microprocessor of FIG. 1 according to the present technology, Wherein the branch and switch key instruction of FIG. 12 is performed; FIG. 15 is a block diagram illustrating the format of the branch and switch key command of another embodiment of the present invention; FIG. 16 is a block diagram of a block diagram according to the present invention Figure 17 is a flow chart illustrating the operation of the microprocessor of Figure 1 in which the branch and switch key instructions of Figure 15 are performed in accordance with the teachings of the present invention; Figure 18 is a flow diagram Figure, diagram Another embodiment of the present invention, wherein the operation of a post-processor is described for the post-processing of a program, and is encrypted by the microprocessor of FIG. 1; FIG. 19 is a flowchart, according to the technique of the present invention, The operation of the microprocessor of Figure 1 is used to cope with a task switching, switching between an encryption program and a plain text program; Figure 20 illustrates a flow chart illustrating the microprocessor of Figure 1 in accordance with the teachings of the present invention. The operation of the executed system software; FIG. 21 illustrates a block diagram illustrating the field of the flag register of FIG. 1 according to another embodiment of the present invention; CNTR2449100-TW/0608-A43129-TW/ Final 64 201203108 Figure 22 is a flow chart illustrating the operation of the microprocessor of Figure 1 using the flag register of Figure 21' for handling a task switch and switching between multiple encryption programs in accordance with the teachings of the present invention; 23 is a flow chart illustrating the operation of the microprocessor of FIG. 1 using the flag register of FIG. 21 for coping with a task switching, switching between multiple encryption programs, in accordance with the teachings of the present invention. Figure 24 is a block diagram showing a single register in the key register file of Figure 1 according to another embodiment of the present invention; Figure 25 is a flowchart, according to another embodiment of the present invention, The operation of the microprocessor of Fig. 1 using the flag register of Fig. 21 and the file of the key register file of Fig. 24 is used to cope with a task switching and switching between multiple encryption programs; Fig. 26 is a flow According to another embodiment of the present invention, the operation of the microprocessor of FIG. 1 using the flag register of FIG. 21 and the key register file of FIG. 24 is illustrated to cope with a task switching and switching to multiple encryptions. Between the programs; Fig. 27 is a block diagram showing other embodiments of the contents of the microprocessor 100 of Fig. 1; Fig. 28 is a block diagram showing the branch target address of Fig. 27 in detail according to the technique of the present invention Cache memory (BTAC); Fig. 29' is a block diagram illustrating the contents of the BTAC units in Fig. 28 in detail according to the technology of the present invention; Fig. 30 is a flowchart showing the 27th according to the technique of the present invention. Graph microprocessor The operation of the BTAC is illustrated in Figure 28; Figure 31 is a flow chart illustrating the operation of the BTAC of Figure 28 in accordance with the technique of the present invention, CNTR2449100-TW/0608-A43129-TW/ Final 65 201203108; Micro-production flow chart 'According to the technology of the present invention, the operation of the 27th micro-processing benefit-branch and switching secret recording instructions; and [main component symbol description] 100~microprocessor; 104~ extraction unit; 108~ decoding unit; 114~Exporting unit; 122~secure storage area; 128~flag register; 134~ extracting address; 144~ control register; 102~ instruction cache; 106~ private data (can be encrypted); 112~execution unit; 118~general register; 124~key register file; 132~microcode unit; 142~main secret register; 148~E bit; 152~key expander; ~ multiplexer; 156 ~ mutual exclusion logic; 164 ~ extraction instruction generator 162 ~ plain text instruction data; ; 172 ~ two sets of keys; Π 4 ~ decryption key; 176 ~ multi-bit binary zero value 178 ~ Output of multiplexer 154; 212~ multiplexer a 214~ multiplexer B; 216~ rotator; 218~addition/subtractor; 234~first key; 236~second key; 23 8~ rotator output; 402~E bit blocker; 302-316~step block; 408~standard x86 mark of multiple bits; 5 00~ secret record load instruction; 5 02~ operation code; CNTR2449100-T W/0608-Α43129-TW/ Final 66 201203108 504- Key register file target address; 506~ secure memory source address; 600~ secret record switching instruction; 602~Operation code; 604~key register file index; 702-708~block step; 800~ Memory usage; 900~ branch and switch key command; 902~ opcode; 906~ branch information; 1102-1106~step block; 1202~ opcode; 13 02~ address range; 1402-1418~ step block; 02~Operation code; 1604~secured memory area address 1714~step block; 904~key register file index; 1002-1018~step block; 1200~ branch and switch key instruction 1300~block address range table: 1304 ~ Key register file index 1500 ~ branch and switch key command 1600 ~ block address range table: 1802-1806 ~ step block; 1902-1944 ~ step block; 2002-2008 ~ step block; 2104 ~ index; 2202-2216 ~ step block; 2302-2316 ~ step block; 2402 ~ eliminated Bit; 2506~step block; 2607, 2609~step block; 2702~ branch target address cache memory (BTAC); 2706~ target address; 2708~ adopt/not use indicator; 2712~key switching logic; 2714~type index; 2716~key register file index; 2802~BTAC array; 2808~BTAC unit; 2902~effective bit; 2904~marker block; CNTR2449100-TW/0608-A43129-TW/ Final 67 201203108 2906~target address; 2908~ adopt/do not use field 2912~key register file index; 2914~type field; 3002-3004~step block; and 3102-3116~step block; 3208-3222~step Square; ZEROS ~ multi-bit binary zero value. CNTR2449100-TW/0608-A43129-TW/ Final 68
Claims (1)
Applications Claiming Priority (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US34812710P | 2010-05-25 | 2010-05-25 | |
US13/091,547 US8700919B2 (en) | 2010-05-25 | 2011-04-21 | Switch key instruction in a microprocessor that fetches and decrypts encrypted instructions |
US13/091,487 US8671285B2 (en) | 2010-05-25 | 2011-04-21 | Microprocessor that fetches and decrypts encrypted instructions in same time as plain text instructions |
US13/091,828 US8645714B2 (en) | 2010-05-25 | 2011-04-21 | Branch target address cache for predicting instruction decryption keys in a microprocessor that fetches and decrypts encrypted instructions |
US13/091,698 US8683225B2 (en) | 2010-05-25 | 2011-04-21 | Microprocessor that facilitates task switching between encrypted and unencrypted programs |
US13/091,641 US8639945B2 (en) | 2010-05-25 | 2011-04-21 | Branch and switch key instruction in a microprocessor that fetches and decrypts encrypted instructions |
US13/091,785 US8719589B2 (en) | 2010-05-25 | 2011-04-21 | Microprocessor that facilitates task switching between multiple encrypted programs having different associated decryption key values |
Publications (2)
Publication Number | Publication Date |
---|---|
TW201203108A true TW201203108A (en) | 2012-01-16 |
TWI437489B TWI437489B (en) | 2014-05-11 |
Family
ID=46756316
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW100118074A TWI437489B (en) | 2010-05-25 | 2011-05-24 | Microprocessors and operating methods thereof and encryption/decryption methods |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI437489B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104573537A (en) * | 2013-10-11 | 2015-04-29 | 群联电子股份有限公司 | Data processing method, memory storage device and memory control circuit unit |
US9864879B2 (en) | 2015-10-06 | 2018-01-09 | Micron Technology, Inc. | Secure subsystem |
WO2018106570A1 (en) * | 2016-12-09 | 2018-06-14 | Cryptography Research, Inc. | Programmable block cipher with masked inputs |
TWI743692B (en) * | 2020-02-27 | 2021-10-21 | 威鋒電子股份有限公司 | Hardware trojan immunity device and operation method thereof |
-
2011
- 2011-05-24 TW TW100118074A patent/TWI437489B/en active
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104573537A (en) * | 2013-10-11 | 2015-04-29 | 群联电子股份有限公司 | Data processing method, memory storage device and memory control circuit unit |
CN104573537B (en) * | 2013-10-11 | 2017-09-15 | 群联电子股份有限公司 | Data processing method, memory storage apparatus and memorizer control circuit unit |
US9864879B2 (en) | 2015-10-06 | 2018-01-09 | Micron Technology, Inc. | Secure subsystem |
TWI633457B (en) * | 2015-10-06 | 2018-08-21 | 美光科技公司 | Apparatuses and methods for performing secure operations |
US10068109B2 (en) | 2015-10-06 | 2018-09-04 | Micron Technology, Inc. | Secure subsystem |
TWI672610B (en) * | 2015-10-06 | 2019-09-21 | 美商美光科技公司 | Apparatuses and methods for performing secure operations |
US10503934B2 (en) | 2015-10-06 | 2019-12-10 | Micron Technology, Inc. | Secure subsystem |
WO2018106570A1 (en) * | 2016-12-09 | 2018-06-14 | Cryptography Research, Inc. | Programmable block cipher with masked inputs |
US11463236B2 (en) | 2016-12-09 | 2022-10-04 | Cryptography Research, Inc. | Programmable block cipher with masked inputs |
TWI743692B (en) * | 2020-02-27 | 2021-10-21 | 威鋒電子股份有限公司 | Hardware trojan immunity device and operation method thereof |
US11574048B2 (en) | 2020-02-27 | 2023-02-07 | Via Labs, Inc. | Hardware trojan immunity device and operation method thereof |
Also Published As
Publication number | Publication date |
---|---|
TWI437489B (en) | 2014-05-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI502497B (en) | Apparatus and method for generating a decryption key | |
US9892283B2 (en) | Decryption of encrypted instructions using keys selected on basis of instruction fetch address | |
US20160104011A1 (en) | Microprocessor with on-the-fly switching of decryption keys | |
TWI627556B (en) | Microprocessor and method for securely executing instructions therein | |
CN107102843B (en) | Microprocessor and method for safely executing instruction therein | |
TW201203108A (en) | Microprocessors and operating methods thereof and encryption/decryption methods |