TWI253268B - Microprocessor apparatus and method for optimizing block cipher cryptographic functions - Google Patents

Microprocessor apparatus and method for optimizing block cipher cryptographic functions Download PDF

Info

Publication number
TWI253268B
TWI253268B TW093129342A TW93129342A TWI253268B TW I253268 B TWI253268 B TW I253268B TW 093129342 A TW093129342 A TW 093129342A TW 93129342 A TW93129342 A TW 93129342A TW I253268 B TWI253268 B TW I253268B
Authority
TW
Taiwan
Prior art keywords
cryptographic
block
instruction
text block
register
Prior art date
Application number
TW093129342A
Other languages
Chinese (zh)
Other versions
TW200513084A (en
Inventor
Thomas A Crispin
Glenn G Henry
Terry Parks
Timothy A Elliott
Original Assignee
Via Tech Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Via Tech Inc filed Critical Via Tech Inc
Publication of TW200513084A publication Critical patent/TW200513084A/en
Application granted granted Critical
Publication of TWI253268B publication Critical patent/TWI253268B/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Storage Device Security (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The present invention provides an apparatus and method for performing cryptographic operations on a plurality of input data blocks within a processor. In one embodiment, an apparatus for performing cryptographic operations is provided. The apparatus includes a cryptographic instruction and translation logic. The cryptographic instruction is received by a computing device as part of an instruction flow. The cryptographic instruction prescribes one of the cryptographic operations. The translation logic translates the cryptographic instruction into micro instructions. The micro instructions are ordered to direct the computing device to load a second input text block and to execute the one of the cryptographic operations on the second input text block prior to directing the computing device to store an output text block corresponding to a first input text block. Consequently, the output text block is stored during execution of the one of the cryptographic operations on the second input text block.

Description

【發明所屬之技術領域】 _且 ^發明係有關於微電子領域, >、有U指令最佳化順序之計算 及方法,藉以增加此計算裝置2產 【先前技術】 早期的電腦系統係獨立操作於 i執行於此電腦系統中之應用程 =儲存於此電腦系統就是由應用 2供;而應用程式執行結果所產生 ,為列印輸出的紙張,或者是寫入 糸統其他類型之儲存裝置的檔案。 相同電腦系統中執行之應用程式的 出資料先前被儲存成檔案於可移除 ,其也可以提供給不同但相容之電 。在這些早期的系統,保護機密資 在其他資訊安全措施中,密碼應用 止機密資訊未被授權揭露。這些密 (scramble)及解碼(unscramble)在 的輸出資料。 其後沒幾年,使用者開始發現 以提供資訊共享存取的好處,因此 以及資料傳輸協定等均發展成不僅 力,更是其顯著的特徵。例如:使 特別是有關於一種3 中執行密碼運算的装 量(throughput) ° 在置 其他電 式所需 程式設 的輸出 磁帶、 輸出檔 輸入檔 或可輸 腦系統 訊的需 程式被 碼程式 儲存裝 腦糸統 的輸入 計人員 資料, 磁碟或 案可以 案,或 送的儲 的應用 求係公 發展及 一般係 置中儲 之外, 資料, 在執行 其形式 是此電 是之後 者,當 存裝置 程式使 認,並 應用以 以拌碼 存成梢 藉由網路將電腦連 網路架構、操作系 支援存取共享資料 用者的電腦工作站 據 若時 腦在輸時 用 且 防案 可 能 以[Technical field to which the invention pertains] _ and ^ inventions relate to the field of microelectronics, >, calculations and methods for optimizing the order of U commands, thereby increasing the number of computing devices 2 [prior art] Early computer systems are independent The operation of i in the computer system is stored in the computer system is stored in the computer system is provided by the application 2; and the application execution results are generated, printed for the output of the paper, or write to other types of storage devices Archives. The output of the application executed in the same computer system was previously saved as a file and can be removed. It can also be supplied to different but compatible power. In these early systems, protection of confidential assets In other information security measures, password-based confidential information was not authorized to be disclosed. These are scrambled and unscrambled in the output data. In the next few years, users began to discover the benefits of providing information sharing access, and therefore data transfer agreements and the like have evolved into not only forces, but also their distinctive features. For example, it is especially useful for a type of 3 to perform cryptographic operations. The output tape, the output file input file, or the input program that can be used to transmit the brain system is stored in the code program. The information on the input of the brains, the disk or the case can be filed, or the application of the storage to be sent to the public development and the general system is outside the storage, the data, in the form of execution of the electricity is the latter, when The storage device program recognizes and applies the computer workstation that connects the computer to the network architecture and the operating system supports access to the shared data by using the network to store the data. Take

第8頁 "i「、發明說明(2) 在不同工作站或網路檔案伺服器存取 網路獲得新聞及其他資訊,或者對數以百 达及接收電子訊息(如電子郵件),或者與 統連接並提供信用卡或銀行資訊以購買產 、機場或其他公共場合利用無線網路進行 、。因此,保護機密資料及傳輸免於未授權 速的成長,而在某些特定的狀況下,使用 密資料的情況也大大的增加。目前新聞頭 腦資汛安全問題,例如垃圾郵件(spam)、 、反向工程、惡作劇以及信用卡詐騙等係 幾名。而當這些從各方面侵入私人領域的 誤到有預謀的網路攻擊,負責的執行單位 的執行以及公共教育節目回應。然而,這 遏止危及電腦資訊的浪潮。昔曰是政府、 所專注關切的間諜,現在對一般人而古也 題;間諜讀取他們的電子郵件或從他^的 們檢查帳戶的交易。在商業之前,熟悉該 從小到大的社團法人目前應用其資源^越 產資訊。 資訊安全領域提供我們技術及裝置以 其僅能由指定的個體加以解竭,此^所知 (cryptography)。當特別應用於保護儲存 間的資訊時’密碼最常被應用於轉換機密 ’’〖plaintext或cleartext)成為難以理解 ’或者利 计的其他 經鎖商的 品,或者 上述之任 揭露的需 者被迫保 條通常集 駭客、身 公眾所關 動機由無 以新法律 些回應並 金融機構 已成為重 家用電腦 項技藝者 的部分以 用網際 電腦傳 電腦系 在餐廳 何活動 求已急 護其機 中在電 分盜取 注的前 心的錯 、嚴厲 未有效 、軍方 要的問 存取他 可察知 保護財 編碼資料,並使 的密碼 或傳輸於電腦之 資料(稱為”明文 的形式(稱為’’密Page 8 "i", Invention Description (2) Access news and other information on different workstations or network file servers, or log in and receive electronic messages (such as e-mail), or Connect and provide credit card or bank information to purchase products, airports or other public places using wireless networks. Therefore, protect confidential information and transmission from unauthorized growth, and in certain circumstances, use confidential information. The situation has also increased greatly. At present, news-minded security issues, such as spam (spam), reverse engineering, mischief and credit card fraud, are several. When these intrusions into the private sector from various aspects, there are premeditated plans. The cyberattack, the execution of the responsible executive unit and the response to the public education program. However, this has curbed the wave of computer information. The old government is the spy of concern, and now it is the same as the average person; the spy reads them. E-mail or check the account transaction from his ^. Before the business, familiar with the corporate body from small to large Before applying its resources, it is the most important information. The information security field provides our technology and devices so that they can only be depleted by designated individuals. This is the cryptography. When it is especially used to protect the information in the storage room, the password is the most Often used in the conversion of secret ''plaintext or cleartext) to become difficult to understand 'or other profit-making goods, or the above-mentioned needs of the need to be forced to protect the bar usually set the target of the hacker, the public In response to the lack of new laws and financial institutions have become part of the heavy-duty computer program to use the Internet computer to transmit the computer system in the restaurant, the activity has been urgently guarded against the mistakes in the machine. Strictly uneffective, the military wants to access him to know the protection of the coded information, and the password or the information transmitted to the computer (called "clear text" (called '' dense

.-Τί ί2$3268 1 五、發明說明(3) -文’’;ciphertext)。轉變明文成為密文的轉換過程稱為加 •密(encryption; enciphering; ciphering),而轉變密^ 回明文的反向轉換過程稱為解密(decrypti〇n; deciphering; inverse ciphering) 〇 在密碼學的領域中,幾種程序及協定已發展到允許 用者不須具備許多知識及努力即可執行密碼運算,並且 對這些使用者使其可以傳輸或者提供其編碼形式的資訊產 口口給不同的使用者。連同編碼資訊,傳送者通常會提供接 收者一’’密碼鑰匙(cryptographic key)”以使接收者可以 解碼所編碼的資訊,因此使得接收者能夠恢復或者獲得存 取士編碼的原始資訊。熟悉該項技藝者可察知這些程序及 一般係以暗語(password)保護、數學演算法以及應用 程式^別設計的形式加以實現以加密及解密機密資訊了 成種類型的演算法目前使用於加密及解密資料。演曾 篡=據上述一類型(例如一種RSA演算法,公開鑰匙密碼= ΪΪ Λ用,兩密碼鑰匙’―公開料與-私密鎗匙,加密、 匙係ϋ料。根Ϊ 一些公開鑰匙演算法,接收者的公開鑰 學關^送者用來加密傳送給接收者的資料,因為有一數 收者2 /子在於使用者的公開鑰匙與私密鑰匙之間,因此接 這類利用其私密餘匙解密此傳輪以恢復此資料。雖然 運算ί,密碼演算法廣泛使用於現今,但其加密及解密的 如極慢甚至於少量的資料。—第二類型的演算法, 可以對稱鑰匙演算法,提供同量等級的資料安全並且 又、執行。這些演算法稱為對稱鑰匙演算法,因為他.-Τί ί2$3268 1 V. Description of invention (3) - text ''; ciphertext). The conversion process of transforming plaintext into ciphertext is called encryption (enciphering; ciphering), and the reverse conversion process of transforming ciphertext is called decryption (decryption; deciphering; inverse ciphering). In the field, several programs and protocols have been developed to allow users to perform cryptographic operations without much knowledge and effort, and to enable these users to transmit or provide informational forms of information products for different uses. By. Along with the encoded information, the sender typically provides the recipient a 'cryptographic key' to enable the recipient to decode the encoded information, thus enabling the recipient to recover or obtain the original information of the access code. Experts can be aware that these programs are generally implemented in the form of password protection, mathematical algorithms, and application programming to encrypt and decrypt confidential information. The types of algorithms currently used to encrypt and decrypt data.演曾篡=According to the above type (for example, an RSA algorithm, public key cipher = ΪΪ ,, two cipher keys) - open material and private key, encryption, key system. Root Ϊ some public key calculus In the law, the recipient's public key is used to encrypt the data transmitted to the recipient, because there is a number of receivers 2 / sub-users between the user's public key and the private key, so the use of this private secret The key decrypts this wheel to recover this data. Although the operation ί, the cryptographic algorithm is widely used in today, its encryption and decryption is extremely slow. As for a small amount of data.—The second type of algorithm can be symmetric key algorithm, providing the same level of data security and execution. These algorithms are called symmetric key algorithms because he

12„ 2 3 135326¾ 五、發明說明(4) 對加密及解密資訊使用單一密碼鑰匙。在公開區段,目 •珂有三種盛行單一鑰匙(sing le_key)密碼演算法:資料編 石馬標準(Data Encryption Standard; DES)、三重des 以及 進階編碼標準(Advanced Encryption Standard; AES)。 因為這些演算法保護機密資料的強度,美國政府機關目前 正使用這些演算法,但熟悉該項技藝者預期這些演算法中 ~個或多個演算法,在不久的將來會變成商業及非官方交 易的標準。根據所有這些對稱鑰匙演算法,明文及密文被 劃分在指定大小中的區塊以進行加密及解密。例如·· AES 執行密碼運算於1 2 8位元區塊的大小,並且使用1 2 8位元、 1 9 2位元以及2 5 6位元的密碼鑰匙長度。其他對稱鑰匙演算 去’例如Rijndael Cipher也允許192位元以及256位元的 ^料區塊。據此,就一區塊加密運算而言,一 1 0 2 4位元的 明文訊息加密當成8個1 28位元的區塊。 所有對稱鑰匙演算法利用相同形式的次運算以加密一 區塊的明文,並且根據許多更常被應用的對稱鑰匙演算法 ’一初始密碼鑰匙被擴展成複數個鑰匙(例如:一”鑰匙排 & )’每一输匙係用以當成次運算的〆對應您碼回合且 執行於明文區塊。例如:鎗匙排程的第一鑰匙係用以執行 次運算的第一密碼回合於明文區塊,第一回合的結果係用 以當成第二回合的輸入,其中第二回合利用鑰匙排程的第 二鑰匙以產生第二結果,並立一具體指定數量後來的回合 執行產生一最終回合結果,即密文本身。根據AES演异法 ’在每一回合的次運算係參照於文獻中的SubBy tes (或12„ 2 3 1353263⁄4 V. Description of invention (4) Use a single cipher key for encryption and decryption information. In the public section, there are three popular single key (sing le_key) cipher algorithms: data syllabus standard (Data Encryption Standard; DES), triple des, and Advanced Encryption Standard (AES). Because these algorithms protect the strength of confidential data, US agencies are currently using these algorithms, but those skilled in the art anticipate these calculations. In the future, one or more algorithms will become the standard for commercial and unofficial transactions in the near future. According to all these symmetric key algorithms, plaintext and ciphertext are divided into blocks of specified size for encryption and decryption. For example, AES performs a cryptographic operation on the size of a 128-bit block, and uses a cipher key length of 1 2 8 bits, 1 9 2 bits, and 2 5 6 bits. Other symmetric key calculus goes 'for example Rijndael Cipher also allows 192-bit and 256-bit blocks. According to this, for a block cipher operation, a plain message of 1 0 2 4 bits It is used as a block of 8 1 28 bits. All symmetric key algorithms use the same form of sub-operation to encrypt the plaintext of a block, and are extended according to many more commonly used symmetric key algorithms 'an initial cipher key A plurality of keys (for example: a "key row &") 'each key is used as a secondary operation corresponding to your code round and executed in the plaintext block. For example, the first key of the shot schedule is used to perform the first password round of the secondary operation in the plaintext block, and the result of the first round is used as the input of the second round, wherein the second round utilizes the key scheduling The second key is used to generate the second result, and a specific specified number of subsequent rounds of execution produces a final round result, ie, a secret text body. According to the AES algorithm, the sub-operations in each round refer to the SubBy tes in the literature (or

.S-box)、SluftRows、MixColums 以及AddR〇und 。一 塊密文的解密係類似的處理並伴隨例外的執行在每一回^ ’且回合的最終結果係一區塊的明文,上述之例外係指: 文輸入反加密及反次運算執行(例如:Ιηνα% ''山.S-box), SluftRows, MixColums, and AddR〇und. The decryption of a piece of ciphertext is similarly processed and accompanied by the execution of the exception at each time. The final result of the round is the plaintext of a block. The above exceptions are: text input anti-encryption and reverse-order execution (eg: Ιηνα% ''Mountain

MixColumns 、 Inverse ShiftRows) 〇 DES及三重DES利用不同特定的次運算,但 算係類似AES的次運算’因為其利用相似的方式以轉^矣一運 區塊的明文成為'一區塊的密文。MixColumns, Inverse ShiftRows) 〇DES and Triple DES use different specific sub-operations, but the calculation is similar to the AES sub-operation 'because it uses a similar way to turn the plaintext of a transport block into a ciphertext of one block. .

執行密碼運算於多連續的文字區塊,所有對稱鑰匙演 算法利用相同類別的模式,這些模式包含電子密碼本 (electronic code book; ECB)模式、密碼區塊鏈結 (cipher block chaining; CBC)模式、密碼反饋模式 (cipher feedback; CFB)以及輸出反饋模式(〇utpi^ feedback; 0FB)。這些模式中有些利用一附加初始化向量 於執行次運算期間,有些使用執行於第一區塊明文之第一 集(set)後碼回合的密文輸出當成附加的輸入給執行於第 二區塊明文之第二集密碼回合。除此,本應用的領域對現 今對稱鑰匙密碼演算法所應用的每一密碼演算及次運算提 供更深層的討論。就具體指定執行標準而言,讀者可由美 國聯邦資訊處理標準公告46-3 (Federal Irif0rmatiQn 、Perform cryptographic operations on multiple consecutive blocks of text. All symmetric key algorithms utilize the same type of pattern, including electronic code book (ECB) mode, cipher block chaining (CBC) mode. , cipher feedback (CFB) and output feedback mode (〇utpi^ feedback; 0FB). Some of these modes utilize an additional initialization vector during the execution of the secondary operation, and some use the ciphertext output that is executed in the first set of the first block of the plaintext after the code round as an additional input to the second block of clear text. The second set of password rounds. In addition, the field of this application provides a deeper discussion of each of the cryptographic and sub-operations used in today's symmetric key cryptography algorithms. For specific specified implementation criteria, the reader may be notified by the Federal Information Processing Standards Bulletin 46-3 (Federal Irif0rmatiQn,

Processing Standards Publication; FIPS-46-3) ,1999 年10月25日出版,得到DES及三重DES的詳細探討;以及美 國聯邦資訊處理標準公告197 (FIPS -197),200 1年11月26 日出版’得到AES的詳細探討。上述提及的兩種標準係由Processing Standards Publication; FIPS-46-3), published on October 25, 1999, with a detailed discussion of DES and Triple DES; and US Federal Information Processing Standards Bulletin 197 (FIPS-197), published November 26, 2001 'Get a detailed discussion of AES. The two standards mentioned above are

第12頁Page 12

ιΙ25356§ I _^一-^ 五、發明說明(6) -美國國家標準暨技術局(National Institute of Standards and Technology; NIST)所發布及主張,在此 •列為參考以供本發明所有意圖及目的之說明。除上述所提 及的標準’教導(tutorial)、白皮書、套件(toolkit)以 及資源文章均可透過網際網路http : / /csrc· ni st. gov/在 N I ST 的電腦資源安全中心(Computer Secur i ty Resource Center; CSRC)獲得。 熟悉該項技藝者可察知有許多的應用程式能夠執行在 可以執行密碼運算(例如:加密及解密)的電腦系統。實際 上’某些操作系統(例如:微軟W i n dow XP、L i nux )提供直 接加密/解密的服務於密碼基元(p r i m i t i v e )、密碼應用程 式介面以及諸如此類的形式。然而,本發明人已觀察到現 今電腦密碼技術在某些方面的缺陷,因此藉由第一圖強調 及討論這些缺陷。 第一圖係方塊圖1 〇 〇圖解現今電腦密碼應用程式。方 塊圖100描繪第一電腦工作站101連接區域網路1〇5,且區 域網路105也連接第二電腦工作站1〇2、網路檔案儲存裝置 1〇6、第一路由器107或其他介面形式到廣域網路11〇 (例 如··網際網路)以及像是符合IEEE 802.1 1的無線網路路由 器108,筆記型電腦104則是透過無線網路1〇9與無線路由 器108成為介面。在廣域網路11()方面,第二路由器丨^提 供介面給第三電腦工作站103。 如上概述,現今的使用者在工作期間面臨許多次的電 腦資訊安全問題。例如:在現今多工(multi_tasking)操ιΙ25356§ I _^一-^ V. Description of the invention (6) - Published and claimed by the National Institute of Standards and Technology (NIST), which is hereby incorporated by reference for all intents of the present invention. Description of purpose. In addition to the standard 'tutorials, white papers, kits, and resource articles mentioned above, you can use the Internet http:// /csrc· ni st. gov/ at the Computer Security Center at NI ST (Computer Secur i ty Resource Center; CSRC). Those skilled in the art will recognize that many applications can execute on computer systems that can perform cryptographic operations (eg, encryption and decryption). In fact, some operating systems (e.g., Microsoft Windows XP, Linux) provide direct encryption/decryption services in the form of cryptographic primitives (p r i m i t i v e ), cryptographic application interfaces, and the like. However, the inventors have observed defects in some aspects of today's computer cryptography, and therefore these defects are emphasized and discussed by the first figure. The first picture is a block diagram. Figure 1 〇 〇 illustrates the current computer password application. The block diagram 100 depicts the first computer workstation 101 connecting to the local area network 〇5, and the area network 105 is also connected to the second computer workstation 1, 2, the network file storage device 1-6, the first router 107 or other interface form. The wide area network 11 (for example, the Internet) and the IEEE 802.1 1 compliant wireless network router 108, the notebook computer 104 is interfaced with the wireless router 108 via the wireless network 1-9. In the wide area network 11() aspect, the second router provides an interface to the third computer workstation 103. As outlined above, today's users face many times of computer information security issues during their work. For example: in today's multiplex (multi_tasking) operation

125326¾ ' : .. j 五、發明說明(7) ' '~' --- -作系統的控制下,使用者工作站1〇1可以同時執行多個任 務(task)且每一任務要求密碼運算。使用者工作站1〇1要 求執行加密/解密應用程式11 2 (無論是操作系統的一部分 或是由操作系統所引動(invoke))以儲存區域檔案於網路 槽案儲存裝置106,在檔案儲存的同時,使用者可以傳送 一加密訊息給在工作站102的第二使用者,其中工作站ι〇2 也要求執行加密/解密應用程式112的一範例,而加密訊自 可能是即時(例如:即時訊息)或者是非即時(例如:電子u 郵件)。此外,使用者可以透過廣域網路n〇從工作站1〇3 存取或提供其金融資料(例如··信用卡號、金融交易等)或 者其他形式的機密資料。工作站1〇3也可以代表是家庭辦 公或其他遠端電腦103,其可以讓工作站1〇1的使用者離開 辦公室時用以存取區域網路105的任何共享資源1〇1、1〇2 、106、107、108以及1〇9。上述提及的每一活動均要求引 動加密/解密應用程式丨12的相對範例,並且無線網路1〇9 目前普遍地提供於咖啡店、機場、學校以及其他公眾場所 ,因而促使使用者筆記型電腦丨〇4不僅對其他使用者傳送/ 接收的訊息進行加密/解密,並且也對透過無線網路1〇9到 無線路由器1 08的所有通訊進行加密及解密。 熟悉該項技藝者可因此察知在工作站丨Obi 〇4中連同 每-要求密碼運算的活動,須有一相對的要求以引動 (invoke)加密/解密應用程式112的範例,因此電腦 101-104在最近的將來有可能同時執行數以百計的密碼運1253263⁄4 ' : .. j V. Invention description (7) ' '~' --- Under the control of the system, the user workstation 1〇1 can execute multiple tasks at the same time and each task requires cryptographic operations. The user workstation 101 requires an encryption/decryption application 11 2 (whether part of the operating system or invoked by the operating system) to store the zone file in the network slot storage device 106, in file storage. At the same time, the user can transmit an encrypted message to the second user at the workstation 102, wherein the workstation ι2 also requests an example of the encryption/decryption application 112, and the encrypted message may be instant (eg, instant message). Or it is not instant (for example: e-mail). In addition, users can access or provide their financial information (such as · credit card numbers, financial transactions, etc.) or other forms of confidential information from workstations 1〇3 via the WAN. The workstation 1〇3 may also represent a home office or other remote computer 103 that allows the user of the workstation 101 to access any shared resources of the regional network 105, 1〇1, 1〇2, when leaving the office. 106, 107, 108 and 1〇9. Each of the activities mentioned above requires a relative example of priming the encryption/decryption application 12, and the wireless network 〇9 is currently generally available in coffee shops, airports, schools, and other public places, thereby facilitating user notebooks. The computer 4 not only encrypts/decrypts the messages transmitted/received by other users, but also encrypts and decrypts all communications through the wireless network 1 to the wireless router 108. Those skilled in the art will thus be aware that in the workstation 丨Obi 〇4, along with each activity requiring a cryptographic operation, there must be a relative requirement to invoke an example of the encryption/decryption application 112, thus the computer 101-104 is recently In the future, it is possible to execute hundreds of passwords at the same time.

乳i厶2 ;5 H53268Milk i厶2 ;5 H53268

五、發明說明(8) 本發日月人、、:t九 解密廯用穿r弋 %、到上述電細糸統1 〇 1 -1 0 4藉由引動加密/ 限制了例二2的一或多範例以執行密碼運算之方法的 過硬體勃> _過程式規劃的軟體執行一指定功能就比透 式11 2時,仃τ目―同功能還慢。且每次執行加密/解密應用程 並且密碼*電腦1〇卜104執行的任務就必須暫緩執行, 等)必須透數(例如:明文、密文、模式以及鑰匙 成密巧運置裕乍糸統傳达給加密/解密應用程式112為完 成在馬運异所引動的範例。並且因為宓碼、、寅管項力如— 的資料區请引叙#々Α 口巧在碼肩异須在一指疋 112的執;r ^私許夕回β的次運算,加密/解密應用程式 速度產生丁不L許多電腦指令的執行而對整體系統的處理 〇u11 ook僂、/丨的^影響/熟悉該項技藝者可察知在微軟V. Description of invention (8) The person in charge of the day, the month of the second month, the t9, the decryption, the use of the r弋%, the above-mentioned electric system 1 〇1 -1 0 4 by citing encryption / limit one of the second example 2 Or more examples of the method of performing cryptographic operations. The software of the procedural plan performs a specified function, which is slower than the same function. And each time the encryption/decryption application is executed and the password* computer 1 104 104 execution tasks must be suspended, etc.) must be transparent (for example: plaintext, ciphertext, mode and key into a densely transported system) Communicating to the encryption/decryption application 112 is an example that is motivated by the difference in the horse's movement. And because of the weight, the control area, such as the data area, please quote #々Α 口巧在码 Shoulder The operation of the fingerprint 112; r ^ private Xu Xi back to the beta operation, the encryption / decryption application speed produces Ding not L many computer instructions to perform the processing of the overall system 〇u11 ook偻, / The artist can be aware of Microsoft

# ^ λ ^迗夕1加密電子郵件訊息的時間會相當於只傳 运未加密電子郵件訊息的五倍。 田H 此外,目前的技術受限於操作李絲人 分的應用程式並益提供完整的二介人的延遲。大部 .直利爾π 输起產生或加密/解密元件 務::?;系統的:件或外掛應用程式以完成上述之任 務此外知作系統因中斷及其他正在 1 而轉移其執行。 仃應用私式的印求 並且’本發明人注意到在現今雷,么 碼運算係相類似於微處理機尚 ^二統1 0卜1 0 4的密 運算。早期的浮點單元運算係由元時的浮點數學 非常慢;㈤浮‘點運算’由軟體執行“行的 當浮點技術更進一步發展,浮點辅助;:運异也疋極慢。 以供執行,這些浮點輔助處理 =里為柃供汙點指令 ™钒仃汙點運算比軟體執行# ^ λ ^迗1 The time to encrypt an email message will be equivalent to five times that of an unencrypted email message. In addition, the current technology is limited by the application of Lisi's application and provides a complete second-person delay. Most. Straight Lear π to generate or encrypt/decrypt components::? Systematic: A component or plug-in application to perform the above tasks, in addition to knowing that the system is being transferred due to interruptions and other ongoing operations.仃 Apply private printing and the 'inventors have noticed that in today's mines, the symphony operation is similar to the microprocessor's secret operation. The early floating-point unit operation was very slow by the floating-point mathematics of the meta-time; (5) the floating 'point operation' was performed by the software. The floating-point technique of the line was further developed, and the floating-point aid was used: the transport was also extremely slow. For execution, these floating-point assisted processing = 柃 柃 柃 指令 TM TM 仃 仃 仃 仃 仃 仃 仃 仃 仃 仃 仃

1253268 _ _____—--—" 五、發明說明(9) -快了許多,但卻增加了系統的成本。相同地,密 理器目前以附加在電路板或以外接裝置與主處理】^助處 •列埠或其他介面匯流排(例如·· USB )成為介面的形w、過亚 ,這些輔助處理器能使密碼運算的完成比由純軟體"^^一 的快了許多。但是密碼辅助處理器增加系統配備的成灯 要求額外的電源以及降低系統的整體可靠度。密码輔助卢 理器的實現對刻意的窺探而言有其弱點,因為資料通道與 主微處理器並不在相同的晶粒(d i e )上。 ” 因此本發明人確認將密碼硬體加入現今微處理器的需 要,藉此,要求密碼運算的應用程式可藉由一單獨1基$ (a t om i c )的密碼指令指示微處理器執行密碼運算。本發明 人也確認應以此功能限定操作系統介入及管理的要求,並 且期望密碼指令可以使用於應用程式的權限層級 (privilege level)以及密碼硬體可相稱(c〇mp〇rt with) 於現今微處理器的一般架構,並且密碼硬體及相關聯的密 碼指令可支援相容先前的操作系統及應用程式。更期望的 是提供執行密碼運算的裝置及方法,其可阻止未授權的監 視;其可支援及可程式化有關多密碼演算;其可支援核對 及測試實體特定的密碼演算;其可允許使用者提供鑰匙也 可自行產生鑰匙;其支援多資料區塊大小及鑰匙長度(key s i z e );其提供有效率的多資料區塊管線處理;以及其提 供可程式化區塊加密/解密模式如ECB、CBC、CFB以及 0FB 〇1253268 _ _____—--—" V. Invention Description (9) - A lot faster, but it increases the cost of the system. Similarly, the microprocessor is currently attached to the circuit board or the external device and the main processing device, or other interface bus (for example, USB) to form the interface w, the sub-Asian, these auxiliary processors It can make the completion of cryptographic operations much faster than pure software "^^. However, the cryptographic auxiliary processor increases the system's lighting requirements to require additional power and reduce the overall reliability of the system. The implementation of the cryptographic auxiliary processor has its weaknesses for deliberate snooping because the data channel is not on the same die (d i e ) as the main microprocessor. The inventor therefore confirmed the need to add a cryptographic hardware to the present microprocessor, whereby an application requiring cryptographic operations can instruct the microprocessor to perform cryptographic operations by a single id (o om ic ) cryptographic instruction. The inventor has also confirmed that this function should be used to limit the requirements for operating system intervention and management, and that password commands can be used for the application's privilege level and the password hardware can be commensurate (c〇mp〇rt with). The general architecture of today's microprocessors, and the cryptographic hardware and associated cryptographic instructions support compatibility with previous operating systems and applications. It is more desirable to provide means and methods for performing cryptographic operations that prevent unauthorized surveillance. It supports and can be programmed for multi-password calculus; it supports verification and testing of entity-specific cryptographic calculations; it allows users to provide keys and generate their own keys; it supports multiple data block sizes and key lengths (key Size ); it provides efficient multi-block pipeline processing; and it provides a programmable block encryption/decryption mode ECB, CBC, CFB and 0FB billion

第16頁Page 16

2S326S2S326S

五、發明說明(10) 【發明内容 本發明 發明提供一 在一實施例 係包含一密 logic 上 令係由一計 令流的一部 之轉譯邏輯 碼指令轉譯 存對應第一 裝置載入第 密碼運算。 間,上述之 本發明 配置用以轉 (micro ins 包含一第一 示(direct) 於此第二輸 一輸出文字 算對應於一 (issue) % -本發明 係用以 較佳的 中,提 碼指令 述之密 算裝置 分,並 電路係 成微指 輪入文 二輸入 因此, 輸出文 解決上述習 技術以執行 供一種執行 電路及一轉 碼指令電路 接收並將其 且此密碼指 操作性地輕 令,此微指 字區塊的輸 文字區塊並 在對第二輸 子區塊可以 提供一種執行密碼 譯一密碼指令成一 truct ions)的轉譯 微指令及一第二微 載入一第二輸入文 入文字區塊。上述 區塊,此第一輸出 第一輸入文字區塊 -微指令後發佈第」 提供一種在一裝置 知技藝中的問題及缺點。本 密碼運算於一微處理機中。 密碼運算的裝置,而此裝置 譯邏輯電路(translation 產生一密碼指令,此密碼指 當成在此計算裝置上執行指 令指定一種密碼運算。上述 合此密碼指令電路且將此密 令係用以在指示計算裝置儲 出文字區塊之前,指示計算 對此第二輸入文字區塊執行 入文字區塊執行密碼運算期 被儲存。 運算的裝置,此裝置包含一 序列(sequence)微指令 邏輯電路。此序列的微指令 指令。上述之第一微指令指 字區塊並且執行一密碼運算 之弟一微·指令指不儲存_第 文字區塊根據執行的密碼運 。上述之轉譯邏輯電路發佈 -微指令。 執行密碼運算的方法,此方V. SUMMARY OF THE INVENTION (10) SUMMARY OF THE INVENTION The present invention provides an embodiment in which a cryptographic logic command is used to translate a logical code instruction from a program stream to a first device to load a password. Operation. In the above, the present invention is configured to rotate (micro ins includes a first direct output to the second output, and the output text corresponds to an issue % - the present invention is used for better, and the code is used. The instruction means the secret device, and the circuit is connected to the micro-finger input. Therefore, the output file solves the above-mentioned prior art to perform for an execution circuit and a transcoding instruction circuit to receive and the code is operatively light. Let the micro-finger block block the input text block and provide a translation micro-instruction and a second micro-loading a second input to the second input sub-block. Text into the text block. The above block, the first output of the first input text block - the micro-instruction is issued after the first instruction provides a problem and a disadvantage in the art of a device. This password is computed in a microprocessor. a device for cryptographic operations, and the device translates a logic circuit (translation generates a cryptographic command, the password means that a command operation is performed on the computing device to specify a cryptographic operation. The cryptographic command circuit is used in conjunction with the cryptographic command to indicate the calculation Before the device stores the text block, instructing to calculate the execution of the input block into the text block to perform a cryptographic operation period is stored. The device includes a sequence microinstruction logic circuit. The microinstruction instruction. The first microinstruction refers to the word block and executes a cryptographic operation. The microinstruction refers to not storing the _text block according to the executed password. The above translation logic circuit issues the microinstruction. The method of cryptographic operation, this side

第17頁Page 17

五、發明說明(π) 法包含轉譯一密 其中此密碼指令 上述之第一微指 子區塊並且執行 之卓一微指令氺 曰 一輸出文字區塊 文字區塊;以及 單元後,發佈上 密碼運算對此第 塊可以被儲存。 :指令成一第一微指令及一第二微指令, =定(prescribes)—種密碼運算的執行。 令指示(direct)此裝置載入一第二輸入文 此密碼運算於此第二輸入文字區塊,上述 示此裝置儲存一第一輪出文字區塊,此第 根據執行的此密碼運算對應於一第一輸入 發佈(issue)上述之第—微指令給一密碼 述之第二微指令給此密碼單元;藉此在此 一輸入文子區塊執行期間,此輸出文字區 【實施方式】 本發明的一些實施例會詳細描述如下。麸而, =描述外,纟發明還可以廣泛地在其他的實施例施行,^ 本發?的範圍不受限定,#以之後的專利範圍為準。並且 ,為提供更清楚的描述及更容易理解本發明,圖示内各部 分並沒有依照其相對尺寸繪圖,某些尺寸與其他相關尺度 之比例已經被誇張;不相關之細節部分也未完全繪出,以 求圖不的簡潔。 鑑於上述所討論的密碼運算及現今電腦系統用以加/ 解密資料的相關技術,這些技術及其相關限制將在第二圖 中繼續探討,而接下來本發明也將根據第三圖到第十六圖 加以討論。本發明提供一種在現今電腦系統中執行密碼運 异的裝置及方法’其透過主要機制展現優秀的性能特徵並V. The invention (π) method includes translating a password in which the first micro-finger sub-block of the password instruction is executed and executing a micro-instruction, an output text block text block; and after the unit, issuing the password The operation can be stored for this block. : The instruction is a first micro instruction and a second micro instruction, = prescribes - the execution of a cryptographic operation. Directing the device to load a second input text, the cryptographic operation is performed on the second input text block, wherein the device is configured to store a first round of text block, and the cryptographic operation according to the execution corresponds to a first input issues the first micro-instruction to a second micro-instruction of the password to the crypto unit; thereby outputting the text field during execution of the input text sub-block [embodiment] Some embodiments of the invention are described in detail below. Bran, = description, the invention can also be widely implemented in other embodiments, ^ this hair? The scope is not limited, # is subject to the scope of the patents that follow. Moreover, in order to provide a clearer description and a better understanding of the present invention, the various parts of the drawings are not drawn according to their relative dimensions, and the ratio of certain dimensions to other related dimensions has been exaggerated; the irrelevant details are not fully drawn. Out, in order to find the simplicity of the map. In view of the cryptographic operations discussed above and related techniques used to encrypt/decrypt data in today's computer systems, these techniques and their associated limitations will continue to be discussed in the second figure, and the present invention will also be based on the third to tenth The six figures are discussed. The present invention provides an apparatus and method for performing cryptographic operations in today's computer systems, which exhibit excellent performance characteristics through a primary mechanism and

11353268 :換頁11353268 : Form feed

•且更滿足上述所提及的目標, 先前(legacy)架構的相容性、 高效率的多資料區塊管線操作 性等等。 像是限制操作系統的干預、 演算法及模式的可程式性、 ’防止駭客入侵以及可測試 請f照第二圖,方塊圖2〇〇描繪當今電腦系統中執行 岔碼運异的技術。方塊圖2 〇 〇包含一微處理器2 〇 j,其擷取 指令及從系統=憶體中一稱為應用程式記憶體2〇3存取應 用程式相關的資料,而程式控制及應用程式記憶體2〇3中 資料的存取通常是由屬於系統記憶體保護範圍的操作系統 軟體202所管理。如上所述,當一執行應用程式(例如:電 子郵件程式或檔案儲存程式)要求執行密碼運算時,此執 行應用程式必須藉由指示(direct)微處理器2(Π執行相當 數量的指令以完成密碼運算。這些指令可能是執行應用程 式本身的子程式’也可能是連結到此執行應用程式的外掛 應用程式’或者是由操作系統2 〇 2所提供的服務。姑且不 論他們的關聯性,熟悉該項技藝者可察知這些指令將駐於 某些指定或分派的記憶體範圍。為達討論目的,這些記憶 體範圍顯示在應用記憶體2 〇 3並且包含一密碼鑰匙產生應 用私式(key generation application)204,其中密碼输 匙產生應用程式204產生或接收一密碼鑰匙並且擴展此鑰 匙成一使用於密碼回合運算中的鑰匙排程(key schedule) 2 0 5 °就多區塊加密運算而言,區塊加密應用程式 (encryption applicati〇n)2〇6 被引動(invoke)。加密應 用程式20 6執行存取明文(plaintext)區塊21〇、鑰匙排程• and more meet the above mentioned objectives, compatibility with legacy architecture, efficient multi-material block pipeline operability, and more. Such as limiting the intervention of operating systems, the scriptability of algorithms and patterns, 'preventing hacking and testing.' Please refer to the second diagram. Figure 2 depicts the technology for performing weights in today's computer systems. Figure 2 〇〇 contains a microprocessor 2 〇j, which captures instructions and accesses application-related data from a system memory system called application memory 2〇3, while program control and application memory Access to data in volume 2〇3 is typically managed by operating system software 202 that is within the scope of system memory protection. As described above, when an execution application (for example, an email program or a file storage program) requires a cryptographic operation, the execution application must be executed by directing the microprocessor 2 (Π executing a considerable number of instructions). Cryptographic operations. These instructions may be subroutines that execute the application itself, or may be plug-in applications that connect to the executing application, or services provided by operating system 2 。 2. Regardless of their relevance, familiarity The skilled artisan will be aware that these instructions will reside in certain specified or dispatched memory ranges. For discussion purposes, these memory ranges are displayed in the application memory 2 〇 3 and contain a cryptographic key to generate the application private (key generation Application 204, wherein the password key generating application 204 generates or receives a cryptographic key and expands the key into a key schedule for use in the password rounding operation. In terms of multi-block cryptographic operations, Block encryption application (encryption applicati〇n) 2〇6 is invoked (invoke). Encryption application Formula 206 performs access the plaintext (plaintext) block 21〇, the key scheduling

第19頁 五、發明說明(13) 2 0 5以及密碼參數2 0 9的指令,其中密碼參數2 〇 9係進一步 指示明確的密碼運算,如模式、鑰匙排程位置等,且在要 求特定模式時,加密應用程式206也可存取初始向量 (initalization vector)208。加密應用程式206執行其内 的指令以產生對應的密文(ciphertext)區塊211。同理, 解密應用程式(decryption appl icat ion) 207被引動以執 行區塊解密運算。解密應用程式207執行存取密文區塊 2 11、鑰匙排程2 0 5以及密碼參數2 0 9的指令,其中密碼來 數2 0 9係進一步指示明確的密碼運算,並且在要求特定模 式時,也可存取初始向量208。解密應用程式207執行其内 修 的指令以產生對應的明文區塊2 1 0。 八 值得注意的是必須執行相當數量的指令以產生密碼餘 匙及加密或解密文字區塊。上述提及的FIPS說明書包含許 多虛擬碼致能相當數量指令之範例,因此,熟悉該項技藝 者可察知一個簡單的加密運算將要求數以百計的指令,並 且每一指令須經由微處理器2 0 1執行以完成所要求的密碼 運算。並且,完成密碼運算的指令執行對正在執行的應用 程式之主目的(例如:檔案管理、即時訊息、電子郵件、 遠端檔案存取、信用卡交易)而言一般係屬多餘,結果讓 使用者誤為目前執行的應用程式執行效率不佳。至於獨立 _ 或外掛的加密及解密應用程式2 06及207,這些應用释式的 引動及管理也必須服從操作系統2〇2的其他請求' 例如支 援中斷、例外(excepti〇n)以及更惡化問題的類似事件。 並且電腦系統所要求每一同時的密碼運算,密碼錄匙產生Page 19, V. Invention (13) 2 0 5 and the password parameter 2 0 9 command, wherein the password parameter 2 〇 9 further indicates an explicit cryptographic operation, such as mode, key scheduling position, etc., and requires a specific mode. The encryption application 206 can also access the initialization vector 208. Encryption application 206 executes the instructions therein to generate a corresponding ciphertext block 211. Similarly, a decryption application 281 is motivated to perform a block decryption operation. The decryption application 207 executes instructions for accessing the ciphertext block 2 11, the key schedule 2 0 5, and the password parameter 2 0 9 , wherein the password number 2 0 9 further indicates an explicit cryptographic operation, and when a specific mode is required The initial vector 208 can also be accessed. The decryption application 207 executes its internal revision instructions to generate a corresponding plaintext block 2 1 0. Eight It is worth noting that a significant number of instructions must be executed to generate a cryptographic key and to encrypt or decrypt the text block. The FIPS specification referred to above contains many examples of virtual code enabling a significant number of instructions, and thus those skilled in the art will appreciate that a simple cryptographic operation will require hundreds of instructions and each instruction must be via a microprocessor. 2 0 1 Execution to complete the required cryptographic operations. Moreover, the execution of the instruction to complete the cryptographic operation is generally superfluous for the main purpose of the executing application (for example, file management, instant messaging, email, remote file access, credit card transaction), and the result is wrong for the user. It is not efficient for the currently executing application. As for the independent _ or plug-in encryption and decryption applications 2 06 and 207, the priming and management of these application interpretations must also be subject to other requests from the operating system 2 ' 'such as support interrupts, exceptions (excepti〇n) and worsening problems Similar events. And the computer system requires each simultaneous cryptographic operation, password key generation

第20頁 ::叫 五、發明貌明(14)Page 20 :: Calling 5. Inventions (14)

jj^ CD 例必ί呈式2 Ο 4、解岔應用程式2 Ο 7及初始向量2 Ο 8的個別實 別^須被配置在應用程式記憶體20 3,且預期由微處理器 加所要求執行之同時密碼運算的數目也將隨時間而增 ,、本發明人注意到目前電腦系統密碼技術的問題與限制 、^且確認在微處理器中提供執行密碼運算之裝置及方法 的需要。藉此,本發明提供一微處理器及相關的方法透過 其$的密碼單元執行密碼運算,此密碼單元係藉由單一密 碼指令的程式執行密碼運算。本發明現在將以第三圖到第 十一圖為參考加以討論。 請參照第三圖,其為本發明執行密碼運算之微處理器 的方塊圖300。方塊圖300描繪一微處理器301,其透過記 憶體匯流排3 1 9與系統記憶體32 1耦合連接,且微處理器 301包含從指令暫存器接收指令的一轉譯邏輯電路Mg。轉 譯邏輯電路303包含邏輯電路、裝置或微碼(例如:微指令 或本機指令),或邏輯電路、裝置或微碼的組合,或用以 轉譯指令成為指令相關序列的等效元件。這些在轉譯邏輯 電路303中執行轉譯的元件可能與在微處理器3〇1中執行其 他功能的電路、微碼共用,而根據本應用的範圍,微碼是 對照一個或多個微指令的術語。一微指令(也可參照成一 本機指令)係一單元層級執行的一指令,例如微指令係由 精簡指令集電腦(reduced instruction set computer; RISC)微處理器直接執行。至於複雜指令集電腦(complex instruct ion set computer ; CISC)微處理器,如 x86 相容Jj^ CD example must be 2 Ο 4, 岔 application 2 Ο 7 and initial vector 2 Ο 8 individual reality ^ must be configured in the application memory 20 3, and is expected to be required by the microprocessor The number of cryptographic operations at the same time as execution will also increase over time. The inventors have noted the problems and limitations of current computer system cryptography, and have confirmed the need to provide means and methods for performing cryptographic operations in microprocessors. Accordingly, the present invention provides a microprocessor and associated method for performing a cryptographic operation through a cryptographic unit of $, which performs a cryptographic operation by a program of a single cipher command. The invention will now be discussed with reference to the third through eleventh figures. Please refer to the third figure, which is a block diagram 300 of a microprocessor for performing cryptographic operations in accordance with the present invention. Block diagram 300 depicts a microprocessor 301 coupled to system memory 32 1 via a memory bus 319 and microprocessor 301 includes a translation logic Mc that receives instructions from the instruction register. Translation logic 303 includes logic, devices or microcode (e.g., microinstructions or native instructions), or a combination of logic, devices, or microcode, or equivalent elements used to translate the instructions into sequences of instructions. These elements that perform translations in translation logic 303 may be shared with circuitry, microcode that performs other functions in microprocessor 3.1, and microcode is a term that refers to one or more microinstructions, depending on the scope of the application. . A microinstruction (also referred to as a native instruction) is an instruction executed at a unit level. For example, the microinstruction is directly executed by a reduced instruction set computer (RISC) microprocessor. As for complex instruction set computer (CISC) microprocessors, such as x86 compatible

第21頁 w ^r\ \ i :i i J = 士理|§ ’其x86指令被轉譯為關聯的微指令並且由複 集ϋ微》處理器中的單元直接執行。轉譯邏輯電路 渴、々佇列304,且此微指令佇列304具有複數個 ;;.^30; ^ ^ ?,J304 ^ ^ ^ ^ 奴加紅士 / 自#又邏軏電路,而此暫存器組307包含複 密碼運ίίΓ已 t ^ -- 暫存器308-313指到系統記憶體321 _防4執仃彳日疋枪碼運算資料的對應位置323-327。暫存 口到載入邏輯電路3 1 4,此載入邏輯電路3 1 4係與取 回=料以執行定址密竭運算的資料快取3 此取 :料藉由資料匯流排319麵合到系統記憶 此 ::: 令所指定的運算。執行邏輯電路328包含 ;;:路、裝置或微碼(例如:微指令或本機指令),或邏 曾的莖:I置或!碼的組合,或用以執行由指令指定之運 ^丌处^兀件。廷些在執行邏輯電路328中執行運算的元 :可:與在微處理器301中執行其他功能的電路、微碼丘 執仃邏輯電路包含密碼單元316,此密碼 接、 二,邏輯電路3“被要求執行指定密碼運 接收 密碼單元316執行指定密碼運算於複數個輸入Ϊ 一 ^ / 2 6以^生相對應複數個輸出文字區塊3 2 7。密碼'™ :^含邏輯電路、裝置或微碼(例如:微指令或機: 運算的等效元件。這些在密竭單的元㈣===Page 21 w ^r\ \ i :i i J = 士理|§ 'The x86 instructions are translated into associated microinstructions and executed directly by the units in the multiplexed processor. The translation logic circuit is thirsty, the queue 304, and the micro-instruction array 304 has a plurality of;; ^30; ^ ^ ?, J304 ^ ^ ^ ^ slave plus red / self #又逻辑軏 circuit, and this temporary The memory bank 307 contains a complex password. The temporary memory 308-313 refers to the corresponding location 323-327 of the system memory 321 _ defense 4 仃彳 仃彳 疋 gun code operation data. The temporary storage port is loaded into the logic circuit 3 1 4, and the loading logic circuit 3 1 4 is used to retrieve the data cache for performing the address exhaustion operation. The data is taken over by the data bus 319. The system remembers this::: The operation specified by the order. Execution logic 328 includes ;;: way, device or microcode (eg, microinstruction or native instruction), or logic stem: I set or! A combination of codes, or used to execute the operations specified by the instruction. Some of the elements that perform operations in the execution logic circuit 328: may be: a circuit that performs other functions in the microprocessor 301, the microcode hilly logic circuit includes a cryptographic unit 316, which is connected to the second, the logic circuit 3" It is required to execute the specified password to receive the password unit 316 to perform the specified password operation on the plurality of inputs Ϊ a ^ / 2 6 to correspond to a plurality of output text blocks 3 2 7. The password 'TM: ^ contains logic circuits, devices or Microcode (for example: microinstruction or machine: equivalent of arithmetic. These are in the exhaustive unit (four) ===

五、發明說明(16) •可能與在微處理器3 0 1中執行其他功能的電路、微碼共 用。在一實施例中,密碼單元3 1 6並列操作與在執行邏輯 3 2 8内的其他執行單元(未繪出),例如整數單元、浮點數 單元等。在本應用範圍一 ”單元’’的實施係包含邏輯電路、 裝置或微碼(例如:微指令或本機指令),或邏輯電路、事 置或微碼的組合,或用以執行指定功能或指定運算的等效 元件。這些在特定單元中執行指定功能或指定運算的元件 可能與在微處理器301中執行其他功能的電路、微碼共 用。例如:一實施例中,一整數單元包含邏輯電路、裝置 或微碼(例如:微指令或本機指令),或邏輯電路、裝置戋 微碼的組合,或用以執行整數指令的等效元件;一浮點^ 兀包含邏輯電路、裝置或微碼(例如:微指令或本機指 令),或邏輯電路、裝置或微碼的組合,或用以執行浮點 指令的等效元件;則在整數單元中執行整數指令的元件 能與在浮點單元中執行浮點數指令的其他電路、微碼等共 用。在一與x86架構相容的實施例中,密碼單元316盥 整數單元、Χ86浮點單元、χ86數學陣列處理指令” (Mathematic Matrix Extension; ΜΜχ)單元、χ86 單 多貢料流程擴展(Streaming SIMD Extensi〇n 7 並列操作:根據本應用範圍,當一實施例可以正確二: 计給X 8 6微處理器執行之大部分岸 與x86架構相容,一岸用^ :: 転式時,此實施例係 « 用&式正確執行而得到其預期的钍 果。曰代X 8 Μ目容實施例預期密$ i ' Ά iAvSR ^頂J也碼早兀亚列#作與先前提 及的x86執仃早兀之子集。宓 ⑺代 山馬早7G 3 1 6耦合到儲存邏輯電 f .厂 -.‘一· Π532你 五、發明說明(17) 、路31 7並且提供相對應複數個輸出文字區塊327,而此儲存 •邏輯電路3 1 7也耦合到指定輸出文字資料3 2 7給系統記憶體 3 2 1儲存的資料快取3 1 5。此資料快取3 1 5耦合到寫回邏輯 318 ’而當所指定的密碼運算完成時,寫回邏輯318更新在 暫存器組3 0 7中的暫存器3 0 8 - 3 1 3。在一實施例中,微指令 與時脈信號電路(未繪出)同步經過每一個上述所提及之邏 輯電路階段302、303、304、307、314、316-318以使運算 可以同時執行而相似於在線執行運算。 在系統記憶體32 1中,一要求指定密碼運算的應用程 式可以直接指示微處理器301透過單一密碼指令322 (參照 用以說明的XCRYPT指令322)執行此運算。在一CISC實施例 中’ XCRYPT指令3 2 2包含一指定密碼運算的微指令。在一 實施例中,XCRYPT指令322利用一存在指令集架構中的一 王閒或未使用指令運算碼。在一 x 8 6架構相容的實施例中 ’ XCRYPT指令322係一4位元組指令包含一X86前置REP (如 0xF3)、兩位元組未使用χ86運算碼(如〇x〇FA7)、一位元組 有關於一指定區塊密碼模式以應用於執行一指定密碼運算 。在一實施例中,根據本發明的XCRYPT指令322可以在系 統權限供給應用程式的層級執行,因而可以程式規劃於指 令的程式流以提供給微處理器3〇1不論是由應用程式直接0 或在操作系統320的控制下。因為僅有一XCRYpT指令322指 示微處理器3 0 1執行指定的密碼運算,而運算的完成 作系統320應是顯而易見。 μ 在操作中,操作糸統3 2 0引動一應用程式以執行於微V. INSTRUCTIONS (16) • May be shared with circuits and microcode that perform other functions in the microprocessor 310. In one embodiment, cryptographic units 316 operate in parallel with other execution units (not depicted) within execution logic 326, such as integer units, floating point units, and the like. In the context of this application, the implementation of a "unit" includes logic, devices or microcode (eg, microinstructions or native instructions), or a combination of logic, events, or microcode, or to perform a specified function or Equivalent elements that specify operations. These elements that perform specified functions or specified operations in a particular unit may be shared with circuitry, microcode that performs other functions in microprocessor 301. For example, in one embodiment, an integer unit contains logic a circuit, device or microcode (eg microinstruction or native instruction), or a combination of logic circuitry, device/microcode, or an equivalent component for performing integer instructions; a floating point 兀 containing logic circuitry, means or Microcode (eg, microinstruction or native instruction), or a combination of logic circuits, devices, or microcodes, or equivalent components used to execute floating-point instructions; then components that perform integer instructions in integer units can float Other circuits, microcodes, etc. that perform floating point instructions in the dot unit are shared. In an embodiment compatible with the x86 architecture, the cryptographic unit 316 盥 integer unit, Χ 86 floating point unit, Χ86 Mathematical Array Processing Instructions” (Mathematic Matrix Extension; ΜΜχ) unit, χ86 Single tributary process extension (Streaming SIMD Extensi〇n 7 parallel operation: according to the scope of application, when an embodiment can be correct two: Count to X 8 6 Most of the microprocessor's execution is compatible with the x86 architecture. When using the ^: 転 type on the shore, this embodiment is implemented correctly with the & and the expected result. The modern X 8 Μ Μ The embodiment expects that the secret $ i ' Ά iAvSR ^ top J is also coded earlier than the sub-set of the x86 squatting earlier. The 宓 (7) 代山马 early 7G 3 1 6 coupled to the storage logic f. Factory-.'一·Π532你五, invention description (17), road 31 7 and provides a corresponding plurality of output text blocks 327, and this storage logic circuit 3 17 is also coupled to the specified output text data 3 2 7 The data cache stored in the system memory 3 2 1 is cached 3 1 5. This data cache 3 1 5 is coupled to the write back logic 318 ' and when the specified cryptographic operation is completed, the write back logic 318 is updated in the scratchpad group. The scratchpad 3 0 8 - 3 1 3 in 3 0 7. In one embodiment, the microinstruction and the clock Signal circuits (not shown) are synchronized through each of the aforementioned logic circuit stages 302, 303, 304, 307, 314, 316-318 to enable the operations to be performed simultaneously and similar to performing the operations online. In an example, an application requiring a cryptographic operation can directly instruct the microprocessor 301 to perform the operation via a single cipher command 322 (refer to the XCRYPT instruction 322 for explanation). In a CISC embodiment, the 'XCRYPT instruction 3 2 2 includes A microinstruction that specifies a cryptographic operation. In one embodiment, the XCRYPT instruction 322 utilizes a free or unused instruction opcode in an existing instruction set architecture. In an x86 architecture-compatible embodiment, the 'XCRYPT instruction 322 is a 4-bit instruction that contains an X86 pre-REP (such as 0xF3), and the two-tuple does not use the χ86 opcode (such as 〇x〇FA7). A tuple has a specific block cipher mode for applying a specified cryptographic operation. In one embodiment, the XCRYPT instruction 322 in accordance with the present invention can be executed at the level of the system authority supply application, and thus can be programmed into the program stream to be provided to the microprocessor 3〇1 either by the application directly 0 or Under the control of the operating system 320. Since only one XCRYpT instruction 322 indicates that the microprocessor 310 performs the specified cryptographic operations, the completion of the operation system 320 should be apparent. μ In operation, the operating system 3 2 0 illuminates an application to execute on the micro

Ig53268 | | ____________一抑W-你 五、發明說明(18) ,處理器3 0 1。如部分指令流於應用程式的執行期間,一 XCRYPT指令322從系統記憶體321提供給擷取邏輯電路 '302。然而,在XCRYPT指令322執行之前,在程式流的指令 指示微處理器301初始化暫存器308 -3 1 2的内容以使他們指 到記憶體3 2 1中的位置3 2 3 - 3 2 7,其包含一密碼控制字組 323、一初始禮碼錄起324或一餘匙排程324、一初始向量 325(如果需要)、運算用的輸入文字326、以及輸出文字 327。在執行XCRYPT指令322之前須先初始化暫存器 308-312,因為XCRYPT指令322與一附加於暫存器308-312 之含有區塊計數的暫存器313,其中區塊計數係在輸入文 子326區塊加密或解密資料區塊的數目。因此轉譯邏輯電 路3 03從擷取邏輯電路3〇2取回XCRYPT指令並且轉譯成一序 =相對應的微指令以指示微處理器3〇 1執行指定的密碼運 算。一第一複數個微指令305-306於相對應微指令序列 中’指示密碼單元316從載入邏輯電路314載入資料,並且 ,始執行指定數目的密碼回合以產生相對應區塊的輸出資 料丄提供藉由資料快取3丨5儲存於系統記憶體3 2 1中的輸出 文子327給儲存邏輯電路317。一第二複數個微指令(未繪 ^ ^於-相對應微指令序列中,指示在微處理器3 〇 1中其他執 $单元(未繪出)執行其他未完成指定密碼運算所需的運 ^ ’例如:管理包含暫時結果及計數之非架構暫存器(未 、、曰出)、更新輸出及輸入文字指標暫存器311 —312、 入文字, ^ 輸 兩 &鬼326之加密/解密初始向1指標暫存器31〇(如果 而)、處理未處理的中斷等等。在一實施例中,暫存器 limm 五、發明說明(19) -3 Ο 8 — 3 1 3 4糸架才籌子生的暫;^ as ait I ± & ^ 的督存器。架構性暫存器308-3 1 3係為實 現特定微j理器之指♦集架射所定義的一種暫存器。 ^ 4/ν員鉍例中,逸、碼單元3 1 6分成複數個階段因此允 ” 塊326的管線處理。而其相反的實施例 I I :二甘元3,一第三實施例係關注於-兩階段 扭二早兀 ,、可官線處理兩個相繼輸入文字區塊326。 K:”!f例,、密碼單元316係裝置以緩衝微指令及 :妗屮::二f6 ’亚且在儲存對應前-輸入文字區塊3 2 6 的輸出文子區塊327時’執行指定的密 。為透過密碼單元最大化文字區:3= =:,=指令3〇"。6用以指示載入隨後的輸人 塊327°被儲’/Λ·前—輸入文字區塊326的輸出文字區 被儲存之别,執行指定的密碼運算。如此之順序考 文字區塊326-327有效率的管線處理,並且也^ 後更加詳細的探討。 第三圖的方塊圖300教示本發明所需之元件,因 ^斗多在現今微處理器3()1中的邏輯以求圖示 U悉,項技藝者可察知現今特定實現的微處上〇、; 糸包含許多階段及邏輯,在此為圖示之簡潔而部 二弋如:載人邏輯電路314在—快取線對 之刀口 介面階段的-位址產生階段。然而重要 且應主思的疋,在複數個輸入文字區塊326上+要 :運算,係根據本發明藉由一單一指令322的運算J = …统320的考ϊ係顯而易見,並且單—指令322的執行係藉 第26頁 )1^5326¾Ig53268 | | ____________ A W-you V. Invention Description (18), Processor 3 0 1. An XCRYPT instruction 322 is provided from system memory 321 to the capture logic circuit '302 as part of the instruction flow during execution of the application. However, prior to execution of the XCRYPT instruction 322, the instructions in the program stream instruct the microprocessor 301 to initialize the contents of the registers 308 - 321 to cause them to point to the location in the memory 3 2 1 3 2 3 - 3 2 7 It includes a password control block 323, an initial gift code record 324 or a spoon schedule 324, an initial vector 325 (if needed), input text 326 for calculation, and output text 327. The registers 308-312 must be initialized prior to executing the XCRYPT instruction 322 because the XCRYPT instruction 322 and a register 313 containing the block count attached to the registers 308-312, where the block count is in the input text 326 Block Encrypts or decrypts the number of data blocks. The translation logic circuit 303 therefore retrieves the XCRYPT instruction from the capture logic circuit 3〇2 and translates it into a sequence = corresponding microinstruction to instruct the microprocessor 3〇1 to perform the specified password operation. A first plurality of microinstructions 305-306 in the corresponding microinstruction sequence 'instructs the cryptographic unit 316 to load data from the load logic circuit 314, and begins executing a specified number of cryptographic turns to generate output data for the corresponding block. The output file 327 stored in the system memory 3 2 1 by the data cache 3 is supplied to the storage logic circuit 317. a second plurality of microinstructions (not drawn in the corresponding microinstruction sequence, indicating that other cells in the microprocessor 3 〇1 (not shown) perform other operations required to perform the specified cryptographic operation ^ 'For example: Manage non-architected scratchpads containing temporary results and counts (not, and output), update output and input text indicator register 311-312, enter text, ^ lose two & ghost 326 encryption / The decryption is initially directed to the 1 indicator register 31 (if any), the unprocessed interrupt is processed, etc. In one embodiment, the register limm 5, the invention description (19) -3 Ο 8 - 3 1 3 4糸The temporary storage of the child; ^ as ait I ± & ^ ^ the supervisor. The architectural register 308-3 1 3 is a temporary storage defined by the specific micro-computer In the example of ^4/ν, the escape code unit 3 16 is divided into a plurality of stages, thus allowing the pipeline processing of block 326. The opposite embodiment II: two-element 3, a third embodiment Focus on the two-stage twisting two early, and the official line processing two successive input text blocks 326. K: "! f example, crypto unit The 316 system performs the specified secret by buffering the microinstruction and: 妗屮::2 f6 ' and storing the corresponding pre-input text block 3 2 6 output text sub-block 327. To maximize the text through the cryptographic unit Zone: 3==:,=Command 3〇".6 is used to indicate that the input character block 327° stored in the subsequent input block is stored, and the output text area of the input text block 326 is stored, and executed. The specified cryptographic operations. Such sequential test text blocks 326-327 are efficient pipeline processing, and are also discussed in more detail later. The block diagram 300 of the third diagram teaches the components required by the present invention. Nowadays, the logic in the microprocessor 3()1 is shown in the figure, and the artist can know the micro-location of the specific implementation today; 糸 contains many stages and logics, here is the simplicity of the illustration For example, the manned logic circuit 314 is in the - address generation phase of the edge interface phase of the cache line pair. However, it is important and should be considered, in the plurality of input text blocks 326 + to: operate, according to The present invention is easy to perform by a single instruction 322 operation J = ... See, and the single-execution 322 is executed on page 26) 1^53263⁄4

五、發明說明(20) .由與微處理器3 0 1中其他執行單元並聯操作及協調的密石馬 單元316所完成。本發明密碼單元316在實施組態中的替代 實施例係類似前幾年微處理器中浮點單元的硬體。密碼單 元316的操作及相關XCRPYT指令322係完全相容先前操作2 統及程式同時操作,並且也將在之後更加詳細的探討。^ 請參照第四圖’其為本發明之一基元(at〇mic)密碼指 令4 0 0實施例的方塊圖。密碼指令4 〇 〇包含一選項 曰 (optional)前置攔位4〇1、一重複(repeat)前置攔位4〇2、 一運算碼(opcode)欄位403、一區塊密碼模式(bl〇ck cipher m〇de)欄位404。在一實施例中,攔位4(Π 一 4〇4的内 容相稱於χ86指令集架構,而其替代的實施例可考慮相容 於其他指令集架構。 刼作上,選項前置攔位40i在許多指令集架構中係用 以=能(enable)或禁能(disable)部分主要微處理器的 :!ί,'象是指示16位元或32位元的運算、指示處理或存 記憶體區段等。重複前置欄位4〇2係用以指示由 欲碼和々4 0 0所指定的密碼運算係在複數個輸入資料區塊 Λ明Λ或密甘文)完成。重複前置棚位402也隱示二相稱微 個架構暫存器的内容當成指標指到系 2元Λ指定密碼運算所需參數的位置。如上 斤1 ’在-x86相谷貝施例中,重複前 0^3 =且根據χ86架構協定,密⑽令與 = 令’如,.MOV,在形式上非常相似。例如:當本复發子明由 -x86相谷微處理器實施例執行時’重複前置欄位權係參V. DESCRIPTION OF THE INVENTION (20) This is accomplished by a pebbly horse unit 316 that operates and coordinates in parallel with other execution units in the microprocessor 310. An alternative embodiment of the cryptographic unit 316 of the present invention in an implementation configuration is similar to the hardware of a floating point unit in a microprocessor in previous years. The operation of cryptographic unit 316 and associated XCRPYT instructions 322 are fully compatible with previous operations and programs, and will be discussed in more detail later. ^ Please refer to the fourth figure, which is a block diagram of an embodiment of the at least mic password command of the present invention. The password command 4 〇〇 contains an option opt (optional) pre-block 4 〇 1, a repeat pre-block 4 〇 2, an opcode field 403, a block cipher mode (bl 〇ck cipher m〇de) Field 404. In one embodiment, the content of the block 4 (Π4〇4 is commensurate with the χ86 instruction set architecture, and alternative embodiments thereof may be considered compatible with other instruction set architectures. In practice, the option pre-block 40i In many instruction set architectures, it is used to enable or disable some of the main microprocessors: !ί, 'like to indicate 16-bit or 32-bit operations, to indicate processing or to store memory. Sections, etc. The repeating pre-column 4〇2 is used to indicate that the cryptographic operations specified by the desired code and 々400 are completed in a plurality of input data blocks, Λ明Λ or 密甘文. Repeating the front shelf 402 also implicitly indicates the contents of the two-phase micro-architecture register as an indicator to the location of the parameter required to specify the cryptographic operation. In the case of the -x86 phase, the previous 0^3 = and according to the χ86 architecture agreement, the secret (10) order is very similar in form to the =, eg, .MOV. For example, when the recurrence is executed by the -x86 phase valley microprocessor embodiment, the repeat pre-column weights are

第27頁Page 27

-昭 ‘儲存在架構暫存器Ecx中之區 •暫存器ESI中之來源位址指標(免。十數^數、-儲存在 算)以及一儲存在暫存器EDI中之目以供密碼運 體中的輸出資料)。在x86相容的:;=曰標(指到記憶 展傳統重複字串之指令的概念成為;月= 器EDX令之控制字組指標、一 更了一儲存在暫存 匙指標以及-儲存在暫存糧暫存獅X中之密碼鑰 果指定密碼模式要求) 中對-初始向量的指標(如-Zhao's area stored in the architecture register Ecx. The source address indicator in the register ESI (free. Tens of ^, - stored in the calculation) and a destination stored in the register EDI for The output data in the password transport). The x86-compatible:;= 曰 mark (refers to the concept of the instruction to the memory repeat traditional repeat string becomes; the month = the EDX command control word indicator, one more stored in the temporary key indicator and - stored in Temporary grain temporary lion X in the password key to specify the password mode requirements) in the pair - initial vector indicators (such as

運算碼攔位403指定微處理器完成一密碼運算,此资 :運算係由控制字組指標所隱示參照儲存在記憶體中的^ 控制字組。本發明認為運算碼值的較佳選擇係存在指令集 架構中-空閒或未使用的運算碼值,#此在—相稱微處理 器中保留與先前操作系統及應用軟體的相容。例如:如上 所述,一x86相容實施例的運算碼攔位4〇3使用〇χ〇ρΑ7以指 示執行指定的的密碼運算。區塊密碼模式欄位4〇4指示特 定的區塊密碼模式以供特定的密碼運算使用,並且將參照 第五圖加以探討。The opcode block 403 specifies that the microprocessor completes a cryptographic operation. The operation: the operation is implicitly referenced by the control block indicator to the ^ control block stored in the memory. The present invention contemplates that the preferred choice of opcode values is in the instruction set architecture - idle or unused opcode values, which are compatible with previous operating systems and application software in the commensurate microprocessor. For example, as described above, an opcode block 4〇3 of an x86 compatible embodiment uses 〇χ〇ρΑ7 to indicate that the specified cryptographic operation is performed. The block cipher mode field 4〇4 indicates a specific block cipher mode for use with a particular cryptographic operation and will be discussed with reference to FIG.

第五圖係第四圖基元密碼運算指令之區塊密碼模式欄 位範例值的表格5 0 0。值0 X C 8指示使用電子密碼本 (electronic code book,ECB)方式完成密碼運算;值 OxDO指示使用密碼區塊鏈結(cipher block chaining, CBC)方式完成密碼算;值ΟχΕΟ指示使用密碼反饋方式 (cipher feedback,CFB)完成密碼運算;以及值0χΕ8指示 使用輸出反饋方式(output feedback,0FB)完成密碼運The fifth picture is the block 5 of the block code mode field of the fourth picture of the elementary cryptographic operation instruction. A value of 0 XC 8 indicates that the cryptographic operation is performed using an electronic code book (ECB); a value of OxDO indicates that the cryptographic calculation is performed using a cipher block chaining (CBC) method; and a value ΟχΕΟ indicates that the cryptographic feedback method is used ( Cipher feedback, CFB) completes the cryptographic operation; and the value 0χΕ8 indicates that the output feedback (0FB) is used to complete the password operation.

第28頁 :125 纖 — ' - _ 五、發明說明(22) ,算。區塊密碼模式欄位4〇4其他所有的值係保留,而這些 模式係描述於上述所提及的FIPS的文件中。Page 28: 125 fiber — ' - _ V. Invention description (22), count. The block cipher mode field 4 〇 4 all other values are reserved, and these modes are described in the FIPS file mentioned above.

睛參照第六圖’其為本發明在一χ86相容微處理器6〇〇 中較詳細的密碼單元617的實施例方塊圖。微處理器60〇包 含擷取邏輯電路6 〇 1用以從記憶體(未繪出)擷取指令以供 執行。擷取邏輯電路601係麵合到轉譯邏輯電路,而轉 譯邏輯電路602包含邏輯電路、裝置或微碼(例如:微指令 或本機指令)’或邏輯電路、裝置或微碼的組合,或用以 轉譯指令成為相關序列微指令的等效元件。這些在轉譯邏 輯電路602中執行轉譯的元件可能與在微處理器6〇〇中執行 其他功能的電路、微碼共用。轉譯邏輯電路6〇2包含一轉 #器6 0 3,而此轉譯器6 〇 3係麵合到一微碼唯讀記憶體 (microcode ROM) 6 04。中斷邏輯電路626藉由匯流排628耦 合到轉譯邏輯電路602。複數個軟體及硬體中斷信號電路 627係由指示未處理中斷給轉譯邏輯電路628的中斷邏輯電 路626處理。轉譯邏輯電路628耦合到微處理器600相繼的 階段包含一暫存階段605、位址階段6〇6、載入階段607、Referring to the sixth figure, which is a block diagram of an embodiment of a more detailed cryptographic unit 617 in a 86 compatible microprocessor 6A. The microprocessor 60 includes a capture logic circuit 6 〇 1 for fetching instructions from memory (not shown) for execution. The capture logic circuit 601 is coupled to the translation logic circuit, and the translation logic circuit 602 includes logic circuits, devices or microcode (eg, microinstructions or native instructions) or a combination of logic circuits, devices, or microcodes, or The translation instruction becomes the equivalent component of the relevant sequence microinstruction. These elements that perform translations in translation logic circuit 602 may be shared with circuitry, microcode that performs other functions in microprocessor 6. The translation logic circuit 6〇2 includes a turnaround device 6 0 3, and the translator 6 〇 3 is coupled to a microcode read only memory (microcode ROM) 6 04. Interrupt logic circuit 626 is coupled to translation logic circuit 602 by bus 628. A plurality of software and hardware interrupt signal circuits 627 are processed by interrupt logic circuit 626 that directs unprocessed interrupts to translation logic 628. The successive stages of translation logic 628 coupled to microprocessor 600 include a temporary stage 605, an address stage 6〇6, a load stage 607,

執行階段6 0 8、儲存階段6 1 8、以及寫回階段6 1 9。每一相 繼階段包含邏輯電路以完成由擷取邏輯電路6 〇丨所提供相 關指令執行的特定功能,如先前在第三圖的微處理器中所 討論參照類似名稱的元件。描繪在第六圖中χ86相容實施 例6 0 0係以在執行階段6 0 8中之執行邏輯電路6 3 2為特徵, 其包含平行執行單元610、612、614、616、617。一整數 單元6 1 0從微指令佇列6 0 9接收執行整數微指令;一浮點單The execution phase 6 0 8 , the storage phase 6 1 8 , and the write back phase 6 1 9 . Each successive stage contains logic circuitry to perform the particular functions performed by the associated instructions provided by the capture logic circuit 6, such as those previously discussed in the microprocessor of the third figure. Illustrated in the sixth diagram, the 相容86 compatible embodiment 060 is characterized by an execution logic circuit 632 in execution stage 608, which includes parallel execution units 610, 612, 614, 616, 617. An integer unit 6 1 0 receives an integer micro instruction from the microinstruction queue 6 0 9 ; a floating point list

第29頁Page 29

五、發明說明(23) -兀6 1 2從微指令佇列6丨}接收執行浮點數微指令;一.X單 疋61j從微指令佇列613接收執行ΜΜχ微指令;一 SSE單元 616從微指令佇列615接收執行SSE微指令。在展示的實 施=,一始、碼單元617藉由一載入匯流排62〇耦合到ME單 ^ 6二一失速(staU)信號電路621以及一儲存匯流排 岔碼單元617共用SSE單元的微指令佇列615。一替代 貝軛例可將密碼單元617獨立並聯操作像是單元610、612 以及614。整數單元61〇耦合到一 χ86 efugs暫存哭624, =匕暫”包含一x位元625…x位元62“狀態係 X位^t曰不岔碼運算是否正在處理。在一實施例中,此 整數ELFAGS暫存器624的第30位元。此外, 狀離機器指定暫存器以評估-E位元629的 μ,而此E位70629的狀態指示密碼單元617 ί或禁能密碼單元617。如第三圖的微處理 ΐϋΓ 圖的微處理機_以必要元件為特徵教 容實施:j的内容,並且為求圖示簡潔而 的其他元件’像是資料快取、匯流排介面單 ^ 打脈產生以及分配邏輯等,均未繪出。 在操作中,指令是由擷取邏輯電路6〇1從記憶 出)擷取並且與一時脈信號(未繪出) 二*、未:曰 電路602。轉譯邏輯電路602轉譯每 /成、為Ό澤邏輯 列的微指令,其與時脈信號同對應序 夕得續地如供給微處理機V. INSTRUCTION DESCRIPTION (23) - 兀 6 1 2 receives an execution floating-point microinstruction from the microinstruction queue 6丨}; an .X unit 61j receives an execution microinstruction from the microinstruction queue 613; an SSE unit 616 The execution of the SSE microinstruction is received from the microinstruction queue 615. In the implementation of the display, the code unit 617 is coupled to the ME unit 6 6 stalling (staU) signal circuit 621 and a storage bus bar code unit 617 to share the SSE unit by a loading bus 62 . The command queue 615. An alternative conjugate yoke can operate cryptographic units 617 independently in parallel as units 610, 612, and 614. The integer unit 61 is coupled to a 86 efugs temporary cry 624, = 匕 temporary "contains an x-bit 625...x-bit 62" state system X-bit ^t 曰 no-bit operation is processing. In one embodiment, the integer ELFAGS register 624 is the 30th bit. In addition, the slave specifies the register to evaluate the μ of the -E bit 629, and the state of the E bit 70629 indicates the cryptographic unit 617 ί or the disable crypto unit 617. The microprocessor of the micro-processing diagram of the third figure _ is characterized by the necessary components: the content of j, and other components for the sake of simplicity of the illustrations, such as data cache, bus interface interface Pulse generation and distribution logic are not shown. In operation, the instruction is fetched from the memory circuit 6〇1 and is associated with a clock signal (not shown), and is not: 电路 circuit 602. The translation logic circuit 602 translates the micro-instructions of each logical array into a logical sequence, and the same as the clock signal, the serial processor is continuously supplied to the microprocessor.

丨12¾鏡8 五、發明說明(24) ,6 0 0的後續階段6 0 5 - 6 0 8、6 1 8、6 1 9。在一序列微指令中的 每一個微指令指示一個次運算的執行,而次運算被要求完 成由一相對指令所指定的一整體運算,例如位址階段6〇6 產生一位址、暫存階段6〇5從指定暫存器(未繪出)恢復的 兩運算元在整數單元内相加、藉由儲存邏輯電路618儲存 執行單元610、612、614、616、617其中之一所產生的結 果於土憶體4。根據轉譯中的指令,轉譯邏輯電路6 〇 2利 用轉譯器603直接產生一序列的微指令,或是從微碼r〇m 6 04擷取此序列,或是利用轉譯器6 〇3直接產生此序列的部 份並且從微碼ROM 604擷取此序列剩下的部分。微指令透 過微處理機6 0 0的相繼階段6 〇 5 - 6 0 8、6 1 8、6 1 9持續地與時 脈同步進行。當微指令到達執行階段6 0 8,執行邏輯電路 632連同其運算元(在暫存階段6〇5從暫存器所恢復,或在 位址階段606由邏輯電路所產生,或藉由載入邏輯電路6(^ 從資料快取所恢復),藉由放置微指令在一對應的微指令 佇列609、611、613、615而將其依指定路線傳送給一指定 執行單元610、612、614、616、617。執行單元61〇、 612、614、616、617執行微指令並提供結果給儲存階段 618。在一實施例,微指令包含攔位指示其是否可以與其 它運算並列執行。 回應先前所述之擷取一個XCRYPT指令,轉譯邏輯6〇2 產生相關微指令,其指示在微處理器6〇〇後繼階段6〇5 —6〇8 ,618,619中的邏輯電路執行指定的密碼運算。據此,一 第一複數個相關微指令係直接依路徑傳送至密碼單元Η 7丨123⁄4 Mirror 8 V. Inventive Note (24), the subsequent phase of 605 0 0 5 - 6 0 8 , 6 1 8 , 6 1 9 . Each microinstruction in a sequence of microinstructions indicates the execution of a sub-operation, and the sub-operation is required to complete an overall operation specified by a relative instruction, such as address phase 6〇6 to generate an address, a temporary phase 6〇5 The two operands recovered from the specified register (not shown) are added in integer units, and the result of storing one of the execution units 610, 612, 614, 616, 617 by the storage logic circuit 618 In the soil recall body 4. According to the instruction in the translation, the translation logic circuit 6 利用2 directly generates a sequence of micro-instructions by using the translator 603, or extracts the sequence from the microcode r〇m 6 04, or directly generates the result by using the translator 6 〇3. The portion of the sequence and the remaining portion of the sequence is retrieved from the microcode ROM 604. The microinstructions are continuously synchronized with the clock through the successive stages 6 〇 5 - 6 0 8 , 6 1 8 , 6 1 9 of the microprocessor 600. When the microinstruction reaches the execution phase 6 0 8, the execution logic circuit 632 along with its operand (recovered from the scratchpad in the scratch phase 6〇5, or generated by the logic circuit in the address phase 606, or by loading The logic circuit 6 (recovered from the data cache) transfers the microinstruction to a designated execution unit 610, 612, 614 by a specified route by placing the microinstruction in a corresponding microinstruction queue 609, 611, 613, 615. 616, 617. Execution units 61, 612, 614, 616, 617 execute the microinstructions and provide the results to storage stage 618. In one embodiment, the microinstructions include an intercept indicating whether they can be executed side by side with other operations. The XCRYPT instruction is fetched, and the translation logic 6〇2 generates a related microinstruction indicating that the logic circuit in the subsequent stage 6〇5-6〇8, 618, 619 of the microprocessor 6 performs the specified cryptographic operation. According to this, a first plurality of related micro-instructions are directly transmitted to the cryptographic unit according to the path Η 7

五、發明說明(25) 示密喝單元617由載入匯流排62。 •生且^執行指定數目的密 y鬼的輪出貝科,或藉由儲存 。以產 密碼單元^有利t特生“用以增加 以;的:輸=字區塊之前, 存時,一指定密碼操作執二二二田輸出文字區塊正在被儲 -第二複數個相關微指令依其路徑==㈡二 兀610、612、614、616以執行其它次運曾,匕只仃早 完成指定密碼運算之必需 ^ 八等次運算係 -B3^^X^625V;^ 階段605更新暫存(例*,計曰丁二馬㈣進仃中、在暫存 器、輸出文字指標暫存哭、//,、輸入文字指標暫存 中斷信號電路627的處存二)、ft斷邏輯電咖所指示之 密碼運算的最佳執行地於理多等關微^令係用以提供指定 元微指令序列中的整 丄其藉由與密蝎單 令以允許或並從待元成。微指令係包含於相關微指 對密碼參數的指電卿恢復。因為所有 行中斷時,其狀態被疒’、$供,86架構暫存器,當執 被恢復。當從中斷=存^且^從中斷返回’這些狀態 ’微指令測試X位元625的狀態以決 第32頁 )------ 丨12銳68 一-Ύ「— 五、發明說明(26) ,定是否一密碼運算在進行。如果是,當中斷發生時,此π 算重覆於處理中之特別輸入資料區塊。相關微指令係用0 允許在處理中斷信號電路627之前,更新在一序列輸入孓 字區塊上之一序列密碼操作的指標暫存器及中間的結果σ 請參照第七圖,其為第六圖之微處理器中指示密碼本 運算之範例微指令700攔位的方塊圖。微指令7〇〇包含一淤 運算碼欄位701,一資料暫存器攔位7〇2,以及一暫存器碱 位7 0 3。微運算碼欄位7 0 1指定執行一特定次運算並且指炙 邏輯電路於微處理器6 0 0中一或多階段以執行次運算。歡 運算碼攔位70 1的指定值指定根據本發明的一密碼單元執 行指示的微指令。在一實施例,有兩個指定的值。一第/ 值(XLOAD)指定資料從一記憶體位置恢復,而其位址係由 =料暫存器攔位7 0 2内容所指稱之一架構暫存器的内容戶斤 ί定二這資料被載入到由暫存器攔位7〇3内容所具體指定 畨碼單70内的一暫存器。這恢復的資料(例如:密_鑰匙 資料、控制字、址、輸入文字眘* 、二」θ在碼鑰^ « M _ 、卜 又子貝科、初始向量)係提供給密 二==。微運鼻碼欄位7〇1的第二值(xst〇 產生的資料儲存在-記憶位置,而其位址 存器攔位702内容所指稱之一牟 糸由貝枓暫 在宓從时—.μ 朱構暫存器的内容所指定。 複备徊2 : f Ρ白段實施例’暫存器攔位703的内容户一 複數個輸出貧料區塊之一 )門备‘不 由資料攔位704内的密巧單;§己仏體。輸出資料區塊係 根據本發明穷早7^所提供以供儲存邏輯存取。 的'4=;.:=;=™微指令更具趙 _及弟九圖加以討論。 1253268 發明說明(27) 呪明-- 暫 存芎二:二第二圖’其為第七圖之XL0AD微指令格式700 存為攔位703的值之表格。#前所述,係產 生回應一XCRPYT指令的隸嗖.,Λ 序列被才曰 係由密碼單元指示執行;以及一第二複數 個破私令,其係由微處理器中密碼單元以外之一或多個並 所ΐ行。第二複數個微指令指示次運算,例如 屋、測試並設定口 及輸人資料給密碼單元並且指示密碼單元 加密(或解密)輸Λ Λ / Λ復Λ餘匙排程)以載 XLOAD微指令提供給密單、’且儲存輸出文字資料。- 一密碼鑰匙或鑰匙排程早載二㈣字組資料、載入 ^ ^ ^ ^載入初始向量資料、載入輸入文 才匕人乂射二=f竭單疋開始一指定密碼運算。一XLOAD微 & I二/位7〇3之值0b〇1。係指示密碼單元載人一控 ΐ =:::= ':暫存;。當這微指令進行管線處 中儲存控制字組的位址:暫存器存取記憶體 字組,然後傳給:載入邏輯電路從快取揭取控制 指示密碼單元載:由:二。同樣地,暫存器攔位值0b〇l〇 祖并m 由貧料欄位704所提供的輸人文字資 組:輸入指定的密碼運算。類似控制字 取。值GbOlO指干/ 構暫存11 +所儲存的—暫存器存 s /、载入由資料攔位7〇4所提供的輸入資料給V. INSTRUCTION OF THE INVENTION (25) The compact drink unit 617 is loaded into the bus bar 62. • Raw and ^ perform a specified number of secret y ghosts out of the Bayco, or by storing. In order to produce a cryptographic unit ^ advantageous t special "to increase by; before: input = word block, save time, a specified password operation, 222 field output text block is being stored - the second plurality of related micro The instruction depends on its path == (two) two 兀 610, 612, 614, 616 to perform other secondary operations, and only needs to complete the specified cryptographic operation as early as possible ^ eight equal-order operation system - B3 ^ ^ X ^ 625V; ^ stage 605 Update temporary storage (example *, 曰丁二马 (4) into the 、, in the temporary register, output text indicators temporary cry, / /, input text indicator temporary interrupt signal circuit 627 of the second), ft broken The optimal execution of the cryptographic operations indicated by the logical coffee coffee is used to provide the integers in the sequence of specified meta-instructions by means of a single order with the secret order to allow or to The micro-instruction is included in the relevant micro-finger-to-key parameter recovery. Because all lines are interrupted, their status is 疒', $, and the 86 architecture register is restored when the slave is interrupted. And ^ returns the 'these states' micro-instruction from the interrupt to test the state of the X bit 625 to determine the 32nd page) ------ 丨 12 sharp 68 one - Ύ "- five, invention description (26), whether a cryptographic operation is in progress. If yes, when the interrupt occurs, this π calculation is repeated in the special input data block in the process. The relevant micro-instruction is allowed with 0 Before processing the interrupt signal circuit 627, updating the index register of one of the serial cipher operations on a sequence of input 区 blocks and the result σ in the middle, refer to the seventh figure, which is indicated in the microprocessor of the sixth figure. The block diagram of the example microinstruction 700 of the codebook operation. The microinstruction 7〇〇 includes a silencing code field 701, a data register block 7〇2, and a register base bit 7 0 3 . The micro-code field 7 0 1 specifies that a particular order operation is performed and the fingerprint logic circuit performs one or more stages in the microprocessor 600 to perform the second operation. The specified value of the joy code block 70 1 is specified in accordance with the present invention. A cryptographic unit executes the indicated microinstruction. In one embodiment, there are two specified values. A /value (XLOAD) specifies that the data is recovered from a memory location, and its address is blocked by the semaphore register. Bit 7 0 2 Content refers to the contents of one of the architecture registers斤 定定二 This data is loaded into a temporary register in the specified weight list 70 by the contents of the register 7〇3. This recovered data (for example: secret _ key data, control word, address , input text caution *, two "θ in the code key ^ « M _, Bu and sub-beca, initial vector) is provided to the secret two ==. The second value of the micro-runner code field 7〇1 (the data generated by xst〇 is stored in the -memory position, and one of the contents of the address register block 702 is referred to by Bessie in the time of 宓- The content of the .μ朱构存存器 is specified. 备徊2 : f Ρ白段实施例 “The contents of the temporary register block 703, one of the multiple output poor blocks”, the door is not blocked by the data block A succinct single in bit 704; § 仏 仏. The output data block is provided in accordance with the present invention for storage logical access. The '4=;.:=;=TM micro-instructions are more discussed by Zhao _ and the younger nine. 1253268 DESCRIPTION OF THE INVENTION (27) Illustrated - Temporary 芎 2: 2 。 。 。 。 。 。 。 。 。 。 。 。 。 。 。 。 。 。 。 。 。 。 。 。 。 。 。 。 。 #前前, is generated in response to an XCRPYT instruction. Λ The sequence is executed by the cryptographic unit; and a second plurality of severance orders are one of the cryptographic units in the microprocessor. Or more than one. The second plurality of microinstructions indicate sub-operations, such as house, test, and setting port and input data to the crypto unit and instructing the crypto unit to encrypt (or decrypt) the input Λ / Λ Λ 匙 排 以 以 以 以 LOAD Provided to the secret, 'and store the output text. - A password key or key schedule is used to load two (four) word data, load ^ ^ ^ ^ to load the initial vector data, load the input text, then the person shoots the second = f to start a specified password operation. An XLOAD micro & I two / bit 7 〇 3 value 0b 〇 1. Indicates that the password unit is loaded with a control ΐ =:::= ': temporary storage; When the microinstruction stores the address of the control block in the pipeline: the scratchpad accesses the memory block and then passes it to: the load logic circuit from the cache fetch control indicates that the cipher unit contains: by: two. Similarly, the scratchpad block value 0b〇l〇 祖和m is the input text group provided by the lean field 704: enter the specified cryptographic operation. Similar to the control word. The value GbOlO refers to the storage/storage temporary storage 11 + stored - the temporary storage s /, loading the input data provided by the data block 7 〇 4

第34頁Page 34

[K53268 五、發明說明(28) 内部暫存器IN-1。載入到in-1暫存器的資料不是輸入文字 資料(當管線處理時)就是一初始北向量。值0bl 1〇及〇bl i i 分別指示密碼單元載入一密碼鑰匙或使用者產生鑰匙排程 中一鑰匙之較低及較高位元。根據本應用,使用者係定義 成執行一特定功能或特定運算,而使用者可具體化成一應 用程式、一操作系統、一機器或者一個人。 在一實施例中,暫存器欄位值〇bl 〇〇及〇bl 〇1係考慮一 袷碼單元有兩階段,藉此,可以管線處理相繼的輸入文字 區塊資料。因此對管線處理相繼的輸入資料區塊而言,一 第~XL0AD微指令執行提供一第一區塊的輸入文字資料給 IN〜1,接著執行一第二XL0AD微指令提供一第二區塊的輸 入文字資料給ΙΝ-0,並且指示密碼單元開始執行指定的密 瑪運算。 當一使用者產生之鑰匙排程被用以執行密碼運算時, 對應使用者產生之鑰匙排程中鑰匙數量的xL0AD微指令係 依叹定路徑傳送給密碼單元,此密碼單元指示載入此鑰起 排程中每一回合鑰匙。 在XL0AD微指令中暫存器欄位7 03其他所有的值係保 留 〇 請參照第九圖,其為第七圖之XST0R微指令格式700暫 存器攔位703的值之表格。一XST0R微指令係發布(issue) 、、、。岔碼單元以指示其提供所產生的輸出文字區塊給儲存邏 輯儲存於記憶體中由位址欄位702所提供的位址。據此, 本發明的轉譯邏輯為一特定的輸出文字區塊所發布之一[K53268 V. INSTRUCTIONS (28) Internal register IN-1. The data loaded into the in-1 register is not an input text (when processed by the pipeline) is an initial north vector. The values 0bl 1 〇 and 〇 bl i i respectively indicate that the cryptographic unit loads a cryptographic key or the user generates a lower and upper bit of a key in the key schedule. According to the application, the user is defined to perform a particular function or specific operation, and the user can be embodied as an application, an operating system, a machine, or a person. In one embodiment, the register field values 〇bl 〇〇 and 〇 bl 〇 1 have two stages in which a weight unit is considered, whereby successive input text block data can be processed in a pipeline. Therefore, for the pipeline to process successive input data blocks, a ~XL0AD microinstruction execution provides a first block of input text data to IN~1, and then executes a second XL0AD microinstruction to provide a second block. Enter the text data to ΙΝ-0 and instruct the crypto unit to begin performing the specified singular operation. When a user-generated key schedule is used to perform a cryptographic operation, the xL0AD micro-instruction corresponding to the number of keys in the key schedule generated by the user is transmitted to the cryptographic unit according to the singular path, and the cryptographic unit indicates that the key is loaded. Start each turn key in the schedule. In the XL0AD microinstruction, the register field 7 03 all other values are reserved. Please refer to the ninth figure, which is a table of the values of the XST0R microinstruction format 700 register block 703 of the seventh figure. An XST0R microinstruction is issued (issue), ,,. The weight unit stores the generated output text block for the storage logic to store the address provided by the address field 702 in the memory. Accordingly, the translation logic of the present invention is one of the specific output text blocks issued.

第35頁Page 35

li5l26Sli5l26S

-XSTOR微指令係在為一其所對應輸入文字區塊所發一 XST0R微指令之後。暫存器欄位7〇3之值〇bi〇〇係指麥碼 單元提供關聯其内部的OUT-〇暫存器給儲存邏輯^存二, OUT-0的内容與輸入文字區塊提供給^_〇係相關聯。同 里 > 暫存器攔位值〇 b 1 0 1之内部0 u t p u t - 1暫存器係與 輸入文字資料提供給U—丨相關聯。據此,跟隨在鑰匙及控 制字組資料載入之後,複數個輸入文字區塊可以被管線輸 达’係透過密碼單元依序發布密碼微指令xl〇ad.in i、 XL0AD.IN 0 (XL0AD.IN-0也指示密碼單元開始密碼運算) ^XST0R.0UT-1 >XSTOR.OUT>〇 ^XLOAD.IN-1 ^XLOAD.IN-O (開始下兩個輸入文字區塊運算)等等。 請參照第十圖,其為本發明指定密碼運算參數之範例 控制字組1 000格式的方塊圖。控制字組1 000係由使用者程 式設計於記憶體,並且在執行密碼運算之前,控制字組 1 0 0 0的指標提供給相稱微處理器中的一架構暫存器。據此 ,當部分序列的微指令對應到一XCRYPT指令時,一xl〇ad 微指令被發布以指示微處理器去讀取包含指標的架構暫存 器、從記憶體(快取)恢復控制字組1 〇 0 0以及載入控制字組 1 00 0到密碼單元的内部控制字組暫存器。控制字組1 000包 含一保留(RSVD)欄位1001、鑰匙大小(KSIZE)攔位1 0 02、 一加密/解密(E/D)攔位1 0 03、一中間結果(IRSLT)攔位 1 004、一鑰匙產生(KGEN)欄位1 00 5、一演算(ALG)攔位 1 00 6以及一回合計算(RCNT)欄位1 0 07。保留欄位1001所有 的值係保留。鑰匙大小欄位1 〇〇2的内容係指示一用以完成The -XSTOR microinstruction is followed by an XST0R microinstruction issued for its corresponding input text block. The value of the register field 7〇3〇bi〇〇 means that the wheat code unit provides its internal OUT-〇 register to the storage logic, and the contents of OUT-0 and the input text block are provided to ^ _ 相关 is associated. The same > register block value 〇 b 1 0 1 internal 0 u t p u t - 1 register is associated with the input text data provided to U-丨. According to this, after the key and control block data is loaded, a plurality of input text blocks can be transmitted by the pipeline. The password micro-instruction xl〇ad.in i, XL0AD.IN 0 (XL0AD) is sequentially issued through the password unit. .IN-0 also instructs the crypto unit to start the cryptographic operation) ^XST0R.0UT-1 >XSTOR.OUT>〇^XLOAD.IN-1 ^XLOAD.IN-O (starts the next two input text block operations), etc. . Please refer to the tenth figure, which is a block diagram of an example control block 1 000 format for specifying a cryptographic operation parameter of the present invention. The control block 1000 is designed by the user to be in memory, and the indicator of the control block 1 0 0 0 is provided to an architectural register in the symmetric microprocessor before the cryptographic operation is performed. Accordingly, when a partial sequence of microinstructions corresponds to an XCRYPT instruction, an xl〇ad microinstruction is issued to instruct the microprocessor to read the architectural register containing the indicator, and recover the control word from the memory (cache). Group 1 〇0 0 and load control block 1 00 0 to the internal control block register of the crypto unit. Control block 1 000 contains a reserved (RSVD) field 1001, a key size (KSIZE) block 1 0 02, an encryption/decryption (E/D) block 1 0 03, an intermediate result (IRSLT) block 1 004, a key generation (KGEN) field 1 00 5, an calculus (ALG) block 1 00 6 and a round calculation (RCNT) field 1 0 07. All values of reserved field 1001 are reserved. The size of the key size field 1 〇〇 2 indicates that one is used to complete

第36頁 腦8 五、發明說明(30) 士密或解密之密碼鑰匙的大小。在一實施例中,鑰匙大小 欄位1 0 0 2不是指示一 1 2 8位元鑰匙、一 1 9 2位元鑰匙,就是 指示一 2 5 6位元鑰匙。加密/解密欄位丨〇 〇 3指出密碼運算係 加密運算或指出密碼運算係解密運算。鑰匙產生欄位丨〇 〇 5 指示在記憶體中係使用者產生之鑰匙排程或在記憶體中係 單一密碼鑰匙;如果為單/鍮起時,微指令發布給密碼單 元與猎碼鑰匙以指示單元根據演算欄位1 〇 〇 6之内容所具體 指定的密碼演算以擴展鑰匙成為一鑰匙排程。在一實施 例,演算欄位1 006之特定值具體指示DES演算法、三重DES 冷真法或者AES演异法如先前所述之討論。替代實施例可 考慮其他密碼演算法,例如Rijn(jael Cipher、Twofish Cipher等。回合計算攔位1〇〇7的内容指示一數量的密碼回 合’其根據具體指示的演算法完成於每一輸入文字區塊。 雖,上述提及的標準指示每一輸入文字區塊固定前置數量 的^碼回合,但回合計算欄位1 007允許一程式設計者從標 ,,不修改回合的數量。在一實施例中,程式設計者可指 定2 一區塊從0 —15回合。最後,中間結果欄位1〇〇4指示是 =一,^文字區塊的加密/解密是根據演算1 006所指定之 密,次异法以回合計算1 0 0 7所指定回合的數量執行,或者 ΐιΚ解密是根據演算1 006所指定之密碼演算法以回合計 ^ 7所指定回合的數量執行,而其最終回合的執行代表 :密以:不是一最終結果。_悉該項技藝者可察知許 運算於Γ去除了最終回合的次運鼻之外係執行相同的次 母 回合。因此程式設計中間結果攔位1 〇 〇 4提供中Page 36 Brain 8 V. Invention Description (30) The size of the password key for the secret or decryption. In one embodiment, the key size field 1 0 0 2 does not indicate a 1 2 8 bit key, a 1 9 2 bit key, or a 2 5 6 bit key. Encryption/Decryption Field 丨〇 指出 3 indicates that the cryptographic operation is an encryption operation or indicates that the cryptographic operation is a decryption operation. The key generation field 丨〇〇5 indicates that the user generates a key schedule in the memory or a single cipher key in the memory; if it is single/push, the micro-command is issued to the crypto unit and the hunting key. The instruction unit calculates the key according to the contents of the calculation field 1 〇〇 6 to expand the key into a key schedule. In one embodiment, the particular value of the calculus field 1 006 specifically indicates the DES algorithm, the triple DES cold truth method, or the AES algorithm as discussed previously. Alternative cryptographic algorithms may be considered, such as Rijn (jael Cipher, Twofish Cipher, etc. The content of the round calculation block 1 指示 7 indicates a number of cryptographic rounds] which is completed according to a specific indication of the algorithm for each input text. Block. Although the above mentioned standard indicates that each input text block has a fixed number of pre-coded rounds, the round calculation field 1 007 allows a programmer to follow the mark, without modifying the number of rounds. In an embodiment, the programmer can specify 2 blocks from 0-15. Finally, the intermediate result field 1〇〇4 indicates =1, and the encryption/decryption of the text block is specified according to the calculation 1 006. The secret, sub-differential method is executed in rounds to calculate the number of rounds specified in the round, or ΐιΚ decryption is performed according to the number of rounds specified by the cryptographic algorithm specified in calculus 1 006, and the final round is Executive Representative: Mi Mi: Not a final result. _ The artist can see that the calculation is performed in the same way that the second round of the final round is removed. If bit 1 billion square bar 4 provided in

第37頁 ^532¾ ^532¾ 五、發明說明(31) ,結果而不是最後結果,藉此,允許程式設計者可核對演 -^ ^實現之中間的步驟。例如:獲得增加的中間值以核對 演算法實行,假設,執行一回合的加密於一文字區塊,然 後執行兩回合於相同文字區塊,然後三回合等。提供可程 式化回合及中間值結果的功能可讓使用者檢查密碼執行、 除錯以及達到改變鑰匙結構及回合計數。請參照第十一 ^ ’其為本發明之一密碼單元1 1 0 0的較佳實施例方塊圖。 搶碼單元1 1 0 0包含一微運算碼暫存器丨丨0 3,此微運算碼暫 存器1 1 0 3透過一微指令匯流排111 4接收密碼微指令(例如 XLOAD與XSTOR微指令)。密碼單元1100也包含一控制字組 暫存器1104、一 input-0暫存器11〇5以及input —1暫存器 1106、一 key-Ο暫存器1107以及一 fceyq暫存器11〇8。資料 透過一載入匯流排1 11 1提供給暫存器丨1〇4 —丨1〇8,如微指 令暫存迄1103中一XLOAD微指令内容所指定。而inpu卜曰〇 與input-Ι暫存器1105-1106係配置用以在目前輸入文字區 塊執行密碼運算期間,致能隨後輸入文字區塊的緩衝。密 碼單元11 00也包含區塊密碼邏輯電路1101,此區塊密碼 輯電路11 01耦合到所有的暫存器U 〇3 —11〇8以及也耦合到 密碼鑰匙隨機存取記憶體11〇2。區塊碼邏輯電路丨1〇1提供 一暫停信號電路111 3並且也提供區塊結果給一〇utput-〇 ^ 存器1109以及一out put-1暫存器m〇。輸出暫存器 1109-1110透過一儲存匯流排1212將内容依指定路徑傳 給在一相稱微處理器中的相繼階段。密碼單元丨丨〇 〇係 以致能在密碼運算於接著的輸入文字區塊時,儲存從^出Page 37 ^5323⁄4 ^5323⁄4 V. Invention Description (31), the result, not the final result, thereby allowing the programmer to check the steps in the middle of the implementation. For example, an increased intermediate value is obtained to check the algorithm implementation, assuming that one round of encryption is performed in one text block, then two rounds are performed on the same text block, then three rounds, and so on. The ability to provide programmable rounds and intermediate value results allows the user to check password execution, debug, and change key structure and round counts. Please refer to the eleventh embodiment for a preferred embodiment of a cryptographic unit 1 1 0 0 of the present invention. The preemption unit 1 1 0 0 includes a micro operation code register 丨丨0 3, and the micro operation code register 1 1 0 3 receives a cryptographic micro instruction through a micro instruction bus line 111 4 (for example, XLOAD and XSTOR micro instructions) ). The cryptographic unit 1100 also includes a control block register 1104, an input-0 register 11〇5, and an input-1 register 1106, a key-Ο register 1107, and an fceyq register 11〇8. . The data is supplied to the scratchpad 丨1〇4 —丨1〇8 through a load bus 1 11 1 , as specified by the contents of an XLOAD microinstruction in the micro-instruction temporary storage 1103. The inpu 曰〇 and input-Ι registers 1105-1106 are configured to enable subsequent buffering of the text block during the current cryptographic operation of the input text block. The cipher unit 11 00 also includes a block cipher logic circuit 1101 that is coupled to all of the registers U 〇 3 - 11 〇 8 and also to the cipher key random access memory 11 〇 2 . The block code logic circuit 丨1〇1 provides a pause signal circuit 111 3 and also provides block results to a 〇putput buffer 1109 and an out put-1 register m〇. The output registers 1109-1110 pass the contents to a successive stage in a phased microprocessor via a storage bus 1212. The cryptographic unit 以 以 is so that when the cryptographic operation is performed on the following input text block, the storage is from

第38頁Page 38

MU 1253268 五、發明說明(32) •暫存器11 0 9 -111 0的資料。在一實施例中,微指令暫存器 .1103係32位元大小’並且其餘的暫存器iiiQ皆為128 位元暫存器。 在操作中’密碼微指令與資料一起連續提供給微指令 暫存器11 0 3 ’其中資料係指定給控制字組暫存器11 〇 4、或 輸入暫存器1105-1106之一、或鑰匙暫存器之一 。在參照第八圖及第九圖討論的實施例中,控制字組藉由 一XLOAD微指令載入到控制字組暫存器11 〇4。因此密碼鍮 匙或鑰匙排程經由連續的XL 0AD微指令載入。當一 128位元 密碼鑰匙載入時,一XL0AD微指令因此提供給指定的ΚΕγ一〇 暫存器1107,並且連同一 XL0AD微指令提供給指定的 暫存器1108。當一使用者產生之鑰匙排程載入時,連續 X—L0AD微指令提供給指定KEY-〇暫存器1107。鑰匙排程中的 每一鑰匙被載入且依序被放置在鑰匙隨機存取記憶體n〇2 以供其相對應的密碼回合使用。隨此,輸入文字資料(如 果沒有要求一初始向量)載入到IN—丨暫存器11〇6,如果要 求一初始向量,則經由一XL0AD微指令載入到in-1暫存器 1106。對ΙΝ-0暫存器1105的一 XL〇AD微指令指示密碼單^ 以載入輸入文字資料給IN-〇暫存器11〇5,並且開始在IN — 〇 =存器11 05内的輸入文字資料執行密碼回合,其根據控制 字組暫存器1104之内容所提供的參數使用在j^ — 丨或在兩輸 ^暫存器1 1 05-1 1 06 (當輸入資料係管線處理)中的初始向 量。根據收到指定IN-〇的XL0AD微指令,區塊密碼邏輯電 路11 0 1開始執行由控制字組内容所指定的密碼運算。當單MU 1253268 V. Description of the invention (32) • Information on the register 11 0 9 -111 0. In one embodiment, the microinstruction register .1103 is 32 bits in size and the remaining registers iiiQ are all 128 bit registers. In operation, the 'password microinstruction is continuously supplied to the microinstruction register 11 0 3 ' together with the data, wherein the data is assigned to the control block register 11 〇 4, or one of the input registers 1105-1106, or a key. One of the scratchpads. In the embodiment discussed with reference to the eighth and ninth figures, the control block is loaded into the control block register 11 〇4 by an XLOAD microinstruction. Therefore the password key or key schedule is loaded via the continuous XL 0AD microinstruction. When a 128-bit cipher key is loaded, an XL0AD microinstruction is thus provided to the designated ΚΕγ 〇 register 1107, and the same XL0AD microinstruction is provided to the designated register 1108. The continuous X-L0AD microinstruction is provided to the designated KEY-〇 register 1107 when a user-generated key schedule is loaded. Each key in the key schedule is loaded and sequentially placed in the key random access memory n〇2 for its corresponding password round. Accordingly, the input text data (if an initial vector is not required) is loaded into the IN-丨 register 11〇6, and if an initial vector is required, it is loaded into the in-1 register 1106 via an XL0AD microinstruction. An XL〇AD microinstruction to the ΙΝ-0 register 1105 indicates the password list ^ to load the input text data to the IN-〇 register 11〇5, and starts the input in the IN_〇=the memory 117 The text data performs a password round, which is used according to the parameters provided by the contents of the control block register 1104 in j^ — 丨 or in the two-transfer register 1 1 05-1 1 06 (when the input data system is processed) The initial vector in . Based on the XL0AD microinstruction that received the specified IN-〇, the block cipher logic circuit 11 0 1 begins the cryptographic operation specified by the control block contents. When

12532^8 五、發明說明(33) 一密碼鑰匙要求擴展,區塊密碼邏輯電路1 1 〇 1產生鑰匙排 .程中的每^鑰匙並將以儲存在鑰匙隨機存取記憶體1102。 姑且不論是否由區塊密碼邏輯電路1101產生一鑰匙排程或 者是從記憶體中載入鑰匙排程,第一回合的鑰匙係快取儲 存於區塊密碼邏輯電路11 01中以使得第一區塊密瑪回合可 以不用存取鑰匙RAM 1102而處理。一但初始化後,區塊密 碼邏輯1 101繼續執行指定的密碼運算於一或多個輸入文字 區塊直到運算完成;其連續從鑰匙RAM 1102擷取回合鑰匙 如所應用的密碼演算法所要求。密碼單元1 1 〇 〇執行一指定 區塊密碼運算於指定的輸入文字區塊,而相繼的輸入文字 區塊透過相繼對應的XLOAD及XSTOR微指令加密/解密。當 一XSTOR微指令執行時,如果指定輸出資料(例如 out 1)尚未兀全產生,則區塊密碼邏輯電路丨丨顯示暫停 ΐϊ:路1113。一但輸出資料已產生且放置於相對應之輸 出暫存器1109-1110時,暫存器11〇9_111〇的内容接著傳 =存匯流排1U2。雖,然當指定輸出資料尚未完全產= = = 號電路1113,但由於輸入暫存器 寺 允终輸入文字區塊的緩衝,因此 的資料區塊管線處理係藉由順序 :早兀11 00有效率 得在隨後輸入文字區“====令,使 料於輸出暫存器i 109一Ji 10時。 在要未儲存貧 5月參照第十二圖,其為太表 (AES)演算法密碼運算、_、、'月執行有關進階加密標準 的方塊圖。區塊密碼邏輯邏八輯電路⑽實施例 电路1 200包含-回合引擎1 220 ’12532^8 V. Invention Description (33) A cipher key is required to be expanded, and the block cipher logic circuit 1 1 产生 1 generates a key platoon. Each key in the process will be stored in the key random access memory 1102. Regardless of whether a key schedule is generated by the block cipher logic circuit 1101 or a key schedule is loaded from the memory, the key of the first round is cached and stored in the block cipher logic circuit 11 01 to make the first area The block mega-round can be processed without accessing the key RAM 1102. Once initialized, block cipher logic 1 101 continues to perform the specified cryptographic operations on one or more input text blocks until the operation is complete; it continuously retrieves the round key from key RAM 1102 as required by the applied cryptographic algorithm. The cryptographic unit 1 1 〇 〇 performs a specified block cryptographic operation on the specified input text block, and successive input text blocks are encrypted/decrypted by successive corresponding XLOAD and XSTOR microinstructions. When an XSTOR microinstruction is executed, if the specified output data (e.g., out 1) has not been fully generated, the block cipher logic circuit 丨丨 displays a pause ΐϊ: path 1113. Once the output data has been generated and placed in the corresponding output register 1109-1110, the contents of the register 11〇9_111〇 are then transferred to the memory bank 1U2. Although, when the specified output data has not been fully produced === circuit 1113, but because the input register is allowed to input the buffer of the text block, the data block pipeline processing is processed by the order: early 11 00 The efficiency is then entered in the text area "==== order, so that it is expected to output the register i 109 a Ji 10. In the case of not storing the poor May, refer to the twelfth figure, which is the AES algorithm. Cryptographic operation, _,, 'month block diagram of advanced encryption standards. Block cipher logic logic circuit (10) embodiment circuit 1 200 contains - round engine 1 220 '

I麵I face

12532681253268

五、發明說明(34) 此回合引擎1 2 2 0透過匯流排1 2 1 1 - 1 2 1 4及匯流排1 2 1 6 - 1 2 1 8 ,合到一回合引擎控制器121〇。回合引擎控制器121〇包含 儲存邏輯1230,並且存取一微指令暫存器12〇1、控制字組 暫存器1 202、KEY-〇暫存器12〇3以及KEY-1暫存器12〇4以存 取鑰匙資料、微指令以及所指示密碼運算的參數。輸入暫 存器1205-1206的内容提供給回合引擎122〇並且回合引擎 1220提供相對應輸出文字給輸出暫存器12〇7一12〇8。輸出 暫存器1 207- 1 208透過匯流排12 16-1217也耦合到回合引擎 控制器1 2 1 0以致能回合引擎控制器存取每一相繼密碼回合 的結果’而此結果係透過N E X TI N匯流排1 2 1 8提供給回合引 擎1 220下一密碼回合。鑰匙rAM (未繪出)中的密碼鑰匙係 透過匯流排1 2 1 5存取。ENC/DEC信號1 2 11指示回合引擎利 用次運算執行不是加密(例如S-Box)就是解密(例如反向 S-Box)。RNDC0N匯流排1212的内容指示回合引擎122〇執行 不是一第一AES回合、一中間AES回合就是一最後AES回 合。錄政匯流排1213用以提供每一回合錄匙給回合引擎 1 2 2 0在其對應的回合執行時。 回合引擎1 220包含第一鑰匙x〇R邏輯電路1221,此第 一鍮匙X0R邏輯電路1221搞合到一第一暫存器reg - 〇 1222 ’此第一暫存器1222耦合到S-Box邏輯電路;[223,此 S - Box邏輯電路1 223耦合到Shift Row邏輯電路1 224,此 Shift Row邏輯電路1224耦合到一第二暫存器reg 一 1 1225 ’此苐二暫存器1225麵合到Mix Colum邏輯電路 1 226,此Mix Col um邏輯電路1 226耦合到一第三暫存器V. INSTRUCTIONS (34) This round engine 1 2 2 0 is merged into a round engine controller 121 through the busbars 1 2 1 1 - 1 2 1 4 and the busbars 1 2 1 6 - 1 2 1 8 . The round engine controller 121 includes storage logic 1230 and accesses a microinstruction register 12〇1, a control block register 1 202, a KEY-〇 register 12〇3, and a KEY-1 register 12 〇4 to access key data, micro-instructions, and parameters of the indicated cryptographic operations. The contents of the input registers 1205-1206 are provided to the round engine 122, and the round engine 1220 provides corresponding output characters to the output registers 12〇7-12〇8. The output register 1 207-1 208 is also coupled to the round engine controller 1 2 1 0 through the bus bars 12 16-1217 to enable the round engine controller to access the result of each successive password round' and the result is transmitted through NEX TI The N bus 1 2 1 8 is provided to the round engine 1 220 for the next password round. The cryptographic key in the key rAM (not shown) is accessed via the busbar 1 2 1 5 . The ENC/DEC signal 1 2 11 indicates that the round engine uses a secondary operation to perform either encryption (eg, S-Box) or decryption (eg, reverse S-Box). The content of the RNDC0N bus 1212 indicates that the round engine 122 is not performing a first AES round, an intermediate AES round is a final AES round. The recording bus 1213 is used to provide each round of the key to the round engine 1 2 2 0 at the time of its corresponding round execution. The round engine 1 220 includes a first key x〇R logic circuit 1221. The first key XOR logic circuit 1221 is coupled to a first register reg - 〇 1222. The first register 1222 is coupled to the S-Box. Logic circuit; [223, the S-Box logic circuit 1 223 is coupled to the Shift Row logic circuit 1 224, and the Shift Row logic circuit 1224 is coupled to a second register reg 1 1225 'the second register 1225 side Coupled to the Mix Colum logic circuit 1 226, the Mix Col um logic circuit 1 226 is coupled to a third register

第41頁 ^'1.12. 2¾ 1253268 五、發明說明(35) •REG-2 1 227。第一鑰匙邏輯1221、s —β〇χ邏輯電路1 223、 Shift Row邏輯電路1 224以及Mix (^丨聽邏輯電路1 226係配 置用以執行次運算於輸入文字資料,像是具體指定於先前 討淪的AES FIPS標準。Mix Colum邏輯電路1226在中間回 合期間於要求使用藉由鑰匙匯流排1213所提供的回合鑰匙 時’係附加配置以執rAES X0R功能於輸入資料。第一鑰 匙邏輯1221、S~Box邏輯1 223 'Shift Row邏輯1 224以及Page 41 ^'1.12. 23⁄4 1253268 V. INSTRUCTIONS (35) • REG-2 1 227. The first key logic 1221, the s-β〇χ logic circuit 1 223, the Shift Row logic circuit 1 224, and the Mix (the listening logic circuit 1 226 are configured to perform sub-operations on the input text data, as specified in the previous The AES FIPS standard is discussed. The Mix Colum logic circuit 1226 is configured to perform the rAES X0R function on the input data during the intermediate round when the round key provided by the key bus 1213 is required to be used. The first key logic 1221. S~Box Logic 1 223 'Shift Row Logic 1 224 and

Mix Colum邏輯1 22 6在藉由ENC/DEC 1211的狀態指示時, 也配置用以執行其相對之反向AES次運算於解密期間。熟 悉該項技藝者可察知中間回合資料係根據控制字組暫存器 1 202内容所指定的具體區塊加密模式而回饋給回合引擎时 1 220。初始向量資料(如果要求)透過ΝΕχτΐΝ匯流排i2i8 供給回合引擎1220。 在第十二圖所示的實施例中,回合引擎分為兩階段: 一第一階段介於第一暫存器REG —〇 1 222與第二暫存器 REG- 1 1 225以及一第二階段介於第二暫存器REG —丨盥 第二暫存器REG-2 1 227。中間回合資料同步一時脈信號 (未繪出)於階段間管線處理。當一區塊的輸入資料+ 碼運算,其關聯的輸出資料放置於相對應輸出暫存器μ 1 20 7- 1 208。回應到一 XST0R微指令,儲存邏輯電路/ 立STORE信號電路1214,以告知回合引擎122〇說指 確 暫存器1207-1208的内容正提供給儲存匯流排(未繪出^。 § k後的輸入文字區塊已緩衝於輸入暫存器 且當回合引擎1220正在處理隨後的輸入文字區塊 匕w ’輸出The Mix Colum Logic 1 22 6 is also configured to perform its relative reverse AES operations during the decryption period as indicated by the state of the ENC/DEC 1211. It will be appreciated by those skilled in the art that the intermediate round data is fed back to the round engine 1 220 based on the particular block encryption mode specified by the contents of the control block register 1 202. The initial vector data (if required) is supplied to the round engine 1220 via the ΝΕχτΐΝ bus i2i8. In the embodiment shown in the twelfth embodiment, the round engine is divided into two phases: a first phase between the first register REG_〇1 222 and the second register REG-1 1 225 and a second The phase is between the second register REG and the second register REG-2 1 227. The intermediate round data synchronization clock signal (not shown) is processed during the interstage pipeline. When a block of input data + code operation, its associated output data is placed in the corresponding output register μ 1 20 7- 1 208. In response to an XST0R microinstruction, the storage logic circuit / STORE signal circuit 1214 is instructed to inform the round engine 122 that the contents of the scratchpad 1207-1208 are being provided to the storage bus (not drawn ^. § k The input text block is buffered in the input register and when the round engine 1220 is processing the subsequent input text block 匕w 'output

[划 2g8 ‘ 五、發明說明(36) :=器1』0 7 1 2 0 8可以執行儲存。根據本發明效率化多資 ,:ϊνί處ΐ如何載入及儲存微指令,將更具體參照第 十二圖到第十六圖加以討論。 密碼微指令流之一實施例對 碼單元--欠可:ϊ二 表格1 3 00。如上述’-單階密 多夕邮择了處理一輸入文字區塊。然而,此單階實施 二圖)係配置於相同方法,也就是當回合ί擎;十 資料執行指定的密碼運管時, 引擎對目刖輸入 輸入區塊資料,並且心德二:存器允許緩衝隨後的 碼運算時,輪出暫存;盘儲存區塊執行指定的密 資料區塊的輸出區塊:存”ί:電路致能對應目前輸入 執,需要兩個管= 的 存器0,回合引擎自動開始。就比别貝;Ί載入到輸入暫 須20個時脈週期以產生—乂的而言’回合引擎 儲存指令ST.OUT-0係暫停。二似二=,,,在此期間,一 指令ST.OUT-0指定執行的儲存運 』,儲存 此,當一第一載入指令LD. IN_〇调^ =時脈週期。據 行’因此在週期22時產生二:回合引擎開始執 =儲存指令ST.0UT-G係暫停直到相區塊。相對應 文,因此在週期24完成儲存。_ & w的輪出資料區塊備 —炚後的栽入指令LD.IN-0 第 43 ^ ' ------ rmn.1253268[Scratch 2g8 ‘ V. Invention Description (36) :=Device 1』0 7 1 2 0 8 The storage can be performed. In accordance with the present invention, more efficient, how to load and store microinstructions will be discussed more specifically with reference to Figures 12 through 16. One example of a cryptographic microinstruction stream for a code unit - owe: ϊ two Table 1 3 00. As described above, the single-order letter has been processed to process an input text block. However, this single-stage implementation of the second figure is configured in the same method, that is, when the round qing engine; ten data execution of the specified password management, the engine inputs the input block data to the directory, and the heart is two: the register allows When buffering the subsequent code operation, the disk is temporarily stored; the disk storage block executes the output block of the specified dense data block: save "ί: the circuit enables the current input, and requires two tubes = memory 0, The round engine starts automatically. It is better than Bebe; ΊLoading into the input requires 20 clock cycles to generate - 乂, the 'round engine storage instruction ST.OUT-0 is paused. Two like two =,,, here During the period, an instruction ST.OUT-0 specifies the execution of the storage operation, storing this, when a first load instruction LD. IN_〇 adjusts ^ = clock cycle. According to the line 'thus, at cycle 22, two: round The engine starts to execute = the storage instruction ST.0UT-G is paused until the phase block. Corresponding text, so the storage is completed in cycle 24. _ & w round-out data block preparation - 栽 after the planting instruction LD.IN -0 43^ ' ------ rmn.1253268

五、發明說明(37) '係暫停在先前 .此在週期2 6之 如上所述 對密碼單元先 區塊執行密碼 請參照第 例對密碼單元 圖所討論的微 單階密碼單元 擎執行載入指 算之時脈週期 同。 儲存指令ST 前沒有載入 ’這種載入 前所提及的 運算而言, 十四圖,其 之單階實施 才曰令流程, 的有利特性 令LD·ΙΝ-0 、 的數目與參 • OUT-0之後,直到儲存完成,因 隨後的輸入文字區塊。 -儲存-載入-儲存微指令的順序 特性並無助益。結果,就多資料 每個區塊需要2 4週期。 為本發明微指令流程之另一實施 例的表格1 4 0 0。對比參照第十三 此替代微指令流程實施例利用了 。就比較目的而言,透過回合引 .儲存指令ST· OUT-〇以及密碼運 照第十三圖所討論的實施例係相 LT) τ \根η據此替代微指令流實施例,當一第一載入指令 ★在週期0提供給密碼單元,然後在兩個週期後,輸 一貝料載入並且回合引擎開始執行,因此在週期22時產生 1應的輸出資料區塊。然而,因為輸入資料可以緩衝, 此轉孑邏輯電路在週期4完成發布一第二載入指令 D.IN0以載入一隨後的輸入文字區塊。在隨後輸入文字區 2執行的密碼運算係暫停直到一對應第一輸入文字區塊的 、雨出文字區塊產生(週期22),但是隨後的輸入文字區塊在 週期4已緩衝儲存,因此其密碼運算可以在週期23開始並 在週期42完成。對應第一輸入區塊的輸出文字之儲存指令 st· ουτ-〇係由轉譯邏輯在隨後區塊載入指令LD. ΙΝ-〇之後 所提供。此儲存指令ST· OUT-0係暫停直到相對應的輸出資V. INSTRUCTIONS (37) 'The system is paused in the previous. This is the execution of the password for the first block of the cipher unit as described above in the cycle 26. Please refer to the example to perform the loading of the micro-order crypto unit engine discussed in the crypto unit diagram. The clock cycle of the calculation is the same. Before the storage instruction ST is loaded, 'the operation mentioned before the loading, the fourteenth figure, the single-stage implementation of the process, the advantageous characteristics of the number of LD·ΙΝ-0, and the parameters After OUT-0, until the storage is complete, because of the subsequent input text block. - Store-load-store the order of the micro-instructions. Features are not helpful. As a result, more data is required for each block for 24 cycles. Table 1 400 of another embodiment of the microinstruction flow of the present invention. The reference is made to the thirteenth alternative microinstruction process embodiment. For comparison purposes, the embodiment of the embodiment LT) τ 根 根 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存 储存A load instruction is provided to the crypto unit in cycle 0, then after two cycles, a buck is loaded and the round engine begins execution, thus generating a 1 output data block at cycle 22. However, because the input data can be buffered, the switch logic circuit releases a second load instruction D.IN0 in cycle 4 to load a subsequent input text block. The cryptographic operation performed in the subsequent input text area 2 is suspended until a rain-out text block corresponding to the first input text block is generated (period 22), but the subsequent input text block is buffer-stored in cycle 4, so The cryptographic operation can begin at cycle 23 and complete at cycle 42. The storage instruction st· ουτ-〇 corresponding to the output text of the first input block is provided by the translation logic after the subsequent block load instruction LD. ΙΝ-〇. This storage instruction ST·OUT-0 is suspended until the corresponding output is

Uf532祕 五、發明說明(38) •料區塊在週期22備妥,但在週期24 * 士紗六 ,入指令LD· IN-0係暫停在先前健存指^ST 〇^τ。一隨後的載 儲存完成,因此在週期26之前沒有#陴〜之後直到 塊。糟由回合引擎而將兩週期轉入隨後輸人 理。藉由最初執行兩個載入,這種子區塊的處 密碼單元先前所提及的特性,因此增二$區'可,得利於 的產量成20週期。儲存一輪出區^ :,母個區塊 ,有效合併於一隨後輸入文字區塊密;^ 财輸入文字區塊密碼運算的執行期間。』係口併於目 請參照第十五圖,JL兔士政nn ^ χ ^ ^ . ®八為本發明微指令流之一實施例對 實施例係具心 個相繼66於X次μ ,、了以在回合引擎的週期處理兩 程,表格1^00=,塊三如,表格1 300的單階實施例流 二併時脈週期7 '在密碼單元中並未從其特徵中獲利以 ;$LD 、继六比,較目的而言,透過回合引擎執行載人 期二數Θ H、諸存指$ST· 〇UT —〇以及密碼運算之時脈週 期的數目與參昭望+ -囬 吐、 〜 同。如上所述:、暫匕執二十四广討論的實施例係相 輸入資料至輸入暫二執=二指令上㈣-1係僅僅載入 文字資料至於入Ιί 一0暫存器執行載入輸入 、 輸暫存器〇,並且透過回合引擎初妒虛理/ 輸入暫存器0及1内的鈐Α次粗σ W擎初始處理在 rstagpHx UL的輪入貝枓。因為回合引擎的發動 g )因此完成在兩輸入暫存5|中輪入眘料的知玄/ 解密僅須20個時脈週期。$仔器中輸入貝枓的加捃/ 第45頁 幽4:Uf532 secrets 5. Invention description (38) • The material block is ready in cycle 22, but in cycle 24 * 士 纱 六, the input command LD· IN-0 is suspended in the previous health indicator ^ST 〇 ^ τ. A subsequent load is completed, so there is no #陴~ until the block before cycle 26. The two cycles are transferred to the subsequent transfer by the turn engine. By initially performing two loads, the sub-block's cryptographic unit has previously mentioned the characteristics, thus increasing the two-area', yielding a yield of 20 cycles. Store one round of exit area ^:, mother block, validly merged into a subsequent input text block; ^ Enter the execution period of the text block cipher operation. 』 系 并 并 并 并 并 J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J To process two passes in the cycle of the round engine, Table 1^00=, block three, the single-order embodiment of Table 1 300, and the clock cycle 7' does not benefit from its features in the crypto unit. ; $LD, following the six ratios, for the purpose of the purpose, through the round engine to execute the manned period two Θ H, the deposit refers to the number of the STST 〇 UT — 〇 and the number of clock cycles of the cryptographic operation and the reference + Take back, ~ with. As mentioned above: the implementation of the twenty-fourth discussion of the phase input data to the input temporary two = two instructions (four) -1 only load the text data for the input Ι a 0 register to perform the load input The input buffer 〇, and through the round engine initial 妒 / / input buffer 0 and 1 钤Α 粗 擎 擎 擎 擎 擎 擎 初始 初始 初始 初始 初始 初始 初始 初始 初始 初始 初始 初始 初始 初始 初始 初始 初始 rs rs rs rs rs rs rs rs rs Because the start of the round engine g), it is only necessary to complete the 20-clock cycle in the two-input temporary storage 5| Enter the bellow's crown in $1 / page 45 幽4:

五、發明說明(39) 指令。IN轉1澤邏輯發布一LD. IN_1微指令跟隨一LD.IN-0 回合弓丨擎在、周s在週期2完成而LD.lN-〇在週期4完成’並且 成。兩隨德^^5開始處理兩輸入文字區塊且在週期24完 期^待其^ ^指帽.附]、ST.謝係暫停直到週 停解除,並、兩入貧料文字區塊處理完成,在週期24暫 料緩衝儲f =在週期28完成儲存。因為沒有其他輸入資 停直到儲二;^此兩隨後載入指令LD. IN-0、LD. IN_i係暫 期29-32之、。因此隨後輸入文字區塊的載入發生於週 區塊。 並且由回合引擎在週期33-52之間處理這些 入、:广同Ϊ參照第十三圖單階密碼單元所討論微指令的載 1入館存順序,表格15的載入—載入-儲存-儲存 塊處理之密$Ί儲存順序,並沒有從支援有效率資料區 單亓劫,=t i的特性中取得好處。結果,在兩階密碼 期。仃岔4、、運算於多資料區塊,每兩個區塊需要28週 對八Ί二二第十,、圖,其為本發明微指令流之另一實施例 抢碼早兀之兩階實施例的表格16〇〇。 = ί流,此表格16°°之替代微指令以例利 人:白ί 〔早元的有利特性。就比較目的而言,透過回 % ^ Ϊ執^ ,人指令LD. IN-Q、儲存指令ST. GUT-G以及密 鼻之時脈週期的數目與參照第十五圖所討論的實施例 货、相同。 根據此替代微指令流實施例,當一第一載入指令V. INSTRUCTIONS (39) Directive. IN to 1 Logic releases an LD. The IN_1 microinstruction follows an LD.IN-0 round, and the sequel is completed in cycle 2 and LD.lN-〇 is completed in cycle 4'. Two with the German ^ ^ 5 began to process two input text block and in the end of the cycle 24 ^ ^ ^ ^ finger cap. Attached], ST. Xie paused until the week to lift, and two into the poor text block processing Completed, in the cycle 24, the buffer is stored in the cycle f = the storage is completed in cycle 28. Because there is no other input to stop until the second; ^ these two subsequent load instructions LD. IN-0, LD. IN_i is the temporary 29-32. Therefore, the loading of the input text block subsequently occurs in the block. And the round engine processes these entries between periods 33-52, and refers to the loading order of the microinstructions discussed in the single-level cryptographic unit of the thirteenth figure, and the loading-loading-storing of the table 15 The storage block handles the secret $Ί storage order, and does not benefit from the support of the efficient data area, the feature of =ti. As a result, in the second-order password period.仃岔4,, operation in multiple data blocks, each of the two blocks requires 28 weeks to gossip and twenty-two, and the figure is the second order of the pre-emptive code of another embodiment of the microinstruction stream of the present invention. Table 16 of the examples. = ί stream, this table replaces the micro-instruction with 16°° for example: white ί [favorable characteristics of early yuan. For the purpose of comparison, the number of clock cycles of the LD. IN-Q, the storage command ST. GUT-G, and the secret nose is returned by the return of % ^ Ϊ ^, and the embodiment of the clock discussed with reference to the fifteenth figure. ,the same. According to this alternative microinstruction stream embodiment, when a first load instruction

第46頁 五、發明說明(40) LD. IN-1在週期〇提供給密碼單元以及跟著提供一 然後在4個週期後,輸入資料載入並且一回合 為;此在週期24時產生一對應的輸出資料區Page 46 V. INSTRUCTIONS (40) LD. IN-1 is supplied to the crypto unit in cycle 以及 and is followed by one and then after 4 cycles, the input data is loaded and one round is entered; this produces a correspondence at cycle 24 Output data area

路菸貧料可以緩衝儲存,因此轉譯邏輯電 布兩輸入文字區塊之載入指令认㈣、:DU ,的兩輸出文字區塊:生(’:;二對二字區 字區塊在週期8已緩衝儲存,田甘疋奴後的兩輸入文 期2 5開始並在週期4 4 //m碼運算可以在週 輪出文字之储存指令ST ουτ—應兩第;;輸入文字區塊的兩 路在隨後區塊載入指令LD IN-1 STD〇 Τ-0係由轉譯邏輯電 儲存指令ST.OUT-1、ST ππτ n/LD.IN-〇之後所提供。此 資料區塊在週期24備妥,·但在H停^到相對應的輸出 引$已經將4週期轉入隨後輸/ =成儲存。藉由回合 仞執行四個载入,這種 :文子&塊的處理。藉由最 前所提及的特性,因此增力3 $品序可以得利於密碼單元先 週期。儲存輪出區塊所:^、二區塊之每個區塊的產量成20 兩隨後輸入文字區塊密二四個時脈週期係有效合併於 輸入文字區塊所需的的四個執行。此外,載入隨後兩 文字區塊密碼運算的執行期/係合併於目前輸入兩輸入 雖然本發明及其目的、々 他實施例也應包含於本發明、徵與優點已詳細描述,但其 x86架構之實施例討論長度,例如:本發明曾根據相容 ^ & %這些討論已提供此類的Road smoke and poor materials can be buffered and stored, so the translation logic of the two input text blocks of the load command recognizes (4), :DU, the two output text blocks: raw (':; two-to-two-word area block in the cycle 8 buffered storage, after the two input period of Tian Ganzi slave 2 5 and in the cycle 4 4 / m code operation can be stored in the weekly round of the instruction ST ουτ - should be two;; input text block The two-way load instruction LD IN-1 STD〇Τ-0 is provided by the translation logic storage instruction ST.OUT-1, ST ππτ n/LD.IN-〇. This data block is in the cycle. 24 is ready, but in H stop ^ to the corresponding output $ has transferred 4 cycles to the subsequent transfer / = into storage. By the round 仞 perform four loads, this: the processing of the text & block. With the characteristics mentioned at the beginning, therefore, the force 3 $ order can benefit from the first cycle of the cryptographic unit. The storage round out block: ^, the output of each block of the second block is 20 and then the input text area The block two-four clock cycle is effectively combined with the four executions required to enter the text block. In addition, the subsequent two-text block is loaded. The execution period/code of the code operation is merged into the current input two inputs. Although the present invention and its objects and embodiments are also included in the present invention, the features and advantages have been described in detail, but the embodiment of the x86 architecture discusses the length, for example: The present invention has been provided in accordance with the compatibility of ^ & %

^2p|68 五、發明說ϋγ 方式,因為χ86架構容易理解且楹徂? & 發明。然而本發明句入u解且&供足夠的方式以教示本 例如彳〇werPC、MIPS3相知於其他指令集架構的實施例, 令集架構。 s及諸如此類等,此外還有全新的指 本發明更包合I 4 w i 算的執行,例*,I;!;:微理器外其他元件之密竭運 分相同的整合電路,::例古:實施例並非如微處理器部 之如此的實施例传為; <仃:式如部分電腦系、统。本發明 北橋、南僑),Λ Λ入圍繞在微處理器的晶片、组(如 碼指令係由主要微^理m用於執行密碼運算時,其密 發明可庳用於肉A处器移轉(hand off)給此處理器。本 列處理;以及4 f控制器、I業控制器、信號處理器、陣 :處理器以及任何相似處理資陣 置不僅執行密碼;r 2;:;:=。如此的内礙裝 些替代的處理元件參照成上述之處理器間明,本發明將這 區塊的丄]ί :、本發明提及128位元區塊,但是許多不同 =鬼的士小可以透過改變暫存器的大小而被應用,其 存器傳迗輸入資料、輸出資料、鑰匙以及控;字組? 特枚並i木::本應用顯著以des、三重des以及aes為其 ;ί碼:β.也包含較少人知的區塊密碼演算法,例如 • 、”、、、Rl jndael 密碼、Two fish 密碼、Blowfish 密 馬SerPent在碼以及RC6密碼。足以理解的是,本發明提 BW68 五、發明說明(42) ,供在微處理器中用於區塊密碼的奘罢a丄 基元區塊密碼運算可透過單-指f2克援的演算法’其 以執行區塊密碼功能為特徵,但= 其相關技術 形式的密碼也包含於本發明應用範^支也碼之外其他 碼功能。 :早疋依“完成指令所指定的密 並且’在此所討論的回合引蘩 處理兩區塊的輸入資料,:兩階裝置可管線 π駐里, 仁其他實施例也可考慮多於兩階 ^裝置。階段的分配對支援更多輸人資料區塊的管線處 王將發展協調相稱微處理器中其他階段的分配。、 獨密ί雖然本發明具體討論支援複數個演算法之一單 :ΐ::二但ί本發明也提供理解在一相稱微處理器中 :::他執仃早兀並列操作耦合的多密碼單元,而每一多密 笛::係配置用以執行一具體指定的密碼演算,例如:一 行DES早等疋係配置用以執行AES、一第二單元係配置用以執 —太=上所述僅為本發明之較佳實施例而已,並非用以限 ^ 明之申請專利範圍;凡其他為脫離本發明所揭示之 ^ ^ Ϊ所完成之等效改變或修飾,均應包含在下述之申請 專利範圍。 β IV, Ϊ253268^2p|68 V. Invented the ϋ γ method, because the χ86 architecture is easy to understand and what? & invention. However, the present invention provides a solution to the architecture of the other instruction set architectures, such as 彳〇werPC and MIPS3. s and the like, in addition to the new means that the invention is more inclusive of the execution of the I 4 wi calculation, the example *, I;!;: the same integrated circuit of the other components outside the microprocessor, :: Ancient: The embodiment is not passed as such an embodiment of the microprocessor unit; <仃: a formula such as a part of a computer system. The invention of the North Bridge, the South China, Λ Λ 围绕 围绕 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器 微处理器Hand off to this processor. This column processing; and 4 f controller, I industry controller, signal processor, array: processor and any similar processing resources not only execute the password; r 2;:;: The internal processing of these alternative processing elements is referred to as the above-mentioned processor. The present invention refers to the block ί] ί :, the present invention refers to the 128-bit block, but many different = ghost taxi Small can be applied by changing the size of the scratchpad. Its memory is used to transfer input data, output data, keys and controls. The word group? Special and i wood:: This application is marked by des, triple des and aes ; ί code: β. Also contains less well-known block cipher algorithms, such as •, “,,, Rl jndael cipher, Two fish cipher, Blowfish semaphore SerPent in code, and RC6 cipher. It is sufficient to understand that the present invention BW68 V. Invention description (42) for use in the microprocessor The password 奘 丄 丄 丄 区 区 区 区 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄 丄Other code functions other than the code: : According to the "tightness specified by the completion of the instruction and the input of the two blocks in the round of the discussion here, the two-stage device can be pipelined π, Li Ren other Embodiments may also consider more than two orders. The allocation of stages to the pipeline that supports more input data blocks will develop and coordinate the allocation of other phases in the microprocessor. Supporting one of a number of algorithms: ΐ::2 But the invention also provides an understanding in a commensurate microprocessor::: He is obsessed with the multi-cryptographic unit coupled to the parallel operation, and each whistle: The system is configured to perform a specific specified cryptographic calculation, for example, a line of DES is configured to perform AES, and a second unit is configured to perform - too = the above is only a preferred implementation of the present invention. For example, it is not intended to limit Patent range; all other departing from the present invention as disclosed by the completion of the ^ ^ Ϊ modifications may, should be included in the scope of the following patent applications of β IV, Ϊ253268.

f — ®係現今密碼應用的方塊圖; 第二圖係執行密碼運算技術的方塊圖; ^ 一圖係本發明執行密碼運算之微處理器裝置的方塊圖; 第四圖係本發明之基元(atomic)密碼指令實施例的方塊 第五圖係第四圖之基元密碼指令區塊加密模式欄位值之範 例的表格; ' 第六圖係本發明在X86相容微處理器中之密碼單元的方塊 第七圖係第六圖之微處理器中指示密碼次運算之範例微指 令欄位的方塊圖; 第八圖係第七圖之XLOAD微指令暫存攔位值格式的表格; 第九圖係第七圖之XSTOR微指令暫存攔位值袼式的表格| 第十圖係本發明指定密碼運算參數之控制字組格式範例的 第十一圖係本發明之一較佳實施密碼單元的方塊圖· 第十二圖係本發明執行有關進階加密標準(Agg )演瞀、去密 碼運算之一區塊加密邏輯電路實施例的方塊圖;t ’猪 第十三圖係本發明微指令流之一實施例對密竭單_ b 實施例的表格·, ‘、、凡之早P白 一實施例對密碼單元之單 實施例對密碼單元之兩階 第十四圖係本發明微指令流之另 階實施例的表格; 第十五圖係本發明微指令流之一 實施例的表格;以及f - ® is a block diagram of the current cryptographic application; the second diagram is a block diagram of the cryptographic operation technique; ^ is a block diagram of a microprocessor device for performing cryptographic operations of the present invention; and the fourth diagram is a primitive of the present invention The fifth block of the embodiment of the cryptographic instruction embodiment is a table of examples of the cryptographic instruction block cipher mode field value of the fourth figure; 'the sixth figure is the cipher of the present invention in the X86 compatible microprocessor. The seventh block of the unit is a block diagram of the example micro-instruction field indicating the password sub-operation in the microprocessor of the sixth figure; the eighth picture is a table of the XLOAD micro-instruction temporary storage block value format of the seventh figure; The nineth figure is the seventh table of the XSTOR micro-instruction temporary storage block value 袼 form | The tenth figure is the eleventh figure of the control word group format of the specified cryptographic operation parameter of the present invention is a preferred implementation password of the present invention Block diagram of a unit. The twelfth figure is a block diagram of an embodiment of a block cipher logic circuit for performing advanced cryptographic standard (Agg) deduction and de-cipher operations; One embodiment of a microinstruction stream Exhaustion list _ b Form of the embodiment ·, ', 凡早早白白一 embodiment singular unit of cryptographic unit two-step fourteenth cryptographic unit is another embodiment of the microinstruction stream of the present invention a fifteenth diagram is a table of one embodiment of the microinstruction stream of the present invention;

第50頁 1253268 圖式簡單說明 -第十六圖係本發明微指令流之另一實施例對密碼單元之兩 階實施例的表格。 主要部分之代表符號: 112 加 密 /解密應用程式 110 廣 域 網 路 201 微 處 理 器 202 操 作 系 統 203 應 用 程 式 記 憶體 204 密 碼 鑰 匙 產 生應用程式 205 输 匙排 程 206 加 密 應 用 程 式 207 解 密 應 用 程 式 208 初 始 向 量 209 密 碼 參 數 210 明 文 211 密 文 301 微 處 理 器 302 擷 取 邏 輯 電 路 303 轉 譯 邏 輯 電 路 308 控 制 字 組 指 標 309 錄 匙 指 標 310 初 始 向 量 指 標 311 m 入 文 字 指 標Page 50 1253268 BRIEF DESCRIPTION OF THE DRAWINGS - Figure 16 is a table of a two-stage embodiment of a cryptographic unit of another embodiment of the microinstruction stream of the present invention. Representative symbols for the main part: 112 Encryption/decryption application 110 Wide area network 201 Microprocessor 202 Operating system 203 Application memory 204 Password key generation application 205 Key scheduling 206 Encryption application 207 Decryption application 208 Initial vector 209 Cryptographic parameter 210 plain text 211 ciphertext 301 microprocessor 302 capture logic circuit 303 translation logic circuit 308 control block indicator 309 record index 310 initial vector indicator 311 m into text indicator

第51頁Page 51

1253268 圖式簡單說明 •312 輸出文字指標 314 載入邏輯 315 資料快取 316 密碼單元 317 儲存邏輯 318 寫回邏輯 319 記憶體匯流排 320 操作系統 321 系統記憶體 322 XCRYPT指令 323 密碼控制字組 324 初始密碼鑰起或鑰匙排程 325 初始向量 326 輸入文字 327 輸出文字 401 選項前置 402 重複前置 403 運算碼 404 區塊密碼模式 601 擷取邏輯電路 602 轉譯邏輯電路 603 轉譯器 604 微碼ROM 605 暫存1253268 Schematic description • 312 Output text indicator 314 Load logic 315 Data cache 316 Password unit 317 Storage logic 318 Write back logic 319 Memory bus 320 Operating system 321 System memory 322 XCRYPT instruction 323 Password control block 324 Initial Password key or key schedule 325 Initial vector 326 Input text 327 Output text 401 Option front 402 Repeat front 403 Opcode 404 Block cipher mode 601 Capture logic 602 Translation logic 603 Translator 604 Microcode ROM 605 Save

第52頁Page 52

1253268 圖式簡單說明 •6 0 6 定址 607 載入 6 0 8 執行 6 0 9、6 11、6 1 3、6 1 5 微佇歹丨J 610 整數單元 612 浮點單元 614 MMX單元 616 SSE單元 617 密碼單元 618 儲存 619 寫回 620 載入 621 暫停 6 2 2 儲存 623 執行邏輯電路 626 中斷邏輯電路 701 微運算碼 702 資料暫存器 703 暫存器 1101 區塊密碼邏輯電路 1102 鑰匙 RAM 1103 微指令 110 4 控制字組1253268 Schematic description of the diagram • 6 0 6 Addressing 607 Loading 6 0 8 Execution 6 0 9 , 6 11 , 6 1 3 , 6 1 5 Micro 伫歹丨 J 610 Integer unit 612 Floating point unit 614 MMX unit 616 SSE unit 617 Cryptographic unit 618 store 619 write back 620 load 621 pause 6 2 2 store 623 execution logic circuit 626 interrupt logic circuit 701 micro-code 702 data register 703 register 1101 block cipher logic circuit 1102 key RAM 1103 micro-instruction 110 4 control block

第53頁 年月曰〃’ .1253268 圖式簡單說明 4111 載入匯流排 1112 儲存匯流排 1113 暫停 1114 微指令匯流排 1210 回合引擎控制器 1213 鑰匙 1214 儲存 1215 至鑰匙RAM 1 220 回合引擎 1221 第一鑰匙XOR邏輯電路 1 223 S-BOX邏輯電路Page 53 曰〃月曰〃 ' .1253268 Schematic Brief Description 4111 Loading Bus 1112 Storage Bus 1113 Pause 1114 Micro Command Bus 1210 Round Engine Controller 1213 Key 1214 Storage 1215 to Key RAM 1 220 Round Engine 1221 First Key XOR logic circuit 1 223 S-BOX logic circuit

第54頁Page 54

Claims (1)

1253268 六、申請專利範圍 I 一種執行密碼運算的裝置,該執行密碼運算的裝置包 含: 一密碼指令電路,產生一密碼指令,該密碼指令係由 一計算裝置接收並將其當成在該計算裝置執行之一指令流 的部分,其中該密碼指令指定一密碼運算;以及1253268 VI. Patent Application Range I A device for performing a cryptographic operation, the device for performing a cryptographic operation comprising: a cryptographic command circuit for generating a cryptographic command received by a computing device and acting as being executed at the computing device a portion of an instruction stream, wherein the password instruction specifies a cryptographic operation; 一轉譯邏輯電路,操作耦合於該密碼指令電路並配置 用以轉譯該密碼指令成微指令,其中該微指令係用以在指 示該計算裝置儲存對應於一第一輸入文字區塊的一輸出文 字區塊之前,指示該計算裝置載入一第二輸入文字區塊並 對該第二輸入文字區塊執行該密碼運算; 藉此在該密碼運算對該第二輸入文字區塊執行期間, 該輸出文字區塊可以被儲存。 2.如申請專利範圍第1項所述之執行密碼運算的裝置,其 中該密碼運算包含: 一加密運算,該加密運算包含複數個明文區塊的加密 以產生相對複數個密文區塊; 其中該些明文區塊包含:a translation logic circuit operatively coupled to the cryptographic instruction circuit and configured to translate the cryptographic instruction into a microinstruction, wherein the microinstruction is configured to instruct the computing device to store an output text corresponding to a first input text block Before the block, instructing the computing device to load a second input text block and perform the cryptographic operation on the second input text block; thereby outputting the second input text block during the cryptographic operation Text blocks can be stored. 2. The apparatus for performing cryptographic operations as recited in claim 1, wherein the cryptographic operation comprises: an encryption operation comprising encryption of a plurality of plaintext blocks to generate a plurality of ciphertext blocks; The plaintext blocks contain: 該第一及該第二輸入文字區塊;以及 其中相對的該些密文區塊包含: 該輸出文字區塊。 3.如申請專利範圍第1項所述之執行密碼運算的裝置,其 中該密碼運算包含:The first and the second input text block; and wherein the opposite ciphertext blocks comprise: the output text block. 3. The apparatus for performing cryptographic operations as recited in claim 1, wherein the cryptographic operation comprises: 第55頁 六、申請專利範圍 , 一解密運算,該解密運算包含複數個密文區塊的解密 以產生相對複數個明文區塊; 其中該些密文區塊包含: 該第一及該第二輸入文字區塊;以及 其中相對的該些明文區塊包含: 該輸出文字區塊。 4.如申請專利範圍第1項所述之執行密碼運算的裝置,更 包含: 一執行邏輯電路,操作耦合以接收該微指令並當該密 碼運算執行於該第二輸入文字區塊時配置用以儲存該輸出 文字區塊。 5 ·如申請專利範圍第4項所述之執行密碼運算的裝置,其 中該執行邏輯電路包含一密碼單元。 6. 如申請專利範圍第5項所述之執行密碼運算的裝置,其 中該密碼單元係配置用以根據進階加密標準執行該密碼運 算。 7. 如申請專利範圍第5項所述之執行密碼運算的裝置,其 中該密碼單元包含: 一兩階回合引擎,配置用以管線執行該第一及該第二 輸入文字區塊。Page 55, the patent application scope, a decryption operation, the decryption operation includes decryption of a plurality of ciphertext blocks to generate a plurality of relative plaintext blocks; wherein the ciphertext blocks include: the first and the second Entering a text block; and the relative plaintext blocks therein include: the output text block. 4. The apparatus for performing cryptographic operations as recited in claim 1, further comprising: an execution logic circuit operatively coupled to receive the microinstruction and configured for when the cryptographic operation is performed on the second input text block To store the output text block. 5. The apparatus for performing cryptographic operations as described in claim 4, wherein the execution logic circuit comprises a cryptographic unit. 6. The apparatus for performing cryptographic operations as recited in claim 5, wherein the cryptographic unit is configured to perform the cryptographic operation in accordance with an advanced cryptographic criterion. 7. The apparatus for performing a cryptographic operation as described in claim 5, wherein the cryptographic unit comprises: a two-stage round engine configured to pipeline the first and second input text blocks. 第56頁 六 、申請專利範圍 8•如申請專利範圍第1項所述之執 >該微指令包含: m订必碼運异的裝置,其 載入微才曰7 ’ g己置用以指示該計算 二輸入文字區 指示該計算裝置儲存該輪出 輸入文字區塊並且執行該密碼運算於該第 載入該弟· 塊;以及 ~ 一儲存微指令,配置用以 文字區塊。 9.如申請專利範圍第1項所述之執行密碼運算的裝置,复 中該密碼指令係根據x86指令格式所指定。 一 I 0 ·如申請專利範圍第1項所述之執行密碼運算的農置,其 中該密碼指令隱含參照該計算裝置内複數個暫存^。 ^ II ·如申請專利範圍第1 〇項所述之執行密碼運算的裝置, 其中該些暫存器包含: 一第一暫存器,其中該第一暫存器的内容包含一第一 指標對一第一記憶體位址,該第一記憶體位址依據完成的 該密碼運算指定記憶體内一第一位置以存取複數個輸入文 字區塊,其中該些輸入文字區塊包含該第一及該第二輸入 文子區塊。 1 2 ·如申請專利範圍第1 〇項所述之執行密碼運算的裝置,Page 56 VI. Application for Patent Scope 8 • As stated in the scope of claim 1 of the patent application, the micro-instruction contains: m-fixed code-for-transport device, which is loaded into the micro-tool 7' g. Instructing the computing two-input text area to instruct the computing device to store the round-out input text block and performing the cryptographic operation on the first load of the block; and ~ storing the micro-instruction configured for the text block. 9. The apparatus for performing cryptographic operations as recited in claim 1, wherein the cryptographic instructions are specified in accordance with an x86 instruction format. An I 0 . The farmer performing cryptographic operations as described in claim 1 of the patent application, wherein the cryptographic command implicitly refers to a plurality of temporary storages in the computing device. ^ II. The device for performing a cryptographic operation as described in claim 1 , wherein the register includes: a first register, wherein the content of the first register includes a first indicator pair a first memory address, the first memory address specifies a first location in the memory according to the completed cryptographic operation to access a plurality of input text blocks, wherein the input text blocks include the first and the The second input text sub-block. 1 2 · The device for performing cryptographic operations as described in the first paragraph of the patent application, 1253268 六、申請專利範圍 ,其中該些暫存器包含: 一第二暫存器,其中該第二暫存器的内容包含一第二 4 指標對一第二記憶體位址,該第二記憶體位址指定記憶體 内一第二位置以儲存相對複數個輸出文字區塊,相對該些 輸出文字區塊係根據複數個輸入文字區塊所完成該密碼運 算的結果,其中該些輸出文字區塊包含該輸出文字區塊。 1 3 ·如申請專利範圍第1 0項所述之執行密碼運算的裝置, 其中該些暫存器包含: 一第三暫存器,其中該第三暫存器内容指示複數個輸 入文字區塊内複數個文字區塊。 1 4.如申請專利範圍第1 0項所述之執行密碼運算的裝置, 其中該些暫存器包含: 一第四暫存器,其中該第四暫存器的内容包含一第三 指標對一第三記憶體位址,該第三記憶體位址指定記憶體 内一第三位置以存取密碼鑰匙資料以用於完成該密碼運 算。 1 5.如申請專利範圍第1 0項所述之執行密碼運算的裝置, 其中該些暫存器包含: 一第五暫存器,其中該第五暫存器的内容包含一第四 指標對一第四記憶體位址,該第四記憶體位址指定記憶體 内一第四位置,該第四位置包含一初始向量位置,該初始1253268. The scope of the patent application, wherein the register includes: a second register, wherein the content of the second register comprises a second 4 indicator to a second memory address, the second memory location The address specifies a second location in the memory to store a plurality of output text blocks, and the output text blocks are compared to the result of the cryptographic operation according to the plurality of input text blocks, wherein the output text blocks comprise The output text block. The device for performing cryptographic operations as described in claim 10, wherein the register includes: a third register, wherein the third register indicates a plurality of input text blocks Multiple text blocks inside. The device for performing cryptographic operations as described in claim 10, wherein the register includes: a fourth register, wherein the content of the fourth register includes a third indicator pair A third memory address, the third memory address designating a third location in the memory to access the cryptographic key data for performing the cryptographic operation. The device for performing cryptographic operations as described in claim 10, wherein the register includes: a fifth register, wherein the content of the fifth register includes a fourth indicator pair a fourth memory address, the fourth memory address designating a fourth location in the memory, the fourth location including an initial vector location, the initial 第58頁Page 58 輪入 佈讀 輪入文字區塊執行期間 六、申請專利範圍 响量位置的内容包含一初始向量或初始向量等效物以用於 •完成該密碼運算。 1 6 ·如申請專利範圍第1 〇項所述之執行密碼運算的裝置, 其中該些暫存器包含·· 一第六暫存器,其中該第六暫存器的内容包含一第五 指標對一第五記憶體位址,該第五記憶體位址指定記憶體 内一第五位置以存取一控制字組以用於完成該密碼運算, 其中該控制字組指定密碼參數給該密碼運算。 17 · —種執行密碼運算的裝置,該執行密碼運算的裝置包 一轉譯邏輯電路,配置用以轉譯一密碼指令成一 的微指令,該序列的微指令包含: j 一第一微指令,指示载入—第二輸入文字區 行一密碼運算於該第二輸入文字區塊;以及 且執 一第二微指令,指示儲存— 〜 一輸出文字區塊根據執行的兮龙一輸出文子區塊,該第 文字區塊; 这进碼運算對應於一第一鉍、 其中該轉譯邏輯電路在$& 第一微指令; &该第二微指令之前發 藉此表該密碼運算對該第— 該輸出文字區塊可以被儲存。〜Round-in-page Reading During the execution of the text block. VI. Patent Application The content of the volume position contains an initial vector or initial vector equivalent for • to complete the cryptographic operation. The apparatus for performing cryptographic operations as described in claim 1 , wherein the register includes a sixth register, wherein the content of the sixth register includes a fifth indicator For a fifth memory address, the fifth memory address specifies a fifth location in the memory to access a control block for completing the cryptographic operation, wherein the control block specifies a cryptographic parameter for the cryptographic operation. 17 - a device for performing a cryptographic operation, the device for performing a cryptographic operation - a translation logic circuit configured to translate a cryptographic instruction into a microinstruction, the microinstruction of the sequence comprising: j a first microinstruction, the indication Entering a second input text area, a cryptographic operation is performed on the second input text block; and a second micro instruction is executed to indicate storage - an output text block is output according to the executed Snapdragon one, a first text block; the code operation corresponds to a first block, wherein the translation logic circuit sends the table cryptographic operation to the first micro-instruction before the second micro-instruction The output text block can be saved. ~ μ一、爭請i利範圍 ,1 8 ·如申請專利範圍第i 7項所述之執行密碼運异的裝置, •其中該密碼運算包含: 一加密運算,該加密運算包含複數個明文區塊的加密 以產生相對複數個密文區塊; 其中該複數個明文區塊包含: 該第一及該第二輸入文字區塊;以及 其中相對的該些密文區塊包含: 該輪出文字區塊。 1 9 ·如申請專利範圍第1 7項所述之執行密碼運算的裝置, 其中該密碼運算包含·· 一解密運算,該解密運算包含複數個密文區塊的解密 以產生相對複數個明文區塊; 其中該複數個密文區塊包含: 該第一及該第二輸入文字區塊;以及 其中相對的該些明文區塊包含: 該輪出文字區塊。 2 0 ·如申請專利範圍第1 7項所述之執行密碼運算的裝詈 更包含·· 、 一费碼單元, 算執行於該第二輸 區塊。Μ1, contending for i-profit range, 1 8 · As described in the patent application scope i 7 item, the cryptographic operation includes: an encryption operation, the encryption operation includes a plurality of plaintext blocks Encryption to generate a plurality of ciphertext blocks; wherein the plurality of plaintext blocks comprise: the first and the second input text block; and wherein the opposite ciphertext blocks comprise: the round text area Piece. A device for performing a cryptographic operation as described in claim 17 wherein the cryptographic operation comprises a decryption operation comprising decrypting a plurality of ciphertext blocks to generate a plurality of relative plaintext regions And the plurality of ciphertext blocks include: the first and the second input text block; and wherein the relative plaintext blocks comprise: the round text block. 2 0. The device for performing cryptographic operations as described in item 17 of the patent application scope further includes a unit, a fee code unit, and is executed in the second block. 碼運 文字Code text K532齡 六、申請專利範圍 2 1 ·如申請專利範圍第2 0項所述之執行密碼運算的裝置, r其中該密碼單元係配置用以根據進階加密標準執行該宓 運算。 山 22.如申請專利範圍第2〇項所述之執行密碼 其中該密碼單元包含: 斤的衣置, 一兩階回合引擎,配置用以管線執行該第一 輸入文字區塊。 4弟一 專利範圍第17項所述之執行密碼運瞀的梦署 其中該密碼指令係根據χ86指令格式所指定運…置’ 24. —種在一裝置執行密碼運瞀 密碼運算的方法包含: 开的方法,該在一裝置執行 轉譯一密石馬指令成一第一 y ^ 密碼指令指定一密碼運算,:=指令及一第二微指令,該 人-第二輸入文字區塊並且指示該裝置以載 入文字區塊,該第二微指令 雄碼運异於該第二輸 字區塊’該第一輪出文字區二=置第—輪出文 於一第一輸入文字區塊;以及 執仃的忒岔碼運算對應 發佈該第一微指令給一一 給該密碼單元; 馬早凡後發佈該第二微指令 藉此在該密碼運算對該 該輸出文字區塊可以被儲存一輸入文字區塊執行期間,K532 Age VI. Patent Application Range 2 1 · The device for performing cryptographic operations as described in item 20 of the patent application scope, wherein the cryptographic unit is configured to perform the 运算 operation according to the advanced encryption standard. Mountain 22. The execution password as described in claim 2, wherein the cryptographic unit comprises: a jewellery set, a two-stage round engine configured to execute the first input text block in a pipeline. 4: The dream of the password operation described in Item 17 of the patent scope, wherein the password command is specified according to the format of the χ86 command format. The method for performing a password operation on a device includes: The method of opening, the device performing a translation of a pebbly horse command into a first y ^ cryptographic command to specify a cryptographic operation, the := instruction and a second microinstruction, the person - the second input text block and indicating the device To load a text block, the second micro-instruction is different from the second-character block, the first round-out text area 2=set first-round text in a first input text block; The executed weight operation correspondingly issues the first micro-instruction to the crypto unit; the horse releases the second micro-instruction so that the cipher operation can store an input to the output text block. During the execution of the text block, 第61頁 六、申請專利範圍 V $5.如申請專利範圍第24項所述之在一裝置執行密碼運算 的方法,其中該轉譯包含: 藉由該第一微指令指定執行一加密運算於該第二文字 區塊以產生一相對第二密文區塊。 26.如申請專利範圍第24項所述之在一裝置執行密碼運算 的方法,其中該轉譯包含: 藉由該第一微指令指定執行一解密運算於該第二文字 區塊以產生一相對第二明文區塊。 2 7.如申請專利範圍第24項所述之在一裝置執行密碼運算 的方法,更包含: 執行該第一及莖第二微指令於一密碼單元,其中包 含: 當執行該密碼運算於該第二輸入文字區塊時,儲存該 輸出文字區塊。 2 8.如申請專利範圍第24項所述之在一裝置執行密碼運算 的方法,其中該密碼指令根據進階加密標準指定執行該密 碼運算。 2 9.如申請專利範圍第24項所述之在一裝置執行密碼運算 的方法,更包含:Page 61, the patent application scope V $5. The method for performing a cryptographic operation on a device as described in claim 24, wherein the translation comprises: performing, by the first microinstruction, performing an encryption operation on the The second text block is to generate a relative second ciphertext block. 26. The method of performing a cryptographic operation on a device as recited in claim 24, wherein the translating comprises: performing, by the first microinstruction, performing a decryption operation on the second text block to generate a relative Second Mingwen block. 2. The method of performing a cryptographic operation on a device as described in claim 24, further comprising: executing the first and second directional micro-instructions in a cryptographic unit, comprising: when performing the cryptographic operation on the When the second input text block is input, the output text block is stored. 2. A method of performing a cryptographic operation on a device as recited in claim 24, wherein the cryptographic instruction specifies execution of the cryptographic operation in accordance with an advanced cryptographic criterion. 2 9. The method for performing a cryptographic operation on a device as described in claim 24 of the patent application, further comprising: 第62頁Page 62 六、申請專利範圍 . 執行該第一及莖第二微指令於一密碼單元,其中該執 行包含透過一兩階回合引擎管線處理該第一及該第二輸入 文字區塊。6. Applying for a patent scope. Execute the first and stem second micro-instructions in a cryptographic unit, wherein the executing comprises processing the first and second input text blocks through a two-stage round engine pipeline. 11^ 第63頁 f— y4 ι厶 I 1253268" i .—- -—二.—一™^ _ 六、指定代表圖 乂一)、本案代表圖為:第 六 圖 (二)、本案代表圖之元件符號簡單說明: 601 擷取邏輯電路 602 轉譯邏輯電路 603 轉譯器 604 微碼R 0 Μ 605 暫存 606 定址 607 載入 608 執行 609 微佇列 610 整數單元 611 、6 1 3、6 1 5 微 612 浮點單元 614 ΜΜΧ單元 616 SSE單元 617 密碼單元 618 儲存 619 寫回 620 載入 621 暫停 622 儲存 626 中斷邏輯電路11^ Page 63 f— y4 ι厶I 1253268" i .—---two.—one TM^ _ 6. Designated representative figure )). The representative figure of this case is: the sixth picture (2), the representative figure of the case A brief description of the component symbols: 601 capture logic circuit 602 translation logic circuit 603 translator 604 microcode R 0 Μ 605 temporary storage 606 address 607 load 608 execution 609 micro-column 610 integer unit 611, 6 1 3, 6 1 5 Micro 612 floating point unit 614 ΜΜΧ unit 616 SSE unit 617 crypto unit 618 storage 619 write back 620 load 621 pause 622 save 626 interrupt logic circuit 第5頁Page 5 ίΪ253268 六、指定代表圖 -6 32 執行邏輯電路Ϊ 253268 VI. Designated representative figure -6 32 Execution logic circuit
TW093129342A 2003-09-29 2004-09-29 Microprocessor apparatus and method for optimizing block cipher cryptographic functions TWI253268B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US50697103P 2003-09-29 2003-09-29

Publications (2)

Publication Number Publication Date
TW200513084A TW200513084A (en) 2005-04-01
TWI253268B true TWI253268B (en) 2006-04-11

Family

ID=34619303

Family Applications (1)

Application Number Title Priority Date Filing Date
TW093129342A TWI253268B (en) 2003-09-29 2004-09-29 Microprocessor apparatus and method for optimizing block cipher cryptographic functions

Country Status (2)

Country Link
CN (1) CN100527664C (en)
TW (1) TWI253268B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104683093B (en) * 2013-11-27 2018-01-26 财团法人资讯工业策进会 Have block encryption device, block encryption method, block decryption device and the block decryption method of integrity verification concurrently
CN107330552A (en) * 2017-06-28 2017-11-07 无锡井通网络科技有限公司 A kind of intelligent trade matching method of distributed system digital asset

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4250546A (en) * 1978-07-31 1981-02-10 Motorola, Inc. Fast interrupt method

Also Published As

Publication number Publication date
CN100527664C (en) 2009-08-12
TW200513084A (en) 2005-04-01
CN1592189A (en) 2005-03-09

Similar Documents

Publication Publication Date Title
US7321910B2 (en) Microprocessor apparatus and method for performing block cipher cryptographic functions
TWI303936B (en) Apparatus and method for generating a cryptographic key schedule in a microprocessor
TWI351864B (en) Apparatus and method for employing cyrptographic f
US7844053B2 (en) Microprocessor apparatus and method for performing block cipher cryptographic functions
US7532722B2 (en) Apparatus and method for performing transparent block cipher cryptographic functions
EP1519509B1 (en) Apparatus and method for providing user-generated key schedule in a microprocessor cryptographic engine
US7392400B2 (en) Microprocessor apparatus and method for optimizing block cipher cryptographic functions
CN105302522A (en) Gf256 SIMD instructions and logic to provide general purpose Gf256 SIMD cryptographic arithmetic functionality
US7502943B2 (en) Microprocessor apparatus and method for providing configurable cryptographic block cipher round results
US7529368B2 (en) Apparatus and method for performing transparent output feedback mode cryptographic functions
US7536560B2 (en) Microprocessor apparatus and method for providing configurable cryptographic key size
US7900055B2 (en) Microprocessor apparatus and method for employing configurable block cipher cryptographic algorithms
US7542566B2 (en) Apparatus and method for performing transparent cipher block chaining mode cryptographic functions
TWI253268B (en) Microprocessor apparatus and method for optimizing block cipher cryptographic functions
US7519833B2 (en) Microprocessor apparatus and method for enabling configurable data block size in a cryptographic engine
CN1661958B (en) Microprocessor apparatus of block cryptographic functions and method
TWI247241B (en) Microprocessor apparatus and method for performing block cipher cryptographic functions
US7529367B2 (en) Apparatus and method for performing transparent cipher feedback mode cryptographic functions
TWI274280B (en) Microprocessor apparatus and method for employing configurable block cipher cryptographic algorithms
TW200536335A (en) Apparatus and method for performing transparent cipher feedback mode cryptographic functions
TW200536332A (en) Microprocessor apparatus and method for enabling configurable data block size in a cryptographic engine
TWI272815B (en) Apparatus and method for performing transparent output feedback mode cryptographic functions
TW200536329A (en) Apparatus and method for performing transparent cipher block chaining mode cryptographic functions