TWI282066B

TWI282066B - Apparatus and method for extending data modes in a microprocessor

Info

Publication number: TWI282066B
Application number: TW91124005A
Authority: TW
Inventors: G Glenn Henry; Rodney E Hooker; Terry Parks
Original assignee: Ip First Llc
Priority date: 2002-08-22
Filing date: 2002-10-18
Publication date: 2007-06-01
Also published as: CN1218243C; CN1431584A

Abstract

An apparatus and method are provided for extending a microprocessor instruction set beyond its current capabilities to allow for extended size operands specifiable by programmable instruction in the microprocessor instruction set. The apparatus includes translation logic and extended execution logic. The translation logic translates an extended instruction into corresponding micro instructions for execution by the microprocessor. The extended instruction has an extended prefix and an extended prefix tag. The extended prefix specifies an extended operand size for an operand corresponding to a prescribed operation, where the extended operand size cannot be specified by an existing instruction set. The extended prefix tag indicates the extended prefix, where the extended prefix tag is an otherwise architecturally specified opcode within the existing instruction set. The extended execution logic is coupled to the translation logic. The extended execution logic receives the corresponding micro instruction and performs the prescribed operation using the operand.

Description

1282066 九、發明說明：【與相關申請案之對照、】 [〇_ ^請案主張以下美时請案之相 10/227008,申請日為2〇〇2年8月22日。平匕 [0002]本申請案與下在申請中之美國專财請案有關，都具有相同的申請人與發明人。台灣申請案號 DOCKET NUMBER ίΜΛΜ 91116957 7/30/02 CNTR:2176 延伸微處理器指令集之裝置及方法 91116958 7/30/02 CNTR:2186 一—----- 執行條件指令之裝置及方法 91124008 10/18/02 CNTR:2187 選擇性控制記憶體屬性之裝置及方法 91116956 7/30/02 CNTR:2188 ---— 選擇性地控制條件碼回寫之裂置及方法 91116959 7/30/02 CNTR:2189 增加微處理器之暫存器數量的機制 91124006 10/18/02 CNTR:2191 —--— 延伸微處理器位址模式之裝置及方法 CNTR:2192 ——. 儲存檢查之禁止 CNTR:2193 選擇性中斷之禁止 *---1282066 IX. Description of the invention: [Comparison with the relevant application,] [〇_ ^ The case is as follows: 10/227008 of the following US case, the application date is August 22nd, 2nd.匕 [0002] This application has the same applicant and inventor as the US special account application in the application. Taiwan Application No. DOCKET NUMBER ΜΛΜ 91116957 7/30/02 CNTR: 2176 Apparatus and Method for Extending the Microprocessor Instruction Set 91116958 7/30/02 CNTR: 2186 One------ Apparatus and Method for Executing Conditional Commands 91124008 10/18/02 CNTR: 2187 Apparatus and method for selectively controlling memory attributes 91116956 7/30/02 CNTR: 2188 ---- Selective control of condition code write back splitting and method 91116959 7/30/02 CNTR: 2189 Mechanism for Increasing the Number of Registers of Microprocessors 91124006 10/18/02 CNTR: 2191 —--—— Apparatus and Method for Extending Microprocessor Address Mode CNTR: 2192 —. STOP CNTR for Storage Check: 2193 Prohibition of selective interruption*---

Client’s Docket No·: TT5s Docket N〇:0608-A40748TWF1/DEVICHEN/2006-07-05 1282066 91124007 10/18/02 CNTR:2195 非暫存記憶體參照控制機制 91116672 7/26/02 ___ CNTR:2198 選擇性地控制結果回寫及方法【發明所屬之技術領域】 [0003]本發明係有關微電子的領域，尤指—種能將延伸資料模式控_人i有之微處理器指令餘構的技術。【先前技術】 _ [0004]自1970年代初發韌以來，微處理器之使用即呈指數般成長。從最早應用於科學與技術的領域，到如今已從那些特殊領域引進商業的消費者領域，如桌上型與膝上型（iaptop—) 電腦、視訊遊戲控制器以及許多其他常見的家用與商用裝置等產品。 _习隨著使用上的爆炸性成長，在技術上也歷經一相對應之提昇，其特徵在於對下列項目有著日益昇高之要求：更快的速度、更強的定址能力、更快的記憶體存取、更大的運算看 =、、更多種-般用途類型之運算（如浮點運算、單一指令多重貝料（SIMD)、條件移動等）以及附加的特殊用途運算（如數位訊號處理功能及其他多媒體運算）。如此造就了該領域中驚 ^的技術進展1都已應用於微處理器之設計，像擴充管線化 (她nswe Plpelmmg )、超純量架構（-⑽· 肌·㈣、快取結構、亂序處理（祕。f被rp霞腿g)、爆發式存取（bum access)機制、分支預測（b娜h _icati〇n)Client's Docket No·: TT5s Docket N〇:0608-A40748TWF1/DEVICHEN/2006-07-05 1282066 91124007 10/18/02 CNTR:2195 Non-temporary memory reference control mechanism 91116672 7/26/02 ___ CNTR:2198 Select TECHNICAL FIELD OF THE INVENTION The present invention relates to the field of microelectronics, and more particularly to a technique capable of controlling the extended data mode to control the microprocessor instruction. . [Prior Art] _ [0004] Since the beginning of the resilience in the early 1970s, the use of microprocessors has grown exponentially. From the earliest applications in science and technology to the consumer sectors that have introduced business from specific areas, such as desktop and laptop (iaptop-) computers, video game controllers, and many other common home and business applications. Devices and other products. With the explosive growth in use, the technology has also undergone a corresponding improvement, characterized by increasing requirements for the following items: faster speed, stronger addressing capability, faster memory Access, larger operations, =, more kinds of general-purpose types of operations (such as floating-point operations, single-instruction multiple-batch (SIMD), conditional movement, etc.) and additional special-purpose operations (such as digital signal processing) Features and other multimedia operations). This has led to the development of the technology in the field 1 has been applied to the design of microprocessors, such as extended pipeline (her nswe Plpelmmg), super-quantity architecture (- (10) · muscle · (four), cache structure, out of order Processing (secret. f is rp ga ga g), explosive access (bum access) mechanism, branch prediction (b na h _icati〇n)

Client’s Docket No.: TT，s Docket N〇:0608-A40748TWFl/DEVICHEN/200M7 〇5 1282066 (speculative™1〇n)〇 , tb^30^^ 大的能:’。現在的微處理器呈現出驚人的複雜度，且具備了強 ]但與4多其他產品不同的是，有另_非 Ξ器了」並雜艮制著微處理器架構之演進。現今微處二曰複雜，一大部分得歸因於這項因素，即舊有軟體之市％考|下，所多製造商選擇將新的架構特徵納入取…顺理騎計中，但同時在這些最新的產品中，又保留 ^斤有為確保相容於較舊的、_胃「财」（legaey)應用程式所必需之能力。 [00〇7]這種舊有軟體相容性的負擔，沒有其他地方，會比在X864目容之微處理器的發展史中更加顯而易見。大家都: =現在的32/16位元之虛擬模式（virtual_m〇de) χ86微處理器，仍可執行1980年代所撰寫之8位元真實模式（real_m〇de) 的應用程式。而熟習此領域技術者也承認，有不少相關的架構匕被」堆在x86架構中，只是為了支援與舊有應用程式及^運作模式的相容性。雖然在過去，研發者可將新開發的架構特徵加入既有的指令集架構，但如今使用這些特徵所憑藉之工具，即可程式化的指令，卻變得相當稀少。更簡單地說，在某些重要的指令集中，已沒有「多餘」的指令，讓設計者可藉以將更新的特徵納入一既有的架構中。 [0008]例如，在x86指令集架構中，已經沒有任何一未定義的一位元組大小的運算碼狀態，是尚未被使用的。在主要的一位元組大小之x86運算碼圖中，全部256個運算碼狀態都Client’s Docket No.: TT,s Docket N〇:0608-A40748TWFl/DEVICHEN/200M7 〇5 1282066 (speculativeTM1〇n)〇 , tb^30^^ Large energy:’. Today's microprocessors are surprisingly complex and powerful. But unlike four other products, there is another _ non-instrument and the evolution of the microprocessor architecture. Nowadays, the complexity is complicated, and a large part of it is attributed to this factor, that is, the market of the old software is under the test. Many manufacturers choose to incorporate the new architectural features into the ride, but at the same time In these latest products, it is also necessary to ensure that it is compatible with older, legaiy applications. [00〇7] This burden of old software compatibility, nowhere else, will be more apparent than in the history of the development of microprocessors in the X864. Everyone: = The current 32/16 bit virtual mode (virtual_m〇de) χ86 microprocessor, still executable in the 1980s octet real mode (real_m〇de) application. Those skilled in the art also recognize that many related architectures are "stacked" in the x86 architecture just to support compatibility with legacy applications and operating modes. While in the past, developers were able to add newly developed architectural features to existing instruction set architectures, the tools that can be used to implement these features are now quite sparse. More simply, in some important instruction sets, there are no “excess” instructions that allow designers to incorporate newer features into an existing architecture. [0008] For example, in the x86 instruction set architecture, there is no undefined one-byte size of the opcode state that has not been used. In the main one-tuple size x86 opcode, all 256 opcode states are

Client’s Docket No·: TT^ Docket N〇:0608-A40748TWF1/DEVICHEN/2006-07-05 8 1282066 已被既有的指令佔用了。 3 必須在提供新特徵與保“ /處理器的設計者現在提供新的可H翻目=體相雜兩相作抉擇。若要若既有的指令集架構&二=運^碼狀態給這些特徵。算碼狀態必須重新定義=二:碼，，則某些既存的運新的特徵，就得-μ # 〜、，的知·欲。因此，為了提供文就侍犧牲舊有軟體相容性了。 [〇〇〇9] 一個持續惡化是運算元的大小。早期_~^里…又计者的問題，即元之8衍开f 處益设計提供了使用8位元運算心運异。隨著應用程式使㈣計算日漸的大小與相關之運算也掸連t兀< 型應用锃。_用於桌上型/膝上處理哭之運:处理t ’已能提供％位元之運算元/運算。微 md^n "~兀運异的大小，通常稱為資料模式（data 〇e °*此’為了保留與舊有應用程式之相容性，現代的卓上型/膝上型f腦之微處理器皆能以32位元、^^ 位元的資料模式運作。 4k 8 [0010] 但即使到現在，由於微處理器不能支援延伸資料吴工、如64位兀與128位元的資料模式，仍有些應用程式的· 領域會遭受不利的影響。不過，為了要在已無剩餘運算碼值的架構内支援這些延伸㈣模式，必·既有運算碼重新定義，如此將會導致無法支援舊有的應用程式。 [0011] 因此，我們所需要的是，一種可將延伸資料模式納入既有微處理器指令集架構的裝置及方法，其中該指令集架構係被已疋義之運异碼完全佔用，且納入該延伸資料模式還能讓一符合舊有規格之微處理器保留執行舊有應用程式的能力。Client’s Docket No·: TT^ Docket N〇: 0608-A40748TWF1/DEVICHEN/2006-07-05 8 1282066 Has been occupied by the existing instructions. 3 The designer must provide new features and protections. / / The designer now offers a new H-version = body-in-a-phase. If you want to have the existing instruction set architecture & These characteristics. The state of the code must be redefined = two: code, then some of the existing new features, you get -μ # ~,, know and desire. Therefore, in order to provide the text, sacrifice the old software phase. Capacitance. [〇〇〇9] A continuous deterioration is the size of the operand. Early _~^ in the middle of the problem, that is, the problem of the eight yuan, the benefit of the design provides the use of 8-bit computing As the application makes (4) calculate the increasing size and related operations, it is also connected to the type of application. _ for desktop / lap processing crying: processing t ' can provide % bit The operation of the element / operation. Micro md ^ n " ~ the size of the different transport, usually called the data mode (data 〇e ° * this 'in order to retain compatibility with the old application, modern top type / The microprocessor of the laptop f brain can operate in the data mode of 32 bits and ^^ bits. 4k 8 [0010] But even now, due to The processor can't support extended data, such as 64-bit and 128-bit data patterns, and some applications will suffer adversely. However, in order to support these in an architecture with no remaining opcode values. Extended (four) mode, must have both opcode redefinition, which will result in the inability to support legacy applications. [0011] Therefore, what we need is a way to incorporate extended data patterns into existing microprocessor instruction sets. The apparatus and method of the architecture, wherein the instruction set architecture is completely occupied by the deprecated code, and the inclusion of the extended data mode allows an old-compliant microprocessor to retain the ability to execute the legacy application.

Clients Docket No.： TTs Docket N〇：0608-A40748TWFl/DEVICHEN/2006-07-05 1282066 【發明内容】羽姜:2] ί發明如同前述其他中請案，係針對上述及其他 ^知技術之_與缺點加以克服。本發明提供—種更好的技術，用以擴充微處理器之指人你甘如士驻具夕、宣管—s 7集，使其超越現有的能力，提供 :”70 w該微處理器指令集之可程式化指令在其上運:。在-具體實施例中，提供了 —種㈣，式之裝Ί裝置包括—轉輯單元（t簡lat應loglc) =l伸執行建輯單元（extended㈣咖⑽1〇扯）。該轉譯邏 #單元將-延伸指令轉譯成對應的微指令（他⑽ 赠她on) ’由微處理器執行。該延伸指令具一延伸前置碼 (extended prefixmKmmI& ( extended prefix tag) 〇該延伸前置碼指定對應—指定運算之-運算it的延伸運算元大小’其中該延伸運算元大小不能由一既有指令集加以指定。該延伸前置碼標記翻出觀伸前置碼，其中延伸前置碼標記係原本該=有指令翻另—依據架構所指定之運算碼。該延伸執订邏輯單讀接至轉譯輯單元，㈣接收該職的微指令，並使用該運算元來執行該指定運算。 [0013]本發明的另一目的，在於提出一種為既有指令集增添延伸資料模式能力的指令集延伸模組。該指令集延伸模組具有一延伸指令標記（escapetag)、一延伸前置碼及一延伸執打邏輯單το。該延伸指令標記由一轉譯邏輯單元接收，並指出一對應指令之附隨部分係指定了微處理器所要執行之一延伸運异’其中該延伸指令標記為該既有指令集内之一第一運算碼。該延伸運算元大小指定元耦接至該延伸指令標記，且為該Clients Docket No.: TTs Docket N〇:0608-A40748TWFl/DEVICHEN/2006-07-05 1282066 [Summary of the Invention] Yu Jiang: 2] 发明Inventions, as in the other above, are for the above and other technologies. And the shortcomings are overcome. The present invention provides a better technology for expanding the microprocessor's fingertips. You are able to surpass the existing capabilities, providing: "70 w the microprocessor The programmable instruction of the instruction set is carried on it: In the specific embodiment, a type (4) is provided, and the device of the type includes: a conversion unit (t lat should be loglc) = l extension execution unit (extended (four) coffee (10) 1 )). The translation logic # unit translates the extension instruction into the corresponding microinstruction (he (10) gives her on) 'executed by the microprocessor. The extension instruction has an extended preamble (extended prefixmKmmI& (extended prefix tag) 〇 The extended preamble specifies the corresponding--the operation-expanded operation element size of the specified operation', wherein the extended operand size cannot be specified by an existing instruction set. The extended preamble tag is flipped out Observing the preamble, where the extended preamble mark is originally = there is a command to turn over another - according to the operating code specified by the architecture. The extended binding logic is single-read to the translation unit, and (4) receives the micro-instruction of the job, And use the operand to execute [0013] Another object of the present invention is to provide an instruction set extension module that adds an extended data pattern capability to an existing instruction set. The instruction set extension module has an extended instruction tag (escapetag), a Extending the preamble and an extended execution logic το. The extension instruction flag is received by a translation logic unit, and indicates that an accompanying portion of the corresponding instruction specifies that the microprocessor is to perform one of the extensions of the extension The instruction is marked as one of the first operation codes in the existing instruction set, and the extended operation element size specification element is coupled to the extension instruction mark, and is the

Clients Docket No.: TT，s Docket N〇:0608-A40748TWF1/DEVICHEN/2006-07-05 10 1282066 用以指定對應該延伸運算之複數個資料模 /、該延伸執行邏輯單元耦接至該轉譯邏輯單元，、、枓杈式，而未能提供所指定之資料模式。隹卒nt發明的再—目的’在於提供—種擴充既有指令式。該方法包括提供-延伸指令，該延伸指令包含一及一延伸前置碼，其中該延伸標記係該既有^ 第:，碼項目；將該延伸指令轉譯成微指令，包内，偵_延伸標記以及錢延伸轉譯朗解伸I置碼與該延伸指令之其餘部分，以取代該延伸運算之设貧料模式；透過該延伸前置碼與該延伸指令之其餘部分指定 =延伸資料模式與-指定運算，其中該既有指令集架構僅提供和疋既有捕模式㈣該延伸f料模式的指令；以及依據該延伸資料模式執行該指定運算。【圖式簡單說明】 [0015] 本發明之前述與其它目的、特徵及優點，在配合下列說明及所附圖示後，將可獲得更好的理解： [0016] 圖-係為-相關技術之微處理器指令格式的方塊圖， _7]圖二係為-表格’其插述—指令集架構中之指令，如何對應至圖-指令格式内-8位元運算碼位植之位元邏輯狀態； [0018]圖三係為本發明之延伸指令格式的方塊圖；Clients Docket No.: TT, s Docket N〇: 0608-A40748TWF1/DEVICHEN/2006-07-05 10 1282066 is used to specify a plurality of data modules corresponding to the extension operation, and the extended execution logic unit is coupled to the translation logic Units, , , and ,, but failed to provide the specified data mode. The re-purpose of the invention was to provide an extension of the existing instruction. The method includes providing an extension instruction, the extension instruction including one and an extended preamble, wherein the extension mark is the same as: the code item; the extension instruction is translated into the micro instruction, the package, the detection_extension Marking and money extension translation and decoding the rest of the extension instruction to replace the lean mode of the extension operation; specifying the extended data mode through the extended preamble and the rest of the extension instruction - The operation is specified, wherein the existing instruction set architecture only provides an instruction for the extended mode (4) of the extended mode; and the specified operation is performed according to the extended data mode. BRIEF DESCRIPTION OF THE DRAWINGS [0015] The foregoing and other objects, features and advantages of the present invention will become more <RTIgt; </ RTI> <RTIgt; </ RTI> <RTIgt; Block diagram of the microprocessor instruction format, _7] Figure 2 is the table - its 'interpretation' - the instruction in the instruction set architecture, how to correspond to the bitmap logic in the -8-bit operation code bit in the instruction-instruction format [0018] FIG. 3 is a block diagram of an extended instruction format of the present invention;

Client’s Docket No.: TTs Docket No:0608-A40748TWF1/DEVICHEN/2006-07-05 11 1282066 特汽imr四料—表格’其顯示依據本發明，延伸架構Client’s Docket No.: TTs Docket No: 0608-A40748TWF1/DEVICHEN/2006-07-05 11 1282066 Special steam imr four materials—forms’ display according to the present invention, extended architecture

At可、〜至8位元延伸前置碼實施例中位元的邏輯狀恶；資料模式之一管線 [0020]圖五係為解說本發明應用延伸化微處理器的方塊圖； [0021] 目六係為本發日賴於指定—微處理器中之延伸資料模式之延伸前置碼之—具體實施例的方塊圖； ' [0022] ®七係相五微處理器内轉譯階段邏輯之細部的 [0023] 圖人係為圖五之微處理器内延伸執行階段邏輯的方塊圖；以及 [0024] 圖九係為描述本發明對於指定微處理器中之一延伸資料模式運算的指令，進行轉譯與執行的方法之運作流程【主要元件符號說明】 100 指令格式 101 前置碼 102 運算碼 103 位址指定元 200 8位元運算碼圖 201 運算碼值 202 運算碼F1H 300 延伸指令格式 301 前置碼 302 運鼻碼 303 位址指定元 304 延伸指令標記 305 延伸前置碼 400 8位元前置碼圖 401架構特徵 500 管線化微處理器 501 提取邏輯單元The at least, ~ to 8 bits extend the logic of the bit in the preamble embodiment; one of the data patterns [0020] Figure 5 is a block diagram illustrating the application of the extended microprocessor of the present invention; [0021] The sixth is the block diagram of the extended preamble of the extended data mode in the designated microprocessor - the specific figure of the [0022] ® seven-phase five-fibre internal translation stage logic [0023] FIG. 9 is a block diagram showing the logic of the extended execution phase in the microprocessor of FIG. 5; and FIG. 9 is a diagram for describing the operation of the extended data mode in one of the specified microprocessors. Operational flow of the method of translation and execution [Main component symbol description] 100 Instruction format 101 Preamble 102 Operation code 103 Address specification element 200 8-bit operation code diagram 201 Operation code value 202 Operation code F1H 300 Extension instruction format 301 Preamble 302 Run nose code 303 Address designation element 304 Extension instruction mark 305 Extended preamble 400 8-bit preamble map 401 Architecture feature 500 Pipelined microprocessor 501 Extract logic unit

Client’s Docket No.: IT’s Docket No:0608-A40748TWFl/DEVICHEN/2006-07-05 12 1282066 案號91124005 96年；1月4曰 , f ；； ^ 修正頁外部記憶體 505 延伸轉譯邏輯單元 507 執行邏輯 600 延伸前置碼 700 轉譯階段邏輯 702 機器特定暫存器 704 才曰令緩衝器 706 轉譯控制器 708 逸出指令偵測器 710 指令解碼器 712 微指令緩衝器 714 微運算碼攔位 716 來源搁位 800 延伸暫存器階段邏輯 802 暫存器邏輯 804 來源運算元位址欄位 806 延伸讀取邏輯 808 微指令緩衝器 810 運算元緩衝器 812 結果緩衝器 814 運算元欄位 504 轉譯邏輯單元 5〇6 微指令佇列 508 延伸執行邏輯單元； 701啟動狀態訊號 703 延伸特徵攔位 705轉譯邏輯單元 707除能訊號 709延伸解碼器 711控制唯讀記憶體 713 運算碼延伸項攔位 715 目的欄位 717位移攔位 801微指令缓衝器 803延伸暫存器檔案 805 來源運算元位址欄位 807 延伸回寫邏輯 8〇9運算元缓衝器 8Π 完整微指令緩衝器 813 結果緩衝器 815運算元延伸項攔位 816延伸暫存器 900〜924對指定微處理器之延伸資料模式運算的指令，進行Client's Docket No.: IT's Docket No:0608-A40748TWFl/DEVICHEN/2006-07-05 12 1282066 Case No. 91124005 96; January 4曰, f ;; ^ Correction Page External Memory 505 Extended Translation Logic Unit 507 Execution Logic 600 extended preamble 700 translation stage logic 702 machine specific register 704 buffer 706 translation controller 708 escape instruction detector 710 instruction decoder 712 microinstruction buffer 714 micro opcode stall 716 source Bit 800 Extension Register Stage Logic 802 Register Logic 804 Source Operation Element Address Field 806 Extended Read Logic 808 Micro Instruction Buffer 810 Operation Element Buffer 812 Result Buffer 814 Operation Element Field 504 Translation Logic Unit 5 〇6 micro-instruction queue 508 extended execution logic unit; 701 start state signal 703 extension feature block 705 translation logic unit 707 disable signal 709 extension decoder 711 control read-only memory 713 opcode extension block 715 destination field 717 shift block 801 micro-instruction buffer 803 extended register file 805 source operand address field 807 extended write back Series 8〇9 operand buffer 813 buffers 8Π complete microinstructions result buffer 815 entry bar extending operand bit instruction register 816 extending 900~924 of extension data designated mode operation of a microprocessor, for

Client’s Docket No·: TT^ Docket No:0608-A40748TWF2/DEVICHEN/2007-0U03 1282066 _與執行的方法之運作流程【實施方式】脈絡mr的說明’係在—特定實施例及其必要條件的而，各種’可使—般熟習此顿術者能夠本發明。然乃俜顯==顺作的修改，對_此項技術者而言他實施例ί ’並且，在此所討論的—般原理，亦可應用至其施例，而9’本發明亚不限於此處所展示與敘述之特定實圍。有與此處所揭露之原理與新穎特徵相符之最大範特徵文已針對今日之微處理器内，如何擴充其架構於1，在^、t目關指令集能力之技術，作了f景的討論。有鑑二了="：，將討論—相關技術的例子。此處的討論脾、/ w㈣計者所一直面對的兩難，即—方面，他們想 :i開發之架構特徵納人微處理器的設計巾，但另一方面二 :取要保留執行舊有制程式的能力。在®-至二的例子凡全佔用之運算碼圖，已把增加新運該 ==排除’因而迫使設計者要不就選擇將新特徵 ^=_度之財軟體相雜，要不鄕賴上的最新進展 -併放莱’以便維持微處理器與舊有應用程式之相容性。在相關技術的討論後’於®三至九’將提供對本發明之討論。藉由利用，有i_未使用之運异碼作為_延伸指令之前置碼標記，本舍明可讓微處理器設計者克服已完全使狀指令集架構的限制，除了提供程式員使用比現有還長之運算Client's Docket No·: TT^ Docket No: 0608-A40748TWF2/DEVICHEN/2007-0U03 1282066 _ Operational flow of the method of execution [Embodiment] The description of the vein mr is in the specific embodiment and its necessary conditions, A variety of 'can be familiar with this experience can be the invention. However, the modification of the == simplification, for the technologist, his embodiment ί 'and the general principles discussed herein can also be applied to its application, and 9' the invention is not Limited to the specific realities shown and described herein. The largest characterization feature that is consistent with the principles and novel features disclosed herein has been discussed in today's microprocessors, how to expand its architecture to 1, and in the technology of ^, t-target instruction set capabilities, . There are two examples of =":, which will be discussed - examples of related technologies. The discussion here is a dilemma that the spleen, / w (four) meter has been facing, that is, they want to: i develop the architectural characteristics of the design of the microprocessor, but on the other hand: take the implementation of the old The ability to program. In the example of ®-to two, the full-occupied opcode map has been added to the new operation ==exclude' thus forcing the designer to choose the new feature ^=_ degree of wealth, mixed or not The latest developments - and put it in order to maintain the compatibility of the microprocessor with the old application. A discussion of the present invention will be provided after a discussion of related art. By utilizing i_unused transport heterocode as the pre-extension directive pre-coded token, Ben Schumming allows the microprocessor designer to overcome the limitations of the fully-instructed instruction set architecture, in addition to providing programmers with a ratio Existing long calculations

Clienfs Docket No.： U TT^s Docket N〇：0608-A40748TWFl/DEVICHEN/2006.07-05 14 1282066 « 能力’同時也能㈣執行舊有應錄式所f 隣7]請參閱圖-，其係_相_術之微式1⑽的方塊圖。該相關技術之指令1〇〇具有數量可變項目101-103，每-項目皆設定成一特定值，合在貝: 微處=^特定指令1〇0。該特定指令_示微4理= 订-特疋運算，例如將兩運算元相加，或者是將一運算元/ 憶體搬移至-内部暫存器，或從該内部暫存器搬移至記。 -般而言，指令1〇_之運算碼項目102指定了所要贿定運算，而翻（Gpti⑽al)之位址指定柄目iQ3位於運管碼搬之後，以指定關於該特定運算之附加資訊，像是如何: 行該運算，運算元位於何處等等。指令格式1〇〇並允許程式員在-運算碼102前加上前置碼項目而。在運算碼舰所指^ 之特定運#執行時’前置碼1G1用以指示是否使用特定的架構特徵。一般來說，這些架構特徵能應用於指令集中任何運算碼 1〇2所指疋運异的大部分。例如，現今前置碼1〇1存在於一此能使用不同大小之運算元（如8位元、16位元、32位元）執行運异的被處理态中。而當許多此類處理器被程式化為一預設{ 的運异元大小時（比如32位元），在其個別指令集中所提供之前置碼101，仍能使程式員依據各個指令，選擇性地取代 (override)該預設的運算元大小（如為了執行μ位元之運算）。可選擇之運算元大小僅是架構特徵之一例，在許多現代的微處理器令，這些架構特徵能應用於眾多可由運算碼1〇2加以指定的運算（如加、減、乘、布林邏輯等）。 [0028]圖一所示之指令格式100,有一為業界所熟知的範Clienfs Docket No.: U TT^s Docket N〇:0608-A40748TWFl/DEVICHEN/2006.07-05 14 1282066 «Capability' can also (4) Execute the old recorded record f neighbor 7] Please refer to the figure -, its system _ A block diagram of the micro-form 1 (10) of the phase. The related art instruction 1 has a variable number of items 101-103, and each item is set to a specific value, which is in the vicinity of: micro = ^ specific instruction 1 〇 0. The specific instruction_represents the micro-order=detail-characteristic operation, for example, adding two operands, or moving an operand/memory to the internal register, or moving from the internal register to the record . In general, the opcode item 102 of the instruction 1〇_ specifies the desired bribe operation, and the address (gpti(10)al) address designation iQ3 is located after the transport code is moved to specify additional information about the particular operation. How it is: Take the operation, where the operand is located, and so on. The instruction format is 1〇〇 and allows the programmer to add a preamble item before the - opcode 102. The preamble 1G1 is used to indicate whether a particular architectural feature is used when the specific code is executed by the code ship. In general, these architectural features can be applied to most of the operational differences in any instruction code 1〇2 in the instruction set. For example, today's preamble 1〇1 exists in a processed state that can perform different operations using different sized operands (such as 8-bit, 16-bit, 32-bit). And when many such processors are programmed to a predetermined {-equivalent size (such as 32-bit), the pre-coded 101 provided in its individual instruction set still enables the programmer to follow the instructions. The preset operand size is selectively overridden (eg, to perform a μ bit operation). The choice of operand size is just one example of architectural features. In many modern microprocessors, these architectural features can be applied to many operations (such as addition, subtraction, multiplication, and Boolean logic) that can be specified by the opcode 1〇2. Wait). [0028] The instruction format 100 shown in FIG. 1 has a well-known standard in the industry.

Clients Docket No.： TT5s Docket No:0608-A40748TWF1/DEVICHEN/2006-07-05 15 1282066 哭’，χ86指令格式⑽’其為所有現代之撕_相容微處理 μ所彳木用。更具體地說，χ86指令格式1〇〇 (也稱為X%指八木木構100)使用了 8位元前置碼1〇卜8位元運算石馬1〇2以及8位兀位址指定元1〇3。χ86架構勘亦具有數個前置碼 10^其中兩個取代了邊微處理器所預設的位址/資料大小運，碼狀顚與67Η)，另—侧指示微處理器依據不轉譯規則來解譯其後之運算碼位元組搬（即前置竭值咖，其使得轉譯動作是依據所謂的二位元組運算碼規則來進行），其他的前Μ ΗΠ聰·運算重複執行，直至重複條件滿足為止（即REP運算碼：F〇h、F2H及F3H)。 [0029]現請參閱圖二，其顯示一表格2〇〇，用以描述一指令集架構之指令201如何對應至圖一指令格式内一8位元運^ 碼位元組1G2讀元值。表格呈現了 — 8位元運算碼^ 200的範例，其將- 8位元運算碼項目酸所具有之最多256 ^固值，關聯到對應之微處理器運算碼指令2〇1。表格將運算碼項目1〇2之-特定值’譬如咖，映射至一對應之運算碼指令2〇i (即指令1〇2 2〇1)。在滿運算石馬圖的例子中，為此鲁領域中人所熟知的是，運算碼值14H係映射至χ86之進位累加 (Add With Carry，ADC)指令，此指令將—8位元之直接 (immediate )運算元加至架構暫存器AL之内含值。熟習此領域技術者也將發覺，上文提及之X86前置碼1〇1 (亦即66H、 67H、0FH、F0H、F2H及F3H)係實際的運算碼值2〇卜其在不同脈絡下’指定要將特定的架構延伸項應用於隨後之運算瑪項目102所指定的運算。例如，在運算碼14H (正常情況下，Clients Docket No.: TT5s Docket No: 0608-A40748TWF1/DEVICHEN/2006-07-05 15 1282066 Cry ', χ86 command format (10)' It is used for all modern tear-compatible micro-processing. More specifically, the χ86 instruction format 1〇〇 (also known as X% refers to Yagi Wood Construction 100) uses an 8-bit preamble 1 〇 8 8-bit operation stone horse 1 〇 2 and 8-bit 兀 address designation Yuan 1〇3. Χ86 architecture survey also has several preambles 10^ two of which replace the address/data size preset by the edge microprocessor, code 顚 and 67Η), and the other side indicates that the microprocessor is based on the untranslated rules. To interpret the subsequent operation code bit group move (that is, the pre-exhaustion value coffee, which makes the translation action according to the so-called two-bit operation code rule), and the other front-end ΗΠ ············ Until the repetition condition is satisfied (ie REP opcodes: F〇h, F2H and F3H). Referring now to FIG. 2, a table 2 is shown to describe how an instruction 201 of an instruction set architecture corresponds to an 8-bit code byte 1G2 read value in the instruction format of FIG. The table presents an example of an 8-bit opcode ^ 200 that associates the octet opcode item acid with a maximum of 256^ fixed values to the corresponding microprocessor opcode instruction 2〇1. The table maps the code item 1〇2 to a specific value, such as a coffee, to a corresponding opcode instruction 2〇i (ie, instruction 1〇2 2〇1). In the example of the full operation stone horse diagram, it is well known in the art that the code value 14H is mapped to the With86 Add With Carry (ADC) instruction, which will directly The (immediate) operand is added to the embedded value of the architecture register AL. Those skilled in the art will also find that the X86 preamble 1〇1 (ie 66H, 67H, 0FH, F0H, F2H and F3H) mentioned above is the actual opcode value of 2 in different contexts. 'Specify that a particular schema extension is to be applied to the operation specified by the subsequent operation item 102. For example, at opcode 14H (normally,

Client’s Docket No.: TT5s Docket No:0608-A40748TWF1/DEVICHEN/20〇6-〇7.〇5 16 1282066 ，别述之ADC運算碼）前加上前置碼㈣，會使得遍處理 -執订賴、％與插入低塵縮之單精度浮點值」（啊純— Interleave Low Packed Single-Prec.sion Floating-Point Values) 運异’、而非原本的ADC運算。諸如此.鮮所述之特徵，在現代之微處理器中係部分地致能，此因微處理器内之指令轉譯/解碼邏輯是依序解譯—齡〗⑻的項目igmg3。所以在過去，於指令集架構中使用特定運算碼值作為前置碼KH，可允許微處理ϋ設計者料少先進的架構特徵納人相容舊有軟體讀處理ϋ的設計中，而不會對未使用那些特定運算碼狀態的鲁 =有転式Τ來執行上的負面衝擊。例如，一未曾使用瘍運介I 0FH的售有私式，仍可在今日的χ86微處理器上執行。而一較新的應用程式，藉著運用χ86運算碼㈣作為前置碼 101—’就能使用許多新進納入之χ86架構特徵，如單一指令重貝料（SIMD)運算’條件移動運算等等。、义[〇〇3〇]儘管過去已藉由指定可用/多餘的運算碼值201作為前置碼1G1 (也稱為架構特徵標記/指標ΐ()ι或逸出指令 101)’來提供架構特徵，但許多指令集架構丨⑻在提供功能^ 勺強化日守仍會因為一非常直接的理纟，而碰到阻礙用/多餘的運算碼值已被用完，也就是，運算碼圖·中的全 H算瑪值已被_化触定。當所有可㈣值被分派為運首 21=或前置碼項目1G1時，就沒有剩餘的運算碼值‘ 器架構中^ΐΓ ϋ個嚴重的問題存在於現在的許多微處理 L使设計者得在增添架構特徵與保留舊之相容性兩者間作抉擇。工Client's Docket No.: TT5s Docket No: 0608-A40748TWF1/DEVICHEN/20〇6-〇7.〇5 16 1282066, not to mention the ADC opcode) plus the preamble (four), will make the processing - the implementation of , % and insert a low-precision single-precision floating-point value" (ah pure - Interleave Low Packed Single-Prec. sion Floating-Point Values), instead of the original ADC operation. Features such as this are partially enabled in modern microprocessors, since the instruction translation/decoding logic within the microprocessor is a sequential interpretation of the project igmg3 of age (8). So in the past, using a specific opcode value as the preamble KH in the instruction set architecture allowed the micro-processing designer to have less advanced architectural features and compatibility with the old software-reading design, without Negative impact on execution without the use of those specific opcode states. For example, a private model that has not been used to transport I 0FH can still be executed on today's χ86 microprocessor. A newer application, by using the χ86 opcode (4) as the preamble 101-', can use many of the newly incorporated features of the 86 architecture, such as single instruction heavy-duty (SIMD) operations, conditional mobile operations, and so on. , meaning [〇〇3〇] although the architecture has been provided in the past by specifying the available/excessive opcode value 201 as the preamble 1G1 (also known as the architectural signature/indicator ΐ() ι or the escape instruction 101). Features, but many instruction set architectures (8) in the provision of functions ^ spoon to strengthen the day-to-day defensive will still be due to a very direct reason, and encountered obstacles / redundant opcode values have been used up, that is, the opcode map · The full H-calculus value in the middle has been singulated. When all the possible (four) values are assigned to the first 21= or the preamble item 1G1, there is no remaining opcode value in the device architecture. A serious problem exists in many of the current micro-processing L designers. You have to choose between adding architectural features and retaining old compatibility. work

Client’s Docket No TT'sDocketNo：0608-A40748TWFl^VICHEN/2006-07-05 17 1282066 [0031] 值得注意的是，圖二所示之指令2〇1係以一般性的方式表不（亦即124、186)，而非具體指涉實際的運算（如進位累加、減、互斥或）。這是因為，在一些不同的微處理器架構中全佔用之運算碼圖200在架構上，已將納入較新進展的可能性排除。雖然圖二例子所提到的，是8位元的運算碼項目102，熟習此領域技術者仍將發覺，運算碼1〇2的特定大小，除了作為—特殊情況來討論完全佔用之運算碼結構2〇〇所造成的問題外，其他方面與問題本身並不相干。因此，一完全佔用之6位元運算碼圖將有64個可架構化地指定之運算碼/前鲁置碼201 ’並將無法提供可用/多餘的運算碼值作為擴充之用。 [0032] 另一種替代做法，則並非將原有指令集完全廢棄，以一新的格式1⑽與運算碼圖200取代，而是只針對一部份既有的運算碼2〇1，以新的指令意含取代，如圖二之運算碼 40H至4FH。以這種混合的技術，微處理器就可以單獨地以下列兩種模式之一運作：其中舊有模式利用運算碼4〇h_4fh，係依曰有規則來解潭，或者以另一種改良模式（⑶^姐㈣m〇de) 運作，此日守運异碼4〇Η-4Ρϋ則依加強之架構規則來解譯。此馨項技術確能允許設計者將新特徵納入設計，然而，當符合舊有規格之微處理器於加強模式運作時，缺點仍舊存在，因為微處理斋不旎執行任何使用運算碼4〇H_4FH的應用程式。因此，站在保留舊有軟體相容性的立場，相容舊有軟體/加強模式的技術’還是無法接受的。 [0033] 然而’對於運算碼空間已完全佔用之指令集』⑻，且该空間涵蓋所有於符合舊有規格之微處理器上執行之應用Client's Docket No TT'sDocketNo: 0608-A40748TWFl^VICHEN/2006-07-05 17 1282066 [0031] It is worth noting that the instruction 2〇1 shown in Figure 2 is expressed in a general manner (ie, 124, 186), rather than specifically referring to actual operations (such as carry accumulation, subtraction, mutual exclusion or). This is because, in some different microprocessor architectures, the fully occupied opcode map 200 is architecturally excluded from the possibility of incorporating newer developments. Although the example in Figure 2 is an 8-bit opcode entry 102, those skilled in the art will still recognize that the specific size of the opcode 1〇2, except as a special case, discusses the fully occupied opcode structure. In addition to the problems caused by the problem, other aspects are not related to the problem itself. Therefore, a fully occupied 6-bit opcode map will have 64 architectablely specified opcodes/pre-blocks 201' and will not provide usable/excess opcode values for expansion. [0032] Another alternative is not to completely discard the original instruction set, replace it with a new format 1 (10) and the opcode map 200, but only for a part of the existing opcode 2〇1, with a new one. The instruction is intended to be replaced, as shown in the operating code 40H to 4FH of Figure 2. With this hybrid technology, the microprocessor can operate in one of two modes: the old mode uses the opcode 4〇h_4fh, depending on the rules, or in another modified mode ( (3) ^ sister (four) m〇de) Operation, this day's shoud code 4〇Η-4Ρϋ is interpreted according to the strengthened structure rules. This singularity technology does allow designers to incorporate new features into the design. However, when the microprocessors that meet the old specifications operate in enhanced mode, the shortcomings still exist, because the micro-processing does not perform any use of the opcode 4〇H_4FH Application. Therefore, it is still unacceptable to stand on the standpoint of retaining the old software compatibility and compromising the old software/enhancement model. [0033] However, 'the instruction set that is completely occupied by the opcode space』 (8), and the space covers all applications executed on the microprocessor conforming to the old specification.

Client’s Docket No · rr,DocketNo:0608.A40748TWFl/DEVlCHEN/2006-07-05 1β 1282066 程式的情形’本案發明人已注意到其中運算碼201的使用狀況’且他們亦觀察出，雖然有些指令202是架構化地指定，但未用於能被微處理器執行之應用程式中。圖二所述之指令ΙΠ 202即為此現象之一例。事實上，相同的運算碼值202 (亦即 F1H)係映射至未用於χ86指令集架構之一有效指令2〇2。雖然該未使用之Χ86指令202是有效的χ86指令202，其指示要在χ86微處理器上執行一架構化地指定之運算，但它卻未使用於任何能在現代x86微處理器上執行之應用程式。這個特殊的 x86指令202被稱為電路内模擬中斷點（In Circuk Emulation φ Breakpoint)(亦即ICE Βκρτ，運算碼值為F1H)，之前都是專門使用於一種現在已不存在之微處理器模擬設備中。ice BKPT 202 ¼未用於電路内模擬器之外的應用程式中，並且先前使用ICEBKPT 202之電路内模擬設備已不復存在。因此，在x86的情形下，本案發明人已在一完全佔用之指令集架構 200内發現-樣工具，藉著利用一有效但未使用之運算碼 202 ,以允許在微處理器的設計中納入先進的架構特徵，而不需犧牲舊有軟體之相容性。在―完全佔用之指令集架構2〇〇 _ 中’本發明利用-架構化地指定但未使用之運算碼2〇2，作為 -指標標記’以指出其後之一 n位元前置碼，因此允許微處理為没计者可將最多2η個最新發展之架構特徵，納人微處理器的ΰ又斤中’同留與所有舊有軟體完全的相容性。 ,[0034]本發明藉提供一 η位元之延伸運算元大小指定元月ij置碼，以使用岫置碼標記/延伸前置碼的概念，因而可允許程式貝在-微處理器中，依據每個指令指定一延伸資料模式予Client's Docket No · rr, DocketNo: 0608.A40748TWFl/DEVlCHEN/2006-07-05 1β 1282066 The situation of the program 'The inventor of this case has noticed the use status of the operation code 201' and they also observed that although some instructions 202 are Architecturally specified, but not used in applications that can be executed by a microprocessor. The command ΙΠ 202 described in Fig. 2 is an example of this phenomenon. In fact, the same opcode value 202 (i.e., F1H) is mapped to an active instruction 2〇2 that is not used in the 指令86 instruction set architecture. Although the unused Χ86 instruction 202 is a valid χ86 instruction 202 indicating that an architecturally specified operation is to be performed on the χ86 microprocessor, it is not used in any modern x86 microprocessor. application. This special x86 instruction 202 is called In Circuk Emulation φ Breakpoint (ie ICE Β κρτ, opcode value F1H), which was previously used exclusively for a microprocessor simulation that no longer exists. In the device. Ice BKPT 202 1⁄4 is not used in applications other than in-circuit simulators, and in-circuit analog devices that previously used ICEBKPT 202 no longer exist. Thus, in the case of x86, the inventor of the present invention has discovered a tool in a fully occupied instruction set architecture 200 by utilizing an active but unused opcode 202 to allow for inclusion in the design of the microprocessor. Advanced architectural features without sacrificing the compatibility of legacy software. In the "fully occupied instruction set architecture 2"_ the present invention utilizes - an architecturally specified but unused opcode 2 〇 2 as an - indicator flag ' to indicate one of the following n-bit preambles, Therefore, micro-processing is allowed to count up to 2n of the latest developments of the architectural features, and the microprocessors of the microprocessors are completely compatible with all the old software. [0034] The present invention provides a meta-month ij code by providing an n-bit extended operation element size to mark/extend the concept of a preamble, thereby allowing a program to be in the microprocessor. Specify an extended data mode according to each instruction

Clients Docket No.: TT^ Docket No:0608-A40748TWF1/DEVICHEN/2006-07 〇5 19 4 1282066 •^應的運算。該延伸資料模式制以取代該微處理器之既有导曰令集架構所支援之既有資料模式。本發明現將參照至九進行討論。 …、U — [0035]現請參閱圖三，其為本發明之延伸指令柊弋3⑽ 的方塊圖。與圖-所討論之格式則非常近似，該延^入終式2〇〇具有數量可變之指令項目301-305，每一項目設一 7疋值’集合起來便組成微處理器之一特定指令3⑽。該特定私令300指示微處理器執行一特定運算，像是將兩運算元相力或是將-運算元從記憶體搬移至微處理器之暫存‘。一 ^而言，指令300之運算石馬項目3〇2指定了所要執行之特定運而選用之位址指定元項目3〇3則位於運算碼3〇2後，疋該特定運算之相關附加資訊，像是如何執行該運算，運曰，於何處等等。指令格式亦允許程式員在―運算碼置碼項目則。在運算碼3〇2所指定之特定運算執行才月J碼項目301係用來指示是否要使用既有的架構特徵。 [〇〇36]然而，本發明的延伸指令係前述圖一指令 j 1〇〇之一超集合（superset)，其具有兩個附加項目删^ :可魏擇性作為指令延伸項，並胁—格式化延伸指; 斤气其餘項目3〇1-303之前。這兩個附加項目304與305 料;5式貝能ί 一符合舊有規格之微處理器内，指定-延伸資仙式，以依據該延伸資料模式執行一運算，其中該延粗模式係無法另由該符合舊有規# 丄曰令現钇U處理态之既有指令集來加 =式^這兩個附加項目3〇4與3〇5可將較大之運算元/運 …，，内入-具有已完全佔用之指令集架構的微處理器設計中。選Clients Docket No.: TT^ Docket No: 0608-A40748TWF1/DEVICHEN/2006-07 〇5 19 4 1282066 • The operation should be. The extended data model system replaces the existing data schema supported by the existing command set architecture of the microprocessor. The invention will now be discussed with reference to IX. ..., U - [0035] Please refer to FIG. 3, which is a block diagram of the extended instruction 柊弋3(10) of the present invention. The format discussed in the figure--is very similar. The extension 2 has a variable number of instruction items 301-305, and each item is set to a value of 7 ' to form one of the microprocessors. Instruction 3 (10). The particular private order 300 instructs the microprocessor to perform a particular operation, such as balancing the two operands or moving the operand from the memory to the temporary storage of the microprocessor. In the case of the instruction 300, the operation of the Shima project 3〇2 specifies the address to be executed and the specified address is specified. The item 3〇3 is located after the operation code 3〇2, and the additional information related to the specific operation , like how to perform the operation, how, where, and so on. The command format also allows the programmer to place the code in the "code" item. The specific operation execution specified in the operation code 3〇2 is used to indicate whether or not to use the existing architectural features. [〇〇36] However, the extension instruction of the present invention is a superset of the above-mentioned FIG. 1 instruction j1〇〇, which has two additional items deleted: can be selected as an extension of the instruction, and threatened— Format extension refers to; the rest of the project is 3〇1-303. The two additional items 304 and 305 are materials; the type 5 is capable of performing an operation according to the extended data mode in the microprocessor of the old specification, wherein the extended mode cannot be performed. In addition, the existing instruction set conforms to the old rule #丄曰令钇U processing state plus ==^ These two additional items 3〇4 and 3〇5 can be larger operands/operations..., Inbound - A microprocessor design with a fully occupied instruction set architecture. selected

Client’s Docket No.: TT s DocketNo：〇608-A40748TWFl/DEVICHEN/20〇6-〇7.〇5Client’s Docket No.: TT s DocketNo: 〇608-A40748TWFl/DEVICHEN/20〇6-〇7.〇5

Client’s Docket N丨 20 « 1 1282066 用項目304與305係— 沪定；乂耍庄甲才才示§己304與一延伸運曾元* ί ，兀則置碼305。該延伸指令標記内另一依據架構所指定之運算碼 #讀理“令集伸指令標記3G4，或稱逸_ 3G4，’該延其為早先使用之1CEBKPT指令。逸出:二'=F1H，輯指出，該延伸前置碼3〇5 ☆出^己304向微處理器邏隨在後，1巾#^ 稱延伸特徵指定元3G5，係跟通在後其中辆伸前置碼305指定了對運算元/運算大小或資料 =於運异之— 3〇4指出，-對應延伸指令貫施例中’逸出標記定了微處理器所要執行之—延伸運分則⑽3及3〇5指丁之延伸運异。延伸運算元大小指定元管或^延伸前置碼3〇5 ’指定了對應於一相關運算之複數 ""凡^中之―。微處理器内之延伸執行邏輯單元於執 p延伸運异時’存取延伸大小之暫存器中的運算元，並使用賤指定的運算元大小或資料模式相_致之處理規則，來處理所存取的運算元。 [0037]此處將本發明之延伸資料模式的技術作個概述。 =延伸指令係組態為於一既有微處理器指令集中指定一延伸φ 貧料模式，其中該延伸資料模式無法另依該既有微處理器指令集來加以指定。該延伸指令包括該既有指令集之運算碼/指令 304其中之一以及一 n位元之延伸特徵前置碼3〇5。所選取之運算碼/指令作為一指標304,以指出指令300是一延伸特徵指令300 (亦即，其指定了微處理器架構之延伸項），該η位元之特徵前置碼305則指出該延伸資料模式。在一具體實施例中，延伸前置碼305具有八位元，最多可指定250種不同的資Client’s Docket N丨 20 « 1 1282066 With items 304 and 305 - Huding; 乂庄庄甲甲甲才才才才才才才才才才才才才才才才才才才才才才才才才才才才才才In the extension instruction flag, another operation code #, which is specified by the architecture, reads "Order the instruction instruction mark 3G4, or the escape_3G4," which is the 1CEBKPT instruction used earlier. Escape: two '=F1H, The series pointed out that the extended preamble 3〇5 ☆ out of the ^ 304 to the microprocessor behind the logic, 1 towel # ^ said the extension feature specified element 3G5, the system followed by the extension of the preamble 305 specified For the operation unit/operation size or data = in the case of Yun Yun - 3〇4 points out that - corresponding to the extension instruction in the example, the "escape mark determines the microprocessor to perform - the extension of the transport points (10) 3 and 3 〇 5 fingers Dingzhi extends the difference. The extended operation element size specifies the elementary tube or ^extends the preamble 3〇5' to specify the plural number corresponding to a correlation operation"" The unit accesses the operands in the extended size register when the p extension is extended, and processes the accessed operands using the specified operand size or data pattern processing rule. 0037] Here is an overview of the technique of the extended data pattern of the present invention. Configuring to specify an extended φ poor mode in an existing microprocessor instruction set, wherein the extended data mode cannot be specified in accordance with the existing microprocessor instruction set. The extended instruction includes the existing instruction set One of the opcodes/instructions 304 and an n-bit extended feature preamble 3〇5. The selected opcode/instruction is used as an indicator 304 to indicate that the instruction 300 is an extended feature instruction 300 (ie, its An extension of the microprocessor architecture is specified, and the η-bit feature preamble 305 indicates the extended data pattern. In a specific embodiment, the extended preamble 305 has octets and can specify up to 250 Different capital

Clients Docket No.: TT5s Docket No：0608-A40748TWFl/DEVICHEN/2006-07-05 21 1282066 。,福置石馬的實施例，則最多可指定2n種不同的資符：；右招:：實施例中，提供64位元之資料模式’以取代一口見私微處理器中預設的資料模式（如32位元或16位二—口此在對應之運异執行時，執行邏輯即於64位元之運二=上執仃64位兀的運算（如加、減、邏輯運算等）。在另一貝知例中’則更允許程式員指定64位元或128位元之資料模 ^ ° 、、。。[咖]現請參閱圖四，一表格4〇〇顯示依據本發明，暫存》次伸項如何映射至一 S位元延伸前置碼實施例之位元邏籲輯狀態。類似於圖二所討論之運算碼圖·，圖四之表格4〇〇呈現一 8位元之延伸資料模式前置碼圖400的範例，其將一 8 位元延伸前置碼項目305之最多256個值，關聯到一符合舊有規格之微處理器的對應延伸資料模式401 (如Ε34、E4D等）。在-沾6的具體實施例中’本發明之8位元延伸特徵前置碼 3仍係提供給資料模式4〇1 (亦即e〇〇_eff)之用，該些資料模式401乃現行x86指令集架構所未能提供的。一 [0039]圖四所不之延伸特徵4〇1係以一般性的方式表# 不’而非具體指涉實際的特徵，此因本發明之技術可應用於各種不同的架構延伸項40i與特定的指令集架構。熟習此領域技術者將發覺，許多不同的架構特徵4〇1，其中一些已於上文提及’可依此處所述之逸出標記3〇4/延伸前置碼3〇5技術將其納入-既有之指令集。圖四之8位元前置碼實施例提供了最多 256個不同的特徵4(U ’而_ n位元前置碼實施例則具有最多 2個不同特徵401的程式化選擇。Clients Docket No.: TT5s Docket No: 0608-A40748TWFl/DEVICHEN/2006-07-05 21 1282066. For the embodiment of Fu Shi Shima, you can specify up to 2n different types of characters: right stroke: in the embodiment, provide 64-bit data mode to replace the default data mode in a microprocessor. (For example, 32-bit or 16-bit two-ports, when the corresponding traffic is executed, the execution logic is 64-bit operations (such as addition, subtraction, logic operations, etc.). In another case, 'more allows the programmer to specify 64-bit or 128-bit data modulo ^, , . . . [Caf] Please refer to Figure 4, a table 4 〇〇 shows according to the present invention, temporarily How to store the sub-thresholds to the bit-robin state of an S-bit extended preamble embodiment. Similar to the opcode diagram discussed in Figure 2, Table 4 of Figure 4 presents an 8-bit number. An example of an extended data pattern preamble map 400 that associates up to 256 values of an 8-bit extended preamble item 305 to a corresponding extended data pattern 401 of a microprocessor that conforms to the old specification (eg, 34) , E4D, etc.) In the specific embodiment of the -6, the 8-bit extension feature preamble 3 of the present invention is still Supply data mode 4〇1 (also known as e〇〇_eff), which is not provided by the current x86 instruction set architecture. [0039] Figure 4 does not extend the feature 4〇1 In a general manner, the table does not specifically refer to actual features, as the techniques of the present invention are applicable to a variety of different architecture extensions 40i and specific instruction set architectures. Those skilled in the art will recognize that many Different architectural features 4〇1, some of which have been mentioned above, can be incorporated into the existing instruction set according to the escape tag 3〇4/extension preamble 3〇5 technique described here. The four-octet preamble embodiment provides up to 256 different features 4 (U' and the n-bit preamble embodiment has a stylized selection of up to two different features 401.

Client’s Docket No.: TT?s Docket N〇：0608-A40748TWF1/DEVICHEN/20〇6-〇7_〇5 1282066 厂4[0040]現請參閱圖五，其為解說本發明用以執行延伸資料杈式運异之官線化微處理器500的方塊圖。微處理器500具月,、、、頁的又類型·提取、轉譯及執行。提取階段具有提 =邏，單元501，可從指令快取記憶體5〇2或外部記憶體5〇2 提取彳曰令。所提取之指令經由指令佇列503送至轉譯階段。轉 ^ = 4又具有轉譯邏輯單元504，耦接至一微指令佇列506。轉 ^㉔輯單元5Q4包括延伸轉譯邏輯單元505。執行階段則有執行邏輯507，其内具有延伸執行邏輯單元5〇8。人[0〇41]依據本發明，於運作時，提取邏輯單元5〇1從指看 2快取記憶體/外部記憶體502提取格式化指令，並將這些指 ▽依其執行順序放入指令佇列5〇3中。接著從指令佇列5〇3提取足些指令，送至轉譯邏輯單元5〇4。轉譯邏輯單元5〇4將每一，入的指令轉譯/解碼為一對應之微指令序列，以指示微處理器500去執行這些指令所指定的運算。依本發明，延伸轉譯邏輯早70 505偵測那些具有延伸前置碼標記之指令，以進行對應延伸貧料模式指定元前置碼之轉譯/解碼。在一 x86的實施 =中，延伸轉譯邏輯單元505組態為偵測其值為F1H之延伸_ 前，碼標記，其係x86之ICE Βκρτ運算碼。延伸微指令攔位則提供於微指令佇列5〇6中，以允許在微處理器5⑻内指定延伸資料模式。 [0042]微指令從微指令符列5〇6被送至執行邏輯如了，其中L伸執行邏輯單元508組態為依照該延伸微指令攔位所指定，存取内部的微處理器暫存器。複數個被指定要用於執行一才曰疋運异之來源運异元，則從來源運算元延伸暫存器中提取。Client's Docket No.: TT?s Docket N〇: 0608-A40748TWF1/DEVICHEN/20〇6-〇7_〇5 1282066 Factory 4 [0040] Please refer to Figure 5 for illustrating the present invention for performing extended data杈A block diagram of a conventional microprocessor 500. The microprocessor 500 has months, types, pages, types, extractions, translations, and executions. The extraction phase has a logic = unit 501, and the command can be extracted from the instruction cache 5〇2 or the external memory 5〇2. The fetched instructions are sent to the translation phase via command queue 503. Turning to ^=4, there is a translation logic unit 504 coupled to a microinstruction queue 506. The conversion unit 5Q4 includes an extended translation logic unit 505. The execution phase has execution logic 507 with an extended execution logic unit 5〇8. According to the present invention, in operation, the extraction logic unit 5〇1 extracts formatting instructions from the pointing 2 cache memory/external memory 502, and places the fingerprints in the order in which they are executed. Queue 5〇3. The instruction is then fetched from the instruction queue 5〇3 and sent to the translation logic unit 5〇4. The translation logic unit 5〇4 translates/decodes each of the incoming instructions into a corresponding microinstruction sequence to instruct the microprocessor 500 to perform the operations specified by the instructions. In accordance with the present invention, the extended translation logic detects 70 505 instructions having extended preamble markers for translation/decoding of the specified meta preamble corresponding to the extended poor mode. In an implementation of x86, the extended translation logic unit 505 is configured to detect an extension of the F1H _ pre-code flag, which is an ICE Β κρτ opcode of x86. The extended microinstruction block is provided in the microinstruction queue 5〇6 to allow the extension of the data mode to be specified in the microprocessor 5(8). [0042] The microinstruction is sent from the microinstructor column 5〇6 to the execution logic, wherein the L extension execution logic unit 508 is configured to access the internal microprocessor temporary storage as specified by the extended microinstruction block. Device. A plurality of source transport elements that are designated to be used to perform the transfer are extracted from the source operand extension register.

Client’s Docket No.: TT s Docket No:〇608-A40748TWF1/DEVICHEN/2006-07.〇5 23 1282066 延伸執行邏輯單元508執行微指令所指定之運算，並產生對應之結果。隨著結果的產生，延伸執行邏輯單元508將該對應結果回寫至該延伸微指令攔位所指定之目的運算元延伸暫存器。 [0043] 熟習此領域技術者將發現，圖五所示之微處理哭 500係現代之管線化微處理器50經過簡化的結果。事實上，現代的管線化微處理器500最多可包含有20至30個不同的管線階段。然而，這些階段可概括地歸類為方塊圖所示之三個階段，因此，圖五之方塊圖500可用以點明前述本發明實施例所需之必要元件。為了簡明起見，微處理器5〇〇中無關的元件並馨未顯示出來。 [0044] 現請參閱圖六，其為本發明用於指定一微處理器延伸運异元/運异之延伸前置碼600之一具體實施例的方塊圖。延伸運算元/運算指定元前置碼600具8位元大小。在一具體實施例中，8位元前置碼600之值指定一對應運算之一延伸資料模式，其中該對應運算係由本發明之延伸指令的其餘部分所指定，如此處所述。在一 x86的實施例中，該延伸資料模式（如64位元之運算元/運算）被指定，以取代一預設之資料籲模式（如32位元之運算元/運算）。 [0045] 在圖六之本發明延伸前置碼6〇〇的實施範例中，整個前置碼600係用於指定一延伸資料模式。然而，熟悉此領域技術者將察覺’指定複數個延伸資料模式其中之一所需之位元數，係依該些延伸資料模式的數量而定。因此，一個能夠指定64位元或128位元資料模式的實施例，僅需前置碼6⑽的一個位元就足以區分該兩種模式。所以，前置碼6〇〇的其餘位Client's Docket No.: TT s Docket No: 〇 608-A40748TWF1/DEVICHEN/2006-07. 〇 5 23 1282066 The extended execution logic unit 508 performs the operation specified by the microinstruction and produces a corresponding result. As the result is generated, the extended execution logic unit 508 writes back the corresponding result to the destination operand extension register specified by the extended microinstruction block. Those skilled in the art will recognize that the micro-processing crying 500 series modern pipelined microprocessor 50 shown in FIG. 5 has a simplified result. In fact, modern pipelined microprocessors 500 can contain up to 20 to 30 different pipeline stages. However, these stages can be broadly classified into the three stages shown in the block diagram, and therefore, the block diagram 500 of Fig. 5 can be used to clarify the necessary elements required for the foregoing embodiments of the present invention. For the sake of brevity, the unrelated components of the microprocessor 5 are not shown. [0044] Reference is now made to FIG. 6, which is a block diagram of one embodiment of an extended preamble 600 for designating a microprocessor extension/transport. The extended operand/operation specifies the meta preamble of 600 with an 8-bit size. In one embodiment, the value of the 8-bit preamble 600 specifies an extended data pattern for a corresponding operation, wherein the corresponding operation is specified by the remainder of the extended instructions of the present invention, as described herein. In an x86 embodiment, the extended data pattern (e.g., a 64-bit operand/operation) is designated to replace a predetermined data call mode (e.g., a 32-bit operand/operation). [0045] In the embodiment of the present invention extending the preamble 6〇〇 of FIG. 6, the entire preamble 600 is used to designate an extended data mode. However, those skilled in the art will recognize that the number of bits required to designate one of the plurality of extended data patterns depends on the number of extended data patterns. Thus, an embodiment capable of specifying a 64-bit or 128-bit data pattern requires only one bit of preamble 6 (10) to distinguish between the two modes. So, the rest of the preamble 6〇〇

Client’s Docket No.: TT5s Docket No：0608-A40748TWF1/DEVICHEN/2006-07-05 24 1282066 ^便可用於指定—既有指令集架構所無法提供之其他延伸特邏链[7〇Γ6]現請參職七’其為圖五之微處驾_譯階段 οσ 〇之細部的方塊圖。轉譯階段邏輯700具有一指八緩依本發明，其提供延伸指令至轉譯邏輯單元服，，。轉早70 7G5係輪至—具有一延伸特徵攔位703之機哭特 2暫存器Ua物e sp触c 7。2。轉譯邏輯久: 二=制器7〇6 ’其提供一除能訊號7〇7至-逸出指令偵偏延伸解碼器7〇9。逸出指令偵測器708轉接至延解碼器709及-指令解碼器71〇。延伸解碼器7〇9與指令解碼邏輯710存取一控制准讀記憶體（ROM) 711，其中儲存了對，至某些延伸指令之樣板（template)微指令序列。轉譯邏輯卓元705亦包含一微指令緩衝器712，其具有—運算碼延伸項攔位713、一微運算碼攔位714、一目的攔位715、一來源攔位71ό以及一位移欄位717。、 [0047]運作上，在微處理器通電啟動期間，機器特定暫存斋702内之延伸攔位703的狀態係藉由訊號啟動狀態_ power-upstate) 701決定，以指出該特定微處理器是否能轉譯與執行本發明之用以提供微處理器之延伸資料模式的延伸指令。在一具體實施例中，訊號701從一特徵控制暫存器（圖上未顯示）導出，該特徵控制暫存器則讀取一於製造時即已組離之熔絲陣列（fusearray)(未顯示）。機器特定暫存器7〇2將延伸特徵攔位703之狀態送至轉譯控制器706。轉譯控制邏輯7〇6 則控制從指令緩衝器704所提取之指令，要依照延伸轉譯規則Client's Docket No.: TT5s Docket No:0608-A40748TWF1/DEVICHEN/2006-07-05 24 1282066 ^Can be used to specify - other extended special logic chains that are not available in the existing instruction set architecture [7〇Γ6] The job seven's is the block diagram of the detail of Figure 5. The translation stage logic 700 has a one-to-one instruction, which provides an extension instruction to the translation logic unit. Turn 70 7G5 system turn - machine with an extended feature block 703 cry 2 special register Ua object e sp touch c 7. 2. The translation logic is long: 2 = controller 7 〇 6 ’ which provides a divisor signal 7〇7 to the escape command to extend the decoder 7〇9. The escape command detector 708 is transferred to the delay decoder 709 and the -instruction decoder 71A. The extended decoder 7〇9 and the instruction decode logic 710 access a control read-ahead memory (ROM) 711 in which pairs of template microinstructions to some extended instructions are stored. The translation logic element 705 also includes a microinstruction buffer 712 having an opcode extension block 713, a microcode block 714, a destination block 715, a source block 71, and a displacement field 717. . [0047] In operation, during the power-on startup of the microprocessor, the state of the extended block 703 in the machine-specific temporary storage 702 is determined by the signal activation state _ power-upstate 701 to indicate the particular microprocessor. Whether it is possible to translate and execute the extended instructions of the present invention for providing an extended data mode of the microprocessor. In one embodiment, the signal 701 is derived from a feature control register (not shown) that reads a fusearray that has been assembled at the time of manufacture (not display). The machine specific register 7〇2 sends the status of the extended feature block 703 to the translation controller 706. The translation control logic 7〇6 controls the instructions fetched from the instruction buffer 704 in accordance with the extended translation rules.

Client’s Docket No.: TT5s Docket N〇:0608-A40748TWF1/DEVICHEN/2006-07-05 25 1282066 或習用轉譯規則進行解譯。提供這樣的控制特徵，可應用程式（如班叫致能/除能微處理器之延伸執行特徵j 延伸特徵被除能，則呈有被& ' 右 ^ 、頁被込為延伸特徵標記之運算碼狀熊的 b，將依習用轉譯規則進行轉譯。在：中，選取運算碼狀態F1H作為，Sll + h、體只轭例、㈣c1Um η 作純5己，則在習用的轉譯規則下，遇到腿將造成不合法的指令異常（exeeptiQn)。若延伸被除能，指令解碼器71G將轉譯/解碼所有送人的指令，並對微指令7i2的所有攔位713至717進行組態。然而，在延伸轉澤規則下，若剌標記，則會鶴出指令_器谓伯測出來。逸出指令偵測器708因而使指令解碼器71〇轉譯/解碼該延伸指令的其餘部分，並對微指令712的微運算碼攔位Μ斑位移攔位7Π進行組態’而延伸解碼器.7〇9則解碼/轉譯該延 =前置碼，以對微指令712之微運算碼延伸項搁位713進行組態。某些特定指令將導致對控制R〇M 711的存取，以獲取對應之微指令序列樣板。經過組態之微指令7i2被送至一微指令佇列（未顯示於圖中），由處理器進行後續執行。 7 [0048]現請參閱圖八，其為圖五微處理器内之延伸暫存鲁器階段邏輯800的方塊圖。該延伸暫存器階段邏輯8〇〇具—暫存器邏輯（register logic) 802 ’其從一微指令緩衝器8〇1或^ 指令佇列801提取本發明之延伸微指令。暫存器邏輯8〇2具二包含複數個延伸暫存器816之延伸暫存器檔案8〇3。每一延伸暫存器816具一預設之運算元攔位814與一運算元延伸項攔位 815。在一 X86實施例中，該預設之運算元攔位814為32位元寬’以支援既有x86之32位元運算元之儲存與提取。在一糾Client’s Docket No.: TT5s Docket N〇: 0608-A40748TWF1/DEVICHEN/2006-07-05 25 1282066 Or use the translation rules for interpretation. Providing such a control feature, the application program (such as the extended execution feature of the class calling enable/disable microprocessor) is de-energized, and is marked by the & 'right ^, page is extended The b of the computational code bear will be translated according to the translation rules. In:, select the operation code state F1H as Sll + h, the body yoke example, and (4) c1Um η as pure 5, then under the customary translation rules, Encountering the leg will result in an illegal command exception (exeeptiQn). If the extension is disabled, the instruction decoder 71G will translate/decode all the given instructions and configure all of the intercepts 713 to 717 of the microinstruction 7i2. However, under the extension rule, if the flag is marked, the instruction will be detected. The escape command detector 708 thus causes the instruction decoder 71 to translate/decode the rest of the extension instruction, and The micro-opcode block Μ 位移位移位移位移微微微微微微 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 712 The shelf 713 is configured. Some specific instructions will lead The access to the control R〇M 711 is obtained to obtain a corresponding microinstruction sequence template. The configured microinstruction 7i2 is sent to a microinstruction queue (not shown) for subsequent execution by the processor. [0048] Referring now to Figure 8, a block diagram of the extended temporary stage phase logic 800 in the microprocessor of Figure 5. The extended register stage logic 8 - register logic 802' extracts the extended microinstruction of the present invention from a microinstruction buffer 8〇1 or ^instruction queue 801. The register logic 8〇2 has two extended register files containing a plurality of extended registers 816 8〇 3. Each extension register 816 has a predetermined operand block 814 and an operand extension block 815. In an X86 embodiment, the preset operand block 814 is 32 bits. Yuankuan' to support the storage and extraction of 32-bit operands with x86.

Clients Docket No.： TT5s Docket No:0608-A40748TWF1/DEVICHEN/2006-07-05 26 1282066 二;Γ 延伸項攔位δί5係32位元寬，以允 S t64位元。—128位元之輪_㈣位元 :果=來源，，並由延伸回寫邏輯8。7存取，； :運4。延伸讀取_ 806將來源運算元⑽、⑽ 至兩個運算元緩衝器809、810。結果 = 兩個結果緩衝器犯,被送至延伸回寫邏輯8〇Γ 牛作上，延倾指令與—f_脈（未顯示）同步，仗微指令仵列801被送至暫存器邏輯802。在-時脈週^ 内’延:讀取邏輯搬解碼延伸微指令之來源運算元位址搁位 5，以判斷哪些暫存器816包含要用於—指定運管 :^元二延伸微指令801的運算碼延伸項攔位（未顯二之值，、疋了項取邏輯8G6於存取暫存器職8() 式。繼之資料模式而言，只有被定址之暫存器8i6 = 汉運异痛位8M被存取。觀伸f _式而言位814與運算歧伸項攔㈣5之-等長部分進行存取，以^ 取來源縣7C。因此，來源運算元⑽、W2 ! 8〇3被提取，並送至來源運算元暫存器8〇9、δι〇。而且:= 微指令被送入管線至緩衝器8〇8，以供微處理器之後續管線階段（未顯示）執行。在同一時脈週期内，一最近執行運管之社果RS卜RS2被回寫至由完整（c〇mpleted)微指令緩衝；川中之目的暫存器攔位（未顯示）所指定的目的暫存器δΐ6中。完整微指令緩衝器m中之運算碼延伸項攔位（未頻示）之值判斷結果是否回寫至目的暫存器816之運算元延伸項搁位，以Clients Docket No.: TT5s Docket No: 0608-A40748TWF1/DEVICHEN/2006-07-05 26 1282066 II; 延伸 The extension block δί5 is 32 bits wide to allow S t64 bits. - 128-bit wheel _ (four) bit: fruit = source, and by extension back to write logic 8.7 access,; : transport 4. The extended read_806 will source the operands (10), (10) to the two operand buffers 809, 810. Result = Two result buffers are committed, sent to the extended write logic 8, and the delay instruction is synchronized with the -f_ pulse (not shown), and the microinstruction queue 801 is sent to the scratchpad logic. 802. In the -clock cycle ^ delay: read the logical transfer decoding extended micro-instruction source operand address location 5 to determine which register 816 contains to be used - specify the transport: ^ yuan two extended micro-instructions The operation code extension of 801 is blocked (the value is not shown, and the item 8G6 is accessed in the access register 8). Following the data mode, only the addressed register 8i6 = Han Yun's different pain position 8M is accessed. In view of the f _ type, the bit 814 is accessed with the operation of the distraction item (4) 5 - isometric part, to take the source county 7C. Therefore, the source operation unit (10), W2 8〇3 is extracted and sent to the source operand register 8〇9, δι〇. And:= The microinstruction is sent to the pipeline to the buffer 8〇8 for the subsequent pipeline stage of the microprocessor (not Display) Execution. During the same clock cycle, the most recent execution of the management RS RS2 is written back to the buffer (c〇mpleted) by the complete (c〇mpleted) micro-instruction; the destination register of the Chuanzhong (not shown) The specified destination register is δΐ6. The value of the operation code extension block (not shown) in the complete microinstruction buffer m determines whether the result is back. To the destination operand of register 816 extends items resting position, to

Client’s Docket No·: TT?s Docket No:0608-A40748TWF1/DEVICHEN/2006-07-05 27 1282066 。對應的結果運算元及是延伸項欄位815的哪一部份被回寫 RS卜RS2則被送入緩衝器812、813。、[0050]目八所示之暫存器階段邏輯細，提供了於單一日士脈週期内，-致地存取兩個來源暫存器與兩個結果暫存器的^ 力。另-實施綱提供兩個來源運算元與單—的目的運算元。為確保暫存H 816的-致性’延伸暫存器邏輯8Q2於執行結果 RS卜RS2的回寫前，便存取來源運算元〇ρι、〇打。 [0051] 現請參閱圖九’其為描述本發明對可讓程式員指定微處理器之延伸資料模式的指令’進行轉譯與執行的方法之_ 運作流程圖。流程開始於方塊902’其中一個組態有延伸特徵指令的程式，被送至微處理器。流程接著進行至方塊9〇4。 [0052] 於方塊904中，下一個指令係從快取記憶體/外部 έ己憶體提取。流程接著進行至判斷方塊9〇6。 [0053]於判斷方塊906中，對在方塊904中所提取的下個指令進行檢查，以判斷是否包含一本發明之延伸逸出碼。在一 χ86的實施例中，該檢查係用以偵測運算碼值F1 (ice BKPT)。若偵測到該延伸逸出碼，則流程進行至方塊9〇8。若未"[貞測到該延伸逸出碼，則流程進行至方塊912。 [0054]於方塊908中，解碼/轉譯該延伸指令之延伸前置碼部分，以決定被指定用以執行現行運算之一延伸資料模式。流程接著進行到方塊910。 [0055]於方塊910中，現行運算所用之該延伸資料模式被指定於一對應微指令序列之延伸項欄位。流程接著進行至方塊 912。Client’s Docket No·: TT?s Docket No: 0608-A40748TWF1/DEVICHEN/2006-07-05 27 1282066. The corresponding result operand and which part of the extension field 815 is written back to the RS 102 is sent to the buffers 812, 813. [0050] The register stage logic shown in [0050] provides an effort to access the two source registers and the two result registers in a single day cycle. In addition, the implementation outline provides two source operands and a single-purpose destination operand. In order to ensure that the temporary H 816's -> extension register logic 8Q2 is before the execution of the result RS b RS2 write back, the source operation element 〇ρι, beating is accessed. [0051] Referring now to Figure IX, a flowchart showing the operation of the present invention for a method of translating and executing an instruction that allows a programmer to specify an extended data mode of a microprocessor. The flow begins at block 902' where a program configured with extended feature instructions is sent to the microprocessor. The flow then proceeds to block 9〇4. [0052] In block 904, the next instruction is extracted from the cache memory/external memory. The flow then proceeds to decision block 9〇6. [0053] In decision block 906, the next instruction fetched in block 904 is checked to determine if an extended escape code of the present invention is included. In an embodiment of 86, the check is used to detect an opcode value F1 (ice BKPT). If the extended escape code is detected, the flow proceeds to block 9-8. If not "[the extended escape code is detected, the flow proceeds to block 912. [0054] In block 908, the extended preamble portion of the extension instruction is decoded/translated to determine an extended data pattern designated to perform the current operation. Flow then proceeds to block 910. [0055] In block 910, the extended data pattern used by the current operation is assigned to an extension field of a corresponding microinstruction sequence. The flow then proceeds to block 912.

Client’s Docket No.: TT^ Docket N〇：0608-A40748TWF1/DEVICHEN/2006-07-05 28 1282066 [0=6]於方塊912巾’該指令之所有其餘部分被解碼/轉 =，以決定該指定運算、該運算之運算元的位㈣及既 =微處理器指令集架構’由前置碼所衫之既有架構特徵的使用。流程接著進行至方塊914。 …[=057]於方塊914 t ’-微指令序列被組態為指定該指定運算及其對應之運算碼延伸項。流程接著進行至方塊916。 [0058]於方塊916巾’該微指令序列被送至一微指令佇列，由微處理器執行。流程接著進行至 _於方…，該微指丁令=刺之一延伸· 暫存器邏輯進行提取。該延伸暫存器邏輯從指定暫存器中提取對應於該指定運算之運算元。運算元係依該微指令序二内指定之資料模式（即預設或延伸）定其大小。流程接著進行至方塊 920 〇 [0060] 於方塊920中，延伸執行邏輯單元運用該指定之資料模式，使用於方塊918中所存取之運算元執行該指定運算，以產生結果運算元。流程接著進行至方塊922。 [0061] 於方塊922中，該結果運算元被送至該延伸暫存籲器邏輯，並以該微指令序列所指定之資料模式被回寫至延伸暫存器中。流程接著進行至方塊924。 [0062] 於方塊924中’本方法完成。 [0063] 雖然本發明及其目的、特徵與優點已詳細敘述，其它實施例亦可包含在本發明之範圍内。例如，本發明已就如下的技術加以敘述：利用已完全佔用之指令集架構内一單_、未使用之運算碼狀態作為標記，以指出其後之延伸特徵前置Client's Docket No.: TT^ Docket N〇:0608-A40748TWF1/DEVICHEN/2006-07-05 28 1282066 [0=6] at block 912, 'All the rest of the instruction is decoded/turned = to determine the designation The operation, the bit (4) of the operand of the operation, and the = microprocessor instruction set architecture 'use of the existing architectural features of the preamble. Flow then proceeds to block 914. ...[=057] at block 914 t '-the microinstruction sequence is configured to specify the specified operation and its corresponding opcode extension. Flow then proceeds to block 916. [0058] At block 916, the sequence of microinstructions is sent to a microinstruction queue, which is executed by the microprocessor. The flow then proceeds to _方方, the micro-finger = thorn one extension · the register logic to extract. The extension register logic extracts an operand corresponding to the specified operation from the specified scratchpad. The operand is sized according to the data mode (ie, preset or extended) specified in the second instruction sequence. Flow then proceeds to block 920. [0060] In block 920, the extended execution logic unit applies the specified data pattern, and the specified operation is performed using the operand accessed in block 918 to produce the resulting operand. Flow then proceeds to block 922. [0061] In block 922, the result operand is sent to the extended scratchpad logic and written back to the extended register in the data pattern specified by the sequence of microinstructions. Flow then proceeds to block 924. [0062] In block 924, the method is completed. [0063] While the invention and its objects, features and advantages have been described in detail, other embodiments may be included within the scope of the invention. For example, the present invention has been described in the context of a single _, unused code state in a fully occupied instruction set architecture as a flag to indicate the subsequent extended feature front

Client’s Docket No.: TVs Docket N〇:0608-A40748TWF1/DEVICHEN/2006-07-05 29 1282066 =:但本發明的範圍就任一方面來看，並不限於已完全佔用之 ^令集架構’或未使用的指令，或是單—標記。相反地，本發月成盖了未完全映射之指令集、具已使用運算碼之實施例以及 ^ 個以上之私令標記的實施例。例如，考慮一沒有未使用、斤馬狀悲之私令集架構。本發明之一具體實施例包含了選取、=為逸出標記之運算碼狀態，其中選取標準係依市場因素而决弋。另一具體實施例則包含使用運算碼之一特殊組合作為標记，如運算碼狀態7FH的連續出現。因此，本發明之本質係在於使用-標記序列，其後則為一 n位元之延伸前置碼，可允井程式員於-延伸指令中指定延伸資料模式，而該益法另由微處理器指令集之既有指令來提供。—^、、次、,、_64]此外，本發明主要係以64位元與128位元之延伸貝料模式來進行描述。然而，這些模式僅僅是用來在現代桌上型/膝上型電腦微處理器所展現之資料模式的脈絡下，說明本發明的各個面向。所以熟悉此領域技術者將知道，本發明的範圍可以延伸至需要非常大或小之運算元/運算的應用程式，或者具有大小可變之運算元/運算的應用程式，其中一特定運算元/運算的大小係依據每個指令來指定。 # _5]再者’雜上文係·微處弱為例來解說本發明及其目的、特徵和優點’熟習此領域技術者仍可窣覺: 明的範圍並不限於微處理器的架構，而可涵蓋所有形式: 式化裝置，如訊號處理器、工聿用 ^ 王 + η 、土上来用$工制态（industrial controller)、陣列處理器及其他同類裝置。總之，以上所述者，僅為本發明之較佳實施例而已，當不Client's Docket No.: TVs Docket No: 0608-A40748TWF1/DEVICHEN/2006-07-05 29 1282066 =: However, the scope of the present invention is not limited to the fully occupied architecture of the command set or not. The instruction used, or single-mark. Conversely, the present month covers an embodiment of an instruction set that is not fully mapped, an embodiment with a used opcode, and more than one private tag. For example, consider a private order architecture that is not used and is sloppy. One embodiment of the present invention includes the selection of =, = the state of the opcode of the escape token, wherein the selection criteria are determined by market factors. Another embodiment includes the use of a particular combination of opcodes as a signature, such as the sequential occurrence of opcode state 7FH. Therefore, the essence of the present invention is to use a --mark sequence, followed by an n-bit extended preamble, which allows the well-programmer to specify an extended data pattern in the -extension instruction, and the benefit method is additionally processed by micro processing. The instruction set of the device is provided by an instruction. -^, , , , , , _64 In addition, the present invention is mainly described in the extended bit mode of 64 bits and 128 bits. However, these modes are merely illustrative of the various aspects of the present invention in the context of the data patterns exhibited by modern desktop/laptop microprocessors. Therefore, those skilled in the art will appreciate that the scope of the present invention can be extended to applications that require very large or small operands/operations, or applications that have variable-sized operands/operations, where a particular operand/ The size of the operation is specified according to each instruction. # _5] Further, the present invention and its objects, features and advantages are exemplified by those skilled in the art. Those skilled in the art will still be aware that the scope of the invention is not limited to the architecture of the microprocessor. It can cover all forms: device, such as signal processor, workmanship ^ Wang + η, earth to use industrial controller (industrial controller), array processor and other similar devices. In summary, the above is only a preferred embodiment of the present invention, when not

Clienfs Docket No.: TTs Docket No:0608-A40748TWF1/DEVICHEN/2006-07-05 30 1282066 \ % 能以之限定本發明所實施之範圍。大凡依本發明申請專利範圍所作之均等變化與修飾，皆應仍屬於本發明專利涵蓋之範圍内，謹請責審查委員明鑑，並祈惠准，是所至禱。風Clienfs Docket No.: TTs Docket No: 0608-A40748TWF1/DEVICHEN/2006-07-05 30 1282066 \ % The scope of the invention can be limited thereto. The equal changes and modifications made by Dafan in accordance with the scope of patent application of the present invention should still fall within the scope covered by the patent of the present invention. I would like to appoint the reviewer for the examination and pray for the best. wind

Client’s Docket No·: TT5s Docket N〇:0608-A40748TWF1/DEVICHEN/2006-07-05Client’s Docket No·: TT5s Docket N〇:0608-A40748TWF1/DEVICHEN/2006-07-05

Claims

1282066 Amendment No. 91124005 January 4, 1996 10, the scope of application for patents: Order, executed by micro-processing, which is transferred to the corresponding micro-finger one extension preamble, used for the second day 7 · > extends the operand size, where the extended gentleman-operator is specified; and the size b is extended by the existing instruction set extension preamble, where the delay code; and 曰7 within - according to the architecture specified by the corresponding to: the translation logic unit 'to access the translation logic unit contains: μ to perform the specified operation, two of which are used to detect the extended preamble tag; Knowing that the heart decides that the material is to be executed, and that it is used to transfer the output command side and the instruction decoding: the size is different, and the corresponding two instructions are specified in the corresponding micro-instruction. The setting of the item/1, wherein the extension instruction further includes an instruction item of the existing instruction set. 3. The device described in item 2 of the U-profit range, the operation of the instruction item is straightforward and the operation of the operation is performed on the operation of the operation, wherein the register is based on a plurality of different sizes. The extraction/storage of the transportation. 4. The device of claim 3, wherein the operand sizes comprise a 64-bit operand size. 5. The device of claim 3, wherein the operand sizes further comprise a 128-bit operand size. ^ Client's Docket No.: TT^ Docket No: 0608-A40748TWF2/DEVICHEN/20〇7-〇i.〇3 32 1282066 Set the operation size.仃邮邮载 ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' For example, the size of the φ 兀是是是〇〇包含包含包含包含包含包含包含包含包含包含包含包含包含包含包含包含 φ φ φ φ φ φ φ , , , , , , , , This extension pre-sets the operation code F1 (ICEBKPT). The set extension module package i has both the instruction set and the extension (four) mode capability of the instruction finger ^ 'by the translation logic unit pure, and indicates that the corresponding tube ^ is specified to execute one extension of the microprocessor = and its boat The extension instruction is marked as the first operation in the existing instruction set; the bucket is in the range = the code is connected to the extension instruction mark, and is the accompanying part 2. One to specify the plural corresponding to the extension operation The data row mode is coupled to the translation logic unit, and the extension operation is performed by using the two modes, wherein the existing instruction set only provides: jϋ' and fails to provide the specified data mode The translation logic 7L includes: /, an escape command detector for detecting the extension instruction indicating that the translation operation of the attachment (four) points is based on the extended translation convention; and ^-extending decoding n, secret to the escape The instruction (4) is configured to execute the translation operation according to the Client's Docket No.: Tr.DocketNo: 0608-A40748TWF2/DEVICHEN/2007^ 33 1282066, and translate the extension according to the extension; Yiyi According to the specified data schema, the instruction set extension module described in the section 11 of the Haili Range f11, the address refers to the second operation code and the selected plurality of operation element system. In the μ ^ *, the 1st 5th-above instruction set extension module 'its code F1) 〇匕3 86 is the ICE BKPT opcode in the wood (ie, 11) The micro-finger (4)^ part is translated into pairs to take/store an extended operation unit. π, 廿甲θ廿口17· The data pattern specified in item n of the patent application scope contains -6 4 bits ^ day: : Extending the core group, 18. The data pattern specified in the patent application scope u ϋ 二 2 includes a 128-bit extension module, 19. A method for expanding an existing instruction set architecture ^& Specifying an extended data pattern, the method comprising: providing an extended instruction in the four places, wherein the extended mark is the extended instruction mark, and the extended instruction instruction is to translate the extended instruction into a micro instruction. The microinstruction indicates an extension to execute Client's Docket No.: TT^ Docket N〇:0608-A 40748TWF2/DEVICHEN/2007-01-〇3 34 1282066 Logic single ί f ^ 伸资料资料资料资料该该该该该该该该该该该该该该该该该该该该该该该该该该该该该该该该该该该该该该该The reading order of the observation instruction D is different from the default data mode. The rest of the age of the data mode specifies that the extension has both the formula and the second, and the extension (four) mode performs the calculation. The method described in item 19 of the data pattern is to specify a delay operation, wherein the first specified action includes using the second instruction code item in the existing instruction set architecture.仏吏或或 132 132 132 132 132 132 132 132 132 132 132 132 132 132 132 132 132 132 132 132 132 132 132 132 〇 8-bit size item, the method described in the extension of the pre-extension to the nm-range range, item 19, wherein the provisioning includes selecting the first-transport range from the χ86 microprocessor instruction set architecture* The method of claim 22, wherein the selection of the first code includes the selection of the x86 ice bkpt operation horse (ie, the operation of the method: the method described in item 19) wherein the data pattern of the specified extension is 64 bits, as the Extended data mode. Client's Docket No:: TT s DocketNo-°608-A40748TWF2/DEVICHEN/2007-01-03 12820铋91124005 July 6, 1995 using extended data mode microprocessor

500

501 502 Extract Logic Unit Instruction Cache Memory / External Memory Extract Command Queue 503

1282066 Case No. 91124005 January 4, 1996 P-year-old t-day repair three / correction page Microprocessor using extended data mode 700 701

Figure 7 1282066 VII. Designated representative map: (1) The representative representative of the case is: Figure 5. (b) A brief description of the component symbols of the representative diagram: 500 pipelined microprocessor 501 extraction logic unit 502 instruction cache memory / external memory 503 instruction queue 504 translation logic unit 505 extension translation logic unit 506 micro-instruction queue 507 Execution logic 508 extends the execution logic unit. 8. If there is a chemical formula in this case, please reveal the chemical formula that best shows the characteristics of the invention: Slightly Client's Docket No.: TT5s Docket N〇: 0608-A40748TWF1/DEVICHEN/2006-07-05