TWI238943B

TWI238943B - Apparatus and method for masked move to and from flags register in a processor

Info

Publication number: TWI238943B
Application number: TW92128964A
Authority: TW
Inventors: Gerard M Col; G Glenn Henry; Terry Parks
Original assignee: Ip First Llc
Priority date: 2002-10-22
Filing date: 2003-10-20
Publication date: 2005-09-01
Also published as: TW200406684A

Abstract

A method and apparatus are provided for writing to, and reading from, the EFLAGS register in a processor. For a particular write to EFLAGS request, a mask is generated using destination information for the write and privilege level information for the write. The mask is then ANDed with EFLAGS new value information and the result is written to the EFLAGS register in a single instruction cycle. For a particular read from EGLAGS request, a mask is generated using privilege information for the read to specify those bits of EFLAGS which can be updated during the read. The mask is then ANDed with the contents of the EFLAGS register and the result is stored in a stack in memory.

Description

1238943 案號 92128964__£ 月曰修正五、發明說明（1) 【與相關申請案之對照】 [0 0 0 1 ]本申請案優先權之申請係根據該美國專利申請案，案號60/345455，申請日：1 0/ 23/200 1，專利名1238943 Case No. 92128964__ Month Amendment V. Description of Invention (1) [Comparison with related applications] [0 0 0 1] The priority application of this application is based on the US patent application, case number 60/345455, Application date: 1 0 / 23/200 1, patent name

稱："APPARATUS AND METHOD FOR MASKED MOVE TO FLAGS REGISTER，’ 。【發明所屬之技術領域】 [ 0002 ]本發明係有關電腦指令之執行的領種用以減少執行寫入/讀取EFLAGS暫 -尤才曰 ^什裔 < 耘令週期的裝Weigh: " APPARATUS AND METHOD FOR MASKED MOVE TO FLAGS REGISTER, '. [Technical field to which the invention belongs] [0002] The present invention is related to the execution of computer instructions to reduce the execution of write / read EFLAGS temporarily-You Caiyu

1238943 案號 92128964 五、發明說明（2) 置及方法。【先前技術】 [0003]在一 x86管線微處理器中，執行一寫入到 EFLAGS 暫存器的指令（例如：POPF/popfd，CLI/STI， CLD/STD，CLC/STC)需要的週期數量相當多。因為寫入到 EFLAGS暫存器的動作係受到現時輸出入特權等級（1/〇 privilege level, IOPL)及EFLAGS暫存器内某些位元在一寫入時之狀態的影響。在微軟視窗R作業系統下，在每一次一子程式有一回應時，該EFLAGS暫存器即須被從該堆疊儲存器中挽出，因而導致顯著的作業系統延遲。 [0 0 0 4 ]因此，本發明提供一種微處理器運算技術以減少因執行一寫入到EFLAGS暫存器的指令之相關延遲，例如：退堆疊（pop)指令。1238943 Case No. 92128964 V. Description of the invention (2) Equipment and method. [Prior art] [0003] In an x86 pipeline microprocessor, the number of cycles required to execute an instruction (eg, POPF / popfd, CLI / STI, CLD / STD, CLC / STC) written to the EFLAGS register. very much. Because the action of writing to the EFLAGS register is affected by the current I / O privilege level (IOPL) and the state of some bits in the EFLAGS register at the time of writing. Under the Microsoft Windows R operating system, each time a subroutine responds, the EFLAGS register must be retrieved from the stack memory, resulting in a significant operating system delay. [0 0 0 4] Therefore, the present invention provides a microprocessor operation technology to reduce the delay associated with executing an instruction written to the EFLAGS register, such as a pop instruction.

[0005]同時，在一 x86管線微處理器中，將該EFLAGS 暫存器存入於該堆疊儲存器之一下推（pUsh)指令，即 PUSGF/PUSHFD，其需要的週期數量亦相當多。因為從該 EFLAGS暫存器所讀取之該些位元的狀態及該微處理器的執行狀態係受到現時輸出入特權等級（I〇PL)及EFLAGS暫存器内該些特定位元在一下推（push)時之狀態的影響。在微軟視窗R作業系統下，在每一次回應一子程式，該EFLAGS暫存器即須被存入到該堆疊儲存器中，因而導致顯著的作業系統延遲。 [0 0 0 6 ]因此，本發明提供一種微處理器運算技術以減[0005] At the same time, in an x86 pipeline microprocessor, the EFLAGS register is stored in one of the stacked storage push-down (pUsh) instructions, that is, PUSGF / PUSHFD, which requires a considerable number of cycles. Because the state of the bits read from the EFLAGS register and the execution status of the microprocessor are subject to the current I / O privilege level (IOPL) and the specific bits in the EFLAGS register are below The effect of the state when pushing. Under the Microsoft Windows R operating system, each time a subroutine is responded, the EFLAGS register must be stored in the stack memory, resulting in a significant operating system delay. [0 0 0 6] Therefore, the present invention provides a microprocessor operation technology to reduce

画 1238943 92128964 玍日修正 —_ 五、發明說明（3) 少因執行一讀取EFLAGS暫存器之的指令之相關存入EFLAGS 到該堆疊儲存器的延遲，例如··下推（push)指令。【發明内容】 [0 0 0 7 ]在本發明之一具體實施例中，本發明提供在微處理器内之一多位元旗標暫存器上執行一寫入運算的方法。該方法包括利用一微處理器之一轉譯階段，以接收一要求寫^到該位元旗標暫存器之一巨集指令。該方法亦包 f用4轉厚階段以產生一微指令，該微指令係組態為在 :早一寫入週期完成寫入到該多位元旗標暫存器。該方法 μ包括產生一旗標遮罩，並且利用該旗標遮罩與一預定運 ί =之3 ί Γ，以產生一結果，該結果隨後即被存人到該票：存器’而且在本具體實施例中1多位元旗铋暫存15即為該EFLAGS暫存器。 EFLAGS暫;^發=::在-指令週期完成-寫入到該在本發\/之以有—效且的體咸少處理/的延遲。包括利用一微處理器 Ζ碩取運异的方法。該方法位元旗標暫存器之：：；；；階段，以接收一要求讀取該階段以產生-微指♦，：2二該方法亦包括利用該轉譯期完成讀取該多位元旗；以係組態為在-單-寫入週標遮罩’該旗標遮罩包括該方法更包括產生-旗在一現時特權等級之一讀取^二訊’該些特權資訊係關於 -買取運舁之下，該多位元旗標暫存 11^1Drawing 1238943 92128964 the next day's correction --- 5. Description of the invention (3) The delay in storing EFLAGS to the stack memory related to the execution of an instruction that reads the EFLAGS register, for example, the push instruction . [Summary] [0 0 0 7] In a specific embodiment of the present invention, the present invention provides a method for performing a write operation on a multi-bit flag register in a microprocessor. The method includes using a translation stage of a microprocessor to receive a macro instruction requesting a write to the bit flag register. The method also includes a four-turn thick stage to generate a micro-instruction, which is configured to complete writing to the multi-bit flag register in an earlier write cycle. The method μ includes generating a flag mask, and using the flag mask and a predetermined operation 3 = Γ to produce a result, which is then deposited into the ticket: register 'and in In the specific embodiment, the bismuth of 1 multi-bit flag is temporarily stored as 15 the EFLAGS register. EFLAGS temporarily; ^ issue = :: completed in-instruction cycle-write to this in this issue // with effective-effective and less processing / delay. Including the use of a microprocessor to master the difference. In the method, the bit flag register :: ;;; stage, to receive a request to read the stage to generate-micro-finger ♦, 2: 2 The method also includes using the translation period to complete reading the multiple bits The flag is configured as an on-single-write weekly mark mask. The flag mask includes the method and further includes generating-flag reading at one of the current privilege levels. ^ Two messages. The privilege information is about -Under the purchase and shipping, the multiple flags are temporarily stored 11 ^ 1

麵第8頁 1238943 做為更新之用的位元。該預定運算元之及運算，以結果儲存在一記憶體之_ 本發明可以有效的減少導器的延遲，例如··在一處存器下推的延遲。在本發令週期完成該些EFLAGS堆發明之其他目的及優點由可更加明白。修正Page 8 1238943 Bits used for update. The sum of the predetermined operands is stored as a result in the memory. The present invention can effectively reduce the delay of the director, for example, the delay of the push-down in a memory. Other objects and advantages of completing these EFLAGS reactor inventions during this ordering cycle will become clearer. Amend

案號 92128QR4 五、發明說明（4) 器之該些適宜旗標遮罩與一法更包括將該 [0010]在取運算的處理器執行堆疊儲以在一單一指 [0011 ]本隨附之圖表當方法更包括利用該產生一結果。該方堆疊儲存器内。因於執行EFLAGS讀理器之EFLAGS暫存明之裝置與方法可疊儲存器之下推。隨後之詳細說明及【實施方式】 [0 0 1 8 ]以下的說明，係在一特定實施例及其必要條件的脈絡下而提供，可使一般熟習此項技術者能夠利用本發明。然而’各種對該較佳實施例所作的修改，對熟習此項技術者而言乃係顯而易見，並且，在此所討論的一般原理’亦可應用至其他實施例。因此，本發明並不限於此處所展出與敘述之特定實施例，而是具有與此處所揭露之原理與新賴特徵相符之最大範圍。 [0019]請參閱圖一，其係為描述一傳統管線微處理器 1 〇 0的方塊圖。該微處理器有一提取階段1 〇 5，一轉譯階段 110 ’ 一暫存階段115，一定址階段120，一 dat/ALU或執行階段125，及一寫 0(write back)130 階段。 [〇 0 2 0 ]於運作時，該提取階段1 〇 5從記憶體（未顯示）提取巨集指令以供該微處理器1 〇〇執行。轉譯階段11 0則將Case No. 92128QR4 V. Description of the invention (4) The appropriate flag masks and methods of the device further include stacking the [0010] processor in the fetch operation and storing it in a single finger [0011] attached to this Graphing the method further includes using the result to produce a result. The side is stacked inside the reservoir. As the EFLAGS temporary storage device and method implementing the EFLAGS reader can be pushed down the stackable memory. The following detailed description and [Embodiment] [0 0 1 8] The following description is provided in the context of a specific embodiment and its necessary conditions, so that those skilled in the art can use the present invention. However, 'a variety of modifications to the preferred embodiment will be apparent to those skilled in the art, and the general principles discussed herein' can also be applied to other embodiments. Therefore, the present invention is not limited to the specific embodiments shown and described herein, but has the widest scope consistent with the principles and novel features disclosed herein. [0019] Please refer to FIG. 1, which is a block diagram illustrating a conventional pipeline microprocessor 1000. The microprocessor has an extraction phase 105, a translation phase 110 ', a temporary storage phase 115, an addressing phase 120, a dat / ALU or execution phase 125, and a write back 130 phase. [00 0 2] During operation, the fetching stage 105 fetches macro instructions from a memory (not shown) for execution by the microprocessor 100. The translation stage 11 0 will

1238943 --^ 92128964 年月日條正五、發明說明（5) " " '~ - 該被提取的巨集指令轉譯成對應的微指令。 [0 0 2 1 ]每一微指令係用以命令微處理器丨〇〇執行一特之子任務，並且該子任務為完成一被提取的巨集指令之全部運算的一部份。暫存階段丨丨5則從一暫存檔案中，取還被該微指令所指定之運算元，以供管線中隨後的階段所用。定址階段1 2 0則計算被該微指令所指定之計憶體位址’以供> 料儲存與取還運算所用。階段125不是在從該暫存檔案取還的資料上，執行算術邏輯單元 (arithmetic logic unit，ALU)運算，即是利用在定址階段1 2 0所計算之計憶體位址，以從該計憶體讀取資料或寫入資料至該計憶體。寫回階段丨3〇則將一資料讀取運算，或是一ALU運算的執行結果寫入到該暫存檔案。因此，回顧整個流程’提取階段1 〇 5提取巨集指令，、該些巨集指令經由轉澤階段1 1 〇被解碼成微指令，而該些被轉譯的微指令則再流經115-130階段以執行運算，因此構成該微處理器1 0 0的管線運算。 [0022] 為增進對微處理器之字串處理的了解，在以下的討論中將使用一x86微處理器的標準命名法。但是熟習此領域技術者將發現使用X 8 6架構之暫存器與巨集指令僅止於舉例說明而已，其他微處理器與架構亦可被用來作為範例。 [0023] Data/ALU 階段 125 包括EFLAGS 暫存器 132，該 EFLAGS暫存器1 32係存有該處理器的狀態。對於條件指令迴路（conditional l〇op)與條件指令跳躍（c〇nditi〇nal1238943-^ 92128964 Year 5. Article description of the invention (5) " " '~-The extracted macro instruction is translated into the corresponding micro instruction. [0 0 2 1] Each micro-instruction is used to instruct the microprocessor to perform a special sub-task, and the sub-task is a part of completing all operations of an extracted macro instruction. The temporary storage stage 丨丨 5 retrieves the operand specified by the microinstruction from a temporary archive for use by subsequent stages in the pipeline. The addressing stage 1 2 0 calculates the memory address specified by the microinstruction for the > material storage and retrieval operations. Phase 125 is not to perform arithmetic logic unit (ALU) operation on the data retrieved from the temporary file, that is, to use the memory address calculated in the addressing phase 120 to retrieve the data from the memory. Read or write data to the memory. The write-back stage 丨 30 writes a data read operation or an execution result of an ALU operation to the temporary file. Therefore, reviewing the entire process of 'fetching macro instructions' at stage 105, these macro instructions are decoded into micro-instructions through translation stage 1 10, and the translated micro-instructions then flow through 115-130 The stage performs operations, thus constituting a pipeline operation of the microprocessor 100. [0022] In order to improve understanding of microprocessor string processing, a standard nomenclature of an x86 microprocessor will be used in the following discussion. However, those skilled in the art will find that the registers and macro instructions using the X 8 6 architecture are limited to examples, and other microprocessors and architectures can also be used as examples. [0023] The Data / ALU stage 125 includes an EFLAGS register 132. The EFLAGS register 1 32 stores the state of the processor. For conditional instruction loop (conditional l〇op) and conditional instruction jump (c〇nditi〇nal

1238943 _案號92128964_年月日__ 五、發明說明（6) jump)而言，該EFLAGS暫存器132可被許多指令所修改，並且可做為比較參數之用。該EF LAGS暫存器的每一位元均存有該最後指令之特定參數的狀態，如下列之表格一，其顯示該EFLAGS暫存器的32個位元，及每一位元的功能。1238943 _ Case No. 92128964_ Year Month Day__ 5. In the description of the invention (6) jump), the EFLAGS register 132 can be modified by many instructions and can be used as a comparison parameter. Each bit of the EF LAGS register holds the status of specific parameters of the last instruction, as shown in Table 1 below, which shows the 32 bits of the EFLAGS register and the function of each bit.

表格一Form one

第11頁 1238943 _案號92128964_年月日_ 五、發明說明（7) EFLAGS暫存器位元號碼名稱功能 32:22:00 Reserved "接低電位" 21 ID ID旗標 20 VIP 虚擬中斷未決 19 VIF 虛擬中《旗標 18 AC 對位檢査 17 VM 虚擬棋式 16 RP 回復旗標 15 0 "接低«位" 14 NT 粜狀1作旗標 13:12 IOPL 輸出入特權等級 11 OF 溢位旗標 10 DF 方向旗標 9 IF 中斷旗標致牦 8 TF 陷阱旗標 7 SF 符號旗標 6 ZF 零旗標 5 0 "接低電位_ 4 AF 輔助進位旗標 3 0 "接低«位11Page 111238943 _Case No. 92128964_ Year Month Date_ V. Description of the invention (7) EFLAGS register bit number name function 32:22:00 Reserved " Connect to low potential " 21 ID ID flag 20 VIP Virtual Interrupt pending 19 VIF Virtual Flag 18 AC Alignment Check 17 VM Virtual Chess 16 RP Reply Flag 15 0 " Connect Low «Bit " 14 NT Flag 1 Flag 13:12 IOPL Output Input Privilege Level 11 OF Overflow flag 10 DF Direction flag 9 IF Interrupt flag Peugeot 8 TF Trap flag 7 SF Symbol flag 6 ZF Zero flag 5 0 " Connect to low potential_ 4 AF Auxiliary carry flag 3 0 " Low «bit 11

[0024]在一今曰管線微處理器中，例如處理器100，任一執行一寫入到該EFLAGS暫存器的該些指令（即 POPF/POPFD，CLC/STC，CLD/STD，CLI/STI)的執行過程均需要相當數量的機器週期。因為寫入到EFLAGS暫存器132 的動作係受到現時輸出入特權等級（IOPL)及EFLAGS暫存器[0024] In a pipeline microprocessor, such as the processor 100, any one of the instructions (that is, POPF / POPFD, CLC / STC, CLD / STD, CLI / STI) requires a considerable number of machine cycles to execute. Because the action written to the EFLAGS register 132 is subject to the current I / O privilege level (IOPL) and the EFLAGS register

第12頁 1238943Page 12 1238943

内某些位元在一寫入時之狀態的影響，尤其位元丨，位元 3，位元5，位元15，及位元22〜31係為保留狀態 (reserved)，並且其狀態不可被改變。此外，當該處理器係在特權等級〇之保護模式（亦稱為真實位址模式）下運作時，除VIP，VIF，及VM之外，所有非保留的位元均可被修改。該νιρ與VIF旗標必須被清除’該VM旗標則必須保持盆現在狀態。〃 [ 0 025 ]任何前述之寫入資料到以“以暫存器132的巨集指令之執行均會導致一些微指令的產生。細部來說，一微指令將首先被執行以決定該現時輸出入特權等級 (IOPL) ’該些隨後之微指令即行讀取某些EFLAGS位元的現時狀態，例如：VM，RF，IOPL，VIP，VIF，及IF，並且設立位元狀態為一新數值以被寫入到EFLAGS。而一最後的微指令則將該新數值寫入到EFLAGS暫存器132。 [ 0 026 ]前述之傳統更新該EFLAGS暫存器的方法有非常顯著的缺失，即是必須要執行為數眾多的微指令才能完成寫入到該EFLAGS暫存器。因為有數個微指令必須要被產生及處理，因此上述之更新該EFLAGS暫存器會消耗許多時間，使得微處理器的效能降低。 [0 0 2 7 ]本發明注意到在微軟視窗r作業系統下，在每一次一子程式有一回應時，該EFLAGS暫存器即須被從該堆叠儲存器中挽出，並且，此情況亦發生在許多今日普遍使用之桌上型電腦之應用程式。既然此類型之指令已為大眾廣泛使用，因此迫切需要能將該些指令之執行時間減到最The influence of the state of some bits in a write, especially bit 丨, bit3, bit5, bit15, and bits22 ~ 31 are reserved, and their status cannot be Was changed. In addition, when the processor is operating in the protected mode of privilege level 0 (also known as the real address mode), all non-reserved bits can be modified except VIP, VIF, and VM. The νιρ and VIF flags must be cleared 'and the VM flag must remain in its current state. 〃 [0 025] Any of the foregoing writing data to the execution of the macro instruction with the register 132 will cause the generation of some micro instructions. In detail, a micro instruction will be executed first to determine the current output Enter Privileged Level (IOPL) 'These subsequent micro instructions read the current state of certain EFLAGS bits, such as: VM, RF, IOPL, VIP, VIF, and IF, and set the bit state to a new value to Is written to EFLAGS. And a new microinstruction writes the new value to EFLAGS register 132. [0 026] The aforementioned traditional method of updating the EFLAGS register has a very significant deficiency, that is, it must be Many micro-instructions need to be executed to complete writing to the EFLAGS register. Because there are several micro-instructions that must be generated and processed, the above update of the EFLAGS register will consume a lot of time and make the performance of the microprocessor [0 0 2 7] The present invention notices that under the Microsoft Windows operating system, each time a subroutine has a response, the EFLAGS register must be pulled out of the stack memory, and this Situation also Applications created on many desktop computers that are commonly used today. Since this type of instruction is widely used by the general public, there is an urgent need to minimize the execution time of these instructions

1238943 _案號92128964_年月日佟π： _ 五、發明說明（9) 少〇 [0028] 本發明的目的在於減少執行一寫入到該eflagS 暫存器所需之指令週期的數量。為達此目的，本發明提供一種動態產生一EFLAGS遮罩的裝置與方法。該遮罩係為結合一指定運算元執行邏輯及運算（即將EFLAGS暫存器退堆疊，或選擇EFLAGS位元狀態），並且將其結果寫入到該 EFLAGS暫存器。此處所揭示之新處理器有下述優點；該處理器使用一新的微指令，即Move To EFLAGS (MTEF)，該微指令結合該執行階段之專用邏輯，使得寫入到EFLAGS的動作可以在一單一指令週期内完成。 [0029] 現請參閱圖二’其係為一方塊圖，該方塊圖描述一處理器200使用單一微指令，即Move To EFLAGS (MTEF)，以在一單一指令週期内完成寫入到eFLAGS暫存器。該處理器200包括一提取階段2 02，並且該提取階段 202内含一耦接至指令記憶體206的提取邏輯204 (instruction fetch logic)。一指令指標208 (instruct ion pointer)係耦接至該提取邏輯204以指示該提取邏輯204到該記憶體206的特定位置去提取現行指令。1238943 _ Case No. 92128964_ Year Month Date 日 π: _ V. Description of the Invention (9) Less 〇 [0028] The purpose of the present invention is to reduce the number of instruction cycles required to execute a write to theeflagS register. To achieve this, the present invention provides a device and method for dynamically generating an EFLAGS mask. The mask performs logical AND operation on a specified operand in combination (ie, unstacks the EFLAGS register, or selects the EFLAGS bit state), and writes the result to the EFLAGS register. The new processor disclosed here has the following advantages; the processor uses a new micro instruction, Move To EFLAGS (MTEF), which combines the special logic of the execution stage, so that the actions written to EFLAGS can be executed in Completed in a single instruction cycle. [0029] Please refer to FIG. 2 ′, which is a block diagram illustrating a processor 200 using a single microinstruction, namely Move To EFLAGS (MTEF), to complete writing to the eFLAGS temporary in a single instruction cycle. Memory. The processor 200 includes an extraction stage 202, and the extraction stage 202 includes an instruction fetch logic 204 coupled to the instruction memory 206. An instruction pointer 208 is coupled to the fetch logic 204 to instruct the fetch logic 204 to a specific location of the memory 206 to fetch the current instruction.

[ 0 03 0 ]當該提取邏輯204提取一巨集指令，例如： POPF/POPFD，CLI/STI，CLC/STC，或CLD/STD指令，轉譯階段212之轉譯器210即回應產生一MTEF D，S微指令，該指令將在data/ALU-執行階段216執行一移動到EFLAGS暫存器的動作。在該MTEF D，S微指令中，S為一來源攔位，該來源攔位指示將被轉移至EFLAGS暫存器214的資料來源；D[0 03 0] When the fetch logic 204 fetches a macro instruction, such as: POPF / POPFD, CLI / STI, CLC / STC, or CLD / STD instruction, the translator 210 in the translation stage 212 responds to generate a MTEF D, S micro-instruction, this instruction will perform an action to move to the EFLAGS register in the data / ALU-execute stage. In the MTEF D, S microinstruction, S is a source stop, and the source stop instruction will be transferred to the data source of the EFLAGS register 214; D

第14頁 1238943 修正曰 92128984 五、發明說明（10) 的：位：該目的攔位指定在肌AGS暫存器214之意圖被寫入的位7〇。 [ 0 0 3 1 ]此處，在討論該MTEF微指令之處理之前討論該處理器200其餘的架構。如圖二所示，該mtefd s 微指令被送至轉譯指令佇列（XIQ)218。然後，該mtef β， s，指令再流至暫存階段222之一 MTEF暫存器22〇。暫存階段222係為儲存該處理器2〇〇的架構狀態。暫括一ESP架構暫存器226。如圖二所*，該暫存階段22^ 包括一 OP1暫存器228及一 0P2暫存器230。 [ 00 32 ]該暫存階段222經由定位階段向下耦接至載入階段2 32。該處理器2〇〇使用一傳統定位階段25〇，以計算該處理器200處理該些指令所使用的位址。定位階段^22之 MTEF暫存器220的内容被送至並且儲存在載入階段232中對應之MTEF暫存器234。載入階段232包括載入/調正邏輯 236 ’该載入/調正邏輯236係耦接至暫存階段222之OP1暫存器228及OP2暫存器230。該載入/調正邏輯236亦耦接至資料記憶體238。該載入/調正邏輯236的輸出端係耦接至 OP3暫存器240。如圖二所示，暫存階段222之〇})1暫存器 2 28及OP2暫存器230的内容係分別向下傳送至載入階段232 之OP1暫存器242及OP2暫存器244。 [0033]處理器200更包括一 data/ALU或執行階段216，該執行階段216包括前述之EFLAGS暫存器214。該執行階段 216亦包括一 TVAL暫存器246及一 TMASK暫存器248，並且在 TVAL暫存器246及TM ASK暫存器248的内容係由一及閘250之瞧第15頁 1238943 案號 92128964 五、發明說明（11) 及運算結合在一起，該及運算之結果係儲存在EFLAGS暫存器2 1 4。下述之討論將有關於該及運算與該tmASK暫存器 2 4 8所提供之遮罩運算。執行階段2 1 6亦包括一特權等級暫存器PR IV 252，該特權等級暫存器係提供現時被^^“〖暫存器248執行中之指令的特權資訊。該些指令執行的結果則被送至結果暫存器2 5 4，該些結果並且經由一結果匯流排（未顯示）被寫入到該暫存器檔案2 2 4。 [0034]如前所述，為回應由提取邏輯2〇4提供給轉譯器 210 之一被提取的 p〇pf/p〇pfd，CLI/STI，（：ΙΧ/8ϊ(：， CLD/STD巨集指令，該轉譯器21〇產生一單一微指令，即 MTEF D，S，並且將該MTEF D，S微指令送至盥該轉嘩器 210耦接之轉譯器佇列（XIq)218，然後再送至與轉譯器佇列（XIQ)218耦接之暫存階段222。該訂以微指令包括一來源攔，S及目的欄位D。該目的攔位係用以指定EFLAGS 暫存态214之應被寫入的位元。舉例來說，若D = 〇，則該目的攔位將指寫人到進位旗標CF，該進位旗標Μ，^表格一所不，係為EFLAGS暫存器之〇位元。在另一例中，卜9 將指定一寫入到該EFLAGS暫存器之11?位元，D=：1 寫入到該EFLAGS暫存器之DF位元。在該MTEF D微曰將腿gs暫存器214從該堆疊儲存器中挽出之疊（pop)的設定為D = 31。择 EFLAGS ί二Γ微指令的來源攔位S則指定被寫入到該 S 位兀的狀態。舉例來Hl〇，s = 。’即命令處理㈣咖卜鋼…’即命令處理器 1238943Page 14 1238943 Amendment 92128984 V. Description of the invention (10): Bit: The purpose stop is specified in bit 70, which is intended to be written in the muscle AGS register 214. [0 0 3 1] Here, before discussing the processing of the MTEF microinstruction, the rest of the architecture of the processor 200 is discussed. As shown in Figure 2, the mtefd s microinstruction is sent to the translation instruction queue (XIQ) 218. Then, the mtef β, s, instruction flows to the MTEF register 22, which is one of the temporary storage stages 222. The temporary storage stage 222 is used to store the architecture state of the processor 200. An ESP architecture register 226 is temporarily included. As shown in Figure 2 *, the temporary storage stage 22 ^ includes an OP1 register 228 and an 0P2 register 230. [0032] The temporary storage phase 222 is coupled down to the loading phase 232 via the positioning phase. The processor 200 uses a conventional positioning stage 25 to calculate the addresses used by the processor 200 to process the instructions. The contents of the MTEF register 220 in the positioning stage ^ 22 are sent to and stored in the corresponding MTEF register 234 in the loading stage 232. The loading stage 232 includes a loading / adjustment logic 236 ', which is coupled to the OP1 register 228 and the OP2 register 230 of the temporary storage stage 222. The load / adjust logic 236 is also coupled to the data memory 238. The output of the load / adjust logic 236 is coupled to the OP3 register 240. As shown in Figure 2, the contents of the temporary storage stage 222 to 0)) 1 register 2 28 and the OP2 register 230 are transferred down to the OP1 register 242 and the OP2 register 244 of the loading stage 232, respectively. . [0033] The processor 200 further includes a data / ALU or execution phase 216. The execution phase 216 includes the aforementioned EFLAGS register 214. The execution phase 216 also includes a TVAL register 246 and a TMASK register 248, and the contents of the TVAL register 246 and the TM ASK register 248 are determined by the first and second gates. 92128964 V. Description of the invention (11) Combined with the operation, the result of the AND operation is stored in the EFLAGS register 2 1 4. The following discussion will be about the sum operation and the mask operation provided by the tmASK register 2 4 8. The execution stage 2 1 6 also includes a privilege level register PR IV 252. The privilege level register provides the privilege information of the instructions currently being executed by the "register 248". The results of the execution of these instructions are The results are sent to the result register 2 5 4 and the results are written to the register file 2 2 4 via a result bus (not shown). [0034] As mentioned previously, the logic is extracted for the response. 204 is provided to one of the translators 210, the extracted p0pf / p0pfd, CLI / STI, (: Ιχ / 8ϊ (:, CLD / STD macro instruction, the translator 21) generates a single micro instruction MTEF D, S, and the MTEF D, S microinstruction is sent to the translator queue (XIq) 218 coupled to the transponder 210, and then sent to the translator queue (XIQ) 218 The temporary storage phase 222. The micro instruction includes a source block, S, and a target field D. The target block is used to specify the bit that should be written in the EFLAGS temporary state 214. For example, if D = 〇, then the purpose of the stop will refer to the writer to the carry flag CF, the carry flag M, ^ table is different, is the 0 bit of the EFLAGS register In another example, BU 9 writes a designated 11 bit to the EFLAGS register, and D =: 1 writes to the DF bit of the EFLAGS register. At the MTEF D, the leg gs is written. The register popup 214 from the stack storage is set to D = 31. Select the source block S of the EF micro instruction and specify the state to be written to the S bit. For example Come to H10, s =. 'That is, the command processing ㈣Cabu steel ...' that is the command processor 1238943

20 0設定該進位旗標。換言之，在該ΜΤΕ{Γ微指令中，s=r〇代表清除該目的位元，而S = 1代表設定該目的位元。對一p〇p EFLAGS指令而言，該指令忽略該5欄位。 [ 0036 ]在寫入該EFLAGS暫存器時，在data/ALU執行階段2 1 6之執行邏輯2 5 6係作為一遮罩以確保只有正確的位元位置被寫入。在執行一MTEF微指令時，即動態的產生該 TMASK暫存器248的内容。data/ALU執行階段21 6之執行邏20 0 Set this carry flag. In other words, in the MTTE {Γ microinstruction, s = r0 means to clear the destination bit, and S = 1 means to set the destination bit. For a POP EFLAGS instruction, the instruction ignores the 5 fields. [0036] When writing to the EFLAGS register, the execution logic 2 56 in the data / ALU execution stage 2 16 is used as a mask to ensure that only the correct bit position is written. When an MTEF micro instruction is executed, the contents of the TMASK register 248 are dynamically generated. data / ALU execution phase 21 6 execution logic

輯256係從特權暫存器PRIV 252存取該現時運算模式，並從EFLAGS暫存器214存取其他位元之狀態。被提供給新數值暫存器TVAL 246的内容，若不是s攔位的數值，即是在載入階段232讀取自堆疊儲存器的EFLAGS暫存器。如圖二所示，TVAL與TMASK係以一及閘250連結在一起，並且該及閘2 50運算之結果係被寫入到EFLADS暫存器214，有利的疋，TMASK係組態為只改變某些位元，而該某些位元係為一可作為從PR IV暫存器讀取之該特定現時運算模式之函數的位元。在此具體實施例中，一指令所擁有之最高特權等級為3 ’該最高特權等級之指令在更新該EFLAGS暫存器之該些指定位元時，擁有最高的容許度。較低特權等級之指令在更新該EFLAGS暫存器的限制較多。最高特權等級為曰〇曰。在本發明之一具體實施例中，該遮罩所有之位元的數量等同於該EFLAGS暫存器的位元數量。當一特定遮罩位元被設立，即是指該遮罩位元在PRIV暫存器252之該現時特權等級之下，係為可被更新，反之，若一特定遮罩位元未被設立，則該遮罩位元為不可被更新。總而言之，係Series 256 accesses the current operation mode from the privilege register PRIV 252, and accesses the state of other bits from the EFLAGS register 214. The content provided to the new value register TVAL 246, if it is not the value of the s block, is the EFLAGS register read from the stack memory during the loading phase 232. As shown in Figure 2, TVAL and TMASK are connected by a AND gate 250, and the result of the AND gate 250 operation is written to the EFLADS register 214. Advantageously, the TMASK system is configured to change only Certain bits, and the certain bits are a bit that can be used as a function of the particular current operation mode read from the PR IV register. In this specific embodiment, the highest privilege level owned by an instruction is 3 '. The highest privilege level instruction has the highest allowability when updating the specified bits of the EFLAGS register. Instructions with lower privilege levels have more restrictions on updating this EFLAGS register. The highest privilege level is 〇〇. In a specific embodiment of the present invention, the number of bits in the mask is equal to the number of bits in the EFLAGS register. When a specific mask bit is set, it means that the mask bit is below the current privilege level of the PRIV register 252, and it can be updated. Conversely, if a specific mask bit is not set, , The mask bit cannot be updated. All in all, the department

第17頁 1238943 ---案號92128964 _年月日修正五'發明說明（13) 提供被寫入到EFLAGS之該特定數值，而TMASK則根據對應於儲存在PRIV暫存器252之該特定指令的特權等級，以決疋一寫入動作是否被允許。 [ 0037 ]本發明使得一寫入到該EFLAGS暫存器的指令可以在一單一指令週期内完成，因此得以顯著的增加處理器的產出量。 [0 0 3 8 ]現請參閱圖三，其顯示一流程圖，用以描述微處理器20 0根據本發明以執行一寫入到該EFLAGS暫存器的運算的高階處理流程概要。流程開始於方塊3 〇〇，在此處’從記憶體提取一巨集指令，例如：p〇pF/p〇pFD， CLI/STI，CLC/STC，或CLD/STD，流程接著進行到方塊 305。於方塊305中，轉譯器210將該巨集指令轉譯成要求執行寫入到EFLAGS暫存器214之微指令，流程接著進行到方塊310。於方塊31〇中，在該EFLAGS遮罩暫存器TMASK 248中產生一EFLAGS遮罩，流程接著進行到方塊31 5。於方塊315中，如前文所述，一新數值將被寫入到EFLAGS，並且該巨集指令之執行結果將被載入到巧“暫存器246，流程接著進行到方塊320。於方塊320中，目的資訊將被提供給TMASK暫存器248，流程接著進行到方塊325。於方塊325 中，提供現時特權等級予該TMASK暫存器248，若該特權等級容許更新由該目的資訊所指定之該些特定EFLAGS位元，則該TMASK暫存器248會被組態有一許可更新由該目的資訊所指定之該些特定EFLAGS位元之一數值，流程接著進行到方塊330。於方塊330中，對該TMASK暫存器的内容與該新 Η kmPage 171238943-Case No. 92128964 _Year Month Day Amendment 5 'Invention Description (13) Provide the specific value written to EFLAGS, and TMASK according to the specific instruction corresponding to stored in the PRIV register 252 Privilege level to determine if a write operation is allowed. [0037] The present invention enables an instruction written to the EFLAGS register to be completed in a single instruction cycle, thereby significantly increasing the output of the processor. [0 0 3 8] Please refer to FIG. 3, which shows a flowchart for describing an outline of a high-level processing flow of the microprocessor 200 according to the present invention to perform an operation written to the EFLAGS register. The process starts at block 3 00, where 'fetch a macro instruction from memory, for example: p0pF / p0pFD, CLI / STI, CLC / STC, or CLD / STD, and the process then proceeds to block 305. . In block 305, the translator 210 translates the macro instruction into a micro-instruction that is required to be written to the EFLAGS register 214, and the flow proceeds to block 310. In block 31o, an EFLAGS mask is generated in the EFLAGS mask register TMASK 248, and the flow proceeds to block 315. In block 315, as described above, a new value will be written to EFLAGS, and the execution result of the macro instruction will be loaded into the smart register 246. The flow then proceeds to block 320. At block 320 The target information will be provided to the TMASK register 248, and the flow then proceeds to block 325. In block 325, the current privilege level is provided to the TMASK register 248. If the privilege level allows updating specified by the purpose information The specific EFLAGS bits, the TMASK register 248 will be configured with a permission to update one of the specific EFLAGS bits specified by the destination information, and the flow then proceeds to block 330. In block 330 , The contents of the TMASK register and the new Η km

Ml m 第18頁 1238943 --m 92128964_年月—曰條正五、發明說明（14) " --— 數值暫存器TVAL的内容執行一及運算’流程接著進行到方塊335。於方塊335中，方塊330之及運算使得唯有被該時特權等級所容許之EFLAGS位元會被更新。 [ 0 0 3 9 ]值得注意的是，在傳統管線處理器中， PUSHF/PUSHFD以將該EFLAGS暫存器下推至該堆疊儲存器會 j要一很大數量之處理器週期，其係因為從該eflags暫存器讀取之動作係受到現時輸出入特權等級（I〇pL)及以^以暫存器内^亥些特定位元在一寫入時之狀態的影響。尤其位，、位TC3，位元5，位元15，及位元22〜31係為保留狀態’並且其狀態不可被改變。再者，該EFLAGS暫存器之" 旗標立元16及位元17)並沒有被複製，反之，該些旗払的數值係從儲存在堆疊儲存器之EFLAGS暫存器的清除掉。 [0040]當一 χ86處理器係在虛擬8〇86模式下運作，並且fl/O特權等級（I0PL)小於3時，一pusHF/pusHFD指令的執行必然會導致般性保護錯誤（general protect ion f，u It)或異常。但是，在真實定址模式下，且該ESp暫存器或该SP暫存器等於1，3，或5時，一PUSHF/PUSHFD指令Ml p. 18 1238943 --m 92128964_year month-said article 5. Description of the invention (14) " --- The content of the value register TVAL performs a sum operation 'flow then proceeds to block 335. In block 335, the sum operation of block 330 causes only the EFLAGS bits allowed by the privilege level at that time to be updated. [0 0 3 9] It is worth noting that in a traditional pipeline processor, PUSHF / PUSHFD to push down the EFLAGS register to the stack memory will require a large number of processor cycles, because The action read from the eflags register is affected by the current input / output privilege level (IOpL) and the state of some specific bits in the register at the time of writing. In particular, bit, bit TC3, bit 5, bit 15, and bits 22 to 31 are reserved states' and their states cannot be changed. Furthermore, the "flag flags 16 and 17" of the EFLAGS register are not copied. On the contrary, the values of these flags are cleared from the EFLAGS register stored in the stack memory. [0040] When a x86 processor operates in virtual 8086 mode and the fl / O privilege level (IOPL) is less than 3, the execution of a pusHF / pusHFD instruction will inevitably lead to a general protection ion f , U It) or abnormal. However, in the true addressing mode, and the ESp register or the SP register is equal to 1, 3, or 5, a PUSHF / PUSHFD instruction

的執行必然會導致該處理器因缺少堆疊儲存器空間而停止運作。 [〇 0 4 1 ]在一今日的管線微處理器中，例如處理器 100，任一PUSHD/PUSHFD指令的執行均會導致若干微指令的^生。首先’一微指令將被執行以將該EFLAGS暫存器的 ”移丨暫時暫存器中；然後，另一微指令將被執行以Execution will inevitably cause the processor to stop functioning due to lack of stack memory space. [00 0 1] In today's pipeline microprocessors, such as processor 100, the execution of any PUSHD / PUSHFD instruction will result in the generation of several micro-instructions. First, a microinstruction will be executed to move the EFLAGS register to the temporary register; then, another microinstruction will be executed to

1238943 修正 Ά% 921289Β4 五、發明說明（15) 清除該VM位元與RF位元；緊接著，執行又—微指令以決定該現時I/O特權等級（Ι〇η)，使得該處理器得以知道是否應產生異f ’或是停止運作；最後’執行以將該EFLAGS儲存到該堆疊儲存器。 " [0 042]前述之傳統微處理器有非常顯著的缺失，即必須要執行為數眾多的微指令才能完成一下推sh EFLAGS到堆疊儲存器上。在該些眾多的微指令中一些用以取传現時I/O特權等級（I〇PL)，以將該eflags的内容移到該堆f儲存器的微指令係為必要的，另一些用以命令在下f之刖，先行清除EFLAGS的某些特定位元亦為必要的， J是：然有許多微指令的產生係因為現今之管線處理器架構，並不能完成適合一包括一 AU運算與一儲存運算之指 f的執行。今日的執行階段邏輯僅允許執行一存取運算。因&，任何包括命令一則類型之 U = 之運算的指令的執行，均必須產生兩個連續”々集，並且該兩個連續的微指令集的執行係需 :U的,？週期。在一作業系統下，例如微軟視、、、母_人回應一子程式，該EFLAGS暫存器即須 =叠儲存器中，因此若能減少該下推EFUGS暫存器到該堆叠儲存器的執行時間，將有助於增進處的效能。 σ ^043 ]圖四之處理器400提供一單一微指令，即M〇ve (MFEF)，以將EFLAGS暫存器414的内容移至諸存器。如圖所示，執行階段（data/ALU階段）之執行 Π·· 第20頁 1238943 __案號92128964__年月日_修正 __ 五、發明說明（16) 邏輯416及一load-ALU儲存管線架構致能EFLAGS之一下推可以在一單一指令週期内完成。因此，處理器的效能將可有顯著的增進。 [ 0 044 ]該處理器400包括一提取階段402，並且該提取階段402内含一耗接至指令記憶體406的提取邏輯4〇4 (instruction fetch logic) ° 一指令指標408 (instruction pointer)係耦接至該提取邏輯404，以指示該提取邏輯4 0 4到該指令記憶體4 0 6的特定位置去提取現行指令。 [0045]當該提取邏輯404提取一巨集指令，例如一 PUSHF/PUSHFD指令’轉譯階段之轉譯器即回應產生一評ef 微指令’該微指令係用以在data/ALU-執行階段執行一從 EFLAGS暫存器4 1 4之移出。 [ 0 046 ]如圖四所示，該MFEF微指令被送至轉譯指令佇列（XIQ)419，隨後到達在暫存階段422内之一 MFEF暫存器 420。暫存階段422包括一暫存檔案424 ,該暫存檔案424儲存该處理器4 0 0的架構狀態。暫存檔案4 2 4包括一堆疊指標暫存器ESP 426。暫存階段422亦包括0P1暫存器428與0P2 暫存器430。定址階段431係緊鄰該暫存階段422。定址階段431係用以計算被儲存數值之位址，使得該些數值可以被從記憶體取還及被寫入記憶體。 [ 0047 ]該MFEF暫存器420的内容被送到並且儲存在載入階段434之對應的MFEF暫存器432。該載入階段434包括載入/;正邏輯436。>圖所示，該載人/調正邏輯436係搞1238943 Modification Ά% 921289B4 V. Description of the invention (15) Clear the VM bit and RF bit; Then, execute another micro instruction to determine the current I / O privilege level (Ι〇η), so that the processor can Know if it should generate 'f' or stop operation; finally 'execute to store the EFLAGS in the stack storage. " [0 042] The aforementioned traditional microprocessor has a very significant deficiency, that is, it must execute a large number of micro instructions to complete the push of sh EFLAGS to the stack memory. Some of these many micro-instructions are used to get the current I / O privilege level (IOPL), it is necessary to move the contents of the eflags to the micro-instruction of the heap f memory, and others The order is below. It is also necessary to clear some specific bits of EFLAGS first. J is: However, many micro-instructions are generated because of the current pipeline processor architecture, which cannot be completed. The storage operation refers to the execution of f. Today's runtime logic allows only one access operation to be performed. Because of &, the execution of any instruction that includes an operation of type U = of a command must produce two consecutive "々" sets, and the execution of the two consecutive microinstruction sets requires: U,? Cycles. Under an operating system, such as Microsoft TV, the parent and the human respond to a subroutine, the EFLGS register must be stored in the stack, so if the execution of the push-down EFUGS register to the stack is reduced, Time will help improve performance. Σ ^ 043] The processor 400 in FIG. 4 provides a single microinstruction, namely Move (MFEF), to move the contents of the EFLAGS register 414 to the registers. As shown in the figure, the execution of the execution phase (data / ALU phase) Π ·· Page 20 1238943 __Case No. 92128964__year month day_correction__ V. Description of the invention (16) Logic 416 and a load-ALU storage The pipeline architecture enables one of the push-downs of EFLAGS to be completed in a single instruction cycle. Therefore, the performance of the processor can be significantly improved. [0 044] The processor 400 includes an extraction phase 402, and within the extraction phase 402 Contains a fetch logic 404 (in struction fetch logic) ° An instruction pointer 408 (instruction pointer) is coupled to the fetch logic 404 to instruct the fetch logic 404 to a specific position in the instruction memory 406 to fetch the current instruction. [0045] 当The fetch logic 404 fetches a macro instruction, for example, a PUSHF / PUSHFD instruction 'The translator in the translation phase responds to generate a comment ef micro instruction'. The micro instruction is used to execute a temporary storage from EFLAGS during the data / ALU-execution phase. [0 046] As shown in Figure 4, the MFEF microinstruction is sent to the translation instruction queue (XIQ) 419, and then reaches one of the MFEF registers 420 in the temporary storage stage 422. The storage stage 422 includes a temporary file 424, which stores the architecture state of the processor 400. The temporary file 4 2 4 includes a stack index register ESP 426. The temporary phase 422 also includes the 0P1 temporary Register 428 and 0P2 register 430. The addressing phase 431 is next to the temporary storage phase 422. The addressing phase 431 is used to calculate the address of the stored values, so that these values can be retrieved from memory and written. [0047] The contents of the MFEF register 420 The content is sent to and stored in the corresponding MFEF register 432 in the loading stage 434. The loading stage 434 includes loading / positive logic 436. As shown in the figure, the manning / correcting logic 436 is

1238943 __案號92128964_年月日修$ 五、發明說明（17) 接至暫存階段422之0P1暫存器428與0P2暫存器430。該載入/調正邏輯236亦耦接至資料記憶體438。該載入/調正邏輯236的輸出端係耦接至〇P3暫存器440。如圖四所示，暫存階段422之OP1暫存器428及OP2暫存器430的内容係分別向下傳送至載入階段434之OP1暫存器442及OP2暫存器 444 〇 ^ [ 0048 ]MFET暫存器422，ΟΡ1暫存器442，〇Ρ2暫存器 444及ΟΡ3暫存器440均搞接至data/ALU -執行階段418之執行邏輯416，使得儲存在該些暫存器之該些數值均可被提供給執行邏輯416。 [0 0 4 9 ]以下詳細討論該M F E F微指令從轉譯階段4 1 2到 data/ALU-執行階段418之處理過程。當該轉譯器410接收到一PUSHF或PUSHFD指令時，該轉譯器41〇即回應產生並輸出一 MFEF微指令。該MFEF微指令命令微處理器4〇〇執行增加及讀取的動作，該MFEF微指令更命令堆疊指標暫存器 ESP 426去讀取EFLAGS暫存器414，並且動態的將其efugs 鏡像修改成該現時作業模式之一函數。然後，將該E F L A G s 鏡像儲存至記憶體445内之堆疊儲存器。 [005 0 ] data/ALU-執行階段418之執行邏輯41 6包括一特權暫存器PRIV 446，該特權暫存器PRIV 446係儲存該現時執行之指令的IOPL。該特權暫存器PRIV 446係耦接至 FMASK暫存器448，因此，該現時l〇PL可以為該FMASK暫存器448之一輸入。EFLAGS暫存器414係耦接至FM ASK暫存器 448，以&供一第二輸入給FMASK暫存器448。執行邏輯可1238943 __Case No. 92128964_ Year, Month, Day, Rev. V. Description of the Invention (17) Connected to the 0P1 register 428 and 0P2 register 430 of the temporary storage stage 422. The load / adjust logic 236 is also coupled to the data memory 438. The output of the load / adjustment logic 236 is coupled to the OP3 register 440. As shown in Figure 4, the contents of the OP1 register 428 and the OP2 register 430 in the temporary storage stage 422 are transmitted downward to the OP1 register 442 and the OP2 register 444 in the loading stage 434, respectively. ^^ 0048 ] MFET register 422, OP1 register 442, OP2 register 444 and OP3 register 440 are all connected to the data / ALU-executing logic 416 of execution stage 418, so that they are stored in these registers Any of these values may be provided to the execution logic 416. [0 0 4 9] The processing of the M F E F microinstruction from the translation stage 412 to the data / ALU-execution stage 418 is discussed in detail below. When the translator 410 receives a PUSHF or PUSHFD instruction, the translator 410 generates and outputs an MFEF microinstruction in response. The MFEF microinstruction instructs the microprocessor 400 to perform adding and reading operations. The MFEF microinstruction also instructs the stack index register ESP 426 to read the EFLAGS register 414, and dynamically changes its efugs image to A function of this active mode. Then, the E F L A G s image is stored in a stack memory in the memory 445. [005 0] The execution logic 416 of the data / ALU-execution stage 418 includes a privilege register PRIV 446, which is an IOPL that stores the currently executing instructions. The privilege register PRIV 446 is coupled to the FMASK register 448. Therefore, the current 10PL can be input for one of the FMASK registers 448. The EFLAGS register 414 is coupled to the FM ASK register 448 and provides & a second input to the FMASK register 448. Execution logic can

1238943 案號 92128964 五、發明說明（18) 提供一遮罩，即FMASK，當一肝⑽微指令被執行時，該遮罩即動態的被產生。在本發明之一具體實施例中，該遮罩戶:有之位元的數量等同於該EFLAGS暫存器41 4的位元數里。备一特疋遮罩位元被設立，即是指該遮罩位元在pR J V 暫存器446之該現時特權等級之下，係為可被更新，反之，若一特定遮罩位元未被設立，則該遮罩位元為不可被更新。執行邏輯416從特權暫存器PRIV 446存取該現時運算模式，並從EFLAGS暫存器414存取其他位元之狀離。鈇後，讀取該EFLAGS暫存器414的内容，將該些内容、 FMASK以執行一及運算，並且將其結果儲存^ 一結果暫存 4=52 H說’FMASK暫存器的輸出端係輕接至及閘 450之一輸入&，而該及閘45〇之其他 EFLAGS暫存器414。 ’丁柄丧主 [0051]在下一個機器週期中，的結果即被寫入到記憶体内之：：器T ;位址係由ESP暫存器426所指二= 存並非必帛，因為在其緊接著：：的暫時儲會被提供至儲存階段4 5 6之儲在、羅& ^ ^、中，5亥結果時儲存該EFLAGS的鏡像。儲存^ 4，所以非不需要暫記憶体458之堆疊儲存器儲存邏輯454會將該結果儲存在 [ 0052 ]處理器400可以在一單一執行。因此，該處理二二:* 產出量即有顯著的增進。〜A %及 [ 0 053 ]現請參閱圖五，其丹顯不一流程圖，用以描述微1238943 Case No. 92128964 V. Description of the invention (18) Provide a mask, namely FMASK. When a liver microinstruction is executed, the mask is dynamically generated. In a specific embodiment of the present invention, the mask user: the number of bits is equal to the number of bits in the EFLAGS register 414. The preparation of a special mask bit means that the mask bit can be updated under the current privilege level of the pR JV register 446, otherwise, if a specific mask bit is not If it is set, the mask bit cannot be updated. The execution logic 416 accesses the current operation mode from the privilege register PRIV 446, and accesses other bits from the EFLAGS register 414. After that, read the contents of the EFLAGS register 414, perform an AND operation on the contents and FMASK, and store the result ^ A result is temporarily stored 4 = 52 H says that the output terminal of the FMASK register is Tap to one of the AND gates 450 & and the other EFLAGS register 414 of the AND gate 450. 'Ding handle mourner [0051] In the next machine cycle, the result is written into the memory :: device T; the address is pointed by the ESP register 426 = storage is not necessary, because in its Immediately after :: The temporary storage will be provided to the storage stage 4 5 6 stored in, Luo & ^ ^, middle, and the EFLAGS image will be stored when the result is 5 Hai. Store ^ 4, so the stack memory storage logic 454 which does not need temporary memory 458 will store the result in [0052] processor 400 can be executed in a single. As a result, this treatment has two: * The output has increased significantly. ~ A% and [0 053] Now please refer to Figure 5, which shows a different flow chart for describing micro

1238943 修正月曰案號 92128QR4 五、發明說明（19) ，理器40 0j艮據本發明以執行一讀取該EFLAGS暫存器的運算:該運算係為執行_下推至該堆疊儲存器。根據本發明之具體實施例，提取器（f etcher )404從指令記憶體提取一 pushF*PUSHFDs集指令，在此情況下，一被提取之指令在被轉>移至堆疊儲存器之前，會先要求從EFLAGS暫存器 414之一讀取，如同流程開始之方塊5〇0所示。於方塊5〇5 中，轉譯器410將該巨集指令轉譯成—MFEF微指令，該 MFEF微指令係組態為在一單一微指令週期内完成從該 EFLAGS暫存器之一讀取，流程接著進行到方塊51〇。於方塊510中，在FMASK遮罩暫存器448中產生一 EFLAGS遮罩，在本發明之一具體實施例中，該EFLAGS遮罩的位元數量等同於該EFLAGS的位元數量，因此，在該FMASk暫存器内之該EFLAGS遮罩的該些位元與EFLAGS暫存器的該些位^元有一對一之對應關係。在方塊5 1 5之產生該eflAGS遮罩的過程中’執行邏輯4 1 6會檢查該現時特權等級，並且設定該遮罩的該些位元’該些被設定的遮罩位元係對應於該特定特權專級所允终可被更新之特疋EFLAGS位元，至於該遮罩的其他對應該特定特權等級所不允許更新之EFLAGS ^元的遮罩位元’則為未设疋狀態’或者將其值設為〇，流程接著' 進行到方塊520。於方塊520中，將該遮罩連同該該efugs 暫存器之内容進行一及運算，流程接著進行到方塊525。於方塊5 2 5中，將該及運算之結果寫入到堆疊儲存器。 [ 0054 ]在上文關於圖二及圖三的敘述中，其係描述一種增進處理器執行一寫入到EFLAGS暫存器的裝置與方法。第24頁 1238943 修正茶就此1289fi4 五、發明說明（20) 另在上文關於圖四及圖五的敘述中，1 進處理器執行從肌似暫存器之讀取的1238943 Amendment Month Case No. 92128QR4 V. Description of the Invention (19), the processor 403 performs an operation to read the EFLAGS register according to the present invention: the operation is to execute _ push down to the stack memory. According to a specific embodiment of the present invention, the fetcher 404 fetches a set of pushF * PUSHFDs instructions from the instruction memory. In this case, a fetched instruction will be transferred before it is transferred to the stack memory and will be It is first required to read from one of the EFLAGS registers 414, as shown in block 500 at the beginning of the process. In block 505, the translator 410 translates the macro instruction into an MFEF micro instruction. The MFEF micro instruction is configured to complete reading from one of the EFLAGS registers in a single micro instruction cycle. Proceed to block 51. In block 510, an EFLAGS mask is generated in the FMASK mask register 448. In a specific embodiment of the present invention, the number of bits of the EFLAGS mask is equal to the number of bits of the EFLAGS. Therefore, in The bits in the EFLAGS mask in the FMASk register have a one-to-one correspondence with the bits in the EFLAGS register. In the process of generating the eflAGS mask in block 5 1 5 'Execution logic 4 1 6 will check the current privilege level and set the bits of the mask' The set mask bits correspond to The special EFLAGS bit allowed by the specific privilege level can be updated. As for other mask bits of the mask that correspond to the EFLAGS ^ elements that are not allowed to be updated in the specific privilege level, 'the state is not set.' Or set its value to 0, and the flow then proceeds to block 520. In block 520, the mask is summed with the contents of the efugs register, and the flow proceeds to block 525. In block 5 2 5, write the result of the sum operation to the stack memory. [0054] In the description of FIG. 2 and FIG. 3 above, it describes a device and method for enhancing a processor to execute a write to an EFLAGS register. Page 24 1238943 Amendment Tea 1289fi4 V. Description of the Invention (20) Also in the description of Figure 4 and Figure 5 above, the 1-in processor executes the reading from the muscle-like register.

的是，該寫入與讀取運算均可在一翠一指令週：内去：而非如同傳統處理器般，需要複數個週期才能完二、[COM]雖然本發明的具體實施例已敘述如=L =5 土受：於此。本發明不但可以硬體實現之亦；以藉在一電腦可運用（例如，可辨識）元件具體實現之辨識碼(例如’電腦可辨識程式碼、資料等】腦程式碼促成此處所揭露之本發明的功能以，測試之實現。舉例來說，本發明可以下ί電藤程式碼來實現之··一般的程式語言（例如，C、C+:” 等），GDSII資料庫；硬體描述語言（hardware ， description languages, HDL) , ：Veril〇g HDL > 之編製程式及/或電路(：)是等術領域中私式I可適用於任何已知之電腦可運用（例如，可辨識）元 :二Ί運用元件包括：半導體記憶體，磁碟片，光 λ M'DVD，M等等），以及如同在一電腦可運用（例如，可辨識）傳輸元件（例如，載波，或是包括 Ϊ位就ΐ :身或ΐ比▲式元件)具體實現之-電腦資料訊於，s 而,，该電腦程式碼可以在通信網路上傳 :麻::5 1:路包括網際網路與企業網路。可以理解的是 f if W 其結構可以内建於處理器之電腦程式碼來 ° L，GDSn，等等），並且將之轉移至硬體 1 Ι^ΗΠ 第25頁 1238943What ’s more, the write and read operations can be performed in one instruction cycle: instead of: as in traditional processors, multiple cycles are required to complete the second. [COM] Although specific embodiments of the present invention have been described Such as = L = 5 soil suffer: here. The present invention can be implemented not only in hardware, but also through an identification code (such as' computer-identifiable code, data, etc.) specifically implemented by a computer that can use (eg, identifiable) components. The function of the invention is realized by testing. For example, the present invention can be implemented by using diandian code. General programming languages (for example, C, C +: ", etc.), GDSII database; hardware description language (Hardware, description languages, HDL),: Veril〇g HDL > programming and / or circuit (:) is a private I in the field of surgery, etc. can be applied to any known computer can be used (for example, identifiable) element : Second, the use of components include: semiconductor memory, magnetic disks, optical lambda M'DVD, M, etc.), and as a computer can use (for example, recognizable) transmission components (for example, carrier waves, or include Ϊ The position is: the body or the ratio ▲ type component) concrete realization-computer information in, s and, the computer code can be uploaded on the communication network: hemp :: 5 1: the road includes the Internet and corporate networks Road. Understandable is f if W its structure can be built into the processor's computer code (L, GDSn, etc.), and transferred to the hardware 1 Ι ^ ΗΠ page 25 1238943

本發明亦可以一硬體與電腦程 _銮號 92128964_ 五、發明說明（21) 以成為積體電路之一部分式碼的組合來實現之。 [0 0 5 6 ]再者，雖然本發明及I a从 ^ 人丹目的、特徵盥振細敘述，其他具體實施例仍涵蓋名太欢⑽ ”设點已詳嚴仕本發明之範圍 [ 0057 ]最後，本發明的具體實固⑺ 耳施例已敘述如前發明並未受限於此。唯以上所述者為：士施例，當不能以之限制本發明的範直 J佳實此項技術者使用或製造本發明之田固/係^供予熟習利範圍所做之均等變化及体綠明專 ^τ始μ ^I 飾，仍將不失本發明之要義所 ^ 止與Θ;猜砷和範圍，故都應視為本發明的進一步實施狀況。The present invention can also be implemented by a hardware and computer program _ 銮号 92128964_ V. Description of the invention (21) It can be realized by the combination of the code which becomes part of the integrated circuit. [0 0 5 6] Furthermore, although the present invention and I a detailed description from the purpose and characteristics of the human body, other specific embodiments still cover the name “Tai Huan” “The set point has been detailed and strictly covers the scope of the present invention [0057] Finally The specific embodiment of the present invention has been described as the previous invention is not limited to this. Only the above is: Shi embodiment, when it is not possible to limit the invention of the present invention J Jiashi this technology Those who use or manufacture the Tiangu / Department of the present invention to make the same changes in the scope of the conventional benefits and the body green Ming ^ τμ μ ^ I decoration, will still not lose the gist of the present invention ^ and Θ; guess Both arsenic and range should be considered as the further implementation status of the present invention.

第26頁 1238943 ^S_92128964 圖式簡單說明【發明圖示說明】 [0 0 1 2 ]本發明之二合下列說明及所附圖刖二，、八他目的、特徵及優點，在 [0013]圖一係ϊυ，將可獲得更好的理解：管線階段；塊圖’其描述一傳統微處理器 [0014]圖二係為之一具體實施例；方塊圖其描述本發明的微處理 [0 0 1 5 ]圖三係 [0 0 1 6 ]圖四係之另一具體實施例 [0 0 1 7 ]圖五係為解說圖為一方塊為解說圖一之微處圖，其描四之微處理器運作述本發明理器運作的流程圖的微處理的流程圖配的器器圖號說明： 1 0 〇管線化微處理器架構 I 0 5提取階段 II 〇轉譯階段 11 5暫存階段 120定址階段 125 DATA/ALU階段（執行階段） 1 3 0寫回階段 2 0 0處理器 2 0 2提取階段 204指令提取邏輯 2 0 6指令記憶體Page 26 1238943 ^ S_92128964 Brief description of the drawings [Illustration of the invention] [0 0 1 2] The present invention combines the following description and the attached drawings, the second, eighth purposes, features and advantages, as shown in [0013] A series of ϊυ will get a better understanding: pipeline phase; block diagram 'which describes a traditional microprocessor [0014] Figure two series is a specific embodiment; block diagram which describes the microprocessing of the present invention [0 0 1 5] Figure three series [0 0 1 6] Another specific embodiment of the figure four series [0 0 1 7] Figure five series are explanatory diagrams. The operation of the processor describes the flowchart of the micro-processing of the flow chart of the operation of the present invention. The description of the flow chart of the processor is as follows: 1 0 〇 pipelined microprocessor architecture I 0 5 extraction phase II 〇 translation phase 11 5 temporary storage phase 120 Addressing phase 125 DATA / ALU phase (execution phase) 1 3 0 write back phase 2 0 0 processor 2 0 2 fetch phase 204 instruction fetch logic 2 0 6 instruction memory

1238943 _案號92128964_年月日修正圖式簡單說明 20 8指令指標 2 1 2轉譯階段 2 1 0轉譯器 22 2 暫存階段 224暫存檔案 23 2 載入階段 23 6 載入/調正邏輯 238 資料記憶體 216 DATA/ALU 階段 2 5 4 結果 3 0 0〜3 3 5 流程圖 4 0 0 處理器 402 提取階段 404指令提取邏輯 406 指令記憶體 408指令指標 4 1 2 轉譯階段 4 1 0轉譯器 4 2 2 暫存階段 424暫存檔案 431 定址階段 434 載入階段 436 載入/調正邏輯 438 資料記憶體1238943 _ Case No. 92128964_ Year, month, and day correction diagram, simple description 20 8 instruction indicators 2 1 2 translation stage 2 1 0 translator 22 2 temporary stage 224 temporary file 23 2 loading stage 23 6 loading / correction logic 238 Data memory 216 DATA / ALU phase 2 5 4 result 3 0 0 ~ 3 3 5 flow chart 4 0 0 processor 402 fetch phase 404 instruction fetch logic 406 instruction memory 408 instruction index 4 1 2 rendering phase 4 1 0 rendering 4 2 2 Temporary stage 424 Temporary file 431 Addressing stage 434 Loading stage 436 Loading / adjusting logic 438 Data memory

第28頁 1238943 案號92128964 年月日修正圖式簡單說明 418 DATA/ALU 階段 4 5 2 結果 456儲存階段 454儲存邏輯 445 記憶體内之堆疊儲存器 5 0 0〜5 2 5 流程圖Page 28 1238943 Case No. 92128964 Rev. Date Brief description of the diagram 418 DATA / ALU phase 4 5 2 Results 456 Storage phase 454 Storage logic 445 Stacked memory in memory 5 0 0 ~ 5 2 5 Flow chart

Claims

1238943 ----- Case No. 92128Qfi4 6. Scope of Patent Application

year

1. A method of performing a write operation to a multi-bit flag register in a processor, the method comprising: receiving a macro instruction to a translation stage, the macro instruction requires a write to The multi-bit flag register; and generating a micro-instruction from the translation stage, the micro-instruction is configured to complete the writing to the multi-bit flag register in a single write cycle. 2. The method as described in item 1 of the scope of patent application, the method comprising: generating a flag mask. 3. The method as described in item 2 of the scope of patent application, the method comprising: performing a logical "and" operation on the flag mask with a pre-specified operand to produce a result. 4. The method as described in item 2 of Shenyan's patent scope, the method comprising: storing the result in the multi-bit flag register. 5. The method according to item 1 of the scope of patent application, wherein the processor is an X 8 6 processor. 6. The method as described in item 1 of the scope of patent application, wherein the ten-bit multi-bit flag register is an EFLAGS register. 7. The method as described in item 1 of the scope of patent application, wherein the macro instruction is one of the following instructions: POF, POFD, CL1, ST1, CLC, STX, CLD, and STD 〇 more than one bit A method for the read operation of the flag register, the method includes: ㈣ϊ 2: set instruction to a translation stage, the macro instruction is required to read from one of the flag registers of the 70th bit; as well as

Page 30 1238943 _Case No. 92128964_ Rev. Year_6. The scope of the patent application generates a micro instruction from the translation stage. The micro instruction is configured to complete the slave multi-bit flag in a single write cycle. The reading of the register 0 9. The method as described in item 8 of the scope of patent application, the method includes generating a flag mask, the flag mask including privileged information, which is about the During a read operation, the bits in the multi-bit flag register are set as bits that can be updated according to a current privilege level.

10. The method as described in item 9 of the scope of patent application, the method comprising: performing a logical "and" operation on the flag mask and the multi-bit flag register to produce a result. 11. The method according to item 10 of the patent application scope, the method comprising: storing the result in a stack memory of a memory. 1 2. The method according to item 8 of the scope of patent application, wherein the processor is an x86 processor. 1 3. The method according to item 12 of the scope of patent application, wherein the multi-bit flag register is an EFLAGS register.

Page 31