TW552556B - Data processing apparatus for executing multiple instruction sets - Google Patents
Data processing apparatus for executing multiple instruction sets Download PDFInfo
- Publication number
- TW552556B TW552556B TW90101006A TW90101006A TW552556B TW 552556 B TW552556 B TW 552556B TW 90101006 A TW90101006 A TW 90101006A TW 90101006 A TW90101006 A TW 90101006A TW 552556 B TW552556 B TW 552556B
- Authority
- TW
- Taiwan
- Prior art keywords
- instruction
- instructions
- memory
- main
- decoder
- Prior art date
Links
Landscapes
- Memory System Of A Hierarchy Structure (AREA)
- Executing Machine-Instructions (AREA)
- Advance Control (AREA)
Abstract
Description
\fl.doc/006 \fl.doc/006 修正日期92 7.14 補各 、發明說明: 本發明是有關於一種資料處理裝置(Data Pr〇cessing\ fl.doc / 006 \ fl.doc / 006 Date of revision 92 7.14 Supplementary notes, invention description: The present invention relates to a data processing device (Data Pr〇cessing
Apparatus),且特別是有關於一種用以執行多重指令組 (Multiple Instruction Sets)之資料處理裝置。 資料處理裝置通常包括一處理器核,用以執行一預設 指令組之程式指令。連同此處理器核外,更可包括一系統 曰己、體,以儲存執行程式指令,以及一程式計數暫存器, 用以指出在記憶體中,下一指令之位址。然而,此一型式 之裝置僅允許執行一種類型的指令組。而若是能同時執行 不只一個類型指令組的裝置,則將會更具有彈性且功能更 強大。 ^ 第1圖係一方塊圖,用以繪示出一傳統執行兩個指令 組之資料處理裝置的結構,而其係揭露在名稱爲 "Interoperability with multiple mStructlon sets”之美國第 6,021,265 號專利。 如第1圖所示,在傳統的資料處理裝置中之一處理器 核包括一暫存器列(Register Bank)30、一布茲乘法器 (Booth Multiplier)40、一 位移器(Barrel Shifter)50、一 32 位元運算邏輯單元(32-bit Arithmetic Logic Unit,ALU)60、 以及一寫入資料暫存器(Write Date Register)70。 此裝置之其他的元件包括有一第一指令解碼及邏輯控 制器(Instruction Decoder & Controller)100,以及一第二指 令解碼及邏輯控制器110、一程式計數控制器(PC Controller)140、一程式計數器(Program Counter,PC)130、 一多工器(Multiplexer)90、一資料讀取暫存器120(Read- 552556Apparatus), and in particular, a data processing device for executing Multiple Instruction Sets. The data processing device usually includes a processor core for executing program instructions of a preset instruction set. In addition to the processor core, it may further include a system, a body, to store and execute program instructions, and a program counter register to indicate the address of the next instruction in the memory. However, this type of device allows only one type of instruction set to be executed. If the device can execute more than one type of instruction set at the same time, it will be more flexible and more powerful. ^ Figure 1 is a block diagram showing the structure of a conventional data processing device that executes two instruction sets, and is disclosed in US No. 6,021,265 entitled "Interoperability with multiple mStructlon sets" As shown in Figure 1, a processor core in a conventional data processing device includes a register bank 30, a Booth Multiplier 40, and a shifter (Barrel Shifter). ) 50, a 32-bit Arithmetic Logic Unit (ALU) 60, and a Write Date Register 70. Other components of this device include a first instruction decoding and Logic controller (Instruction Decoder & Controller) 100, and a second instruction decoding and logic controller 110, a program counter controller (PC Controller) 140, a program counter (PC) 130, a multiplexer (Multiplexer) 90, a data read register 120 (Read- 552556
修正日期92 7.14Modified date 92 7.14
Data Register)、〜指令管線(Instructl〇n pipeline)8〇、以及 一記憶系統20。 在此傳統的裝置中,例如對於兩個指令組,則需要分 開的指令解碼及邏輯控制方式。因此,第一指令解碼及邏 輯器100對第一指令組之程式〖旨令進行解碼,而第二指令 解碼及邏輯器110對第二指令組之程式指令進行解碼。第 一指令組之程式指令通常爲32位元,而第二指令組之程 式指令通常爲16位元。如此,程式設計者可以在使用具 有較多功能之32位元的指令組,或是使用16位元之指令 以節省記憶體大小之間做選擇。 一控制器則必須包括在其中,以便控制使用那一個指 令解碼器來進行現行程式指令的解碼。這是藉由程式計數 控制器140之設定,或是重新設定程式計數器130中之最 大位元(Most Significant Bit,MSB)或最小位元(Least Significant Bit,LSB)來達成的。如此可控制多工益90以 在第一及第二指令解碼及邏輯控制器100及110做選擇。 在這樣傳統的裝置中,指令組的類別係即時(Real tlme) 決定的。也就是說,兩個指令集钶以混在一起,程式設計 者可以在程式中任意的決定要使用何種指令集,而不需要 分別處理。然而,在硬體的設計上,傳統的裝置卻是需要 兩個解碼器及邏輯控制器不停的解碼與耗電。因此,這樣 的設計會造成處理器核10更多功率之消耗與需要更長的 處理週期(Cycle)。這樣將無法被目刖追求的低功率與局震 盪頻率的趨勢所接受。 另一個係設計成可以執行兩個不同指令組的傳統資料 6Data Register), ~ Instruction pipeline 80, and a memory system 20. In this conventional device, for example, for two instruction groups, separate instruction decoding and logic control methods are required. Therefore, the first instruction decoding and logic unit 100 decodes the program instruction of the first instruction set, and the second instruction decoding and logic unit 110 decodes the program instruction of the second instruction set. The program instructions in the first instruction set are usually 32 bits, while the program instructions in the second instruction set are usually 16 bits. In this way, the programmer can choose between using a 32-bit instruction set with more functions, or using a 16-bit instruction to save memory size. A controller must be included to control which instruction decoder is used to decode the current program instructions. This is achieved by setting the program counting controller 140, or resetting the Most Significant Bit (MSB) or Least Significant Bit (LSB) in the program counter 130. In this way, the multiplexer 90 can be controlled to choose between the first and second instruction decoding and logic controllers 100 and 110. In such a conventional device, the type of the instruction set is determined in real time. In other words, the two instruction sets are mixed together, and the programmer can arbitrarily decide which instruction set to use in the program without processing them separately. However, in terms of hardware design, traditional devices require two decoders and a logic controller to continuously decode and consume power. Therefore, such a design will cause the processor core 10 to consume more power and require longer processing cycles. This will not be accepted by the trend of low power and local oscillation frequency pursued by everyone. The other is a traditional resource designed to execute two different sets of instructions 6
wfl.doc/006 修正日期92 7.14 補充 一 裝置,係揭 mappingn2美國第 歸在名稱爲 ’’Multiple instruction set 5,568,646號專利中。而其所揭露的架 構係並不需要控制器,以控制使用哪一個解碼器對目前之 程式指令作解碼。也就是說,並不需要設定或是重新設定 在程式計數器中的最大位元(MSB)或是最小位元(LSB)。 在般的管線式處理器(Pipeline processor)中,對於 資料處理分爲包括三個階段,係一爲抓取階段(Fetchingwfl.doc / 006 Amendment date 92 7.14 A supplementary device is disclosed in the US Patent No. 5,568,646 for mappingn2. The disclosed architecture does not require a controller to control which decoder is used to decode the current program instructions. In other words, there is no need to set or reset the maximum bit (MSB) or minimum bit (LSB) in the program counter. In a general pipeline processor, data processing is divided into three stages, and the first is the fetching stage.
Stage)、一爲解碼階段(Decoding Stage)、另一爲執行階段 (Executing stage)。在此專利中所揭露的設計,係利用資料 處理時之解碼階段。在一解碼時脈中執行包括映射 (Mapping)與產生控制信號(Decode)等兩個步驟。不同的指 1¾組指令係首先映射爲一*主要程式組指令,接著再根據此 主要程式組指令解碼產生控制訊號,以便控制處理器核執 行此主要程式組指令。 然而’由於在解碼階段需要作映射的動作,將會大大 的增加在解碼階段的週期時間(Cycle Time)。也就是,將 很難進丫了局頻之設計。除此之外,其功率消耗也將會嚴重 地增加。同樣地,此種的硬體設計方式也將無法符合低功 率和高頻率趨勢的需求。 因此,本發明提供一種用以處理多重指令組之資料處 理裝置,可以更符合低功率和高頻率設計的目標。其包括 一記憶體,用以儲存複數個指令組之指令,一處理器核, 用以執行該些指令組中之一主要指令組,一程式計數暫存 器,用以指定記憶體中所儲存之下一指令的位址,複數個 資料暫存器,用以儲存該些指令之資料,一處理器狀態暫 552556 fl.doc/006Stage), one is the Decoding Stage, and the other is the Executing Stage. The design disclosed in this patent is a decoding stage during data processing. Two steps including mapping and generating control signals are performed in a decoding clock. Different fingers 1¾ sets of instructions are first mapped into a * main program group instruction, and then a control signal is decoded according to this main program group instruction to control the processor core to execute this main program group instruction. However, since the mapping action is required during the decoding phase, the cycle time during the decoding phase will be greatly increased. That is to say, it will be difficult to get into the design of rounds. In addition, its power consumption will also increase significantly. Similarly, this kind of hardware design will not be able to meet the needs of low power and high frequency trends. Therefore, the present invention provides a data processing device for processing multiple instruction sets, which can better meet the goals of low power and high frequency design. It includes a memory for storing instructions of a plurality of instruction sets, a processor core for executing one of the main instruction sets of the instruction sets, and a program counter register for specifying storage in the memory. The address of the next instruction, a plurality of data registers to store the data of these instructions, a processor state temporarily 552556 fl.doc / 006
‘正 充 會器(PSR), 修正日期92.7.14 用以儲存處理器核之狀態,其中該處理器狀 態暫存器包括一指令組選擇旗標(Instruction Set Selector^ ISS),用以指示該些指令組中之現行指令組,一預解碼器 (pre-decoder),用以將非主要指令翻譯成爲該主要指令, 並將其輸出,一快取記彳思體(Icache),用以儲存該主要指 令,一解碼器(daoder),用以將主要指令予以解碼,其中 該處理核是用來執行由解碼器解碼之該主要指令,一程式 計數器控制器,以調整程式計數暫存器使其符合長度不同 於主要指令之指令,以及一匯流排,用來作爲系統記憶體 與快取記憶體之間的介面。 該處理器核執行來自於主要指令組A之指令,並將結 果儲存於資料暫存器R0〜R14中,若其爲一分支指令 (Branch Instruction),則存入程式計數暫存器中。在執行 完每一個指令之後’程式狀態暫存器(PSR)會記錄執行旗 標(Flags),指令組選擇旗標(ISS),執行模式(Mode)等訊息。 以維持最新的處理器狀態。預解碼器依據指令組選擇旗 標,將非主要指令組指令轉換成主要指令。而後,快取記 憶體儲存預解碼器的輸出在資料記憶體(Data Ram)並且將 指令組選擇旗標儲存在標籤記憶體(Tag Ram),而解碼器 則將快取記憶體之指令組的指令予以解碼成處理器核之控 制訊號。在此一資料處理裝置中,該處理器核僅處理主要 指令組之指令。但是該處理器核可以利用預解碼器及指令 組選擇旗標,使其可以執行其他指令組之程式指令。 當指令組變換發生時,一個或多個指令會將分支位址 (Target Address,TA)存入於資料暫存器之位元中,同 8 552556'Positive Charger (PSR), the revision date 92.7.14 is used to store the state of the processor core, where the processor state register includes an Instruction Set Selector ^ ISS to indicate these The current instruction set in the instruction set, a pre-decoder, is used to translate non-main instructions into the main instruction and output it, and a cache is used to store the Icache. The main instruction, a decoder, is used to decode the main instruction. The processing core is used to execute the main instruction decoded by the decoder. A program counter controller is used to adjust the program counter register to Matches commands that differ in length from the main command, and a bus that acts as an interface between system memory and cache memory. The processor core executes the instructions from the main instruction group A and stores the results in the data registers R0 ~ R14. If it is a branch instruction, it is stored in the program count register. After executing each instruction, the program status register (PSR) records the execution flags (Flags), instruction set selection flags (ISS), and execution mode (Mode). To maintain the latest processor state. The pre-decoder converts non-main instruction group instructions into main instructions according to the instruction group selection flag. Then, the cache memory stores the output of the pre-decoder in the data memory (Data Ram) and stores the instruction set selection flag in the tag memory (Tag Ram), and the decoder stores the instruction set of the cache memory. The instructions are decoded into control signals from the processor core. In this data processing device, the processor core processes only the instructions of the main instruction set. However, the processor core can use the pre-decoder and the instruction set selection flag to enable it to execute program instructions of other instruction sets. When the instruction set change occurs, one or more instructions will store the branch address (Target Address, TA) in the bit of the data register, the same as 8 552556
l.doc/006 修正日期92.7.14l.doc / 006 Revision date 92.7.14
.¾¾將新的指令組選擇旗標(Instruction Select,IS)存入資 料暫存器之位元〇中。接著,一特定的分支指令(Branch and Exchange,BX)會將暫存器之位元31〜1複寫至程式計數 器,程式計數器中位元0則設定爲〇。同時,特定分支指 令又將資料暫存中指令組選擇旗標(IS)複寫至程式狀態暫 存器中之指令組選擇旗標(ISS)。在執行分支指令之後,程 式計數器將定出新指令組之第一指令的位址,而指令組選 擇旗標則指示一新的指令組模式。當程式計數器所定位址 之新指令被輸入該預解碼器時,新指令的預先解碼方法則 是由新的指令組選擇旗標之値所決定。如果該指令組選擇 旗標指示一指令組B,則該預解碼器將視該指令來自於指 令組B,並利用預解碼器將此一輸入指令轉換成爲指令組 A之指令。然後,該預解碼器將輸出該指令組a之指令至 快取記憶體。如果指令組選擇旗標指示一來自指令組A之 指令,預解碼器則會將其視之爲來自於指令組A之輸入指 令,並將其傳至快取記憶體中。該快取記憶體僅儲存指令 組A之指令。該解碼器及該處理器核亦總是處理指令組a 之指令。 爲讓本發明之上述和其他目的、特徵、和優點能更明 顯易懂,下文特舉較佳實施例,並配合所附圖式,作詳細 說明如下: 圖式之簡單說明: 桌1係一方塊圖,其繪示出用以執行兩個指令組之一 傳統資料處理裝置的結構。 第2係一方塊圖,其繪示出本發明之一較佳實施例中, 9 552556.¾¾ Store the new Instruction Select Flag (IS) in bit 0 of the data register. Next, a specific branch instruction (Branch and Exchange, BX) will overwrite bits 31 ~ 1 in the register to the program counter, and set bit 0 in the program counter to zero. At the same time, the specific branch instruction copies the instruction set selection flag (IS) in the data temporary storage to the instruction set selection flag (ISS) in the program status register. After executing the branch instruction, the program counter will determine the address of the first instruction in the new instruction group, and the instruction group selection flag indicates a new instruction group mode. When a new instruction at the address of the program counter is input into the pre-decoder, the pre-decoding method of the new instruction is determined by the new instruction group selection flag. If the instruction group selection flag indicates an instruction group B, the pre-decoder will treat the instruction as coming from instruction group B, and use the pre-decoder to convert this input instruction into an instruction of instruction group A. Then, the pre-decoder will output the instruction of the instruction group a to the cache memory. If the instruction group selection flag indicates an instruction from instruction group A, the pre-decoder treats it as an input instruction from instruction group A and passes it to the cache memory. The cache memory only stores the instructions of instruction group A. The decoder and the processor core also always process the instructions of instruction group a. In order to make the above and other objects, features, and advantages of the present invention more comprehensible, the preferred embodiments are described below in detail with the accompanying drawings as follows: Brief description of the drawings: Table 1 is a A block diagram showing the structure of a conventional data processing device for executing one of two instruction sets. Series 2 is a block diagram illustrating a preferred embodiment of the present invention. 9 552556
l.doc/006 修正日期92.7 14 用處理多重指令組的一資料處理裝置。 第3圖係一流程圖,其繪示出本發明之較佳實施例中, 執行指令的流程。 第4圖係一流程圖,其繪示出本發明之較佳實施例中, 指令轉換的流程。 第5A圖和第5B圖係顯示本發明之較佳實施例之快 取記憶體操作。 第6A圖顯示習知之用以處理不同類別指令之傳統架 構。 第6B圖顯示根據本發明之較佳實施例之處理不同類 別指令之架構。 第7A圖係顯示習知之處理器,針對混合指令時之處 理方法。 第7B圖係顯示根據本發明之較佳實施例之處理裝 置,針對混合指令時之處理方法。 圖式之標記說明: 10 處理器核 20 記憶系統 30 暫存資料庫 40 布茲乘法器 50 位移器 60 32位元運算邏輯單元 70 寫入資料暫存器 80 指令緩衝區 90 多工器 10 552556 修正B期92.7 14l.doc / 006 Revision date 92.7 14 A data processing device for processing multiple instruction sets. FIG. 3 is a flowchart illustrating a flow of executing instructions in a preferred embodiment of the present invention. FIG. 4 is a flowchart illustrating a flow of instruction conversion in a preferred embodiment of the present invention. Figures 5A and 5B show the cache operation of the preferred embodiment of the present invention. Figure 6A shows a conventional architecture that is conventionally used to handle different types of instructions. Figure 6B shows the architecture for processing different types of instructions according to the preferred embodiment of the present invention. Fig. 7A shows a conventional processor, processing method for mixed instructions. Fig. 7B shows a processing device according to a preferred embodiment of the present invention, and a processing method for mixed instructions. Description of the marks of the drawings: 10 processor core 20 memory system 30 temporary storage database 40 Booth multiplier 50 shifter 60 32-bit arithmetic logic unit 70 write data register 80 instruction buffer 90 multiplexer 10 552556 Amendment B 92.7 14
06439twfl.doc/006 修正 補充 100 第一指令解碼及控制器 110 第二指令解碼及控制器 120 資料讀寫暫存器 130 程式計數器 140 程式計數控制器 200 處理器核 210 記憶體 215 匯流排 220 程式計數暫存器 225 程式計數控制 230 資料暫存器 240 指令組選擇旗標位元 245 目標位址 250 處理器狀態暫存器 260 指令組選擇旗標 270 預解碼器 272 解碼器 280 快取記憶體 290 解碼器 510 位址暫存器 512 指令組選擇旗標 520 快取記憶體之標籤記憶體 610 匯流排之線路 620 640 660切換器 630 670快取記憶體之資料記憶體06439twfl.doc / 006 Correction supplement 100 First instruction decoding and controller 110 Second instruction decoding and controller 120 Data read / write register 130 Program counter 140 Program counter controller 200 Processor core 210 Memory 215 Bus 220 Program Count register 225 Program count control 230 Data register 240 Instruction set selection flag bit 245 Target address 250 Processor status register 260 Instruction set selection flag 270 Pre-decoder 272 Decoder 280 Cache memory 290 Decoder 510 Address register 512 Instruction set selection flag 520 Cache tag memory 610 Bus line 620 640 660 Switcher 630 670 Cache data memory
11 552556 修正日期92.7.14 0643 9twf 1 .doc/006 650預解碼器 310-395 指令執行之流程步驟 400-420 指令組轉換之流程步驟 實施例 請參照第2圖,其所繪示的是依照本發明較佳實施例 之一種用於執行多重指令組的資料處理裝置,特別針對管 線式之處理器(Pipeline processor)。 本發明所提供之資料處理裝置是用來處理執行多重指 令組(Multiple Instruction Set)的。其包括一處理器核200、 一記憶體 210、一程式計數器(Program Counter,PC)220、 複數個資料暫存器R〇〜R14、一處理器狀態暫存器(Program Status Register,PSR)250、一快取記憶體(Icache)280、至 少一個預解碼器272、一解碼器290、一程式計數控制器(PC Contr〇lle〇225、以及一匯流排 215。 記憶體210是用來儲存多重指令碼(Multlple Instruction Word)(例如是A或B指令)或資料。程式計數 暫存器(PC)220是用來定出儲存於記憶體210中,下一筆 指令的位址。資料暫存器(R0〜R14)230是用來儲存指令之 資料或結果。在資料暫存器230中之位元中有兩個部分。 當一特疋的分支指令(Branch Instruction)被執行時,一個 或多個位元將視爲指令組選擇位元(lnstructl0n Set Selection bits ’底下稱爲IS)240,而其他的位元則被視爲 目標位址(Target address,底下稱爲TA)245。指令組選擇 位元IS將會存在處理器狀態暫存器(program status Register* ’ PSR),而目標位址ΤΑ將會儲存在程式計數器 12 552556 --964«twfl.doc/006 修正曰期 92.7 14 _—中。當轉換指令組時、 指令選擇位元(IS)爲新指令組之旗標、目標位址245爲新 指令組之開始位址。 處理器狀態暫存器(PSR)250用以儲存處理器核200之 狀態。此處理器狀態暫器250包括在指令組選擇旗標 (Instruction Set Selector,底下稱爲 ISS)260 中之一或多個 位元,用以指出現行之指令組。爲方便說明,在處理器狀 態暫存器(PSR)25〇中的指令組選擇旗標(ISS)260之一或多 個位元簡稱爲PSR(ISS)。而PSR(ISS)可以依據資料暫存 器R0〜R14中之一個或多個IS位元,由一特定的分支指令 (Specified Branch Instruction)來設定。 預解碼器270包括一或多個次解碼器(sub-decoder)272,用以將一個或多個指令組翻譯成一主要指令 (Primary Instruction Word)。此主要指令係經過解碼器290 而由處理器核200所執行。在此實施例中,此處理器核2〇〇 可簡單地僅執行此主要指令即可。而本發明的資料處理裝 置’可藉由此預解碼器2 7 0執行許多不同類型的指令組。 爲簡單地了解本發明之實施例,在底下皆以"A”爲此主要 指令,而其他的指令,則以指令’’B”或’’C’’表示。其中,指 令B或C係爲此主要指令’’Απ之子組(Subset),而且由指 令ΠΒ"可轉換爲指令’’A’’。而此次解碼器272係由PSR(ISS) 所控制’而預解碼器270之輸出結果即爲主要指令a。 快取記憶體280用以儲存主要指令。快取記憶體中的 標籤記憶體會儲存線路(Memory Line)之標籤位元(tag bits)、合法位元(Valid bit)和處理器狀態暫存器中的指令組 13 552556 修正日期92.7.14 -----0^4^9twfl .doc/006 %11 552556 Modified date 92.7.14 0643 9twf 1 .doc / 006 650 Pre-decoder 310-395 Process flow of instruction execution 400-420 Process flow of instruction group conversion For example, please refer to Figure 2, which is shown in A data processing device for executing multiple instruction sets in a preferred embodiment of the present invention is particularly directed to a pipeline processor. The data processing device provided by the present invention is used to process multiple instruction sets (Multiple Instruction Set). It includes a processor core 200, a memory 210, a program counter (PC) 220, a plurality of data registers R0 ~ R14, and a processor status register (PSR) 250. , A cache memory (Icache) 280, at least one pre-decoder 272, a decoder 290, a program counter controller (PC Controllo 225, and a bus 215. The memory 210 is used to store multiple Instruction code (Multlple Instruction Word) (for example, A or B instruction) or data. Program counter register (PC) 220 is used to determine the address of the next instruction stored in memory 210. Data register (R0 ~ R14) 230 is used to store instruction data or results. There are two parts in the bit in the data register 230. When a special Branch Instruction is executed, one or more Each bit will be regarded as the instruction set selection bit (Instructl0n Set Selection bits' is called IS) 240, and the other bits will be regarded as the target address (Target address, called TA) 245. Instruction set selection Bit IS will have processor status Register (program status Register * 'PSR), and the target address TA will be stored in the program counter 12 552556 --964 «twfl.doc / 006 correction date 92.7 14 _ —. When the instruction group is converted, the instruction The selection bit (IS) is the flag of the new instruction set, and the target address 245 is the start address of the new instruction set. The processor state register (PSR) 250 is used to store the state of the processor core 200. This processor The state register 250 includes one or more bits in the Instruction Set Selector (hereinafter referred to as ISS) 260 to indicate the current instruction set. For convenience of explanation, the processor state register One or more bits of the instruction set selection flag (ISS) 260 in (PSR) 25 are referred to as PSR (ISS). The PSR (ISS) can be based on one or more of the data registers R0 to R14. The IS bit is set by a specific branch instruction. The pre-decoder 270 includes one or more sub-decoders 272 to translate one or more instruction sets into a main instruction. (Primary Instruction Word). This main instruction is passed through the decoder 290 It is executed by the processor core 200. In this embodiment, the processor core 2000 can simply execute only the main instruction. The data processing device of the present invention can execute many different types of instruction sets by the pre-decoder 270. In order to understand the embodiments of the present invention simply, " A " is the main instruction below, and other instructions are indicated by the instruction '' B 'or' 'C' '. Among them, the instruction B or C is a subset of the main instruction '' Απ, and can be converted into the instruction '' A 'by the instruction ΠB ". The decoder 272 is controlled by PSR (ISS) this time, and the output of the pre-decoder 270 is the main instruction a. The cache memory 280 is used to store main instructions. The tag memory in the cache memory stores the tag bits, valid bits, and instruction sets in the processor status register of the Memory Line. 13 552556 Revision date 92.7.14- ---- 0 ^ 4 ^ 9twfl .doc / 006%
修正 補免 一 一三旗標PSR(ISS)。快取記憶體中的資料記憶體則儲存指 令碼(Instruction Word)。因爲快取記憶體只儲存主要指令 組A、所以在標籤記憶體中必需儲存指令組選擇旗標(ISS) 來分辨。以下將標籤記憶體中所儲存之指令組選旗標簡稱 爲TAG(ISS)。而如果A指令之位元寬度與其他指令不同, 例如A指令位元寬度爲32位元而B指令位元寬度爲16 位元,則快取記憶體280之標籤記憶體520則需要在標籤 位元(tag)中多加一個位元、用以當TAG(ISS)指示爲B指 令組時、分辨在資料記憶體中所儲存之指令碼是否爲程式 計數器所指示的指令。基本上,如果X不等於Y,則在快 取記憶體中的B指令之位址,將會不同於在記憶體中的位 址。例如,儲存在記憶體之B指令位址若是爲(O,〗〆/)或 (8,A,C,E)。當其儲存在快取記憶體中時,B指令之位址都 將改變成(〇,4,8,C)。所以在標籤記憶體中的標籤位元會多 一個位元來分辨其所儲存之指令爲(0,2,4,6)或者是 (8人C,E)。 解碼器290是用來對A指令解碼。處理器核2〇〇則用 來執彳了由解碼器290所解碼的A指令。程式計數控制器225 對應於指令組選擇旗標PSR(IS S)260修正程式計數器之値 (底下稱爲PC Value) ’以符合不同指令組之長度。匯流排 215則爲預解碼器270以及記憶體210之介面。 第3圖繪示本發明較佳實施例之指令執行流程圖。在 此例子中僅說明適用於處理器之兩種類型之指令組。當 然,本發明並不限於此。 $ 在步驟32〇中,處理器核(Process Core)彳吏用程式十數 14 552556 --- iFslfl.doc/006 修正曰期92.7.14 値(PC Value)從快取記憶體中讀取指令。多重指令儲 存於記憶體中。例如,記憶體同時儲存A指令或B指令。 其中,A指令之位元爲X,而B指令之位元爲Y。每一指 令佔據一個別的記憶體位址。當處理器核執行指令時,程 式計數器總是指向下一個指令之記憶體位址。也就是說, 在步驟320中,處理器核使用程式計數器以得到下一指令。 在接著的步驟330中,判斷程式記數器所指示之下一 指令是否爲已儲存於快取記憶體中(tnt),也就是圖示所顯Amendment Exemption One, three flags PSR (ISS). The data memory in the cache memory stores instruction words. Because the cache memory only stores the main instruction set A, the instruction set selection flag (ISS) must be stored in the tag memory for identification. The instruction set selection flag stored in the tag memory is hereinafter referred to as TAG (ISS). If the bit width of the A instruction is different from other instructions, for example, the A instruction bit width is 32 bits and the B instruction bit width is 16 bits, then the tag memory 520 of the cache memory 280 needs to be in the tag bit. An additional bit is added to the tag to distinguish whether the instruction code stored in the data memory is the instruction indicated by the program counter when the TAG (ISS) instruction is the B instruction set. Basically, if X is not equal to Y, the address of the B instruction in the cache will be different from the address in the memory. For example, if the address of the B instruction stored in the memory is (O,〗 〆 /) or (8, A, C, E). When it is stored in the cache memory, the address of the B instruction will be changed to (0, 4, 8, C). Therefore, the tag bit in the tag memory will have one more bit to distinguish whether the stored command is (0, 2, 4, 6) or (8 persons C, E). The decoder 290 is used to decode the A instruction. The processor core 2000 is used to execute the A instruction decoded by the decoder 290. The program count controller 225 corresponds to the instruction set selection flag PSR (IS S) 260 to modify the program counter 値 (hereinafter called PC Value) ′ to conform to the length of different instruction sets. The bus 215 is an interface of the pre-decoder 270 and the memory 210. FIG. 3 is a flowchart of instruction execution according to a preferred embodiment of the present invention. In this example, only the two types of instruction sets applicable to the processor are explained. Of course, the present invention is not limited to this. $ In step 32〇, the processor core (Process Core) uses the program number 14 552556 --- iFslfl.doc / 006 to modify the date 92.7.14 値 (PC Value) to read instructions from the cache memory . Multiple instructions are stored in memory. For example, the memory stores both the A command and the B command. Among them, the bit of the A instruction is X, and the bit of the B instruction is Y. Each instruction occupies a different memory address. When the processor core executes an instruction, the program counter always points to the memory address of the next instruction. That is, in step 320, the processor core uses the program counter to obtain the next instruction. In the following step 330, it is determined whether the next instruction indicated by the program register is stored in the cache memory (tnt), that is, as shown in the icon.
示的”Icache Hit?”。而此判斷步驟由三個條件所決定。第 一個條件即在快取記憶體中的合法位元(Valid bh)是否爲 合法、此合法位元代表在標籤記憶體中的標籤位元(tag)是 否有效。第二條件即在快取記憶體位址中的標籤記憶體之 標籤位元(tag bits)是否等於在程式計數器內所儲存的標籤 位元(tag bhS)。另外,第三個條件即爲,在快取記憶體位 址中的TAG(ISS)是否等於在處理器狀態暫存器中的 PSR(ISS)。"Icache Hit?" This judgment step is determined by three conditions. The first condition is whether the valid bit (Valid bh) in the cache memory is valid, and this valid bit represents whether the tag bit (tag) in the tag memory is valid. The second condition is whether the tag bits of the tag memory in the cache memory address are equal to the tag bits (tag bhS) stored in the program counter. In addition, the third condition is whether the TAG (ISS) in the cache memory address is equal to the PSR (ISS) in the processor status register.
若是以上三個條件都成立時,也就是Icache Hh,則 代表所需要的指令已經存在快取記憶體中的資料記憶體 中,且此儲存的指令(Cached Instruction Word)類型係與所 需要的類型吻合。於是,緊接著步驟380,此快取記憶體 將會輸出此指令碼到解碼器中,而解碼器將會對此指令解 碼。 快取gfi憶體之標籤記憶體的標籤位元(tag bits)係爲程 式計數器中的m個位元。而程式計數器內的N位元可定 址標籤記憶體輸出標籤位元,而在程式計數器內的標籤位 15 552556 ------06.439twfl.doc/006 修正日期 92.7.14 η η 修正 --丨一bits)可與在標籤記憶體輸出的標籤位元作比較。若 是程式計數器內的標籤位元(tag bits)等於在標籤記憶體輸 出的標籤位元,則代表所儲存的指令的位址等於在程式計 數器內的位址。爲了判斷標籤位元是否有效,上述的合法 位元將會在快取記憶體開啓時設定爲無效(Invalid),而在 指令暫存時設定爲有效(Valid)。上述的TAG(ISS)代表所存 取的指令之類別。而在指令暫存(Cached)時,將會記憶整 個線路(Memory Line)的指令類別,而此線路係針對管線式 線路而言,例如每個管線的線路可同時傳送四個指令。If the above three conditions are met, that is, Icache Hh, it means that the required instruction already exists in the data memory in the cache memory, and the type of the stored instruction (Cached Instruction Word) is the same as the required type Fit. Therefore, immediately after step 380, the cache memory will output the instruction code to the decoder, and the decoder will decode the instruction. The tag bits of the tag memory of the cache gfi memory are m bits in the program counter. The N bits in the program counter can address the label memory to output the label bits, and the label bit in the program counter is 15 552556 ------ 06.439twfl.doc / 006 Modified date 92.7.14 η η Modified- One bit) can be compared with the tag bit output in the tag memory. If the tag bits in the program counter are equal to the tag bits output in the tag memory, it means that the address of the stored instruction is equal to the address in the program counter. In order to determine whether the tag bit is valid, the above-mentioned legal bit will be set to Invalid when the cache memory is turned on, and set to Valid when the instruction is temporarily stored. The above TAG (ISS) represents the type of instruction stored. When the instruction is cached, the instruction type of the entire line will be memorized. This line is for pipelined lines. For example, each line of the pipeline can transmit four instructions at the same time.
而後在步驟390中,處理器核將執行指令並將結果儲 存在R0〜R14中,或是程式計數器中。如果此指令係一分 支指令(Branch Instruction),則需要改變程式計數器以控 制執行之流程。Then in step 390, the processor core will execute the instruction and store the result in R0 ~ R14 or the program counter. If this instruction is a branch instruction, the program counter needs to be changed to control the execution flow.
如果上述的三個條件並沒有全部符合,也就是沒有 Icache Hit,也就是TAG(ISS)不等於PSR(ISS),或是指令 並沒有存在快取記憶體中(Icache Miss)。當這樣的情形發 生時,如步驟340所示。此一匯流排將使用程式計數器的 値以要求記憶體,並等待記憶體回覆所需要的線路,如步 驟350所示。當指令由記憶體回覆且輸入預解碼器 (Predecoder)中時,此預解碼器將會根據PSR(ISS)選擇次 解碼器(Sub-decoder),並翻譯(Translate)此輸入之指令成 爲相關A指令’如步驟3 6 0所不。在步驟3 7 0中,預解碼 器的輸出將會儲存於快取記憶體中。而此快取記憶體將會 設定合法位元、標籤位元、儲存PSR(ISS)到TAG(ISS)中, 並且儲存預解碼器之輸出到快取記憶體中的資料記憶體 16 552556 修正日期92.7.14If the above three conditions are not all met, that is, there is no Icache Hit, that is, TAG (ISS) is not equal to PSR (ISS), or the instruction is not stored in the cache memory (Icache Miss). When such a situation occurs, as shown in step 340. This bus will use the counter of the program counter to request memory and wait for the memory to respond to the required lines, as shown in step 350. When the instruction is replied from the memory and input into the predecoder, the pre-decoder will select the sub-decoder according to the PSR (ISS), and translate the input instruction into the relevant A The instruction 'is not the same as step 3 6 0. In step 370, the output of the pre-decoder will be stored in cache memory. And this cache memory will set the legal bit, tag bit, store PSR (ISS) to TAG (ISS), and store the data output from the pre-decoder to the cache memory 16 552556 Correction date 92.7.14
l.doc/006 接著,指令將會如平常一般執行。 在執行過每一指令之後,處理器狀態暫存器(Processor Status Register)將會更新以維持最新之旗標、狀態、模式 以及ISS旗標。而如步驟395中所述,程式計數器將會根 據ISS値更新,以指出下一個指令之指令組。 參考第4圖,係顯示本發明之一較佳實施例之指令組 轉換的步驟流程。 指令組的轉換是由軟體來控制,符別係由一特定的分 支指令(Branch and Exchange Instruction)。當一個指令組 轉換發生時,在步驟400中,一個或多個指令將會將新指 令組之分支位址(Target Address)寫入暫存器R0〜R14的目 標位址(TA)區段中,並將新指令組之選擇旗標寫入暫存器 R0〜R14之指令組選擇位元(IS)中。 在步驟410中,一特定的分支指令將暫存器R0〜R14 中之目標位址(TA)區段複寫至程式計數器中,如下一步驟 420所述。程式計數器中其他的位元皆設定爲〇。同時, 特定的分支指令將R0〜R14中之1s部分複寫至PSR之ISS 區域。 在完成指定分支指令之後,程式計數器將新指令組中 之第一指令予以定位’且PSR(ISS)將會指示出新的指令組 模式。 在上述第3圖中的步驟330中,決定快取記憶體是否 命中(Hh),以及TAG(ISS)是否等於PSR(ISS),將會在底 下的第5A圖與第5B圖中詳細描述。請參照第5A圖與第 5B圖,其顯示快取記憶體中的操作。在第5A圖中,其顯 17 552556 06439twfl.doc/006 修正日期92.7」4 ‘般傳統的快取記憶體操作 其係爲沒有結合PSR(ISS) 作比較的例子。一位址信號510係儲存在程式計數器(PC) 中,並且傳到快取記憶體520中。此位址信號510的m個 位元係用以輪入至標籤記憶體中並輸出對應之標籤位元。 而此位址信號510的N位元係用來與輸出之標籤位元(tag bits)作比較(Comparison Operation)。而在標籤記憶體中的 一合法位元(VaHd Blt),將代表是否所選擇的輸出之標籤 位元有效或是無效。l.doc / 006 Then, the instructions will be executed as usual. After each instruction is executed, the Processor Status Register will be updated to maintain the latest flags, states, modes, and ISS flags. As described in step 395, the program counter will be updated according to ISS 値 to indicate the instruction set of the next instruction. Referring to Fig. 4, it shows the flow of steps of instruction set conversion in a preferred embodiment of the present invention. The conversion of the instruction set is controlled by software, and the symbol is a specific Branch and Exchange Instruction. When an instruction group conversion occurs, in step 400, one or more instructions will write the branch address (Target Address) of the new instruction group into the target address (TA) section of the registers R0 ~ R14. And write the selection flag of the new instruction set into the instruction set selection bits (IS) of the registers R0 ~ R14. In step 410, a specific branch instruction copies the target address (TA) section in the registers R0 to R14 to the program counter, as described in the next step 420. All other bits in the program counter are set to zero. At the same time, a specific branch instruction overwrites 1s of R0 ~ R14 to the ISS area of PSR. After the designated branch instruction is completed, the program counter will position the first instruction in the new instruction set ’and PSR (ISS) will indicate the new instruction set mode. In step 330 in the above figure 3, determining whether the cache memory hits (Hh) and whether TAG (ISS) is equal to PSR (ISS) will be described in detail in Figures 5A and 5B below. Please refer to FIGS. 5A and 5B, which show operations in the cache memory. In Figure 5A, it shows 17 552556 06439twfl.doc / 006 Rev. Date 92.7 "4‘ General cache operation. This is an example without PSR (ISS) for comparison. The address signal 510 is stored in the program counter (PC) and transmitted to the cache memory 520. The m bits of the address signal 510 are used to rotate into the tag memory and output the corresponding tag bits. The N bits of the address signal 510 are used to compare with the output tag bits (Comparison Operation). A valid bit (VaHd Blt) in the tag memory will represent whether the selected output tag bit is valid or invalid.
在第5B圖中,係顯示本發明較佳實施例之快取記憶 體中的操作方法,其中PSR(ISS)則用來加入此比較之操作 (Comparison Operation)。一位址信號510係儲存在程式計 數器中。位址信號中的m個位元輸入到標籤記憶體、而標 籤記憶體則輸出對應之標籤位元。在位址信號510中的N 位元,係與標籤記憶體所輸出的標籤位元(tag bits)作比較。 而在標籤記憶體中的一合法位元(Valid Bh),將代表是否 所輸出之標籤位元有效或是無效。在此實施例中,PSRCISS) 將會用來判斷是否與標籤記憶體中的TAG(ISS)相等。而 PSR(ISS)即爲前述的處理器狀態暫存器(PSR)中的指令組 選擇旗標(ISS)之一或多個位元。而TAG(ISS)則代表在標 籤記憶體之ISS位元。如果具有不同位元數目的指令混在 一起,例如,16位元的指令與32位元的指令混在一起, 在位址信號510的另一額外之位元可用來辨認指令的第一 半或是第二半之部分。例如,在第5B圖上所示,此演譯 方法係將原來使用N位元是否等於指定的標籤記憶體之標 籤位元,改變爲在位址信號510上的N+1位元是否等於在 18 552556In FIG. 5B, the operation method in the cache memory of the preferred embodiment of the present invention is shown, in which PSR (ISS) is used to add the comparison operation (Comparison Operation). The address signal 510 is stored in the program counter. The m bits in the address signal are input to the tag memory, and the tag memory outputs the corresponding tag bits. The N bits in the address signal 510 are compared with the tag bits output by the tag memory. A valid bit (Valid Bh) in the tag memory will represent whether the output tag bit is valid or invalid. In this embodiment, PSRCISS) will be used to determine whether it is equal to TAG (ISS) in the tag memory. The PSR (ISS) is one or more bits of the instruction set selection flag (ISS) in the aforementioned processor state register (PSR). TAG (ISS) represents the ISS bit in the tag memory. If instructions with different numbers of bits are mixed, for example, a 16-bit instruction is mixed with a 32-bit instruction, another extra bit at address signal 510 can be used to identify the first half or the first half of the instruction. The two halves. For example, as shown in Figure 5B, this interpretation method is to change whether the original N bit is equal to the tag bit of the specified tag memory to whether the N + 1 bit on the address signal 510 is equal to 18 552556
W9t vvfl.doc/006 補充 記憶體的N+l個位元 修正日期92 7.14 如第2圖所示之具有一個或更多的次解碼器π〗之預 解碼益27〇係用以將非主要指令轉換(Transiating)爲主要 指令,例如上述的A指令。爲更詳細描述,請參照第6A 圖與第6B圖。第6A圖顯示習知之用以處理不同類別指 之傳統头:構。在母個線路(Line)中有例如四個指令從 匯流排(Bus Interface Unit,BIU)610傳來。藉由切換器 (Switch)620選擇,其中一個指令將會傳送到快取記憶體之 資料記憶體630中。爲了執行指令,其中一個指令將會傳 送到解碼器Decode。而所傳送的指令首先將會進行映射 (Mapping),並接著進行解碼(Decoding)。經過映射與解碼, 此指令將會傳到處理器核(Processor Core)以便執行。在本 發明之較佳實施例中,如第6B圖所示,經過切換器64〇 之選擇後,所選擇的指令將同時傳送到預解碼器650與切 換器660。如果所選擇的指令爲B指令,由於其並非主要 指令,因此預解碼器650將會轉換(Translate)此B指令爲 一主要指令,例如A指令。此經過預先解碼之指令將會傳 到切換器660。藉由從PSR所傳來的ISS位元,此指令將 會接著傳到快取記憶體中的記億體670中。 根據第7A圖與第7B圖,其解釋從匯流排中若是傳 來混合指令的線路,例如線路中同時存在A指令與B指 令。首先,請參照第7A圖,快取記憶體將會以程式計數 器之値爲〇要求匯流排BIU,而匯流排BIU將會對線路710 回應包括四個,,ABBA,,指令格式之線路。目TAG(ISS)則^ 會-直記住首紐到的指令之酬,而快取_體的命中 552556W9t vvfl.doc / 006 Supplemental memory N + l bit correction date 92 7.14 Pre-decoding benefit 27 with one or more secondary decoders π as shown in Figure 2 is used to convert non-primary Instruction translation is the main instruction, such as the above-mentioned A instruction. For a more detailed description, please refer to FIGS. 6A and 6B. Figure 6A shows the traditional head: structure that is known to handle different types of fingers. In the parent line (Line), for example, four instructions are transmitted from a Bus Interface Unit (BIU) 610. Select by Switch 620, one of the commands will be sent to the data memory 630 of the cache memory. To execute the instruction, one of the instructions is passed to the decoder Decode. The transmitted instructions are first mapped and then decoded. After mapping and decoding, this instruction will be passed to the Processor Core for execution. In the preferred embodiment of the present invention, as shown in FIG. 6B, after the selection of the switch 64o, the selected instruction will be transmitted to the pre-decoder 650 and the switch 660 at the same time. If the selected instruction is a B instruction, since it is not the main instruction, the pre-decoder 650 will translate this B instruction into a main instruction, such as the A instruction. This pre-decoded instruction will be passed to the switcher 660. With the ISS bit transmitted from the PSR, this instruction will then be passed to the DRAM 670 in the cache. Figures 7A and 7B explain the use of a mixed command line from the bus, for example, there are both A and B commands in the line. First, please refer to Figure 7A. The cache memory will request the bus BIU with the program counter as 0, and the bus BIU will respond to line 710 including four, ABBA, and command format lines. Head TAG (ISS) then ^ will-remember the reward of the first instruction, and cache the hit of the body 552556
l.doc/006 修正日期92.7 14l.doc / 006 Revised 92.7 14
Hit·)也將會以TAG(ISS)爲根據。例如,在此實施例所示, 此TAG(ISS)係爲指令A,因爲在程式計數器之値爲〇的位 址之指令爲A指令。而在此快取記億體之資料記憶體係塡 滿了 A指令類型。而此指令之順序將會成爲"AAAA”,如 圖所示。 經過了 η個時脈後,匯流排BIU線路可能已經寫入快 取記憶體,並且已經改變。中央微處理器將會開始執行程 式計數器値PC=4之PSR(ISS)=B。但是在此階段 TAG(ISS)=A,其代表快取記憶體將沒有命中(M1SS)。再一 次,快取記憶體將以PC=4要求匯流排BIU,而匯流排BIU 也會再次回應’’ABBA”四個指令格式之線路。接著,請參 照第7B圖,PC=4之指令會經過預解碼器作轉換,並存入 到資料記憶體中並輸出至解碼器。此時、TAG(ISS)=B,快 取記憶體之資料記憶體係充滿ΠΒ”,而其指令之順序爲 ’ΈΒΒΒ”。再經過η個時脈之後、中央微處理器開始執行PC =8之指令。此時、TAG(ISS)等於PSR(ISS)、代表快取記 .憶體命中(Hh)、快取記憶體即可輸出指令至解碼器。不管 是指令類型的順序爲何,快取記憶體將會判斷正確的指令 類別,並且轉換至主要指令。在實際的操作上,這種在同 一線路上混合不同指令類別的情形是相當少見的。 本發明之資料處理裝置較傳統之資料處理裝置,具有 多個優點。其中之一則是本發明之資料處理裝置可以執行 多重指令組之指令,其不限制於一個或兩個指令組。如此, 使得程式編譯者在設計程式時具有極大的彈性。如果效能 爲主要考量,則可使用較強功能的指令、例如32位元之 20 552556 ----------0645^twf 1. doc/006Hit ·) will also be based on TAG (ISS). For example, as shown in this embodiment, the TAG (ISS) is the instruction A, because the instruction at the address of 0 in the program counter is the A instruction. And here the data memory system of caching billions is full of A instruction types. The order of this instruction will become "AAAA", as shown in the figure. After n clocks, the bus BIU line may have been written into the cache memory and has changed. The central microprocessor will start Program counter 値 PC = 4 of PSR (ISS) = B. But at this stage TAG (ISS) = A, it means that the cache memory will not hit (M1SS). Once again, the cache memory will be PC = 4 requires the bus BIU, and the bus BIU will respond to the four command lines of "ABBA" again. Then, please refer to Figure 7B. The instruction of PC = 4 will be converted by the pre-decoder and stored in the data memory and output to the decoder. At this time, TAG (ISS) = B, the data memory system of the cache memory is filled with ΠB ”, and the order of its instructions is‘ ΈΒΒΒ ”. After n clocks, the central microprocessor starts to execute the instruction of PC = 8. At this time, TAG (ISS) is equal to PSR (ISS), which represents the cache. The memory hit (Hh) and the cache memory can output instructions to the decoder. Regardless of the order of the instruction types, the cache will determine the correct instruction type and switch to the main instruction. In actual operation, such a situation of mixing different instruction types on the same line is quite rare. The data processing device of the present invention has a number of advantages over conventional data processing devices. One of them is that the data processing device of the present invention can execute multiple instructions, which is not limited to one or two instructions. In this way, program compilers have great flexibility in designing programs. If performance is the main consideration, you can use more powerful instructions, such as 20 552556 in 32 bits ---------- 0645 ^ twf 1. doc / 006
如果記憶體容 量爲主要考量 體之指令、例如16位元之指令。 另一優點即爲降低功率之消耗 修正日期92.7.14 則可以使用較省記憶 在傳統的裝置上,所 有的指令組都會具有分別指定的指令解碼器與邏輯控制單 元。這是非常昂貴且浪費功率之設計,因爲多個指令解碼 器與邏輯控制單元需要不斷地運作。然而,在本發明中, 預解碼器僅需要在第一個指令抓取時進行,而後便將轉換 後之結果儲存在快取記憶體中。一般而言、快取記憶體命 中機率爲95%時,其代表在本發明中的預解碼器僅需要在 100個指令抓取中運作5次。此可大幅降低所消耗之功率。 另一個優點則是減少解碼階段(Decode Stage)之週期 時間(Cycle Time)。在傳統的裝置中,不論是利用兩個解 碼器或是利用映射(Mapping)來處理多重指令組、都會增 加在解碼階段之週期時間。尤其是利用映射的方式、更會 大大的增加解碼階段的週期時間。這些傳統的裝置,都會 使得中央處理單元整體的週期時間增加。然而,在本發明 中,無論多少個指今組,只使用一個指令解碼器。該指令 解碼器僅對爲主要指令組之指令型式進行解碼、所以不會 增加任何在解碼階段之週期時間。同時、在高頻率的設計 中、快取記憶體一般是使用同步式記憶體(Synchronous Ram)、所以預解碼器也不會增加在抓取階段(Fetch stage) 之週期時間。 本發明已以較佳實施例揭露如上,然其並非用以限定 本發明,任何熟習此技藝者,在不脫離本發明之精神和範 圍內,當可作各種之更動與潤飾,因此本發明之保護範圍 21 552556If the memory capacity is the main consideration, such as a 16-bit instruction. Another advantage is to reduce the power consumption. The correction date of 92.7.14 can be used to save memory. On traditional devices, all instruction groups will have their own designated instruction decoder and logic control unit. This is a very expensive and power-hungry design because multiple instruction decoders and logic control units need to operate continuously. However, in the present invention, the pre-decoder only needs to be performed when the first instruction is fetched, and then the converted result is stored in the cache memory. Generally speaking, when the cache memory hit probability is 95%, it means that the pre-decoder in the present invention only needs to operate 5 times in 100 instruction fetches. This can significantly reduce the power consumed. Another advantage is that it reduces the cycle time of the Decode Stage. In conventional devices, whether using two decoders or mapping to process multiple instruction sets, the cycle time in the decoding phase is increased. Especially the use of mapping will greatly increase the cycle time of the decoding stage. These traditional devices will increase the overall cycle time of the central processing unit. However, in the present invention, no matter how many instruction sets, only one instruction decoder is used. The instruction decoder only decodes instruction types that are the main instruction set, so it does not increase any cycle time during the decoding phase. At the same time, in high-frequency designs, cache memory generally uses synchronous memory (Synchronous Ram), so the pre-decoder will not increase the cycle time in the Fetch stage. The present invention has been disclosed as above with preferred embodiments, but it is not intended to limit the present invention. Any person skilled in the art can make various modifications and retouches without departing from the spirit and scope of the present invention. Protection range 21 552556
0643 9Γ\修正 9twfl.doc/006 修正日期92.7.14 一#7¾½附之申請專利範圍所界定者爲準。 220643 9Γ \ Amendment 9twfl.doc / 006 Amendment date 92.7.14 One # 7¾½ attached to the scope of the patent application shall prevail. twenty two
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW90101006A TW552556B (en) | 2001-01-17 | 2001-01-17 | Data processing apparatus for executing multiple instruction sets |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW90101006A TW552556B (en) | 2001-01-17 | 2001-01-17 | Data processing apparatus for executing multiple instruction sets |
Publications (1)
Publication Number | Publication Date |
---|---|
TW552556B true TW552556B (en) | 2003-09-11 |
Family
ID=31713376
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW90101006A TW552556B (en) | 2001-01-17 | 2001-01-17 | Data processing apparatus for executing multiple instruction sets |
Country Status (1)
Country | Link |
---|---|
TW (1) | TW552556B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI552080B (en) * | 2011-04-01 | 2016-10-01 | 英特爾股份有限公司 | Processor |
US10573383B2 (en) | 2017-07-31 | 2020-02-25 | Micron Technology, Inc. | Data state synchronization |
-
2001
- 2001-01-17 TW TW90101006A patent/TW552556B/en not_active IP Right Cessation
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI552080B (en) * | 2011-04-01 | 2016-10-01 | 英特爾股份有限公司 | Processor |
US10573383B2 (en) | 2017-07-31 | 2020-02-25 | Micron Technology, Inc. | Data state synchronization |
TWI699780B (en) * | 2017-07-31 | 2020-07-21 | 美商美光科技公司 | Data state synchronization |
US10943659B2 (en) | 2017-07-31 | 2021-03-09 | Micron Technology, Inc. | Data state synchronization |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5911057A (en) | Superscalar microprocessor having combined register and memory renaming circuits, systems, and methods | |
US20020004897A1 (en) | Data processing apparatus for executing multiple instruction sets | |
US9619750B2 (en) | Method and apparatus for store dependence prediction | |
US7818542B2 (en) | Method and apparatus for length decoding variable length instructions | |
US6195735B1 (en) | Prefetch circuity for prefetching variable size data | |
TWI639952B (en) | Method, apparatus and non-transitory machine-readable medium for implementing and maintaining a stack of predicate values with stack synchronization instructions in an out of order hardware software co-designed processor | |
TWI691897B (en) | Instruction and logic to perform a fused single cycle increment-compare-jump | |
US7818543B2 (en) | Method and apparatus for length decoding and identifying boundaries of variable length instructions | |
US6275927B2 (en) | Compressing variable-length instruction prefix bytes | |
US5218711A (en) | Microprocessor having program counter registers for its coprocessors | |
JPH0395629A (en) | Data processor | |
US6212621B1 (en) | Method and system using tagged instructions to allow out-of-program-order instruction decoding | |
JP2010501913A (en) | Cache branch information associated with the last granularity of branch instructions in a variable length instruction set | |
US6460116B1 (en) | Using separate caches for variable and generated fixed-length instructions | |
JPH0215331A (en) | Data processor | |
TWI522910B (en) | Microprocessor, methods of selectively decompressing microcode, generating selectively compressed microcode, and generating a description, and computer program product | |
JP4748918B2 (en) | Device having a cache for storing and supplying decrypted information and method for doing so | |
TW552556B (en) | Data processing apparatus for executing multiple instruction sets | |
TWI697836B (en) | Method and processor to process an instruction set including high-power and standard instructions | |
EP3905034A1 (en) | A code prefetch instruction | |
US5895497A (en) | Microprocessor with pipelining, memory size evaluation, micro-op code and tags | |
JP3644892B2 (en) | Data processing apparatus for executing a plurality of instruction sets | |
US20210200538A1 (en) | Dual write micro-op queue | |
US6865665B2 (en) | Processor pipeline cache miss apparatus and method for operation | |
JP5229383B2 (en) | Prefetch request circuit |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
GD4A | Issue of patent certificate for granted invention patent | ||
MM4A | Annulment or lapse of patent due to non-payment of fees |