TW455814B - Software directed target address cache and target address register - Google Patents

Software directed target address cache and target address register Download PDF

Info

Publication number
TW455814B
TW455814B TW88111693A TW88111693A TW455814B TW 455814 B TW455814 B TW 455814B TW 88111693 A TW88111693 A TW 88111693A TW 88111693 A TW88111693 A TW 88111693A TW 455814 B TW455814 B TW 455814B
Authority
TW
Taiwan
Prior art keywords
branch
target address
instruction
branch prediction
information
Prior art date
Application number
TW88111693A
Other languages
Chinese (zh)
Inventor
Tse-Yu Yeh
Harshvardhan Sharangpani
Judge K Arora
Original Assignee
Intel Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corp filed Critical Intel Corp
Application granted granted Critical
Publication of TW455814B publication Critical patent/TW455814B/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3802Instruction prefetching
    • G06F9/3804Instruction prefetching for branches, e.g. hedging, branch folding
    • G06F9/3806Instruction prefetching for branches, e.g. hedging, branch folding using address prediction, e.g. return stack, branch history buffer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3842Speculative instruction execution
    • G06F9/3846Speculative instruction execution using static prediction, e.g. branch taken strategy

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)

Abstract

A branch prediction system is provided that includes a first, low-latency storage structure, a second, higher-latency storage structure, and a branch prediction manager (BPM) for updating the first and second storage structures according to software provided hint information. For one embodiment, the BPM identifies branch hint information from branch related instructions and writes the identified branch hint information according to the state of an importance bit in the instruction. When the importance bit is in a first state, branch prediction information is stored in the first storage structure. When the hint information is in a second state, the branch prediction information is stored in a second structure.

Description

4 5 5 8 1 4 理器使用 上,整個 線。指令 之管線級 獨立作業 理器之指 管線處理 。為了完 ,以足夠 係保持管 執行一分 跳至一新 送至管線 線前端取 執行路徑 自錯誤之 徑的指令 級中的資 為其未提4 5 5 8 1 4 The controller uses the entire line. Instruction pipeline level Independent operation means the pipeline processing. In order to complete, it is enough to keep the pipe to execute a point, jump to a new one, send it to the front of the pipeline, take the execution path, and use the funds in the instruction level of the wrong path.

第5頁 發明背景 關於分枝預測之領域’尤其有關存取關於分枝 資訊的系統及方法。 非常高的速度 由數個瀑布式 一作業序列, “管狀級”)而 的管狀級而同 f各管狀級中 器之傳輪量每 令執行功能, 該處理器之執 正確執行路徑 符合分枝條件 ,並將來自該 執行發生於管 令取得仰賴分 管線可能於分 指令。然後當 令必須從管線 。該閒置4 出,直到填入 五 '發明說明(1) 發明領域 本發明係 指令之預測 背景技藝 先進之處 這類處理器 構成的一管 藉由一對應 數個指令之 因而增加處 行資源的一 超過一指令 之執行路徑 分枝指令 要挑戰。當 之控制流程 列之指令傳 指令卻於管 決定正確之 前’填滿來 確之執行路 有效之管狀 線氣泡,因 管線技藝以 機器組織成 處理分割成 中的硬體( 可藉由不同 令傳輸量。 器,該處理 全利用此指 的指令提供 線填滿來自 枝指令且其 的代碼序列 。通常分枝 得。如果指 ’則處理器 執行路徑的 時’這些指 源維持間置 供有用的輸 執行指令。 之硬體級所 且各作業係 執行。來自 時間處理, 包含多重執 時脈週期可 必須從正確 行資源。 之指令的主 時,處理器 新的代碼序 線後端,而 枝執行,以 枝條件決定 取得來自正 流出,留下 狀級稱為管 來自JL確&lt; ^ 5 58 1 4 五、發明說明(2) &quot; &quot; 執行路徑的指令為止。 現代之處理器於其管線前端合併分枝預測模組,以減少 管線氣泡的數目。當一分枝指令輸入管線前端時,分枝預 測模組預報:當於管線後端執行時,是否將採用分枝指令 ^如果預測採用分枝,則分枝預測模組將一分枝目標住址 提供予指令取得模組。同樣位於管線前端之取得模組開始 從目標住址取得指令。 傳統之分枝預測模組使用分枝目標緩衝器(BTBs)儲存預 測貢訊,像是:是否將採用一分枝,以及當採用分枝時之 類似的目標住址。查閱BTB中的一指令,決定是否採用分 枝,以及將一目標住址提供予一採用之預測的取得模組等 延遲了將處理器重新導引至目標住址。此延遲允許來自錯 誤執行路徑之指令的輸入,並於管線向下傳播。由於這些 指令並未加入預測之執行路徑的前進,所以當其流出時, 將於管線中建立“氣泡”。重新導引管線所需之時脈週期 數目愈大,管線中所建立的氣泡數目愈大。此外,愈精碹 而且完整,則分枝預測演算法耗費愈久才得以完成,並且 於重新導弓丨處理中產生較大之延遲。 目前可用之分枝預測技術通常要求二或更多時脈週期, 用以重新導引處理器管線,以減少但並未消除管線氣泡。 當這些氣泡出現於像是該等控制緊密迴路之選定的分枝指 令中時,則可能有相當程度之效能降級。例如,如果執行 於四時脈週期的一迴路中引進一週期的一氣泡,則該迴路 之執行可能降級2 0 %。Page 5 Background of the Invention The field of branch prediction ' is particularly related to systems and methods for accessing branch information. Very high speed consists of several waterfall-like one-work sequence, "tubular stage") and the tubular stage. As each round of the tubular stage has a number of rounds of execution functions, the correct execution path of the processor conforms to the branch. The conditions that will arise from the execution of the management order are dependent on the possibility that the sub-pipeline may be sub-instruction. Then the order must be removed from the pipeline. The idle 4 will be filled out until the description of the invention is filled in. (1) Field of the Invention The present invention is a prediction of the background of the instruction. The advanced technology of this type of processor constitutes a tube that corresponds to several instructions and thus increases processing resources. Branching instructions with more than one instruction is challenging. When the control flow of the command line is passed, the tube line bubble is filled to ensure that the execution is effective before the decision is correct. The pipeline technology is organized by the machine to process the divided hardware (can be transmitted by different orders) The processor uses this instruction to provide lines to fill the code sequence from the branch instruction and its code. Usually branched. If it refers to 'then when the processor executes the path', these sources maintain intermediation for useful Input execution instructions. The hardware level and each task is executed. From time processing, including multiple clock cycles may have to start from the correct line of resources. The main time of the instruction, the processor's new code sequence backend, and branches Execution, depending on the branch conditions to obtain the positive outflow, leaving the state is called the tube from the JL confirms <^ 5 58 1 4 V. Description of the invention (2) &quot; &quot; Execution path instructions until now. Modern processors in The branch prediction module is integrated into the front end of the pipeline to reduce the number of pipeline bubbles. When a branch instruction is input to the front end of the pipeline, the branch prediction module predicts that: When executed, will branch instructions be used? If branch prediction is used, the branch prediction module provides a branch target address to the instruction acquisition module. The acquisition module, which is also located in the front of the pipeline, starts to obtain instructions from the target address. The traditional branch prediction module uses branch target buffers (BTBs) to store prediction tributes, such as: whether a branch will be used, and similar target addresses when branches are used. Check out an instruction in the BTB, Determining whether to use branching and providing a target address to a used predictive acquisition module delays the redirection of the processor to the target address. This delay allows the input of instructions from the wrong execution path to be routed to the pipeline. Since these instructions do not join the predicted execution path forward, when they flow out, a "bubble" will be established in the pipeline. The greater the number of clock cycles required to redirect the pipeline, the greater the number of clock cycles established in the pipeline. The larger the number of bubbles. In addition, the more sophisticated and complete the branch prediction algorithm will take longer to complete, and will be re-guided Larger delays are generated. Currently available branch prediction techniques usually require two or more clock cycles to redirect the processor pipeline to reduce but not eliminate pipeline bubbles. When these bubbles appear like this When the selected branch instruction that controls the tight loop is degraded, there may be a considerable degree of performance degradation. For example, if a cycle of one bubble is introduced in a loop of a four-clock cycle, the execution of the loop may be degraded by 2 0%.

第6頁 4 5 5 8 1 4Page 6 4 5 5 8 1 4

五、發明說明(3) 發明概述 本發明係一階層式分枝預測系統,其t於軟體導向下, 將分枝預測資訊提供予所選定之分枝預測結構。 根據本發明,一分枝預測系統包括第一與第二错存結構 ,分別用以儲存分枝指令其第一與第二類別之分枝預測資 訊。一分枝預測管理員(BPM)根據相關於該分枝的—重要 提示,將一分枝之分枝預測資訊提供予第一或者第二健 結構。 本發明一具體實施例中,第一儲存結構係一暫存器檔案 *可於一單一時脈週期中存取,而第二儲存結構係一快捷 記憶體,可於兩時脈週期中存取。 ' 圖式簡述 本發明可藉由參照以下阖式而了解,其中類似之元件係 以類似的號碼指示。這些圖式係提供以說明本發明所選定 之具體實施例’而非希望限制本發明的範圍。 圖i係一處理器管線其前端級的一區塊圖,該前端級包 括一傳統分枝預測模組》 圖2係一處理器管線的一區塊圖,該處理器管線包括根 據本發明的一分枝預測系統。 圖3係一分枝預到管理員(ΒΡΜ)其一具體實施例的一區塊 圖,該分枝預測管理員(ΒΡΜ)係用以更新圖1中之分枝預測 系統的分枝儲存結構。 圖4係一寫入管線的一區塊圖,該寫入管線係用以更新 圖2之目標住址暫存器(τ a R )與目標住址快捷記憶體(TA c )V. Description of the invention (3) Summary of the invention The present invention is a hierarchical branch prediction system, which provides software-oriented branch prediction information to the selected branch prediction structure. According to the present invention, a branch prediction system includes first and second misstored structures for storing branch prediction information of the first and second types of branch instructions, respectively. A branch prediction manager (BPM) provides branch prediction information of a branch to the first or second health structure according to the important reminder related to the branch. In a specific embodiment of the present invention, the first storage structure is a register file * that can be accessed in a single clock cycle, and the second storage structure is a fast memory that can be accessed in two clock cycles . 'Brief Description of the Drawings The present invention can be understood by referring to the following formulas, in which similar elements are indicated by similar numbers. These drawings are provided to illustrate selected embodiments of the invention 'and are not intended to limit the scope of the invention. FIG. I is a block diagram of a front-end stage of a processor pipeline, which includes a traditional branch prediction module. FIG. 2 is a block diagram of a processor pipeline, which includes a processor pipeline according to the present invention. A branch prediction system. FIG. 3 is a block diagram of a specific embodiment of a branch predictive administrator (BPM). The branch prediction manager (BPM) is used to update the branch storage structure of the branch prediction system in FIG. 1. . FIG. 4 is a block diagram of a write pipeline for updating the target address register (τ a R) and the target address flash memory (TA c) of FIG. 2.

第7頁 4 5 581 4 五、發明說明(4) 圖5A-5C係分枝及分枝相關指 ,該分枝相關指令係適合用 '、f包例的區塊圖 之分枝預測系統。用以將重要提示資訊提供予圖2 圖6係一方法的一流鞀阁 更新分枝預測資訊。 〜法係用以根據本發明而 圖7係一方法的一潘敍親 _ 從分枝預測結構的-階厗提\方:係用以根據本發明, 丨占層徒供分枝預測資訊。 發明詳論 以下討論陳述多種輯宏έ p 了組u 疋細㈤’以提供對本發明的-徹底 士將明白:本發明;解本揭露之優點的人 鍤基交夕女决 …、需圪些特疋細節而實行。此外,各 ',程序’組成,與電路並未詳細描述,以便 t集關注於本發明的特性。 顯不傳統之處理器管線1〇〇其前端管狀級 ’,元4 s亥揭露之例子中’級1 〇 2包括用以指示一 :供:目線J0之私令的資源,而級104包括用以取得所指 不=札7的貝源。通常指令係藉由一對應之指令指標(Ip) =曰不。因此匕,級1〇2包括一指令指標(ιρ)多工器(Μ”) 、’以及一指令快捷記憶體(I—快捷記憶體)U0與分枝 預測模組120等部分。請記住:卜快捷記憶體ιι〇與分枝預 測模組120之部分延伸至取得級1〇4。管線1〇〇中卜快捷記 憶體11 〇與分枝預測模組丨20等位置與大小分別指示其何時 接收指令指標(IP),以及其處理該接收之指令指標(Ip)所Page 7 4 5 581 4 V. Description of the invention (4) Figure 5A-5C is a branch and branch-related instruction. The branch-related instruction is suitable for the branch prediction system using the block diagram of the ', f package example. It is used to provide important reminder information to Figure 2. Figure 6 is a method of first-class cabinet update branch forecast information. ~ The method is used in accordance with the present invention, and Fig. 7 is a method of a method of _ from the branch to predict the structure of the-order to mention \ Fang: used to according to the present invention, 丨 stratum for branch prediction information. Detailed discussion of the invention The following discussion states a variety of macros to provide a thorough understanding of the present invention-a thorough person will understand: the present invention; people who understand the advantages of this disclosure will be able to make a decision ..., need some special疋 Detailed implementation. In addition, each component, program, and circuit are not described in detail so that the set t focuses on the characteristics of the present invention. Obviously, the traditional processor pipeline 100 has a front-end tubular stage. In the example disclosed in Yuan 4 shai, the 'stage 1 102' includes resources for indicating one: for: the private order of the eyeline J0, and the stage 104 includes the use of In order to obtain the source of Bei = Zha7. Usually, the instruction is by a corresponding instruction indicator (Ip) = no. Therefore, the level 102 includes a command indicator (ιρ) multiplexer (M "), a command quick memory (I-quick memory) U0, and branch prediction module 120. Please remember : The part of the fast memory ιι〇 and the branch prediction module 120 extends to the acquisition level 104. The pipeline 100 has the fast memory 11 〇 and the branch prediction module 丨 20 and other locations and sizes indicate their respective positions and sizes. When to receive the instruction indicator (IP), and where it processes the received instruction indicator (Ip)

4 5 58 1 4 五、發明說明(5) 需時間。例如,指令指標(IP)多工器130於級102之前半中 選擇一指令指標(IP)。I快捷記憶體1 1 〇與分枝預測模組 1 2〇大約於通過IPG級1 02的途妒接收該指令指標(ip),並 柃級104期間完成處理。 指令指標(IP)多工器(MUX) 130接收來自包括分枝預測 楔组120之各種來源的指令指標(ip)。取決於來自分枝預 剛模組1 20與其他控制電路(未顯示)之輸入,指令指標 (IP)多工器(MUX) 130將於其輸入之一的指令指標(ip)耗 合至I-快捷記憶體120與分枝預測模組丨2〇 β 於接文選定之指令指標(IP)時,1_快捷記憶體11〇與分 枝預測模組120起始化查閱程序:以取得有關所選定之 令指標(IP)的資訊。尤其,卜快捷記億體丨1〇儲存所選^ =指令的副本,以其對應之指令指標(Ip)作為索引。 所接收之指令指標(ίΡ)與其登錄項相比較' 則將對應之指令傳至管線二—級ν/Λ:登錄項時’ 果指八浐俨r T D、 、中的電路(未顯示)。如 果知·^晶^(IP)於I-快捷記憶體如 由對記憶體子系統(未顳示)進行 ',、lj扣7係藉 。 潛在期之異動而讀取 分枝預測模組1 2 〇儲存所選定之 訊’以分枝指令之指令指標(Ip)枝和;其分枝預測資 如像是指示·‘ A否可能採用所對應=。此資訊包括例 標住址(指令指標([p )),其中如 刀,以一預測的目 預測採用該分枝,則將4 5 58 1 4 V. Description of the invention (5) It takes time. For example, the instruction index (IP) multiplexer 130 selects an instruction index (IP) in the first half of the stage 102. I fast memory 1 1 0 and branch prediction module 1 2 0 receive the instruction index (ip) through the IPG level 10 02, and complete the processing during the level 104. An instruction indicator (IP) multiplexer (MUX) 130 receives instruction indicators (ip) from various sources including a branch prediction wedge set 120. Depending on the input from the branch pre-rigidity module 1 20 and other control circuits (not shown), the instruction index (IP) multiplexer (MUX) 130 will consume one of its input instruction index (ip) to I -Quick memory 120 and branch prediction module 丨 20β When the instruction index (IP) selected in the text is received, 1_Quick memory 11 and branch prediction module 120 initiates the review process: to obtain relevant information Information on the selected order indicator (IP). In particular, Bu Yiji Yiyi 丨 10 stores a copy of the selected instruction, and uses its corresponding instruction index (Ip) as an index. The received instruction index (ίΡ) is compared with its registration item ', then the corresponding instruction is transmitted to the pipeline two-level ν / Λ: When the registration item' is a circuit (not shown) in the 浐 俨 r T D,,. If you know that ^ crystal ^ (IP) in the I-fast memory, such as by the memory subsystem (not shown), lj buckle 7 is borrowed. Potential period changes and read the branch prediction module 1 2 0 Store the selected message 'The instruction index (Ip) branch of the branch instruction; if the branch prediction data is like an indication,' A is it possible to use the Correspondence =. This information includes examples of residential addresses (instruction indicators ([p)), where, for example, a knife is used to predict the use of the branch, the

4 5 5 8 1 A 五'發明說明(6) -- 管線100導引至該預測的目標住址。當藉由指令指標(ίρ) 多工器(MUX) 130而轉寄之指令指標(ΙΡ)於分枝預測模組 1 20令命中時,則存取並讀取相關於該命中登錄項之分枝 預測資訊,以決定是否預測採用該分枝。如果為是,則對 應之目標住址(指令指標ΠΡ))耦合回至指令指標(ιρ)多工 器(ΜϋΧ) 130,將管線重新導引至從該目標住址開始的代 碼序列。一級問122控制將來自分枝預測模組12〇之信號耦 合至多工器(MUX) 130的時序。 分枝指令於電腦代碼中為相對較普遍者’平均每5至12 個指令出現次。為了調節經常遇到之分枝指令的一有理 小數其預測資訊,分枝預測模組丨2 〇須為一相對較大的結 構。分枝預測模組120之大小受限於管線1〇〇中的時序考°慮 ^如果分枝預測模組12〇太大,則將具有相對較長之存取k m大小以夕卜,預測分枝之精確度亦影響分枝預測 系統的速度。較精確之分枝預測演算法傾向於較複雜,而 需要額外之時間,用以提供一分枝指令的一預測。 之如、二明,分枝預測模組120用以存取-預測採用 所需的各額外時脈週期均允許將來自錯誤執 1徑(氣泡,’)之額外指令輸入管線100。因此通常 二預二20定有大小並選定其預測演算法,使管線 可於跟隨-分枝指令後的數個時脈週期之内重新導引 、:之分枝預測策略中存4:提供處理器管線之快 以及提供出現於大部分電腦碼中相對較大量 之勿枝其精確的分枝預測提示之間的取捨。4 5 5 8 1 A Five 'invention description (6)-pipeline 100 leads to the predicted target address. When the instruction indicator (IP) forwarded by the instruction indicator (ίρ) multiplexer (MUX) 130 hits the branch prediction module 1 20, it accesses and reads the score related to the hit registration item. Branch prediction information to decide whether to predict that branch. If it is, the corresponding target address (instruction index (IP)) is coupled back to the instruction index (ιρ) multiplexer (MX) 130 to redirect the pipeline to the code sequence starting from the target address. The primary question 122 controls the timing of coupling the signal from the branch prediction module 120 to the multiplexer (MUX) 130. Branch instructions are relatively common in computer code 'and occur on average every 5 to 12 instructions. In order to adjust the prediction information of a rational decimal that is often encountered in branch instructions, the branch prediction module 丨 2 0 must be a relatively large structure. The size of the branch prediction module 120 is limited by the timing considerations in the pipeline 100. If the branch prediction module 120 is too large, it will have a relatively long access km size to predict the branch size. The accuracy of the branches also affects the speed of the branch prediction system. More accurate branch prediction algorithms tend to be more complex and require additional time to provide a prediction for a branch instruction. As such, Erming, the branch prediction module 120 is used to access-predict each additional clock cycle required to allow additional instructions from the wrong execution path (bubble, ') to be input into the pipeline 100. Therefore, the size of the two pre-two 20 is usually selected and its prediction algorithm is selected, so that the pipeline can be redirected within a few clock cycles after the follow-branch instruction. The trade-off between the pipeline speed and the provision of a relatively large amount of unbranch accurate branch prediction hints that appear in most computer code.

第10頁 五、發明說明(7) ^ 速度舆大小/精確度間之取捨可具有重 例如,當重新導引需要兩時脈週期(1氣泡之重新^影響。 ,則具有大循ί裒計數的一迴路每次遞迴將於 W)時 氣泡。如果迴路主财僅存在數個料,例如到— 路”,則氣泡可代表資源使用上的重要下降 緊密迴 本發明係一階層式分枝預測系統,且伟才 所選定之分枝指令其速度與大小/精产/佳化或者改良 本發明,分枝預測系統包括第—與 ^ 、捨。根據 以根據編譯器所產生之提示資了二:存結構,以及用 至第一與第二儲存結構的寫入邏輯。第測資訊配置 選定之分枝指令的預測資訊, 儲存結構儲存所 之分枝指令群組的分枝預測資訊' ;代:結:儲存-較大 可實行一較精確之分枝預測演苴 ’第二儲存結構 測與較大之分枝預測容量的某此纽人〆者較精確之分枝預 對於-具體實施例,第—儲提 選定之分枝的分枝預測資訊, 偁扠供皁—週期存取所 更多週期中存取分枝預測資訊。八二儲存結構提供於二或 據透過各種分枝相關指令中 :枝預測管理員(ΒΡΜ)根 枝預測資訊儲存於第一或者楚_示所提供的資訊,而將分 分枝預測資訊可根據~分枝儲f結構中。 訊的一重要攔位,而標示儲存於關指令中指定分枝預測資 中。於揭露之具體實施例中,二f 或者第二儲存結構 用以指示:預測資訊應儲存單一位兀之重要性攔位係 義言之’一n位元之重要性概、—儲存結構中之何者。更廣 襴位可用以將分枝預測資訊選Page 10 V. Description of the invention (7) ^ The trade-off between speed and size / accuracy may have significant weights. For example, when the redirection requires two clock cycles (the effect of a re- ^ of 1 bubble.), It has a large cycle count. Each loop of the loop will be bubbled each time. If there are only a few materials in the main circuit of the circuit, such as “to-road”, then the bubble can represent an important decline in resource use. Closely back to the present invention is a hierarchical branch prediction system, and the branch selected by Wei Cai instructs its speed. With the size / fine production / optimization or improvement of the present invention, the branch prediction system includes the first and the second, and the house. According to the hint generated by the compiler, two funds are saved: the storage structure, and the first and second storages are used. Structure write logic. The test information configures the prediction information of the selected branch instruction, and stores the branch prediction information of the branch instruction group stored in the structure storage; 'Generation: knot: storage-larger can implement a more accurate The branch prediction is based on the second storage structure and the larger branch prediction capacity of the branch. The branch prediction is more accurate for the specific branch. Information, 偁 fork supply soap-cycle access in more cycles to access branch prediction information. Eighty-two storage structure is provided in two or according to various branch-related instructions: branch prediction manager (BPM) root branch prediction information Stored in One or Chu_ shows the information provided, and the branch prediction information can be based on the ~ branch storage f structure. An important stop of the information is stored in the designated branch prediction information in the relevant instruction. In a specific embodiment, the second f or the second storage structure is used to indicate: the importance of predicting that information should be stored in a single bit. The meaning of the meaning of the “n bit”, which is the meaning of the storage structure. A wider range can be used to select branch prediction information

Λ55814 五、發明說明(8) 擇性儲存於分枝預測儲存結構的一 2n位準階層中3 對於本發明一具體實施例,分枝相關指令包括分枝預測 指令(BRPs),分枝指令(BR),與M〇V_TO一BR指令(M0V卜分 枝預測指令(BRP)指定相關之指令指標(ΙΡ)相對分枝的預 測目標住址’預測之導向(採用/未採用),以及重要性提 示。指令指標(IP)相關之分枝(BR)指令指定其目標住址, 並可能同時包括重要性提示^ MOV指令指定非導向之分枝 其目標住址,並可敗同時包括重要性提示。除了重要性提 示外’這些分枝相關指令各可提供預測提示,指示何資源 應用以預測相關之分枝《例如,預測提示可指示:一分枝 係靜態或者動態預測。靜態預測使用透過分枝相關指令而 提供之編譯時間資訊,而動態預測使用程式碼執行期間所 聚集之資訊》 早期於處理器管線中提供分枝預測資訊有助於沿著適當 指令路徑快速取得以及後續執行指令。只要儲存此資訊之 結構並未將重要路徑載入處理器管線,或者變得太棘手而 將不必要之管線氣泡引進經常採用的内在迴路分枝,則此 項策略均有益《本發明藉由提供該結構的一階層,用以儲 存分枝預測資訊,而可於不妨礙存取分枝指令其某一重要 類別之分枝預測資訊下,促進所有分枝指令之分枝預測資 訊的使用。 圖2係一處理器管線200其一部分的一區塊圖,其中包括 根據本發明之分枝預測系統2 8 0。同時顯示一指令指標 UP)多工器(MUX) 250,一指令快捷記憶體(I-快捷記憶體Λ55814 V. Description of the invention (8) Selectively stored in a 2n level hierarchy of the branch prediction storage structure 3 For a specific embodiment of the present invention, the branch-related instructions include branch prediction instructions (BRPs), branch instructions ( BR), MOV_TO-BR instruction (M0V branch prediction instruction (BRP) designation related instruction index (IP) relative branch prediction target address' prediction guidance (adopted / not adopted), and importance reminder The branch index (BR) instruction related to the instruction indicator (IP) specifies its target address and may also include a hint of importance ^ The MOV instruction specifies the target address of a non-oriented branch and may also include a hint of importance. In addition to important Out of sexual prompts, each of these branch-related instructions can provide prediction hints, indicating which resources should be used to predict related branches. For example, prediction hints can indicate: a branch is static or dynamic prediction. Static prediction uses branch-related instructions. And compile time information provided, and dynamic prediction uses information gathered during code execution. Providing branch prediction information in the processor pipeline early helps To quickly obtain and follow instructions along the appropriate instruction path. As long as the structure storing this information does not load important paths into the processor pipeline, or becomes too tricky to introduce unnecessary pipeline bubbles into the often used internal circuit branch This strategy is beneficial. The present invention provides a layer of the structure for storing branch prediction information, and can promote the branch prediction information without hindering access to the branch instruction information of an important category of the branch instruction. The use of branch prediction information for all branch instructions. Figure 2 is a block diagram of a portion of a processor pipeline 200, which includes a branch prediction system 2 8 0 according to the present invention. At the same time, an instruction indicator UP is displayed. Tool (MUX) 250, one instruction quick memory (I-quick memory

第12頁 455814 五 '發明說明(9) )21 〇,一指令緩衝器2〗4,散佈邏輯21 6,以及一分枝執 行單元(BRU) 270。指令指標(IP)多工器(MUX) 250接收來 自各種來源之指令指標(IP),並將一選定的指令指標(IP) 桃合至I -快捷記億體21 G與分枝預測系統28 0以便處理。來 自I -快捷記憶體2 1 0或者旁徑(未顯示)之指令係貯列於緩 衝器214中,並透過散佈邏輯216,而路徑導引至例如分枝 執行單元(BRU) 270的執行資源。耆缓衝器214為空白時, 指令亦可能從I -快捷記憶體21〇直接傳送至散佈邏輯216。 對於管線200的一具體實施例,可同時間處理之指令則 &amp;集成束。各指令束所相關的一度量板對散佈邏輯21&amp;指 &quot;^指令類型,該散佈邏輯21&amp;將指令路徑導引至適當之執 行資源。指令朿可能包括多重分枝指令,該情況下,分枝 執行單元(BRU) 2 70包括用以同時間處理多重分枝指令之 資源。分枝預測系統280之某些特性係以設計成用於成束 才9令的具體實施例加以說明,但本發明與指令如何提供予 管線2 0 0無關。 管線200表示成一連串之管線(“管狀”)級2〇 1 — 2〇χ,用 以指示分枝預測系統280之不同元件何時作業於一給定的 指令。除了所註明之外,信號由左向右傳播,所以例如於 時脈(CLK)週期Ν,管狀級201中之電路響應將傳播至時脈 (CLK )週期Ν +1其管狀級2 〇 2中的電路。級閂2 1 8控制管狀級 2(Π -2〇χ間之信號流。本發明之其他具體實施例可使用分 枝預測元件與級閂21 8之不同的相對組態。本發明並非取Page 12 455814 V. Description of the invention (9)) 21 〇, an instruction buffer 2 〖4, distribution logic 216, and a branch execution unit (BRU) 270. Instruction indicator (IP) multiplexer (MUX) 250 receives instruction indicators (IP) from various sources, and integrates a selected instruction indicator (IP) to I-Quickly remember billion body 21 G and branch prediction system 28 0 for processing. Instructions from I-Quickmemory 2 10 or side-by-side (not shown) are stored in buffer 214 and are routed to the execution resources such as branch execution unit (BRU) 270 through the distribution logic 216 . When the buffer 214 is blank, the instruction may also be transferred directly from the I-flash memory 21 to the distribution logic 216. For a specific embodiment of the pipeline 200, the instructions that can be processed at the same time are &amp; integrated bundles. A metric board associated with each instruction bundle refers to the distribution logic 21 &amp; &quot; ^ instruction type, which guides the instruction path to the appropriate execution resources. The instruction frame may include multiple branch instructions. In this case, the branch execution unit (BRU) 2 70 includes resources to process multiple branch instructions at the same time. Some characteristics of the branch prediction system 280 are described in a specific embodiment designed for bundling, but the present invention has nothing to do with how instructions are provided to the pipeline 200. The pipeline 200 is represented as a series of pipeline ("tubular") stages 2101-2χ to indicate when the different elements of the branch prediction system 280 operate on a given instruction. Except where noted, the signal travels from left to right, so, for example, during the clock (CLK) period N, the circuit response in the tubular stage 201 will propagate to the clock (CLK) period N +1 in its tubular stage 002. Circuit. The stage latch 2 1 8 controls the signal flow between the tubular stages 2 (Π-2χ. Other embodiments of the present invention may use a different relative configuration of the branch prediction element and the stage latch 218. The present invention is not intended to take

第13頁 4 5 58 1 4 五、發明說明(10) 所揭露之具體實施例中’分枝預測系統280包括一目標 住址暫存器(T AR ) 2 3 0,一目標住址快捷記憶體(丁a C) 24 0 ,一分枝預測表(BPT) 220,選擇邏輯254,及分枝預測管 理員(BPM) 260。將目標住址暫存器(TAR) 230,目標住址 快捷記憶體(TAC) 240,與分枝預測表(BPT) 220(統稱 “分枝儲存結構”)耦合,用以接收來自指令指標(I P)多 工器(MUX) 2 50的一指令指標(IP),並且當所接收之指令 指標(I P)與其登錄項之一相符時,則將分枝預測資訊提供 予選擇邏輯2 5 4。選擇邏輯2 5 4將分枝預測資訊從分枝儲存 結構之一路徑導引回至指令指標(〖P)多工器(MUX) 250以 便處理。因此分枝預測結構結合選擇邏輯254與指令指標 (I P)多工器(MUX ) 2 5 0而作業,預測採用的一分枝於分枝 預測系統280中命中時,用以重新導引管線2〇〇。 目標住址暫存器(T AR) 2 2 0,目標住址快捷記憶體(T AC) 2 3 0 ’ I -快捷記憶體210,及分枝預測表(BPT) 2 20的相關 於級20 1與20 2的延伸區指示各結構用以處理所接收之指令 指標(I P)所需的時間。所揭露之具體實施例中,目標住址 暫存器(TAR) 23 0設計成用以處理一接收之指令指標(IP) ’並於一單一時脈週期内,將相關之分枝預測資訊提供予 選擇邏輯254。此係藉由例如限制目標住址暫存器(TAR) 230大&amp;小’以降低其存取潛在期而完成。一具體實施例中 ’ ^標住址暫存器(TAR) 230將四個分枝指令(BR)之預測 目標住址儲存於四個完全相關之登錄項,該登錄項係以部Page 13 4 5 58 1 4 V. Description of the Invention (10) In the specific embodiment disclosed by the invention, the 'branch prediction system 280 includes a target address register (T AR) 2 3 0, a target address shortcut memory ( D a) 24 0, a branch prediction table (BPT) 220, selection logic 254, and branch prediction manager (BPM) 260. Coupling the target address register (TAR) 230, the target address flash memory (TAC) 240, and the branch prediction table (BPT) 220 (collectively referred to as the "branch storage structure") for receiving instructions from the instruction indicator (IP) A multiplexer (MUX) 2 50 is an instruction index (IP), and when the received instruction index (IP) matches one of its entries, the branch prediction information is provided to the selection logic 2 5 4. The selection logic 2 5 4 guides the branch prediction information from one of the paths of the branch storage structure back to the instruction index (〖P) multiplexer (MUX) 250 for processing. Therefore, the branch prediction structure is combined with the selection logic 254 and the instruction index (IP) multiplexer (MUX) 2 50 to operate. A branch used for prediction is used to redirect the pipeline 2 when it hits the branch prediction system 280. 〇〇. Target address register (T AR) 2 2 0, target address shortcut memory (T AC) 2 3 0 'I-shortcut memory 210, and branch prediction table (BPT) 2 20 are related to level 20 1 and The 20 2 extension indicates the time required for each structure to process the received instruction indicator (IP). In the disclosed specific embodiment, the target address register (TAR) 23 0 is designed to process a received instruction index (IP) 'and provide related branch prediction information to a single clock cycle to Selection logic 254. This is done, for example, by limiting the target address register (TAR) 230 large &amp; small 'to reduce its potential access period. In a specific embodiment, the ^^ address temporary register (TAR) 230 stores the predicted target addresses of the four branch instructions (BR) in four completely related entries, the entries are

O:\59\59052.PTD 第14頁 a b 5 8 1 4 五'發明說明 標住址暫存器(TAR) 23 0中的一命中在時脈週期N + 1前,亦 即在零氣泡重新導引之時,將一預測a標住址提供予指令 指標(IP)多工器(·Χ) 250。 因為目標住址暫存器(TAR) 230支援零氣泡重新導引, 所以適於儲存具有衝擊處理器效能之最大潛能的分枝指令 其分枝預測資訊◦對於一具禮實施例’預測採用之迴路分 枝指令其分枝預測資訊,標示有目標住址暫存器(了 ) 230中的儲存處。如以下所討論,分枝預測管理員(βρμ) 2 6 0處理分枝預測資訊,並根據分枝對於處理器效能之重 要性的一指示,而將其寫入目標住址暫存器(TAR) 230與 目標住址快捷記憶體(TAC) 240。此具體實施例中,目標 住址暫存器(TAR) 230不必儲存預測之分枝決定,因為目 標住址暫存器(TAR) 230提供一目標住址之各分枝指令均 預測採用。當一指令指標(I P)於目標住址暫存器(TAR ) 230中命中時,選擇邏輯254可將一百標住址從目標住址暫 存器(TAR) 230轉送至多工器(MUX) 250。 小大小之目標住址暫存器(TAR) 23 0部分係藉由分枝預 測表(BPT) 22 0與目標住址快捷記憶體(TAC) 24〇予以補償 ’以調整大小’而調節一相對較大數目之分枝指令的分枝 預測資訊。較大大小之分枝預測表(βρτ) 2 2 0與目標住址 快捷圮憶體(TAC) 240防止其響應,直到級20 2部分完成為 止。對於所揭露之具體實施例,此造成兩時脈週期之分枝 預測潛在期’以及對於目標住址快捷記憶體(TAC) 24〇中 命中之分枝指令其管線2 0 0的單一氣泡重新導引。因此,O: \ 59 \ 59052.PTD Page 14 ab 5 8 1 4 Five 'invention description A hit in the address register (TAR) 23 0 is before the clock cycle N + 1, that is, the zero bubble is redirected When quoted, a predicted a-mark address is provided to the instruction index (IP) multiplexer (· X) 250. Because the target address register (TAR) 230 supports zero-bubble redirection, it is suitable for storing branch prediction information for branch instructions that have the greatest potential to impact processor performance. For a courteous embodiment, the circuit used for prediction The branch instruction has branch prediction information marked with a storage location in the target address register (L) 230. As discussed below, the branch prediction manager (βρμ) 2 60 processes branch prediction information and writes it to the target address register (TAR) based on an indication of the importance of the branch to processor performance. 230 and target address quick access memory (TAC) 240. In this specific embodiment, the target address register (TAR) 230 does not need to store the predicted branch decision, because each branch instruction provided by the target address register (TAR) 230 for a target address is predicted to be adopted. When an instruction index (IP) hits the target address register (TAR) 230, the selection logic 254 can transfer one hundred standard addresses from the target address register (TAR) 230 to the multiplexer (MUX) 250. The small target address register (TAR) 23 0 is partially adjusted by the branch prediction table (BPT) 22 0 and the target address fast memory (TAC) 24 0 to compensate for 'resize' and adjust a relatively large Branch prediction information for the number of branch instructions. The larger branch prediction table (βρτ) 2 2 0 and the target address Quick Access Memory (TAC) 240 prevents it from responding until the level 20 2 section is completed. For the specific embodiment disclosed, this results in a two-clock cycle branch prediction potential period 'and a single bubble redirection of its pipeline 2 0 for a branch instruction that hits the target address fast memory (TAC) 24 0 . therefore,

第15頁 455314_ 五、發明說明(12) &quot; - 雖然目標住址暫存器(TAR) 230,目標住址快捷記憶體 (TAC) 240 ,與分枝預測表(βΡΤ) 22〇之輸出耦合至級2〇1 中的選擇邏輯254,但於產生目標住址暫存器(TAR) 23〇之 輸出後一完全時脈週期,才產生分枝預測表(βρτ) 22〇與 目標住址快捷記憶體(TAC) 24〇的該等輸出。所揭露之具 體實施例中,所示之目標住址快捷記憶體(TAC) 24〇與分 枝預測表(BPT) 220係分離的結構’但其同樣可實施於一 統合之結構中。 本發明一具體實施例中,分枝預測表(βρτ) 22〇與目標 住址快捷記憶體(TAC) 240可將64個登錄項之預測決定及 目標住址資訊分別儲存於一四種方式設定的相關組態中。 當由相令指標(IP)多工器(MUX) 23 0於時脈週期Ν所提供的 —分枝指令指標(IP)在目標住址暫存器(TAR) 230中失誤 時’为枝預測表(BPT) 220以及/或者目標住址快捷記憶體 (TAC) 240中的一命中將於時脈週期N + 1内暫存,並且對應 之为枝預測資訊將於時脈週期N + 2可從多工器(MUX) 250取 得。取決於分枝預測資訊如何更新,分枝預測表(Βρτ) 220與目標住址快捷記憶體(TAC) 240無需相同指令指標 (IPs)於暫存器中命中。 分枝預測管理M(BPM) 2 6 0處理分枝預測資訊,接著並 更新目標住址暫存器(TAR) 23〇,目標住址快捷記憶體 (TAC) 240 ’與分枝預測表(BPT) 22(^尤其,分枝預測管 理員(BPM ) 2 6 0識別來自各種分枝相關之分枝預測資訊, 並根據包含於分枝預測資訊之提示,而將所識別的資訊提Page 15 455314_ 5. Description of the invention (12) &quot;-Although the target address register (TAR) 230, the target address shortcut memory (TAC) 240, and the output of the branch prediction table (βΡΤ) 22 are coupled to the stage The selection logic 254 in 201, but the branch prediction table (βρτ) 22 and the target address fast memory (TAC) are generated a complete clock cycle after the output of the target address register (TAR) 23〇 is generated. ) 24 of these outputs. In the specific embodiment disclosed, the target address flash memory (TAC) 24o and the branch prediction table (BPT) 220 are separated structures', but they can also be implemented in a unified structure. In a specific embodiment of the present invention, the branch prediction table (βρτ) 22 and the target address shortcut memory (TAC) 240 can store the prediction decisions and target address information of 64 registration items in one or four ways. Configuration. When the phase indicator (IP) multiplexer (MUX) 23 0 provided in the clock cycle N-the branch instruction indicator (IP) fails in the target address register (TAR) 230 is a branch prediction table A hit in (BPT) 220 and / or target address shortcut memory (TAC) 240 will be temporarily stored in the clock cycle N + 1, and the corresponding prediction information will be from clock cycle N + 2 MUX 250 acquired. Depending on how the branch prediction information is updated, the branch prediction table (Βρτ) 220 and the target address fast memory (TAC) 240 do not need to hit the same instruction index (IPs) in the register. Branch prediction management M (BPM) 2 6 0 processes branch prediction information, and then updates the target address register (TAR) 23, target address flash memory (TAC) 240 'and branch prediction table (BPT) 22 (^ In particular, the branch prediction manager (BPM) 2 60 recognizes branch prediction information from various branch-related information, and extracts the identified information based on the hints contained in the branch prediction information.

第16頁 d55B 1 4 五、發明說明(13) 供予分枝儲存結構。分枝預測管理員(BPM) 26〇同時可提 供目標住址暫存器(TAR) 230,目標住址快捷記憶體(TAC) 240,與分枝預測表(BPT) 22 0(靜態預測)中失誤之分枝指 令的預測資訊,檢查來自目標住址暫存器(TAR) 230,目 標住址快捷記憶體(TAC) 240 ’與分枝預測表(BPT) 220之 分枝預測資訊的精確度,並於偵測得錯誤時,提供指令指 標(IP)以重新導引管線2〇〇。 分枝預測管理員(BPM) 260同時可接收來自分枝執行單 元270之分枝預測資訊。例如,分枝執行單元270可 將來自分枝(BR)與MOV—TOJR指令之提示與目標住址資訊 解碼,並將此資訊提供予分枝預測管理員(ΒΡΜ) 26〇,以 更新目標住址暫存器(TAR) 230,目標住址快捷記憶體 (TAC) 240,與分枝預測表(ΒΡΤ) 2 20。此資訊將一動態組 成提供予分枝預測資訊’用以補助由軟體導向之提示所提 供的靜態组成。 分枝預測管理員(ΒΡΜ) 260結合目標住址暫存器(TAR) 2 30,目標住址快捷記億體(TAC) 240,與分枝預測表 (Β Ρ Τ) 2 2 0而作業,以提供分枝預測結構的一階層。表1彙 總由分枝預測系統2 8 0的一具體實施例所實行之不同的重 新導引事件,條件,與目標。對於以表1代表之具體實施 例,Ν個指令聚集成束’以便同時間處理,多重分枝指令 可能包含於一指令束中,並且指令係根據於程式碼中之相 對順序,而指定成第1 - Ν小節。Page 16 d55B 1 4 V. Description of the Invention (13) Supply to branch storage structure. Branch prediction manager (BPM) 26. At the same time, it can provide target address temporary register (TAR) 230, target address shortcut memory (TAC) 240, and branch prediction table (BPT) 22 0 (static prediction). The prediction information of the branch instruction, check the accuracy of the branch prediction information from the target address register (TAR) 230, the target address flash memory (TAC) 240 'and the branch prediction table (BPT) 220, and check the accuracy of the branch prediction information. When an error is detected, an instruction indicator (IP) is provided to redirect the pipeline 200. The branch prediction manager (BPM) 260 can also receive branch prediction information from the branch execution unit 270 at the same time. For example, the branch execution unit 270 may decode the hints from the branch (BR) and MOV-TOJR instructions and the target address information, and provide this information to the branch prediction manager (BPM) 26 to update the target address temporary storage. Device (TAR) 230, target address shortcut memory (TAC) 240, and branch prediction table (BPT) 2 20. This information provides a dynamic composition to branch prediction information 'to subsidize the static composition provided by software-oriented prompts. Branch Prediction Manager (BPM) 260 operates in conjunction with Target Address Temporary Register (TAR) 2 30, Target Address Shortcut Register (TAC) 240, and Branch Prediction Table (ΒΡΤ) 2 2 0 to provide One level of branch prediction structure. Table 1 summarizes the different redirect events, conditions, and goals implemented by a specific embodiment of the branch prediction system 280. For the specific embodiment represented in Table 1, N instructions are grouped into a bundle for processing at the same time. Multiple branch instructions may be included in an instruction bundle, and the instructions are designated as the first order based on the relative order in the code. 1-Section Ν.

第17頁 4b58 1 4 五、發明說明(14) 表1 ---- 事件型態 (優先序) 預餚條件 原因 !重新 目標住址暫存器(TAR) 重新導引 (4’最低優先序) 無 目標住址暫存器(TAR) 中命中 來自目標 器(TAR)之目標住 址(TA) 目標住址快捷記憶體 (TAC)重新導引 (3) 第一合---— 沒有目標住址 暫存器(TAR)重 新導引 1) 目標住址快捷記憶 體(TAC)中命中,分 枝預測表(BPT)於同 一指令束上所預測 之採用(TK)及分枝 並非重返 2) 目標住址抉捷記憶 體(TAC)中命中,而 分枝預測表(BPT)中 失誤 3) 目橾住址快捷記憶 體(TAC)中命中,分 枝預測表(BPT)於同 一指令束上所預測 之採用(TK)及分枝 為重返 1) 來自目標住址快 捷記憶體(TAC) 之目標住址(TA) 2) 來自目標住址快 捷記憶體(TAC) 之目標住址(TA) 3) 來自重返堆疊緩 衝器(RSB)之目 標住址(TA) 刀仪让址吏正S 元(BAC1)重漸導引 (2) 沒有目標住址 暫存器(TAR)或 者目標住址快 捷記憶體(TAC) 重新導引 1) 目標住址快捷記憶 體(TAC)中失誤, 分枝預測表(BPT) 所預測之採用(ΤΚ:ι 及分枝並非重返 2) —指令束中之任何 有效分枝以靜態預 測採用(TK),並且 該指令束中之最後 分枝不為非導向分 枝 3) —指令束中之任何 有效分枝以靜態預 測採用(TK),並且 該束中之最後分枝 係一非導向分枝 1) 指令束中最後分 枝之a標住址 (TA) 2) 指令束中最後分 枝之目標住址 (TA) 3) 來自重返堆疊緩 衝器(RSBk目標 住址(TA) _ 1111 第18頁 455814 五、發明說明(15) 目標住址暫存 器(TAR)重新導 引 4)偽分枝預測,該指 令束中無有效分枝 3)計數迴路,由迴路 預測器指示:依計 數或於頂端迴路離 開 4) 下一指令束住址 5) 下一指令束住址 第二分枝住址更正單 元(BAC2)重新導引 (1,最高優先序) 指令東上之目 標住址快捷記 憶體CTAC)重新 導引 指令束上之第 一分枝住址更 正單元(BAC1) 重新導引 1) 目標住址快捷記憶 體(TAC)目標小節識 別字(ID)並不符合 第一採用分枝預測 ,並且該分枝不為 非導向分枝 2) 採用第一分枝住址 更正單元(BAC1)重 新導引具有從目標 住址(TA)所計算的 一溢流 3) 遮蔽來自第一分枝 住址更正單元 (BAC1)4l第一採用 (TK)分技指令(BR) ’則一不為非導向 分枝指令(BR)東中 存在一採用(TK)分 枝 4) 來自第一分枝住址 更正單元(BAC1)之 第一採用(TK)分枝 指令(BR)並非最後 小節,且不為一非 導向分枝 5) 來自第一分枝住址 更正單元(BAC1)之 第一採用(TK)分枝 指令(BR)並非最後 小節,且為一非導 向分枝 1) 來自第二分枝住 址更正單元 (BAC2&gt;iL 第一採 用(TK)目標住址 (TA) 2) 來自第二分枝住 址更正單元 (BAC2)之第一採 用(TK)目標住址 (TA) 3) 來自第二分枝住 址更正單元 (BAC2)之第一採 用(TK)S標住址 (TA) 4) 來自第二分枝住 址更正單元 (BAC2)之第一採 用(TK)目標住址 (TA) 5) 來自重返堆疊緩 衝器(RSB)4l目標 住址(TA)Page 17 4b58 1 4 V. Description of the invention (14) Table 1 ---- Event type (priority order) Pre-condition conditions reason! Retargeting address register (TAR) Redirection (4 'lowest priority order) No Target Address Register (TAR) Hit Target Address (TA) from Target (TAR) Target Address Quick Memory (TAC) Redirect (3) First Match ----- No Target Address Register (TAR) Redirection 1) Hit in target address shortcut memory (TAC), the adoption of branch prediction table (BPT) predicted on the same instruction bundle (TK) and branch does not return 2) Target address selection Hit in memory (TAC), and mistake in branch prediction table (BPT) 3) Hit in address shortcut memory (TAC), predicted by branch prediction table (BPT) on the same instruction bundle (TK ) And branches for return 1) Target address (TA) from target address flash memory (TAC) 2) Target address (TA) from target address flash memory (TAC) 3) From return stack buffer ( (RSB) target address (TA) Dao Yi gave the addressee S yuan (BAC1) to guide again (2) no target address register (TAR) or target TAC redirection 1) Errors in the target address shortcut memory (TAC), the adoption as predicted by the branch prediction table (BPT) (TK and branches not returning 2) — instruction bundle Any valid branch in the instruction bundle is taken with static prediction (TK), and the last branch in the instruction bundle is not a non-directed branch3)-any valid branch in the instruction bundle is taken with static prediction (TK), and the The last branch in the bundle is an undirected branch. 1) The last branch in the instruction bundle is marked with an address (TA). 2) The last branch in the instruction bundle is marked with an address (TA). 3) From the return stack buffer. RSBk target address (TA) _ 1111 page 18 455814 V. Description of the invention (15) Target address temporary register (TAR) redirection 4) Pseudo branch prediction, no valid branch in the instruction bundle 3) Counting loop, Instructed by the loop predictor: Count or leave at the top loop 4) Next instruction bundle address 5) Next instruction bundle address Second branch address correction unit (BAC2) Redirection (1, highest priority) Instruction east Target address flash memory (CTAC) Branch address correction unit (BAC1) redirection 1) Target address flash memory (TAC) target section identifier (ID) does not meet the first branch prediction used, and the branch is not a non-directional branch 2) Use the first branch address correction unit (BAC1) to redirect with an overflow calculated from the target address (TA) 3) Shield from the first branch address correction unit (BAC1) 4l First adoption (TK) sub-technology Instruction (BR) 'there is not a non-directed branch instruction (BR). There is an adoption (TK) branch. 4) The first adoption (TK) branch instruction from the first branch address correction unit (BAC1). (BR) is not the last bar and is not a non-directional branch 5) The first adoption (TK) branch instruction (BR) from the first branch address correction unit (BAC1) is not the last bar and is a non-oriented Branch 1) From the second branch address correction unit (BAC2> iL first adoption (TK) target address (TA) 2) From the second branch address correction unit (BAC2) first adoption (TK) target address ( TA) 3) From the second branch address correction unit (BAC2), the first adoption (TK) S marked residential address (TA) 4) from the second branch First use (TK) target address (TA) of the address correction unit (BAC2) 5) From the return stack buffer (RSB) 4l target address (TA)

第19頁 455814 五、發明說明(16) 6)_遮蔽來自目標住址 6)來自重返堆疊缓 快捷記憶體(TAC)/ 衝器(RSB)之目標 第一分枝住址更正 單元(BAC1)重新導 引之第一採用分枝 指令(FTB) ’則該指 令束中存在一採用 (ΤΚ)之分枝指令, 並且該採用(ΤΚ)分 枝為一非導向分枝 指令(BR) 住址(ΤΑ) 指令束上之目 7)遮蔽來自目標住址 7)下一指令束住址 標住址快捷記 快捷記懷體(TAC)/ 憶體(TAC)/第 第一分枝住址更正 一分枝住址更 單元(BAC1)重新導 正單元(BAC1) 引之第一採用分枝 重新導引 指令(FTB),並且指 令束中無其他分枝 指令(BR)為採用 (ΤΚ) 表1中,TK表示一分枝預測/決定採用,TA係指目標住址 ,FTB係指第一採用之分枝,而RSB係指重返之目標住址 (TA)的一重返堆疊緩衝器。 圖3係根據本發明之分枝預測管理員(BPM) 260其一具體 實施例的一區塊圖。為作說明,所揭露之具體實施例包括 以上所註明之檢查特性,且最多可同時處理兩指令束,包 括具有多重分枝指令(“多重方式分枝”)之指令束。熟知 此項技藝且由此揭露獲益之人士將認知用以處理更多或者 更少指令之分枝預測管理員(BPM) 2 6 0的修正。 所揭露之分枝預測管理員(βΡΜ) 2 6 0的具體實施例包括Page 19 455814 V. Description of the invention (16) 6) _ mask from the target address 6) from the target first branch address correction unit (BAC1) re-entering the stacking fast memory (TAC) / punch (RSB) The first adopted branch instruction (FTB) 'is that there is a branch instruction using (TK) in the instruction bundle, and the adoption (TK) branch is a non-directed branch instruction (BR) address (TA) ) Head on the instruction bundle 7) Mask from the target address 7) Next instruction bundle address mark address short note short memory (TAC) / memory body (TAC) / first branch address correction one branch address update unit (BAC1) The first branch of the redirection unit (BAC1) uses the branch redirection instruction (FTB), and no other branch instruction (BR) in the instruction bundle is adopted (TK). In Table 1, TK indicates a point. For branch prediction / decision, TA refers to the target address, FTB refers to the branch adopted first, and RSB refers to a return stack buffer of the target address (TA) to return. FIG. 3 is a block diagram of a branch prediction manager (BPM) 260 according to an embodiment of the present invention. For illustration, the specific embodiments disclosed include the inspection characteristics noted above, and can process up to two instruction bundles at the same time, including instruction bundles with multiple branch instructions ("multi-mode branch"). Those who are familiar with this technique and who have benefited from this disclosure will recognize a branch prediction manager (BPM) 2 6 0 correction to handle more or fewer instructions. Specific examples of the disclosed branch prediction manager (βPM) 2 6 0 include

第20頁 4 5 5 3 1 4 五、發明說明(17) 一第一分枝住址更正單元(BAC1 ) 3 I 0,一第二分枝住址更 正單元(BAC2 ) 360,以及路徑導引模組370。同時所示係 一方式多工器(MUX) 306,將來自前一級之指令與解碼資 訊耦合至分枝預測管理員(BPM) 260與一指令緩衝器308。 第一分枝住址更正單元(BAC1) 310與第二分枝住址更正單 元(BAC2 ) 3 60處理指令指標(IP)及指令解碼資訊,以識別 分枝預測資訊,檢查所識別資訊的錯誤,而且如果決定為 有誤’則更正該資訊。路徑導引模組3 7〇之路徑導引模組 處理來自第一分枝住址更正單元(BAC1) 31〇與第二分枝住 址更正單元(B A C 2 ) 3 6 0之提示及分枝預測資訊,以決定該 預測資訊的一適當之儲存結構。路徑導引模組3 7 〇之路徑 導引模組將結合圖4而詳細討論。 第一分枝住址更正單元(BAC1) 310的一具體實施例初步 決定一指令束的一目標住址,以及預測採用之指令束中的 一第一分枝指令。其同時識別偽分枝指令,亦即並無分枝 指令出現時,卻於目標住址暫存器(TAR) 23〇,目標住址 快捷記憶體(TAC) 240,或者分枝預測表(BpT) 220中命中 。第一分枝住址更正單元(BAC〗)310包括合併邏輯320 ’ 以多重分枝指令(FFT邏輯324)之邏輯找到一指令束中的一 第一採用分枝,以及一解碼器3 28,用以評估(預測)分枝 導向。第一分枝住址更正單元(BAC1) 包括住址312 , 314 ’316 ,318,以及用以評估分枝目標住址之多工器 (MUX) 338 其中包括一化名之解碼器334與多工器(Μυχ) 330 ’用以識別偽分枝預測。 ’Page 20 4 5 5 3 1 4 V. Description of the invention (17) A first branch address correction unit (BAC1) 3 I 0, a second branch address correction unit (BAC2) 360, and a path guidance module 370. Also shown is a mode multiplexer (MUX) 306 that couples instructions and decoded information from the previous stage to a branch prediction manager (BPM) 260 and an instruction buffer 308. The first branch address correction unit (BAC1) 310 and the second branch address correction unit (BAC2) 3 60 process the instruction index (IP) and instruction decoding information to identify branch prediction information, check for errors in the identified information, and If the decision is wrong ', correct the information. The path guidance module of the path guidance module 370 processes the tips and branch prediction information from the first branch address correction unit (BAC1) 31 and the second branch address correction unit (BAC 2) 3 600. To determine an appropriate storage structure for the predicted information. The path guidance module 37's path guidance module will be discussed in detail in conjunction with FIG. A specific embodiment of the first branch address correction unit (BAC1) 310 initially determines a target address of an instruction bundle, and a first branch instruction in the instruction bundle predicted to be used. It also recognizes a fake branch instruction, that is, when no branch instruction appears, it is in the target address register (TAR) 23, the target address shortcut memory (TAC) 240, or the branch prediction table (BpT) 220. Hit. The first branch address correction unit (BAC) 310 includes merging logic 320 'to find a first adoption branch in an instruction bundle using the logic of a multi-branch instruction (FFT logic 324), and a decoder 3 28, using To evaluate (predict) branch-oriented. The first branch address correction unit (BAC1) includes addresses 312, 314'316, 318, and a multiplexer (MUX) 338 for evaluating branch target addresses, including a pseudonymous decoder 334 and a multiplexer (Μυχ ) 330 'to identify pseudo-branch prediction. ’

45 58 1 4 五、發明說明(18) 解碼器312接收來自例如I快捷記憶體210而包括分枝提 示資訊之指令資訊。提示資訊可包括應動態或者靜態預測 一分枝的一指示。如果指示為靜態預測,則提示資訊可能 同時指示所預測之分枝導向(TK/NT) e合併邏輯32〇組合提 示資訊以及來自目標住址暫存器(TAR) 320與目標住址快 捷記憶體(TAC) 340之任何預測的分枝導向,而且多重分 枝指令(FFT)邏輯324從解碼及預測之資訊決定一第一採用 分枝。多重分枝指令(FFT)邏輯324使用多工器(MUX) 338 選擇一偏移,用以計算第二分枝住址更正單元(BAC2) 360 中(預測之)其第一採用分枝的目標住址。 住址312與316分別使用來自第一與第二指令束中之分枝 預測指令(BRP)的解碼資訊,而決定分枝目標住址《目標 住址係使用多工器(MUX) 2 5 0所提供之指令指標(IP )以及 分枝預測指令(BRP)中所指定之住址偏移而決定。住址314 與318使用指令指令指標(ip)與解碼之偏移資訊,而決定 指令指標(i P)其相關分枝指令之分枝目標住址,例如用於 靜態重新導引。對於本發明一具體實施例,住址314與318 可簡化,例如排除所選定之進位邏輯,以符合將重新導引 指令指標(IP)提供予選擇邏輯254所需的時序限制。如果 一住址快捷記憶體產生一進位,則正確住址可藉由第二分 枝住址更正單元(BAC 2) 360中的一完全加法器(358)而決 定。 為了限制分枝預測管理員(BPM) 260之大小,並非一指 令束中所有指令槽均需提供加法器。對於所揭露之具體實45 58 1 4 V. Description of the invention (18) The decoder 312 receives instruction information including, for example, branch prompt information from the I flash memory 210. The hint information may include an indication that a branch should be predicted dynamically or statically. If the indication is a static prediction, the hint information may also indicate the predicted branch-oriented (TK / NT) e merge logic 32. The combined hint information and from the target address register (TAR) 320 and the target address shortcut memory (TAC) ) Any predicted branch of 340, and multiple branch instruction (FFT) logic 324 determines a first adopted branch from the decoded and predicted information. Multiple branch instruction (FFT) logic 324 uses a multiplexer (MUX) 338 to select an offset to calculate the second branch address correction unit (BAC2) 360 (predicted) whose target address is the first branch used . Addresses 312 and 316 use the decoded information from the branch prediction instructions (BRP) in the first and second instruction bundles, respectively, to determine the branch target address. The target address is provided using a multiplexer (MUX) 2 50. It is determined by the instruction index (IP) and the address offset specified in the branch prediction instruction (BRP). The addresses 314 and 318 use the instruction instruction index (ip) and decoded offset information to determine the branch target address of the instruction index (ip) and its related branch instruction, for example, for static redirection. For a specific embodiment of the present invention, the addresses 314 and 318 can be simplified, such as excluding the selected carry logic to meet the timing constraints required to provide the redirect instruction index (IP) to the selection logic 254. If an address shortcut generates a carry, the correct address can be determined by a full adder (358) in the second branch address correction unit (BAC 2) 360. In order to limit the size of the branch prediction manager (BPM) 260, not all instruction slots in an instruction bundle need to provide adders. For the specific facts disclosed

第22頁 ^5581 4 五、發明說明(19) 施例,加法器3 1 2與3 1 6分別計算第一與第二指令束其特定 指令槽中的分枝預測指令(BRP)之目標住址。類似地,加 法器31 4與3 1 8分別計算第一與第二指令束其特定指令槽中 的分枝指令(BR)之目標住址。例如’當指令束為三個指令 寬時,各個同時間處理之指令束其第三指令槽可耗合至加 法器3 1 2,3 1 4,3 1 6,31 8,而立分枝指令可優先路徑導引 至這些指令槽。此具體實施例中,並未指定予第三槽之分 枝指令的目標住址係藉由第二分枝住址更正單元(BAC2) 2 60令的一加法器(加法器35 8)給值° 別名偵測器334可確保於目標住址暫存器(TAR) 230,目 標住址快捷記憶體(TAC) 240,與分枝預测表(BPT) 22 0中 命中之指令指標(I P)對應於分枝相關指令。例如,目標住 址快捷記憶體(TAC) 230,目標住址暫存器(TAR) 2 40,與 分枝預測表(BPT) 220可能僅使用一指令指標(IP)的一部 刀作為其所儲存之分枝預測資訊的索引β如此可節省石夕面 積,但部分指令指標(I Ρ )可能對應於一個指令以上(“別 名”)之住址。偵測器334使用指令解碼資訊,以確保分枝 健存結構中產生一命中之指令指標(I Ρ)對應於—分枝相關 指令。多工器(MUX) 33 0於別名偵測器334控制下選擇一指 令指標(IP),並將其耦合回至選擇邏輯254,用 用以於分枝 預測資訊中偵測得一錯誤之事件中,重新導弓丨管線2〇 —分枝住址更正單元(BAC1)重新導引)。 第二分枝住址更正單元(BAC2) 360的一具體實施 —分枝住址更正單元(BAC1) 310所產生之初步姑 、’·ϋ禾轨彳丁有Page 22 ^ 5581 4 V. Description of the invention (19) Example, the adders 3 1 2 and 3 1 6 calculate the target address of the branch prediction instruction (BRP) in the specific instruction slot of the first and second instruction bundles, respectively. . Similarly, the adders 31 4 and 3 1 8 calculate the target address of the branch instruction (BR) in the specific instruction slot of the first and second instruction bundles, respectively. For example, when the instruction bundle is three instructions wide, the third instruction slot of each instruction bundle processed at the same time can be consumed by the adder 3 1 2, 3 1 4, 3 1 6, 31 8 and the branch instruction can be The priority path leads to these instruction slots. In this specific embodiment, the target address of the branch instruction that is not assigned to the third slot is a value added by an adder (adder 35 8) ordered by the second branch address correction unit (BAC2) 2 60 The detector 334 can ensure that the target address register (TAR) 230, the target address flash memory (TAC) 240, and the branch prediction table (BPT) 22 0 hit the instruction index (IP) corresponding to the branch Related instructions. For example, the target address shortcut memory (TAC) 230, the target address register (TAR) 2 40, and the branch prediction table (BPT) 220 may use only one knife of an instruction index (IP) as their stored data. The index β of the branch prediction information can save the area of Shi Xi in this way, but some instruction indicators (IP) may correspond to addresses of more than one instruction ("alias"). The detector 334 uses the instruction to decode information to ensure that a hit instruction index (IP) generated in the branch memory structure corresponds to the branch-related instruction. Multiplexer (MUX) 33 0 selects an instruction indicator (IP) under the control of the alias detector 334 and couples it back to the selection logic 254 for detecting an error event in the branch prediction information In the middle, redirect the bow 丨 pipeline 20-branch address correction unit (BAC1) redirection). A specific implementation of the second branch address correction unit (BAC2) 360-the preliminary result of the branch address correction unit (BAC1) 310,

4558 1 4 五、發明說明(20) 效性檢查’接著並且更新目標住址暫存器(了 A R ) 2 3 〇 ’目 標住址快捷圮憶體(T AC) 24 0,與分枝預測表(Β ρτ) 2 2 。 所揭路之具體實施例中’目標住址快捷記憶體(TAC )/分枝 預測表(BPT)檢查模組344決定由分枝預測表(Βρτ) 22〇所 預測採用的—分枝是否與目標住址快捷記憶體(TAC) 240 所提供相關於目標住址之分枝相符。如果結果為不符,則 目標住址快捷記憶體(TAC)/分枝預測表(Βρτ)檢查模組344 由第二分枝住址更正單元(BAC2)使用所產生之住址,並根 據攸77枝所取得之靜邊預測資訊而重新導引。當進位邏輯 從加法器3 1 4 , 3 1 8中排除,以符合時序限制時,所包含之 加法器檢查模組3 4 8用以偵測溢流錯誤。加法器檢查模組 348決定加法器3丨4或者3丨8所計算的一目標住址是否產生 一進位’並且於偵測得一進位時,觸發整個加法器3 5 8 , 以重新計算目標住址。 罩幕有效性模組3 50決定何時第一分枝住址更正單元 (B A C1 ) 3 1 0中的加法器所服務的指令槽之一不存在一預測 的第一採用分枝。分枝預測管理員(B P Μ) 2 6 0的一具體實 施例中’一罩幕則指示未服務之指令槽。當採用之分枝存 在於一遮蔽之指令槽中時,則罩幕有效性模組3 5〇使用多 工Is ( MUX) 3 5 4從加法器3 58選擇目標住址。多工器(MU X) 354之輸出將目標住址提供予選擇邏輯254,用以重新導引 管線(第二分枝住址更正單元(BAC2)重新導引)。路徑導引 模組3 7 0以來自解碼之分枝指令(BR)與分枝預測指令(β Rp) 指令的資料更新目標住址暫存器(TAR) 230,目標住址快4558 1 4 V. Description of the invention (20) Validity check 'Next and update the target address register (AR) 2 3 0' The target address shortcut memory (T AC) 24 0, and the branch prediction table (B ρτ) 2 2. In the specific embodiment of the road disclosed, the 'Target Address Quick Memory (TAC) / Branch Prediction Table (BPT) check module 344 decides whether the branch prediction table (Bρτ) 22 is used to predict whether the branch and target are used. The address shortcut memory (TAC) 240 provides branches related to the target address. If the result is inconsistent, the target address quick access memory (TAC) / branch prediction table (Βρτ) check module 344 uses the address generated by the second branch address correction unit (BAC2) and obtains it according to the 77 branches. Quietly predicting information and redirecting. When the carry logic is excluded from the adders 3 1 4, 3 1 8 to meet timing constraints, the included adder check module 3 4 8 is used to detect overflow errors. The adder check module 348 determines whether a target address calculated by the adder 3 丨 4 or 3 丨 8 generates a carry ', and when a carry is detected, the entire adder 3 5 8 is triggered to recalculate the target address. The mask validity module 3 50 determines when one of the instruction slots served by the adder in the first branch address correction unit (B A C1) 3 1 0 does not have a predicted first adoption branch. In a specific embodiment of the Branch Prediction Manager (BPM) 260, a mask indicates an unserved instruction slot. When the adopted branch exists in a shaded instruction slot, the mask validity module 3 50 uses the multiplexing Is (MUX) 3 5 4 to select the target address from the adder 3 58. The output of the multiplexer (MU X) 354 provides the target address to the selection logic 254 to redirect the pipeline (the second branch address correction unit (BAC2) redirects). Path guidance module 370 updates the target address register (TAR) 230 with the data from the decoded branch instruction (BR) and branch prediction instruction (β Rp) instructions. The target address is fast.

O:\59\59051PTD 第24頁 4 5 5 8 1 4 五、發明說明(21) 捷記憶體(TAC) 240 ’與分枝預測表(BPT) 220。 圖4代表用以更新目標住址快捷記憶體(tac) 23〇與目標 住址暫存器(TAR) 240的一管線400。雖然管線2〇〇中,目 標住址快捷記憶體(TAC) 230與目標住址暫存器(TAR) 240 實體上位於分枝預測管理員(BPM) 26 0與分枝執行單元 (BRU) 270前面,但圖4 _所示其跟隨分枝預測管理員 (BPM) 260之後’以強調各種解碼,執行,以及更新作業 之相對時序。此外,所揭露之寫入管線4 〇 〇具體實施例適 用於處理器管線200 (圖2)同時間最多處理兩指令束的情況 。因此’目標住址暫存器(TAR) 230與目標住址快捷記憶 體(TAC)240各耦合至兩讀取/寫入埠(埠〇,埠丨),以調節 兩同時間更新之作業^ _ 所揭露之具體實施例中,分枝預測管理員(βΡΜ) 2 6 0之 第一分枝住址更正箪元(BAC1) 310與第二分枝住址更正單 元(BAC2 ) 360顯示於管狀級402與4 03中。目標住址暫存器 (TAR) 230表示成一目標住址暫存器(TAR)讀取模組430與 一目標住址暫存器(TAR)寫入模組434,分別於級403與404 申存取。類似地’目標住址快捷記憶體(TAC) 240表示成 一目標住址快捷記憶體(TAC)讀取模組440與一目標住址快 捷記憶體(TAC)寫入模組444,分別於級403與404中存取。 路徑導引模組370包括來源選擇多工器(MUX) 412,414, 寫入控制多工器(MUX) 460,464,及其控制邏輯(未顯示) 〇 來源選擇多工器(MUX) 41 2與41 4從各種來源選擇目標住O: \ 59 \ 59051PTD Page 24 4 5 5 8 1 4 V. Description of the invention (21) Flash memory (TAC) 240 'and branch prediction table (BPT) 220. FIG. 4 represents a pipeline 400 for updating the target address flash memory (tac) 23 and the target address register (TAR) 240. Although in the pipeline 200, the target address shortcut memory (TAC) 230 and the target address register (TAR) 240 are physically located in front of the branch prediction manager (BPM) 2600 and the branch execution unit (BRU) 270. But FIG. 4_ shows that it follows the branch prediction manager (BPM) 260 'to emphasize the relative timing of various decoding, execution, and update operations. In addition, the disclosed write pipeline 400 embodiment is applicable to the case where the processor pipeline 200 (FIG. 2) processes a maximum of two instruction bundles at the same time. Therefore, the 'target address register (TAR) 230 and the target address flash memory (TAC) 240 are each coupled to two read / write ports (port 0, port 丨) to adjust the two update operations at the same time ^ _ 所In the disclosed embodiment, the first branch address correction unit (BAC1) 310 and the second branch address correction unit (BAC2) 360 of the branch prediction manager (βPM) 2 60 are displayed at the tubular levels 402 and 4 03 in. The target address register (TAR) 230 is represented as a target address register (TAR) reading module 430 and a target address register (TAR) writing module 434, which are accessed at levels 403 and 404, respectively. Similarly, the target address shortcut memory (TAC) 240 is represented as a target address shortcut memory (TAC) reading module 440 and a target address shortcut memory (TAC) writing module 444, in levels 403 and 404, respectively. access. The path guidance module 370 includes a source selection multiplexer (MUX) 412, 414, a write control multiplexer (MUX) 460, 464, and its control logic (not shown). A source selection multiplexer (MUX) 41 2 and 41 4 Select destinations from various sources

第25頁 0 0814 五、發明說明(22) 址,並將所選定之目標住址分別耦合至目標住址暫存器 (TAR) 230與目標住址快捷記憶體(TAC) 240之埠〇與埠1。 所揭露之具體實施例中,第一分枝住址更正單元(BAC1) 3 1 0將來自所解碼之分枝預測指令(BRP )指令的目標住址提 供予多工器(MUX) 412,414 ’而分枝執行單元(bru)170將 來自所解碼之分枝指令(BR)與M0V_T0_BR指令的目標住址 分別提供予多工器(MUXs) 412與多工器(MUX) 414。當一 預測之目標住址不正破’或者一目標住址計算超出第一分 枝住址更正單元(BAC1)加法器314,316之容量時,多工器 (MUX) 414同樣可接收來自第二分枝住址更正單元(BAC2) 360的一目標住址。 寫入控制多工器(MUX) 460,464使用來自各種來源之解 碼的提示資訊’以控制來自多工器(MUX) 412,414之目標 住址是否寫入目標住址暫存器(TAR) 230,目標住址快捷 記憶體(T A C ) 2 4 0,抑或兩者=對於一具體實施例,第一 分枝住址更正單元(BAC1) 360提供來自所解碼之分枝預測 指令(BRP)指令的重要提示,而分枝執行單元(BRU) 270提 供來自分枝指令(BR)與M0V„T0_BR指令之重要提示。 抵達多工器(MUX) 412 ’414之控制信號根據於管線200 中所偵測的事件’例如指令類型,預測提示,檢查失誤等 ’而選擇一目標住址來源。所選定之目標住址係透過級閂 418,而提供予目標住址暫存器(TAR)讀取模組4 30的埠〇與 埠1。目標住址暫存器(TAR)讀取模組430決定所選定之目 標住址是否與其登錄項之一相符(“命中” )Q如果偵測Page 25 0 0814 V. Description of the invention (22) address, and couple the selected target address to target address register (TAR) 230 and target address flash memory (TAC) 240, port 0 and port 1, respectively. In the disclosed embodiment, the first branch address correction unit (BAC1) 3 1 0 provides the target address from the decoded branch prediction instruction (BRP) instruction to the multiplexer (MUX) 412, 414 'and The branch execution unit (bru) 170 provides the target addresses from the decoded branch instruction (BR) and M0V_T0_BR instruction to multiplexers (MUXs) 412 and multiplexers (MUX) 414, respectively. The multiplexer (MUX) 414 can also receive data from the second branch address when a predicted target address is not correct or when the calculation of a target address exceeds the capacity of the first branch address correction unit (BAC1) adders 314, 316. A target residential address for correction unit (BAC2) 360. The write control multiplexer (MUX) 460, 464 uses decoded hint information from various sources to control whether the target address from the multiplexer (MUX) 412, 414 is written to the target address register (TAR) 230, Target address shortcut memory (TAC) 2 4 0, or both = For a specific embodiment, the first branch address correction unit (BAC1) 360 provides important hints from the decoded branch prediction instruction (BRP) instruction, The branch execution unit (BRU) 270 provides important hints from the branch instruction (BR) and M0V „T0_BR instruction. Arrival to the multiplexer (MUX) 412 '414 The control signal is based on the events detected in pipeline 200' For example, command type, prediction prompt, check error, etc. 'to select a target address source. The selected target address is provided to the target address register (TAR) reading module 4 30 through port 418 through the level latch 418. Port 1. The target address register (TAR) reading module 430 determines whether the selected target address matches one of its registered items ("hit") Q If detected

第26頁 abb8 1 4 五、發明說明(23) 為命中,並且設定對應之命中位元,則將目標住址寫至目 標住址暫存器(TAR)寫入模組434中之命中登錄項。如果積 測並無命中’但設定對應之提示位元,則於目標住址暫存 器(TAR)寫入模組434中配置一登錄項,並將目標住址寫至 所配置之登錄項。一具體實施例中,目標住址暫存器 (TAR) 230實行一先進先出(FIFO)配置演算法。由於目標 住址暫存器(TAR) 230係雙埠,所以可同時間讀取及寫入 兩登錄項。 一類似之寫入程序係分別以目標住址快捷記憶體(TAC) 的讀取與寫入模組440,444實行。將目標住址與目標住址 快捷記憶體(TAC)讀取模組440 t之登錄項相比較。如果偵 測為命中,則將目標住址寫至目標住址快捷記憶體(TAC) 寫入模組4 4 0之命中登錄項。如果偵測並未命令,則配置 一登錄項,並將目標住址寫至所配置之登錄項。一具體實 施例中,目標住址快捷記憶體(TAC ) 24 0實行一最少且最 近使用(LRU)之配置演算法。 因此,寫入管線400允許於各種分枝相關指令(分枝指令 (M),分枝預測指令(BRP),M0V_T0_BR)之控制下,將分 枝預測資訊寫至目標住址暫存器(TAR) 230與目標住址快 捷記憶體(T A C) 2 4 0。例如’重要性攔位可包含於這些指 令中,用以指示指令中所指定之分枝預測資訊是否應配置 目標住址暫存器(TAR) 2 3 0或者目標住址快捷記憶體(TAC) 240。一編譯器可根據編譯時期可用之相關分枝的資訊, 而在這些指令之重要性欄位中設定一或更多位元。Page 26 abb8 1 4 V. Description of the invention (23) is a hit, and the corresponding hit bit is set, then the target address is written to the hit registration entry in the target address register (TAR) written in the module 434. If there is no hit in the measurement but a corresponding prompt bit is set, a registration item is configured in the target address register (TAR) writing module 434, and the target address is written to the configured registration item. In a specific embodiment, the target address register (TAR) 230 implements a first-in-first-out (FIFO) configuration algorithm. Since the target address register (TAR) 230 is dual-port, it can read and write both entries at the same time. A similar writing procedure is performed using the target address flash memory (TAC) read and write modules 440, 444, respectively. Compare the target address with the entry in the target address flash memory (TAC) read module 440 t. If the detection is a hit, the target address is written to the target address shortcut memory (TAC) and the hit entry of module 4 4 0 is written. If the detection is not commanded, configure a login item and write the target address to the configured login item. In a specific embodiment, the target address shortcut memory (TAC) 24 0 implements a least-recently-used (LRU) configuration algorithm. Therefore, the write pipeline 400 allows branch prediction information to be written to the target address register (TAR) under the control of various branch-related instructions (branch instruction (M), branch prediction instruction (BRP), M0V_T0_BR). 230 and target address quick access memory (TAC) 2 4 0. For example, the 'importance block' may be included in these instructions to indicate whether the branch prediction information specified in the instruction should be configured with a target address register (TAR) 2 3 0 or a target address flash memory (TAC) 240. A compiler may set one or more bits in the importance field of these instructions based on the information about the relevant branches available at compile time.

第27頁 4 5 5814 五、發明說明(24) 可使用各種規則以標示目標住址暫存器(TAR) 230或者 目標住址快捷記憶體(TAC) 240之分枝預測資訊。這些規 則係有關分枝之預測資訊相關的特徵。例如,由於採用— 分枝時’目標住址暫存器(TAR) 23〇提供管線2〇〇有效的重 新導引,所以當預測採用相關分枝時,可很方便地將分枝 預測資訊儲存於目標住址暫存器(TAR) 23〇中。再者,由 於分枝預測資訊係由編譯器解碼’所以可根據編譯時期之 可用資訊而預測採用的分枝尤為適當選擇。 由於目標住址暫存器(TAR) 2 3 0之大小相對較小,所以 無法调節預測採用之所有分枝的預測資訊。可使用額外規 則,以減少競爭分枝之數目。例如,重覆存取控制迴路之 么枝直到迴路終止為止β透過例如“計數之頂端迴路,, (CTOP—)與“當於頂端迴路”(WT〇p)的一採用分枝而重覆的 迴路每次迴路重覆時,可從零氣泡重新導引獲得幫助。此 對於迴路主體中具有相對較少指令(“緊密迴路,,)之頂端 迴路尤其為真0這些情況下,即使一單一管線氣泡仍可表 示遞迴時期的一重要部分。因此’相關於(計數或者當於) 頂端迴路之分枝均為將預測資訊儲存於目標住址暫存器 (TAR) 230中的良好選擇s 由目標住址快捷記憶體(TAC) 24〇提供予分枝預測資訊 之分枝即該等對於處理器效能而言較與目標住址暫存器 (TAR) 230相關的分枝為不重要者。目標住址快捷記憶體 (TAC) 230同時可藉由後續所提供之分枝預測資訊而調節 從目標住址暫存器(TAR) 22〇移位的分枝預測資訊。因此Page 27 4 5 5814 V. Description of the invention (24) Various rules can be used to indicate the branch prediction information of the target address register (TAR) 230 or the target address flash memory (TAC) 240. These rules are characteristics related to the predicted information about branches. For example, since the 'Target Address Register (TAR) 23' at the time of branching provides effective redirection of pipeline 200, when branch prediction is adopted, branch prediction information can be easily stored in The target address register (TAR) is 23 °. Furthermore, since the branch prediction information is decoded by the compiler ', it is particularly appropriate to select the branch that can be predicted based on the information available at the time of compilation. Because the size of the target address register (TAR) 230 is relatively small, it is impossible to adjust the prediction information of all branches used for prediction. Additional rules can be used to reduce the number of competing branches. For example, repeat the branch of the access control loop until the loop terminates. Β is repeated through a branch such as "Counting Top Loop, (CTOP-) and" When Top Loop "(WToop). Each time the loop repeats, it can get help from zero bubble redirection. This is especially true for top loops with relatively few instructions ("tight loops,") in the loop body. In these cases, even a single pipeline Bubbles can still represent an important part of the recursion period. Therefore, the branches related to (counting or when) the top circuit are good choices for storing the prediction information in the target address register (TAR) 230. The target address flash memory (TAC) 24 is provided to the branch. Branches of the branch prediction information are branches that are less important to the processor performance than the target address register (TAR) 230. The target address shortcut memory (TAC) 230 can also adjust the branch prediction information shifted from the target address register (TAR) by 22 ° through the branch prediction information provided subsequently. therefore

1H9 第28頁 4 5 58 1 4 五、發明說明(25) 目標住址快捷記憶體(TAC) 240之可用性允許不必降低效 能重要碼區段中之分枝的分枝預測速度/精確度,而能廣 泛使用軟體提示,以配置分枝預測系統280中的分枝預測 資訊。 圖5A係一分枝預測指令(BRP) 50 0的一區塊圖,該分枝 預測指令適於用以將分枝預測資訊輸送至分枝預測系統 280。分枝預測指令(BRP) 500包括一運算碼欄位51 0,~ “是/否”欄位5 14,一重要性提示欄位5 1 8,一目標攔位 5 20 ,以及一標記攔位524 »運算碼欄位51 ◦指示該指令係 一分枝預測指令。是/否攔位51 4指示分枝應如何預測,例 如’採用/不採用,動態/靜態預測等。標記櫊位5 1 8指示 相關分枝指令(BR)的一住址,而目標攔位52D指示的分枝 指令(BR)的一預測之目標住址。重要性提示欄位524指示 提供相關分枝之低潛在期分枝預測的相對重要性。分枝預 測管理員(BPM) 260使用重要性提示欄位5 24,將資料從目 標欄位520路徑導引至一或更多分枝儲存結構。 圖5B係一分枝指令(BR) 540的一區塊圖,該分枝指令 (BR)適於用以將重要性資訊輸送至指令指標(IP)-相關分 枝之分枝預測系統280。分枝指令(BR) 540包括一運算碼 欄位544,一分枝類型欄位548,一目標襴位5 50,一是/否 欄位5 54 ’以及一重要性欄位558。當分枝類型攔位548指 示例如呼叫,重返’計數之迴路,模數排定之計數迴路, 模數排定之當於迴路等分枝類型時,運算碼攔位544將分 枝指令(BR) 5 4 0識別成一非導向分枝指令(BR )。分枝預測1H9 Page 28 4 5 58 1 4 V. Description of the invention (25) The availability of the target address fast memory (TAC) 240 allows the branch prediction speed / accuracy of branches in important code sections to be reduced without reducing the performance. Software hints are widely used to configure branch prediction information in the branch prediction system 280. FIG. 5A is a block diagram of a branch prediction instruction (BRP) 50 0, which is suitable for transmitting branch prediction information to the branch prediction system 280. The branch prediction instruction (BRP) 500 includes an opcode field of 51 0, a "yes / no" field of 5 14, a importance prompt field of 5 1 8, a target stop of 5 20, and a marker stop. 524 »Op Code Field 51 ◦Indicate that this instruction is a branch prediction instruction. The yes / no block 51 4 indicates how the branches should be predicted, such as ‘with / without, dynamic / static prediction, and the like. The flag bit 5 1 8 indicates an address of the relevant branch instruction (BR), and the target stop 52D indicates a predicted target address of the branch instruction (BR). The importance hint field 524 indicates the relative importance of providing a low-potential-phase branch prediction for the relevant branch. The branch prediction manager (BPM) 260 uses the importance prompt fields 5 24 to guide the data from the target field 520 to one or more branch storage structures. FIG. 5B is a block diagram of a branch instruction (BR) 540. The branch instruction (BR) is suitable for transmitting importance information to an instruction index (IP) -related branch prediction system 280. The branch instruction (BR) 540 includes an opcode field 544, a branch type field 548, a target bit field 5 50, a yes / no field 5 54 ', and an importance field 558. When the branch type block 548 indicates, for example, a call, return to the 'counting loop, the modulo-scheduled counting loop, and the modulo-schedule is a branch type such as the loop, the opcode block 544 branches the instruction ( BR) 5 4 0 is identified as a non-directed branch instruction (BR). Branch prediction

第29頁 45 58 1 4 五 '發明說明(26) 管理員(BPM)260根據重要性爛位558之狀態,而將資料從 目標欄位550路徑導引至一或更多分枝儲存結構。 圖5C係一 MOV_TO_BR指令56 0的一區塊圖,該m〇V_TO_BR 指令適於用以將非導向分枝之分枝提示資訊輸送至分枝預 測系統280。MOV指令560包括一運算碼欄位564,一分枝類 型欄位568,一目標欄位5 70,一暫存器攔位574,一重要 性攔位578,以及一是/否攔位58 0。運算碼攔位5 64將指令 識別成一MOV一TO —BR指令,目標攔位5 70指定目標住址,而 暫存器欄位574指定其中儲存目標住址之分枝暫存器。來 自目標欄位5 7 0之資料係根據重要性欄位5 7 8的狀態,而提 供予目標住址暫存器(TAR) 2 3 0以及/或者目標住址快捷記 憶體(TAC) 240。 ' 圖6係一種根據本發明而儲存分枝預測資訊之方法的一 流程圖。所揭露之具體實施例中,方法6 〇 〇例如可於分枝 預測管理員(ΒΡΜ) 260或者分枝執行單元(BRU) 270將一分 枝相關指令解碼之時起始化。當偵測(6丨〇 )得一適當指令 時’則擷取(6 2 0 )所包含之分枝預測資訊,並且決定(6 3 〇 ) 是否設定指令中之重要性位元。如果設定(6 3 〇 )重要性位 疋’則分枝預測資訊儲存於最低潛在期之分枝預測結構 中,例如目標住址暫存器(TAR) 23 〇。一具體實施例中, 萬一分枝預測資訊後續從目標住址暫存器(TA R) 2 3 〇剔除 時’則同樣可儲存於目標住址快捷記憶體(TAC) 24〇 _。 如果並未設定(6 3 〇 )重要性位元,則將分枝預測資訊儲存 於一較南潛在期之分枝預測結構中,例如目標住址快捷記Page 29 45 58 1 4 5 'Explanation of the invention (26) The administrator (BPM) 260 guides the data from the target field 550 path to one or more branch storage structures according to the status of the importance position 558. FIG. 5C is a block diagram of a MOV_TO_BR instruction 560. The mOV_TO_BR instruction is suitable for transmitting branch prompt information of non-oriented branches to the branch prediction system 280. MOV instruction 560 includes an opcode field 564, a branch type field 568, a target field 5 70, a register block 574, an importance block 578, and a yes / no block 58 0 . Opcode block 5 64 identifies the instruction as a MOV-TO-BR instruction. Target block 5 70 specifies the target address, and register field 574 specifies the branch register in which the target address is stored. The data from the target field 5 7 0 is provided to the target address register (TAR) 2 3 0 and / or the target address shortcut memory (TAC) 240 according to the status of the importance field 5 7 8. FIG. 6 is a flowchart of a method for storing branch prediction information according to the present invention. In the disclosed specific embodiment, the method 600 can be initiated, for example, when a branch prediction manager (BPM) 260 or a branch execution unit (BRU) 270 decodes a branch-related instruction. When an appropriate instruction is detected (6 丨 〇), the branch prediction information contained in (6 2 0) is retrieved, and it is determined (6 3 0) whether to set the importance bit in the instruction. If the (6 3 0) importance bit is set to 则 ′, the branch prediction information is stored in the branch prediction structure of the lowest potential period, such as the target address register (TAR) 23 〇. In a specific embodiment, when the branch prediction information is subsequently removed from the target address temporary register (TA R) 2 3 0 ′, it can also be stored in the target address fast memory (TAC) 24 0 _. If the (6.30) importance bit is not set, the branch prediction information is stored in a branch prediction structure in a southward potential period, such as a short note of the target address

第30頁 4 5 5 8 1 4 五、發明說明(27) 憶趙(TAC) 24(3與分枝預測表(βρτ) 220。 某些例子中,分枝指令(BR)可能透過處理器管線2〇〇而 緊跟著其相關之分枝預測指令(BRP)指令,所以需要分枝 指令(BR)前,可能沒有足夠時間儲存來自分枝預測指令 CBRΡ)指令之分枝預測資訊。這些情況下,分枝預測資訊 可能透過旁徑結構(未顯示)’而直接耦合至指令指標(11&gt;) 多工器(MUX) 250 。 圖7係一種根據本發明而使用分枝預測資訊之方法7 〇 〇的 一流程圖。方法700係於一新的指令指標(Ιρ)傳送(7〇[))至 分枝預測結構之時起始化,例如級2〇2中,一第一時脈週 期期間。如果指令指標(ΙΡ)於低潛在期之分枝預測結構( 目標住址暫存器(TAR) 23〇)中命中(72〇) 一登錄項,則於 下時脈週期時期,將相關於該登錄項的一預測目標指令 才曰挞(IP )送返。如果指令指標U P)於低潛在期之結構中失 誤( 7 20 ),則方法700等候(74〇)來自較高潛在期之結構(目 標住址快捷記憶體(TAC) 24〇,分枝預測表(Βρτ) 22〇)的 如果較高潜在期之結構指示一命中( 740 ),則於下一時 時間,將相關於該命中登錄項的-、目標指令 曰二^ 較低潛在期之結構中的一失誤(740 )指示 •於較低潛在期之結構中入” 枝指令(βΐ〇,或者u I ()亚未對應於一分 訊。如果額外之:::;分枝指令(BR)可用之分枝預測資 m繼續尋找中該等存可用於分枝預測資訊,則方法 X等^構中之命中。如果新的指令指標Page 30 4 5 5 8 1 4 V. Description of the invention (27) Yi Zhao (TAC) 24 (3 and branch prediction table (βρτ) 220. In some examples, the branch instruction (BR) may pass through the processor pipeline 2000 and its related branch prediction instruction (BRP) instruction, so before the branch instruction (BR) is needed, there may not be enough time to store the branch prediction information from the branch prediction instruction (CBRP) instruction. In these cases, the branch prediction information may be directly coupled to the instruction indicator (11 &gt;) multiplexer (MUX) 250 through a bypass structure (not shown) '. FIG. 7 is a flowchart of a method 700 for using branch prediction information according to the present invention. The method 700 is initiated when a new instruction index (Ip) is transmitted (70 [)) to the branch prediction structure, such as in level 202, during a first clock cycle. If the instruction index (IP) hits (72) an entry in the branch prediction structure of the low potential period (target address register (TAR) 23〇), then in the next clock cycle period, it will be related to the entry A predicted target of the item was sent back (IP). If the instruction index (UP) fails in a structure with a low potential period (7 20), then the method 700 waits (74) for a structure with a higher potential period (target address shortcut memory (TAC) 24), a branch prediction table ( Βρτ) 22〇) If the structure of the higher potential period indicates a hit (740), then at the next time, one of the-, the target instruction of the hit entry, and the structure of the lower potential period will be ^ Error (740) instructions • Enter the structure of the lower potential period "branch instruction (βΐ〇, or u I () sub-corresponds to a subdivision. If additional :::; branch instruction (BR) available The branch prediction resource m continues to search for such information that can be used for branch prediction information, and then the method X and other structures are hit. If the new instruction index

第31頁 b8 1 4 五、發明說明(28) (IP)並非代表例如由分枝預測管理員(BPM) 2 60所決定的 一分枝指令7 7 0,則包含於指令中之分枝預測資訊可用以 根據所包括之提示資訊,而更新(780)第一或者第二分枝 預測結構(BPS) » 本發明已描述一系統,其中分枝指令係以其對應之指令 指標(I P)作為索引β然而,此並非必要,而且一些表示法 可用於此目的之分枝指令,例如分枝指令運算碼》此外, 本發明已描述兩分枝預測結構的—分枝預測階層之情況。 熟知此項技藝人士將認可:本發明可快速應用於兩層以上 之分枝預測結構的分枝預測階層。這些情況下,分枝預測 指令(BRP)指令將使用對應之較大提示欄位,而且將提供 分枝指令(BR)的額外類別。 _ 因此,此處已提供一種用於快速分枝預測作業之系統及 方法^使用一種於軟體導向下作業之分枝預測結構的階層 。該分枝指令其一第一類別之分枝預測資訊係儲存於一小 而快速的分枝預測結構中,該分枝預測結構可於一 =二:中f取。該分枝指令之另一類別的分枝預測資訊俾 第二類別。當一分枝指令於第-結構中; 人枝指令開始之時脈週期後的時脈週期中,將一 才曰7指標(IP)提供予管壤之笛 夺 目標 之虚拽哭 線之第-級。即使是以高頻率勃〜 ,所揭露之發明仍可提供最大有效分枝=仃 測的早一週期吞吐量。 仪再分枝預Page 31 b8 1 4 V. Description of the invention (28) (IP) does not represent a branch instruction 7 7 0 determined by the branch prediction manager (BPM) 2 60, for example, the branch prediction included in the instruction The information can be used to update (780) the first or second branch prediction structure (BPS) according to the included prompt information »The present invention has described a system in which the branch instruction uses its corresponding instruction index (IP) as the Index β However, this is not necessary, and some notations can be used for branch instructions for this purpose, such as branch instruction opcodes. Furthermore, the present invention has described the case of a branch prediction hierarchy of a two-branch prediction structure. Those skilled in the art will recognize that the present invention can be quickly applied to the branch prediction hierarchy of a branch prediction structure with more than two layers. In these cases, the Branch Prediction Instruction (BRP) instruction will use the corresponding larger hint field and will provide additional categories of the Branch Instruction (BR). _ Therefore, a system and method for fast branch prediction operations have been provided here ^ A hierarchy of branch prediction structures operating under software guidance. The branch prediction information of one of the first types of branch instructions is stored in a small and fast branch prediction structure, and the branch prediction structure can be obtained at one = two: f. The branch prediction information of another type of the branch instruction 俾 the second type. When a branch instruction is in the-structure; in the clock cycle after the clock cycle at the start of the human branch instruction, the 7 indicator (IP) is provided to the first of the drowling cry line of the pipe of the pipe soil. -level. Even at high frequencies, the disclosed invention can still provide the maximum effective branch = estimated early cycle throughput. Branching

第32頁Page 32

Claims (1)

4 5 5 8 1 4.______ 六'申請專利範圍 1 · 一種为枝預測系統,包含: 用於分枝目標住址的一第一儲存結構; 用於分枝目標住址的一第二儲存結構;以及 耦合至第一與第二儲存結構的一分枝預測管理員(BPM) ,該分枝預測管理員(BPM)可識別一指令中的—分枝目標 住址與一重要性指示器,並根據該重要性指示器,而將τ分 枝目標住址寫入第一或者第二儲存結構。 2.如申請專利範圍第1項之分枝預測系統,其中該第一 儲存結構具有一響應潛在期,較該第二儲存結構其響應潛 在期為短。 、‘, 3 _如申請專利範圍第1項之分枝預測系統,其中該分枝 預測管理員(BPM)包括解碼邏輯:用以識別包括分枝預測 資訊之指令’並從所解碼之指令擷取分枝預測資訊。 4. 如申明專利範圍第1項之分枝預測系統,其中該輕合 之分枝預測官理員(BPM)用以接收來自一分枝執行單元之 選定指令的分枝預測資訊。 5. —種處理器,包含: 一執行管線; 輕合至該執行管線的一目標住址暫存器,可儲存所選定 之分枝預測資訊’用以改變通過執行管線的一指令流; 耦合至執行管線的一目標住址快捷記憶體,可儲存分技 預測資訊’用以改變該通過執行管線之指令流;以及 耦合至執行管線,目標住址暫存器(TAR),與目標住址 快捷記憶體(TAC)的一分枝預測管理員(BpM),用以偵測執4 5 5 8 1 4 .______ Six 'patent application scope 1 · A branch prediction system, comprising: a first storage structure for branch target addresses; a second storage structure for branch target addresses; and A branch predictive manager (BPM) coupled to the first and second storage structures, the branch predictive manager (BPM) identifying a branch target address and an importance indicator in an instruction, and based on the Importance indicator, and write the τ branch target address into the first or second storage structure. 2. The branch prediction system according to item 1 of the patent application scope, wherein the first storage structure has a response potential period which is shorter than the response potential period of the second storage structure. , ', 3 _ If the branch prediction system of the first patent application scope, wherein the branch prediction manager (BPM) includes decoding logic: to identify the instruction including branch prediction information' and extract from the decoded instruction Take branch prediction information. 4. If the branch prediction system of item 1 of the patent scope is declared, the light branch prediction officer (BPM) is used to receive branch prediction information from a selected instruction from a branch execution unit. 5. A processor comprising: an execution pipeline; a target address register that is lightly closed to the execution pipeline and can store selected branch prediction information 'for changing an instruction flow through the execution pipeline; coupled to A target address shortcut memory of the execution pipeline, which can store technical prediction information 'for changing the instruction flow through the execution pipeline; and coupled to the execution pipeline, a target address temporary register (TAR), and the target address shortcut memory ( TAC) branch prediction manager (BpM) 第33頁 六、申請專利範圍 ' ------ =線中之指令的分枝預測資訊,並根據該指令所指定的 暫在ί/ϋ示’而將偵測之分枝預測資訊提供予目標住址 R)與目標住址供捷記憶體(TAC)。 6二、如申請專利範圍第5項之處理器,其中該執行單元包 描兩刀v執仃單疋,可從該指令擷取分枝預測資訊’並將 之刀枝預測資訊耦合至分枝預測管理員(βρΜ) ^ 在^ :人請專利範圍第5項之處理器,其中該目標住址暫 二至執行管線的一取得儲存,用以接收一指令指 : ;一單一時脈週期内,將對應於指令指標UP)的一 目標住址送返取得儲存。 捷U如Λ請人專利範圍第7項之處理器,其中該目標住址快 :=耦《至取得儲存,用以接收一指令指#,並於二 5更多吟脈週期内,將對應於指令指標( 送返取得儲存。 〃町目棕住址 I如申請專利範圍第8項之處理器,其,當目標住址暫 標=返目標住址時1禁止目標住址快捷記憶體送返目 10· 一種處理器,包含: 一指令取得模組; :耦合之目標住址暫存器(TAR) ’用以接收來自 、且的一指令指標(IP),龙且當指令指標([P)斑一 Ϊ暫存器(TAR)登錄項相符肖,則於-單-時脈週期内;住 返-目標指令指標(IP); 々朗内送 一耦合之目標住址快捷記憶體(TAC),用以接收來自該Page 33 6. Scope of patent application '------ = branch prediction information of the instruction in the line, and the detected branch prediction information will be provided according to the temporary ί / ϋ indication specified by the instruction. The target address R) and the target address are provided to the TAC. 62. If the processor of item 5 of the patent application scope, wherein the execution unit includes two tools v execution instructions, the branch prediction information can be retrieved from the instruction and the blade prediction information is coupled to the branch Prediction manager (βρΜ) ^ In ^: The processor of the fifth item in the patent scope, where the target address is temporarily stored in an execution pipeline to obtain a command to receive a command finger: within a single clock cycle, A target address corresponding to the instruction index UP) is returned for storage. Jie U is the processor of item 7 in the patent scope, where the target address is fast: = coupled to to obtain storage, used to receive a command finger #, and in the period of more than 5 5 Yinmai, will correspond to Instruction index (return for storage. 〃 町 目 目 Address I, such as the processor in the scope of patent application No. 8, when the target address temporary mark = return to the target address 1 Prohibit the target address fast memory to return to the target 10 · One The processor includes: an instruction acquisition module;: a coupled target address register (TAR) 'for receiving an instruction indicator (IP) from and, and the instruction indicator ([P) The registration items of the TAR register are in accordance with the Shaw, within the -single-clock cycle; the return-target instruction index (IP); send a coupled target address fast memory (TAC) to receive from The 455814 六、申請專利範圍 ~— 取得模組的一指令指標(IP),並且當指令指標(Ip)與—目 標住址快捷記憶體(TAC)登錄項相符時,則於二或更多時 脈週期内送返一目標指令指標(I ρ);以及 一分枝預測管理員(ΒΡΜ),用以識別分枝預測資訊,並 且根據分枝預測資訊中的一重要性指示器,以來自該分枝 預測資訊的一目標指令指標(IP)更新目標住址暫存^刀 (TAR)與目標住址快捷記憶體(TAC)。 11·如申請專利範圍第1 〇項之處理器,其中當設定重要 性指示器時,則分枝預測管理員(BPM)將目標指令指標 (IP)寫至目標住址暫存器(TAR)與目標住址快捷記= (TAC),而未設定當重要性時,則單獨寫至目標住^ 記憶體(TAC)。 · 、焚 12. 如申請專利範圍第1 0項之處理器,其中該分枝 管理員(BPM)包括一耦合之解碼器,用以接收來自77取得_ 組之指令’識別分枝相關指令’並且從所識別之分、 指令擷取分枝預測資訊。 相關 13. 如申請專利範圍第11項之處理器,進一步包含一、 枝執行單元’耦合至該取得模組與該分枝預測管理 刀 (BPM),分枝執行單元可從分枝及分枝相關指令梅取八 預測資訊,並且將擷取之資訊提供予分枝預測管理^枝 (BPM)。 胃 14. 一種分枝預測系統,包含: 一目標住址暫存器(TAR),用以儲存分枝指令其一 群組之目標住址; 第一455814 6. Scope of patent application ~ — Obtain an instruction index (IP) of the module, and when the instruction index (Ip) matches the entry of the target address flash memory (TAC), it will be at two or more clock cycles Incoming a target instruction indicator (I ρ); and a branch prediction manager (BPM) to identify the branch prediction information, and according to an importance indicator in the branch prediction information, from the branch A target instruction index (IP) of the prediction information updates the target address temporary storage (TAR) and the target address shortcut memory (TAC). 11. If the processor of the scope of patent application No. 10, when the importance indicator is set, the branch prediction manager (BPM) writes the target instruction index (IP) to the target address register (TAR) and Target Address Shortcut = (TAC), if not set, when it is important, it is written to the target residence ^ memory (TAC) separately. · 、 12. If the processor of the scope of patent application No. 10, wherein the branch manager (BPM) includes a coupled decoder to receive the instruction 'identify branch related instruction' from the 77 acquisition_ group And branch prediction information is retrieved from the identified branches and instructions. Related 13. If the processor of item 11 of the patent application scope further includes a branch execution unit, which is coupled to the acquisition module and the branch prediction management knife (BPM), the branch execution unit may be branched and branched. The related instructions take eight prediction information and provide the obtained information to the branch prediction management branch (BPM). Stomach 14. A branch prediction system comprising: a target address register (TAR) for storing target addresses of a group of branch instructions; first 六、申請專利範圍 —目標住址快捷記憶體(TAC ),用以儲存分枝指令其一 第二群組之目標住址;以及 —分枝預測管理員(βΡΜ),用以識別相關於分枝指令其 第 與第一群組之分枝預測資訊’並以來自所識別之分枝 預測資訊的一目標住址更新目標住址快捷記憶體(T a c )與 目標住址暫存器(TAR)。 ” 15. 如申請專利範圍第1 4項之分枝預測系統,其中該分 枝預測資訊包括一重要性位元’並且當設定重要性位元 時’則分枝預測管理員(BPM)將目標住址寫至目標住址暫 存器(TAR)。 16. —階層式分枝預測系統的一分枝預測管理員,該分 枝預測管理員包含: 一分枝住址計算器,用以識別分枝相關指令中之分枝預 測資訊;以及 一路徑導引模組,用以將分枝預測資訊從分枝住址計算 器提供予階層式分枝預測系統中的一所指示位置。 17. 如申請專利範圍第1 6項之分枝預測管理員,其中該 分枝預測資机包括提示與目標住址負訊’而且路徑導引模 組將目標住址資訊提供予該提示資訊所指示的—位置。 18. 如申請專利範圍第1 7項之分枝預測管理員,其中該 分枝預測資訊係由一分枝預測指令提供。 19. 如申請專利範圍第1 6項之分枝預測管理員,其中該 分枝住址什异包含一第一分枝住址汁算器,用以從''分枝 預測資訊初步決定一預測之分枝導向與目標住址,以及一6. Scope of patent application—Target address shortcut memory (TAC) for storing the target address of a second group of branch instructions; and—branch prediction manager (βPM) for identifying branch instructions The branch prediction information of the first group and the first group are updated with a target address from the identified branch prediction information and a target address shortcut memory (T ac) and a target address register (TAR). "15. If the branch prediction system of item 14 of the patent application scope, wherein the branch prediction information includes an importance bit 'and when the importance bit is set', the branch prediction manager (BPM) sets the target The address is written to the target address register (TAR). 16. —A branch forecast manager of the hierarchical branch prediction system, the branch forecast manager includes: a branch address calculator to identify branch related The branch prediction information in the instruction; and a path guidance module for providing branch prediction information from the branch address calculator to an indicated position in the hierarchical branch prediction system. The branch prediction manager of item 16, wherein the branch prediction machine includes a prompt and a negative address of the target address' and the path guidance module provides the target address information to the location indicated by the prompt information. 18. If The branch forecasting administrator of item 17 in the scope of patent application, wherein the branch forecasting information is provided by a branch forecasting instruction. 19. If the branch forecasting administrator of item 16 in the scope of patent application, the branch Comprises a first exclusive address even juice branch address calculator for the 'preliminary decision branch prediction of a branch prediction and target address information guide, and a 第36頁 4 5 581 4 -------------------- 六、申請專利範圍 第二分枝住址計算器,用以使該初步決定之分枝導向與目 標住址有效。 2 0.如申請專利範圍第1 9項之分枝預測管理員’其中該 分枝預測資訊係由一分枝指令提供。 21. 如申請專利範圍第1 6項之分枝預測管理員,其中該 階層式分枝預測系統中所指示之位置係一第一或者第二分 枝預測結構中的一登錄項。 22. 如申請專利範圍第21項之分枝預測管理員,其中該 第一分枝預測結構係一目標住址暫存器檔案,支援所選定 之分枝的單一時脈内之管線重新導引,而第二分枝預測結 構係一目標住址快捷記憶體,支援二或更多週期内之管線 重新導引。 · 2 3.如申請專利範圍第2 2項之分枝預測管理員,其中一 預測之目標佐址係根據分枝預測資訊中的一欄位,而路徑 導引至目標住址暫存器檔案或者目標住址快捷記憶體。 24· 一種處理器,包含: 用以儲存分枝指令其一第一群組之分枝預測資訊的第一 裝置; 用以儲存分枝指令其一第二群組之分枝資訊的第二裝置 ;以及 用以管理分枝預測資訊之裝置,該分枝預測資訊係用以 偵測一分枝相關指令中的分枝預測資訊,並根據該分枝相 關指令中之提示資訊,而將分枝預測資訊路徑導引至第一 或者第二儲存裝置。Page 36 4 5 581 4 -------------------- 6. The second branch address calculator of the scope of patent application, used to guide the branch of this preliminary decision Valid with target address. 20. The branch prediction manager of item 19 in the scope of patent application, wherein the branch prediction information is provided by a branch instruction. 21. The branch prediction manager of item 16 in the scope of patent application, wherein the position indicated in the hierarchical branch prediction system is a registration item in a first or second branch prediction structure. 22. If the branch prediction manager of the 21st patent application scope, wherein the first branch prediction structure is a target address register file, supporting pipeline redirection within a single clock of the selected branch, The second branch prediction structure is a target address shortcut memory, which supports pipeline redirection in two or more cycles. · 2 3. If the branch forecast manager of item 22 of the patent application scope, one of the predicted target locations is based on a field in the branch forecast information, and the path is directed to the target address register file or Target address shortcut memory. 24. A processor comprising: a first device for storing branch prediction information of a first group of branch instructions; a second device for storing branch information of a second group of branch instructions ; And a device for managing branch prediction information, the branch prediction information is used to detect branch prediction information in a branch-related instruction, and branch according to the prompt information in the branch-related instruction. The prediction information path is directed to the first or second storage device. 第37頁 455814 六、申請專利範圍 25. 如申請專利範圍第24項之處理器,其中該管理裝置 包括: 用以從分枝預測資訊產生一分枝目標住址之裝置;以及 用以根據提示資訊,而將分枝目標住址路徑導引至第一 或者第二儲存裝置之裝置。 26. 如申請專利範圍第25項之處理器,其中該多重分枝 可同時間處理,並且該產生裝置包括複數個住址,用以計 算多重分枝之目標住址,以及用以於多重分枝間識別一第 一採用之分枝的邏輯。 27. 如申請專利範圍第26項之處理器,其中該第一採用 之分枝係使用所預測之分枝資訊而決定。 28. 如申請專利範圍第25項之處理器,其中該管理裝置 包括: —第一住址計算器之計算器,用以從分枝預測資訊決定 一初步之目標住址與分枝本身;以及 —第二住址計算器,用以使初步之目標住址與分枝本身 有效。 29. 如申請專利範圍第28項之處理器,其中該分枝預測 資訊係由一分枝指令提供。 30. 如申請專利範圍第25項之處理器,其中該第一儲存 裝置包含複數個暫存器,可提供一目標住址,用以於一單 一時脈週期内重新導引處理器,而該第二儲存裝置包含一 目標住址快捷記憶體,可提供一目標住址,用以於二或更 多時脈週期内重新導引處理器。Page 455814 6. Application scope of patent 25. For the processor of application scope 24, the management device comprises: a device for generating a branch target address from the branch prediction information; and a device for prompting information based on the prompt information , And guide the branch target address path to the first or second storage device. 26. For example, the processor of the scope of application for patent No. 25, wherein the multiple branches can be processed at the same time, and the generating device includes a plurality of addresses, which is used to calculate the target address of the multiple branches, and is used between multiple branches. Identify the logic of a first adopted branch. 27. For the processor of claim 26, wherein the first adopted branch is determined using the predicted branch information. 28. The processor of claim 25, wherein the management device includes:-a calculator of the first address calculator to determine a preliminary target address and the branch itself from the branch prediction information; and-the first Two address calculators are used to validate the initial target address and the branch itself. 29. The processor of claim 28, wherein the branch prediction information is provided by a branch instruction. 30. If the processor of the scope of application for patent No. 25, wherein the first storage device includes a plurality of registers, a target address can be provided for redirecting the processor in a single clock cycle, and the first The two storage devices include a target address address memory, which can provide a target address address to redirect the processor in two or more clock cycles. 第38頁Page 38
TW88111693A 1998-08-06 1999-07-09 Software directed target address cache and target address register TW455814B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US13039198A 1998-08-06 1998-08-06

Publications (1)

Publication Number Publication Date
TW455814B true TW455814B (en) 2001-09-21

Family

ID=22444490

Family Applications (1)

Application Number Title Priority Date Filing Date
TW88111693A TW455814B (en) 1998-08-06 1999-07-09 Software directed target address cache and target address register

Country Status (3)

Country Link
AU (1) AU5239599A (en)
TW (1) TW455814B (en)
WO (1) WO2000008551A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6427192B1 (en) 1998-09-21 2002-07-30 Advanced Micro Devices, Inc. Method and apparatus for caching victimized branch predictions
US20110320787A1 (en) * 2010-06-28 2011-12-29 Qualcomm Incorporated Indirect Branch Hint
US20140006752A1 (en) * 2012-06-27 2014-01-02 Qualcomm Incorporated Qualifying Software Branch-Target Hints with Hardware-Based Predictions
GB2548604B (en) 2016-03-23 2018-03-21 Advanced Risc Mach Ltd Branch instruction
GB2548602B (en) 2016-03-23 2019-10-23 Advanced Risc Mach Ltd Program loop control
GB2548603B (en) 2016-03-23 2018-09-26 Advanced Risc Mach Ltd Program loop control

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE4345028A1 (en) * 1993-05-06 1994-11-10 Hewlett Packard Co Device for reducing delays due to branching
US5860150A (en) * 1995-10-06 1999-01-12 International Business Machines Corporation Instruction pre-fetching of a cache line within a processor
US5742804A (en) * 1996-07-24 1998-04-21 Institute For The Development Of Emerging Architectures, L.L.C. Instruction prefetch mechanism utilizing a branch predict instruction

Also Published As

Publication number Publication date
AU5239599A (en) 2000-02-28
WO2000008551A1 (en) 2000-02-17

Similar Documents

Publication Publication Date Title
JP3542020B2 (en) Processor device and processor control method for executing instruction cache processing for instruction fetch alignment over multiple predictive branch instructions
US10209993B2 (en) Branch predictor that uses multiple byte offsets in hash of instruction block fetch address and branch pattern to generate conditional branch predictor indexes
US8185725B2 (en) Selective powering of a BHT in a processor having variable length instructions
US6178498B1 (en) Storing predicted branch target address in different storage according to importance hint in branch prediction instruction
JP6718454B2 (en) Hiding page translation miss latency in program memory controller by selective page miss translation prefetch
US7032097B2 (en) Zero cycle penalty in selecting instructions in prefetch buffer in the event of a miss in the instruction cache
US8943298B2 (en) Meta predictor restoration upon detecting misprediction
KR20110008298A (en) Selectively performing a single cycle write operation with ecc in a data processing system
JP2004038344A (en) Instruction fetch control device
JP3848161B2 (en) Memory access device and method using address translation history table
US6883090B2 (en) Method for cancelling conditional delay slot instructions
US20040158694A1 (en) Method and apparatus for hazard detection and management in a pipelined digital processor
US7219216B2 (en) Method for identifying basic blocks with conditional delay slot instructions
TW455814B (en) Software directed target address cache and target address register
KR20170001602A (en) Front end of microprocessor and computer-implemented method using the same
US20020078332A1 (en) Conflict free parallel read access to a bank interleaved branch predictor in a processor
US10592517B2 (en) Ranking items
JPH09218786A (en) Information processor
US7849299B2 (en) Microprocessor system for simultaneously accessing multiple branch history table entries using a single port
JPH08249178A (en) Method and apparatus for formation of modifier bit at inside of annex of processor
US7711926B2 (en) Mapping system and method for instruction set processing
JP3558481B2 (en) Data processing device
EP0912929B1 (en) A data address prediction structure and a method for operating the same
JPH08328856A (en) Computer apparatus for processing of branch instruction