JPH03250221A

JPH03250221A - Branch predicting system

Info

Publication number: JPH03250221A
Application number: JP4542490A
Authority: JP
Inventors: Tetsuaki Nakamigawa; 哲明中三川; Shigeya Tanaka; 成弥田中; Kenichi Kurosawa; 黒沢　憲一; Takashi Hotta; 多加志堀田; Yasuhiro Nakatsuka; 康弘中塚; Takeshi Takemoto; 毅竹本
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1990-02-28
Filing date: 1990-02-28
Publication date: 1991-11-08

Abstract

PURPOSE:To improve the performance of calculation by defining the updating condition of a branch predicting table when the condition of a condition branching instruction is established and the branching destination is on the back of the instruction itself. CONSTITUTION:When the contents of an address branching instruction address (BIA) 1001 as a second instruction are coincident at time t2, an instruction address selector (IASEL) 105 sends the contents of a branching destination address (BTA) 1002 to an instruction address register (IAR) 106 as the address to be next read and in order to calculate the address to be next read, the contents are sent to an incrementer (INC) 104 as well. Further, for recovery in the case of failing prediction, the contents [(a)+4] of a program counter (PC) 103 are saved in a saving register (PCS) 108. In such a manner, reading is started from an predicted branching destination address (b) at time t3 and reading occurs from an address (b+4) at time t4. When the condition is established as the executed result of the second instruction at the time t4 and it is confirmed that branching occurs, the prediction is made successful. Thus, the condition branching is executed without useless waiting time and the performance of a computer is improved.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は機械命令を複数のステージに分けて実行するパ
イプライン計算機における分岐処理方式に関し、特に分
岐予測テーブルを数多く持てないｌチップマイクロプロ
セッサの分岐処理の高速化に関する。[Detailed Description of the Invention] [Field of Industrial Application] The present invention relates to a branch processing method in pipeline computers that execute machine instructions by dividing them into multiple stages, and is particularly applicable to l-chip microprocessors that cannot have many branch prediction tables. Regarding speeding up branch processing.

[Conventional technology]

パイプライン計算機においては一つの機械命令を、命令
の読み込み、解読、演算、書き込み等の複数のステージ
に分け、流れ作業的に実行し、処理時間の短縮を図って
いる。そのためパイプライン計算機では、連続する命令
は高速↓こ実行できるが、分岐命令が出現するとパイプ
ラインステージをキャンセルし、命令の読み込みから実
行し直さなければならないため、性能が低下する要因と
なっていた０分岐命令の内、無条件分岐命令については
、分岐することが予め分かつているため、分岐先番地が
計算でき次第命令の読み込みに移れるが、条件分岐命令
については、演算結果によって条件が成立した場合には
分岐の処理を行うが１条件が不成立の場合には、分岐を
行わずに次番地の命令をそのまま実行するため、演算結
果によって条件が決定するまで分岐の処理に移れない。In a pipeline computer, one machine instruction is divided into multiple stages such as instruction reading, decoding, calculation, and writing, and is executed in a flow-like manner to reduce processing time. For this reason, pipeline computers can execute consecutive instructions at high speed, but when a branch instruction occurs, the pipeline stage must be canceled and execution must be restarted from the instruction loading, which is a factor in reducing performance. Among 0-branch instructions, unconditional branch instructions are known in advance to take a branch, so the instruction can be read as soon as the branch destination address is calculated, but for conditional branch instructions, the condition is satisfied based on the operation result. If one condition is not satisfied, the instruction at the next address is executed without branching. Therefore, branch processing cannot be started until the condition is determined based on the calculation result.

そのため、条件が成立した場合の性能低下が著しい。Therefore, when the conditions are met, the performance deteriorates significantly.

そこで、条件分岐を高速化するため１分岐予測と呼ばれ
る手法が考えられた。分岐予測方式については、アイ・
イー・イー・イー、コンピュータ、（１９８４年１月）
第６頁から第２２頁（ＩＥＥＥ。Therefore, a method called one-branch prediction was devised to speed up conditional branching. Regarding the branch prediction method, please refer to i.
E.E., Computer, (January 1984)
Pages 6 to 22 (IEEE.

ＣＯＭＰＵＴＥＲ，（Ｊａｎ、　１９８４）ｐ　ｐ　６
−２２）において論じられている。COMPUTER, (Jan, 1984) p p 6
-22).

従来の分岐予測方式では、分岐命令番地４分岐先番地９
分岐予測判定ビットの３つのフィールドを持った分岐予
測テーブルを、数十から数百エントリーを持ち、実行中
の分岐命令の番地が、テーブル中の分岐命令番地と一致
した場合、分岐予測判定ビットの状態に従って、分岐す
るか否かを予測し１分岐すると予測された場合には条件
の成立を待たずに、テーブル中の分岐先番地から命令読
み込みを開始するため、予測が成功した場合には分岐に
伴う性能の低下が緩和され、全体として処理性能が向上
する１分岐予測判定ビットは、過去の分岐の成立、不成
立によって決定される。In the conventional branch prediction method, branch instruction address 4 branch destination address 9
A branch prediction table with three fields of branch prediction judgment bits has tens to hundreds of entries, and when the address of the branch instruction being executed matches the branch instruction address in the table, the branch prediction judgment bit field is Depending on the state, it is predicted whether or not to branch. If one branch is predicted, the instruction is read from the branch destination address in the table without waiting for the condition to be satisfied, so if the prediction is successful, the branch is executed. The one-branch prediction determination bit, which alleviates the performance deterioration associated with this and improves the processing performance as a whole, is determined by whether or not a branch was taken in the past.

[Problem to be solved by the invention]

上記従来技術では１分岐予測テーブルを数十から数百エ
ントリー持ち１条件分岐命令を出来るだけ多くテーブル
に登録することによって性能を向上させることを前提と
している。そのため、１チツプマイクロプロセツサのよ
うにテーブルのエントリー数を数多くとれない場合には
、あまり性能向上が期待出来ないという問題点があった
。The above-mentioned conventional technology is based on the premise that performance is improved by having a one-branch prediction table with tens to hundreds of entries and registering as many one-conditional branch instructions as possible in the table. For this reason, there is a problem in that when the number of entries in the table cannot be increased as in the case of a 1-chip microprocessor, it is not possible to expect much improvement in performance.

本発明の目的は１分岐予測テーブルのエントリー数が少
ない場合にも分岐予測の効果を出し、特に１チツプマイ
クロプロセツサにおいて、最適な分岐予測方式を提供す
ることにある。An object of the present invention is to provide a branch prediction method that is effective even when the number of entries in a single branch prediction table is small, and is particularly suitable for a 1-chip microprocessor.

[Means to solve the problem]

上記目的を達成するためには、分岐予測が最も効果的に
作用する。プログラムのループ部分を構成する条件分岐
命令を優先的に分岐予測テーブルに登録すればよい。そ
のため、分岐先番地が分岐命令番地より後方である場合
をテーブルの登録条件とする。Branch prediction works most effectively to achieve the above objective. Conditional branch instructions forming the loop portion of the program may be registered preferentially in the branch prediction table. Therefore, the table registration condition is that the branch destination address is after the branch instruction address.

[Effect]

プログラムのループ部分では、ループ終了の判定のため
の条件分岐命令がループの最後即ち上位番地側にあり、
条件が成立した場合には後方即ち低位番地側に分岐する
。ループ内に他に条件分岐命令が出現しなければ単純に
条件分岐命令を登録すればよいが、例えばループ内に条
件文（ＩＦ文等）があるとその条件文に対する条件分岐
命令がテーブルに登録されてしまい、ループ終了のため
の条件分岐命令がテーブルに登録されない場合がでてく
る０分岐先番地が分岐命令番地より後方である場合をテ
ーブルの登録条件にすれば、ループ内の条件文に対する
条件分岐命令はテーブルに登録されず、ループ終了のた
めの条件分岐命令が登録される。In the loop part of the program, the conditional branch instruction for determining the end of the loop is at the end of the loop, that is, at the higher address side.
If the condition is satisfied, the process branches backward, that is, to the lower address side. If no other conditional branch instructions appear in the loop, you can simply register the conditional branch instruction, but for example, if there is a conditional statement (IF statement, etc.) in the loop, the conditional branch instruction for that conditional statement will be registered in the table. If the 0 branch destination address is after the branch instruction address is set as a table registration condition, the conditional branch instruction for loop termination may not be registered in the table. Conditional branch instructions are not registered in the table, but conditional branch instructions for loop termination are registered.

〔Example〕

以下５本発明の実施例を第１図〜第１１図により説明す
る。Hereinafter, five embodiments of the present invention will be explained with reference to FIGS. 1 to 11.

第１図は本発明の一実施例の基本的構造を示している０
分岐予測を行うパイプライン計算機は、分岐予測テーブ
ル（ＢＰＴ）１００、分岐予測テーブル登録セレクタ（
ＢＰＳＥＬ）　１０１　、プログラムカウンタ比較器（
ＧＯＭＰ）１０２　、プログラムカウンタインクリメン
タ（ＩＮＣ）１０３．プログラムカウンタ（ＰＣ）１０
４　、命令アドレスセレクタ（ＩＡＳＥＬ）１０５．命
令アドレスレジスタ（ＩＡＲ）１０６　、命令キャッシ
ュ（ＩＣ）１０７、プログラムカウンタ退避レジスタ（
ＰＣ５）１０８．命令コードユニット（ＤＥＣ）２００
．命令実行ユニット（ＥＸＥＣ）３００を主な構成要素
とする。分岐予測テーブル（ＢＰＴ）１００は、分岐命
令アドレス（ＢＩＡ）１００１．分岐先アドレス（ＢＴ
Ａ）１００１　を含む。分岐予測テーブル登録セレクタ
（ＢＰＳＥＬ）　１０１はプログラムカウンタバッファ
（ＰＣＢ）１０１１を含む。命令デコードユニット（Ｄ
ＥＣ）２００は、制御信号（ＣＯＮＴＲＯＬ）　２０１
、ディスプレースメントマイナス信号（ＤＩＳＰＭ）２
０２、条件分岐命令判定信号（ＢＣＣ）２０３を含む。FIG. 1 shows the basic structure of an embodiment of the present invention.
A pipeline computer that performs branch prediction includes a branch prediction table (BPT) 100 and a branch prediction table registration selector (
BPSEL) 101, program counter comparator (
GOMP) 102, program counter incrementer (INC) 103. Program counter (PC) 10
4, instruction address selector (IASEL) 105. Instruction address register (IAR) 106, instruction cache (IC) 107, program counter save register (
PC5)108. Instruction code unit (DEC) 200
．． The main component is an instruction execution unit (EXEC) 300. A branch prediction table (BPT) 100 includes branch instruction addresses (BIA) 1001. Branch destination address (BT
A) Contains 1001. Branch prediction table registration selector (BPSEL) 101 includes a program counter buffer (PCB) 1011. Instruction decode unit (D
EC) 200 is a control signal (CONTROL) 201
, displacement minus signal (DISPM) 2
02, including a conditional branch instruction determination signal (BCC) 203.

命令実行ユニット（ＥＸＥＣ）　３００は、算術論理演
算（ＡＬＵ）３０１　、条件コード生成部（ＣＯＮＤ）
３０２を含む。The instruction execution unit (EXEC) 300 includes an arithmetic logic operation (ALU) 301 and a condition code generation unit (COND).
302 included.

ＩＡＲ１０６の内容であるアドレスに従ってＩＣ１０７
から読み出された命令は、ＤＥＣ２００で解読され、Ｃ
０ＮＴＲ０Ｌ２０１　ニよりＥＸＥＣ３００を制御し、
ＥＸＥＣ３００は所定の演算を行う、ＰＣ１０４は次に
読みだすべき命令のアドレスを保持しており、　ｌＮＣ
ｌＯ３はＰＣＩ０４の内容を命令の語長に合わせてイン
クリメントする。Ｃ０ＮＰ１０２はＰＣ１０４とＢＩＡ
ｌｏｏＩとを比較し、一致した場合はＢＴＡ１００２の
内容をＩＡＳＥＬ１０５に送る。IC107 according to the address that is the content of IAR106
The instructions read from the C
Control EXEC300 from 0NTR0L201 d,
EXEC 300 performs a predetermined operation, PC 104 holds the address of the next instruction to be read, and
lO3 increments the contents of PCI04 according to the word length of the instruction. C0NP102 is PC104 and BIA
looI, and if they match, the contents of BTA 1002 are sent to IASEL 105.

ＩＡＳＥＬ１０５はＤＥＣ２００からの信号２０４、Ｅ
ＸＥＣ３００からの信号３０３、Ｃ０ＮＰ１０２からの
信号１０２１に基づき、Ｐ　Ｃ１０４、ＰＣ５１０８，
ＢＴＡ１０２１　、　ＡＬＵ３０１の出力ＡＬＵＯＵＴ
３０５の内の一つを選択し、ＩＡＲ１０６に送る。その
時、　ＢＴＡ１０２１が選択された場合は、ＢＴＡ１０
２１の内容をｌＮＣｌＯ４に送ると共にＰＣ１０３の内
容をＰＣ５１０８に退避する。　ＰＣＢＩＯＩＩはＰＣ
１０４の内容を保持し、　ＢＰＴｌｏｏの更新のタイミ
ングに合わせてＢＰＴｌｏｏにその内容を送る。　ＢＰ
ＳＥＬＬＯＬは、Ｃ０ＮＤ３０２からの条件成立信号３
０４とＤＥＣ２００からのＤＩＳＰＭ２０２及びＢＣＣ
２０３の内容によりＰＣＢＩＯＩＩの内容とＡＬＵ３０
１の出力である分岐先アドレス（ＡＬＵＯＵＴ３０５）
をそれぞれＢＰＴｌｏｏの中のＢＩＡｌｏｏＩとＢＴＡ
１００２に格納する。IASEL 105 receives signals 204 and E from DEC 200.
Based on the signal 303 from XEC300 and the signal 1021 from C0NP102, PC104, PC5108,
BTA1021, ALU301 output ALUOUT
305 and sends it to IAR106. At that time, if BTA1021 is selected, BTA10
The contents of PC 103 are sent to INClO4 and the contents of PC 103 are saved to PC 5108. PCBIOII is PC
104 and sends the content to BPTloo at the timing of BPTloo update. B.P.
SELLOL is condition fulfillment signal 3 from C0ND302
DISPM202 and BCC from 04 and DEC200
The contents of PCBIOII and ALU30 depend on the contents of 203.
Branch destination address (ALUOUT305) which is the output of 1
BIAlooI and BTA in BPTloo respectively
1002.

第２図は本実施例で述べる計算機の分岐予測を含む基本
動作を表すタイミングチャートである。FIG. 2 is a timing chart showing the basic operation of the computer including branch prediction described in this embodiment.

ｔｏからｔ８は時系列を表し、一つの時間（タイミング
）が１マシンサイクルに相当する。一つの命令は、ＰＣ
，Ｉ　Ｆ、Ｄ、Ｅ、Ｗに分かれたパイプラインステージ
で実行される。ＰＣステージではＰＣ１０４をインクリ
メントし、次に読み出すべき命令のアドレスを計算する
。尚、本実施例では命令長は４バイト固定とする。ＰＣ
ステージの終了後、即ちＩＦステージと同じタイミング
でＢＩＡｌｏｏＩとＰＣ１０４の比較がＣ０ＮＰ１０２
によって行われる。両者が一致した場合、ＩＡＳＥＬ１
０５はＢＴＡ１０００２の内容をＩＡＲに送り、次から
は予測されたアドレスからの実行が始まる。両者が一致
しない場合には、通常通りインクリメントされたＰＣｌ
、０４が使われる。ＩＦステージではＰＣステージで決
定されたアドレスに従いＩＣ１０７から命令を読み出す
。Ｄステージでは読み出された命令を解読し、ＥＸＥＣ
３００に演算を指示する等、各種制御を行う０本実施例
では分岐予測を行うのは条件分岐命令の分岐先がプログ
ラムカウンタ相対アドレスだけの場合、即ち分岐先アド
レスが静的に決定できる場合に限るので、分岐先が後方
であることを示すディスプレースメントマイナス信号Ｄ
ＮＳＰＭ２０２は命令中のオペランドの特定のビットを
見ればよく、Ｄステージで判定できる。また、条件分岐
命令であるかどうかは命令中のオペコードの特定のビッ
トの組合せで決定できるので条件分岐命令判定信号ＢＣ
Ｃ２０３はＤステージで判定できる。From to to t8 represents a time series, and one time (timing) corresponds to one machine cycle. One command is PC
, IF, D, E, and W pipeline stages. In the PC stage, the PC 104 is incremented and the address of the next instruction to be read is calculated. In this embodiment, the instruction length is fixed at 4 bytes. PC
After the end of the stage, that is, at the same timing as the IF stage, the comparison between BIAlooI and PC104 is C0NP102.
carried out by. If both match, IASEL1
05 sends the contents of BTA10002 to IAR, and next execution starts from the predicted address. If they do not match, PCl is incremented normally.
, 04 are used. At the IF stage, instructions are read from the IC 107 according to the address determined at the PC stage. In the D stage, the read instruction is decoded and EXEC
In this embodiment, branch prediction is performed when the branch destination of a conditional branch instruction is only a program counter relative address, that is, when the branch destination address can be determined statically. Therefore, the displacement minus signal D indicating that the branch destination is backward
The NSPM 202 only needs to look at specific bits of the operands in the instruction, and can make a decision at the D stage. In addition, since whether it is a conditional branch instruction or not can be determined by a combination of specific bits of the opcode in the instruction, the conditional branch instruction determination signal BC
C203 can be determined at the D stage.

ＢＣＣ２０３は分岐予測を行う命令の種類を限定するた
めにも使用できる。ＥステージではＤステージで指示さ
れた演算をＥＸＥＣ３００で実行する６Ｗステージでは
Ｅステージの実行によって得られた結果をレジスタ等に
書き戻す。The BCC 203 can also be used to limit the types of instructions that perform branch prediction. In the E stage, the EXEC 300 executes the operation instructed in the D stage. In the 6W stage, the result obtained by the execution of the E stage is written back to a register or the like.

第２図は、第１命令がコンデイション生成命令、第２命
令が条件分岐命令の場合である。第１命令のアドレスを
ａ−４番地とし、時間１０で計算されたとする。時間ｔ
１では第１命令の読み出しと第２命令のアドレスの計算
が行われる。時間ｔ２では第１命令の解読、第２命令の
読み出しと第３命令のアドレスの計算が行われる。以下
同様に。FIG. 2 shows a case where the first instruction is a condition generation instruction and the second instruction is a conditional branch instruction. Assume that the address of the first instruction is address a-4 and that it is calculated at time 10. time t
1, reading of the first instruction and calculation of the address of the second instruction are performed. At time t2, the first instruction is decoded, the second instruction is read, and the address of the third instruction is calculated. Similarly below.

分岐等で制御が乱れないかぎり、それぞれの命令はパイ
プラインステージに従って流れていく。尚この場合、第
２命令のアドレス（ａ）はＢＩＡｌｏｏＩと一致しなか
ったとする。Unless control is disrupted due to branching, etc., each instruction flows according to the pipeline stage. In this case, it is assumed that the address (a) of the second instruction does not match BIAlooI.

時間ｔ３で演算された第１命令の結果（コンデイション
）が５時間ｔ４で第２命令（条件分岐命令）の条件を満
足した場合は、第２命令で示されたアドレスに分岐する
。分岐先アドレスは時間ｔ４で計算されるので１時間ｔ
５から命令の読み出しを行う。時間ｔ４で分岐が起こる
と分かったとき、第３命令、第４命令は既に読み出され
ているので実行を途中でキャンセルする。時間ｔ５以降
は、再び分岐等が起きないかぎり命令はパイプラインス
テージに従って流れていく、このように、分岐予測が働
かなかった場合、条件分岐命令で条件が成立した場合、
２サイクルの空き時間が生じる。If the result (condition) of the first instruction calculated at time t3 satisfies the condition of the second instruction (conditional branch instruction) at time t4, the process branches to the address indicated by the second instruction. The branch destination address is calculated at time t4, so 1 time t
Instructions are read from step 5. When it is determined that a branch will occur at time t4, execution is canceled midway since the third and fourth instructions have already been read. After time t5, instructions flow according to the pipeline stages unless a branch occurs again.In this way, if branch prediction does not work, if a condition is met with a conditional branch instruction,
Two cycles of idle time occur.

時間ｔ４で分岐が起こると分かったとき、その分岐先が
自命令より後方である（　ａ　＜　ｂ　）かどうかは第
２命令のＤステージ即ち時間ｔ３で分かつているので、
分岐先が自命令より後方である場合、第２命令のアドレ
ス（ａ）と分岐先アドレス（ｂ）を時間ｔ５でＢＰＴｌ
ｏｏに登録する。即ち、条件成立信号３０４が時間ｔ４
で出力された場合。When it is known that a branch will occur at time t4, it is known at the D stage of the second instruction, that is, at time t3, whether the branch destination is after the own instruction (a < b), so
If the branch destination is after the own instruction, the address (a) of the second instruction and the branch destination address (b) are transferred to BPTl at time t5.
Register on oo. That is, the condition fulfillment signal 304 is at time t4.
If output with .

ＢＰＳＥＬＩＯＩはその前のタイミング（時間ｔ３）で
出力されたＢＣＣ２０３とＤＩＳＰＭ２０２に従ってＰ
ＣＢＩＯＩＩの内容を時間ｔ５でＢＩＡｌｏｏＩに、Ａ
ＬＵＯＵＴ３０５をＢＴＡ１００２にそれぞれ格納する
。BPSELIOI is set to P according to BCC203 and DISPM202 output at the previous timing (time t3).
The contents of CBIOII are transferred to BIAlooI at time t5, and A
Each LUOUT 305 is stored in the BTA 1002.

第３図は、分岐予測が成功した場合を示す。第２図と同
様に第１命令がコンデイション生成命令、第２命令が条
件分岐命令である。時間ｔ２で第２命令のアドレスとＢ
ＩＡｌｏｏＩの内容が一致したとする。この場合、ＩＡ
ＳＥＬ１０５はＢＴＡ１００２の内容を次に読みだすア
ドレスとしてＩＡＲ１０６に送ると共に、その次に読み
出すべきアドレスを計算するためにｌＮＣｌＯ４にも送
る。また、予測が失敗した場合の回復用にＰＣ１０３の
内容（ａ＋４）をＰＣ５１０ｇに退避する。FIG. 3 shows a case where branch prediction is successful. As in FIG. 2, the first instruction is a condition generation instruction and the second instruction is a conditional branch instruction. At time t2, the address of the second instruction and B
Assume that the contents of IAlooI match. In this case, IA
The SEL 105 sends the contents of the BTA 1002 to the IAR 106 as the next address to read, and also sends it to the 1NClO4 to calculate the next address to read. Further, the content (a+4) of the PC 103 is saved to the PC 510g for recovery when prediction fails.

このようにして時間ｔ３では予測された分岐先アドレス
（ｂ）から読み出しが始まり、時間ｔ４ではアドレスｂ
＋４から読み出しが起こる。時間ｔ４で第２命令の実行
の結果、条件が成立し１分岐が起こることが確認されれ
ば、予測は成功であり、むだな待ち時間無しに条件分岐
が実行されるので計算機の性能が向上する。In this way, at time t3, reading starts from the predicted branch destination address (b), and at time t4, reading starts from the predicted branch destination address (b).
Reading occurs from +4. If it is confirmed that the condition is satisfied and a branch occurs as a result of the execution of the second instruction at time t4, the prediction is successful, and the conditional branch is executed without unnecessary waiting time, improving the performance of the computer. do.

第４図は、分岐予測が失敗した場合を示す。第２図と同
様に第１命令がコンデイション生成命令、第２命令が条
件分岐命令である。第３図と同様に時間ｔ２で第２命令
のアドレスとＢＩＡｌｏｏＩの内容が一致したとする。FIG. 4 shows a case where branch prediction fails. As in FIG. 2, the first instruction is a condition generation instruction and the second instruction is a conditional branch instruction. Assume that the address of the second instruction and the contents of BIAlooI match at time t2 as in FIG. 3.

時間ｔ２では予測されたアドレス（ｂ）から読み呂しが
始まる。時間ｔ３ではアドレスｂ＋４から読み出しが起
こる。時間ｔ４で第２命令の実行の結果１条件が成立せ
ず、アドレスａ＋４の命令を実行すべきことが分かった
場合、ＩＡＳＥＬ１０５はＰＣ１０４の内容（ｂ＋ｓ）
でなく、退避していたＰＣ５１０８の内容（ａ＋４）を
次に読みだすアドレスとしてＩＡＲ１０６に送る。第３
及び第４命令はキャンセルされる。　ＰＣ５１０８の内
容の内容はｌＮＣｌＯ４にも送られる。このように、分
岐予測が失敗した場合５分岐予測が働かなかった場合と
同様に２サイクルの空き時間が生じる。分岐予測が頻繁
に失敗すると、無駄な空き時間が増えるので最悪の場合
計算機の性能が低下するが、一般に分岐予測が行われる
ループ部分は数多く実行され、分岐予測が成功する可能
性の方が高いので分岐予測は平均的に計算機の性能を向
上させる。At time t2, reading starts from the predicted address (b). At time t3, reading occurs from address b+4. If it is found that the first condition is not satisfied as a result of the execution of the second instruction at time t4 and the instruction at address a+4 should be executed, IASEL 105 returns the contents of PC 104 (b+s).
Instead, the saved content (a+4) of the PC 5108 is sent to the IAR 106 as the address to be read next. Third
and the fourth instruction is canceled. The contents of PC5108 are also sent to INClO4. In this way, when branch prediction fails, two cycles of idle time occur, similar to when 5-branch prediction does not work. If branch prediction fails frequently, more free time is wasted, and in the worst case, the performance of the computer decreases, but generally the loop portion where branch prediction is performed is executed many times, so there is a higher possibility that branch prediction will succeed. Therefore, branch prediction improves computer performance on average.

第５図は、本実施例におけるＢＰＴｌｏｏ及びＢＰＳＥ
ＬＩＯＩの詳細を表したものである。ＢＰＴ１００内の
ヴアリツドビット（Ｖ）１００３は分岐予測テーブル内
のデータが有効かどうかを表す。Ｖ６Ｏ１３はリセット
時にリセット信号ｌｉ！１００４によってクリアされる
。また、仮想記憶をサポートする計算機で命令アドレス
が論理アドレスである場合、プロセス切り換えによって
もクリアされる。ただし、この場合、ＢＩＡｌｏｏＩが
アドレスの一部としてプロセス番号を保持している場合
はＶ６Ｏ１３をクリアする必要はない。FIG. 5 shows BPTloo and BPSE in this example.
This shows the details of LIOI. Valid bit (V) 1003 in BPT 100 indicates whether the data in the branch prediction table is valid. V6O13 is the reset signal li! at reset time! Cleared by 1004. Furthermore, if the instruction address is a logical address in a computer that supports virtual memory, it is also cleared by process switching. However, in this case, if BIAlooI holds the process number as part of the address, there is no need to clear V6O13.

ＢＰＳＥＬＩＯＩ内のＰＣＢＩＯＩＩはラッチ１０１１
　、１０１２゜１０１３から成る。これらのラッチは１
サイクルごとにデータを送るので、第２図の実施例の場
合。PCBIOII in BPSELIOI is latch 1011
, 1012°1013. These latches are 1
In the case of the embodiment shown in FIG. 2, data is sent every cycle.

時間ｔ２でＰＣＢＩＯＩＩに入れられたデータは時間ｔ
５で出力される。ラッチ１０１２，１０１３は、それぞ
れＤＩＳＰＭ２０２．　ＢＣＣ２０３のデータを保持す
る。The data entered into PCBIOII at time t2 is at time t
5 is output. Latches 1012 and 1013 each have DISPM202. Holds the data of BCC203.

第２図の実施例の場合、時間ｔ３でラッチされた信号は
時間ｔ４で条件分岐成立信号３０４とアンドゲート１０
１４によって論理積がとられた分岐予測テーブル登録信
号（ＢＰＴＳ）１．０１７となる。In the embodiment of FIG. 2, the signal latched at time t3 becomes the conditional branch established signal 304 and the AND gate 10 at time t4.
14 results in a branch prediction table registration signal (BPTS) of 1.017.

ＢＰＴＳ１０１７はゲート１０１５．１０１６を開き、
それぞれＰＣＢＩＯＩｌ、　ＡＬＵＯＵＴ３０５の内容
をＢＰＴｌｏｏに送る。BPTS1017 opens gates 1015.1016;
Send the contents of PCBIOIl and ALUOUT305 to BPTloo, respectively.

ＢＰＴＳ１０１７はＶ６Ｏ１３のセット信号としても使
用され、分岐予測テーブルを有効にする。BPTS1017 is also used as a set signal for V6O13 to enable the branch prediction table.

第６図は分岐予測テーブル登録セレクタ（ＢＰＳＥＬ）
の別の実施例である。この実施例は条件分岐命令の分岐
先がプログラムカウンタ相対アドレスだけでないが分岐
先アドレスが静的に決定できる場合、例えば絶対アドレ
ス等で、ディスプレースメントマイナス信号（ＤＩＳＰ
Ｍ）がＤステージでは分からない場合のものである。Ｐ
ＣＢＩＯＩＩの出力とＡＬＵＯＵＴ３０５の内容は減算
器（ＳＯＢ）１０１ｇによって減算され、大小関係が判
定される。Figure 6 shows the branch prediction table registration selector (BPSEL)
This is another example. This embodiment uses a displacement minus signal (DISP) when the branch destination of a conditional branch instruction is not only a program counter relative address but can be statically determined, for example, an absolute address.
M) is not known at the D stage. P
The output of CBIOII and the contents of ALUOUT 305 are subtracted by a subtracter (SOB) 101g to determine the magnitude relationship.

ＡＬＵＯＵＴ３０５の内容の方が小さいと判定された場
合、ディスプレースメントマイナス信号１０１９が出力
される。他の動きは先の実施例と同じである。If it is determined that the content of ALUOUT 305 is smaller, a displacement minus signal 1019 is output. Other movements are the same as in the previous embodiment.

第７図は分岐予測テーブル（ＢＰＴ）及びプログラムカ
ウンタ比較器（ＣＯＭＰ）の別の実施例である。この実
施例は分岐予測テーブルが４エントリの場合である。４
エントリのテーブルは分岐命令アドレス（ＰＣＢ）の一
部によって区別される２個ずつの２つのセットに分けら
れる。あるアドレスがどちらのセットに分類されるかは
アドレスの一部を使用してセット判定器（ＳＥＴＩ）４
００によって決定される。１つのセット内のどちらのエ
ントリに登録すべきかは入れ換え優先順位決定器（ＲＥ
ＰＲ）５００によって決定される。ＲＥＰＲ５００は、
１つのセット内のエントリ内の入れ換え優先ビット（Ｒ
Ｐ）によって登録されたのが古い方のエントリを判定し
、そちらのエントリを登録すべき対象として決定すると
共に入れ換え優先ビットを付は換え、それまでの優先順
位を反転する。５ＥＴ１４００及びＲＥＰＲ５００によ
って登録すべきエントリが一つに決まる。FIG. 7 is another embodiment of the branch prediction table (BPT) and program counter comparator (COMP). In this embodiment, the branch prediction table has four entries. 4
The table of entries is divided into two sets of two separated by a portion of the branch instruction address (PCB). A set determiner (SETI) 4 uses part of the address to determine which set an address is classified into.
Determined by 00. The replacement priority determiner (RE) determines which entry in one set should be registered.
PR) 500. REPR500 is
Replacement priority bit (R
The older entry is determined by P), and that entry is determined as the target to be registered, and the replacement priority bit is changed to invert the previous priority order. 5ET1400 and REPR500 determine one entry to be registered.

分岐予測テーブルからの読み出しの場合、比較すべきア
ドレス（ｐｃ）に対しセット判定器（ＳＥＴ２）６００
によってセットが選ばれ、１つのセット内の２エントリ
に対し２つの比較器によって同時に比較が行われる。分
岐予測テーブルのエントリ数を増やすことにより、より
多くの命令を登録できるので分岐予測の効果が大きくな
り計算機の性能がより向上する。In the case of reading from the branch prediction table, a set determiner (SET2) 600 is used for the address (pc) to be compared.
A set is selected by , and two entries in one set are compared simultaneously by two comparators. By increasing the number of entries in the branch prediction table, more instructions can be registered, which increases the effect of branch prediction and further improves computer performance.

第８図は分岐予測テーブル（ＢＰＴ）及びプログラムカ
ウンタ比較器（ＣＯＭＰ’）の第７図とは異なる実施例
である。あるアドレスがどちらのセットに分類されるか
はセット判定器（ＳＥＴ３）７００によって、ディスプ
レースメントマイナス信号（ＤＩＳＰＭ）を使用し分岐
先が前方か後方かで決定される０分岐予測テーブルから
の読み出しの場合は、比較すべきアドレス（ｐｃ）に対
し４つの比較器によって同時に比較が行われる。FIG. 8 shows a different embodiment from FIG. 7 of a branch prediction table (BPT) and a program counter comparator (COMP'). Which set an address is classified into is determined by the set determiner (SET3) 700 based on whether the branch destination is forward or backward using the displacement minus signal (DISPM) when reading from the 0 branch prediction table. In this case, the addresses (pc) to be compared are compared simultaneously by four comparators.

第９図から第１１図までは本発明の実施例に対する効果
的な使用法について説明するためのプログラム例及びフ
ローチャートである。第９図のプログラムに対し、第１
０図のような命令展開、即ち条件の判定をループの始め
だけで行い、ループの最後がループの先頭への無条件分
岐となっている展開では、分岐予測が働かないため本発
明の効果が表われない。それに対し、第１１図のような
命令展開、即ち条件の判定をループの始めだけでなく、
ループの最後でも行う形の展開では、分岐予測が働き、
本発明の効果が表われている。従って、本発明を使用す
る計算機システムにおいて。FIGS. 9 through 11 are program examples and flowcharts illustrating the effective use of embodiments of the present invention. For the program in Figure 9, the first
In instruction expansion as shown in Figure 0, that is, in expansion where the condition is judged only at the beginning of the loop and the end of the loop is an unconditional branch to the beginning of the loop, branch prediction does not work, so the effect of the present invention is not effective. It doesn't appear. In contrast, instruction expansion as shown in Figure 11, that is, condition judgment, is not only performed at the beginning of the loop.
Branch prediction works when the expansion is performed even at the end of the loop.
The effects of the present invention are evident. Therefore, in a computer system using the present invention.

計算機言語を機械語に翻訳するコンパイラは、第９図の
ようなプログラムに対し、第１０図のような展開でなく
、第１１図のような展開を行えばよい。A compiler that translates a computer language into machine language may perform expansion as shown in FIG. 11 for a program as shown in FIG. 9, rather than as shown in FIG. 10.

〔Effect of the invention〕

本発明によれば、分岐予測によって条件分岐命令の無駄
な空き時間を減らすことができ、計算機の性能が向上す
る。特に、数少ない分岐予測テーブルでもループを構成
する条件分岐命令を選んでテーブルに登録することがで
き、少ないハードウェアで分岐予測の効果を発揮できる
。According to the present invention, branch prediction can reduce unnecessary idle time of conditional branch instructions, improving computer performance. In particular, even if there are only a few branch prediction tables, conditional branch instructions that make up a loop can be selected and registered in the table, making it possible to achieve the effect of branch prediction with less hardware.

本発明によれば、分岐予測テーブルが１エントリしか無
い場合でも第１２図のようなループを持つプログラムに
対し、第１の条件分岐（１２０１）は分岐予測テーブル
に登録せず第２図の条件分岐（１２０２）だけを分岐予
測テーブルに登録するので、分岐予測の効果が発揮でき
るが、分岐先が後方であるという制限を加えないと、第
１．第２の条件分岐の両方を分岐予測テーブルに登録し
てしまうので分岐予測が失敗し１分岐予測の効果が出な
い。According to the present invention, even if the branch prediction table has only one entry, for a program having a loop as shown in FIG. 12, the first conditional branch (1201) is not registered in the branch prediction table and the condition shown in FIG. Since only the branch (1202) is registered in the branch prediction table, the effect of branch prediction can be exhibited, but unless the restriction that the branch destination is backward is added, the first. Since both of the second conditional branches are registered in the branch prediction table, branch prediction fails and single branch prediction is not effective.

[Brief explanation of drawings]

第１図は本発明の一実施例の構成図、第２図から第４図
までは第１図の実施例による基本的動作のタイムチャー
ト、第５図は第１図の中の要素の詳細図、第６図は第５
図と同じ部分の別の実施例による詳細図、第７図は第１
図の中の要素の別の実施例による詳細図、第８図は第７
図と同じ部分の別の実施例による詳細図、第９図から第
１１図は本発明を効果的に使用するためのコンパイラの
展開方法、第１２図は本発明の効果を示すための説明図
である。１００・・・分岐予測テーブル、１０１・・・分岐予測
テーブル登録セレクタ、１０２・・・プログラムカウン
タ比較器、１０３・・・プログラムカウンタ、１０５・
・・命令アドレスセレクタ、２００・・・命令デコード
第図第４図第５図０６第図第９図ｗｈｉ　１ｅ（ａ＜ｂ）ｂｅｇｉｎ処理１第処理１第処理１ＬＡＢＥＬ２：処理２第２図図／”　ＬＡＢＥＬｌｌこ分岐率／図／”ａ−ｂの計算＄７Fig. 1 is a block diagram of an embodiment of the present invention, Figs. 2 to 4 are time charts of basic operations according to the embodiment of Fig. 1, and Fig. 5 is a detailed diagram of the elements in Fig. 1. Figure 6 is the 5th
Detailed view of another embodiment of the same part as in the figure, FIG.
Detailed view of another embodiment of the elements in the figures, FIG.
Detailed diagrams of the same parts as in the figure in another embodiment, Figures 9 to 11 are compiler expansion methods for effectively using the present invention, and Figure 12 is an explanatory diagram to show the effects of the present invention. It is. 100... Branch prediction table, 101... Branch prediction table registration selector, 102... Program counter comparator, 103... Program counter, 105...
...Instruction address selector, 200...Instruction decode Fig. 4 Fig. 5 Fig. 06 Fig. 9 Fig. 9 while 1e (a<b) begin Processing 1 Processing 1 Processing 1 LABEL2: Processing 2 Fig. 2 /”LABELLllllk branching rate/Figure/”a-b calculation $7

Claims

[Scope of Claims] 1. In a pipeline computer that executes a machine instruction in multiple stages, one or more sets of a machine address of a conditional branch instruction and a branch destination machine address of the conditional branch instruction are held as a pair. When a branch instruction appears, it compares its machine address with the instruction machine address in the branch prediction table, and if they match, it starts reading instructions from the branch destination machine address in the branch prediction table. A branch prediction method, characterized in that the condition for updating the branch prediction table is set when a condition of a conditional branch instruction is satisfied and the branch destination is after the own instruction. 2. In claim 1, if the machine address of the branch instruction and the machine address of the instruction in the branch prediction table do not match, the processing of the instruction following the conditional branch instruction is prioritized over the processing of the branch destination instruction. branch prediction method. 3. In claim 1, if the machine address of the branch instruction and the instruction machine address in the branch prediction table do not match, if the branch destination of the branch instruction is after the own instruction, the branch destination instruction is Branch prediction that gives priority to the processing of the instruction following the conditional branch instruction, and if the branch destination of the branch instruction is not after the own instruction, the processing of the instruction following the conditional branch instruction is given priority over the processing of the branch destination instruction. method. 4. In claim 1, if the machine address of the branch instruction and the machine address of the instruction in the branch prediction table do not match, the processing of the branch destination instruction is given priority over the processing of the instruction following the conditional branch instruction. branch prediction method. 5. A branch prediction method according to claim 1, in which a partial combination of an opcode and an operand in an instruction is used as an update condition for a branch prediction table. 6. Has a set of branch prediction tables that store the machine address of a conditional branch instruction and the branch destination machine address of that conditional branch instruction as a pair, and when a branch instruction appears, the machine address and the instruction in the branch prediction table are In a computer that compares machine addresses and, if they match, starts reading instructions from the branch destination machine address in the branch prediction table, when executing a program that constitutes one loop, there are two or more conditional branches in the loop. instructions, two or more conditions are met and a branch occurs, and if there is only one conditional branch instruction whose branch destination is later than the own instruction, the branch destination is later than the own instruction. A branch prediction method characterized by registering a later conditional branch instruction in a branch prediction table. 7. A branch prediction method according to claim 1, in which a loop structure in a program is expanded by placing a conditional branch instruction used to determine a loop end condition at the end of the loop.