JP2000132391A

JP2000132391A - Branch prediction mechanism

Info

Publication number: JP2000132391A
Application number: JP10301841A
Authority: JP
Inventors: Sachiko Shimada; 幸子嶋田
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1998-10-23
Filing date: 1998-10-23
Publication date: 2000-05-12

Abstract

PROBLEM TO BE SOLVED: To improve the prediction hit rate of a branch instruction with condition and to improve performance by making a plurality of branch history information contained in one line of a branch history buffer correspond to the addresses of the branch instructions with conditions. SOLUTION: A branch history buffer 804 storing the plurality of branch history information in one line is installed. In fetching an instruction (IF stage), one line containing the plurality of branch history information is read from the branch history buffer 804. When a plurality of branch instructions with conditions are contained in the fetched instruction, the respective branch instructions with conditions are branch-predicted by using the plurality of branch history information which are read. As a means for making the plurality of branch history information contained in one line of the branch history buffer 804 correspond to the addresses of the branch instructions with condition, an entry in the line of the branch history buffer 804 is made to correspond to an entry in the line of an instruction cache 802.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は情報処理装置に関
し、特に、分岐命令を予測機構を有するマイクロプロセ
ッサに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an information processing apparatus, and more particularly to a microprocessor having a branch instruction predicting mechanism.

【０００２】[0002]

【従来の技術】現在のマイクロプロセッサは、パイプラ
イン方式の導入によりマイクロプロセッサ内で並列処理
を行い高速化を図っている。図８は、パイプライン処理
を説明するための図である。図８を参照すると、このマ
イクロプロセッサにおいて、プログラム内の各命令は、
命令フェッチ（ＩＦ）、命令デコード（ＩＤ）、演算実
行（ＥＸ）、データ読み出し（ＭＥＭ）、書き込み（Ｗ
Ｂ）で一連の処理を終了する。これらの各処理をそれぞ
れ１サイクルで行い、この処理の単位を「パイプライン
ステージ」と呼ぶ。2. Description of the Related Art At present, a microprocessor performs parallel processing in a microprocessor by introducing a pipeline system to increase the speed. FIG. 8 is a diagram for explaining the pipeline processing. Referring to FIG. 8, in the microprocessor, each instruction in a program is:
Instruction fetch (IF), instruction decode (ID), operation execution (EX), data read (MEM), write (W
In B), a series of processing ends. Each of these processes is performed in one cycle, and the unit of this process is called a “pipeline stage”.

【０００３】Ｔ１サイクルにおいて、命令１はＩＦステ
ージの処理が行われる。In the T1 cycle, the instruction 1 is processed in the IF stage.

【０００４】Ｔ２サイクルにおいて、命令１はＩＤステ
ージに処理が移り、ＩＦステージでは、命令２の処理が
行われる。In the T2 cycle, the processing of the instruction 1 shifts to the ID stage, and the processing of the instruction 2 is performed in the IF stage.

【０００５】Ｔ３サイクルでは、命令１はＥＸステー
ジ、命令２はＩＤステージ、命令３はＩＦステージでの
処理が行われる。命令１の処理が終了するＴ５サイクル
では命令５の処理が始まっている。In the T3 cycle, the instruction 1 is processed in the EX stage, the instruction 2 is processed in the ID stage, and the instruction 3 is processed in the IF stage. In the T5 cycle in which the processing of the instruction 1 ends, the processing of the instruction 5 has started.

【０００６】このようにパイプライン方式のマイクロプ
ロセッサは、１つの命令をいくつかのステップで処理
し、資源を増やすことなく、複数の命令をオーバーラッ
プさせて並列実行することを可能としている。[0006] As described above, the pipeline type microprocessor processes one instruction in several steps, and enables a plurality of instructions to be executed in parallel with overlapping without increasing resources.

【０００７】図９は、パイプライン方式のマイクロプロ
セッサの構成の一例を示すブロック図である。図９を参
照すると、このパイプライン方式のマイクロプロセッサ
は、ＰＣ（プログラムカウンタ）レジスタ２０１、命令
キャッシュユニット２０２、命令デコード／レジスタユ
ニット２０３、命令実行ユニット２０４、データキャッ
シュユニット２０５、アドレス加算器２０６、及び、各
パイプラインステージを区切るためのパイプラインレジ
スタ２０７、２０８、２０９、２１０を備えて構成され
る。FIG. 9 is a block diagram showing an example of a configuration of a pipeline type microprocessor. Referring to FIG. 9, this pipelined microprocessor includes a PC (program counter) register 201, an instruction cache unit 202, an instruction decode / register unit 203, an instruction execution unit 204, a data cache unit 205, an address adder 206, And pipeline registers 207, 208, 209, 210 for separating each pipeline stage.

【０００８】命令の処理は以下のように行われる。The processing of an instruction is performed as follows.

【０００９】ＰＣレジスタ２０１は、プログラム内での
命令のアドレスを格納している。ＰＣレジスタ２０１で
指定された命令を、命令キャッシュユニット２０２で読
み出し（命令フェッチ）、パイプラインレジスタ２０７
に格納する。The PC register 201 stores an address of an instruction in a program. The instruction specified by the PC register 201 is read by the instruction cache unit 202 (instruction fetch), and the pipeline register 207
To be stored.

【００１０】命令デコード／レジスタユニット２０３で
は、読み出された命令のデコード、及びレジスタの値を
読み出し、パイプラインレジスタ２０８に格納する。The instruction decode / register unit 203 decodes the read instruction, reads the register value, and stores it in the pipeline register 208.

【００１１】命令実行ユニット２０４では、デコードさ
れた命令の演算が行われ、その結果がパイプラインレジ
スタ２０９に格納される。また演算結果は、レジスタフ
ァイルに書き込むためデータ線２１１を通って一旦パイ
プラインレジスタ２１０へ格納される。In the instruction execution unit 204, the operation of the decoded instruction is performed, and the result is stored in the pipeline register 209. The operation result is temporarily stored in the pipeline register 210 through the data line 211 for writing to the register file.

【００１２】データキャッシュユニット２０５では、演
算結果のデータキャッシュへの書き込み及び読み出しが
行われる。データキャッシュから読み出されたデータは
パイプラインレジスタ２１０へ格納され、データ線２１
２を通って命令デコード／レジスタユニット２０３内の
レジスタへ書き込まれる。The data cache unit 205 writes and reads operation results to and from the data cache. The data read from the data cache is stored in the pipeline register 210,
2 is written to a register in the instruction decode / register unit 203.

【００１３】またアドレス加算器２０６は、次に実行す
る命令のアドレスを計算し、アドレス線２１３を通って
ＰＣレジスタ２０１に格納される。The address adder 206 calculates the address of the next instruction to be executed, and stores it in the PC register 201 through the address line 213.

【００１４】パイプラインレジスタ２０７、２０８、２
０９、２１０がそれぞれのステージでの用いるデータを
保持することにより、各パイプラインステージに区切ら
れ、パイプラインステージ単位での並列処理が可能とな
る。The pipeline registers 207, 208, 2
Since the data 09 and 210 hold the data used in the respective stages, the data is divided into the respective pipeline stages and the parallel processing can be performed in units of the pipeline stages.

【００１５】通常のプログラム・フローは、次に実行さ
れる命令がプログラム順序の次の命令である。このた
め、命令のフェッチと同じサイクルでプログラム順序の
次の命令のアドレスをアドレス加算器２０６で計算し、
次サイクルでは、ＰＣレジスタ２０１に新しい命令のア
ドレスを格納することができる。In a normal program flow, the next instruction to be executed is the next instruction in the program order. Therefore, the address of the next instruction in the program order is calculated by the address adder 206 in the same cycle as the instruction fetch,
In the next cycle, the address of a new instruction can be stored in the PC register 201.

【００１６】しかし、分岐命令では命令内で指定された
アドレスの命令にプログラム・フローが変更される。分
岐命令により変更されるプログラムフローのアドレス指
定は通常ＩＤステージの命令実行ユニットにおいて行わ
れ、この為３サイクル遅れる。そして、指定されたアド
レスの命令を改めて命令キャッシュからフェッチするた
め、パイプラインが停止する原因となる。However, in the case of a branch instruction, the program flow is changed to an instruction at an address specified in the instruction. The addressing of the program flow changed by the branch instruction is usually performed in the instruction execution unit in the ID stage, and is therefore delayed by three cycles. Then, since the instruction at the specified address is fetched from the instruction cache again, this causes the pipeline to stop.

【００１７】パイプライン方式のプロセッサでは、パイ
プラインを満たしておいて命令を並列化することにより
性能をあげているが、特に、パイプライン段数の増加し
た近年のマイクロプロセッサでは、分岐命令により、そ
の後の命令の並列化ができなくなり、分岐によるペナル
ティも増加する。In a pipeline type processor, performance is improved by parallelizing instructions while filling the pipeline. In particular, in a recent microprocessor having an increased number of pipeline stages, a branch instruction causes Cannot be parallelized, and the penalty due to branching also increases.

【００１８】このため、分岐ペナルティを軽減するた
め、パイプラインの早期段階で、分岐命令の指定するア
ドレスを計算し、指定されたアドレスの命令をフェッチ
している。Therefore, in order to reduce the branch penalty, the address specified by the branch instruction is calculated at an early stage of the pipeline, and the instruction at the specified address is fetched.

【００１９】図１０は、分岐ペナルティを軽減するため
パイプラインの早期段階で分岐命令の指定するアドレス
を計算し、指定されたアドレスの命令をフェッチするこ
とを可能としたマイクロプロセッサの構成の一例を示す
ブロック図である。図１０を参照すると、この従来のプ
ロセッサは、ＰＣレジスタ３０１、命令キャッシュユニ
ット３０２、命令デコード／レジスタユニット３０３、
命令実行ユニット３０４、データキャッシュユニット３
０５、アドレス加算器３０６、パイプラインレジスタ３
０７、３０８、３０９、３１０、及び分岐予測ユニット
３１２を備えて構成される。FIG. 10 shows an example of the configuration of a microprocessor capable of calculating an address specified by a branch instruction at an early stage of a pipeline and fetching an instruction at the specified address in order to reduce a branch penalty. FIG. Referring to FIG. 10, this conventional processor includes a PC register 301, an instruction cache unit 302, an instruction decode / register unit 303,
Instruction execution unit 304, data cache unit 3
05, address adder 306, pipeline register 3
07, 308, 309, 310 and a branch prediction unit 312.

【００２０】ＰＣレジスタ３０１で指定された命令を命
令キャッシュユニット３０２で読み出し、パイプライン
レジスタ３０７へ格納する。The instruction specified by the PC register 301 is read by the instruction cache unit 302 and stored in the pipeline register 307.

【００２１】命令デコード／レジスタユニット３０３は
命令のデコードを行う。パイプラインレジスタ３０７へ
格納された命令が分岐命令の場合には、分岐予測ユニッ
ト３１２において、該分岐命令で指定されるアドレスを
計算し、その値をアドレス線３１３を通してＰＣレジス
タ３０１に格納し、新たに命令をフェッチする。The instruction decode / register unit 303 decodes an instruction. If the instruction stored in the pipeline register 307 is a branch instruction, the branch prediction unit 312 calculates an address specified by the branch instruction, stores the value in the PC register 301 through the address line 313, Fetch instructions.

【００２２】分岐予測ユニット３１２で分岐命令の指定
するアドレスを計算することによりパイプラインの早期
段階で、次命令のフェッチを可能とし、これにより、分
岐によるパイプラインの停止ペナルティを軽減してい
る。By calculating the address specified by the branch instruction in the branch prediction unit 312, the next instruction can be fetched at an early stage of the pipeline, thereby reducing the penalty for stopping the pipeline due to the branch.

【００２３】分岐命令には、分岐が常に成立する無分岐
命令の他、条件付分岐命令がある。条件付分岐命令は、
例えばゼロフラグがオンのとき分岐する等、先行する別
の命令の実行により設定される条件等に依存して、分岐
が成立するか不成立となるかが決定される。The branch instruction includes a non-branch instruction in which a branch is always taken, and a conditional branch instruction. The conditional branch instruction is
For example, whether a branch is taken or not taken is determined depending on conditions set by execution of another preceding instruction, such as branching when the zero flag is on.

【００２４】条件分岐命令実行時、命令実行ユニット３
０４における条件の決定まで、プログラム・フローが決
定されないため、図１０に示したマイクロプロセッサで
は、条件付分岐命令によるパイプラインの停止パナルテ
ィを軽減することができない。When the conditional branch instruction is executed, the instruction execution unit 3
Since the program flow is not determined until the condition in step 04, the microprocessor shown in FIG. 10 cannot reduce the pipeline stop panalty due to the conditional branch instruction.

【００２５】図１１は、条件付分岐命令のパイプライン
の停止ペネルティの軽減を図る従来のプロセッサの構成
を示すブロック図である。図１１を参照すると、このマ
イクロプロセッサは、ＰＣレジスタ４０１、命令キャッ
シュユニット４０２、命令デコード／レジスタユニット
４０３、命令実行ユニット４０４、データキャッシュユ
ニット４０５、アドレス加算器４０６、パイプラインレ
ジスタ４０７、４０８、４０９、４１０、分岐予測ユニ
ット４１２、及び分岐履歴バッファ４１４を備えて構成
される。FIG. 11 is a block diagram showing a configuration of a conventional processor for reducing a stop penalty of a pipeline of a conditional branch instruction. Referring to FIG. 11, the microprocessor includes a PC register 401, an instruction cache unit 402, an instruction decode / register unit 403, an instruction execution unit 404, a data cache unit 405, an address adder 406, and pipeline registers 407, 408, and 409. , 410, a branch prediction unit 412, and a branch history buffer 414.

【００２６】分岐履歴バッファ４１４は、条件付分岐命
令における条件が成立したか不成立であったか情報を記
憶しておくためのバッファである。The branch history buffer 414 is a buffer for storing information on whether a condition in a conditional branch instruction is satisfied or not satisfied.

【００２７】ＰＣレジスタ４０１で指定された命令を命
令キャッシュユニット４０２で読み出し、パイプライン
レジスタ４０７へ格納する。同時に、分岐履歴バッファ
４１４からＰＣレジスタ４０１で指定されたアドレスの
分岐履歴情報を読み出し、パイプラインレジスタ４０７
へ格納する。The instruction specified by the PC register 401 is read by the instruction cache unit 402 and stored in the pipeline register 407. At the same time, the branch history information at the address specified by the PC register 401 is read from the branch history buffer 414, and the pipeline register 407 is read.
To store.

【００２８】命令デコード／レジスタユニット４０３
は、読み出された命令のデコードを行う。パイプライン
レジスタ４０７に格納された命令が条件付分岐命令であ
る場合には、パイプラインレジスタ４０７に格納された
分岐履歴情報に基づいて、条件付分岐命令は、分岐が成
立するか否かの予測が行われる。Instruction decode / register unit 403
Performs decoding of the read instruction. When the instruction stored in the pipeline register 407 is a conditional branch instruction, the conditional branch instruction predicts whether or not the branch is taken based on the branch history information stored in the pipeline register 407. Is performed.

【００２９】分岐成立と予測された場合には、命令中に
指定されるアドレスを分岐予測ユニット４１２で計算
し、その値をアドレス線４１３を通してＰＣレジスタ４
０１に格納し、新たに命令をフェッチを行う。When it is predicted that the branch is taken, the address specified in the instruction is calculated by the branch prediction unit 412, and the value is transferred to the PC register 4 through the address line 413.
01, and a new instruction is fetched.

【００３０】分岐不成立と予測された場合には、すでに
フェッチされているプログラム順序の次の命令を命令デ
コード／レジスタユニット４０３へ転送する。When it is predicted that the branch is not taken, the next instruction in the program order already fetched is transferred to the instruction decode / register unit 403.

【００３１】分岐条件の決定は、命令実行ユニット４０
４において行われる。分岐予測が正しい場合には、予測
に従いフェッチしてきた命令の実行が行われる。The branch condition is determined by the instruction execution unit 40.
4 is performed. If the branch prediction is correct, the fetched instruction is executed according to the prediction.

【００３２】分岐予測が間違っていた場合には、正しい
命令アドレスをアドレス線４１５を通してＰＣレジスタ
４０１に格納し、プログラム・フローの命令をフェッチ
し直す。If the branch prediction is wrong, the correct instruction address is stored in the PC register 401 through the address line 415, and the instruction in the program flow is fetched again.

【００３３】条件の決定後、分岐履歴情報が更新され、
信号線４１６を通して分岐履歴バッファ４１４に書き込
まれる。After the condition is determined, the branch history information is updated.
The data is written to the branch history buffer 414 through the signal line 416.

【００３４】図１２は、複数のマイクロプロセッサを並
列に接続した従来の情報処理装置の構成の一例を示す図
であり、図１２には、簡単の為２つのマイクロプロセッ
サが接続されている構成が示されているが、２個に限定
されるものでない。FIG. 12 is a diagram showing an example of the configuration of a conventional information processing apparatus in which a plurality of microprocessors are connected in parallel. FIG. 12 shows a configuration in which two microprocessors are connected for simplicity. Although shown, it is not limited to two.

【００３５】図１２を参照すると、外部バス５０１に接
続された２個のプロセッサ５０２ａ、５０２ｂから構成
され、各プロセッサは、それぞれ、ＰＣレジスタ５０３
ａ、５０３ｂ、命令キャッシュユニット５０４ａ、５０
４ｂ、アドレス加算器５０５ａ、５０５ｂ、分岐履歴バ
ッファ５０６ａ、５０６ｂ、命令デコードユニット５０
７ａ、５０７ｂ、分岐予測ユニット５０８ａ、５０８
ｂ、命令実行ユニット５０９ａ、５０９ｂ、及び、デー
タキャッシュユニット５１０ａ、５１０ｂを備えてい
る。Referring to FIG. 12, there are provided two processors 502a and 502b connected to an external bus 501, each of which is provided with a PC register 503.
a, 503b, instruction cache units 504a, 50
4b, address adders 505a, 505b, branch history buffers 506a, 506b, instruction decode unit 50
7a, 507b, branch prediction units 508a, 508
b, instruction execution units 509a and 509b, and data cache units 510a and 510b.

【００３６】それぞれのプロセッサ５０２ａ、５０２ｂ
は、図１１に示した従来のプロセッサと同様の動作を行
う。Each processor 502a, 502b
Performs the same operation as the conventional processor shown in FIG.

【００３７】プロセッサ５０２ａ、５０２ｂ内で処理さ
れる条件付分岐命令は、プロセッサ５０２ａ、５０２ｂ
内に個別に保持する分岐履歴バッファ５０６ａ、５０６
ｂの情報に基づき、分岐予測ユニット５０８ａ、５０８
ｂにより予測が行われる。分岐条件の決定後、命令実行
ユニット５０９ａ、５０９ｂで履歴情報は更新されそれ
ぞれの分岐履歴バッファ５０６ａ、５０６ｂに書き込ま
れる。The conditional branch instructions processed in the processors 502a and 502b are
History buffers 506a and 506 individually held in
b, the branch prediction units 508a, 508
The prediction is performed by b. After the branch condition is determined, the history information is updated in the instruction execution units 509a and 509b and written into the respective branch history buffers 506a and 506b.

【００３８】これらの分岐履歴情報はそれぞれのプロセ
ッサ５０２ａ、５０２ｂ内で保持され、プロセッサ５０
２ａ、５０２ｂ間では共有されない。The branch history information is held in each of the processors 502a and 502b.
It is not shared between 2a and 502b.

【００３９】複数のプロセッサを並列に接続した情報処
理装置として、オンチップマルチプロセッサの研究も進
められている。複数のプロセッサを１チップに集積した
技術であるオンチップマルチプロセッサとして、例えば
文献（電子情報通信学会論文誌、Ｄ−Ｉ、Ｖｏｌ．Ｊ８
１−Ｄ−Ｉ、Ｎｏ．６、Ｍａｙ１９９８、ｐｐ．７１
８−７２７）等が参照される。As an information processing device in which a plurality of processors are connected in parallel, research on an on-chip multiprocessor is also underway. As an on-chip multiprocessor which is a technology in which a plurality of processors are integrated on a single chip, for example, literatures (Transactions of the Institute of Electronics, Information and Communication Engineers, DI, Vol. J8)
1-DI, No. 6, May 1998; 71
8-727).

【００４０】上記文献で取り上げられている構成の一つ
を図１３に示す。図１３を参照すると、このオンチップ
マルチプロセッサは、パイプラインレジスタ６０１、命
令キャッシュユニット６０２、２個のプロセッサ６０３
ａ、６０３ｂと、を備えて構成されている。FIG. 13 shows one of the configurations taken up in the above-mentioned document. Referring to FIG. 13, this on-chip multiprocessor includes a pipeline register 601, an instruction cache unit 602, and two processors 603.
a, 603b.

【００４１】２個のプロセッサ６０３ａ、６０３ｂは、
それぞれＰＣレジスタ６０４ａ、６０４ｂ、命令デコー
ド／レジスタユニット６０７ａ、６０７ｂ、分岐予測ユ
ニット６０８ａ、６０８ｂ、アドレス加算器６０６ａ、
６０６ｂ、命令実行ユニット６１０ａ、６１０ｂ、デー
タキャッシュユニット６１２ａ、６１２ｂ、及びパイプ
ラインレジスタ６０５ａ、６０５ｂ、６０９ａ、６０９
ｂ、６１１ａ、６１１ｂ、６１３ａ、６１３ｂを備えて
いる。The two processors 603a and 603b are
PC registers 604a, 604b, instruction decode / register units 607a, 607b, branch prediction units 608a, 608b, address adders 606a,
606b, instruction execution units 610a, 610b, data cache units 612a, 612b, and pipeline registers 605a, 605b, 609a, 609
b, 611a, 611b, 613a, 613b.

【００４２】パイプラインレジスタ６０１、命令キャッ
シュユニット６０２は２個のプロセッサ６０３ａ、６０
３ｂの共有の資源として用いられる。The pipeline register 601 and the instruction cache unit 602 include two processors 603a and 60
3b is used as a shared resource.

【００４３】パイプラインレジスタ６０１には、各プロ
セッサ６０３ａ、６０３ｂから転送されたアドレスから
一つを選択し格納している。The pipeline register 601 selects and stores one of the addresses transferred from the processors 603a and 603b.

【００４４】ＩＦステージにおいてこのパイプラインレ
ジスタ６０１に格納されたアドレスで指定される命令キ
ャッシュ６０２から命令列を読み出す。In the IF stage, an instruction sequence is read from the instruction cache 602 specified by the address stored in the pipeline register 601.

【００４５】読み出された命令列は、その命令列を要求
したプロセッサ６０３ａ／６０３ｂのパイプラインレジ
スタ６０５ａ、６０５ｂへ格納する。また同時に、パイ
プラインレジスタ６０１の内容が命令列を要求したプロ
セッサ６０３ａ／６０３ｂのＰＣレジスタ６０４ａ、６
０４ｂへ格納される。The read instruction sequence is stored in the pipeline registers 605a and 605b of the processor 603a / 603b that has requested the instruction sequence. At the same time, the contents of the pipeline register 601 are the PC registers 604a,
04b.

【００４６】パイプラインレジスタ６０１は、一つのみ
アドレスを指定し、命令キャッシュ６０２へのアクセス
は、１サイクルに唯一１度に限られる。すなわち、１つ
のプロセッサが命令キャッシュへアクセスしているサイ
クルでは、他のプロセッサが命令キャッシュへアクセス
することができない。The pipeline register 601 specifies only one address, and access to the instruction cache 602 is limited to only once per cycle. That is, in a cycle in which one processor accesses the instruction cache, another processor cannot access the instruction cache.

【００４７】このため、２つのプロセッサ６０３ａ、６
０３ｂに十分な命令を供給するためには、１サイクルで
２サイクルの命令を命令キャッシュから読み出しパイプ
ラインレジスタ６０５ａ、６０５ｂに格納する必要があ
る。Therefore, the two processors 603a and 603a
In order to supply a sufficient instruction to 03b, it is necessary to read an instruction of two cycles in one cycle from the instruction cache and store it in the pipeline registers 605a and 605b.

【００４８】ＩＤステージにおいて、パイプラインレジ
スタ６０５ａ、６０５ｂ内の命令列は、命令デコード／
レジスタユニット６０７ａ、６０７ｂでデコードされ
る。また、これらの命令が分岐命令であれば、分岐予測
ユニット６０８ａ、６０８ｂで分岐予測がなされる。In the ID stage, the instruction sequence in the pipeline registers 605a and 605b is
The data is decoded by the register units 607a and 607b. If these instructions are branch instructions, branch prediction is performed by branch prediction units 608a and 608b.

【００４９】これらの分岐予測は、それぞれのプロセッ
サ６０３ａ、６０３ｂ内で行われ、これらの分岐履歴情
報は、プロセッサ間で共有されない。These branch predictions are performed in the respective processors 603a and 603b, and the branch history information is not shared between the processors.

【００５０】一方、単一プロセッサでも、単一プロセッ
サで複数の命令を同時に複数の実行ユニットにおいて並
列に処理する技術であるスーパースカラ技術により並列
度が増してきている。１ステップにおいて処理する命令
数が増加するため、１サイクルで命令キャッシュからフ
ェッチする命令数が増加する。On the other hand, even in a single processor, the degree of parallelism has been increased by a superscalar technique, which is a technique for processing a plurality of instructions in a plurality of execution units in parallel by a single processor at the same time. Since the number of instructions processed in one step increases, the number of instructions fetched from the instruction cache in one cycle increases.

【００５１】図１４は、スーパースカラを実現する従来
のプロセッサの構成の一例を示す図である。図１４を参
照すると、ＰＣレジスタ７０１、命令キャッシュ７０
２、アドレス加算器７０３、分岐履歴バッファ７０４、
命令デコード／レジスタユニット７０６、分岐予測ユニ
ット７０７、命令実行ユニット７０９、及びデータキャ
ッシュユニット７１２を備えて構成されいる。FIG. 14 is a diagram showing an example of a configuration of a conventional processor for realizing a super scalar. Referring to FIG. 14, the PC register 701, the instruction cache 70
2, address adder 703, branch history buffer 704,
It comprises an instruction decode / register unit 706, a branch prediction unit 707, an instruction execution unit 709, and a data cache unit 712.

【００５２】命令実行ユニット７０９は、複数の演算ユ
ニット７０９ａ、７０９ｂを含み、複数の命令の並列実
行を可能としている。The instruction execution unit 709 includes a plurality of operation units 709a and 709b, and is capable of executing a plurality of instructions in parallel.

【００５３】ＰＣレジスタ７０１で指定された複数の命
令が命令キャッシュ７０２から読み出されパイプライン
レジスタ７０５に格納される。これらの命令は、命令デ
コード／レジスタユニット７０６でデコードされ、命令
実行ユニット内７０９の演算ユニットへ割り振られて実
行される。A plurality of instructions specified by the PC register 701 are read from the instruction cache 702 and stored in the pipeline register 705. These instructions are decoded by the instruction decode / register unit 706, allocated to the operation units in the instruction execution unit 709, and executed.

【００５４】条件付分岐命令の予測は、分岐予測ユニッ
ト７０７で行われる。命令フェッチ時に、ＰＣレジスタ
７０１に対応する１つの分岐履歴情報を分岐履歴バッフ
ァ７０４から読み出し、パイプラインレジスタ７０５に
格納する。The prediction of a conditional branch instruction is performed by the branch prediction unit 707. At the time of instruction fetch, one piece of branch history information corresponding to the PC register 701 is read from the branch history buffer 704 and stored in the pipeline register 705.

【００５５】パイプラインレジスタ７０５に格納された
複数の命令中に条件付き分岐命令命令が複数含まれる場
合には、それら全ての条件付分岐命令の予測を行うこと
ができず、１つの条件付分岐命令の予測のみを行ってい
る。When a plurality of conditional branch instructions are included in a plurality of instructions stored in the pipeline register 705, prediction of all conditional branch instructions cannot be performed, and one conditional branch instruction cannot be predicted. Only instruction prediction is performed.

【００５６】[0056]

【発明が解決しようとする課題】並列度が増加した情報
処理装置においては、１サイクルで複数の条件付分岐命
令を処理する必要が生じる。しかしながら、従来の情報
処理装置においては、複数の条件付分岐命令を予測する
ことが出来ず、パイプラインレジスタに格納された複数
の命令中に条件付き分岐命令が複数含まれる場合には、
１つの条件付分岐命令の予測のみを行っている。また、
情報処理装置において性能向上のためには条件分岐命令
の予測ヒット率をあげることが必須である。In an information processing apparatus having an increased degree of parallelism, it is necessary to process a plurality of conditional branch instructions in one cycle. However, in the conventional information processing apparatus, when a plurality of conditional branch instructions cannot be predicted and a plurality of conditional branch instructions are included in a plurality of instructions stored in a pipeline register,
Only one conditional branch instruction is predicted. Also,
In order to improve the performance of the information processing apparatus, it is essential to increase the predicted hit rate of the conditional branch instruction.

【００５７】そして、従来の情報処理装置では、複数の
プロセッサ毎個別に分岐履歴バッファを備えており、ハ
ードウエア量が増大し、予測ヒット率を向上するために
各プロセッサが使用する分岐履歴情報量を増やした場
合、さらにハードウエア量が増大する、という問題点を
有している。In the conventional information processing apparatus, a branch history buffer is individually provided for each of a plurality of processors, the amount of hardware increases, and the amount of branch history information used by each processor to improve the prediction hit rate. However, there is a problem that the amount of hardware further increases when the number is increased.

【００５８】したがって、本発明は、上記問題点に鑑み
てなされたものであって、その目的は、１サイクルに命
令キャッシュから複数の命令をフェッチする情報処理装
置において、フェッチしてきた複数の命令中に含まれる
複数の条件付分岐命令の分岐予測を可能とする、情報処
理装置を提供することにある。Therefore, the present invention has been made in view of the above problems, and an object of the present invention is to provide an information processing apparatus that fetches a plurality of instructions from an instruction cache in one cycle. An object of the present invention is to provide an information processing apparatus which enables branch prediction of a plurality of conditional branch instructions included in the information processing apparatus.

【００５９】本発明の他の目的は、条件付分岐命令の予
測ヒット率を向上し、性能向上を図る情報処理装置を提
供することにある。Another object of the present invention is to provide an information processing apparatus which improves the prediction hit rate of a conditional branch instruction to improve the performance.

【００６０】[0060]

【課題を解決するための手段】前記目的を達成する本発
明の情報処理装置は、分岐履歴バッファを１ラインに複
数の分岐履歴情報を格納する構成とし、命令フェッチ時
に、前記分岐履歴バッファから複数の分岐履歴情報を含
む１ラインを読み出し、フェッチした命令中に複数の条
件付分岐命令が含まれている場合、分岐履歴バッファか
ら読み出した複数の分岐履歴情報を用いて各々の条件付
分岐命令の分岐予測を行う。本発明において、分岐履歴
バッファの１ラインに含まれる複数の分岐履歴情報を条
件付分岐命令のアドレスと対応付けさせる。According to a first aspect of the present invention, there is provided an information processing apparatus comprising: a branch history buffer configured to store a plurality of pieces of branch history information in one line; If a plurality of conditional branch instructions are read from a line fetched from the branch history buffer and the fetched instruction includes a plurality of conditional branch instructions, each of the conditional branch instructions is read out using the plurality of branch history information read from the branch history buffer. Perform branch prediction. In the present invention, a plurality of pieces of branch history information included in one line of the branch history buffer are associated with addresses of conditional branch instructions.

【００６１】また本発明は、複数のプロセッサを含む情
報処理装置において、分岐命令を実行した際の分岐成立
／不成立の履歴、及び、成立した場合の命令アドレスを
格納する分岐履歴バッファを前記複数プロセッサ間で共
有することを特徴とする。According to the present invention, in an information processing apparatus including a plurality of processors, a branch history buffer for storing a history of branch taken / not taken when a branch instruction is executed and an instruction address when the branch instruction is taken is provided by the plurality of processors. It is characterized by being shared between.

【００６２】[0062]

【発明の実施の形態】本発明の実施の形態について説明
する。本発明の情報処理装置は、その好ましい実施の形
態において、図１を参照すると、１ラインに複数の分岐
履歴情報を格納する分岐履歴バッファ（８０４）を備
え、命令フェッチ時（ＩＦステージ）において、分岐履
歴バッファから複数の分岐履歴情報を含む１ラインを読
み出し、フェッチした命令中に、複数の条件付分岐命令
が含まれている場合、該分岐履歴バッファから読み出し
た複数の分岐履歴情報を用いて、各々の条件付分岐命令
の分岐予測を行う。分岐履歴バッファに格納される情報
としては、（ａ）分岐命令を実行した際の分岐成立／不
成立の分岐履歴情報、（ｂ）分岐履歴情報と分岐成立時
のアドレス、（ｃ）分岐履歴情報と分岐成立時のアドレ
スと分岐が成立した場合の成立先の命令のうちの少なく
とも一つの形態を含む。Embodiments of the present invention will be described. Referring to FIG. 1, the information processing apparatus according to the present invention includes a branch history buffer (804) for storing a plurality of pieces of branch history information in one line. When one line including a plurality of pieces of branch history information is read from the branch history buffer and a plurality of conditional branch instructions are included in the fetched instruction, the plurality of pieces of branch history information read from the branch history buffer are used. , Branch prediction of each conditional branch instruction. The information stored in the branch history buffer includes (a) branch history information indicating whether a branch is taken / not taken when a branch instruction is executed, (b) branch history information and an address when a branch is taken, (c) branch history information, It includes at least one form of the address at the time of the branch and the instruction at the branch destination when the branch is taken.

【００６３】また本発明の実施の形態においては、分岐
履歴バッファの１ラインに含まれる複数の分岐履歴情報
を条件付分岐命令のアドレスと対応付けさせる。In the embodiment of the present invention, a plurality of pieces of branch history information included in one line of the branch history buffer are associated with addresses of conditional branch instructions.

【００６４】複数の分岐履歴情報と条件付分岐命令のア
ドレスの対応付けのための第一の手段として、図２を参
照すると、命令キャッシュ（８０２）のライン（９０
１）中のエントリに分岐履歴バッファ（８０４）のライ
ン（９０２）内のエントリを対応させる。命令キャッシ
ュ（８０２）の１ライン中の先頭の命令は、分岐履歴バ
ッファ（８０４）の１ライン中の最初の分岐履歴情報
に、２番目の命令は２番目の履歴情報に、と対応付け
る。これにより、複数の条件付分岐命令をそれぞれに対
応する分岐履歴情報を用いて分岐予測することが可能と
なる。As a first means for associating a plurality of pieces of branch history information with addresses of conditional branch instructions, referring to FIG.
1) The entry in the line (902) of the branch history buffer (804) corresponds to the entry in (1). The first instruction in one line of the instruction cache (802) is associated with the first branch history information in one line of the branch history buffer (804), and the second instruction is associated with the second history information. This makes it possible to predict a plurality of conditional branch instructions using the corresponding branch history information.

【００６５】対応付けの第二の手段として、図７を参照
すると、分岐履歴バッファ（１４０２ａ、１４０２ｂ）
のそれぞれの分岐履歴情報に、分岐履歴情報のアドレス
の一部を持つタグ（１４０３ａ、１４０３ｂ）を備え、
条件付分岐命令のアドレスとコンパレータで比較する構
成としてもよい。As a second means of association, referring to FIG. 7, a branch history buffer (1402a, 1402b)
Are provided with tags (1403a, 1403b) each having a part of the address of the branch history information,
The address of the conditional branch instruction may be compared with a comparator.

【００６６】本発明の実施の形態によれば、条件付分岐
命令の履歴情報を異なる命令の履歴情報と混同すること
なく予測に用いることが可能となり予測ヒット率を向上
することを可能とする。According to the embodiment of the present invention, the history information of a conditional branch instruction can be used for prediction without being confused with the history information of a different instruction, and the prediction hit rate can be improved.

【００６７】より詳細には、本発明の情報処理装置は、
その好ましい一実施の形態において、図１を参照する
と、複数の命令を同一サイクルで同時に命令キャッシュ
から読み出すことのできる命令キャッシュユニット（８
０２）と、１ラインあたり複数の分岐履歴情報を格納す
る分岐履歴バッファ（８０４）と、を備え、ＩＦステー
ジにおいて、命令キャッシュユニット（８０２）では、
ＰＣレジスタ（８０１）で指定されたアドレスに従い命
令キャッシュのラインにアクセスし該ラインから複数エ
ントリの命令を読み出して命令バッファ（８０５ａ）に
格納し、ＰＣレジスタ（８０１）で指定されたアドレス
に従い分岐履歴バッファ（８０４）のラインにアクセス
し、該ラインから命令キャッシュ（８０２）のエントリ
に対応した命令の分岐履歴情報を複数エントリ読み出し
てヒストリバッファ（８０５ｃ）に格納し、ＩＤステー
ジにおいて、命令デコード／レジスタユニット（８０
６）では、命令バッファ（８０５ａ）に格納された命令
をデコードして、条件付分岐命令か否か判定され、判定
結果を分岐予測ユニット（８０７）に送り、分岐予測ユ
ニット（８０７）では、命令バッファに格納された命令
が条件付分岐命令であったときに、ヒストリバッファ
（８０５ｃ）に格納された分岐履歴情報に基づいて、前
記条件付分岐命令が成立するか否かを予測し、前記条件
付分岐命令の分岐が成立すると予測された場合には、分
岐先のアドレスを計算し計算結果を前記ＰＣレジスタ
（８０１）へ転送し、且つ、分岐予測の結果は命令実行
ユニット（８０９）に送られ、ＥＸステージでは、各命
令の演算、条件付分岐命令の分岐検証を行なう命令実行
ユニット（８０９）において、前記条件付分岐命令に関
する前記分岐予測ユニット（８０７）での分岐予測が正
しい場合、分岐履歴情報を更新し、分岐履歴バッファ
（８０４）に更新した履歴情報を書き込み、分岐予測が
間違っている場合には、正しい命令をフェッチするため
アドレスを計算してＰＣレジスタ（８０１）に書き込む
とともに、分岐履歴情報の更新を行い前記更新した履歴
情報を分岐履歴バッファ（８０４）に書き込む。More specifically, the information processing apparatus of the present invention
In one preferred embodiment, referring to FIG. 1, an instruction cache unit (8) capable of simultaneously reading a plurality of instructions from the instruction cache in the same cycle.
02) and a branch history buffer (804) for storing a plurality of branch history information per line. In the IF stage, the instruction cache unit (802)
The instruction cache line is accessed in accordance with the address specified by the PC register (801), instructions of a plurality of entries are read from the line, stored in the instruction buffer (805a), and the branch history is stored in accordance with the address specified by the PC register (801). A line of the buffer (804) is accessed, a plurality of entries of branch history information of an instruction corresponding to an entry of the instruction cache (802) are read from the line, and stored in a history buffer (805c). Unit (80
In 6), the instruction stored in the instruction buffer (805a) is decoded, and it is determined whether or not the instruction is a conditional branch instruction. The determination result is sent to the branch prediction unit (807). When the instruction stored in the buffer is a conditional branch instruction, whether or not the conditional branch instruction is satisfied is predicted based on branch history information stored in a history buffer (805c). When it is predicted that the branch of the branch instruction is taken, the address of the branch destination is calculated, the calculation result is transferred to the PC register (801), and the result of the branch prediction is sent to the instruction execution unit (809). In the EX stage, an instruction execution unit (809) for performing the operation of each instruction and verifying the branch of the conditional branch instruction includes the branch prediction unit for the conditional branch instruction. If the branch prediction at the link (807) is correct, the branch history information is updated, the updated history information is written into the branch history buffer (804), and if the branch prediction is wrong, the address for fetching the correct instruction is stored. Is calculated and written into the PC register (801), the branch history information is updated, and the updated history information is written into the branch history buffer (804).

【００６８】このように、本発明の実施においては、同
一サイクル内で複数の分岐履歴情報を参照し、それぞれ
の分岐命令の予測を、同一サイクル内に行うことができ
るAs described above, in the embodiment of the present invention, prediction of each branch instruction can be performed in the same cycle by referring to a plurality of pieces of branch history information in the same cycle.

【００６９】また図７を参照すると、複数ブロックから
なる分岐履歴バッファ（１４０２ａ、１４０２ｂ）に、
分岐履歴情報のアドレスの一部を持つタグ（１４０３
ａ、１４０３ｂ）を備え、ＩＦステージにおいて、分岐
履歴バッファ（１４０２ａ、１４０２ｂ）からＰＣレジ
スタ（１４０１）で指定されたエントリの履歴情報とタ
グアドレスを読み出してヒストリバッファ（１４０６
ａ、１４０６ｂ）に格納し、ＩＤステージにおいて、そ
のタグアドレスと、バッファ（１４０５）の条件付分岐
命令のアドレスと、をコンパレータ（１４０７）で比較
し、一致したブロックの分岐履歴情報をマルチプレクサ
で（１４０８）で選択し、分岐予測ユニットにおける条
件付分岐命令の分岐予測に用いる。Referring to FIG. 7, a branch history buffer (1402a, 1402b) composed of a plurality of blocks stores
Tag (1403) having a part of the address of branch history information
a, 1403b), at the IF stage, reads the history information and the tag address of the entry designated by the PC register (1401) from the branch history buffer (1402a, 1402b), and reads the history buffer (1406).
a, 1406b), and in the ID stage, the tag address and the address of the conditional branch instruction in the buffer (1405) are compared by a comparator (1407), and the branch history information of the matched block is output by a multiplexer ( 1408) and used for branch prediction of a conditional branch instruction in the branch prediction unit.

【００７０】また本発明は、別の好ましい実施の形態と
して、複数のプロセッサを持つ情報処理装置において
は、図４を参照すると、分岐履歴バッファ（１１０３）
をプロセッサ間で共有する。同一の条件付分岐命令を複
数のプロセッサ（１１０４ａ、１１０４ｂ）で実行した
場合、複数のプロセッサ間で分岐履歴情報を共有するこ
とが可能となり、予測のヒット率を向上することができ
る。According to another preferred embodiment of the present invention, in an information processing apparatus having a plurality of processors, referring to FIG.
Is shared between processors. When the same conditional branch instruction is executed by a plurality of processors (1104a, 1104b), branch history information can be shared among the plurality of processors, and the prediction hit rate can be improved.

【００７１】[0071]

【実施例】本発明の実施例について図面を参照して説明
する。Embodiments of the present invention will be described with reference to the drawings.

【００７２】［実施例１］図１は、本発明の一実施例を
なす情報処理装置の構成を示すブロック図である。図１
を参照すると、この情報処理装置は、１サイクルで２命
令の処理を行うことができるスーパースカラ方式のマイ
クロプロセッサであり、アドレスレジスタ８０１、命令
キャッシュユニット８０２、アドレス加算器８０３、分
岐履歴バッファ８０４、命令デコード／レジスタユニッ
ト８０６、分岐予測ユニット８０７、命令実行ユニット
８０９、データキャッシュユニット８１１、及びパイプ
ラインレジスタ８０５ａ、８０５ｂ、８０５ｃ、８０
８、８１０、８１２、を備えて構成されている。[Embodiment 1] FIG. 1 is a block diagram showing the configuration of an information processing apparatus according to an embodiment of the present invention. FIG.
, This information processing device is a super scalar type microprocessor capable of processing two instructions in one cycle, and includes an address register 801, an instruction cache unit 802, an address adder 803, a branch history buffer 804, Instruction decode / register unit 806, branch prediction unit 807, instruction execution unit 809, data cache unit 811, and pipeline registers 805a, 805b, 805c, 80
8, 810, and 812.

【００７３】図２は、図１に示した命令キャッシュユニ
ット８０２、分岐履歴バッファ８０４、パイプラインレ
ジスタ８０５ａ、８０５ｃについてその構成を示すブロ
ック図である。FIG. 2 is a block diagram showing the configuration of instruction cache unit 802, branch history buffer 804, and pipeline registers 805a and 805c shown in FIG.

【００７４】図２を参照すると、パイプラインレジスタ
８０５ａは、命令キャッシュユニット８０２から一度に
読み出した２命令を格納する命令バッファであり、パイ
プラインレジスタ８０５ｃは、分岐履歴バッファ８０４
から読み出した分岐履歴情報を２エントリ格納するヒス
トリバッファである。Referring to FIG. 2, a pipeline register 805a is an instruction buffer for storing two instructions read at a time from instruction cache unit 802, and a pipeline register 805c is a branch history buffer 804
This is a history buffer that stores two entries of branch history information read out from.

【００７５】なお、図１には、左端にパイプラインレジ
スタで区切られるパイプラインステージ名を示してい
る。FIG. 1 shows pipeline stage names separated by pipeline registers at the left end.

【００７６】図１及び図２を参照して、本発明の第１の
実施例の動作について説明する。ＰＣレジスタ８０１
は、命令キャッシュユニット８０２、分岐予測バッファ
８０４にアクセスするためのアドレスを格納する。この
アドレスは、アドレス加算器８０３、分岐予測ユニット
８０７、命令実行ユニット８０９からそれぞれアドレス
線８１３、８１４、８１５を通して転送され、マルチプ
レクサにより、１つが選択される。The operation of the first embodiment of the present invention will be described with reference to FIGS. PC register 801
Stores an address for accessing the instruction cache unit 802 and the branch prediction buffer 804. This address is transferred from the address adder 803, the branch prediction unit 807, and the instruction execution unit 809 through address lines 813, 814, and 815, respectively, and one is selected by the multiplexer.

【００７７】選択されたアドレスがＰＣレジスタ８０１
に格納される。アドレスの選択は、命令実行ユニット８
０９から転送されるアドレスが最優先であり、分岐予測
ユニット８０７から転送されるアドレス、アドレス加算
器８０３の結果の順で決定される。When the selected address is the PC register 801
Is stored in The address is selected by the instruction execution unit 8
The address transferred from the address 09 is the highest priority, and is determined in the order of the address transferred from the branch prediction unit 807 and the result of the address adder 803.

【００７８】ＩＦステージでは、命令キャッシュユニッ
ト８０２、アドレス加算器８０３、分岐履歴バッファ８
０４での処理が行われる。In the IF stage, the instruction cache unit 802, the address adder 803, the branch history buffer 8
04 is performed.

【００７９】アドレス加算器８０３は、次サイクルで命
令キャッシュユニット８０２、分岐予測バッファ８０４
にアクセスするために、ＰＣレジスタ８０１に格納され
たアドレスの値に８を加算して、２命令先のアドレスの
値を得る。結果は、アドレス線８１３を通してＰＣレジ
スタ８０１に送られる。The address adder 803 supplies the instruction cache unit 802 and the branch prediction buffer 804 in the next cycle.
Is added to the value of the address stored in the PC register 801 to obtain the value of the address two instructions ahead. The result is sent to the PC register 801 through the address line 813.

【００８０】命令キャッシュユニット８０２では、ＰＣ
レジスタ８０１で指定されたアドレスを先頭とする２命
令を読み出す。ＰＣレジスタ８０１で指定されたアドレ
スに従い命令キャッシュのライン９０１にアクセスす
る。該ラインから、例えば４００番地の命令９０１ｅ
と、４０４番地の命令９０１ｆの２命令を読み出す。In the instruction cache unit 802, the PC
Two instructions starting from the address specified by the register 801 are read. The instruction cache line 901 is accessed according to the address specified by the PC register 801. From the line, for example, an instruction 901e at address 400
And two instructions 901f at address 404 are read.

【００８１】読み出した２命令は、命令バッファ８０５
ａに格納される。４００番地の命令が命令バッファ８０
５ａ１に、４０４番地の命令が命令バッファ８０５ａ２
に格納される。The two instructions read are stored in the instruction buffer 805.
a. The instruction at address 400 is the instruction buffer 80
5a1, the instruction at address 404 is stored in the instruction buffer 805a2.
Is stored in

【００８２】分岐履歴バッファ８０４では、ＰＣレジス
タ８０１で指定されたアドレスを先頭とする２エントリ
の履歴情報を読み出す。ＰＣレジスタ８０１で指定され
たアドレスに従い分岐予測バッファ８０４のライン９０
２にアクセスする。該ラインの中から、４００番地の命
令の分岐履歴情報９０２ｅと４０４番地の命令の分岐履
歴情報９０２ｆを読み出す。The branch history buffer 804 reads two entries of history information starting from the address specified by the PC register 801. Line 90 of the branch prediction buffer 804 according to the address specified by the PC register 801
Access 2 From this line, branch history information 902e of the instruction at address 400 and branch history information 902f of the instruction at address 404 are read.

【００８３】このように、命令キャッシュのエントリと
分岐履歴バッファ８０４のエントリは常に対応付けられ
ている。As described above, the entry in the instruction cache and the entry in the branch history buffer 804 are always associated with each other.

【００８４】読み出された分岐履歴情報は、ヒストリバ
ッファ８０５ｃに格納する。２エントリ中４００番地の
分岐履歴情報がヒストリバッファ８０５ｃ１に、４０４
番地の分岐履歴情報がヒストリバッファ８０５ｃ２に格
納される。The read branch history information is stored in the history buffer 805c. The branch history information at address 400 in the two entries is stored in the history buffer 805c1 as 404.
The branch history information of the address is stored in the history buffer 805c2.

【００８５】ヒストリバッファに格納された分岐履歴情
報は、命令バッファ８０５ａに格納された命令に常に対
応している。ヒストリバッファ８０５ｃ１は命令バッフ
ァ８０５ａ１の分岐履歴情報であり、ヒストリバッファ
８０５ｃ２は命令バッファ８０５ａ２の分岐履歴情報で
ある。The branch history information stored in the history buffer always corresponds to the instruction stored in the instruction buffer 805a. The history buffer 805c1 is branch history information of the instruction buffer 805a1, and the history buffer 805c2 is branch history information of the instruction buffer 805a2.

【００８６】ＩＤステージでは、命令デコード／レジス
タユニット８０６及び分岐予測ユニット８０７での処理
が行われる。In the ID stage, processing in the instruction decode / register unit 806 and the branch prediction unit 807 is performed.

【００８７】命令デコード／レジスタユニット８０６で
は、命令バッファ８０５ａに格納された命令のデコード
とレジスタからのデータ読み出しが行われる。The instruction decode / register unit 806 decodes the instruction stored in the instruction buffer 805a and reads data from the register.

【００８８】命令のデコードにより、命令バッファ８０
５ａに格納された命令が条件付分岐命令か否か判定され
る。判定結果は、信号線８１６を通して、分岐予測ユニ
ット８０７に送られる。By decoding the instruction, the instruction buffer 80
It is determined whether the instruction stored in 5a is a conditional branch instruction. The determination result is sent to the branch prediction unit 807 via the signal line 816.

【００８９】分岐予測ユニット８０７では、命令バッフ
ァ８０５ａに格納された命令が条件付分岐命令であった
ときに、条件付分岐命令が成立するか否かを予測する。When the instruction stored in the instruction buffer 805a is a conditional branch instruction, the branch prediction unit 807 predicts whether or not the conditional branch instruction is taken.

【００９０】条件付分岐命令の分岐が成立すると予測さ
れた場合には、分岐先のアドレスを計算し、結果をアド
レス線８１４を通してＰＣレジスタ８０１へ転送する。
成立しないと予測された場合には何も行わない。If the branch of the conditional branch instruction is predicted to be taken, the branch destination address is calculated, and the result is transferred to the PC register 801 through the address line 814.
No action is taken if it is predicted not to hold.

【００９１】分岐の予測は、ヒストリバッファ８０５ｃ
に格納された分岐履歴情報に基づいて行われる。また条
件付分岐命令内のオフセットの値を条件付分岐命令のア
ドレスの値に加算することにより、分岐先命令のアドレ
スを得る。分岐予測の結果は信号線８１７を通して命令
実行ユニット８０９に送られる。The branch is predicted by the history buffer 805c.
This is performed based on the branch history information stored in. Further, the address of the branch destination instruction is obtained by adding the offset value in the conditional branch instruction to the address value of the conditional branch instruction. The result of the branch prediction is sent to the instruction execution unit 809 via a signal line 817.

【００９２】ＥＸステージでは、命令実行ユニット８０
９で処理が行われる。命令実行ユニット８０９は、各命
令の演算や条件付分岐命令の予測検証を行う。また分岐
履歴情報の更新を行う。分岐予測が正しい場合には、分
岐履歴情報を更新し、信号線８１８を通して分岐履歴バ
ッファ８０４に更新した履歴情報を書き込む。分岐予測
が間違っている場合には、正しい命令をフェッチするた
め、正しいアドレスを計算し、アドレス線８１５を通し
てＰＣレジスタ８０１に書き込む。また分岐履歴情報の
更新を行い、信号線８１８を通して更新した履歴情報を
分岐履歴バッファ８０４に書き込む。In the EX stage, the instruction execution unit 80
The process is performed at 9. The instruction execution unit 809 performs the operation of each instruction and the prediction verification of the conditional branch instruction. Also, the branch history information is updated. If the branch prediction is correct, the branch history information is updated, and the updated history information is written to the branch history buffer 804 via the signal line 818. If the branch prediction is wrong, the correct address is calculated to fetch the correct instruction and written to the PC register 801 through the address line 815. Further, the branch history information is updated, and the updated history information is written to the branch history buffer 804 via the signal line 818.

【００９３】ＭＥＭステージでは、データキャッシュユ
ニット８１１において処理が行われる。データキャッシ
ュユニット８１１では、データの読み出し、及び書き込
みが行われる。In the MEM stage, processing is performed in the data cache unit 811. The data cache unit 811 reads and writes data.

【００９４】ＷＢステージでは、パイプラインレジスタ
８１２に格納されたデータをデータ線８１９を通して命
令デコード／レジスタユニット８０６内のレジスタへの
書き込む。In the WB stage, data stored in the pipeline register 812 is written to a register in the instruction decode / register unit 806 via a data line 819.

【００９５】図３は、実行するプログラムと各ステージ
で処理される命令を、サイクル毎に表形式で示した図で
ある。図３において、ａｌｕは演算命令であり、ｂｒは
条件付分岐命令である。「ｂｒ、１００」は条件が成
立するならば命令アドレスに１００を加算したアドレス
へ分岐する命令である。分岐命令の最後に付加されてい
る、（ＮＴ）、（Ｔ）は、条件付分岐命令が成立するか
否かを表し、（Ｔ）は分岐が成立することを表し、（Ｎ
Ｔ）は分岐が不成立であることを表す。FIG. 3 is a diagram showing a program to be executed and instructions processed in each stage in a table format for each cycle. In FIG. 3, alu is an operation instruction, and br is a conditional branch instruction. “Br, 100” is an instruction that branches to an address obtained by adding 100 to the instruction address if the condition is satisfied. (NT) and (T) added to the end of the branch instruction represent whether or not a conditional branch instruction is taken, (T) represents that a branch is taken, and (N)
T) indicates that the branch is not taken.

【００９６】また、これら（分岐の成立／不成立）は、
条件が決定した後に判る分岐の結果であり、分岐履歴バ
ッファ８０４の情報に基づく分岐予測時のものとは異な
る場合がある。These (branch taken / not taken) are
This is the result of the branch that is known after the condition is determined, and may be different from that at the time of branch prediction based on information in the branch history buffer 804.

【００９７】図１及び図３を参照すると、Ｔ１サイクル
において、ＰＣレジスタ８０１にアドレス４００番地が
セットされると、命令キャッシュユニット８０２内の命
令キャッシュから４００番地を先頭とする２命令（４０
０番地と４０４番地）を読み出され、命令バッファ８０
５ａに格納される。Referring to FIG. 1 and FIG. 3, when the address 400 is set in the PC register 801 in the T1 cycle, two instructions starting from address 400 from the instruction cache in the instruction cache unit 802 (40
(Addresses 0 and 404) are read out and the instruction buffer 80 is read.
5a.

【００９８】４００番地の条件付分岐命令は８０５ａ１
に、４０４番地の条件付分岐命令は命令バッファ８０５
ａ２に格納される。同時に分岐履歴バッファ８０４か
ら、ＰＣレジスタ８０１で指定された４００番地を先頭
とする２エントリの履歴情報を読み出し、ヒストリバッ
ファ８０５ｃに格納する。ヒストリバッファ８０５ｃ１
に、「不成立」の履歴情報が、ヒストリバッファ８０５
ｃ２に「成立」の履歴情報が格納される。すなわち４０
０番地の履歴情報が「不成立」であり、４０４番地の履
歴情報が「成立」である。The conditional branch instruction at address 400 is 805a1
The conditional branch instruction at address 404 is stored in the instruction buffer 805.
a2. At the same time, two entries of history information starting from address 400 specified by the PC register 801 are read from the branch history buffer 804 and stored in the history buffer 805c. History buffer 805c1
The history information of “not established” is stored in the history buffer 805.
The history information of “established” is stored in c2. That is, 40
The history information at address 0 is “not established”, and the history information at address 404 is “established”.

【００９９】アドレス加算器８０３では、ＰＣレジスタ
８０１の値４００番地に、８を加算し、４０８番地の値
を得る。この結果をＰＣレジスタ８０１へ転送する。The address adder 803 adds 8 to the value 400 of the PC register 801 to obtain the value of address 408. The result is transferred to the PC register 801.

【０１００】次に、Ｔ２サイクルにおけるＩＤステージ
の動作について説明する。Next, the operation of the ID stage in the T2 cycle will be described.

【０１０１】命令デコード／レジスタユニット８０６で
は、命令バッファ８０５ａに格納された４００番地と４
０４番地の２命令のデコードが行われ、条件付分岐命令
であることが分かる。同時に、分岐予測ユニット８０７
で、条件付分岐命令の予測を行う。分岐予測は、ヒスト
リバッファ８０５ｃに格納された分岐履歴情報に基づき
行われる。The instruction decode / register unit 806 stores addresses 400 and 4 stored in the instruction buffer 805a.
Decoding of two instructions at address 04 is performed, and it is found that the instruction is a conditional branch instruction. At the same time, branch prediction unit 807
Predicts a conditional branch instruction. Branch prediction is performed based on branch history information stored in the history buffer 805c.

【０１０２】４００番地の条件付分岐命令は「不成立」
の予測がなされる。条件付分岐命令が不成立であるた
め、プログラム・フローは変更されない。The conditional branch instruction at address 400 is "not taken"
Is predicted. Since the conditional branch instruction is not taken, the program flow is not changed.

【０１０３】また４０４番地の条件付分岐命令は「成
立」の予測がなされる。条件付分岐命令が成立するため
プログラム・フローは、命令中に指定された１００番地
先の５０４番地の命令に分岐する。計算されたアドレス
５０４番地は、アドレス線８１４を通してＰＣレジスタ
８０１に格納される。The conditional branch instruction at address 404 is predicted to be "taken". Since the conditional branch instruction is satisfied, the program flow branches to the instruction at the address 504, which is the address 100 specified in the instruction. The calculated address 504 is stored in the PC register 801 through the address line 814.

【０１０４】ＩＦステージでは、４０８番地と４１２番
地の命令フェッチが行われる。しかしながら、４０４番
地の条件付分岐命令が「成立」と予測されたため、フェ
ッチされた命令は、キャンセルされる。In the IF stage, instruction fetches at addresses 408 and 412 are performed. However, the fetched instruction is canceled because the conditional branch instruction at address 404 is predicted to be “taken”.

【０１０５】次にＴ３サイクルにおけるＥＸステージの
動作について説明する。命令実行ユニット８０９では４
００番地と４０４番地の条件付分岐命令の予測検証が行
われる。４００番地の条件付分岐命令の分岐は「不成
立」であり、予測が正しいためプログラム・フローの変
更は行われない。４０４番地の条件付分岐命令の分岐は
「成立」であり、予測は正しかったことになる。この場
合、分岐履歴情報が更新され、信号線８１８を通して分
岐履歴バッファ８０４に書き込まれる。Next, the operation of the EX stage in the T3 cycle will be described. 4 in the instruction execution unit 809
Prediction verification of conditional branch instructions at addresses 00 and 404 is performed. The branch of the conditional branch instruction at address 400 is "not taken" and the program flow is not changed because the prediction is correct. The branch of the conditional branch instruction at address 404 is "taken", and the prediction was correct. In this case, the branch history information is updated and written to the branch history buffer 804 via the signal line 818.

【０１０６】ＩＦステージでは、Ｔ３サイクルで予測さ
れた５０４番地を先頭とする２命令のフェッチが行われ
る。In the IF stage, two instructions are fetched starting from the address 504 predicted in the T3 cycle.

【０１０７】Ｔ４サイクルにおいて、４００番地と４０
４番地の条件付分岐命令は、ＭＥＭステージの処理が行
われる。In the T4 cycle, addresses 400 and 40
The conditional branch instruction at address 4 is processed in the MEM stage.

【０１０８】ＩＦステージでは、ＰＣレジスタ８０１に
は、Ｔ３サイクルで命令実行ユニット８０９で計算され
たアドレス５１２番地が格納されている。５１２番地と
５１６番地の２つの命令が命令キャッシュユニット８０
２から読み出される。Ｔ４サイクルで、５０４番地と５
０８番地の２つの命令がＩＤステージでデコードされ
る。In the IF stage, the PC register 801 stores the address 512 calculated by the instruction execution unit 809 in the T3 cycle. Two instructions at addresses 512 and 516 are stored in the instruction cache unit 80.
2 is read. 504 and 5 in T4 cycle
Two instructions at address 08 are decoded in the ID stage.

【０１０９】本実施例では、分岐履歴バッファ８０４の
１ラインに、複数の条件付分岐命令の履歴情報を命令キ
ャッシュの１ラインに対応して格納し、命令フェッチ時
にこれら複数の履歴情報を読み出し、参照することによ
って、従来の方式ではできなかった複数の条件付分岐命
令の予測を可能としている。In this embodiment, the history information of a plurality of conditional branch instructions is stored in one line of the branch history buffer 804 corresponding to one line of the instruction cache, and the plurality of pieces of history information are read out at the time of instruction fetch. The reference makes it possible to predict a plurality of conditional branch instructions which cannot be performed by the conventional method.

【０１１０】［実施例２］本発明の第２の実施例につい
て説明する。図４は、本発明の第２の実施例の構成を示
すブロック図である。図４を参照すると、本発明の第２
の実施例は、パイプラインレジスタ１１０１、命令キャ
ッシュ１１０２、分岐履歴バッファ１１０３、プロセッ
サ１１０４ａ、１１０４ｂを備えて構成される。各プロ
セッサ１１０４ａ、１１０４ｂは、スーパースカラ方式
のマイクロプロセッサであり、１サイクルで２命令の処
理を行うことができる。[Embodiment 2] A second embodiment of the present invention will be described. FIG. 4 is a block diagram showing the configuration of the second exemplary embodiment of the present invention. Referring to FIG. 4, a second embodiment of the present invention is shown.
The embodiment includes a pipeline register 1101, an instruction cache 1102, a branch history buffer 1103, and processors 1104a and 1104b. Each of the processors 1104a and 1104b is a super scalar type microprocessor, and can process two instructions in one cycle.

【０１１１】これらの命令キャッシュユニット１１０２
及び分岐履歴バッファ１１０３へのアクセスは、前記第
１の実施例と同様、パイプラインレジスタ１１０１で指
定される共通のアドレスである。These instruction cache units 1102
Access to the branch history buffer 1103 is a common address specified by the pipeline register 1101, as in the first embodiment.

【０１１２】また、命令キャッシュユニット１１０２及
び分岐履歴バッファ１１０３は２つのプロセッサ間共有
のリソースであり、図１３に示した従来のプロセッサと
同様に、パイプラインレジスタ１１０１は、１つのアド
レスのみを指定するため、１つのプロセッサが、命令キ
ャッシュユニット１１０２及び分岐履歴バッファ１１０
３へアクセスしているサイクルでは、他方のプロセッサ
がこれら共有のリソースにアクセスすることはできな
い。The instruction cache unit 1102 and the branch history buffer 1103 are resources shared by two processors, and the pipeline register 1101 specifies only one address, as in the conventional processor shown in FIG. Therefore, one processor can execute the instruction cache unit 1102 and the branch history buffer 110
In the cycle accessing 3, the other processor cannot access these shared resources.

【０１１３】２命令発行のスーパースカラ方式のプロセ
ッサを２つ並列に接続していることから、命令キャッシ
ュユニット１１０２では、命令キャッシュから１サイク
ルで４命令をフェッチし、命令バッファ１１０６ａ又は
１１０６ｂに格納する。Since two super scalar processors that issue two instructions are connected in parallel, the instruction cache unit 1102 fetches four instructions from the instruction cache in one cycle and stores them in the instruction buffer 1106a or 1106b. .

【０１１４】これによりそれぞれのプロセッサ１１０４
ａ、１１０４ｂへの命令供給能力をあげる。As a result, each processor 1104
a) Increase the ability to supply instructions to 1104b.

【０１１５】また分岐履歴バッファ１１０３からは４命
令分の分岐履歴情報を読み出し、ヒストリバッファ１１
０６ａまたは１１０６ｂに格納する。The branch history information for four instructions is read from the branch history buffer 1103, and the history buffer 11
06a or 1106b.

【０１１６】プロセッサ１１０４ａ、１１０４ｂは、Ｐ
Ｃレジスタ１１０５ａ、１１０５ｂ、命令デコード／レ
ジスタユニット１１０８ａ、１１０８ｂ、分岐予測ユニ
ット１１０９ａ、１１０９ｂ、命令実行ユニット１１１
１ａ、１１１１ｂ、データキャッシュユニット１１１３
ａ、１１１３ｂ、命令バッファ１１０６ａ、１１０６
ｂ、ヒストリバッファ１１０７ａ、１１０７ｂを備えて
構成される。The processors 1104 a and 1104 b
C registers 1105a and 1105b, instruction decode / register units 1108a and 1108b, branch prediction units 1109a and 1109b, and instruction execution unit 111
1a, 1111b, data cache unit 1113
a, 1113b, instruction buffers 1106a, 1106
b, history buffers 1107a and 1107b.

【０１１７】図５は、図４の命令キャッシュユニット１
１０２、分岐予測バッファ１１０３、及びパイプライン
レジスタ１１０６ａ、１１０６ｂ、１１０７ａ、１１０
７ｂの構成を示すブロック図である。FIG. 5 shows the instruction cache unit 1 of FIG.
102, a branch prediction buffer 1103, and pipeline registers 1106a, 1106b, 1107a, 110
It is a block diagram which shows the structure of 7b.

【０１１８】図５を参照すると、命令キャッシュユニッ
ト１１０２の１ライン１２０１は、エントリ１２０１
ａ、１２０１ｂ、１２０１ｃ、１２０１ｄ、１２０１
ｅ、１２０１ｆを備え、計８つの命令を格納する。分岐
履歴バッファ１１０３の１ラインは、分岐履歴情報を８
エントリ１２０２ａ、１２０２ｂ、１２０２ｃ、１２０
２ｄ、１２０２ｅ、１２０２ｆを持つ。Referring to FIG. 5, one line 1201 of instruction cache unit 1102 has entry 1201
a, 1201b, 1201c, 1201d, 1201
e, 1201f, and stores a total of eight instructions. One line of the branch history buffer 1103 stores 8 pieces of branch history information.
Entries 1202a, 1202b, 1202c, 120
2d, 1202e, and 1202f.

【０１１９】各プロセッサ毎に用意された命令バッファ
１１０６ａ、１１０６ｂにはそれぞれ４命令を格納する
ことができ、ヒストリバッファ１１０７ａ、１１０７ｂ
には、４分岐履歴情報を格納することができる。The instruction buffers 1106a and 1106b prepared for each processor can store four instructions, respectively, and the history buffers 1107a and 1107b
Can store 4-branch history information.

【０１２０】図４及び図５を参照して、本発明の第２の
実施例の動作について説明する。パイプラインレジスタ
１１０１には、各プロセッサ１１０４ａ、１１０４ｂか
らアドレスが転送される。これらのアドレスからアービ
トレーションにより、選択された唯一つのアドレスがパ
イプラインレジスタ１１０１に格納される。The operation of the second embodiment of the present invention will be described with reference to FIGS. The addresses are transferred from the processors 1104a and 1104b to the pipeline register 1101. Only one address selected by arbitration from these addresses is stored in the pipeline register 1101.

【０１２１】ＩＦステージにおいて、パイプラインレジ
スタ１１０１により指定されたアドレスにより命令キャ
ッシュ１１０２及び分岐履歴バッファ１１０３がアクセ
スされる。In the IF stage, the instruction cache 1102 and the branch history buffer 1103 are accessed by the address specified by the pipeline register 1101.

【０１２２】パイプラインレジスタ１１０１で指定され
るアドレスは唯一つであるため、各サイクルにおいて、
一つのプロセッサのみが、命令キャッシュユニット１１
０２から命令の供給を受ける。Since only one address is specified by pipeline register 1101, in each cycle,
Only one processor has instruction cache unit 11
02 is supplied with an instruction.

【０１２３】命令キャッシュユニット１１０２から読み
出された４命令は、パイプラインレジスタ１１０１のア
ドレスを決定したプロセッサの命令バッファに格納され
る。また、分岐履歴バッファ１１０３から読み出された
４エントリの分岐履歴情報は同じプロセッサのヒストリ
バッファに格納される。The four instructions read from the instruction cache unit 1102 are stored in the instruction buffer of the processor that has determined the address of the pipeline register 1101. Also, the branch history information of four entries read from the branch history buffer 1103 is stored in the history buffer of the same processor.

【０１２４】図６は、本発明の第２の実施例において、
実行プログラムと各サイクルでの処理を表形式で示した
図であり、図６（ａ）に、プロセッサ１（１１０４
ａ）、図６（ｂ）に、プロセッサ２（１１０４ｂ）の処
理ステージが示されている。FIG. 6 shows a second embodiment of the present invention.
FIG. 6A is a diagram showing, in a table form, an execution program and processing in each cycle, and FIG.
6A shows the processing stages of the processor 2 (1104b).

【０１２５】パイプラインレジスタ１１０１には、プロ
セッサ１（１１０４ａ）により決定された４００番地の
アドレスがセットされている。命令キャッシュユニット
１１０２内の命令キャッシュ１２０１ｅに４００番地の
命令が格納されている。The address of the address 400 determined by the processor 1 (1104a) is set in the pipeline register 1101. The instruction at address 400 is stored in the instruction cache 1201e in the instruction cache unit 1102.

【０１２６】Ｔ１サイクルにおいて、パイプラインレジ
スタ１１０１で指定されたアドレスにより命令キャッシ
ュからは４００番地の命令を先頭とする４命令（４０
０、４０４、４０８、４１２番地）が読み出され、プロ
セッサ１（１１０４ａ）の命令バッファ１１０６ａに格
納され、分岐履歴バッファ１１０３からは４００番地の
命令の分岐履歴情報を先頭とする４情報が読み出され、
プロセッサ１（１１０４ａ）のヒストリバッファ１１０
７ａに格納される。In the T1 cycle, the instruction specified by the pipeline register 1101 causes the instruction cache to read four instructions (40 instructions starting with the instruction at address 400).
0, 404, 408, and 412) are stored in the instruction buffer 1106a of the processor 1 (1104a). From the branch history buffer 1103, four pieces of information starting from the branch history information of the instruction at the address 400 are read. And
History buffer 110 of processor 1 (1104a)
7a.

【０１２７】命令キャッシュの１２０１ｅに格納されて
いる４００番地の命令は命令バッファ１１０６ａ１へ、
１２０１ｆに格納されている４０４番地の命令は命令バ
ッファ１１０６ａ２へ、１２０１ｇに格納されている４
０８番地の命令は命令バッファ１１０６ａ３へ、１２０
２ｈに格納されている４１２番地の命令は命令バッファ
１１０６ａ４へと読み出される。The instruction at address 400 stored in the instruction cache 1201e is stored in the instruction buffer 1106a1.
The instruction at address 404 stored in 1201f is stored in the instruction buffer 1106a2 and stored in the instruction buffer 1106a2.
The instruction at address 08 is sent to instruction buffer 1106a3,
The instruction at address 412 stored in 2h is read out to the instruction buffer 1106a4.

【０１２８】分岐履歴バッファ１１０３の１２０２ｅに
格納されている４００番地の分岐履歴情報は、ヒストリ
バッファ１１０７ａ１へ、１２０２ｆに格納されている
４０４番地の分岐履歴情報はヒストリバッファ１１０７
ａ２へ、１２０２ｇに格納されている４０４番地の分岐
履歴情報はヒストリバッファ１１０７ａ３へ、１２０２
ｈに格納されている４０４番地の分岐履歴情報はヒスト
リバッファ１１０７ａ４へ読み出される。The branch history information at address 400 stored in 1202e of the branch history buffer 1103 is sent to the history buffer 1107a1, and the branch history information at address 404 stored at 1202f is sent to the history buffer 1107.
a2, the branch history information at address 404 stored in 1202g is sent to the history buffer 1107a3,
The branch history information at address 404 stored in h is read out to the history buffer 1107a4.

【０１２９】Ｔ２サイクルにおいて、プロセッサ１（１
１０４ａ）は、ＰＣレジスタ１１０５に４００番地を保
持している。In the T2 cycle, the processor 1 (1
104a) holds the address 400 in the PC register 1105.

【０１３０】命令バッファ１１０６ａに読み出された４
００番地と４０４番地の命令をデコードする。また、同
時に、ヒストリバッファ１１０７ａに格納されたこれら
の分岐履歴情報に基づき分岐予測を行う。4 read to the instruction buffer 1106a
The instructions at addresses 00 and 404 are decoded. At the same time, branch prediction is performed based on the branch history information stored in the history buffer 1107a.

【０１３１】ここで４００番地の条件付分岐命令は「不
成立」と予測され、４０４番地の条件付分岐命令は「成
立」と予測される。Here, the conditional branch instruction at address 400 is predicted as “not taken”, and the conditional branch instruction at address 404 is predicted as “taken”.

【０１３２】条件付分岐命令が「不成立」である場合に
は、プログラム・フローは変更されない。４０４番地の
条件付分岐命令が「成立」と予測されたため、プログラ
ム・フローは、４０４番地の命令内で指定された１００
番地先の５０４番地の命令に分岐する。すなわち、プロ
グラムフロー変更に伴い、４０８番地、４０８番地の命
令がキャンセルされる。If the conditional branch instruction is "not taken", the program flow is not changed. Since the conditional branch instruction at address 404 was predicted to be "taken", the program flow proceeds to the 100th instruction specified in the instruction at address 404.
Branch to the instruction at address 504 at the address. In other words, the instructions at addresses 408 and 408 are canceled with the change in the program flow.

【０１３３】同じＴ２サイクルにおいて、パイプライン
レジスタ１１０１はプロセッサ２により決定されたアド
レス３８４番地を保持している。このアドレスを先頭と
する４命令を、命令キャッシュ１１０２からプロセッサ
２（１１０４ｂ）の命令バッファ１１０６ｂに読み出
す。In the same T2 cycle, the pipeline register 1101 holds the address 384 determined by the processor 2. The four instructions starting with this address are read from the instruction cache 1102 to the instruction buffer 1106b of the processor 2 (1104b).

【０１３４】また分岐履歴バッファ１１０３から分岐履
歴情報をヒストリバッファ１１０７ｂに読み出す。The branch history information is read from the branch history buffer 1103 to the history buffer 1107b.

【０１３５】Ｔ３サイクルにおいて４０４番地の条件付
分岐命令の予測に従い５０４番地を先頭とする４命令
が、命令キャッシュ１１０２から読み出されプロセッサ
１（１１０４ａ）の命令バッファ１１０６ａに格納され
る。In the T3 cycle, four instructions starting at address 504 are read from the instruction cache 1102 and stored in the instruction buffer 1106a of the processor 1 (1104a) according to the prediction of the conditional branch instruction at address 404.

【０１３６】ＥＸステージでは、４００番地と４０４番
地が命令実行ユニット１１１１ａにおいて処理される。
４００番地の条件付分岐命令は「不成立」で予測が正し
いが、４０４番地の条件付分岐命令は「不成立」であ
り、予測が間違っていることが検証により分かる。In the EX stage, addresses 400 and 404 are processed in the instruction execution unit 1111a.
The conditional branch instruction at address 400 is "not taken" and the prediction is correct, but the conditional branch instruction at address 404 is "not taken" and the verification shows that the prediction is wrong.

【０１３７】このため、このサイクルのＩＦステージで
読み出された５０４番地と５０８番地の２つの命令は、
キャンセルされる。また４００番地、４０４番地の分岐
履歴情報は更新される。すなわち、２命令とも「不成
立」の履歴情報が、分岐履歴バッファ１１０３に書き込
まれる。Therefore, the two instructions at addresses 504 and 508 read in the IF stage of this cycle are:
Canceled. The branch history information at addresses 400 and 404 is updated. That is, the history information of “not established” is written to the branch history buffer 1103 for both instructions.

【０１３８】一方、プロセッサ２（１１０４ａ）では、
Ｔ３サイクルにおいて３８４番地、３８８番地の命令
が、ＩＤステージで処理される。On the other hand, in the processor 2 (1104a),
In the T3 cycle, the instructions at addresses 384 and 388 are processed in the ID stage.

【０１３９】Ｔ４サイクルにおいて、プロセッサ１（１
１０４ｂ）では、４００番地、４０４番地の命令がＭＥ
Ｍステージにおいて処理される。プロセッサ２（１１０
４ｂ）のＥＸステージでは、３８４番地、３８８番地の
命令の実行が行われ、ＩＤステージでは、３９２、３９
６番地の命令のデコードが行われる。また、ＩＦステー
ジにおいて、４００番地を先頭とする４命令のフェッチ
が行われる。In cycle T4, processor 1 (1
In 104b), the instructions at addresses 400 and 404 are ME
Processed at M stage. Processor 2 (110
In the EX stage 4b), the instructions at addresses 384 and 388 are executed. In the ID stage, 392 and 39 are executed.
The instruction at the address 6 is decoded. In the IF stage, fetch of four instructions starting from address 400 is performed.

【０１４０】Ｔ５サイクルにおいて、プロセッサ１（１
１０４ａ）では、４０８番地を先頭とする４命令のフェ
ッチが行われる。一方、プロセッサ２（１１０４ｂ）で
は、４００番地と４０４番地の条件付分岐命令の予測が
行われる。ヒストリバッファに読み出された分岐履歴情
報により、４００番地は「不成立」、また４０４番地も
Ｔ３サイクルで更新された「不成立」の履歴情報に基づ
き、プログラム・フローは変更されずに実行される。In cycle T5, processor 1 (1
In 104a), fetch of four instructions starting from address 408 is performed. On the other hand, the processor 2 (1104b) predicts conditional branch instructions at addresses 400 and 404. Based on the branch history information read into the history buffer, the program flow is executed without being changed, based on the history information of "not taken" at address 400 and "not taken" updated at address 404 in the T3 cycle.

【０１４１】このように本発明の第２の実施例では、複
数のプロセッサを持つ情報処理装置において、プロセッ
サ間で分岐履歴バッファを共有し、複数の命令の分岐履
歴情報を一度に読み出す構成とされており、このため、
複数の分岐命令を例えば同一サイクル内で予測すること
ができる。As described above, in the second embodiment of the present invention, in an information processing apparatus having a plurality of processors, a branch history buffer is shared between processors, and branch history information of a plurality of instructions is read at a time. Because of this,
A plurality of branch instructions can be predicted, for example, in the same cycle.

【０１４２】また同一の条件付分岐命令を複数のプロセ
ッサで実行した場合、複数のプロセッサ間で履歴情報を
共有化することにより、予測のヒット率を向上すること
ができる。When the same conditional branch instruction is executed by a plurality of processors, by sharing the history information among the plurality of processors, the prediction hit rate can be improved.

【０１４３】従来のプロセッサ毎個別に設けられた分岐
履歴バッファと比べ、本発明の第２の実施例では、分岐
履歴バッファを共有リソースとし、ハードウェア量を増
やすことなく、各プロセッサが使用できる情報量を増加
することが可能となる。In comparison with the conventional branch history buffer individually provided for each processor, in the second embodiment of the present invention, the branch history buffer is used as a shared resource, and information which can be used by each processor without increasing the amount of hardware. It is possible to increase the amount.

【０１４４】［実施例３］次に本発明の第３の実施例に
ついて説明する。図７は、本発明の第３の実施例の構成
を示すブロック図であり、２ウエイセットアソシアティ
ブ方式で構成された分岐履歴バッファの構成を示してい
る。図７を参照すると、分岐履歴バッファは、２つのブ
ロック１４０２ａ、１４０２ｂから構成されている。各
ブロック１４０２ａ、１４０２ｂは、分岐履歴情報のア
ドレスの一部を持つタグバッファ１４０３ａ、１４０３
ｂと、分岐履歴情報を格納するバッファ１４０４ａ、１
４０４ｂを備えている。バッファ１４０５は、命令デコ
ードの際に、その命令のアドレスを格納するためのもの
である。[Embodiment 3] Next, a third embodiment of the present invention will be described. FIG. 7 is a block diagram showing a configuration of the third embodiment of the present invention, and shows a configuration of a branch history buffer configured by a two-way set associative system. Referring to FIG. 7, the branch history buffer includes two blocks 1402a and 1402b. Each block 1402a, 1402b has a tag buffer 1403a, 1403 having a part of the address of the branch history information.
b and buffers 1404a, 1404 storing branch history information.
404b. The buffer 1405 is for storing the address of the instruction when decoding the instruction.

【０１４５】ヒストリバッファ１４０６ａ、１４０６ｂ
は、分岐履歴バッファ１４０２ａ、１４０２ｂから読み
出した履歴情報とタグを格納するためのパイプラインレ
ジスタである。なお、タグ情報は、アドレスの所定の下
位ビットが用いられ、分岐履歴バッファの各エントリに
付加情報（タグ）として格納される。History buffers 1406a and 1406b
Is a pipeline register for storing history information and tags read from the branch history buffers 1402a and 1402b. The tag information uses predetermined lower bits of the address, and is stored as additional information (tag) in each entry of the branch history buffer.

【０１４６】コンパレータ１４０７は、ヒストリバッフ
ァ１４０６ａ、１４０６ｂのタグ情報とバッファ１４０
５の値を比較する。The comparator 1407 stores the tag information of the history buffers 1406a and 1406b and the buffer 140
Compare the value of 5.

【０１４７】マルチプレクサ１４０８は、コンパレータ
１４０７の比較結果を選択制御信号として、２つのブロ
ックから読み出された分岐情報を選択する。バッファ１
４０５は、分岐命令のアドレスを格納する。The multiplexer 1408 selects the branch information read from the two blocks using the comparison result of the comparator 1407 as a selection control signal. Buffer 1
Reference numeral 405 stores the address of the branch instruction.

【０１４８】本発明の第３の実施例の動作について説明
する。分岐履歴バッファ１４０２ａ、１４０２ｂの読み
出しは、ＩＦステージにおいて行われる。読み出された
情報はパイプラインレジスタ１４０６ａ、１４０６ｂに
格納され、ＩＤステージで、コンパレータ１４０７に
て、条件付分岐命令のアドレスとタグ情報とが比較され
た後、条件付分岐命令の予測に使用される。The operation of the third embodiment of the present invention will be described. Reading of the branch history buffers 1402a and 1402b is performed in the IF stage. The read information is stored in pipeline registers 1406a and 1406b, and in the ID stage, the address of the conditional branch instruction is compared with the tag information by the comparator 1407, and is used for predicting the conditional branch instruction. You.

【０１４９】ＩＦステージにおいて、命令キャッシュか
らの命令フェッチと同時に分岐履歴バッファからは、条
件付分岐命令の分岐履歴情報が読み出される。At the IF stage, the branch history information of the conditional branch instruction is read from the branch history buffer simultaneously with the instruction fetch from the instruction cache.

【０１５０】命令フェッチを行う命令のアドレスは、Ｐ
Ｃレジスタ１４０１に格納されている。The address of the instruction for performing the instruction fetch is P
It is stored in the C register 1401.

【０１５１】ＰＣレジスタ１４０１で指定されたエント
リの分岐履歴情報とタグが各ブロック１４０２ａ、１４
０２ｂから読み出され、ヒストリバッファ１４０６ａ、
１４０６ｂにそれぞれ格納される。The branch history information and the tag of the entry specified by the PC register 1401 are stored in each of the blocks 1402a and 1402.
02b, and read from the history buffer 1406a,
1406b.

【０１５２】ＩＤステージにおいて、命令キャッシュか
らフェッチされた命令が条件付分岐命令だった場合に
は、不図示の分岐予測ユニットにおいて分岐予測がなさ
れる。In the ID stage, if the instruction fetched from the instruction cache is a conditional branch instruction, a branch prediction unit (not shown) performs branch prediction.

【０１５３】バッファ１４０５のアドレスと、ヒストリ
バッファの分岐情報のタグ１４０６ａ、１４０６ｂをコ
ンパレータ１４０７で比較し、タグが一致した場合に
は、一致したブロック方の分岐履歴情報を、マルチプレ
クサ１４０８で選択し、条件付分岐命令の予測に用い
る。The address of the buffer 1405 is compared with the tags 1406a and 1406b of the branch information of the history buffer by the comparator 1407. If the tags match, the branch history information of the matched block is selected by the multiplexer 1408. Used for predicting conditional branch instructions.

【０１５４】本発明の第３の実施例は、１サイクルで複
数の命令をフェッチする情報処理装置において、分岐履
歴バッファから１サイクルに複数の分岐履歴情報を読み
出すことができ、フェッチした命令列中の複数の条件付
分岐命令の予測を行うことを可能とする。According to the third embodiment of the present invention, in an information processing apparatus that fetches a plurality of instructions in one cycle, a plurality of pieces of branch history information can be read from a branch history buffer in one cycle. Can be predicted for a plurality of conditional branch instructions.

【０１５５】また、タグにアドレス情報を持つことによ
り、分岐履歴情報と分岐命令の関連性が明確となり、こ
のため異なる分岐命令の情報が予測に用いられるという
事態の発生を防ぎ、分岐予測時の予測のヒット率を向上
することができる。Also, by having address information in the tag, the relationship between the branch history information and the branch instruction is clarified, so that the occurrence of a situation in which information of different branch instructions is used for prediction is prevented, and The prediction hit rate can be improved.

【０１５６】[0156]

【発明の効果】以上説明したように、本発明によれば下
記記載の効果を奏する。As described above, according to the present invention, the following effects can be obtained.

【０１５７】本発明の第１の効果は、複数の条件付分岐
命令の分岐予測を行うことができ、分岐予測のヒット率
を向上し、処理性能を向上する、ということである。並
列度が増加した情報処理装置においては１サイクルで複
数の条件付分岐命令を処理する必要が生じる。従来の情
報処理装置においては、複数の条件付分岐命令を予測す
ることが出来なかったが、本発明は、この問題を解決し
ている。A first effect of the present invention is that branch prediction of a plurality of conditional branch instructions can be performed, and the hit rate of branch prediction is improved, thereby improving processing performance. In an information processing device having an increased degree of parallelism, it is necessary to process a plurality of conditional branch instructions in one cycle. In a conventional information processing apparatus, a plurality of conditional branch instructions could not be predicted, but the present invention has solved this problem.

【０１５８】また情報処理装置において性能向上のため
には条件分岐命令の予測ヒット率をあげることが必須で
ある。本発明によれば、複数の条件付分岐命令の予測が
可能となることにより予測できない条件付分岐命令をな
くすことができ、予測のヒット率の向上が望める。In order to improve the performance of the information processing apparatus, it is essential to increase the prediction hit rate of the conditional branch instruction. According to the present invention, since a plurality of conditional branch instructions can be predicted, unpredictable conditional branch instructions can be eliminated, and an improvement in prediction hit rate can be expected.

【０１５９】本発明の第２の効果は、同一の条件付分岐
命令を複数のプロセッサで実行した場合、複数のプロセ
ッサで分岐履歴バッファを共有することにより、さらに
予測のヒット率を向上することができる、ということで
ある。A second effect of the present invention is that, when the same conditional branch instruction is executed by a plurality of processors, a plurality of processors share a branch history buffer, thereby further improving the prediction hit ratio. It is possible.

【０１６０】本発明の第３の効果は、分岐履歴バッファ
を共有とすることにより、ハードウェア量を増やすこと
なく、各プロセッサが使用する履歴情報量を増やすこと
ができる、ということである。A third effect of the present invention is that by sharing the branch history buffer, the amount of history information used by each processor can be increased without increasing the amount of hardware.

[Brief description of the drawings]

【図１】本発明の第１の実施例の構成を示すブロック図
である。FIG. 1 is a block diagram showing a configuration of a first exemplary embodiment of the present invention.

【図２】本発明の第１の実施例の詳細構成を示すブロッ
ク図である。。FIG. 2 is a block diagram showing a detailed configuration of the first embodiment of the present invention. .

【図３】本発明の第１の実施例の分岐予測動作を説明す
るための図であり、プログラムと各ステージで処理され
る命令を示す図である。FIG. 3 is a diagram for explaining a branch prediction operation according to the first embodiment of the present invention, and is a diagram showing a program and instructions processed in each stage.

【図４】本発明の第２の実施例の構成を示すブロック図
である。FIG. 4 is a block diagram showing a configuration of a second exemplary embodiment of the present invention.

【図５】本発明の第２の実施例の詳細構成を示すブロッ
ク図である。。FIG. 5 is a block diagram showing a detailed configuration of a second embodiment of the present invention. .

【図６】本発明の第２の実施例の分岐予測動作を説明す
るための図であり、プログラムと各ステージで処理され
る命令を示す図である。FIG. 6 is a diagram for explaining a branch prediction operation according to the second embodiment of the present invention, and is a diagram illustrating a program and instructions processed in each stage.

【図７】本発明の第３の実施例の構成を示すブロック図
である。FIG. 7 is a block diagram showing a configuration of a third exemplary embodiment of the present invention.

【図８】パイプライン処理の動作原理を説明するための
図である。FIG. 8 is a diagram for explaining the operation principle of pipeline processing.

【図９】パイプライン方式マイクロプロセッサの構成の
一例を示すブロック図である。FIG. 9 is a block diagram illustrating an example of a configuration of a pipelined microprocessor.

【図１０】分岐予測ユニットを有するマイクロプロセッ
サの構成の一例を示すブロック図である。FIG. 10 is a block diagram illustrating an example of a configuration of a microprocessor having a branch prediction unit.

【図１１】分岐履歴バッファを有するマイクロプロセッ
サの構成の一例を示すブロック図である。FIG. 11 is a block diagram illustrating an example of a configuration of a microprocessor having a branch history buffer.

【図１２】複数のプロセッサを並列接続した従来の情報
処理装置の構成の一例を示すブロック図である。FIG. 12 is a block diagram illustrating an example of a configuration of a conventional information processing device in which a plurality of processors are connected in parallel.

【図１３】分岐予測機構を有し、オンチップマルチプロ
セッサにおける共有キャッシュを有する、従来の情報処
理装置の構成の一例を示すブロック図である。FIG. 13 is a block diagram showing an example of a configuration of a conventional information processing apparatus having a branch prediction mechanism and having a shared cache in an on-chip multiprocessor.

【図１４】スーパースカラ方式のマイクロプロセッサの
構成の一例を示すブロック図である。FIG. 14 is a block diagram illustrating an example of a configuration of a superscalar microprocessor.

[Description of sign]

２０１ＰＣレジスタ２０２命令キャッシュユニット２０３命令デコード／レジスタユニット２０４命令実行ユニット２０５データキャッシュユニット２０６アドレス加算器２０７パイプラインレジスタ２０８パイプラインレジスタ２０９パイプラインレジスタ２１０パイプラインレジスタ２１１データ線２１２データ線２１３アドレス線３０１ＰＣレジスタ３０２命令キャッシュユニット３０３命令デコード／レジスタユニット３０４命令実行ユニット３０５データキャッシュユニット３０６アドレス加算器３０７パイプラインレジスタ３０８パイプラインレジスタ３０９パイプラインレジスタ３１０パイプラインレジスタ３１１アドレス線３１２分岐予測ユニット３１３アドレス線４０１ＰＣレジスタ４０２命令キャッシュユニット４０３命令デコード／レジスタユニット４０４命令実行ユニット４０５データキャッシュユニット４０６アドレス加算器４０７パイプラインレジスタ４０８パイプラインレジスタ４０９パイプラインレジスタ４１０パイプラインレジスタ４１１アドレス線４１２分岐予測ユニット４１３アドレス線４１４分岐履歴バッファ４１５アドレス線４１６信号線５０１外部バス５０２ａプロセッサａ５０２ｂプロセッサｂ５０３ａＰＣレジスタ５０３ｂＰＣレジスタ５０４ａ命令キャッシュユニット５０４ｂ命令キャッシュユニット５０５ａアドレス加算器５０５ｂアドレス加算器５０６ａ分岐履歴バッファ５０６ｂ分岐履歴バッファ５０７ａ命令デコード／レジスタユニット５０７ｂ命令デコード／レジスタユニット５０８ａ分岐予測ユニット５０８ｂ分岐予測ユニット５０９ａ命令実行ユニット５０９ｂ命令実行ユニット５１０ａデータキャッシュユニット５１０ｂデータキャッシュユニット５１１ａパイプラインレジスタ５１１ｂパイプラインレジスタ５１２ａパイプラインレジスタ５１２ｂパイプラインレジスタ５１３ａパイプラインレジスタ５１３ｂパイプラインレジスタ５１４ａパイプラインレジスタ５１４ｂパイプラインレジスタ６０１パイプラインレジスタ６０２命令キャッシュユニット６０３ａプロセッサａ６０３ｂプロセッサｂ６０４ａＰＣレジスタ６０４ｂＰＣレジスタ６０５ａパイプラインレジスタ６０５ｂパイプラインレジスタ６０６ａアドレス加算器６０６ｂアドレス加算器６０７ａ命令デコード／レジスタユニット６０７ｂ命令デコード／レジスタユニット６０８ａ分岐予測ユニット６０８ｂ分岐予測ユニット６０９ａパイプラインレジスタ６０９ｂパイプラインレジスタ６１０ａ命令実行ユニット６１０ｂ命令実行ユニット６１１ａパイプラインレジスタ６１１ｂパイプラインレジスタ６１２ａデータキャッシュユニット６１２ｂデータキャッシュユニット６１３ａパイプラインレジスタ６１３ｂパイプラインレジスタ７０１ＰＣレジスタ７０２命令キャッシュユニット７０３アドレス加算器７０４分岐履歴バッファ７０５パイプラインレジスタ７０６命令デコード／レジスタユニット７０７分岐予測ユニット７０８パイプラインレジスタ７０９命令実行ユニット７０９ａ演算器７０９ｂ演算器７１０パイプラインレジスタ７１１データキャッシュユニット７１２パイプラインレジスタ８０１ＰＣレジスタ８０２命令キャッシュユニット８０３アドレス加算器８０４分岐履歴バッファ８０５ａ命令バッファ８０５ｂパイプラインレジスタ８０５ｃヒストリバッファ８０６命令デコード／レジスタユニット８０７分岐予測ユニット８０８パイプラインレジスタ８０９命令実行ユニット８１０パイプラインレジスタ８１１データキャッシュユニット８１２パイプラインレジスタ８１３アドレス線８１４アドレス線８１５アドレス線８１６信号線８１７信号線８１８信号線８１９データ線９０１命令キャッシュ内ライン９０２分岐履歴バッファ内ライン１１０１パイプラインレジスタ１１０２命令キャッシュユニット１１０３分岐履歴バッファ１１０４ａプロセッサａ１１０４ｂプロセッサｂ１１０５ａＰＣレジスタ１１０５ｂＰＣレジスタ１１０６ａ命令バッファ１１０６ｂ命令バッファ１１０７ａヒストリバッファ１１０７ｂヒストリバッファ１１０８ａ命令デコード／レジスタユニット１１０８ｂ命令デコード／レジスタユニット１１０９ａ分岐予測ユニット１１０９ｂ分岐予測ユニット１１１０ａパイプラインレジスタ１１１０ｂパイプラインレジスタ１１１１ａ命令実行ユニット１１１１ｂ命令実行ユニット１１１２ａパイプラインレジスタ１１１２ｂパイプラインレジスタ１１１３ａデータキャッシュユニット１１１３ｂデータキャッシュユニット１１１４ａパイプラインレジスタ１１１４ｂパイプラインレジスタ１２０１命令キャッシュ内ライン１２０２分岐履歴バッファ内ライン１４０１ＰＣレジスタ１４０２ａ分岐履歴バッファ内ブロック１４０２ｂ分岐履歴バッファ内ブロック１４０３ａタグバッファ１４０３ｂタグバッファ１４０４ａ履歴バッファ１４０４ｂ履歴バッファ１４０５アドレスバッファ１４０６ａヒストリバッファ１４０６ｂヒストリバッファ１４０７コンパレータ１４０８マルチプレクサ 201 PC register 202 Instruction cache unit 203 Instruction decode / register unit 204 Instruction execution unit 205 Data cache unit 206 Address adder 207 Pipeline register 208 Pipeline register 209 Pipeline register 210 Pipeline register 211 Data line 212 Data line 213 Address line 301 PC register 302 Instruction cache unit 303 Instruction decode / register unit 304 Instruction execution unit 305 Data cache unit 306 Address adder 307 Pipeline register 308 Pipeline register 309 Pipeline register 310 Pipeline register 311 Address line 312 Branch prediction unit 313 Address Line 401 PC Regis 402 instruction cache unit 403 instruction decode / register unit 404 instruction execution unit 405 data cache unit 406 address adder 407 pipeline register 408 pipeline register 409 pipeline register 410 pipeline register 411 address line 412 branch prediction unit 413 address line 414 branch History buffer 415 Address line 416 Signal line 501 External bus 502a Processor a 502b Processor b 503a PC register 503b PC register 504a Instruction cache unit 504b Instruction cache unit 505a Address adder 505b Address adder 506a Branch history buffer 506b Branch history buffer 507a Instruction / Register unit 07b Instruction decode / register unit 508a Branch prediction unit 508b Branch prediction unit 509a Instruction execution unit 509b Instruction execution unit 510a Data cache unit 510b Data cache unit 511a Pipeline register 511b Pipeline register 512a Pipeline register 512b Pipeline register 513a Pipeline register 513b pipeline register 514a pipeline register 514b pipeline register 601 pipeline register 602 instruction cache unit 603a processor a 603b processor b 604a PC register 604b PC register 605a pipeline register 605b pipeline register 606a address adder 6 06b Address adder 607a Instruction decode / register unit 607b Instruction decode / register unit 608a Branch prediction unit 608b Branch prediction unit 609a Pipeline register 609b Pipeline register 610a Instruction execution unit 610b Instruction execution unit 611a Pipeline register 611b Pipeline register 612a Data Cache unit 612b Data cache unit 613a Pipeline register 613b Pipeline register 701 PC register 702 Instruction cache unit 703 Address adder 704 Branch history buffer 705 Pipeline register 706 Instruction decode / register unit 707 Branch prediction unit 708 Pipeline register 709 Instruction actual Unit 709a Operation unit 709b Operation unit 710 Pipeline register 711 Data cache unit 712 Pipeline register 801 PC register 802 Instruction cache unit 803 Address adder 804 Branch history buffer 805a Instruction buffer 805b Pipeline register 805c History buffer 806 Instruction decode / register unit 807 Branch prediction unit 808 Pipeline register 809 Instruction execution unit 810 Pipeline register 811 Data cache unit 812 Pipeline register 813 Address line 814 Address line 815 Address line 816 Signal line 817 Signal line 818 Signal line 819 Data line 901 Instruction cache line 902 Line in branch history buffer 110 1 Pipeline Register 1102 Instruction Cache Unit 1103 Branch History Buffer 1104a Processor a 1104b Processor b 1105a PC Register 1105b PC Register 1106a Instruction Buffer 1106b Instruction Buffer 1107a History Buffer 1107b History Buffer 1108a Instruction Decode / Register Unit 1108b Instruction Decode / register Unit 1108b Prediction unit 1109b branch prediction unit 1110a pipeline register 1110b pipeline register 1111a instruction execution unit 1111b instruction execution unit 1112a pipeline register 1112b pipeline register 1113a data cache unit 1113b data cache unit 11 14a Pipeline Register 1114b Pipeline Register 1201 Instruction Cache Line 1202 Branch History Buffer Line 1401 PC Register 1402a Branch History Buffer Block 1402b Branch History Buffer Block 1403a Tag Buffer 1403b Tag Buffer 1404a History Buffer 1404b History Buffer 1405 Address Buffer 1406a History buffer 1406b History buffer 1407 Comparator 1408 Multiplexer

Claims

[Claims]

An information processing apparatus for simultaneously reading a plurality of instructions from an instruction cache in the same cycle, wherein a plurality of branch instructions among the plurality of read instructions are referred to in a same cycle by a plurality of branch history information, respectively. An information processing apparatus comprising means for performing prediction of a branch instruction in the same cycle.

2. The information processing apparatus according to claim 1, further comprising a branch history buffer for storing branch history information of a plurality of branch instructions in one line.

3. A branch history buffer for storing a plurality of pieces of branch history information in one line corresponding to an instruction in a line of an instruction cache, wherein a plurality of branch instructions to be executed in the processor are predicted. 2. The information processing apparatus according to claim 1, wherein the information processing apparatus is capable of performing the following.

4. An information processing apparatus including a plurality of processors, wherein a means for predicting a branch instruction is shared among the plurality of processors.

5. A branch history buffer for storing a branch taken / not taken history at the time of execution of a branch instruction and an instruction address when a branch is taken is shared by the plurality of processors. 5. The information processing apparatus according to 4.

6. An information processing apparatus including a plurality of processors, comprising: an instruction cache unit capable of simultaneously reading a plurality of instructions from an instruction cache in the same cycle; An information processing apparatus comprising: means for referring to a plurality of pieces of branch history information in a cycle and predicting each instruction in the same cycle.

7. The information processing apparatus according to claim 4, wherein a branch history buffer for storing branch history information of a plurality of branch instructions in one line is shared by the plurality of processors. .

8. A processor according to claim 1, further comprising a branch history buffer for storing a plurality of pieces of branch history information in one line corresponding to instructions in a line of the instruction cache, wherein said branch history buffer is shared by said plurality of processors. 7. The information processing apparatus according to claim 4, wherein prediction of a plurality of branch instructions to be executed within the apparatus can be performed.

9. An information processing apparatus for simultaneously reading a plurality of instructions from an instruction cache in the same cycle, comprising: a branch history buffer in which one line stores branch history information of a plurality of branch instructions; A branch history buffer including one or more blocks having a tag having a part and a comparator for comparing the address of the tag of each block with the address of a conditional branch instruction; An information processing apparatus, wherein a plurality of conditional branch instructions in an instruction can be predicted.

10. An information processing apparatus including a plurality of processors, wherein a branch history buffer for storing branch history information of a plurality of branch instructions in one line, comprising a tag having a part of the address of the branch history information. A branch history buffer including one or a plurality of blocks and a comparator for comparing an address of the tag of each of the blocks and an address of a conditional branch instruction, wherein the plurality of processors share the branch history buffer; An information processing apparatus, wherein prediction of all branch instructions executed among the plurality of processors can be performed.

11. An information processing apparatus including a plurality of processors, comprising: a branch history buffer for storing a plurality of branch history information in one line corresponding to an instruction in a line of the instruction cache; An information processing apparatus, wherein a branch history buffer is shared, and all branch instructions executed in the plurality of processors can be predicted.

12. An information processing apparatus including a pipeline processor, an instruction cache unit capable of simultaneously reading a plurality of instructions from an instruction cache in the same cycle, and a branch history storing a plurality of pieces of branch history information per line. A buffer including a plurality of conditional branch instructions read out from the branch history buffer at the time of fetching an instruction from the instruction cache, wherein the fetched instruction includes a plurality of conditional branch instructions. An information processing apparatus configured to perform a branch prediction of each of the plurality of conditional branch instructions using a plurality of pieces of branch history information read from the branch history buffer.

13. A plurality of pieces of branch history information included in one line of the branch history buffer can be conditionally associated with an entry in the line of the instruction cache corresponding to an entry in the line of the branch history buffer. 13. The information processing apparatus according to claim 12, wherein the information is associated with a branch instruction address.

14. A branch history buffer comprising: a plurality of blocks each including a tag having a part of an address of branch history information;
2. A comparator for comparing an address of the tag in a line accessed for each block with an address of a conditional branch instruction.
2. The information processing apparatus according to item 2.

15. A multiplexer which selectively outputs branch history information included in a line of a block in which the address of a conditional branch instruction matches the address of a tag according to the comparison result of the branch instruction, and stores the branch history information in the branch instruction. The information processing apparatus according to claim 14, which is used for prediction.

16. An information processing apparatus provided with a plurality of processors of a pipeline system, wherein a plurality of processors share a branch history buffer for storing branch history information of a branch instruction, When a plurality of instructions are fetched from an instruction cache shared by a plurality of instructions, the branch history information of the plurality of instructions is read out from the branch history buffer at a time, thereby enabling the prediction of the plurality of conditional branch instructions. An information processing apparatus, wherein when a conditional branch instruction is executed by a plurality of processors, branch history information can be shared among the plurality of processors.

17. An information processing apparatus including a pipeline type processor, an instruction cache unit capable of simultaneously reading a plurality of instructions from an instruction cache in the same cycle, a branch history buffer storing a plurality of branch history information, In the instruction fetch (IF) stage, the instruction cache unit accesses a line of the instruction cache according to an address specified by a PC register, reads instructions of a plurality of entries from the line, and stores the instructions in an instruction buffer. Accessing a line of the branch history buffer according to an address specified by a PC register, reading a plurality of entries of branch history information of an instruction corresponding to an entry of the instruction cache from the line, storing the information in a history buffer, and decoding the instruction (ID) On stage In the instruction decoding unit, the instruction stored in the instruction buffer is decoded, and it is determined whether or not the instruction is a conditional branch instruction. The result of the determination is sent to the branch prediction unit. When the stored instruction is a conditional branch instruction, predicting whether or not the conditional branch instruction is taken based on the branch history information stored in the history buffer, If the branch is predicted to be taken, the branch destination address is calculated, the calculation result is transferred to the PC register, and the result of the branch prediction is sent to the instruction execution unit. In the instruction execution (EX) stage, In the instruction execution unit, if the branch prediction in the branch prediction unit for the conditional branch instruction is correct, the branch history information is updated and the branch The updated history information is written to the history buffer, and if the branch prediction is incorrect, the address is calculated to fetch the correct instruction and written to the PC register, and the branch history information is updated to update the updated history information. In the branch history buffer.

18. An information processing apparatus including a pipeline type processor, comprising: an instruction cache unit capable of simultaneously reading a plurality of instructions from an instruction cache in the same cycle; and a plurality of blocks, each block being included in one line. A branch history buffer having a tag having a part of the address of the branch history information.
The branch history buffer entry is accessed in accordance with the address specified by the register, the branch history information of the entry is read out together with the tag address and stored in the history buffer. In the ID stage, the tag address of the history buffer and the conditional branch are read. The address of the instruction is compared with a comparator, the branch history information of the matched block is selected by a multiplexer and sent to the branch prediction unit, and the branch prediction unit converts the branch history information to the conditional branch instruction. An information processing device for use in prediction.

19. The branch history buffer includes: (a) branch history information; (b) branch history information and an address at the time of branch establishment;
(C) at least one form of branch history information, an address when the branch is taken, and an instruction at a taken place when the branch is taken. An information processing apparatus according to claim 1.