JP2000056971A

JP2000056971A - High-speed arithmetic processor and recording medium

Info

Publication number: JP2000056971A
Application number: JP10220329A
Authority: JP
Inventors: Yasushi Iwata; 靖岩田
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1998-08-04
Filing date: 1998-08-04
Publication date: 2000-02-25

Abstract

PROBLEM TO BE SOLVED: To execute a program in data dependency relation fast by extracting information on data dependency relation and performing static prediction when a source program is compiled, and dynamically predicting and selectively executing an instruction which is not solved so far and has high prediction probability at the time of execution. SOLUTION: A compiler 2 takes a morpheme analysis and a syntax analysis of a C source code 1 to generate a binary code. A simulator 4 registers in a profile image table 5 a pair of the line of an instruction in data dependency relation of an object to be predicted and a dependent instruction among instructions in data dependency relation. In the profile image table 5, the object instruction to be predicted among the instructions in the data dependency relation is registered. A compiler 6 generates a binary code 7 in execution form and registers an instruction code and a dependent instruction in the data dependence relation in the binary code in execution form corresponding to line numbers (address). The analysis and prediction of the source program and the fast execution of the compiled instruction in executable form are enabled.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、ソースプログラム
を解析して予測を行うおよびソースプログラムをコンパ
イルした実行可能形式の命令を高速実行する高速演算処
理装置および記録媒体に関するものである。[0001] 1. Field of the Invention [0002] The present invention relates to a high-speed arithmetic processing device and a recording medium for analyzing a source program to make predictions and executing the executable-format instructions obtained by compiling the source program at a high speed.

【０００２】[0002]

【従来の技術】従来、プロセッサの実行性能向上を図る
ために分岐命令があるときに分岐先を予測して先行的に
実行し、予測結果が正しければその先行実行した次から
実行を継続し、予測結果が間違っていれば先行実行をキ
ャンセルして他の分岐先を実行するようにしていた。2. Description of the Related Art Conventionally, in order to improve the execution performance of a processor, when there is a branch instruction, a branch destination is predicted and executed in advance, and if the prediction result is correct, execution is continued from the one after the preceding execution. If the prediction result is wrong, the preceding execution is canceled and another branch destination is executed.

【０００３】[0003]

【発明が解決しようとする課題】上述したように従来は
分岐先の予測、即ち制御の流れの分岐予測は行われてい
たが、あるタスク内で実行に必要なデータが他のタスク
内で実行されて確定しないと実行できないといういわゆ
るデータ依存関係がある場合のデータ予測は行われてい
なく、高速実行し得ないという問題があった。As described above, conventionally, prediction of a branch destination, that is, branch prediction of a control flow, has been performed. However, data necessary for execution in one task is executed in another task. There is a problem that data cannot be predicted when there is a so-called data dependency that the data cannot be executed unless it is determined and confirmed, and high-speed execution cannot be performed.

【０００４】本発明は、これらの問題を解決するため、
ソースプログラムのコンパイル時にデータ依存関係の情
報を抽出して静的予測を行うと共に実行時にデータ依存
関係があり実行時までに解決されなくて予測確率の高い
命令を動的予測して選択実行し、データ依存関係にある
プログラムの高速実行を実現することを目的としてい
る。[0004] The present invention solves these problems,
Performs static prediction by extracting information on data dependencies at the time of compiling the source program and dynamically predicting and executing instructions that have data dependencies at the time of execution and are not resolved by the time of execution and have a high prediction probability, The purpose is to realize high-speed execution of programs that have a data dependency.

【０００５】[0005]

【課題を解決するための手段】図１を参照して課題を解
決するための手段を説明する。図１において、コンパイ
ラ２は、Ｃソースコードを形態素解析、構文解析などを
行い、バイナリを生成するものである。Means for solving the problem will be described with reference to FIG. In FIG. 1, a compiler 2 generates a binary by performing morphological analysis, syntax analysis, and the like on a C source code.

【０００６】シミュレータ４は、データ依存関係にある
命令のうち予測すべき命令を選択してプロファイルイメ
ージテーブル５に登録するものである。プロファイルイ
メージテーブル５は、データ依存関係にある命令のうち
予測すべき対象の命令を登録するものである（後述する
図４参照）。The simulator 4 selects an instruction to be predicted from instructions having a data dependency and registers it in the profile image table 5. The profile image table 5 is for registering an instruction to be predicted among instructions having a data dependency (see FIG. 4 described later).

【０００７】実行形式バイナリ７は、実行形式のバイナ
リ（後述する図５参照）である。次に、動作を説明す
る。コンパイラ２がＣソースプログラムを解析し、シミ
ュレータ４が解析した結果をもとに行番号に対応づけて
データ依存関係にある命令をプロファイルイメージジテ
ーブル５に設定および設定したデータ依存関係にある命
令のうちデータ予測対象の命令の予測フラグをＯＮに設
定するようにしている。The executable binary 7 is an executable binary (see FIG. 5 described later). Next, the operation will be described. The compiler 2 analyzes the C source program, sets the data-dependent instruction in the profile image table 5 in association with the line number based on the result of the analysis by the simulator 4, and executes the instruction of the set data-dependent instruction. The prediction flag of the instruction whose data is to be predicted is set to ON.

【０００８】この際、予測フラグをＯＮに設定する命令
として、自タスク内でデータが決定される命令を除いた
命令とするようにしている。また、予測フラグをＯＮに
設定する命令として、他タスク内でデータが決定される
が命令実行時までに確定している命令を除いた命令とす
るようにしている。At this time, the instruction for setting the prediction flag to ON is an instruction excluding the instruction whose data is determined in the own task. The instruction for setting the prediction flag to ON is an instruction that excludes an instruction whose data is determined in another task but has been determined by the time the instruction is executed.

【０００９】また、実行可能形式の命令の実行時に、デ
ータ依存関係にある命令を設定したテーブルを参照して
データの値を予測して先行実行し、動的予測の結果の集
計を行い予測の正解率の低い命令の先行実行を抑止する
ようにしている。In addition, when an executable instruction is executed, a data value is predicted with reference to a table in which instructions having a data dependency are set, and pre-execution is performed. Preemptive execution of instructions with a low correct answer rate is suppressed.

【００１０】この際、動的予測の結果の集計として、予
測結果が正解のときに正解フラグに＋１、不正解のとき
に不正解フラグを＋１して正解率を集計するようにして
いる。At this time, as a total of the results of the dynamic prediction, the correct flag is incremented by +1 when the prediction result is correct, and the incorrect flag is incremented by +1 when the prediction result is incorrect, and the correct answer rate is totaled.

【００１１】また、データ依存関係にある命令として、
メモリアクセス命令とするようにしている。従って、ソ
ースプログラムのコンパイル時にデータ依存関係の情報
を抽出して静的予測を行うと共に実行時にデータ依存関
係があり実行時までに解決されなくて予測確率の高い命
令を動的予測して選択実行することにより、データ依存
関係にあるプログラムを高速実行することが可能とな
る。[0011] Further, as an instruction having a data dependency,
It is a memory access instruction. Therefore, at the time of compiling the source program, information on data dependence is extracted and static prediction is performed, and at the time of execution, an instruction which has data dependence and is not resolved by execution and has a high prediction probability is dynamically predicted and selected for execution. By doing so, it becomes possible to execute a data-dependent program at high speed.

【００１２】[0012]

【発明の実施の形態】次に、図１から図１３を用いて本
発明の実施の形態および動作を順次詳細に説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Next, embodiments and operations of the present invention will be sequentially described in detail with reference to FIGS.

【００１３】図１は、本発明のシステム構成図を示す。
図１において、Ｃソースコードは、ソースコードの例で
あって、ここでは、Ｃ言語のソースコードである。FIG. 1 shows a system configuration diagram of the present invention.
In FIG. 1, a C source code is an example of a source code, and here is a C language source code.

【００１４】コンパイラ２は、Ｃソースコード１を形態
素解析、構文解析などを行い、バイナリ３を生成するも
のである。バイナリ３は、コンパイラ２によって生成さ
れた実行形式のコードである。The compiler 2 generates a binary 3 by performing morphological analysis, syntax analysis and the like on the C source code 1. The binary 3 is an executable code generated by the compiler 2.

【００１５】シミュレータ４は、データ依存関係にある
命令のうち予測すべき命令を選択してプロファイルイメ
ージテーブル５に登録などするものである。プロファイ
ルイメージテーブル５は、データ依存関係にある命令の
うち予測すべき対象の命令を登録するものであって、後
述する図４に示すようにデータ依存関係にある命令の行
番号、依存命令の対を登録したものである。The simulator 4 selects an instruction to be predicted from instructions having a data dependency and registers the instruction in the profile image table 5. The profile image table 5 is for registering an instruction to be predicted among instructions having a data dependency, and as shown in FIG. 4 described later, a row number of an instruction having a data dependency and a pair of a dependent instruction. Is registered.

【００１６】コンパイラ６は、実行形式バイナリ７を生
成するものである。実行形式バイナリ７は、実行形式の
バイナリであって、後述する図５に示すように行番号
（アドレス）に対応づけて命令コード、データ依存関係
にある依存命令を登録したものである。以下順次詳細に
説明する。The compiler 6 generates an executable binary 7. The execution form binary 7 is an execution form binary in which an instruction code and a dependent instruction having a data dependency are registered in association with a line number (address) as shown in FIG. The details will be sequentially described below.

【００１７】図２は、本発明の説明図（その１）を示
す。図２において、ＰＥ（Ａ）、ＰＥ（Ｂ）は、プロセ
ッサエレメントであって、タスクＡ、Ｂを動作させるも
のである。ここでは、ＰＥ（Ａ）上でタスクＡが動作
し、ＰＥ（Ｂ）上でタスクＢが動作している。一点鎖線
の矢印でデータ依存関係（１）、（２）、（３）を示
し、このうちのデータ依存関係（１）のみが、データ予
測して先行実行する対象である。FIG. 2 is an explanatory view (part 1) of the present invention. In FIG. 2, PE (A) and PE (B) are processor elements that operate tasks A and B. Here, task A operates on PE (A), and task B operates on PE (B). The data dependency relationships (1), (2), and (3) are indicated by dashed-dotted arrows, and only the data dependency relationship (1) is an object to be predicted and executed in advance.

【００１８】ＰＣ（行番号）は、プログラムカウンタで
あって、現在の実行しようとする行番号（アドレス）を
保持（生成）するものである。次に、図２の動作を説明
する。The PC (line number) is a program counter which holds (generates) the line number (address) to be executed at present. Next, the operation of FIG. 2 will be described.

【００１９】（１）ＰＥ（Ｂ）上で動作するタスクＢ
が行番号ＰＣ＝００１でｚ＝ｙ＋４の演算を行う。この
際、演算中のｙはＰＥ（Ａ）で動作するタスクＡが行番
号ＰＣ＝１０１でｙ＝ｃ＋２の演算によって算出されて
確定するまで待機する必要があるというデータ依存関係
がある。(1) Task B operating on PE (B)
Performs the operation of z = y + 4 with the row number PC = 001. At this time, y during the calculation has a data dependency that it is necessary to wait until task A operating on the PE (A) is calculated and determined by the calculation of y = c + 2 at the row number PC = 101 and determined.

【００２０】（２）ＰＥ（Ｂ）上で動作するタスクＢ
が行番号ＰＣ＝００２でｚ＝ｚ＋２の演算を行う。この
際、同じタスクＢで行番号ＰＣ＝００１で既に算出され
て確定しており待機する必要がないというデータ依存関
係がある。(2) Task B operating on PE (B)
Performs the operation of z = z + 2 with the row number PC = 002. At this time, there is a data dependency that the same task B has already been calculated and determined with the row number PC = 001 and does not need to wait.

【００２１】（３）ＰＥ（Ｂ）で動作するタスクＢが
行番号ＰＣ＝００３でｘ＝ｘ＋５の演算を行う。この
際、演算中のｘはＰＥ（Ａ）で動作するタスクＡが行番
号ＰＣ＝１００でｘ＝ａ＋１の演算によって算出されて
確定しており待機する必要がないというデータ依存関係
がある。(3) The task B operating on the PE (B) performs an operation of x = x + 5 with the row number PC = 003. At this time, x during the calculation has a data dependency such that the task A operating on the PE (A) is determined by the calculation of x = a + 1 with the row number PC = 100 and x = a + 1, and there is no need to wait.

【００２２】従って、上記（１）、（２）、（３）のデ
ータ依存関係のうち、待機する必要のあるデータ依存関
係は（１）のみでありこれが本願発明のデータの値を予
測して先行実行する対象の命令であり、後述する図５の
実行形式バイナリの該当するアドレス（行番号）の予測
フラグを１（予測）に設定する。予測する必要のないも
のには予測フラグを０（非予測）に設定する。これによ
り、コンパイル時にデータ依存関係にある命令のうち静
的な予測が行われたこととなる。以上のことをまとめる
と以下のようになる。Therefore, of the data dependencies (1), (2), and (3), only the data dependency that needs to be waited is (1), which is used to predict the value of the data of the present invention. This is an instruction to be executed in advance, and sets a prediction flag of a corresponding address (line number) of an execution format binary in FIG. 5 described later to 1 (prediction). For those that do not need to be predicted, the prediction flag is set to 0 (non-prediction). This means that a static prediction has been performed among the instructions that have a data dependency at the time of compilation. The above is summarized as follows.

【００２３】（イ）自タスク内で定義されている命令
は予測対象としない（図２の（２）の場合は予測対象と
しない）。（ロ）他タスク内で定義されていて、その命令を実行
するまでにそのデータが確定している命令は予測対象と
しない（図２の（３）の場合は予測対象としない）。(A) Instructions defined in the invoking task are not to be predicted (in the case of (2) in FIG. 2, they are not to be predicted). (B) Instructions that are defined in other tasks and whose data are determined before the execution of the instruction are not to be predicted (in the case of (3) in FIG. 2, are not to be predicted).

【００２４】（ハ）他タスク内で定義されていて、そ
の命令を実行するときでも、そのデータが確定しない命
令は予測対象とする（図２の（１）の場合は予測対象と
する）。(C) An instruction which is defined in another task and whose data is not determined even when the instruction is executed is to be predicted (in the case of (1) in FIG. 2, it is to be predicted).

【００２５】（ニ）尚、後述する図８で説明するよう
に、（ハ）の命令についてデータを予測して実際にシミ
ュレーション実行して予測結果の正解率が高い命令のみ
を予測対象とし、正確率が低い命令を予測対象から外す
ようにしてもよい。(D) As will be described later with reference to FIG. 8, the data of the instruction (c) is predicted, the simulation is actually executed, and only the instruction having a high accuracy rate of the prediction result is set as a prediction target, and Instructions with a low rate may be excluded from prediction targets.

【００２６】図３は、本発明の説明図（その２）を示
す。これは、図２のタスクＢ上で動作するソースイメー
ジプログラムの例を示す。ここでは、例えば図２のタス
クＢ上で実行される行番号００１は、００１ａｄｄ
％ｒｚ，％ｒｙ，４となる。００１は行番号であり、ａ
ｄｄは加算命令であり、％ｒｚ，％ｒｙ，４はレジスタ
ｒｙの内容（ｙ）に４を加算してその結果をレジスタｒ
ｚに格納するものである。同様に、行番号１２１、２０
０は、ＳＴ（ストア命令）、ｌｄ（ロード命令）につい
て記述したものである。FIG. 3 is an explanatory view (part 2) of the present invention. This shows an example of a source image program that runs on task B in FIG. Here, for example, the line number 001 executed on the task B in FIG. 2 is 001 add
% Rz,% ry, 4. 001 is a line number, a
dd is an addition instruction, and% rz,% ry, 4 add 4 to the contents (y) of the register ry and store the result in the register r.
z. Similarly, line numbers 121 and 20
0 describes ST (store instruction) and ld (load instruction).

【００２７】図４は、本発明のプロファイルイメージテ
ーブル例を示す。これは、データ依存関係にある命令を
ソースプログラムから抽出して登録したものであって、
図２の行番号００１、００２、００３のものを登録した
ものである。例えばプログラムカウンタ依存命令１依存命令２依存命令３００１１０１ − − は、図２のタスクＢで実行される行番号００１の命令で
あって、データ依存関係の１番目が行番号１０１である
旨を表す。データ依存関係の２番目、３番目はここでは
なしである。行番号２００の命令は、データ依存関係の
ある命令が１番目（依存命令１）１５０（レジスタｒ
２）、２番目（依存命令２）１３０（レジスタｒ３）、
３番目（依存命令３）１２１（レジスタｒ８）の３つが
ある。FIG. 4 shows an example of a profile image table according to the present invention. This is a data-dependent instruction extracted from the source program and registered.
2 are registered in the row numbers 001, 002, and 003 of FIG. For example, the program counter dependent instruction 1 dependent instruction 2 dependent instruction 3 001 101 --- is the instruction of the line number 001 executed by the task B in FIG. Represent. The second and third data dependencies are none here. The instruction of line number 200 is the first instruction having a data dependency (dependent instruction 1) 150 (register r
2) second (dependent instruction 2) 130 (register r3),
Third (dependent instruction 3) 121 (register r8).

【００２８】以上のように、コンパイル時に命令の行番
号に対応づけてデータ依存関係のある他の命令の行番号
を抽出して登録することにより、データ依存関係にある
命令についてのみデータの値を予測して先行実行するこ
とが可能となる。As described above, by extracting and registering the line numbers of other instructions having a data dependency at the time of compiling at the time of compiling, the data value can be changed only for the instructions having the data dependency. It is possible to predict and execute in advance.

【００２９】図５は、本発明の実行形式バイナリ例を示
す。これは、実行形式のバイナリであって、図示の下記
の項目を登録したものである。アドレス命令コード依存命令１、依存命令２、依存命令３予測フラグ００１ add %rz,%ry,4 １０１ − − １（予測）００２ add %rz,%rz,z − − − ０（非予測）００３ add %rx,%rx,5 １００ − − ０（非予測）以上のようにデータ依存関係にあるアドレス（行番
号）、命令コード、依存命令１〜３、予測フラグｆをコ
ンパイル時に登録しておくことにより、実行時に予測フ
ラグが１（予測）のみの命令についてデータの値を予測
して先行実行することにより、選択的に静的予測した結
果をもとに効率的に実行時にデータ依存関係にある命令
を先行実行し、ＣＰＵの高速実行を実現することが可能
となる。FIG. 5 shows an example of an executable binary of the present invention. This is an executable binary, in which the following items shown in the figure are registered. Address Instruction code Dependent instruction 1, Dependent instruction 2, Dependent instruction 3 Prediction flag 001 add% rz,% ry, 4 101--1 (prediction) 002 add% rz,% rz, z---0 (non-prediction) 003 add% rx,% rx, 5 100 --0 (non-prediction) As described above, the address (line number), instruction code, dependent instructions 1 to 3 and prediction flag f that have a data dependency are registered at the time of compilation. Thus, by predicting the value of the data for an instruction whose prediction flag is only 1 (prediction) at the time of execution and executing the instruction in advance, the data dependency is efficiently reduced at the time of execution based on the result of the selective static prediction. It is possible to execute a certain instruction in advance and realize high-speed execution of the CPU.

【００３０】図６は、本発明の他のシステム構成図を示
す。これは、実行時に後述する図８に示すように、ＰＣ
（プログラムカウンタ）に設定されたアドレス（行番
号）とタグとを比較して一致するときにヒット、一致す
るものみつからないときにミスヒットを出力し、ヒット
のときは更に値と差分を読み出して演算器によって演算
（ここでは、例えば加算）して予測値を算出するハード
イメージである。FIG. 6 shows another system configuration diagram of the present invention. This is performed at the time of execution, as shown in FIG.
The address (line number) set in the (program counter) is compared with the tag to output a hit when they match, and to output a mishit when they do not match, and to read out the value and difference when a hit occurs. This is a hardware image in which a prediction value is calculated by performing a calculation (here, for example, addition) by a calculator.

【００３１】以上の構成により、図５で予測フラグが１
（予測）とした命令のアドレスとそのときの変数のデー
タの値、および前前回と前回との差分を高速に読み出
し、データ依存関係にある変数の値を予測して先行実行
することが可能となる。以下実行時の動作を順次詳細に
説明する。With the above configuration, the prediction flag is set to 1 in FIG.
It is possible to read out the address of the instruction set as (predicted), the value of the data of the variable at that time, and the difference between the previous and previous times at a high speed, predict the value of the variable having a data dependency, and execute the preceding execution. Become. Hereinafter, the operation at the time of execution will be sequentially described in detail.

【００３２】図７は、本発明の説明図（その３、実行
時）を示す。これは、ＰＣ（プログラムカウンタ）に行
番号００１が設定され、図２のＰＥ（Ｂ）上で動作する
タスクＢが行番号００１でｚ＝ｙ＋４を実行しようとし
たときの様子を示す。FIG. 7 is an explanatory diagram of the present invention (No. 3, execution). This shows a state in which the line number 001 is set in the PC (program counter), and the task B operating on the PE (B) in FIG. 2 tries to execute z = y + 4 with the line number 001.

【００３３】初期値は、データ依存関係にある行番号０
０１の命令の変数ｙの初期値が０であることを表す。１
回目は、データ依存関係にある行番号００１の命令の変
数ｙの実行結果が８となったことを表す。The initial value is the row number 0 which has a data dependency.
It indicates that the initial value of the variable y of the instruction 01 is 0. 1
The third time indicates that the execution result of the variable y of the instruction of the line number 001 having the data dependency relationship is 8.

【００３４】２回目は、データ依存関係にある行番号０
０１の命令の変数ｙの予測の例を表す。（ａ）は、２回
目に予測した値ｙ＝１６（１回目の値に差分を加算した
８＋８＝１６）と実際に実行したｙの値１６とが一致し
て正解の場合を表す。この場合には、図示の下記のよう
に更新する。The second time, the line number 0 having a data dependency
13 shows an example of prediction of a variable y of the 01 instruction. (A) shows the case where the value y = 16 predicted at the second time (8 + 8 = 16 obtained by adding the difference to the value at the first time) coincides with the value 16 of the actually executed y and is correct. In this case, updating is performed as shown below.

【００３５】ＰＣ値差分００１１６８（ｂ）は、２回目に予測した値ｙ＝１６（１回目の値に
差分を加算した８＋８＝１６）と実際に実行したｙの値
８とが不一致で不正解の場合を表す。この場合には、図
示の下記のように更新する。The PC value difference 001 16 8 (b) indicates that the second predicted value y = 16 (8 + 8 = 16 obtained by adding the difference to the first time value) does not match the actually executed y value 8. Indicates an incorrect answer. In this case, updating is performed as shown below.

【００３６】ＰＣ値差分００１８０以上のように、初期値をもとに１回目の値を予測して実
際の実行結果と一致したときにそのときの値および前回
との差分を登録することを繰り返すと、後述する図８に
示すようなテーブルを作成することが可能となる。そし
て、正解のときに予測した実行結果が正しいのでそれに
続く処理を実行することが可能となり、一方、不正解の
ときに予測の結果をキャンセルして確定した変数の値で
再実行する。これにより、実行時に正解となる値および
差分が図８のテーブルに示すように登録され、当該正解
数の割合が所定閾値（例えば６０％以上）の命令の予測
を行い、それ以外の命令の予測を抑止することで、予測
率の確率を向上させてキャンセル率を小さくして全体の
実行速度を大幅に向上させることが可能となる。PC value difference 001 8 0 As described above, the first value is predicted based on the initial value, and when it matches the actual execution result, the value at that time and the difference from the previous time are registered. Is repeated, a table as shown in FIG. 8 to be described later can be created. Then, since the execution result predicted at the time of the correct answer is correct, the subsequent processing can be executed. On the other hand, at the time of the incorrect answer, the result of the prediction is canceled and re-executed with the determined variable value. As a result, the value and the difference that become correct at the time of execution are registered as shown in the table of FIG. 8, the ratio of the number of correct answers is predicted for a predetermined threshold value (for example, 60% or more), and the prediction of other instructions is performed. Is suppressed, the probability of the prediction rate is improved, the cancellation rate is reduced, and the overall execution speed can be greatly improved.

【００３７】図８は、本発明の説明図（その４、実行
時）を示す。これは、既述した図７の手順によって予測
対象の命令（命令の行番号）について正解のときの変数
の値および前回との差分を登録したリストであり、正解
と不正解の数を集計して正解率が所定閾値（例えば６０
％以上）の命令のみを残し他の命令の予測フラグを０
（非予測）に設定し、動的に統計を集計してデータ依存
関係にあるデータの予測の正解率を向上させて高速実行
させることを動的に制御することが可能となる。ここで
は、ＰＣ＝００１（行番号００１）についてｎ回実行
し、正解が１３９７４回、不正解が６４回となり、正解
率が６０％以上であるので、データの依存関係にある命
令（行番号００１の命令）のデータ予測を行う。一方、
ＰＣ＝００２（行番号００２）についてｎ回実行し、正
解が２１回、不正解が１７３６５回となり、正解率が６
０％以下であるので、データの依存関係にある命令（行
番号００２の命令）のデータ予測を抑止し、キャンセル
による不要な負荷による処理速度の低下を防止すること
を実行時に動的に行うことが可能となる。FIG. 8 is an explanatory diagram (No. 4, during execution) of the present invention. This is a list in which the values of variables at the time of the correct answer and the difference from the previous time are registered for the instruction (the line number of the instruction) to be predicted by the procedure of FIG. The correct answer rate is a predetermined threshold (for example, 60
% Or more) and the prediction flag of other instructions is set to 0.
(Non-prediction) is set, and it is possible to dynamically collect statistics and dynamically increase the accuracy rate of prediction of data having a data dependency to perform high-speed execution. Here, PC = 001 (line number 001) is executed n times, the correct answer is 13974 times, the incorrect answer is 64 times, and the correct answer rate is 60% or more. Instruction). on the other hand,
PC = 002 (line number 002) is executed n times, the number of correct answers is 21 times, the number of incorrect answers is 17365 times, and the correct answer rate is 6
Since it is 0% or less, the data prediction of the instruction having the data dependence (the instruction of the line number 002) is suppressed, and the processing speed is dynamically reduced at the time of execution to prevent the reduction in the processing speed due to the unnecessary load due to the cancellation. Becomes possible.

【００３８】図９は、本発明の１実施例構成図を示す。
図９の（ａ）は、実行オブジェクトの各ＰＥ＃１から＃
４への割り振りの様子を示す。ここでは、図示のよう
に、実行オブジェクトを４台のＰＥ＃１から＃４に割り
振ると共に、このときに各アドレス（行番号）に対応づ
けて既述したデータ依存関係にある命令のアドレス（行
番号）を依存命令１、２、３に登録しておく（コンパイ
ル時に収集して登録しておく）。これら４台のＰＥ＃１
から＃４にそれぞれ割り振られた実行オブジェクトは、
逐次（ｂ）のＰＥ＃１から＃４によって実行する。FIG. 9 is a block diagram showing one embodiment of the present invention.
FIG. 9A shows each of PE # 1 to PE # of the execution object.
The state of allocation to No. 4 is shown. Here, as shown in the figure, the execution objects are allocated to the four PEs # 1 to # 4, and at this time, the addresses (line numbers) of the instructions having the data dependency described above in association with each address (line number) are assigned. No.) are registered in the dependent instructions 1, 2, and 3 (collected and registered at the time of compilation). These four PE # 1
The execution objects respectively assigned to to # 4 are
It is sequentially executed by PEs # 1 to # 4 in (b).

【００３９】図９の（ｂ）は、４台のＰＥ＃１から＃４
によってそれぞれ実行オブジェクトを実行するときの処
理を示す。ここでは、各ＰＥ＃１から＃４では、データ
依存ありがＹＥＳ、依存解決済みがＮＯの場合には、変
数の値を予測し、この予測した値をもとに実行オブジェ
クトをそれぞれ実行する。FIG. 9B shows four PEs # 1 to # 4.
Shows the processing when each execution object is executed. Here, in each of the PEs # 1 to # 4, if the data dependence is YES and the dependency solved is NO, the values of the variables are predicted, and the execution objects are respectively executed based on the predicted values.

【００４０】以上のように、実行オブジェクトを各ＰＥ
＃１から＃４に割り振り、各ＰＥ＃１から＃４でデータ
依存ありがＹＥＳ、依存解決済みがＮＯの場合に、変数
の値を予測し、この予測した値をもとに実行オブジェク
トをそれぞれ実行することにより、データ依存関係にあ
る命令を正解率高く先行実行し、処理速度を高速化する
ことが可能となる。As described above, the execution object is
If the data is allocated to # 1 to # 4 and the data dependencies are YES in each of PEs # 1 to # 4, and if the dependency has been resolved is NO, the value of the variable is predicted, and the execution object is determined based on the predicted value. By executing, it is possible to precedely execute instructions having a data dependency with a high correct answer rate and to increase the processing speed.

【００４１】図１０は、本発明の他の動作説明フローチ
ャートを示す。これは、既述したデータ依存関係にある
命令の変数の値の予測を動的に決定するときのフローチ
ャートである。FIG. 10 is a flowchart illustrating another operation of the present invention. This is a flowchart when dynamically predicting the value of a variable of an instruction having a data dependency described above.

【００４２】図１０において、Ｓ１は、依存ありか判別
する。これは、例えば既述した図９の（ａ）で各ＰＥに
割り振られた実行オブジェクトに依存命令１、２、３の
いずれかが設定され、実行しようとする命令にデータ依
存関係があるか判別する。ＹＥＳの場合には、データ依
存関係があると判明したので、Ｓ２に進む。ＮＯの場合
には、データ依存関係がないと判明したので、Ｓ５で予
測フラグを０（非予測）にセットする。In FIG. 10, S1 determines whether there is any dependency. This is because, for example, one of the dependent instructions 1, 2, and 3 is set in the execution object allocated to each PE in FIG. 9A, and it is determined whether the instruction to be executed has a data dependency. I do. In the case of YES, it is determined that there is a data dependency, and the process proceeds to S2. In the case of NO, it is determined that there is no data dependency, so the prediction flag is set to 0 (non-prediction) in S5.

【００４３】Ｓ２は、データ依存関係があったが解消し
たか判別する。ＹＥＳの場合には、データ依存関係が解
消したのでＳ６で予測フラグを０（非予測）にセットす
る。ＮＯの場合には、データ依存関係が解消していない
ので、Ｓ３に進む。In step S2, it is determined whether or not the data dependency exists but has been resolved. In the case of YES, since the data dependency has been resolved, the prediction flag is set to 0 (non-prediction) in S6. In the case of NO, the process proceeds to S3 because the data dependency has not been resolved.

【００４４】Ｓ３は、変数の値の予測した結果の正解率
が高いか判別する。これは、命令の変数の値の予測を行
ってその結果を既述した図８に示すように収集してその
正解率が高い（例えば６０％以上）のときにデータ予測
して高速化できると判明したので、Ｓ４で命令の変数の
データの値を予測する。一方、ＮＯの場合には、正解率
が低く、キャンセルに伴う処理が発生し、予測を行うと
処理速度がそのために低下してしまうので、Ｓ７で予測
フラグを０（非予測）にセットする。In step S3, it is determined whether or not the correct answer rate of the result of predicting the value of the variable is high. This is because if the value of the instruction variable is predicted and the result is collected as shown in FIG. 8 and the data is predicted when the correct answer rate is high (for example, 60% or more), the speed can be increased. Since it is found, the value of the data of the variable of the instruction is predicted in S4. On the other hand, in the case of NO, the correct answer rate is low, processing accompanying the cancellation occurs, and if the prediction is performed, the processing speed is reduced. Therefore, the prediction flag is set to 0 (non-prediction) in S7.

【００４５】以上のように、実行時に動的にデータの依
存関係があり、データの依存関係が解消していなく、か
つデータ予測したときの正解率が高いときに命令の変数
のデータの値の予測を行い、それ以外の命令についてデ
ータの予測を抑止し、命令の実行速度を動的に解析して
効率的に高速化を図ることが可能となる。As described above, there is a dynamic data dependency at the time of execution, and when the data dependency is not resolved and the accuracy rate at the time of data prediction is high, the value of the data of the instruction variable is high. Prediction is performed, data prediction is suppressed for the other instructions, and the execution speed of the instruction is dynamically analyzed to efficiently increase the speed.

【００４６】図１１は、本発明の予測正解率例を示す。
これは、ベンチマークプログラムＳＣ（ＳＰＥＣｉｎｔ
９２、System Perfomance Evaluation Cooperative)を
シミュレート実行したときのデータであって、横軸が各
種処理を行うプログラムなどをを表し、縦軸が予測正解
率を表す。図中の棒グラフの横線の部分がＬＯＡＤ命令
の部分を表し、横線の部分がＳＴＯＲＥ命令の部分を表
し、点々の部分がＯＴＨＥＲＳ（ＬＡＡＤ、ＳＴＯＲＥ
などのメモリアクセス命令以外の命令）の部分を表す。
図中の記号は下記を表す。FIG. 11 shows an example of the predicted correct answer rate of the present invention.
This is the benchmark program SC (SPECint
92, which is data obtained when a simulation of System Performance Evaluation Cooperative) is performed. The horizontal axis represents a program for performing various processes, and the vertical axis represents a predicted correct answer rate. The horizontal line portion of the bar graph in the figure represents the LOAD instruction portion, the horizontal line portion represents the STORE instruction portion, and the dotted portions represent OTHERS (LAAD, STORE).
Instruction other than the memory access instruction).
The symbols in the figure represent the following.

【００４７】ｃｏｍｐ：ｃｏｍｐｒｅｓｓ（圧縮プログラム）ｅｑｎ：ｅｑｎｔｏｔｔ（Ｃプログラム）ｅｓｐ：ｅｓｐｒｅｓｓｅｇｃｃ：コンパイラｌｉ：ｌｉｓｐインタプリタで９クイーン問題を解
くｓｃ：スプレッドシートａｖｅ：平均図１２は、本発明の選択的値予測の対象命令と非対象命
令例（％）を表す。これは、図１１で既述したベンチマ
ークプログラムＳＣを分類したものであって、図示の下
記の項目に分類したものである。Comp: compress (compression program) eqn: eqnottt (C program) esp: epresse gcc: Compiler li: Solving 9 queens problem with lisp interpreter sc: Spreadsheet ave: Average FIG. 12 shows selective values of the present invention. Indicates the target instruction and the non-target instruction example (%) of the prediction. This is a classification of the benchmark program SC already described in FIG. 11, and is classified into the following items shown in the figure.

【００４８】・依存無し（データ依存関係がない場合）：・解決済み（データ依存関係はあるが実行時までに解決
済みの場合）：・予測対象（データ依存関係があり、解決済みでない場
合）：本願発明の命令のデータの値の予測対象となる命
令・ｌｏａｄ：データをメモリにロードする命令・ｓｔｏｒｅ：メモリからデータをレジスタにストアす
る命令・ｏｔｈｅｒｓ：ｌｏａｄ、ｓｔｏｒｅのメモリアクセ
ス命令以外の命令ここで、左側に記載した（１）は、図２の（１）に対応
して、データ依存関係を予測する対象の割合（％）であ
る。No dependency (when there is no data dependency): Solved (when there is data dependency but it has been solved by the time of execution): Prediction target (when there is data dependency and it has not been solved) : Instruction for which the value of the data of the instruction of the present invention is to be predicted ・ load: Instruction for loading data into memory ・ store: Instruction for storing data from memory to register ・ others: Instruction other than load and store memory access instructions Here, (1) described on the left side corresponds to (1) in FIG. 2 and is a ratio (%) of an object whose data dependency is predicted.

【００４９】（２）は、図２の（２）に対応して、デー
タ依存関係がない命令の割合（％）である。（３）は、
図２の（３）に対応して、データ依存関係があるか命令
の実際の実行時までに解決してデータの値が確定してい
る命令の割合（％）である。(2) is the ratio (%) of instructions having no data dependency, corresponding to (2) in FIG. (3)
Corresponding to (3) in FIG. 2, this is the ratio (%) of instructions for which there is a data dependency or for which the value of the data is resolved by the time of actual execution of the instruction.

【００５０】従って、ベンチマークプログラムＳＣで
は、予測対象は、ｃｏｍｐ、ｅｑｎ、ｅｓｐ、ｇｃｃ、
ｌｉ、ｓｃ、Ａｖｅで９．２〜１１．６％の間の割合で
あり、これらについて選択的に命令の変数の値の予測を
行い、むやみに予測する際のキャンセルに伴う処理の低
下を無くし、予測したときに高速化される可能性の高い
もののみのデータ依存関係にある命令の変数の値を予測
して先行実行することが可能となる。更に、ｌｏａｄ命
令やｓｔｏｒｅ命令のメモリアクセスは実行速度が遅く
先行実行するメリットが高いので、当該メモリアクセス
命令（ｌｏａｄ命令、ｓｔｏｒｅ命令）を予測対象とし
て選択し、データ依存関係にある命令の変数の値を予測
して先行実行するようにしてもよい。Therefore, in the benchmark program SC, the prediction targets are comp, eqn, esp, gcc,
The ratio of li, sc, and Ave is between 9.2 and 11.6%. For these, the values of instruction variables are selectively predicted, and the reduction in processing due to the cancellation when predicting unnecessarily is eliminated. In addition, it becomes possible to predict the value of a variable of an instruction that has a data dependency of only one that is likely to be speeded up when predicted and execute it in advance. Further, since the memory access of the load instruction and the store instruction has a merit that the execution speed is slow and the preceding execution is high, the memory access instruction (the load instruction and the store instruction) is selected as a prediction target, and the variable of the instruction having the data dependency is selected. The value may be predicted and executed in advance.

【００５１】図１３は、本発明の実行時間例を示す。こ
れは、既述したベンチマークプログラムＳＣの各プログ
ラムをシミュレート実行したときの時間を表し、横軸が
各種処理を行うプログラムなどを表し、縦軸が実行時間
（Ｋサイクル）を表す。図中の棒グラフの左側が予測な
し、中央が予測あり、右側が完全予測（予測が全部正解
の場合）を表す。例えばｓｃの場合には、予測なしの場
合に比して、本願発明の予測を行うと、２０．９％の時
間（サイクル数）だけ高速化できることが判明した。詳
細に説明すれば、ベンチマークプログラムＳＣ中に含ま
れる９．２％のｓｃ（スプレッドシート）の命令を予測
対象とし、約２０．９％のサイクル数だけ高速化できる
ことが判明した。この際、予測を全て正解とすると、約
３４．５％のサイクル数だけ高速化できるとシミュレー
トできた。ベンチマークプログラム中の他のプログラム
についても図１２に示す全体に対するそれぞれの割合に
対し、図１３の中央の予測ありのサイクル数に高速化で
きる。FIG. 13 shows an example of the execution time of the present invention. This represents the time when each of the benchmark programs SC described above is simulated and executed, the horizontal axis represents a program for performing various processes, and the vertical axis represents the execution time (K cycles). The left side of the bar graph in the figure indicates no prediction, the center indicates prediction, and the right side indicates perfect prediction (when all predictions are correct). For example, in the case of sc, it has been found that, when the prediction of the present invention is performed, the speed can be increased by 20.9% of the time (the number of cycles) as compared with the case without prediction. More specifically, it has been found that 9.2% sc (spreadsheet) instructions included in the benchmark program SC can be predicted, and the speed can be increased by about 20.9% of the number of cycles. At this time, if all the predictions were correct, it could be simulated that the speed could be increased by about 34.5% of the number of cycles. For other programs in the benchmark program, the speed can be increased to the number of cycles with prediction in the center of FIG. 13 for each ratio to the whole shown in FIG.

【００５２】[0052]

【発明の効果】以上説明したように、本発明によれば、
ソースプログラムのコンパイル時にデータ依存関係の情
報を抽出して静的予測を行うと共に実行時にデータ依存
関係があり実行時までに解決されなくて予測確率の高い
命令を動的予測して選択実行する構成を採用しているた
め、データ依存関係にあるプログラムを高速実行する装
置を実現できる。As described above, according to the present invention,
A configuration in which information on data dependence is extracted at the time of compiling a source program to perform static prediction, and at the time of execution, an instruction which has a data dependence and is not resolved by execution and has a high prediction probability is dynamically predicted and selected for execution. Therefore, a device that executes a program having a data dependency at a high speed can be realized.

[Brief description of the drawings]

【図１】本発明のシステム構成図である。FIG. 1 is a system configuration diagram of the present invention.

【図２】本発明の説明図（その１）である。FIG. 2 is an explanatory view (No. 1) of the present invention.

【図３】本発明の説明図（その２）である。FIG. 3 is an explanatory view (No. 2) of the present invention.

【図４】本発明のプロファイルイメージテーブル例であ
る。FIG. 4 is an example of a profile image table according to the present invention.

【図５】本発明の実行形式バイナリ例である。FIG. 5 is an example of an executable binary of the present invention.

【図６】本発明の他のシステム構成図である。FIG. 6 is another system configuration diagram of the present invention.

【図７】本発明の説明図（その３、実行時）である。FIG. 7 is an explanatory diagram of the present invention (No. 3, execution).

【図８】本発明の説明図（その４、実行時）である。FIG. 8 is an explanatory diagram (No. 4, execution) of the present invention.

【図９】本発明の１実施例構成図である。FIG. 9 is a configuration diagram of one embodiment of the present invention.

【図１０】本発明の他の動作説明フローチャートであ
る。FIG. 10 is a flowchart illustrating another operation of the present invention.

【図１１】本発明の予測正解率例である。FIG. 11 is an example of a predicted correct answer rate according to the present invention.

【図１２】本発明の選択的値予測の対象命令と非対象命
令例である。FIG. 12 is an example of a target instruction and a non-target instruction for selective value prediction according to the present invention.

【図１３】本発明の実行時間例である。FIG. 13 is an example of execution time of the present invention.

[Explanation of symbols]

１：ソースコード２、６：コンパイラ３：バイナリ４：シミュレータ５：プロファイルイメージテーブル７：実行形式バイナリ 1: Source code 2, 6: Compiler 3: Binary 4: Simulator 5: Profile image table 7: Executable binary

Claims

[Claims]

1. A high-speed processing apparatus for analyzing and predicting a source program, comprising: means for analyzing the source program; and a table for storing an instruction having a data dependency in association with a line number based on the result of the analysis. And a means for setting a prediction flag of an instruction of a data prediction target among the instructions having the data dependence set to ON.

2. The high-speed arithmetic processing device according to claim 1, wherein the instruction for setting the prediction flag to ON is an instruction excluding an instruction whose data is determined in its own task.

3. An instruction for setting the prediction flag to ON, wherein the instruction excludes an instruction whose data is determined in another task but which is determined by the time of execution of the instruction. Alternatively, the high-speed processing device according to claim 2.

4. A high-speed processing apparatus for executing an executable-format instruction obtained by compiling a source program at a high speed, comprising the steps of: executing an executable-format instruction by referring to a table in which data-dependent instructions are set; A high-speed processing apparatus comprising: means for predicting a value and executing the result in advance; and means for summing up the results of the dynamic prediction and suppressing the advance execution of an instruction having a low accuracy rate of the prediction.

5. The total of the results of the dynamic prediction, wherein the correct answer flag is incremented by +1 when the predicted result is correct, and the incorrect flag is incremented by +1 when the predicted result is incorrect, and the correct answer rate is totaled. Item 5. A high-speed processing device according to item 4.

6. The high-speed processing device according to claim 1, wherein the instruction having the data dependency is a memory access instruction.

7. A means for analyzing a source program, means for setting an instruction having a data dependency in a table in association with a line number based on the result of the analysis, and an instruction having a data dependency set in the table A computer-readable recording medium on which a program for functioning as a means for setting a prediction flag of an instruction of a data prediction target to ON is recorded.

8. A means for predicting the value of data with reference to a table in which instructions having a data dependence are set when executing an instruction in an executable format, and executing a preceding execution. A computer-readable recording medium in which a program for functioning as a means for suppressing an advance execution of an instruction having a low accuracy rate of a predicted prediction is recorded.