JP2000194566A

JP2000194566A - System and method for compilation and storage medium stored with compiler

Info

Publication number: JP2000194566A
Application number: JP10368190A
Authority: JP
Inventors: Shinichi Okano; 進一岡野
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1998-12-24
Filing date: 1998-12-24
Publication date: 2000-07-14
Anticipated expiration: 2018-12-24
Also published as: JP3156688B2

Abstract

PROBLEM TO BE SOLVED: To reduce the overhead of a wait time in program execution and to improve the execution performance of a program by leveling load distribution to respective loops which are loop-parallelized. SOLUTION: A branch pattern analyzing means 6 analyzes the branch pattern of a condition statement in a loop by referring to branch history information obtained by the execution of an object program, and a loop parallelizing means 7 parallelizes loops with optimum load distribution according to the analysis result; and the branch pattern analyzing means 6 analyzes the branch pattern having changed owing to the parallelization again and a loop optimizing means 8 generates optimum codes for the respective loops divided by the parallelization according to the analysis result.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、条件分岐命令を
含むループを効率良く実行させるオブジェクトプログラ
ムを生成するコンパイルシステム，コンパイル方法およ
びコンパイラを記憶した記録媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a compiling system, a compiling method, and a recording medium storing a compiler for generating an object program for efficiently executing a loop including a conditional branch instruction.

【０００２】[0002]

【従来の技術】従来のコンパイルシステムとして、例え
ば特開平１０−２４０５７３号公報に記載のものがあ
る。このコンパイルシステムは、コンパイルの対象とな
るソースプログラムと、プログラム制御により動作し、
そのソースプログラムを入力としてコンパイルを行うコ
ンパイル装置と、このコンパイルにより生成されるオブ
ジェクトプログラムと、このオブジェクトプログラムの
実行によって得た所定の命令ストリームを複数回繰り返
す各ループ処理の実行毎の繰り返し数とそのループ処理
内の条件分岐命令の分岐履歴とが記録される分岐履歴フ
ァイルとを備えている。2. Description of the Related Art A conventional compiling system is disclosed in, for example, Japanese Patent Application Laid-Open No. 10-240573. This compilation system operates under the source program to be compiled and program control.
A compiling device for compiling the source program as an input, an object program generated by the compilation, a number of repetitions for each execution of each loop process for repeating a predetermined instruction stream obtained by executing the object program a plurality of times, and A branch history file in which a branch history of a conditional branch instruction in the loop processing is recorded.

【０００３】そして、前記コンパイル装置では、分岐履
歴情報採取コード生成手段により、ループの実行回数，
ループの実行毎の繰り返し回数、およびループ内の条件
文の分岐履歴を採取するコードを埋め込んだオブジェク
トプログラムを生成し、そのオブジェクトプログラムの
実行により、ループ内の各条件文の分岐履歴を分岐履歴
情報ファイルに採取し、その採取した分岐履歴情報を入
力として分岐パターン解析手段により分岐パターン情報
を解析し、この解析結果にもとづいて、同一プログラム
を再度コンパイルすることで、ループ処理に最適なコー
ドを生成するようにしている。[0003] In the compiling apparatus, the branch history information collection code generating means generates the loop execution count,
Generates an object program in which code for collecting the number of repetitions for each execution of the loop and the branch history of conditional statements in the loop is embedded, and executes the object program to display the branch history of each conditional statement in the loop as branch history information. Generates the optimal code for loop processing by collecting the collected branch history information into a file, analyzing the branch pattern information by the branch pattern analysis means using the collected branch history information as input, and recompiling the same program based on the analysis result. I am trying to do it.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、かかる
従来のコンパイル装置にあっては、負荷分散を行った前
記並列化に対して何ら考慮がなされていないため、前記
分岐履歴ファイルから得られた分岐パターンの解析結果
を並列化に利用できず、また、分岐パターンの解析結果
を負荷分散のために利用できないという課題があった。
また、前記分岐パターン解析手段がループに対して行わ
れた並列化および最適化の情報を取得できなかったた
め、ループに対して分岐パターンが変化するような並列
化および最適化を行った場合、その後のループに対する
最適化では分岐パターンを利用できないという課題があ
った。However, in such a conventional compiling device, since no consideration is given to the parallelization in which the load is distributed, the branch pattern obtained from the branch history file is not taken into account. However, there has been a problem that the analysis result of (1) cannot be used for parallelization, and the analysis result of the branch pattern cannot be used for load distribution.
Further, since the branch pattern analysis means could not obtain information on the parallelization and optimization performed on the loop, if the parallelization and optimization such that the branch pattern changes for the loop, There is a problem that the branch pattern cannot be used in the optimization for the loop.

【０００５】この発明は前記課題を解決するものであ
り、並列化処理された各ループの負荷分散を均等化し、
これによりプログラム実行時の待ち時間のオーバヘッド
の削減並びにプログラムの実行性能の向上を図れるとと
もに、並列化された各ループに対して、最適のコードを
生成でき、これによりプログラムのさらなる高速化を実
現できるコンパイルシステム，コンパイル方法およびコ
ンパイラを記録した記録媒体を得ることを目的とする。SUMMARY OF THE INVENTION The present invention has been made to solve the above-mentioned problem, and has an advantage of equalizing the load distribution of each loop subjected to parallel processing.
As a result, it is possible to reduce the overhead of the waiting time at the time of executing the program and to improve the execution performance of the program, and it is possible to generate an optimum code for each of the parallelized loops, thereby realizing further higher speed of the program. It is an object of the present invention to obtain a recording medium on which a compiling system, a compiling method, and a compiler are recorded.

【０００６】[0006]

【課題を解決するための手段】前記目的達成のため、請
求項１の発明にかかるコンパイルシステムは、ソースプ
ログラムからオブジェクトプログラムを生成するコンパ
イル装置と、前記オブジェクトプログラムの実行によっ
て、ループの実行毎の繰り返し回数およびループ内の条
件文の分岐履歴が記録された分岐履歴情報ファイルと、
前記コンパイル装置に設けられて、前記分岐履歴情報フ
ァイルを作成するために、各ループの実行毎の繰り返し
回数とループ内の条件文の分岐履歴の情報を採取するコ
ードを、前記オブジェクトプログラムに埋め込む分岐履
歴情報採取コード生成手段とを有し、分岐パターン解析
手段に、前記分岐履歴情報ファイルからの分岐履歴情報
および並列化の情報を入力としてループに対する分岐パ
ターンの解析を行わせ、また、該分岐パターン解析手段
により得られた分岐パターンの解析結果にもとづき対象
となるループに対して最適の負荷分散を行って前記並列
化を行うループ並列化手段を設け、該ループ並列化手段
による並列化によって分岐パターンが変化したループに
対して、前記分岐パターン解析手段が分岐パターンを再
解析した結果をもとに、ループ最適化手段に、ループに
対して最適のコードを生成させるようにしたものであ
る。According to one aspect of the present invention, there is provided a compiling system for compiling an object program from a source program and executing the object program for each execution of a loop. A branch history information file in which the number of repetitions and the branch history of the conditional statement in the loop are recorded;
A branch provided in the compiling device and for embedding, in the object program, a code for collecting information on the number of repetitions for each execution of a loop and the branch history of a conditional statement in the loop to create the branch history information file. History information collection code generation means, and causes the branch pattern analysis means to analyze a branch pattern for a loop by using the branch history information and the parallelization information from the branch history information file as inputs. Loop parallelizing means for performing the above-mentioned parallelization by performing optimal load distribution on a target loop based on the analysis result of the branch pattern obtained by the analyzing means; The result of the branch pattern analysis means re-analyzing the branch pattern for the loop in which , The loop optimization means, in which so as to generate an optimal code for the loop.

【０００７】また、請求項２の発明にかかるコンパイル
方法は、ループ内の条件文の分岐履歴の情報を採取する
コードを、分岐履歴情報採取コード生成手段によりソー
スプログラムをコンパイルして生成したオブジェクトプ
ログラムに埋め込み、このオブジェクトプログラムの実
行によって得られた分岐履歴情報を参照して、分岐パタ
ーン解析手段によりループ内の条件文の分岐パターンを
解析し、この解析結果をもとに、ループ並列化手段によ
りループに対して最適な負荷分散を行った並列化処理を
行い、さらに並列化されたことによって変化した分岐パ
ターンを前記分岐パターン解析手段によりもう一度解析
し、この解析結果に従って、ループ最適化手段により前
記並列化によって分割された各々のループに対して最適
なコードを生成するようにしたものである。According to a second aspect of the present invention, there is provided a compile method for generating a code for collecting information on a branch history of a conditional statement in a loop by compiling a source program by a branch history information collection code generating means. The branch pattern analysis unit analyzes the branch pattern of the conditional statement in the loop by referring to the branch history information obtained by executing the object program, and, based on the analysis result, the loop parallelization unit The loop is subjected to parallel processing with optimum load distribution, and the branch pattern changed by the parallel processing is analyzed again by the branch pattern analysis unit. According to the analysis result, the loop optimization unit Generate optimal code for each loop divided by parallelization It is obtained by way.

【０００８】また、請求項３の発明にかかるコンパイラ
を記憶した記録媒体は、ループ内の条件文の分岐履歴の
情報を採取するコードを、ソースプログラムをコンパイ
ルして生成したオブジェクトプログラムに埋め込む分岐
履歴情報採取コード生成処理と、このオブジェクトプロ
グラムの実行によって得られた分岐履歴情報を参照し
て、ループ内の条件文の分岐パターンを解析する分岐パ
ターン解析処理と、この解析結果をもとに、ループに対
して最適な負荷分散を行った並列化処理を行うループ並
列化処理と、さらに並列化されたことによって変化した
分岐パターンをもう一度解析し、この解析結果に従っ
て、前記並列化によって分割された各々のループに対し
て最適なコードを生成するループ最適化処理とをコンピ
ュータに行わせるプログラムを記録したものである。According to a third aspect of the present invention, there is provided a recording medium storing a compiler according to a third aspect of the present invention, wherein a code for collecting information on a branch history of a conditional statement in a loop is embedded in an object program generated by compiling a source program. A branch pattern analysis process for analyzing the branch pattern of the conditional statement in the loop by referring to the information collection code generation process and the branch history information obtained by executing this object program, and executing the loop based on the analysis result. A loop parallelization process for performing a parallelization process with an optimum load distribution for the loop and a branch pattern changed by being further parallelized are analyzed once again, and according to the analysis result, each of the segments divided by the parallelization is analyzed. That allows a computer to perform loop optimization processing that generates optimal code for a particular loop Is a record of the ram.

【０００９】[0009]

【発明の実施の形態】以下、この発明の実施の一形態を
図について説明する。図１はこの発明のコンパイルシス
テムを示すブロック図であり、同図において、１はコン
パイルの対象となるソースプログラム、２はこのソース
プログラム１を入力としコンパイルを行うコンパイル装
置、３はコンパイル装置２によって生成されるオブジェ
クトプログラム、４はこのオブジェクトプログラム３の
実行によって得られたループの実行毎の繰り返し回数お
よびループ内の条件文の分岐履歴が記録された分岐履歴
情報ファイルである。また、前記コンパイル装置２にお
いて、５は前記分岐履歴情報ファイル４を作成するた
め、各ループの実行毎の繰り返し回数とループ内の条件
文の分岐履歴の情報を採取するコードをオブジェクトプ
ログラム３に埋め込む分岐履歴情報採取コード生成手段
である。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS One embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing a compiling system of the present invention. In FIG. 1, reference numeral 1 denotes a source program to be compiled, 2 denotes a compiling device which receives the source program 1 as an input, and 3 denotes a compiling device. The generated object program 4 is a branch history information file that records the number of repetitions for each execution of the loop obtained by executing the object program 3 and the branch history of conditional statements in the loop. Further, in the compiling device 2, the code 5 embeds a code for collecting information on the number of repetitions for each execution of the loop and the branch history of the conditional statement in the loop into the object program 3 in order to create the branch history information file 4. It is a branch history information collection code generation means.

【００１０】また、６は前記分岐履歴情報ファイル４お
よび後述のループ並列化手段からの情報を入力として対
象となるループに対する分岐パターンの解析を行う分岐
パターン解析手段、７はこの分岐パターン解析手段６に
より得られた分岐パターンの解析結果をもとに、対象と
なるループに対して最適な負荷分散を行った並列化を行
う前記のループ並列化手段、８はループ並列化手段７に
より並列化されたことにより分岐のパターンが変化した
ループに対して、分岐パターン解析手段６が分岐パター
ンを再解析した結果をもとに、ループに対して最適なコ
ードを生成するループ最適化手段である。Reference numeral 6 denotes a branch pattern analyzing means for analyzing a branch pattern for a target loop by inputting information from the branch history information file 4 and information from a loop parallelizing means to be described later. Based on the analysis result of the branch pattern obtained by the above, the above-mentioned loop parallelizing means for performing parallelization with optimum load distribution for the target loop, and the loop parallelizing means 8 are parallelized by the loop parallelizing means 7. This is a loop optimizing unit that generates an optimal code for the loop based on the result of re-analyzing the branch pattern by the branch pattern analyzing unit 6 for the loop in which the branch pattern has changed.

【００１１】次に、図１および図２に示すソースプログ
ラム１の内容を参照して動作を詳細に説明する。ここで
は、情報の並列化において、データの分散により発生す
る通信にかかるオーバヘッドが無視でき、かつベクトル
演算の可能なプロセッサが１０台のマルチプロセッサを
例として説明する。図２のソースプログラム１は要素数
が１０００の配列Ａに対して、Ｘ（Ｉ）が０．０でない
場合のみ、Ａ（Ｉ）に対して１．０を代入するプログラ
ムである。まず、図２のソースプログラム１からコンパ
イル装置２によってオブジェクトプログラム３を生成す
る際、ループ内の条件文の分岐履歴情報を採取するコー
ドを、分岐履歴情報採取コード生成手段５によって、オ
ブジェクトプログラム３に埋め込む。Next, the operation will be described in detail with reference to the contents of the source program 1 shown in FIGS. Here, in the parallelization of information, an explanation will be given by taking as an example a multiprocessor with ten processors capable of neglecting the communication overhead caused by data distribution and capable of performing vector operations. The source program 1 in FIG. 2 is a program that substitutes 1.0 for A (I) only when X (I) is not 0.0 for array A having 1000 elements. First, when the object program 3 is generated from the source program 1 of FIG. 2 by the compiling device 2, the code for collecting the branch history information of the conditional statement in the loop is converted into the object program 3 by the branch history information collection code generation means 5. Embed.

【００１２】ここでオブジェクトプログラム３に埋め込
まれる分岐履歴情報を採取するコードによって取得すべ
き情報は、ソースプログラム１中のどのループの、何回
目の実行の、何回目の繰り返しのループ内の、どの条件
文が真であったのか偽であったのかという情報である。
上述のような分岐履歴情報を採取するコードが埋め込ま
れたオブジェクトプログラム３の実行によって、ループ
内の条件文の分岐履歴が記録された分岐履歴情報ファイ
ル４が得られる。同一のソースプログラム１を再コンパ
イルする際、分岐履歴情報ファイル４を入力としてルー
プ内の条件文の分岐パターンの解析を分岐パターン解析
手段６が行い、この解析結果をもとに、ループ並列化手
段７がループに対して最適な負荷分散を行った並列化を
行う。ループ並列化手段７が行った並列化の情報をもと
に分岐パターン解析手段６が変化したループ内の分岐パ
ターンの再解析を行い、その解析結果をもとにループ最
適化手段８が対象となるループに対して最適なコードが
生成されたオブジェクトプログラム３を生成する。The information to be obtained by the code for collecting the branch history information embedded in the object program 3 includes which loop in the source program 1, which execution, and which loop This is information on whether the conditional statement was true or false.
By executing the object program 3 in which the code for collecting the branch history information as described above is embedded, the branch history information file 4 in which the branch history of the conditional statement in the loop is recorded is obtained. When the same source program 1 is recompiled, the branch pattern analysis unit 6 analyzes the branch pattern of the conditional statement in the loop by using the branch history information file 4 as an input, and based on the analysis result, the loop parallelizing unit. 7 performs parallelization with optimum load distribution for the loop. The branch pattern analysis unit 6 re-analyzes the changed branch pattern in the changed loop based on the information of the parallelization performed by the loop parallelization unit 7, and the loop optimization unit 8 performs the analysis based on the analysis result. An object program 3 in which an optimal code has been generated for a given loop is generated.

【００１３】例えば、図２のソースプログラム１中のル
ープがプログラム実行中に１回だけ実行され、ループの
繰り返し回数は１０００回、変数Ｉが１から５００まで
はループ内の条件文はすべて真であり、それ以降の繰り
返しはすべて偽であったとする。このソースプログラム
１の１回目のコンパイルと実行により、ソースプログラ
ム１のループは１回の実行が行われ、ループは１０００
回繰り返し、ループ内の条件文が１から５００までの繰
り返しでは真に、５０１から１０００までは偽になると
いった情報が格納された分岐履歴情報ファイル４が生成
される。For example, the loop in the source program 1 of FIG. 2 is executed only once during the execution of the program, the number of repetitions of the loop is 1,000, and all the conditional statements in the loop are true when the variable I is 1 to 500. Yes, and all subsequent iterations are false. By the first compilation and execution of the source program 1, the loop of the source program 1 is executed once and the loop is
A branch history information file 4 is generated which stores information such that the condition statement in the loop is true when the conditional statement is repeated from 1 to 500 and false when the conditional statement is from 501 to 1000.

【００１４】このソースプログラム１を再コンパイルす
る際に、分岐情報ファイル４を入力として、分岐パター
ン解析手段６が分岐パターンの解析を行い、この解析結
果をもとに、ループ並列化手段７が並列化により分割さ
れた各々のループ内の条件文の真率が均等になるように
並列化を行う。前記の場合、１回目の繰り返しから１０
０回目の繰り返しを１つ目のプロセッサにというような
ブロック分散を行うと、並列化された１０個のループの
内５つは、ループ内の条件文が一度も真にならず、何の
処理も行わずにループを繰り返すことになり、負荷分散
が不均等になる。そこで、１回目の繰り返しは１つ目の
プロセッサ、２回目の繰り返しは２つ目のプロセッサと
いうようなサイクリック分散を選択して並列化を行う。When the source program 1 is recompiled, the branch pattern analysis unit 6 analyzes the branch pattern by using the branch information file 4 as an input, and the loop parallelizing unit 7 performs parallel analysis based on the analysis result. Parallelization is performed so that the true rates of the conditional statements in each loop divided by the conversion become equal. In the above case, 10
When block distribution is performed such that the 0th iteration is performed on the first processor, 5 of the 10 parallelized loops do not have any conditional statements in the loop that become true, and no processing is performed. The loop is repeated without performing the above, and the load distribution becomes uneven. Therefore, the first iteration selects the first processor, the second iteration selects the second processor, and performs parallelization.

【００１５】その後、ループ並列化手段７が並列化した
情報をもとに、分岐パターン解析手段６が分岐パターン
の変化したループに対して分岐パターンの再解析を行
い、その解析結果をもとに、ループ最適化手段８が対象
となるループに対して最適となるコードを生成する。こ
の場合、並列化された各々のループの繰り返しは１００
回であり、ループ内の条件文は最初の５０回の繰り返し
は真、残り５０回の繰り返しは偽となるという解析結果
が得られ、スカラ演算よりもベクトル演算を行った方が
性能向上を見込めるため、マスク付きベクトル演算命令
が生成されたオブジェクトプログラム３を生成する。Then, based on the information parallelized by the loop parallelizing means 7, the branch pattern analyzing means 6 re-analyzes the branch pattern for the loop in which the branch pattern has changed, and based on the analysis result. , The loop optimizing means 8 generates a code that is optimal for the target loop. In this case, the iteration of each parallelized loop is 100
The analysis result shows that the first 50 repetitions of the conditional statement in the loop are true, and the remaining 50 repetitions are false. Performance improvement can be expected by performing vector operation rather than scalar operation. Therefore, the object program 3 in which the vector operation instruction with the mask is generated is generated.

【００１６】これにより、並列化された各々のループ内
の真率が均等になるので、負荷分散が均等となる並列化
が可能となり、負荷分散の不均等によるプログラム実行
時の待ち時間のオーバヘッドを削減することができる。
また、並列化された各々のループに対しても分岐パター
ンの解析を再度行うことにより、並列化された各々のル
ープに対して最適なコードが生成することができ、この
ためオブジェクトプログラム３の実行性能を高めること
が可能である。As a result, the true ratio in each of the parallelized loops becomes equal, so that parallelization can be performed so that the load distribution becomes uniform, and the overhead of the waiting time during program execution due to the uneven load distribution can be reduced. Can be reduced.
Also, by performing the branch pattern analysis again on each of the parallelized loops, an optimal code can be generated for each of the parallelized loops. It is possible to increase performance.

【００１７】次に、この発明の他の応用例を説明する。
図２のソースプログラム１に関して、ソースプログラム
１中のループが実行中に１回実行され、ループの繰り返
し回数は１０００回であり、ループ内の条件文が１回目
から５回目までは真、６回目から１００回目までは偽と
いうのを１０回繰り返す場合を考える。このような場
合、１０プロセッサに対して１回目の繰り返しは１つ目
のプロセッサ、２回目の繰り返しは２つ目のプロセッサ
というようなサイクリック分散を行った場合、並列化さ
れた１０個のループのうち、５つはループ内で条件文が
一度も真にならず、何の処理を行わずにループを繰り返
すことになり、負荷分散が不均等となる。このため１０
プロセッサでの並列化は効率的ではなくなる。Next, another application example of the present invention will be described.
With respect to the source program 1 in FIG. 2, the loop in the source program 1 is executed once during execution, the number of loop iterations is 1,000, the conditional statements in the loop are true from the first to fifth times, and the sixth time. Consider a case where false is repeated 10 times from the first to the 100th time. In such a case, when 10 processors are subjected to cyclic distribution such that the first iteration is the first processor and the second iteration is the second processor, 10 parallelized loops are obtained. Of the five, the conditional statement never becomes true in the loop, the loop is repeated without performing any processing, and the load distribution becomes uneven. For this reason 10
Parallelization on the processor becomes less efficient.

【００１８】しかし、この発明における分岐パターン解
析手段６が分岐情報ファイル４に記録された分岐のパタ
ーンを解析し、その解析結果を用いてループ並列化手段
７によって１回目から１００回目までの繰り返しは１つ
目のプロセッサというようなブロック分散によって並列
化が行われることで、１０プロセッサに対する負荷分散
は均等となる。さらにループ並列化手段７によって並列
化された情報をもとに分岐パターン解析手段６が分岐パ
ターンの解析を再度行い、その解析結果をもとにループ
最適化手段８が並列化によって分割された各々のループ
の１００回の繰り返しのうち、最初の５回しか真になら
ないため、スカラ演算か圧縮／伸長命令のコードを生成
する。これにより、負荷分散の不均等によるプログラム
実行時の待ち時間のオーバヘッドや、マスク付きベクト
ル演算を使用した場合に生ずる無駄なオーバヘッドを削
減できるため、プログラムの実行性能を高めることが可
能である。However, the branch pattern analyzing means 6 in the present invention analyzes the branch pattern recorded in the branch information file 4, and the loop parallelizing means 7 uses the analysis result to execute the first to 100th iterations. Since the parallelization is performed by the block distribution such as the first processor, the load distribution for the ten processors becomes equal. Further, the branch pattern analysis means 6 analyzes the branch pattern again based on the information parallelized by the loop parallelization means 7, and based on the analysis result, the loop optimization means 8 separates Of the 100 repetitions of this loop, only the first five are true, so that a code for a scalar operation or a compression / decompression instruction is generated. This can reduce the overhead of the waiting time at the time of program execution due to uneven load distribution and the useless overhead generated when using the vector operation with the mask, so that the execution performance of the program can be improved.

【００１９】[0019]

【発明の効果】以上のように、この発明によれば、ルー
プ内の条件文の分岐履歴の情報を採取するコードを、分
岐履歴情報採取コード生成手段により、ソースプログラ
ムをコンパイルして生成したオブジェクトプログラムに
埋め込み、このオブジェクトプログラムの実行によって
得られた分岐履歴情報を参照して、分岐パターン解析手
段によりループ内の条件文の分岐パターンを解析し、こ
の解析結果をもとに、ループ並列化手段によりループに
対して最適な負荷分散を行った並列化処理を行い、さら
に並列化されたことによって変化した分岐パターンを前
記分岐パターン解析手段によりもう一度解析し、この解
析結果に従って、ループ最適化手段により前記並列化に
よって分割された各々のループに対して最適なコードを
生成するようにしたので、ループ並列化処理された各ル
ープの負荷分散を均等化し、これによりプログラム実行
時の待ち時間のオーバヘッドの削減並びにプログラムの
実行性能の向上を図れるとともに、並列化された各ルー
プに対して、最適のコードを生成でき、これによりプロ
グラムのさらなる高速化を実現できるという効果が得ら
れる。As described above, according to the present invention, the code for collecting the information on the branch history of the conditional statement in the loop is converted into the object obtained by compiling the source program by the branch history information collection code generating means. It is embedded in a program, refers to the branch history information obtained by executing this object program, analyzes the branch pattern of the conditional statement in the loop by the branch pattern analysis means, and based on the analysis result, the loop parallelization means The parallel processing is performed by performing the optimal load distribution on the loop, and the branch pattern changed by the parallel processing is analyzed again by the branch pattern analysis means. According to the analysis result, the loop optimization means Optimal code is generated for each loop divided by the parallelization. Therefore, the load distribution of each loop subjected to the loop parallel processing is equalized, thereby reducing the overhead of the waiting time at the time of executing the program and improving the execution performance of the program. It is possible to generate an optimal code, thereby obtaining an effect that the program can be further speeded up.

[Brief description of the drawings]

【図１】この発明の実施の一形態によるコンパイルシ
ステムを示すブロック図である。FIG. 1 is a block diagram showing a compilation system according to an embodiment of the present invention.

【図２】図１におけるソーププログラムの内容を示す
説明図である。FIG. 2 is an explanatory diagram showing contents of a soap program in FIG. 1;

[Explanation of symbols]

１ソースプログラム２コンパイル装置３オブジェクトプログラム４分岐履歴情報ファイル５分岐履歴情報採取コード生成手段６分岐パターン解析手段７ループ並列化手段８ループ最適化手段 DESCRIPTION OF SYMBOLS 1 Source program 2 Compiling device 3 Object program 4 Branch history information file 5 Branch history information collection code generation means 6 Branch pattern analysis means 7 Loop parallelization means 8 Loop optimization means

Claims

[Claims]

A compiling device for generating an object program from a source program; a branch history information file in which the number of repetitions for each execution of a loop and a branch history of a conditional statement in the loop are recorded by executing the object program; A branch provided in the compiling apparatus and for embedding, in the object program, a code for collecting information on the number of repetitions for each execution of a loop and the branch history of a conditional statement in the loop in order to create the branch history information file. History information collection code generation means; branch pattern analysis means for analyzing a branch pattern for a loop by using the branch history information and parallelization information from the branch history information file as input; branches obtained by the branch pattern analysis means Target rules based on the pattern analysis results Loop parallelizing means for performing the above-mentioned parallelization by performing optimal load distribution on the loop, and the branch pattern analysis means re-performs the branch pattern for a loop whose branch pattern has been changed by the parallelization by the loop parallelizing means. A compilation system comprising: a loop optimizing unit that generates an optimum code for a loop based on a result of the analysis.

2. A code for collecting information on a branch history of a conditional statement in a loop is embedded in an object program generated by compiling a source program by a branch history information collection code generating means, and is obtained by executing the object program. With reference to the obtained branch history information, the branch pattern analysis means analyzes the branch pattern of the conditional statement in the loop, and based on the analysis result, performs optimal load distribution for the loop by the loop parallelizing means. The parallelization process is performed, and the branch pattern changed by the parallelization is analyzed again by the branch pattern analysis unit. According to the analysis result, each of the loops divided by the parallelization is determined by the loop optimization unit. A compilation method characterized by generating optimal code by using the compiler.

3. A branch history information collection code generating process for embedding a code for collecting information on a branch history of a conditional statement in a loop into an object program generated by compiling a source program, and the execution of the object program. Pattern analysis processing for analyzing the branch pattern of a conditional statement in a loop by referring to the obtained branch history information, and parallelizing processing that optimally distributes the load to the loop based on the analysis result. A loop parallelizing process and a loop optimizing process for re-analyzing a branch pattern changed by being further parallelized and generating an optimum code for each of the loops divided by the parallelization according to the analysis result. And a program for causing a computer to perform the following.