JP2003323304A

JP2003323304A - Method and device for speculative task generation

Info

Publication number: JP2003323304A
Application number: JP2002128535A
Authority: JP
Inventors: Toshihiro Ozawa; 年弘小沢; Akira Yasusato; 彰安里
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2002-04-30
Filing date: 2002-04-30
Publication date: 2003-11-14

Abstract

<P>PROBLEM TO BE SOLVED: To provide a method and device which can perform a speculative execution only when reduction of execution time can be expected in order to reduce the overhead occurring when the speculative execution fails over a wide range. <P>SOLUTION: A basic block is segmented by a basic data extraction part 11 from a compiler 2, a data preparation finishing point, an execution determination edge, a non-execution determination edge, and the amount of data for every basic block are obtained at a data dependence analyzing part 12 and a control dependence analyzing part 13. A parallel task selection part 14 seeks a link of the basic blocks having a desirable characteristics for parallel processing, and determines the link as a parallel task. Among the parallel tasks, a parallel task in which (1) the number of commands between the data preparation finishing point and the execution determination edge in a certain basic block group is greater than a predetermine number, (2) the number of commands between the data preparation finishing point and the non-execution determination edge is less than a predetermined number, or (3) the amount of data generated within the range of a certain basic block group and used outside the range is less than a predetermined number, is made to be a speculative task. <P>COPYRIGHT: (C)2004,JPO

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、プログラムを複数
のスレッドに分割し、それらを複数のプロセッサにより
並列実行するシステムにおいて、あるスレッドの実行の
必要性が確定する前に実行を始める投機タスク生成方法
および装置に関する。近年、複数のプロセッサをシステ
ム中に持ち、複数のスレッドを同時に実行できるシステ
ムが実用化されている。このようなシステムにおいて
は、プロセッサ数に見合うだけのプロセスもしくはスレ
ッドを同時に実行することがシステムの効率的な利用に
欠かせない。一方、あるプログラムの結果を短時間で得
ることを目指して、プログラムを複数のスレッドに分け
て並列に実行することにより高速化することが行なわれ
ている。この場合、プログラムを複数のスレッドに分解
する処理において、プロセッサ数に匹敵する、十分な数
の並列に実行できるスレッドを生成することが重要であ
る。しかし、プログラムの性質上、必ず必要になると分
かった計算だけでスレッドを生成しようとすると、十分
な数のスレッドが生成できない場合がある。この場合、
必要になるかどうかが決定する前に実行を始めてしまう
スレッド（投機的スレッド）を生成することにより、プ
ログラムの高速実行を行なうと共に、システムの有効利
用率を高める投機的実行方式が提案されている。本発明
は、かかる投機的実行方式において、投機タスクを生成
する方法および装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention, in a system in which a program is divided into a plurality of threads and executed in parallel by a plurality of processors, starts speculative task generation before the execution of a certain thread is decided. A method and apparatus. In recent years, a system having a plurality of processors in a system and capable of simultaneously executing a plurality of threads has been put into practical use. In such a system, it is essential for efficient use of the system to execute processes or threads at the same time as many as the number of processors. On the other hand, in order to obtain the result of a certain program in a short time, the program is divided into a plurality of threads and executed in parallel to increase the speed. In this case, in the process of decomposing a program into a plurality of threads, it is important to generate a sufficient number of threads that can be executed in parallel, which is equal to the number of processors. However, due to the nature of the program, if you try to create threads only by calculations that are found to be necessary, there are cases where a sufficient number of threads cannot be created. in this case,
A speculative execution method has been proposed in which a program is executed at high speed by creating a thread (speculative thread) that starts execution before it is determined whether it is needed or not, and at the same time, the effective utilization rate of the system is improved. . The present invention relates to a method and apparatus for generating a speculative task in such a speculative execution method.

【０００２】[0002]

【従来の技術】従来の投機的実行は、ハードウエアによ
る条件分岐方向の予測に基づき、条件分岐命令の結果が
決まる前に、分岐先の命令の実行を開始するといった方
法が実現されている。これらは、予測した分岐方向の命
令を投機的に実行することにより、実行速度を向上させ
ようとするものである。しかし、狭い範囲の解析しかで
きないため、ある命令が実行可能になってもなかなか実
行されなかったり、広い範囲から投機的実行する命令
（スレッド）を抽出できなかった。また、基本的にすべ
ての分岐命令に対して投機実行を行なうので、有効でな
い場合も投機実行してしまうなどの問題点がある。ま
た、より広い範囲の投機を行なう技術として、タスクレ
ベルの投機を行なう方法が提案されている（特開２００
１−７５８０２号公報、論文１：山名、安江、石井、村
岡，”並列処理システムにおけるマクロタスク間先行評
価方式”，電子情報処理学会論文誌D-I,Vol.J77-D-I,N
o.5,pp.343-353,1994.5、論文２：山名, 佐藤, 児玉,
坂根, 坂井, 山口，”並列計算機ＥＭ−４におけるマク
ロタスク間投機的実行の分散制御方式”，情報処理学会
論文誌,Vol.36,No.7,pp.1578-1588,1995.7参照）。2. Description of the Related Art Conventional speculative execution has realized a method of starting execution of a branch destination instruction before a result of a conditional branch instruction is determined based on prediction of a conditional branch direction by hardware. These are intended to improve the execution speed by speculatively executing the predicted branch direction instruction. However, since only a narrow range can be analyzed, even if a certain instruction becomes executable, it is difficult to execute it, or an instruction (thread) to be speculatively executed cannot be extracted from a wide range. Further, since speculative execution is basically performed for all branch instructions, there is a problem that speculative execution is performed even when it is not effective. Further, as a technique for performing speculation over a wider range, a method for performing speculation at a task level has been proposed (Japanese Patent Laid-Open No. 200-200200).
1-75802 Publication, Paper 1: Yamana, Yasue, Ishii, Muraoka, "Pre-evaluation method between macro tasks in parallel processing system", IEICE Transactions DI, Vol.J77-DI, N
o.5, pp.343-353,1994.5, Paper 2: Yamana, Sato, Kodama,
Sakane, Sakai, Yamaguchi, "Distributed control method of speculative execution between macro tasks in parallel computer EM-4", IPSJ Journal, Vol.36, No.7, pp.1578-1588, 1995.7).

【０００３】[0003]

【発明が解決しようとする課題】上記特開２００１−７
５８０２号公報に開示される方式では、まだ値の決まっ
ていない変数の値を予測して使用するために、投機失敗
時のオーバーヘッドが大きくなる可能性がある。さら
に、どのような場合に、どのような投機タスクを生成す
るかは、規定されていない。上記論文１には、基本ブロ
ック間の制御依存、データ依存に基づいて投機タスクの
生成手法が提案されている。この方法では、基本タスク
間のデータ依存関係を制御によらずに一意になるよう
に、基本タスクをコピーすることにより、積極的な投機
を行なっているが、一方コピーによるコードの増大など
が起こり得る。本発明は、上記事情に鑑みなされたもの
であって、本発明は、広い範囲に渡って、かつ投機成功
時に実行時間の短縮が期待できる場合のみ投機実行を行
い、投機失敗時のオーバーヘッドを小さくすることがで
きる投機的タスクの生成方法および装置を提供すること
を目的とする。DISCLOSURE OF THE INVENTION Problems to be Solved by the Invention
In the method disclosed in Japanese Patent No. 5802, since the value of a variable whose value has not yet been determined is predicted and used, the overhead at the time of speculation failure may increase. Furthermore, what kind of speculative task is generated in which case is not specified. The above paper 1 proposes a speculative task generation method based on control dependence and data dependence between basic blocks. In this method, the basic tasks are actively speculated by copying the basic tasks so that the data dependencies between the basic tasks become unique without depending on the control. obtain. The present invention has been made in view of the above circumstances, and the present invention performs speculation execution over a wide range and only when shortening of the execution time can be expected when speculation is successful, and the overhead at the time of speculation failure is reduced. It is an object of the present invention to provide a speculative task generation method and apparatus capable of performing the speculative task.

【０００４】[0004]

【課題を解決するための手段】本発明は、投機失敗の確
率を小さくするためと、失敗時のオーバーヘッドを小さ
くするために、投機実行するタスクが必要とするデータ
がすべて揃ってから投機実行を開始する投機的実行方式
において、以下のようにして、前記課題を解決する。図
１に示すように、コンパイラ２からソースコードもしく
はその中間言語表現を受け取り、基本ブロック切り出し
部１１で基本ブロックを切出し、その間の制御関係を構
築する。そして、データ依存解析部１２、制御依存解析
部１３で、基本ブロック毎に、以下(1) に示すデータを
求める。そして、並列タスク選定部１４で、以下(2) の
ようにして、並列タスク、投機タスクを選定する。並列
コード生成部１５は並列実行に必要なコード等を生成す
る。 (1) 基本ブロックごとに、以下の各データを求める。・データ準備完了点：その基本ブロックで使用するデー
タがすべて揃った時点・データ使用点：その基本ブロックで生成したデータを
使用、もしくは再定義する時点・実行確定エッジ：その基本ブロックを実行することが
確定する時点・不実行確定エッジ：その基本ブロックが実行されない
ことが確定する時点・ＯｕｔｐｕｔＤａｔａ量：その基本ブロックで生成
したデータで、他のブロックで使用されるデータの量 (2) 上記のデータから、以下の性質(i) のいずれか、あ
るいは全てを満たす基本ブロックのつながりを求め、そ
れを並列タスクとし、その中で以下の性質(ii)のいずれ
か、あるい全てを満たすものを投機タスクとする。 (i) 並列実行に望ましい性質 (a) ある基本ブロック群のデータ準備完了点から、その
基本ブロック群までの命令数ｃ１と該基本ブロック群の
最後から、データ使用点までの命令数ｃ２との和（ｃ１
＋ｃ２）が大きい。 (b) 上記ｃ１＋ｃ２もしくはｃ２が、上記基本ブロック
群のつながりの中の命令数より多い。 (c) プログラム中の基本ブロック群のつながりの中の命
令数ｄが所定値より大きい。 (ii)投機実行に望ましい性質 (a) ある基本ブロック群のデータ準備完了点と実行確定
エッジ間の命令数ａが所定値より多い。 (b) ある基本ブロック群のデータ準備完了点と上記不実
行確定エッジ間の命令数ｂが所定値より少ない。 (c) ある基本ブロック群について、その範囲で生成さ
れ、かつその範囲以外で使用されるデータの量ｅ（ＯＵ
ＴＰＵＴデータ量）が所定値より少ない。 (3) 上記のようにして生成された投機タスクは、そのデ
ータ準備完了点で起動し、実行確定エッジで投機状態か
ら通常の実行状態へ遷移するとともに、タスク内で生成
したデータを他のタスクからアクセス可能にする。不実
行投機エッジへ制御が移った場合は、投機タスクをキャ
ンセルする。本発明においては、上記性質(i) を満たすものを並列タ
スクとし、その中で、上記性質(ii)を満たすものを投機
タスクとしているので、広い範囲に渡って、かつ投機成
功時に実行時間の短縮が期待できる場合のみ投機実行を
行って、投機失敗時のオーバーヘッドが小さいタスクレ
ベルの投機的実行が行なうことができ、実行時間の高速
化を実現することが可能となる。In order to reduce the probability of speculative failure and to reduce the overhead at the time of failure, the present invention executes speculative execution after all the data required by the task to be speculatively executed is prepared. In the speculative execution method to be started, the above problem is solved as follows. As shown in FIG. 1, a source code or an intermediate language expression thereof is received from a compiler 2, a basic block cutout unit 11 cuts out a basic block, and a control relation therebetween is constructed. Then, the data dependence analysis unit 12 and the control dependence analysis unit 13 obtain the data shown in (1) below for each basic block. Then, the parallel task selection unit 14 selects a parallel task and a speculative task as described in (2) below. The parallel code generator 15 generates a code or the like required for parallel execution. (1) Obtain the following data for each basic block.・ Data preparation point: When all the data to be used in the basic block are prepared ・ Data usage point: When the data generated in the basic block is used or redefined ・ Execution confirmation edge: Execution of the basic block When the basic block is not executed ・ The output data amount: The amount of data generated by the basic block and used by other blocks (2) From the data, find the connection of basic blocks that satisfy any or all of the following properties (i), and use them as a parallel task, and select one that satisfies any or all of the following properties (ii). It is a speculative task. (i) Desired property for parallel execution (a) The number of instructions c1 from the data ready point of a certain basic block group to the basic block group and the number of instructions c2 from the end of the basic block group to the data use point Sum (c1
+ C2) is large. (b) c1 + c2 or c2 is larger than the number of instructions in the connection of the basic block groups. (c) The number of instructions d in the connection of the basic block groups in the program is larger than the predetermined value. (ii) Properties desirable for speculative execution (a) The number of instructions a between a data ready point and an execution confirmed edge of a certain basic block group is larger than a predetermined value. (b) The number of instructions b between the data preparation completion point of a basic block group and the non-execution confirmed edge is less than a predetermined value. (c) For a certain basic block group, the amount of data generated in the range and used outside the range e (OU
TPUT data amount) is less than a predetermined value. (3) The speculative task generated as described above starts at the data preparation completion point, transitions from the speculative state to the normal execution state at the execution confirmation edge, and the data generated in the task is transferred to other tasks. Accessible from. When the control is transferred to the non-execution speculative edge, the speculative task is canceled. In the present invention, a task satisfying the above property (i) is defined as a parallel task, and a task satisfying the above property (ii) is defined as a speculative task. Speculative execution can be performed only when shortening can be expected, and speculative execution at the task level with a small overhead at the time of speculative failure can be performed, and the execution time can be shortened.

【０００５】図２に投機タスクの実行において、投機成
功時の動作と投機不成功時の動作を示す。同図（ａ）は
投機成功時の動作であり、前記したデータ準備完了点
で、投機タスクが起動され、通常タスクに並行して投機
タスクが実行される。そして、実行確定エッジを通過す
ると、”Ｏｕｔｐｕｔのｇｌｏｂａｌ化ＯＫ通知”によ
りＯｕｔｐｕｔのｇｌｏｂａｌ化（作業エリアから正規
のエリアへのＯｕｔｐｕｔデータの書き込み）が行われ
る。タスクが終了すると、”Ｏｕｔｐｕｔのｇｌｏｂａ
ｌ化終了通知”が通知され、次のタスクによるデータの
使用が可能となる。同図（ｂ）は投機失敗時の動作であ
り、前記したデータ準備完了点で、投機タスクが起動さ
れ、不実行確定エッジを通過すると、起動された投機タ
スクはキャンセルされる。FIG. 2 shows an operation at the time of successful speculation and an operation at the time of unsuccessful speculation in executing a speculative task. FIG. 10A shows the operation at the time of successful speculation. The speculative task is activated at the data preparation completion point described above, and the speculative task is executed in parallel with the normal task. Then, when the execution confirmation edge is passed, the output is globalized (the writing of the Output data from the work area to the regular area) is performed by the "Globalization OK notification of Output". When the task is completed, "output global
Then, the data can be used by the next task. The operation at the time of speculative failure is shown in FIG. 7B. When the execution confirmation edge is passed, the activated speculative task is canceled.

【０００６】ここで、前記(i) に示したように並列実行
が望ましい性質は以下の通りである。 (a) データ準備完了点からその基本ブロックまでの命令
数ｃ１と、その基本ブロックからデータ使用点、データ
再定義点までの命令数ｃ２の和が大きい。 (b) 上記ｃ１＋ｃ２が、その基本ブロックの命令数ｄに
ほぼ等しいか大きい。 (c) その基本ブロックの命令数ｄが大きい。すなわち、図２から明らかなように上記条件(a) (b) を
満たすものを並列タスクとすることにより、そのタスク
で生成されたデータを使用するタスクは待たされること
なく処理を行うことができ、処理の遅延を少なくするこ
とができる。(c) を満たすものを並列タスクとすること
により、実行時間の短縮を期待することができる。ま
た、前記(ii)に示したように投機実行に望ましい性質は
以下の通りである。 (a) データ準備完了点から、その実行確定エッジまでの
命令数をａが、より大きい。 (b) 各データ準備完了点から、その不実行確定エッジま
での命令数ｂが小さい。 (c) その基本ブロックで生成され、かつ、それ以外で使
用されるデータ（ＯｕｔｐｕｔＤａｔａ）の量ｅが小
さい。図２から明らかなように、上記条件(a) を満たすものを
投機タスクとすることにより、命令数ａの分だけ処理を
速くすることができ実行時間の短縮を期待することがで
きる。また、上記(2) の条件を満たすものを投機タスク
にすることにより、投機実行に失敗しても、無駄となる
処理量を小さくすることができ、さらに、上記条件(3)
を満たすものを投機タスクにすることにより、投機実行
に失敗してもタスクにより生成されるデータの格納領域
を小さくすることができ、後処理が簡単となり、オーバ
ヘッドを小さくすることができる。Here, as shown in (i) above, the desirable characteristics of parallel execution are as follows. (a) The sum of the number of instructions c1 from the data ready point to the basic block and the number of instructions c2 from the basic block to the data use point and data redefinition point is large. (b) The above c1 + c2 is substantially equal to or larger than the instruction number d of the basic block. (c) The number of instructions d in the basic block is large. That is, as is clear from FIG. 2, by making the tasks that satisfy the above conditions (a) and (b) into parallel tasks, the tasks that use the data generated by the tasks can perform processing without waiting. The processing delay can be reduced. Execution time can be expected to be shortened by setting the tasks that satisfy (c) as parallel tasks. Further, as shown in (ii) above, desirable properties for speculative execution are as follows. (a) The number of instructions from the data ready point to the execution confirmation edge is larger than a. (b) The number of instructions b from each data preparation completion point to the non-execution confirmation edge is small. (c) The amount e of data (Output Data) generated in the basic block and used in other areas is small. As is clear from FIG. 2, by making the condition that satisfies the above condition (a) a speculative task, the processing can be sped up by the number of instructions a and the execution time can be expected to be shortened. In addition, by making the speculative task satisfying the condition of (2) above, it is possible to reduce the amount of wasted processing even if speculative execution fails, and further, the above condition (3)
By making the speculative task satisfying the above condition, the storage area of the data generated by the task can be reduced even if the speculative execution fails, the post-processing can be simplified, and the overhead can be reduced.

【０００７】[0007]

【発明の実施の形態】図１は、本発明の実施例の投機タ
スク生成部の構成を示す図である。同図において、１は
ソースプログラム、２はコンパイラ、３はコンパイラに
より生成されるコードである。基本ブロック切出し部１
１は、コンパイラ２からソースコードもしくはその中間
言語表現を受け取り、基本ブロックを切出し、その間の
制御関係を構築する。データ依存解析部１２は上記切り
出された基本ブロック毎に、ＩｎｐｕｔＤａｔａ、Ｏ
ｕｔｐｕｔＤａｔａ、データ準備完了点を求める。制
御依存解析部１３は、上記基本ブロック毎に、実行確定
エッジ、不実行確定エッジを求める。並列タスク選定部
１４は、後述するように、(1) データ準備完了点から基
本ブロックまでの命令数、(2) データ準備完了点から、
その実行確定エッジまでの命令数、(3) データ準備完了
点から、その不実行確定エッジまでの命令数、(4) 基本
ブロックの最後からデータ使用点、データ再定義点まで
の命令数、(5) その基本ブロックの命令数、(6) その基
本ブロックで生成され、かつそれ以外で使用されるデー
タ（ＯｕｔｐｕｔＤａｔａ）の量、に基づき、並列実
行、投機実行すべきタスクを求める。並列コード生成部
１５は並列に実行するために必要なコード（スケジュー
リング・タスク、同期待ちなど）を生成する。上記のよ
うに生成された投機タスクは、後述するように、そのデ
ータ準備完了点で起動し、実行確定エッジで投機状態か
ら通常の実行状態へ遷移するとともに、タスク内で生成
したデータを他のタスクからアクセス可能にする。FIG. 1 is a diagram showing the configuration of a speculative task generator according to an embodiment of the present invention. In the figure, 1 is a source program, 2 is a compiler, and 3 is a code generated by the compiler. Basic block cutout part 1
1 receives a source code or its intermediate language expression from a compiler 2, cuts out a basic block, and builds a control relationship between them. The data dependence analysis unit 12 inputs Input Data, O for each of the cut-out basic blocks.
Output Data, find data ready point. The control dependence analysis unit 13 obtains an execution confirmed edge and a non-execution confirmed edge for each of the basic blocks. As will be described later, the parallel task selection unit 14 uses (1) the number of instructions from the data ready point to the basic block, (2) the data ready point,
The number of instructions up to the execution confirmation edge, (3) the number of instructions from the data ready point to the non-execution confirmed edge, (4) the number of instructions from the end of the basic block to the data use point, data redefinition point, 5) A task to be executed in parallel or speculatively is obtained based on the number of instructions in the basic block and (6) the amount of data (Output Data) generated in the basic block and used in other areas. The parallel code generation unit 15 generates the code (scheduling task, synchronization wait, etc.) necessary for parallel execution. As will be described later, the speculative task generated as described above is activated at the data preparation completion point, transitions from the speculative state to the normal execution state at the execution confirmation edge, and the data generated in the task Make it accessible to tasks.

【０００８】図３は上記データ準備完了点、実行確定エ
ッジ、不実行確定エッジ等の関係を示す概念図である。
同図に示すように、データ準備完了点は、あるタスクの
ＩｎｐｕｔＤａｔａが決まる最終命令であり、実行確
定エッジ毎に定まる。実行確定エッジは、あるタスクが
実行されることが最初に決まるエッジの集合であり、ま
た、不実行確定エッジは、あるタスクが実行されないこ
とが最初に決まるエッジの集合である。投機タスクは、
上記データ準備完了点で起動され、不実行確定エッジに
移った場合そのタクスはキャンセルされる。また、実行
確定エッジで、投機状態から通常の実行状態に遷移し、
このタスクの実行により生成されるＯｕｔｐｕｔＤａ
ｔａ対して、他のタスクからのアクセス可能となる。FIG. 3 is a conceptual diagram showing the relationship between the data preparation completion point, the execution confirmed edge, the non-execution confirmed edge and the like.
As shown in the figure, the data preparation completion point is the final instruction that determines the Input Data of a certain task, and is determined for each execution confirmation edge. The execution-determined edge is a set of edges where it is first determined that a task is executed, and the non-execution-determined edge is a set of edges where it is initially determined that a task is not executed. The speculative task is
The task is started at the data preparation completion point, and when the non-execution confirmation edge is reached, the tax is canceled. Also, at the execution confirmation edge, transition from the speculative state to the normal execution state,
Output Da generated by executing this task
Other tasks can access ta.

【０００９】以下、上記図１に示した各部の処理につい
て説明する。（１）データ準備完了点を求める処理図４にデータ準備完了点を求めるアルゴリズムを示す。
また、図５〜図６は上記アルゴリズムによる処理を説明
する図であり、図５〜図６に示す丸印はそれぞれ基本ブ
ロックを示し、各基本ブロックに塗られる各色の意味を
図７に示す。また、図８にデータ準備完了点を求める動
作を示す。以下、前記データ依存解析部１２によるデー
タ準備完了点を求める処理について説明する。ここで
は、図５〜図６に示す基本ブロックＦのデータ準備完了
点を求める処理について説明するが、同様の処理を各基
本ブロックについて行う。基本ブロックＦのＩｎｐｕｔ
Ｄａｔａ出力点は基本ブロックＡ，Ｂ，Ｃとする。な
お、図４のアルゴリズムにおける基本ブロックＢ１は以
下の説明では、基本ブロックＦ、基本ブロックＢ２は、
基本ブロックＡ，Ｂ，Ｃに対応する。The processing of each unit shown in FIG. 1 will be described below. (1) Process for obtaining data ready point FIG. 4 shows an algorithm for obtaining a data ready point.
5 to 6 are diagrams for explaining the processing by the above algorithm. The circles shown in FIGS. 5 to 6 indicate basic blocks, and FIG. 7 shows the meaning of each color applied to each basic block. Further, FIG. 8 shows an operation for obtaining the data preparation completion point. The process of obtaining the data preparation completion point by the data dependence analysis unit 12 will be described below. Here, the process of obtaining the data preparation completion point of the basic block F shown in FIGS. 5 to 6 will be described, but the same process is performed for each basic block. Input of basic block F
Data output points are basic blocks A, B, and C. In the following description, the basic block B1 and the basic block B2 in the algorithm of FIG.
It corresponds to basic blocks A, B, and C.

【００１０】(i) 前処理以下の処理を各基本ブロックについて行う。 (1-1) 図５（ａ）において、基本ブロックＦのＩｎｐｕ
ｔＤａｔａ出力点の全ての基本ブロックから、フロー
グラフを順にたどっていき、たどれる基本ブロックに印
Ｘを付ける。これにより、図５（ｂ）に示すように、各
基本ブロックに印Ｘが付けられる。 (1-2) 基本ブロックＦからフローグラフを逆順にたどっ
ていき、各基本ブロックには、印Ｙを付ける。これによ
り、図５（ｃ）に示すように各基本ブロックに印Ｙが付
けられる。 (1-3) 印Ｘと印Ｙがともに付いている基本ブロックを青
色に、その他を白色に塗る。これにより、図５（ｄ）に
示すように、基本ブロックＡ，Ｂ，Ｃ，Ｄ，Ｅ，Ｆが青
色に塗られる。 (1-4) 基本ブロックＦのＩｎｐｕｔＤａｔａを持つ基
本ブロックを黒く塗る。これにより、図６（ｅ）に示す
ように基本ブロックＡ，Ｂ，Ｃが黒く塗られる。(I) Pre-processing The following processing is performed for each basic block. (1-1) Inpu of the basic block F in FIG.
The flow graph is sequentially traced from all the basic blocks at the t Data output point, and the traceable basic block is marked with X. As a result, the mark X is added to each basic block as shown in FIG. (1-2) Follow the flow graph from the basic block F in reverse order, and mark Y for each basic block. As a result, the mark Y is attached to each basic block as shown in FIG. (1-3) Paint the basic block with both mark X and mark Y in blue, and the others in white. As a result, the basic blocks A, B, C, D, E and F are painted blue as shown in FIG. (1-4) Paint the basic block having the Input Data of the basic block F black. As a result, the basic blocks A, B and C are painted black as shown in FIG.

【００１１】(ii)処理本体以下の処理では、白色のブロックはないものとして無視
して処理する。白でない各基本ブロックについて、色の
変化がある間、情報をコントロールフローにさかのぼっ
て伝播させ、自分自身の基本ブロックの色と後続のブロ
ックの色から図８に示す動作をする。そして、準備完了
点を求める基本ブロック（この場合は基本ブロックＦ）
以外の基本ブロックについて処理が終わったら、黒と黄
の色の基本ブロックを、基本ブロックＦに対する準備完
了点とする。上記処理を基本ブロックＡについて行う
と、基本ブロックＡは黒で、白以外の後続ブロック
（Ｂ）の中に青の基本ブロックがないので、図６（ｆ）
に示すように、基本ブロックＡは赤とする。(Ii) Processing body In the processing below, it is assumed that there is no white block and is ignored. For each non-white basic block, while the color changes, the information is propagated back to the control flow, and the operation shown in FIG. 8 is performed from the color of the basic block of itself and the color of the succeeding block. Then, the basic block for which the preparation completion point is obtained (in this case, the basic block F)
When the processing is completed for the basic blocks other than, the basic blocks of black and yellow are set as the preparation completion points for the basic block F. When the above process is performed on the basic block A, the basic block A is black and there is no blue basic block in the subsequent blocks (B) other than white.
As shown in, the basic block A is red.

【００１２】同様に、上記処理を基本ブロックＢについ
て行うと、基本ブロックＢは黒で、白以外の後続ブロッ
ク（Ｂ，Ｃ）の一部（Ｄ）に青の基本ブロックがあるの
で、図６（ｇ）に示すように、基本ブロックＢは赤とす
る。また、基本ブロックＤは黄色にする。さらに、基本
ブロックＣ，Ｄ，Ｅは後続が青ばかりなので変化しな
い。また、基本ブロックＦは前記アルゴリズムにおい
て、ｉｆ（基本ブロックＢ３！＝...）の条件に当ては
まり、変化なしであるので、基本ブロックＢを赤にす
る。また、基本ブロックＤは黄色とする。ここで、前記
アルゴリズムのｗｈｉｌｅ文が一度実行されるが、変化
が起きないので、ここで処理は終わる。上記処理をおこ
なった結果、図６（ｉ）に示すように基本ブロックＣが
黒、基本ブロックＤが黄となるので、基本ブロックＣと
Ｄが準備完了点となる。Similarly, when the above process is performed on the basic block B, the basic block B is black, and a part (D) of the succeeding blocks (B, C) other than white has a blue basic block. As shown in (g), the basic block B is red. The basic block D is colored yellow. Further, the basic blocks C, D and E do not change because the succeeding blocks are all blue. Further, since the basic block F satisfies the condition of if (basic block B3! = ...) in the above algorithm and has no change, the basic block B is colored red. The basic block D is yellow. Here, the while statement of the algorithm is executed once, but since no change has occurred, the processing ends here. As a result of the above processing, the basic block C becomes black and the basic block D becomes yellow as shown in FIG. 6 (i), so that the basic blocks C and D become the preparation completion points.

【００１３】図９にデータ準備完了点を求めるための処
理フローを示す。なお、同図のステップＳ１〜Ｓ６は前
記「(i) 前処理」であり、ステップＳ７〜Ｓ１１は前記
「(ii)処理本体」である。ステップＳ１において、デー
タ準備完了点を求めていない基本ブロックがあるかを調
べる。求めていない基本ブロックがなければ処理を終了
する。また、データ準備完了点を求めていない基本ブロ
ックがあれば、ステップＳ２において、データ準備完了
点を求めていない基本ブロックを一つ選ぶ。ステップＳ
３において、前記図５（ｂ）に示したように、選んだ基
本ブロックのＩｎｐｕｔＤａｔａ出力点のすべての基
本ブロックから、フローグラフを順にたどっていき、た
どれる基本ブロックに印Ｘを付ける。ステップＳ４にお
いて、前記図５（ｃ）に示したように選んだ基本ブロッ
クからフローグラフを逆順にたどっていき、印Ｙを付け
る。そして、ステップＳ５において、前記図５（ｄ）に
示したように印Ｘと印Ｙがともについている基本ブロッ
クを青色に、その他を白に塗る。また、ステップＳ６に
おいて、図６（ｅ）に示したように選んだ基本ブロック
のＩｎｐｕｔＤａｔａを持つ基本ブロックを黒く塗
る。FIG. 9 shows a processing flow for obtaining the data preparation completion point. Note that steps S1 to S6 in the figure are the "(i) preprocessing", and steps S7 to S11 are the "(ii) processing body". In step S1, it is checked whether there is a basic block for which a data preparation completion point has not been obtained. If there is no basic block that has not been obtained, the process ends. If there is a basic block for which the data preparation completion point has not been obtained, one basic block for which the data preparation completion point has not been obtained is selected in step S2. Step S
In FIG. 3, as shown in FIG. 5B, the flow graph is sequentially traced from all the basic blocks at the Input Data output points of the selected basic block, and the traced basic block is marked with X. In step S4, the flow graph is traced in reverse order from the basic block selected as shown in FIG. Then, in step S5, as shown in FIG. 5 (d), the basic blocks attached with the marks X and Y are painted blue, and the others are painted white. Further, in step S6, the basic block having the Input Data of the basic block selected as shown in FIG. 6E is painted black.

【００１４】次に、ステップＳ７において、上記処理を
行った結果、基本ブロックに色の変化があったかを調べ
る。色の変化がなければ、上記Ｓ１に戻り、データ準備
完了点を求めていない基本ブロックがあるかを調べ、な
ければ処理を終了し、あれば、上記ステップＳ１〜Ｓ６
の処理を行う。また色の変化があれば、ステップＳ８に
おいて、白色でなく、かつ処理していない基本ブロック
があるかを調べる。なければステップＳ７に戻り上記処
理を行う。白色でなく、かつ処理していない基本ブロッ
クがあると、ステップＳ９において、処理していない基
本ブロックを一つ選び、データ準備完了点を求めている
基本ブロックか同じであるか調べ、同じであればステッ
プＳ８に戻る。データ準備完了点を求めている基本ブロ
ックが同じでなければ、前記図６（ｆ）（ｇ）に示した
ように、図８に基づいた処理を行い、ステップＳ８に戻
る。以上の処理を行うことにより、図６（ｈ）に示した
ように準備完了点を求めることができる。Next, in step S7, it is checked whether or not there is a color change in the basic block as a result of the above processing. If there is no color change, the process returns to step S1 to check whether there is a basic block for which the data preparation completion point has not been obtained. If there is no basic block, the process ends, and if there is, the above steps S1 to S6.
Process. If there is a color change, it is checked in step S8 whether there is a basic block that is not white and has not been processed. If not, the process returns to step S7 to perform the above process. If there is a basic block that is not white and has not been processed, in step S9, one basic block that has not been processed is selected, and it is checked whether it is the same as the basic block for which the data preparation completion point is obtained. If so, the process returns to step S8. If the basic blocks for which the data preparation completion point is obtained are not the same, the processing based on FIG. 8 is performed as shown in FIGS. 6 (f) and 6 (g), and the process returns to step S8. By performing the above processing, the preparation completion point can be obtained as shown in FIG.

【００１５】（２）実行確定エッジ、不実行確定を求め
る処理図１０に実行確定エッジ、不実行確定エッジを求めるア
ルゴリズムを示す。また、図１１は上記アルゴリズムに
よる処理を説明する図である。以下、前記制御依存解析
部１３による実行確定エッジ、不実行確定エッジを求め
る処理について説明する。ここでは、基本ブロックＦの
データ準備完了点に対して実行確定エッジ（複数）、不
実行確定エッジ（複数）を求める処理について説明する
が、同様の処理を各基本ブロックについて行う。基本ブ
ロックＦのデータ準備完了点は、前記したように基本ブ
ロックＣとＤである。なお、図１０のアルゴリズムにお
ける基本ブロックＢ１は以下の説明では、基本ブロック
Ｆ、基本ブロックＢ２は、基本ブロックＣ，Ｄに対応す
る。 (i) 前処理 (1-1) 基本ブロックＦの各データ準備完了点の基本ブロ
ックＣ，Ｄからフローグラフを順にたどっていき、たど
れる基本ブロックに印Ｘを付ける。 (1-2) 基本ブロックＦからフローグラフを逆順にたどっ
ていき、印Ｙを付ける。 (1-3) 印Ｘと印Ｙがともについている基本ブロックを青
色に、その他を白に塗る。これにより、図１１に示すよ
うに、基本ブロックＥ，Ｆが青に塗られる。(2) Process for obtaining execution-determined edge and non-execution-determined edge FIG. 10 shows an algorithm for obtaining the execution-determined edge and the non-execution-determined edge. Further, FIG. 11 is a diagram for explaining the processing by the above algorithm. The process of obtaining the execution-determined edge and the non-execution-determined edge by the control dependence analysis unit 13 will be described below. Here, the process of obtaining the execution-determined edges (plurality) and the non-execution-determined edges (plurality) with respect to the data preparation completion point of the basic block F will be described, but the same process is performed for each basic block. The data preparation completion points of the basic block F are the basic blocks C and D as described above. In the following description, the basic block B1 and the basic block B2 in the algorithm of FIG. 10 correspond to the basic blocks C and D, respectively. (i) Pre-processing (1-1) The flow graph is sequentially traced from the basic blocks C and D at the respective data preparation completion points of the basic block F, and the traced basic block is marked with X. (1-2) Follow the flow graph from basic block F in reverse order, and mark Y. (1-3) Paint the basic block with the mark X and the mark Y together in blue, and the others in white. As a result, as shown in FIG. 11, the basic blocks E and F are painted blue.

【００１６】(ii)処理本体基本ブロックＦのデータ準備完了点の基本ブロックＣ，
Ｄの各々について以下の処理を行う。基本ブロック
（Ｃ，Ｄ）の後続ブロックを順に青色の基本ブロックの
みたどり、一度たどったところには印を付けて、２回以
上いかないようにする。そして、後続のブロックが白で
あれば、その白色の基本ブロックへ行くエッジを基本ブ
ロックＦのデータ準備完了点（Ｃ，Ｄ）に対する不実行
確定エッジに加える。また、たどったエッジが基本ブロ
ックＦの制御依存エッジ（基本ブロックＦに行くことが
最初に決まるエッジ）であれば、そのエッジを基本ブロ
ックＦのデータ準備完了点（Ｃ，Ｄ）に対する実行確定
エッジとする。例えば、図１１の基本ブロックＣについ
て、Ｃ−Ｆは基本ブロックＦに行くことが最初に決まる
エッジであるから、制御依存エッジであり、実行確定エ
ッジとなる。上記処理により、図１１に示すように、基
本ブロックＣについて、実行確定エッジはＣ−Ｆとな
り、不実行確定エッジはＣ−Ｗとなる。また、基本ブロ
ックＤについて、実行確定エッジはＤ−Ｆ，Ｅ−Ｆとな
り、不実行確定エッジは、Ｄ−Ｙ，Ｅ−Ｕとなる。(Ii) Basic block C of data preparation completion point of basic block F of the processing main body,
The following processing is performed for each of D. Follow the basic blocks (C, D) following the blue basic block in order, and mark the once traced blocks so that they do not go more than once. Then, if the subsequent block is white, the edge that goes to the white basic block is added to the non-execution determined edge for the data ready point (C, D) of the basic block F. If the traced edge is the control-dependent edge of the basic block F (the edge that is first determined to go to the basic block F), that edge is the execution-determined edge for the data ready point (C, D) of the basic block F. And For example, regarding the basic block C in FIG. 11, since C-F is an edge that is first determined to go to the basic block F, it is a control-dependent edge and an execution-determined edge. With the above processing, as shown in FIG. 11, for the basic block C, the execution confirmed edge becomes C-F and the non-execution confirmed edge becomes C-W. For the basic block D, the execution-determined edges are DF and EF, and the non-execution-determined edges are DY and EU.

【００１７】図１２に実行確定エッジ、不実行確定エッ
ジを求めるための処理フローを示す。なお、同図のステ
ップＳ１〜Ｓ６は前記「(i) 前処理」であり、ステップ
Ｓ７〜Ｓ１１は前記「(ii)処理本体」である。ステップ
Ｓ１において、実行確定エッジを求めていない基本ブロ
ックがあるかを調べる。求めていない基本ブロックがな
ければ処理を終了する。また、実行確定エッジを求めて
いない基本ブロックがあれば、ステップＳ２において、
実行確定エッジを求めていない基本ブロックを一つ選
ぶ。ステップＳ３において、選んだ基本ブロックのすべ
てのデータ準備完了点の基本ブロックから、フローグラ
フを順にたどっていき、たどれる基本ブロックに印Ｘを
付ける。ステップＳ４において、選んだ基本ブロックか
らフローグラフを逆順にたどっていき、印Ｙを付ける。
そして、ステップＳ５において、印Ｘと印Ｙがともにつ
いている基本ブロックを青色に、その他を白に塗る。FIG. 12 shows a processing flow for obtaining the execution confirmed edge and the non-execution confirmed edge. Note that steps S1 to S6 in the figure are the "(i) preprocessing", and steps S7 to S11 are the "(ii) processing body". In step S1, it is checked whether there is a basic block for which an execution-determined edge has not been obtained. If there is no basic block that has not been obtained, the process ends. If there is a basic block for which the execution-determined edge is not found, in step S2,
Select one basic block for which the execution-determined edge has not been obtained. In step S3, the flow graph is sequentially traced from the basic blocks of all the data ready points of the selected basic block, and the traceable basic block is marked with X. In step S4, the flow graph is traced in reverse order from the selected basic block, and a mark Y is attached.
Then, in step S5, the basic blocks attached to the marks X and Y are painted blue, and the others are painted white.

【００１８】次に、ステップＳ６において、処理してい
ないデータ準備完了点があるかを調べ、なければ前記ス
テップＳ１に行く。あれば、ステップＳ７において、処
理していないデータ準備完了点の基本ブロックを一つ選
ぶ。ステップＳ８において、選んだ基本ブロックの後続
ブロックを順に青色の基本ブロックをたどって行き、後
続の基本ブロックに白色のものを持つ基本ブロックがあ
れば、その白色の基本ブロックへ行くエッジを不実行確
定エッジに加える。次いで、ステップＳ９において、選
んだ基本ブロックの後続ブロックを順に青色の基本ブロ
ックをたどって行き、たどったエッジが選んだ基本ブロ
ックの制御依存エッジであるなら、そのエッジを、この
データ準備完了点に対する実行確定エッジに加える。つ
いで、ステップＳ６に戻り、前記処理を繰り返す。以上
の処理を行うことにより、図１１に示すように実行確定
エッジ、不実行確定エッジを求めることができる。Next, in step S6, it is checked whether or not there is an unprocessed data preparation completion point, and if there is no data preparation completion point, the process proceeds to step S1. If so, in step S7, one basic block at the data preparation completion point that has not been processed is selected. In step S8, the subsequent basic block of the selected basic block is sequentially traced to the blue basic block, and if there is a basic block having a white basic block, the edge to the white basic block is unexecuted. Add to the edge. Then, in step S9, the blocks following the selected basic block are sequentially followed by the blue basic block, and if the edge traced is the control-dependent edge of the selected basic block, that edge is set to this data ready point. Add to execution confirmed edge. Then, the process returns to step S6 to repeat the above process. By performing the above processing, it is possible to obtain the execution confirmed edge and the non-execution confirmed edge as shown in FIG.

【００１９】（３）並列実行、投機実行すべきタスクを
求める処理図１３に並列実行、投機実行すべきタスクを求めるアル
ゴリズムを示す。また、図１４は基本ブロック、基本ブ
ロック群の融合を説明する図である。以下、前記並列タ
スク選定部１４による並列実行、投機実行すべきタスク
を求める処理について説明する。 (i) 各基本ブロックに対して、下記の基本ブロックの基
本データを求める。ここで、以下のデータを基本ブロッ
クの基本データとする。・ａ：各データ準備完了点から、その実行確定エッジま
での命令数・ｂ：各データ準備完了点から、その不実行確定エッジ
までの命令数・ｃ１：データ準備完了点からその基本ブロックまでの
命令数なお、コントロールフロー上複数のパスがあるときは、
適当なパス（例えば最短パス長や、ループがあるときに
は１０回まわるとして）で計算する。・ｃ２：その基本ブロックからデータ使用点、データ再
定義点までの命令数・ｄ：その基本ブロックの命令数・ｅ：その基本ブロックで生成され、かつ、それ以外で
使用されるデータ（ＯｕｔｐｕｔＤａｔａ）の量(3) Processing for Finding Tasks to be Executed in Parallel and Speculatively FIG. 13 shows an algorithm for finding tasks to be executed in parallel and speculatively executed. FIG. 14 is a diagram for explaining the fusion of basic blocks and basic block groups. Hereinafter, a process of obtaining a task to be executed in parallel or speculatively by the parallel task selection unit 14 will be described. (i) For each basic block, find the basic data of the following basic block. Here, the following data is the basic data of the basic block. -A: the number of instructions from each data preparation completion point to its execution confirmation edge-b: the number of instructions from each data preparation completion point to its non-execution confirmation edge-c1: From the data preparation completion point to its basic block Number of instructions If there are multiple paths in the control flow,
The calculation is performed with an appropriate path (for example, if the shortest path length or a loop is passed 10 times). C2: the number of instructions from the basic block to the data use point and data redefinition point, d: the number of instructions of the basic block, e: data generated in the basic block and used in other areas (Output Data) ) Amount

【００２０】また、以下の条件を満たすものを、並列実
行が望ましい性質、投機実行が望ましい性質とする。 (1)並列実行が望ましい性質・ｃ１＋ｃ２が、大きい・ｄが、大きい・ｃ１＋ｃ２がｄにほぼ等しいか大きい、またはｃ２が
大きい (2) 投機実行に望ましい性質・ａが、大きい・ｂが小さい・ｅが小さいFurther, the properties satisfying the following conditions are assumed to be desirable for parallel execution and desirable for speculative execution. (1) Properties desirable for parallel execution-c1 + c2 is large-d is large-c1 + c2 is approximately equal to or greater than d, or c2 is large (2) Properties desirable for speculative execution-a is large-b is small- e is small

【００２１】(ii)ある基本ブロックを先頭にしたタスク
の生成を試みるために、上記ｃ１が大きい順に各基本ブ
ロックをタスクの出発点として、該タスクのすべての後
続基本ブロックに対して、以下の処理を行う。・ｃａｓｅ１：コントロールフロー上、後続基本ブロ
ックＢ２が、基本ブロックＢ１と同じ確定エッジを持つ
場合：基本ブロックＢ１と同じ確定エッジを持つ後続基
本ブロックＢ２に対して、それぞれを融合させた場合
（基本ブロックＢ１とＢ２を）のデータ準備完了点、デ
ータ使用点、データ再定義点を求め、上記ａ〜ｅを求
め、上記並列タスクに望ましい性質が向上するなら融合
する。例えば、図１４（ａ）において、基本ブロック
Ａ、基本ブロックＢへの確定実行エッジは、ともに基本
ブロックＺから基本ブロックＡへのエッジである。この
場合、ブロックＡとブロックＢとの融合が試みられ、上
記条件を満たす場合、基本ブロックＡと基本ブロックＢ
を融合する。(Ii) In order to try to generate a task with a certain basic block at the head, the following basic blocks are set for all subsequent basic blocks of the task with each basic block in the descending order of c1 as a starting point of the task. Perform processing. Case1: In the control flow, the following basic block B2 has the same definite edge as the basic block B1: When the succeeding basic block B2 having the same definite edge as the basic block B1 is fused with each other (the basic block (B1 and B2), the data ready point, the data use point, the data redefinition point are obtained, the above a to e are obtained, and if the desired properties of the parallel tasks are improved, they are fused. For example, in FIG. 14A, the fixed execution edges to the basic block A and the basic block B are both the edges from the basic block Z to the basic block A. In this case, fusion of block A and block B is attempted, and if the above conditions are satisfied, basic block A and basic block B
Fuse.

【００２２】・ｃａｓｅ２：コントロールフロー上、後
続基本ブロックＢ２がそれまでに融合された基本ブロッ
ク群の中のエッジを実行確定エッジとして持つ場合基本ブロックＢ２の先行基本ブロックＢ３に戻り、基本
ブロックＢ３と、その後続基本ブロックとの間のエッジ
に制御依存している基本ブロックを全て融合させた場合
のデータ準備完了点、データ使用点、データ再定義点を
求め、前記ａ〜ｅを求め、前記並列タスクに望ましい性
質が向上するならば、融合する。図１４（ｂ）に上記融
合の例を示す。図１４（ｂ）において、基本ブロックＡ
とＢはすでに融合されているとする。ここで、基本ブロ
ックＣの融合を試みるとき、基本ブロックＣの実行確定
エッジＢ−Ｃは、融合された基本ブロックＡ，Ｂの実行
確定エッジＺ−Ａとは異なる。この場合、基本ブロック
Ｃの先行基本ブロックＢと、その後続基本ブロックＣ，
Ｅとの間のエッジ（Ｂ−Ｃ，Ｂ−Ｅ）に制御依存してい
る基本ブロックＣ，Ｄ，Ｅの融合が試みられ、前記条件
を満たす場合、基本ブロックＣ，Ｄ，Ｅを融合する。 (iii) 最終的に融合された基本ブロックにおいて、前記
並列タスクに望ましい性質がある値以上のものは並列タ
スクとする。また、その中で、投機実行に望ましい性質
がある値以上のものは投機タスクとする。Case2: In the control flow, when the succeeding basic block B2 has an edge in the basic block group fused so far as the execution-determined edge, it returns to the preceding basic block B3 of the basic block B2 and the basic block B3. , A data ready point, a data use point, and a data redefinition point in the case where all the basic blocks that are control-dependent on an edge between the subsequent basic blocks are fused, the a to e are obtained, and the parallel processing is performed. If the desired properties of the task improve, merge. FIG. 14B shows an example of the above fusion. In FIG. 14B, basic block A
And B are already fused. Here, when trying to fuse the basic blocks C, the execution-determined edge BC of the basic block C is different from the execution-determined edge ZA of the fused basic blocks A and B. In this case, the preceding basic block B of the basic block C and its subsequent basic blocks C,
The fusion of the basic blocks C, D, and E that are control-dependent on the edge (BC, BE) between E and E is attempted, and if the above conditions are satisfied, the basic blocks C, D, and E are fused. . (iii) In the finally fused basic blocks, those having a value having a desirable property for the parallel task or more are regarded as parallel tasks. Further, among them, a speculation task is one having a value having a desirable property for speculative execution or more.

【００２３】図１５に並列タスク、投機実行タスクを求
めるための処理フローを示す。まず、ステップＳ１にお
いて、前記ｃ１を求め、ｃ１の大きい順に、基本ブロッ
クを整列させる。ステップＳ２において、まだ調べてい
ない基本ブロックがあるかを調べ、なければ処理を終了
する。また、まだ調べていない基本ブロックがあれば、
ステップＳ３において、先頭から基本ブロックを一つ選
び、タスクの出発点とする。ステップＳ４において、タ
スクに調べていない後続基本ブロックがあるかを調べ
る。なければ、ステップＳ１０に行く。ステップＳ１０
では、融合させた基本ブロックにおいて、並列タスクに
望ましい性質がある値以上のものは、並列タスクとす
る。その中で、投機実行に望ましい性質がある値以上の
ものは投機タスクとし、ステップＳ２に戻る。また、ス
テップＳ４において、タスクに調べていない後続基本ブ
ロックがある場合には、ステップＳ５に行き、調べてい
ない後続基本ブロックを選ぶ。ステップＳ６において、
後続の基本ブロックが種の基本ブロックと実行確定エッ
ジが等しいかを調べる（前記ｃａｓｅ１であるかの判
定）。等しければステップＳ８に行き、等しくなければ
ステップＳ７に行く。FIG. 15 shows a processing flow for obtaining a parallel task and a speculative execution task. First, in step S1, the c1 is obtained, and the basic blocks are arranged in descending order of c1. In step S2, it is checked whether there is any basic block that has not been checked yet, and if not, the process ends. Also, if you have a basic block that you have not yet examined,
In step S3, one basic block is selected from the beginning and used as the starting point of the task. In step S4, it is checked whether or not the task has a subsequent basic block that has not been checked. If not, go to step S10. Step S10
Then, in the fused basic blocks, those having a value having a desirable property for the parallel task or more are regarded as the parallel task. Among them, those having a value having a desirable property for speculative execution or more are regarded as speculative tasks, and the process returns to step S2. In step S4, if there is a subsequent basic block that has not been checked in the task, the process proceeds to step S5, and a subsequent basic block that has not been checked is selected. In step S6,
It is checked whether the succeeding basic block has the same execution confirmed edge as the basic block of the seed (determination as to the case 1). If they are equal, go to step S8, and if they are not equal, go to step S7.

【００２４】ステップＳ８において、後続ブロックをタ
スクと融合させた場合のデータ準備完了点、データ使用
点、データ再定義点を求め、前記ａ〜ｅを求め、並列タ
スクに望ましい性質が向上するならば融合させる。ま
た、ステップＳ７において、後続の基本ブロックがタス
クに融合された場合基本ブロック群の中のエッジを確定
エッジを持つかを調べる（前記ｃａｓｅ２であるかの判
定）。確定エッジを持っている場合にはステップＳ９に
行き、持たない場合にはステップＳ４に戻る。ステップ
Ｓ９において、一つの先行基本ブロックに戻り、それと
後続基本ブロックとの間のエッジに制御依存している基
本ブロックのすべてを融合させた場合のデータ準備完了
点、データ使用点、データ再定義点を求め、前記ａ〜ｅ
を求め、並列タスクに望ましい性質が向上するならば融
合し、ステップＳ４に戻る。以上の処理を行うことによ
り、並列タスクの生成するとともに、投機タスクも生成
することができる。（４）並列コード生成処理上記のようにして並列タスク、投機タスクが生成される
と、並列コード生成部１５では、タスクを並列に実行す
るために必要な、スケジューリング・タスク、同期待ち
等のコードを生成する。なお、この処理は従来の技術を
用いて実現することができる。In step S8, the data ready point, the data use point, and the data redefinition point in the case where the subsequent block is fused with the task are obtained, and the above a to e are obtained, and if the desirable properties for the parallel task are improved. To fuse. Further, in step S7, when the subsequent basic block is merged into the task, it is checked whether or not the edge in the basic block group has the definite edge (determination as to the case 2). If it has a fixed edge, it goes to step S9, and if it doesn't, it goes back to step S4. In step S9, the data ready point, the data use point, and the data redefinition point in the case of returning to one preceding basic block and merging all the basic blocks that are control-dependent on the edge between it and the following basic block And the above a to e
Are obtained, and if the desired properties of the parallel task are improved, they are fused, and the process returns to step S4. By performing the above processing, it is possible to generate a speculative task as well as a parallel task. (4) Parallel code generation process When the parallel task and the speculative task are generated as described above, the parallel code generation unit 15 generates codes such as a scheduling task and a synchronization wait code required to execute the tasks in parallel. To generate. It should be noted that this process can be realized using a conventional technique.

【００２５】以上のようにして生成されたコード化され
た並列タスク、投機タスクは、並列コンピュータ・シス
テムの各プロセッサ・エレメントに割り付けられ、並列
実行、投機実行が行われる。図１６は、投機タスク実行
時の動作を模式的に表した図である。同図において、タ
スク起動点からの投機タスクの起動信号により、タスク
が生成されて投機実行が起動され、不確定実行エッジに
制御が移ると、投機実行はキャンセルされタスクが終了
する。また、Ｇｌｏｂａｌ化信号が通知される（この場
合、投機実行が失敗であるので、データの書き込みは行
われない）。また、実行確定エッジにより、投機状態か
ら通常の実行状態へ遷移し、タスクにより生成されるＯ
ｕｔｐｕｔデータの書き込みが行われ、Ｇｌｏｂａｌ化
信号が通知される。これにより、投機タスク内で生成し
たデータは、他のタスクからアクセス可能になる。The coded parallel task and speculative task generated as described above are allocated to each processor element of the parallel computer system, and parallel execution and speculative execution are performed. FIG. 16 is a diagram schematically showing the operation when the speculative task is executed. In the figure, when a speculative task start signal from the task start point generates a task and speculative execution is started, and control is transferred to the uncertain execution edge, speculative execution is canceled and the task ends. In addition, the Globalization signal is notified (in this case, since speculative execution has failed, data writing is not performed). In addition, the execution confirmation edge causes a transition from the speculative state to the normal execution state, and the O generated by the task is generated.
The output data is written, and the Global signal is notified. As a result, the data generated in the speculative task can be accessed by other tasks.

【００２６】（付記１）プログラムの投機タスクを生
成する方法であって、プログラム中の各基本ブロックの
データ依存関係から、その基本ブロックで使用するデー
タがすべて揃った時点を解析し、該時点をデータ準備完
了点とし、上記データ準備完了点に対して、その基本ブ
ロックの実行が確定する時点を解析し、該時点を実行確
定エッジとし、ある基本ブロック群のデータ準備完了点
と実行確定エッジ間の命令数が所定値より多いとき、そ
の基本ブロック群を、投機的タスクとすることを特徴と
する投機タスク生成方法。（付記２）プログラムの投機タスクを生成する方法で
あって、ある基本ブロック群について、その範囲で生成
され、かつその範囲以外で使用されるデータの量が所定
値より少ない時、その基本ブロック群を投機的タスクと
することを特徴とする投機タスク生成方法。（付記３）プログラムの投機タスクを生成する方法で
あって、プログラム中の各基本ブロックのデータ依存関
係から、その基本ブロックで使用するデータがすべて揃
った時点を解析し、該時点をデータ準備完了点とし、デ
ータ準備完了点に対して、その基本ブロックが実行され
ないことが確定する時点を解析し、該時点を不実行確定
エッジとし、ある基本ブロック群のデータ準備完了点と
上記不実行確定エッジ間の命令数が所定値より少ない
時、その基本ブロック群を投機的タスクとすることを特
徴とする投機タスク生成方法。（付記４）プログラム中の各基本ブロックのデータ依
存関係から、各基本ブロックで生成したデータを使う時
点を解析し、該時点をデータ使用点とし、ある基本ブロ
ック群のデータ準備完了点から、その基本ブロック群ま
での命令数と該基本ブロック群の最後からデータ使用点
までの命令数との和が多いとき、その基本ブロック群
を、投機的タスクとすることを特徴とする付記１，２ま
たは付記３の投機タスク生成方法。（付記５）プログラム中の各基本ブロックのデータ依
存関係から、各基本ブロックで生成したデータを使う時
点を解析し、該時点をデータ使用点とし、ある基本ブロ
ック群のデータ準備完了点から、その基本ブロック群ま
での命令数と該基本ブロック群の最後からデータ使用点
までの命令数との和が、もしくは、基本ブロック群の最
後からデータ使用点までの命令数が、上記基本ブロック
群のつながりの中の命令数より多いとき、その基本ブロ
ック群を、投機的タスクとすることを特徴とする付記
１，２，３または付記４の投機タスク生成方法。（付記６）プログラム中の基本ブロック群のつながり
の中の命令数が所定値より大きいとき、その基本ブロッ
ク群を、投機的タスクとすることを特徴とする付記１，
２，３，４または付記５の投機タスク生成方法。（付記７）プログラムの投機タスクの生成装置であっ
てプログラムから基本ブロックを切り出す手段と、プロ
グラム中の各基本ブロックのデータ依存関係から、その
基本ブロックで使用するデータがすべて揃った時点を解
析しデータ準備完了点を求める手段と、上記データ準備
完了点に対して、その基本ブロックの実行が確定する時
点を解析し実行確定エッジを求める手段と、ある基本ブ
ロック群のデータ準備完了点と実行確定エッジ間の命令
数が所定値より多いかを判定し、所定値より大きいと
き、その基本ブロック群を、投機的タスクとして選定す
る手段とを有することを特徴とする投機タスク生成装
置。（付記８）プログラムの投機タスクを生成するための
投機タスク生成プログラムであって、上記プログラム
は、対象となるプログラム中の各基本ブロックのデータ
依存関係から、その基本ブロックで使用するデータがす
べて揃った時点を解析し、データ準備完了点を求める処
理と、上記データ準備完了点に対して、その基本ブロッ
クの実行が確定する時点を解析し、該時点を実行確定エ
ッジとする処理と、ある基本ブロック群のデータ準備完
了点と実行確定エッジ間の命令数が所定値より多いと
き、その基本ブロック群を、投機的タスクとして選定す
る処理をコンピュータに実行させることを特徴とする投
機タスク生成プログラム。（付記９）プログラムの投機タスクを生成するための
投機タスク生成プログラムであって、上記プログラム
は、ある基本ブロック群について、その範囲で生成さ
れ、かつその範囲以外で使用されるデータの量を求める
処理と、上記データ量が所定値より少ない時、その基本
ブロック群を、投機的タスクとして選定する処理をコン
ピュータに実行させることを特徴とする投機タスク生成
プログラム。（付記１０）プログラムの投機タスクを生成するため
の投機タスク生成プログラムであって、上記プログラム
は、プログラム中の各基本ブロックのデータ依存関係か
ら、その基本ブロックで使用するデータがすべて揃った
時点を解析し、データ準備完了点を求める処理と、デー
タ準備完了点に対して、その基本ブロックが実行されな
いことが確定する時点を解析し、該時点を不実行確定エ
ッジとする処理と、ある基本ブロック群のデータ準備完
了点と上記不実行確定エッジ間の命令数が所定値より少
ない時、その基本ブロック群を投機的タスクとする処理
をコンピュータに実行させることを特徴とする投機タス
ク生成プログラム。(Supplementary Note 1) A method of generating a speculative task of a program, wherein the time point when all the data used in the basic block are collected is analyzed from the data dependence of each basic block in the program, and the time point is analyzed. As a data preparation completion point, the time at which the execution of the basic block is confirmed with respect to the data preparation completion point is analyzed, and the time is defined as an execution confirmation edge, and between the data preparation completion point and the execution confirmation edge of a certain basic block group. A speculative task generation method in which the basic block group is treated as a speculative task when the number of instructions in the above is greater than a predetermined value. (Supplementary Note 2) A method of generating a speculative task of a program, wherein when the amount of data generated in a certain basic block group and used outside the certain range is less than a predetermined value, the basic block group Is a speculative task, and a speculative task generation method. (Supplementary note 3) A method of generating a speculative task of a program, analyzing from the data dependency of each basic block in the program the time when all the data used in that basic block are complete, and completing the data preparation The point at which it is determined that the basic block is not executed is analyzed with respect to the data ready point, and the point is set as the non-execution definite edge, and the data ready point of a certain basic block group and the non-execution definite edge. A speculative task generation method, characterized in that when the number of instructions in between is less than a predetermined value, the basic block group is treated as a speculative task. (Supplementary Note 4) From the data dependency of each basic block in the program, the time point when the data generated in each basic block is used is analyzed, and the time point is set as the data use point, and from the data preparation completion point of a certain basic block group, When the sum of the number of instructions up to the basic block group and the number of instructions from the end of the basic block group to the data use point is large, the basic block group is treated as a speculative task. Appendix 3 Speculative task generation method. (Supplementary Note 5) From the data dependency of each basic block in the program, the time when the data generated in each basic block is used is analyzed, and the time is set as the data use point, and from the data preparation completion point of a certain basic block group, The sum of the number of instructions to the basic block group and the number of instructions from the end of the basic block group to the data use point, or the number of instructions from the end of the basic block group to the data use point is the connection of the basic block groups. 5. The speculative task generation method of appendices 1, 2, 3 or 4 wherein the basic block group is set as a speculative task when the number of instructions is larger than the number of instructions. (Additional remark 6) When the number of instructions in the connection of the basic block groups in the program is larger than a predetermined value, the basic block group is treated as a speculative task.
2, 3, 4 or the speculative task generation method of Appendix 5. (Supplementary note 7) A speculative task generation device for a program, a means for cutting out a basic block from a program, and a data dependency of each basic block in the program are used to analyze the time when all the data used in the basic block are complete. A means for obtaining a data preparation completion point, a means for analyzing the time when the execution of the basic block is determined with respect to the data preparation completion point, and obtaining an execution confirmation edge, and a data preparation completion point and execution confirmation for a certain basic block group. A speculative task generation device, comprising: means for determining whether the number of instructions between edges is larger than a predetermined value and selecting the basic block group as a speculative task when the instruction count is larger than the predetermined value. (Supplementary Note 8) A speculative task generation program for generating a speculative task of a program, in which the data to be used in the basic block is all collected from the data dependency of each basic block in the target program. Processing for obtaining a data preparation completion point, processing for analyzing the time when the execution of the basic block is determined for the data preparation completion point, and setting the execution time as the execution confirmation edge; A speculative task generation program for causing a computer to execute a process of selecting a basic block group as a speculative task when the number of instructions between a data preparation completion point of a block group and an execution confirmed edge is larger than a predetermined value. (Supplementary Note 9) A speculative task generation program for generating a speculative task of a program, wherein the program obtains the amount of data generated in a range of a certain basic block group and used outside the range. A speculative task generation program for causing a computer to execute a process and a process of selecting the basic block group as a speculative task when the amount of data is smaller than a predetermined value. (Supplementary Note 10) A speculative task generation program for generating a speculative task of a program, wherein the program determines the time point at which all the data used in the basic block are complete from the data dependency of each basic block in the program. A process of analyzing and obtaining a data ready point, a process of analyzing a time point at which the basic block is determined not to be executed for the data ready point, and setting the time point as a non-execution definite edge, and a certain basic block A speculative task generation program for causing a computer to execute a process in which the basic block group is a speculative task when the number of instructions between the data preparation completion point of the group and the non-execution confirmed edge is less than a predetermined value.

【００２７】[0027]

【発明の効果】以上説明したように、本発明において
は、広い範囲に渡って、かつ投機成功時に実行時間の短
縮が期待できる場合のみ投機実行を行って、投機失敗時
のオーバーヘッドが小さいタスクレベルの投機的実行が
行うことができる。このため、実行時間の高速化を実現
することができる。As described above, according to the present invention, the speculative execution is performed over a wide range and only when the execution time can be expected to be shortened when the speculation succeeds, and the overhead at the speculative failure is small. Speculative execution of can be performed. Therefore, the execution time can be shortened.

[Brief description of drawings]

【図１】本発明の実施例の投機タスク生成部の構成を示
す図である。FIG. 1 is a diagram showing a configuration of a speculative task generation unit according to an embodiment of the present invention.

【図２】投機成功時の動作と投機不成功時の動作を示す
図である。FIG. 2 is a diagram showing an operation at the time of successful speculation and an operation at the time of unsuccessful speculation.

【図３】データ準備完了点、実行確定エッジ、不実行確
定エッジ等の関係を示す概念図である。FIG. 3 is a conceptual diagram showing a relationship such as a data preparation completion point, an execution confirmed edge, and a non-execution confirmed edge.

【図４】データ準備完了点を求めるアルゴリズムを示す
図である。FIG. 4 is a diagram showing an algorithm for obtaining a data preparation completion point.

【図５】データ準備完了点を求める処理を説明する図
（１）である。FIG. 5 is a diagram (1) illustrating a process of obtaining a data preparation completion point.

【図６】データ準備完了点を求める処理を説明する図
（２）である。FIG. 6 is a diagram (2) illustrating a process of obtaining a data preparation completion point.

【図７】基本ブロックに塗られる各色の意味を説明する
図である。FIG. 7 is a diagram illustrating the meaning of each color applied to a basic block.

【図８】データ準備完了点を求める動作を示す図であ
る。FIG. 8 is a diagram showing an operation of obtaining a data preparation completion point.

【図９】データ準備完了点を求めるための処理フローを
示す図である。FIG. 9 is a diagram showing a processing flow for obtaining a data preparation completion point.

【図１０】実行確定エッジ、不実行確定エッジを求める
アルゴリズムを示す図である。FIG. 10 is a diagram showing an algorithm for obtaining an execution confirmed edge and a non-execution confirmed edge.

【図１１】実行確定エッジ、不実行確定エッジを求める
処理を説明する図である。FIG. 11 is a diagram illustrating a process of obtaining an execution confirmed edge and a non-execution confirmed edge.

【図１２】実行確定エッジ、不実行確定エッジを求める
ための処理フローを示す図である。FIG. 12 is a diagram showing a processing flow for obtaining an execution confirmed edge and a non-execution confirmed edge.

【図１３】並列実行、投機実行すべきタスクを求めるア
ルゴリズムを示す図である。FIG. 13 is a diagram showing an algorithm for obtaining a task to be executed in parallel or speculatively.

【図１４】基本ブロック、基本ブロック群の融合を説明
する図である。FIG. 14 is a diagram for explaining fusion of basic blocks and basic block groups.

【図１５】並列タスク、投機実行タスクを求めるための
処理フローを示す図である。FIG. 15 is a diagram showing a processing flow for obtaining a parallel task and a speculative execution task.

【図１６】投機タスク実行時の動作を模式的に表した図
である。FIG. 16 is a diagram schematically showing an operation at the time of executing a speculative task.

[Explanation of symbols]

１ソースプログラム２コンパイラ３コード１１基本ブロック切出し部１２データ依存解析部１３制御依存解析部１４並列タスク選定部１５並列コード生成部 1 source program 2 compiler 3 codes 11 Basic block cutting section 12 Data dependency analysis unit 13 Control dependence analysis unit 14 Parallel task selection unit 15 Parallel code generator

Claims

[Claims]

1. A method for generating a speculative task of a program, comprising:
The time when all the data to be used in the basic block are collected is analyzed, the time is defined as a data preparation completion point, and the time when the execution of the basic block is determined with respect to the data preparation completion point is analyzed. A speculative task generation method, wherein a speculative task is defined as an execution-determined edge, and when the number of instructions between a data preparation completion point of a basic block group and the execution-determined edge is greater than a predetermined value, the basic block group is treated as a speculative task.

2. A method of generating a speculative task of a program, wherein when a certain basic block group has a data amount that is generated in a range and is used outside the range is less than a predetermined value, the basic block A speculative task generation method characterized in that a group is a speculative task.

3. A method for generating a speculative task of a program, wherein the data dependency of each basic block in the program is:
The time when all the data to be used in the basic block are collected is analyzed, the time is defined as the data ready point, and the time when the basic block is determined not to be executed is analyzed for the data ready point, and the time is analyzed. Is a non-execution definite edge, and when the number of instructions between a data preparation completion point of a certain basic block group and the non-execution definite edge is less than a predetermined value, the speculative task is characterized by that basic block group being a speculative task. Generation method.

4. A program speculative task generation device for cutting out a basic block from a program, and data dependence of each basic block in the program,
A means for analyzing the time when all the data to be used in the basic block are obtained to obtain a data preparation completion point, and a time when the execution of the basic block is confirmed with respect to the data preparation completion point, and obtaining an execution confirmation edge And a means for determining whether the number of instructions between a data preparation completion point and an execution confirmed edge of a certain basic block group is larger than a predetermined value, and selecting the basic block group as a speculative task when it is larger than the predetermined value. A speculative task generation device having:

5. A speculative task generation program for generating a speculative task of a program, wherein all the data used in the basic block is based on the data dependency of each basic block in the target program. There is a process of analyzing the aligned time points to obtain a data preparation completion point, and a process of analyzing the time point when the execution of the basic block is confirmed with respect to the data preparation completion point and setting the time point as an execution confirmation edge. A speculative task generation program characterized by causing a computer to execute a process of selecting a basic block group as a speculative task when the number of instructions between the data preparation completion point and the execution confirmed edge of the basic block group is larger than a predetermined value. .