JPH06250988A

JPH06250988A - Automatic parallel processing method

Info

Publication number: JPH06250988A
Application number: JP3770693A
Authority: JP
Inventors: Chiharu Kori; 千治郡
Original assignee: NEC Solution Innovators Ltd
Current assignee: NEC Solution Innovators Ltd
Priority date: 1993-02-26
Filing date: 1993-02-26
Publication date: 1994-09-09

Abstract

PURPOSE:To suppress the generation of excess tasks by obtaining the execution condition where a conditional formula is established in parallely executing a loop including the conditional formula by the automatic parallel function of a compiler and generating a parallel code to be executed under the execution condition. CONSTITUTION:A data dependence analyzing procedure 22 analyzes the data dependence of a loop recognized by a loop recognition procedure 21. When an execution condition discrimination procedure 24 obtains the execution condition where the conditional formula is established against the loop including the conditional formula, the loop which is recognized as being parallely executed by a parallel discrimination procedure 23 is deformed into a loop which can be parallely executed under the execution condition. A memory allocation means 25 discriminates the data hierarchy, allocating memories according to the hierarchies. A parallel code generation means 26 generates a code for parallel execution.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は、目的プログラムを並
列処理プログラムに自動的に変換し、実行する自動並列
化処理方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an automatic parallel processing method for automatically converting a target program into a parallel processing program and executing the program.

【０００２】[0002]

【従来の技術】流体解析，気象解析，プラズマ解析，構
造解析，資源探査，最適化問題等の科学技術計算分野に
おいては、大配列を幾つかのループを用いて処理するこ
とが多いい。このような演算処理をループ構造に従って
配列の中から１要素づつデータを取り出してスカラ命令
で順次実行する方式では、高性能な超大型計算機を占有
して実行させても数日から数十日かかることも珍しくな
く、コストパフォーマンス等を考えると実用的ではな
い。また、ベクトル演算機能を有するベクトルプロセッ
サを備えたスーパーコンピュータによって、ベクトル命
令で並列実行する方式でもシングルプロセッサの場合に
は、プロセッサ自身の高速化にも限界があり、飛躍的な
処理速度の向上は望めない。このため、マルチプロセッ
サで並列処理可能なマルチタスキング方式が利用されて
いる。また、並列処理には、サブルーチンあるいはサブ
ルーチン群を単位として処理するマクロタスキングと、
ループ，文あるいは文の集まりを単位として処理するマ
イクロタスキングがある。2. Description of the Related Art In a scientific and technological calculation field such as fluid analysis, meteorological analysis, plasma analysis, structural analysis, resource exploration, optimization problem, etc., a large array is often processed using some loops. In the method of extracting such data one by one from the array according to the loop structure and sequentially executing it by the scalar instruction, it takes several days to several tens of days even if the high-performance ultra-large computer is occupied and executed. This is not uncommon, and is not practical considering cost performance. In addition, even in the case of a single processor, even in the method of parallel execution with vector instructions by a super computer equipped with a vector processor having a vector operation function, there is a limit to the speedup of the processor itself, and it is possible to dramatically improve the processing speed. I can't hope. For this reason, a multitasking method that allows parallel processing by a multiprocessor is used. In addition, in parallel processing, there is macrotasking that processes a subroutine or a group of subroutines as a unit,
There is microtasking that processes loops, sentences, or a group of sentences as a unit.

【０００３】利用者がマクロタスキングを用いる場合に
は、トップダウン的にプログラムのアルゴリズムが並列
処理に適しているかどうか見直し、処理の並列性および
データの独立性等を解析する必要があり、非常に高度な
技術が要求される。また、マイクロタスキングを用いる
場合には、ボトムアップ的にプログラムの局所的な部分
に着目し、各ループの繰り返しや文の集まり毎の処理の
並列性およびデータの独立性等を解析する必要があり、
非常に細やかな注意力を要するため、コンパイラによる
自動並列化が不可欠である。When a user uses macrotasking, it is necessary to top-down review whether the program algorithm is suitable for parallel processing and analyze the parallelism of processing and the independence of data. Requires advanced technology. Also, when using microtasking, it is necessary to focus on the local part of the program from the bottom up and analyze the parallelism of the processing for each loop iteration and each group of statements, data independence, etc. Yes,
Since very careful attention is required, automatic parallelization by the compiler is essential.

【０００４】自動並列化処理では、ソースプログラムに
おけるループ等の制御構造およびデータの定義引用関係
等の情報を解析し、データの依存関係のない並列実行可
能部分を判別し、その並列実行用のコードを生成する。
また、データの依存関係がある部分に対しても、排他／
同期制御を行ったり、ソースプログラムの変形を行った
りすることで並列実行を可能にしている。一般的に、マ
イクロタスクは命令処理の単位が小さいため、オーバー
ヘッドがプログラム全体の処理時間に与える影響は大き
い。従って、無駄なタスクの生成を可能な限り削減し、
オーバーヘッドを抑止する必要がある。また、並列実行
可能なループ内に条件式を含み、条件式が成立する場合
に実行される部分以外の実行文をもたないループに対し
てもタスクが生成されるため、実行時にその条件式が成
立しない場合には無駄なタスクの生成／実行／終了が行
われ、並列化の処理効率が低下する。従来の並列化処理
を例示する図２から図５を援用し、説明すると、ソース
プログラム１は、図２に例示するように、与えられた配
列Ａの各要素の値を実行文のループ２７の内で順次調
べ、正値の場合のみ配列Ｙの対応する要素との積を求
め、その結果を配列Ｘの対応する要素に格納する。従来
の技術によって、このプログラムを翻訳，実行する場合
は、図３から図５に例示するように、ＲＥＳＥＲＶＥル
ーチン（図３の３１）を呼び出してタスクスケジューラ
により利用可能なプロセッサ１からＮを確保し（図５の
５１）、ＰＡＲＡＬＬＵＬＤＯルーチン（図３の３
２）を呼び出してタスク化されている各ループ（図３の
３３）をＮ個の各プロセッサに割り当て、並列実行させ
（図５の５２）、その結果を配列Ａ（図４の４１）の要
素が正値をもつとき、配列Ａ（図４の４１）と配列Ｙ
（図４の４４）の対応する要素の積を配列Ｘ（図４の４
３）の対応する要素に格納し、ループ（図３の３３）の
終わりで全タスク（この実施例では、ループ構造の１回
の繰り返し処理単位毎に１タスクを生成し各プロセッサ
で実行している）の並列実行の終了を待ち合わせる（図
５の５３）。最後にＲＥＬＥＡＳＥルーチン（図３の３
４）を呼び出してプロセッサを解放する（図５の５
４）。In the automatic parallelization processing, information such as a control structure such as a loop in a source program and a definition reference relation of data is analyzed, a parallel executable part having no data dependency is determined, and a code for the parallel execution is determined. To generate.
In addition, even if there is a data dependency, exclusion /
Parallel control is made possible by performing synchronous control and modifying the source program. In general, since the microtask has a small unit of instruction processing, the overhead has a large influence on the processing time of the entire program. Therefore, reduce unnecessary task generation as much as possible,
Overhead needs to be suppressed. In addition, since a task is created for a loop that contains a conditional expression in a loop that can be executed in parallel and has no executable statement other than the part that is executed when the conditional expression is satisfied, that conditional expression If is not established, useless task generation / execution / termination is performed and the parallelization processing efficiency decreases. Referring to FIGS. 2 to 5 exemplifying the conventional parallelization processing, the source program 1 calculates the value of each element of the given array A in the loop 27 of the execution statement as illustrated in FIG. Sequentially, the product with the corresponding element of the array Y is obtained only when the value is a positive value, and the result is stored in the corresponding element of the array X. When the program is translated and executed by the conventional technique, the RESERVE routine (31 in FIG. 3) is called to secure the available processors 1 to N by the task scheduler as illustrated in FIGS. 3 to 5. (51 in FIG. 5), the PARALLUL DO routine (3 in FIG. 3)
2) is called to assign each loop (33 in FIG. 3) that is made into a task to each of the N number of processors, and is executed in parallel (52 in FIG. 5), and the result is the element of array A (41 in FIG. 4). Has a positive value, array A (41 in FIG. 4) and array Y
The product of the corresponding elements of (44 in FIG. 4) is array X (4 in FIG. 4).
3) is stored in the corresponding element, and at the end of the loop (33 in FIG. 3), all tasks (in this embodiment, one task is generated for each iteration processing unit of the loop structure and executed by each processor) Waits for the end of parallel execution (53 in FIG. 5). Finally, the RELEASE routine (3 in Figure 3
4) is called to release the processor (5 in FIG. 5).
4).

【０００５】[0005]

【発明が解決しようとする課題】この発明の方法では、
条件式を含む並列実行可能なループに対し、並列実行用
タスクを生成する前に条件式が成立する実行条件を求
め、並列実行可能な原始ループをその実行条件下で並列
実行するループに変形することにより、無駄な並列実行
用タスクの生成／実行／終了を行わない最適な並列コー
ドを生成し、プログラム全体の処理効率を向上させる。According to the method of the present invention,
For a parallel executable loop that includes a conditional expression, find the execution condition that satisfies the conditional expression before generating a task for parallel execution, and transform the parallel executable primitive loop into a loop that executes parallel under that execution condition. As a result, an optimum parallel code that does not generate / execute / end useless parallel execution tasks is generated, and the processing efficiency of the entire program is improved.

【０００６】[0006]

【課題を解決するための手段】この発明の自動並列化処
理方法は、ソースプログラムを翻訳して並列実行用の目
的プログラムを生成するコンパイラと、前記目的プログ
ラムおよび実行時ルーチンを結合してマルチタスク構成
の実行可能プログラムを生成するリンカと、並列実行用
のタスクを管理するタスクスケジューラと、を含むオペ
レーティングシステムを実行するメモリ共有型のマルチ
プロセッサ計算機システムにおいて、前記コンパイラに
は、翻訳対象となる前記ソースプログラムを入力後、ソ
ースプログラム内の繰り返し実行する実行文のループの
制御構造を認識するループ認識手段と、認識した前記ル
ープ内のデータを定義引用する順序関係を解析し、前記
データ相互に依存関係があるか否かを認識するデータ依
存関係解析手段と、認識したループの前記制御構造とデ
ータの前記依存関係に基づいて、並列実行可能な前記ル
ープの判定と並列実行時のデータを定義し、引用する順
序関係を保証する排他／同期制御とを認識する並列性判
定手段と、並列実行可能な前記ループ内の条件式を認識
し、並列実行可能な部分に前記条件式が成立する場合に
実行する前記部分以外には実行文が存在しないことを判
定し、並列実行可能なループの生成および前記ループの
変形ならびに実行条件の判定を行う実行条件判定手段
と、前記ソースプログラムの宣言情報および並列実行時
のデータの前記依存関係の情報とに基づいて、前記リン
カによって生成されるタスク間共有データと前記タスク
各々が有する固有データをデータ階層に判別し、メモリ
への割り当てを行うメモリ割り当て手段と、前記並列性
判定手段によって得られる並列実行可能な前記ループに
対して、排他／同期制御用のコードも含む並列実行用の
コードを生成する並列コード生成手段とを含み、かつ、
前記タスクスケジューラには、前記コンパイラが生成す
る前記並列用のコードを含む目的プログラムおよび前記
コンパイラが生成する前記目的プログラムを実行するた
めの情報に基づいて、並列実行用タスクを生成するタス
ク生成手段と、並列実行が要求される時点で利用可能な
プロセッサを確保し、前記プロセッサに並列実行用タス
クを割り当てるタスク割り当て手段と、前記並列実行タ
スクのスケジューリングと前記タスク間の排他／同期制
御を行うタスク制御手段と、前記タスクの並列実行終了
時に前記タスクと前記プロセッサの解放を行うタスク解
放手段とを含み、前記コンパイラによる並列実行コード
の生成および前記タスクスケジューラによる並列実行タ
スクおよびそれの自動的メモリ割り当てを行う。SUMMARY OF THE INVENTION An automatic parallelization processing method of the present invention is a multitasking system in which a source program is translated to generate a target program for parallel execution, and the target program and a runtime routine are combined. In a memory sharing type multiprocessor computer system that executes an operating system that includes a linker that generates an executable program of a configuration and a task scheduler that manages tasks for parallel execution, After inputting the source program, the loop recognition means for recognizing the control structure of the loop of the executable statement repeatedly executed in the source program, and the order relation for defining and quoting the recognized data in the loop are analyzed, and the data are dependent on each other. Data dependency analysis means that recognizes whether there is a relationship Based on the recognized control structure of the loop and the dependency relationship of the data, the determination of the loop that can be executed in parallel and the data at the time of parallel execution are defined, and the exclusive / synchronous control that guarantees the order relation to be recognized is recognized. The parallelism determining means and the conditional expression in the parallel executable loop are recognized, and it is determined that there is no executable statement other than the portion executed when the conditional expression is satisfied in the parallel executable portion. An execution condition determining means for generating a loop that can be executed in parallel, deforming the loop, and determining an execution condition; and, based on the declaration information of the source program and the dependency information of data during parallel execution, A memory allocator that determines the shared data between tasks generated by the linker and the unique data of each task in the data hierarchy, and allocates it to the memory. When, wherein for a parallel executable said loop obtained by the parallel determination unit, and a parallel code generation means for generating a code for parallel execution also include code for exclusive / synchronous control, and,
The task scheduler includes a task generation unit that generates a parallel execution task based on an object program including the parallel code generated by the compiler and information for executing the object program generated by the compiler. A task allocation unit that secures an available processor when parallel execution is requested and allocates a parallel execution task to the processor, and task control that performs scheduling of the parallel execution task and exclusive / synchronous control between the tasks Means and a task releasing means for releasing the task and the processor at the end of parallel execution of the task, the parallel execution code generation by the compiler and the parallel execution task by the task scheduler and automatic memory allocation thereof. To do.

【０００７】[0007]

【作用】この発明の自動並列化処理方法は、実行文のル
ープの制御構造とデータの依存関係に基づいて、並列実
行可能と判定されるループ内の条件式を認識し、並列実
行用タスクを生成する以前に条件式が成立する実行条件
を求め、上述のループをその実行条件下で並列実行でき
る条件式を含まないループに変形することにより、最適
な並列コードの生成とメモリ割り当てを自動的に行う。The automatic parallelization processing method of the present invention recognizes a conditional expression in a loop that is determined to be parallel executable based on the control structure of the loop of the executable statement and the data dependency relationship, and executes the task for parallel execution. Optimal parallel code generation and memory allocation are automatically performed by finding the execution condition that satisfies the conditional expression before generating it, and transforming the above loop into a loop that does not include a conditional expression that can be executed in parallel under that execution condition. To do.

【０００８】[0008]

【実施例】次に、この発明の実施例について図面を参照
して説明する。Embodiments of the present invention will now be described with reference to the drawings.

【０００９】図１はこの発明の自動並列化処理方法の一
実施例の構成を示す図、図２はソースプログラム１の一
例としてＦＯＲＴＲＡＮ言語によるプログラムを例示す
る図、図３は図２のプログラムを従来の技術で並列変し
た場合のプログラムを例示する図、図４は図２のプログ
ラムを従来の技術で並列化した場合の共有データとタス
ク固有データのデータ階層とメモリ割り当ての状態を例
示する図、図５は図２のプログラムを従来の技術で並列
化した場合の各タスクの実行状況を例示する図、図６は
図２のプログラムをこの発明の方法へ並列化した場合の
プログラムを例示する図、図７は図２のプログラムをこ
の発明の方法で並列化した場合の共有データとタスク固
有データのデータ階層とメモリ割り当ての状態を例示す
る図、図８は図２のプログラムをこの発明の方法で並列
化した場合の各タスクの実行状況を例示する図、図９は
この発明の特徴であるコンパイラに含まれる実行条件判
定手段２４の動作の流れを示す図、である。FIG. 1 is a diagram showing a configuration of an embodiment of an automatic parallelization processing method of the present invention, FIG. 2 is a diagram illustrating a program in FORTRAN language as an example of a source program 1, and FIG. 3 is a diagram showing the program of FIG. FIG. 4 is a diagram illustrating a program in the case of parallel conversion according to the conventional technique, and FIG. 4 is a diagram illustrating a data hierarchy of shared data and task-specific data and a state of memory allocation when the program in FIG. 2 is parallelized according to the conventional technique. FIG. 5 is a diagram illustrating the execution status of each task when the program of FIG. 2 is parallelized by a conventional technique, and FIG. 6 illustrates a program when the program of FIG. 2 is parallelized by the method of the present invention. FIG. 7 is a diagram illustrating a data hierarchy of shared data and task-specific data and a state of memory allocation when the program of FIG. 2 is parallelized by the method of the present invention, and FIG. FIG. 9 is a diagram illustrating an execution status of each task when a program is parallelized by the method of the present invention, and FIG. 9 is a diagram showing a flow of operations of the execution condition determining means 24 included in the compiler, which is a feature of the present invention. .

【００１０】この実施例は図１を参照すると、コンパイ
ラ２とリンカ５とタスクスケジューラ７とを含んでお
り、ソースプログラム１をコンパイラ２によって翻訳
し、並列実行用の目的プログラム３を生成し、その目的
プログラム３と実行時ルーチン４をリンカ５によって結
合し、マルチタスク構成の実行可能な目的プログラム６
を生成し、その目的プログラム６をタスクスケジューラ
７の制御の下で並列実行させる。この発明では、並列実
行する無駄なタスク数を削減し、プログラム全体の実行
効率を高めるために以下のように処理を行う。この発明
の一実施例の構成を示す図１を参照すると、コンパイラ
２はループ認識手順２１と、データ依存関係解析手順２
２と、並列性判定手順２３と、実行条件判定手順２４
と、メモリ割り当て手順２５と、並列コード生成手順２
６と、を含んでいる。図２に例示するソースプログラム
によって、上記各手順を説明すると、ループ認識手順２
１は、入力したソースプログラム１の制御構造を解析
し、ループ（図２の２７）を認識する。データ依存関係
解析手順２２は、ループ２７内のデータを定義し、引用
する順序関係を解析し、配列Ａ，Ｘ，Ｙの各要素にルー
プの繰り返しにまたがる依存関係がないことを認識す
る。並列性判定手順２３は、ループ２７内の各データに
依存関係がないことによって、ループ２７が並列実行可
能なループであることを認識する。実行条件判定手順２
４は、図９に例示する動作の流れ図を参照すると、ルー
プ２７を取り出し（図９のステップ９０）、並列実行可
能なループであることを判定し（９１）、ループ内に条
件式があるか否かを判定し（９２）、並列実行可能なル
ープにおいて、条件式の条件成立時の実行部分以外の実
行文がないことを判定し（９３）、ループ２７は条件式
が成立する場合にのみ実行文を実行すればよいことを認
識する。次に、条件式が成立する実行条件数格納領域Ｉ
Ｎ（図７の７６）と実行条件値格納領域ＩＭ（図７の７
５）をタスク間共有データ空間（７７）に確保するため
の宣言文（図６の６１）と実行条件数の初期値を０にす
る代入文６３を生成し（９４）、ループ２７の前に実行
条件を求めるループ６４を生成し（９５）、ループ２７
をその実行条件下で並列実行するループ６６に変形する
（９６）。メモリ割り当て手順２５は、図６のプローグ
ラム中に現われた各データを図７に例示するようにタス
ク間共有データ７７およびタスク内固有データ７８のデ
ータ階層に分類して割り当てる。並列コード生成手順２
６は、並列性判定手順２３で認識された並列実行可能部
分や実行条件判定手順２４で生成または変形された部分
も含めて並列実行用のコードを生成する。Referring to FIG. 1, this embodiment includes a compiler 2, a linker 5 and a task scheduler 7. The source program 1 is translated by the compiler 2 to generate an object program 3 for parallel execution, The target program 3 and the runtime routine 4 are linked by the linker 5, and the executable target program 6 having a multitask structure
And executes the target program 6 in parallel under the control of the task scheduler 7. According to the present invention, the following processes are performed in order to reduce the number of unnecessary tasks to be executed in parallel and increase the execution efficiency of the entire program. Referring to FIG. 1 showing the configuration of an embodiment of the present invention, the compiler 2 executes a loop recognition procedure 21 and a data dependency analysis procedure 2
2, parallelism determination procedure 23, and execution condition determination procedure 24
, Memory allocation procedure 25, and parallel code generation procedure 2
6 is included. Each of the above procedures will be described with reference to the source program illustrated in FIG.
1 analyzes the control structure of the input source program 1 and recognizes a loop (27 in FIG. 2). The data dependency analysis procedure 22 defines the data in the loop 27, analyzes the order relation to be referred to, and recognizes that each element of the arrays A, X, and Y does not have a dependency that spans the repetition of the loop. The parallelism determination procedure 23 recognizes that the loop 27 is a loop that can be executed in parallel because each data in the loop 27 has no dependency. Execution condition judgment procedure 2
4, referring to the flow chart of the operation illustrated in FIG. 9, the loop 27 is taken out (step 90 in FIG. 9), it is determined that the loop can be executed in parallel (91), and whether there is a conditional expression in the loop. It is judged (92) whether or not there is an executable statement other than the execution part when the condition of the conditional expression is satisfied in the parallel executable loop (93), and the loop 27 is executed only when the conditional expression is satisfied. Recognize that the executable statement should be executed. Next, the execution condition number storage area I where the conditional expression is satisfied
N (76 in FIG. 7) and the execution condition value storage area IM (7 in FIG. 7)
Before the loop 27, a declaration statement (61 in FIG. 6) for securing 5) in the inter-task shared data space (77) and an assignment statement 63 that sets the initial value of the number of execution conditions to 0 are generated (94). A loop 64 for obtaining the execution condition is generated (95), and the loop 27
Is transformed into a loop 66 for parallel execution under the execution conditions (96). The memory allocation procedure 25 classifies and allocates each data appearing in the program of FIG. 6 into a data hierarchy of inter-task shared data 77 and in-task unique data 78 as illustrated in FIG. 7. Parallel code generation procedure 2
6 generates a code for parallel execution including the parallel executable part recognized in the parallelism determination procedure 23 and the part generated or modified in the execution condition determination procedure 24.

【００１１】以上の説明によれば、この発明の方法にお
いて、図２に例示するプログラムを翻訳，実行する場合
は、図６〜図８に例示するように、実行するループ（図
２の２７）の条件式と実行文を分離する変形が施され、
ループ（図６の６４）とループ（６６）が生成される。
ループ（６４）はＣＡＬＬＲＥＳＥＲＶＥ（６２）に
よって１タスクとして実行される。次にループ（６６）
は、ＲＥＳＥＲＶＥルーチン（６２）を呼び出してタス
クスケジューラ（図１の７）によって利用可能なプロセ
ッサ１からＮを確保し（図８の８１）、ＰＡＲＡＬＬＥ
Ｌ−ＤＯルーチン（図６の６５）を呼び出してループ６
６を並列実行し（図８の８２）、その結果、配列Ａ（図
７の７１）の要素が正値をもつときの配列Ａ（図７の７
１）と配列Ｙ（図７の７４）の対応する要素の積を配列
Ｘ（図７の７３）の対応する要素に格納し、ループ（図
６の６６）の終わりで前タスクの並列実行の終了を待ち
合わせる（図８の８３）。ここで、並列実行されるルー
プ（図６の６６）はループ（図２の２７）の条件式が成
立する実行条件下で実行するため、無駄なタスクの生成
／実行／終了は行われない。その後、ＲＥＬＥＡＳＥル
ーチン（図６の６７）を呼び出してプロセッサを解放す
る（図８の８４）。According to the above description, in the method of the present invention, when the program illustrated in FIG. 2 is translated and executed, a loop to be executed (27 in FIG. 2) as illustrated in FIGS. The transformation that separates the conditional expression and the execution statement of
A loop (64 in FIG. 6) and a loop (66) are generated.
The loop (64) is executed as one task by the CALL RESERVE (62). Then the loop (66)
Calls the RESERVE routine (62) to secure the available processors 1 to N by the task scheduler (7 in FIG. 1) (81 in FIG. 8), and then PARALLLE
Loop 6 by calling the L-DO routine (65 in FIG. 6)
6 in parallel (82 in FIG. 8), and as a result, array A (7 in FIG. 7) when the elements of array A (71 in FIG. 7) have positive values.
1) and the corresponding element of array Y (74 of FIG. 7) are stored in the corresponding element of array X (73 of FIG. 7), and at the end of the loop (66 of FIG. 6) the parallel execution of the previous task Wait for the end (83 in FIG. 8). Here, since the loop (66 in FIG. 6) that is executed in parallel is executed under the execution condition where the conditional expression of the loop (27 in FIG. 2) is satisfied, useless task generation / execution / termination is not performed. Then, the RELEASE routine (67 in FIG. 6) is called to release the processor (84 in FIG. 8).

【００１２】[0012]

【発明の効果】以上説明したようにこの発明によれば、
コンパイラの自動並列化機能によって、条件式を含むル
ープを並列実行する場合に、並列実行用タスクを生成す
る前に条件式が成立する実行条件を求め、並列実行可能
な原始ループをその実行条件下で並列実行するループに
変形することにより、無駄な並列実行用タスクの生成／
実行／終了を行わない並列コードが自動的に生成され、
プログラム全体の処理効率を向上させることができる。As described above, according to the present invention,
When the loop containing the conditional expression is executed in parallel by the automatic parallelization function of the compiler, the execution condition that satisfies the conditional expression is calculated before the task for parallel execution is generated, and the primitive loop that can be executed in parallel is executed under the execution condition. By creating a loop for parallel execution in
Parallel code that does not execute / terminate is automatically generated,
The processing efficiency of the entire program can be improved.

[Brief description of drawings]

【図１】この発明の自動並列化処理方法の一実施例の機
能構成を示す図である。FIG. 1 is a diagram showing a functional configuration of an embodiment of an automatic parallelization processing method of the present invention.

【図２】ソースプログラム１の一例としてＦＯＲＴＲＡ
Ｎ言語によるプログラムを例示する図である。FIG. 2 shows FORTRA as an example of the source program 1.
It is a figure which illustrates the program by N language.

【図３】図２のプログラムを従来の技術で並列化した場
合のプログラムを例示する図である。FIG. 3 is a diagram illustrating a program when the program of FIG. 2 is parallelized by a conventional technique.

【図４】図２のプログラムを従来の技術で並列化した場
合のデータ階層とメモリ割り当ての状態を例示する図で
ある。FIG. 4 is a diagram illustrating a data hierarchy and a state of memory allocation when the program of FIG. 2 is parallelized by a conventional technique.

【図５】図２のプログラムを従来の技術で並列化した場
合の各タスクの実行状況を例示する図である。FIG. 5 is a diagram illustrating an execution status of each task when the program of FIG. 2 is parallelized by a conventional technique.

【図６】図２のプログラムをこの発明の方法で並列化し
た場合のプログラムを例示する図である。FIG. 6 is a diagram illustrating a program when the program of FIG. 2 is parallelized by the method of the present invention.

【図７】図２のプログラムをこの発明の方法で並列化し
た場合のデータ階層とメモリ割り当ての状態を例示する
図である。FIG. 7 is a diagram exemplifying a data hierarchy and a state of memory allocation when the program of FIG. 2 is parallelized by the method of the present invention.

【図８】図２のプログラムをこの発明の方法で並列化し
た場合の各タスクの実行状況を例示する。FIG. 8 illustrates the execution status of each task when the program of FIG. 2 is parallelized by the method of the present invention.

【図９】この発明の特徴であるコンパイラ２に備える実
行条件判定手順２４の流れを示す図である。FIG. 9 is a diagram showing a flow of an execution condition determination procedure 24 provided in the compiler 2, which is a feature of the present invention.

[Explanation of symbols]

１ソースプログラム２コンパイラ３目的プログラム４実行時ルーチン５リンカ６実行可能プログラム７タスクスケジューラ２１ループ認識手順２２データ依存関係解析手順２３並列性判定手順２４実行条件判定手順２５メモリ割り当て手順２６並列コード生成手順 1 Source Program 2 Compiler 3 Target Program 4 Runtime Routine 5 Linker 6 Executable Program 7 Task Scheduler 21 Loop Recognition Procedure 22 Data Dependency Analysis Procedure 23 Parallelism Determination Procedure 24 Execution Condition Determination Procedure 25 Memory Allocation Procedure 26 Parallel Code Generation Procedure

Claims

[Claims]

1. A compiler for translating a source program to generate an object program for parallel execution, a linker for combining the object program and a runtime routine to generate an executable program having a multitask structure, and for parallel execution. In a memory sharing type multiprocessor computer system that executes an operating system including a task scheduler that manages the task of, the compiler executes the source program to be translated, and then repeatedly executes the source program. Loop recognition means for recognizing the control structure of a sentence loop, and data dependency relationship analysis means for analyzing the order relation for quoting and recognizing the recognized data in the loop, and recognizing whether or not there is a mutual dependency relationship between the data. And the control structure of the recognized loop and the dependence of the data. Parallelism determination means for recognizing the determination of the loop that can be executed in parallel based on the relationship and the exclusive / synchronous control that guarantees the order relationship for defining and quoting data at the time of parallel execution; Recognizing the conditional expression, determining that there is no executable statement other than the part to be executed when the conditional expression is satisfied in the part that can be executed in parallel, and creates a loop that can be executed in parallel and transforms the loop. Based on the execution condition determination means that determines the execution condition, and the dependency information of the source program declaration information and the data at the time of parallel execution, the inter-task shared data generated by the linker and the tasks are A memory allocating unit that discriminates the unique data that it has into a data hierarchy and allocates it to a memory, and parallel execution that is obtained by the parallelism determining unit. Parallel code generation means for generating code for parallel execution including code for exclusive / synchronous control with respect to the loop, and the task scheduler includes the parallel code generated by the compiler. Secures a task generation unit that generates a task for parallel execution based on an object program including the object program and information for executing the object program generated by the compiler, and a processor that can be used when parallel execution is requested. Task allocation means for allocating a task for parallel execution to the processor, scheduling of the parallel execution task and exclusion / interval between the tasks
Task control means for performing synchronous control, and task release means for releasing the task and the processor at the end of parallel execution of the task, the parallel execution code generation by the compiler, and the task scheduler of the task to be executed in parallel An automatic parallelization processing method characterized by executing and automatically allocating it.