JPH07244647A

JPH07244647A - Method and device for parallelization for indirect reference loop, and method and device for parallel execution

Info

Publication number: JPH07244647A
Application number: JP6062045A
Authority: JP
Inventors: Hiroshi Ota; 寛太田; Tetsuro Saito; 鉄郎斉藤; Masahiro Uminaga; 正博海永; Yasuhiko Saito; 靖彦斎藤
Original assignee: Hitachi Microcomputer System Ltd; Hitachi Ltd
Current assignee: Hitachi Microcomputer System Ltd; Hitachi Ltd
Priority date: 1994-03-07
Filing date: 1994-03-07
Publication date: 1995-09-19

Abstract

PURPOSE:To enables the parallel execution and parallelization of general indirect reference loops and to enable parallel execution and parallelization, specially, when indirect reference is present on a left side by inserting a statement for deciding whether an array element referred to a loop including the indirect reference in present on its own processor or another processor in an indirect reference loop. CONSTITUTION:An initialization statement for a final iteration array last is interposed 121 right before an indirect loop. A statement for deciding whether the indirect reference is local reference or remote reference is interposed 122 before a substituted statement in a loop of a program. Then a statement for recording an iteration index in the final iteration array is inserted 123 into the part which is executed at the time of the local reference. Further, a statement for registering substitution information on the remote reference in a substitution list is inserted 124 into the part which is executed at the time of the remote reference. Statements for the postprocessing of the remote reference are added 125 and 126 after the loop.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、逐次処理計算機用に記
述された間接参照ループを含むプログラムを、分散メモ
リ型の並列計算機上で実行可能なプログラムに変換する
プログラム並列化方法及び装置と、間接参照ループの並
列実行方法及び装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a program parallelizing method and apparatus for converting a program including an indirect reference loop written for a serial processing computer into a program executable on a distributed memory type parallel computer. The present invention relates to a parallel execution method and apparatus for an indirect reference loop.

【０００２】[0002]

【従来の技術】複数のプロセッサから構成される並列計
算機システムにおいて、各プロセッサごとに固有のメモ
リを備えているものを、分散メモリ型並列計算機と呼ん
でいる。科学技術計算などに現われる大規模配列の処理
を分散メモリ型並列計算機で実行するときは、配列の各
要素を各プロセッサのメモリに分割して割り付け、各要
素に対する処理を各プロセッサで並列に実行するという
方式が、通常用いられる。このような方式を実現するプ
ログラムを作成する一つの方法は、並列化コンパイラを
用いることである。2. Description of the Related Art In a parallel computer system composed of a plurality of processors, one having a memory unique to each processor is called a distributed memory type parallel computer. When executing the processing of a large-scale array that appears in scientific and technical computing with a distributed memory parallel computer, divide each element of the array into the memory of each processor and allocate it, and execute the processing for each element in parallel with each processor. This method is usually used. One way to create a program that implements such a scheme is to use a parallelizing compiler.

【０００３】並列化コンパイラは、逐次処理用言語で記
述されたプログラムを並列計算機用プログラムに変換す
る言語プロセッサである。分散メモリ型並列計算機用の
並列化コンパイラは、配列要素や、ループの繰り返し
（以後、ループの繰り返し単位をイタレーションと呼
ぶ。例えば、ｉ＝２のイタレーションという。）を各プ
ロセッサへ分割して割り当て、必要ならばデータ転送文
や同期文を挿入して、各プロセッサ用のプログラムを生
成する。あるイタレーションで参照される配列要素が自
プロセッサに割り当てられている場合に、その配列要素
参照を「ｌｏｃａｌ参照」と呼び、また、他プロセッサ
に割り当てられている場合に「ｒｅｍｏｔｅ参照」と呼
ぶ。ｒｅｍｏｔｅ参照については、プロセッサ間通信に
よって配列要素の値を転送する必要がある。そこで考慮
しなければならないのは、分散メモリ型並列計算機では
プロセッサ間通信の起動に非常に時間が掛かることであ
る。通常、ＣＰＵ演算の数百ステップ以上の起動時間を
必要とする。したがって、個々のｒｅｍｏｔｅ参照毎に
通信を行っていたのでは、起動オーバーヘッドのために
性能が非常に低下してしまう。そのため、分散メモリ型
並列計算機では、複数の配列要素をまとめて転送して通
信回数を低減する工夫が、非常に重要である。A parallelizing compiler is a language processor that converts a program written in a serial processing language into a parallel computer program. A parallelizing compiler for a distributed memory parallel computer divides array elements and loop iterations (hereinafter, the loop iteration unit is called iteration. For example, i = 2 iteration) into each processor. Allocate, insert data transfer statements and synchronization statements if necessary, and generate a program for each processor. When an array element referred to by a certain iteration is assigned to its own processor, the array element reference is called "local reference", and when it is assigned to another processor, it is called "remote reference". For remote reference, it is necessary to transfer the array element value by interprocessor communication. Therefore, it must be taken into consideration that in a distributed memory parallel computer, it takes a very long time to start interprocessor communication. Normally, a start-up time of several hundred steps or more of CPU calculation is required. Therefore, if communication is performed for each individual remote reference, the performance will be greatly reduced due to the startup overhead. Therefore, in a distributed memory parallel computer, it is very important to devise a method of transferring a plurality of array elements collectively and reducing the number of communications.

【０００４】配列要素参照がａ［ｉ＋１］のように単純
な線形添字を持つならば、ｒｅｍｏｔｅ参照される配列
要素の範囲をコンパイル時に決定でき、配列要素をまと
めて転送する文を比較的容易に生成できる。しかし、添
字がもっと複雑な場合には転送文の生成はより難しくな
る。特に問題となるのは以下に述べる間接参照の場合で
ある。間接参照とは添字が配列要素になっている配列要
素参照、例えばａ［ｐ［ｉ］］のような参照である。間
接参照を含むループのことを間接参照ループと呼ぶ。間
接参照ループは、有限要素法などで生じる粗行列の計算
やピボット交換付きガウス消去法などの現実の問題に頻
繁に現われる。間接参照ループの特徴は、添字の値が実
行時まで不明なため、ｒｅｍｏｔｅ参照される要素をコ
ンパイル時にまとめてデータ転送文を生成することが不
可能なことである。そのため、間接参照ループは添字が
線形であるようなループに比べて並列化が困難である。
間接参照ループの並列化方法に関する従来技術としては
ＫｏｅｌｂｅｌａｎｄＭｅｈｒｏｔｒａ， ”Ｃ
ｏｍｐｉｌｉｎｇＧｌｏｂａｌＮａｍｅ−Ｓｐａｃ
ｅＰａｒａｌｌｅｌＬｏｏｐｓｆｏｒＤｉｓｔ
ｒｉｂｕｔｅｄＥｘｅｃｕｔｉｏｎ”，ＩＥＥＥＴ
ｒａｎｓ．ｏｎＰａｒａｌｌｅｌａｎｄＤｉｓｔ
ｒｉｂｕｔｅｄＳｙｓｔｅｍｓ，Ｖｏｌ．２，Ｎ
ｏ．４，ｐｐ．４４０−４５１，１９９１年があ
る。そこでは、間接参照ループを、次のような並列実行
方法に基づくループに並列化した。その並列実行方法と
は、ｒｅｍｏｔｅ参照される配列要素の位置情報を含む
リストを作成する「ｉｎｓｐｅｃｔｏｒループ」を、プ
ログラムの最初で予め実行しておくものであった。添字
配列の値が全プログラムを通じて不変ならば、ｉｎｓｐ
ｅｃｔｏｒループは全プログラムを通じて１回だけ実行
すればよいことになる。後は、間接参照ループの度にそ
のリストに基づいてｒｅｍｏｔｅ参照される要素をまと
めて転送することができる。また別の従来技術として
は、窪田、三吉、大野、森、中島、富田、”分散メモリ
型並列計算機の自動並列化コンパイラ−Ｉｎｓｐｅｃｔ
ｏｒ／Ｅｘｅｃｕｔｏｒアルゴリズムの高速化−”，
並列処理シンポジウムＪＳＰＰ’９３，ｐｐ．４７−
５４がある。そこでは、前述の従来技術に基づいて、
添字配列が置換になっているなどの特殊な場合にｉｎｓ
ｐｅｃｔｏｒループを高速化する方法が提案されてい
る。If the array element reference has a simple linear subscript such as a [i + 1], the range of the array element referred to by remote can be determined at compile time, and the statement for transferring the array elements collectively can be relatively easily performed. Can be generated. However, if the subscript is more complicated, the transfer sentence generation becomes more difficult. A particular problem is the case of indirect reference described below. An indirect reference is an array element reference whose subscript is an array element, such as a [p [i]]. A loop including an indirect reference is called an indirect reference loop. The indirect reference loop frequently appears in actual problems such as the calculation of a coarse matrix generated by the finite element method and the Gaussian elimination method with pivot exchange. The feature of the indirect reference loop is that it is impossible to generate a data transfer statement by compiling the elements referred to by remote at the time of compilation because the value of the subscript is unknown until the time of execution. Therefore, the indirect reference loop is more difficult to parallelize than the loop having a linear subscript.
The prior art relating to the parallelization method of the indirect reference loop includes Koelbel and Mehtra, "C.
ompinging Global Name-Spac
e Parallel Loops for Dist
ribbedExecution ”, IEEE T
rans. on Parallel Land Dist
ribbed Systems, Vol. 2, N
o. 4, pp. 440-451, 1991. There, the indirect reference loop was parallelized into a loop based on the following parallel execution method. The parallel execution method is to execute an "inspector loop" that creates a list including position information of array elements referred to by remote at the beginning of the program. If the value of the subscript array is invariant throughout the program, insp
The actor loop only needs to be executed once throughout the entire program. After that, the elements referred to by remote can be collectively transferred based on the list in each indirect reference loop. As another conventional technique, Kubota, Miyoshi, Ohno, Mori, Nakajima, Tomita, "Automatic parallelizing compiler for distributed memory parallel computer-Inspect
or / Executor algorithm speedup- ",
Parallel Processing Symposium JSPP'93, pp. 47-
There is 54. There, based on the above-mentioned conventional technology,
Ins in special cases such as subscript array substitution
A method for speeding up the vector loop has been proposed.

【０００５】[0005]

【発明が解決しようとする課題】従来の方法では、並列
化対象としている間接参照ループに特殊な条件があっ
た。すなわち、添字配列の値がプログラムを通じて不変
であることや、添字配列の値が置換になっていることな
どの制限があった。また、上記の従来技術には、並列化
された間接参照ループの形は示されていたが、間接参照
ループをそのような形に変換するための並列化手順は、
明確に述べられていなかった。さらに、従来の方法で
は、間接参照が代入文の左辺にある場合に生じる問題に
対処していなかった。すなわち、ｆｏｒ（ｉ＝０；ｉ＜Ｎ；ｉ＋＋）ａ［ｐ［ｉ］］＝ｂ［ｉ］；のようなループ（ここで、上記文はＣ言語によるもので
あり、ｉ＋＋はｉを順次増加することを意味する。ま
た、式の右辺は既知の値であり、左辺は、いわば未知の
値であり、右辺の既知の値によって決まる。）では、異
なるｉに対してｐ［ｉ］が同じ値を取ったときに、異な
るイタレーションが配列ａの同一の要素を書き換える。
それらのうち、最後のイタレーションによって書かれた
値のみがループ終了後の値として残る。例えば、ａ［ｐ
［２］］＝ｂ［２］でｐ［２］＝１２であり、ａ［ｐ
［６］］＝ｂ［６］でｐ［６］＝１２であったとする
と、ａ［１２］の値としてはｂ［６］が残る。このルー
プを並列実行する場合は、ループ終了後の各配列要素が
並列実行ではない逐次実行のときと同じ値を持つように
しなければならないが、従来の方法は、その問題に対処
していなかった。本発明の目的は、一般的な間接参照ル
ープの並列実行および並列化を可能とすることにある。
特に、間接参照が左辺にある場合の並列実行および並列
化を可能とすることにある。In the conventional method, the indirect reference loop to be parallelized has a special condition. In other words, there are restrictions such as the value of the subscript array being invariable throughout the program and the value of the subscript array being replaced. Further, in the above-mentioned conventional technology, the form of the parallelized indirect reference loop was shown, but the parallelization procedure for converting the indirect reference loop into such a form is
It was not stated clearly. Furthermore, the conventional method does not address the problem that occurs when the indirect reference is on the left side of the assignment statement. That is, a loop such as for (i = 0; i <N; i ++) a [p [i]] = b [i]; (where the above sentence is in C language, and i ++ sequentially i In addition, the right side of the equation is a known value, the left side is a so-called unknown value, which is determined by the known value of the right side.) For different i, p [i] is When it takes the same value, different iterations rewrite the same element of array a.
Among them, only the value written by the last iteration remains as the value after the loop ends. For example, a [p
[2]] = b [2], p [2] = 12, and a [p
If [6]] = b [6] and p [6] = 12, b [6] remains as the value of a [12]. When this loop is executed in parallel, each array element after the end of the loop must have the same value as in serial execution that is not parallel execution, but the conventional method did not address that problem. . An object of the present invention is to enable parallel execution and parallelization of a general indirect reference loop.
In particular, it is to enable parallel execution and parallelization when an indirect reference is on the left side.

【０００６】[0006]

【課題を解決するための手段】逐次処理プログラムまた
は共有メモリ型並列計算機用プログラムを分散メモリ型
並列計算機プログラムに変換するプログラム並列化方法
において、配列の添字が配列になっている間接参照を含
むループに対して、参照されている配列要素が自プロセ
ッサにあるか他プロセッサにあるかを判定する文を、該
間接参照ループ内に挿入するステップと、参照されてい
る配列要素が他プロセッサにある場合に該配列要素につ
いての情報をリストに登録する文を、該間接参照ループ
内に挿入するステップと、該リストをプロセッサ間で交
換するリスト交換文を、該間接参照ループの後に挿入す
るステップと、該交換されたリストを用いて他プロセッ
サの配列要素の値を自プロセッサの配列要素に代入する
文を、該リスト交換文の後に挿入するステップとを含む
ようにしている。また、配列の添字が配列になっている
間接参照ループを分散メモリ型並列計算機で並列実行す
る方法において、参照されている配列要素が自プロセッ
サにあるか他プロセッサにあるかを該間接参照ループ内
で判定するステップと、自プロセッサにある場合に該配
列要素への代入をするステップと、他プロセッサにある
場合に該配列要素についての情報を該間接参照ループ内
でリストに登録するステップと、該間接参照ループ終了
後に該リストをプロセッサ間で交換するステップと、該
交換されたリストを用いて他プロセッサの配列要素の値
を自プロセッサの配列要素に代入するステップとを含む
ようにしている。また、逐次処理プログラムまたは共有
メモリ型並列計算機用プログラムを分散メモリ型並列計
算機プログラムに変換するプログラム並列化装置におい
て、配列の添字が配列になっている間接参照を含むルー
プに対して、参照されている配列要素が自プロセッサに
あるか他プロセッサにあるかを判定する文を、該間接参
照ループ内に挿入する手段と、参照されている配列要素
が他プロセッサにある場合に該配列要素についての情報
をリストに登録する文を、該間接参照ループ内に挿入す
る手段と、該リストをプロセッサ間で交換するリスト交
換文を、該間接参照ループの後に挿入する手段と、該交
換されたリストを用いて他プロセッサの配列要素の値を
自プロセッサの配列要素に代入する文を、該リスト交換
文の後に挿入する手段とを備えるようにしている。ま
た、配列の添字が配列になっている間接参照ループを分
散メモリ型並列計算機で並列実行する装置において、参
照されている配列要素が自プロセッサにあるか他プロセ
ッサにあるかを判定する手段と、自プロセッサにある場
合に該配列要素への代入をする手段と、他プロセッサに
ある場合に該配列要素についての情報をリストに登録す
る手段と、該リストをプロセッサ間で交換する手段と、
該交換されたリストを用いて他プロセッサの配列要素の
値を自プロセッサの配列要素に代入する手段とを備える
ようにしている。また、プログラム並列化方法におい
て、配列の添字が配列になっている間接参照が代入文の
右辺に現われるループに対して、参照されている配列要
素が自プロセッサにあるか他プロセッサにあるかを判定
する文を、該間接参照ループ内に挿入するステップと、
他プロセッサにある場合に該配列要素についての位置情
報を位置リストに登録する文を、該間接参照ループ内に
挿入するステップと、該位置リストをプロセッサ間で交
換する位置リスト交換文を、該間接参照ループの後に挿
入するステップと、該交換された位置リストを用いて、
他プロセッサから参照されている配列要素の値を含む値
リストを作成する文を、該位置リスト交換文の後に挿入
するステップと、該値リストをプロセッサ間で交換する
値リスト交換文を、該値リスト作成文の後に挿入するス
テップと、該交換された値リストを用いて他プロセッサ
の配列要素の値を自プロセッサの配列要素に代入する文
を、該値リスト交換文の後に挿入するステップとを含む
ようにしている。また、並列実行する方法において、参
照されている配列要素が自プロセッサにあるか他プロセ
ッサにあるかを、該間接参照ループ内で判定するステッ
プと、自プロセッサにある場合に該配列要素への代入を
するステップと、他プロセッサにある場合に該配列要素
についての位置情報を該間接参照ループ内で位置リスト
に登録するステップと、該間接参照ループ終了後に該位
置リストをプロセッサ間で交換するステップと、該交換
された位置リストを用いて、他プロセッサから参照され
ている配列要素の値を含む値リストを作成するステップ
と、該値リストをプロセッサ間で交換するステップと、
該交換された値リストを用いて他プロセッサの配列要素
の値を自プロセッサの配列要素に代入するステップとを
含むようにしている。また、プログラム並列化方法にお
いて、配列の添字が配列になっている間接参照が代入文
の左辺に現われるループに対して、間接参照されている
配列の要素を最後に書き換えたイタレーションのインデ
ックスを記録する最終イタレーション配列の宣言文をプ
ログラム内に挿入するステップと、該最終イタレーショ
ン配列の初期化文を該間接参照ループの前に挿入するス
テップと、参照されている配列要素が自プロセッサにあ
るか他プロセッサにあるかを判定する文を、該間接参照
ループ内に挿入するステップと、最終イタレーション配
列にイタレーションインデックスを記録する文を、該間
接参照ループ内に挿入するステップと、他プロセッサに
ある場合に該配列要素についての代入情報を代入リスト
に登録する文を、該間接参照ループ内に挿入するステッ
プと、該代入リストをプロセッサ間で交換する代入リス
ト交換文を、該間接参照ループの後に挿入するステップ
と、該交換された代入リストを用いて他プロセッサの配
列要素の値を自プロセッサの配列要素に代入する文を、
該代入リスト交換文の後に挿入するステップとを含むよ
うにしている。また、プログラム並列化方法において、
配列の添字が配列になっている間接参照が加算代入文の
左辺に現われるループに対して、参照されている配列要
素が自プロセッサにあるか他プロセッサにあるかを判定
する文を、該間接参照ループ内に挿入するステップと、
他プロセッサにある場合に該配列要素についての代入情
報を代入リストに登録する文を、該間接参照ループ内に
挿入するステップと、該代入リストをプロセッサ間で交
換する代入リスト交換文を、該間接参照ループの後に挿
入するステップと、該交換された代入リストを用いて他
プロセッサの配列要素の値を自プロセッサの配列要素に
加算代入する文を、該代入リスト交換文の後に挿入する
ステップとを含むようにしている。また、並列実行する
方法において、参照されている配列要素が自プロセッサ
にあるか他プロセッサにあるかを、該間接参照ループ内
で判定するステップと、自プロセッサにある場合に該配
列要素への加算代入をするステップと、他プロセッサに
ある場合に該配列要素についての代入情報を該間接参照
ループ内で代入リストに登録するステップと、該間接参
照ループ終了後に該代入リストをプロセッサ間で交換す
るステップと、該交換された代入リストを用いて他プロ
セッサの配列要素の値を自プロセッサの配列要素に加算
代入するステップとを含むようにしている。In a program parallelization method for converting a serial processing program or a shared memory parallel computer program into a distributed memory parallel computer program, a loop including an indirect reference in which an array subscript is an array For inserting a statement that determines whether the referenced array element is in its own processor or another processor into the indirect reference loop, and if the referenced array element is in another processor Inserting a statement for registering information about the array element in a list in the indirect reference loop, and inserting a list exchange statement for exchanging the list between processors after the indirect reference loop. Using the exchanged list, a statement for substituting the array element value of another processor into the array element of the local processor is added to the list exchange statement. So that and inserting after the sentence. Further, in a method of executing an indirect reference loop whose array subscript is an array in parallel in a distributed memory type parallel computer, it is determined whether the referenced array element is in its own processor or another processor in the indirect reference loop. , A step of assigning to the array element when it is in its own processor, a step of registering information about the array element in a list in the indirect reference loop when it is in another processor, After the indirect reference loop ends, the steps of exchanging the list between the processors and the step of substituting the value of the array element of the other processor into the array element of the own processor by using the exchanged list are included. Further, in a program parallelizing device for converting a serial processing program or a shared memory parallel computer program into a distributed memory parallel computer program, a loop including an indirect reference in which an array subscript is an array is referred to. A means for inserting into the indirect reference loop a statement for determining whether the existing array element is in its own processor or in another processor, and information about the array element when the referenced array element is in another processor. Using a means for inserting a statement for registering a list in a list in the indirect reference loop, a means for inserting a list exchange statement for exchanging the list between processors after the indirect reference loop, and the exchanged list. And a statement for substituting the value of the array element of another processor into the array element of its own processor after the list exchange statement. It has to. Further, in a device that executes in parallel an indirect reference loop whose array subscript is an array in a distributed memory parallel computer, means for determining whether the referenced array element is in its own processor or another processor, Means for assigning to the array element when it is in its own processor, means for registering information about the array element in a list when it is in another processor, and means for exchanging the list between processors
Means for substituting the array element value of the other processor into the array element of the own processor using the exchanged list. Also, in the program parallelization method, it is determined whether the referenced array element is in its own processor or another processor for the loop in which the indirect reference whose array subscript is an array appears on the right side of the assignment statement. Inserting a statement in the indirect reference loop,
The step of inserting a statement for registering the position information about the array element in the position list in another processor into the indirect reference loop, and the position list exchange statement for exchanging the position list between the processors Using the step of inserting after the reference loop and the exchanged position list,
The step of inserting a statement that creates a value list containing the values of array elements referenced by another processor after the position list exchange statement, and the value list exchange statement that exchanges the value list between the processors A step of inserting after the value list exchange statement, a step of inserting after the value list exchange statement, and a step of substituting the value of the array element of another processor into the array element of the own processor using the exchanged value list. I am trying to include it. In the parallel execution method, a step of determining in the indirect reference loop whether the referenced array element is in its own processor or another processor, and assigning to the array element if it is in its own processor A step of registering position information about the array element in a position list in the indirect reference loop when the processor is in another processor, and a step of exchanging the position list between the processors after the end of the indirect reference loop. Creating a value list containing the values of array elements referenced by other processors using the exchanged position list; and exchanging the value list between processors.
The step of substituting the array element value of another processor into the array element of the own processor by using the exchanged value list. Also, in the program parallelization method, for the loop where an indirect reference in which the array subscript is an array appears on the left side of the assignment statement, record the iteration index at which the element of the indirectly referenced array was last rewritten. Inserting the declaration statement of the final iteration array into the program, inserting the initialization statement of the final iteration array before the indirect reference loop, and the array element being referenced is in the local processor. A statement for determining whether or not it is in another processor in the indirect reference loop; a statement for recording an iteration index in the final iteration array in the indirect reference loop; In the indirect reference loop, a statement to register the assignment information about the array element in the assignment list when A step of inserting an assignment list exchange statement for exchanging the assignment list between processors after the indirect reference loop, and a value of an array element of another processor using the exchanged assignment list. Statement to assign to the array element of
And a step of inserting after the substitution list exchange statement. Also, in the program parallelization method,
For a loop in which an indirect reference in which the array subscript is an array appears on the left side of the addition and assignment statement, a statement that determines whether the referenced array element is in the local processor or another processor is the indirect reference. Inserting inside the loop,
The step of inserting a statement for registering the assignment information about the array element in the assignment list into the indirect reference loop when the processor is in another processor and the assignment list exchange statement for exchanging the assignment list between the processors are And a step of inserting after the assignment list exchange statement, a step of inserting after the reference loop, and a statement for adding and assigning the value of the array element of another processor to the array element of the own processor using the exchanged assignment list. I am trying to include it. In the parallel execution method, a step of determining in the indirect reference loop whether the referenced array element is in its own processor or another processor, and addition to the array element if it is in its own processor A step of assigning, a step of registering assignment information about the array element in the assignment list in the indirect reference loop when it is in another processor, and a step of exchanging the assignment list between processors after completion of the indirect reference loop And a step of adding and substituting the array element value of the other processor to the array element of the own processor by using the exchanged substitution list.

【０００７】[0007]

【作用】本発明の並列実行方法によれば、間接参照ルー
プの実行中にｒｅｍｏｔｅ参照に関する情報のリストを
作成するので、添字配列がプログラム内で変化する場合
でも並列実行可能である。また、左辺に間接参照がある
場合でも、配列要素を書き換えたイタレーションのイン
デックスが記録されているので、最後のイタレーション
によって書き換えられた値のみを、ループ終了後に残す
ことができる。また、本発明の並列化方法によれば、上
記の並列実行方法を実現するプログラムが生成できる。According to the parallel execution method of the present invention, since a list of information regarding remote references is created during execution of an indirect reference loop, parallel execution is possible even when the subscript array changes in the program. Even if there is an indirect reference on the left side, since the index of the iteration that rewrites the array element is recorded, only the value rewritten by the last iteration can be left after the loop ends. Further, according to the parallelization method of the present invention, a program that realizes the above parallel execution method can be generated.

【０００８】[0008]

【実施例】以下、図面を用いて本発明の実施例を説明す
る。図１は、本発明の一実施例に係わるプログラム並列
化方法の手順を示すフローチャートである。図１を参照
してプログラム並列化について説明する前に、本発明に
より並列化されたプログラムが動作する並列計算機の構
成、および、その並列計算機上での本発明による間接参
照ループの並列実行方法について説明する。図２は本発
明の適用対象である分散メモリ型並列計算機の構成の一
例である。この並列計算機は、本発明のプログラム並列
化方法によって並列化されたプログラムを実行する。並
列計算機は複数のプロセッサ２０１から２０ｎ、各プロ
セッサに付随するローカルメモリ２１１から２１ｎ、そ
してそれらを結合する相互結合ネットワーク２２から構
成される。各ローカルメモリ上のデータは、それが付随
するプロセッサからは直接参照できるが、他のプロセッ
サからは直接参照することはできない。あるプロセッサ
に付随するデータを他のプロセッサから参照するために
は、そのデータは相互結合ネットワーク２２を通じて転
送されなければならない。Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a flow chart showing the procedure of a program parallelization method according to an embodiment of the present invention. Before describing program parallelization with reference to FIG. 1, a configuration of a parallel computer on which a program parallelized by the present invention operates and a parallel execution method of an indirect reference loop by the present invention on the parallel computer will be described. explain. FIG. 2 shows an example of the configuration of a distributed memory type parallel computer to which the present invention is applied. This parallel computer executes a program parallelized by the program parallelization method of the present invention. The parallel computer is composed of a plurality of processors 201 to 20n, local memories 211 to 21n associated with each processor, and an interconnection network 22 connecting them. The data on each local memory can be directly referenced by the processor to which it is attached, but cannot be directly referenced by other processors. In order for data associated with one processor to be referenced by another processor, the data must be transferred through the interconnection network 22.

【０００９】図３に間接参照ループの一例を示す。１行
目は実数型の１００００個の要素から成る配列ａ，ｂの
宣言文、２行目は整数型の１００００個の要素から成る
配列ｐの宣言文である。３行目から５行目までが間接参
照ループである。４行目の代入文の右辺の配列ｂの添字
が配列ｐになっている。このループを分散メモリ型並列
計算機で実行する場合、ループのイタレーションと配列
要素を分割して各プロセッサに割り当てる。例えば、プロセッサ１：イタレーション０から９９、配列ａ
［０］からａ［９９］、ｂ［０］からｂ［９９］、ｐ
［０］からｐ［９９］プロセッサ２：イタレーション１００から１９９、配
列ａ［１００］からａ［１９９］、ｂ［１００］からｂ
［１９９］、ｐ［１００］からｐ［１９９］以下同様のように割り当てる。ここでイタレーションの番号は変
数ｉの値で表している。このように割り当てれば、ある
イタレーションｉを実行するときに、配列要素参照ａ
［ｉ］とｐ［ｉ］は必ず自プロセッサに割り当てられて
いる。すなわちｌｏｃａｌ参照である。しかし、配列要
素参照ｂ［ｐ［ｉ］］については、自プロセッサに割り
当てられているとは限らない。すなわちｒｅｍｏｔｅ参
照の可能性がある。ｂ［ｐ［ｉ］］がｌｏｃａｌ参照か
ｒｅｍｏｔｅ参照かはｐ［ｉ］の値によって決まり、プ
ログラムの実行時まで確定しない。この状況を考慮しな
がら、この間接参照ループの並列実行方法について説明
する。なお、以下では、配列要素が割り当てられている
プロセッサを「所有者」と呼び、その要素を参照してい
るイタレーションが割り当てられているプロセッサを
「参照者」と呼ぶ（例えば、ｂ［ｐ［２］］＝ｂ［１２
０］のとき、要素ｂ［１２０］を参照しているｉ＝２の
イタレーションが割り当てられているプロセッサはプロ
セッサ１であり、この場合のプロセッサ１は参照者であ
る。）。また、間接参照されている配列要素のインデッ
クス、すなわち、ｐ［ｉ］の値（例えば、先の例でｐ
［２］の値１２０）のことを「所有者インデックス」、
イタレーションのインデックスｉ（例えば、先の例でｉ
＝２）のことを「参照者インデックス」と呼ぶ。FIG. 3 shows an example of an indirect reference loop. The first line is a declaration statement of the arrays a and b of 10,000 elements of real number type, and the second line is the declaration statement of the array p of 10,000 elements of integer type. The third to fifth lines are indirect reference loops. The subscript of the array b on the right side of the assignment statement on the 4th line is the array p. When this loop is executed on a distributed memory parallel computer, the loop iterations and array elements are divided and assigned to each processor. For example, processor 1: iteration 0 to 99, array a
[0] to a [99], b [0] to b [99], p
[0] to p [99] Processor 2: Iterations 100 to 199, arrays a [100] to a [199], b [100] to b
[199], p [100] to p [199], and so on. Here, the iteration number is represented by the value of the variable i. With this allocation, when performing a certain iteration i, the array element reference a
[I] and p [i] are always assigned to the own processor. That is, a local reference. However, the array element reference b [p [i]] is not always assigned to the own processor. That is, there is a possibility of referring to remote. Whether b [p [i]] is a local reference or a remote reference is determined by the value of p [i] and is not fixed until the program is executed. Considering this situation, the parallel execution method of this indirect reference loop will be described. In the following, a processor to which an array element is assigned is called an "owner", and a processor to which an iteration that refers to the element is assigned is called a "referencer" (for example, b [p [ 2]] = b [12
0], the processor assigned the iteration of i = 2 referring to the element b [120] is the processor 1, and the processor 1 in this case is the referencer. ). Also, the index of the indirectly referenced array element, that is, the value of p [i] (for example, p
The value 120) of [2] is the "owner index",
The iteration index i (eg i in the previous example)
= 2) is called a "referencer index".

【００１０】図５は、図２の並列計算機上での、図３の
間接参照ループの本発明による並列実行方法の手順を示
すフローチャートである。本手順は、並列計算機の各プ
ロセッサ２０１から２０ｎが実行するものである。ま
ず、各プロセッサは、自身に割り当てられたイタレーシ
ョンの各々について、ステップ４００からステップ４０
３の処理を行う。全イタレーションについての処理が終
了した後で、ステップ４０４からステップ４０７の処理
を実行する。以下では各ステップの処理の詳細を述べ
る。ステップ４００は自身に割り当てられたイタレーシ
ョンのうち、未処理のものがまだあるかどうかの判定で
ある。未処理のイタレーションがあればステップ４０１
に進み、以下、そのイタレーションについての処理を行
う。ステップ４０１で間接参照ｂ［ｐ［ｉ］］がｌｏｃ
ａｌ参照かｒｅｍｏｔｅ参照かを判定する。ループの実
行時には添字配列ｐ［ｉ］の値は確定しているのでこの
判定が可能である。もしｌｏｃａｌ参照ならばステップ
４０２に進み、ｒｅｍｏｔｅ参照ならばステップ４０３
に進む。ステップ４０２でｌｏｃａｌ参照についての代
入を実行する。ｌｏｃａｌ参照されている配列要素は自
プロセッサ上にあるので、この代入に際してプロセッサ
間通信は必要ない。ステップ４０３ではｒｅｍｏｔｅ参
照についての位置情報を位置リストに登録する。ここで
位置情報とは、要素ｂ［ｐ［ｉ］］の所有者のプロセッ
サ番号（例えば、ｂ［ｐ［２］］＝ｂ［１２０］であれ
ば、所有者のプロセッサ番号は２である。）、所有者イ
ンデックスｐ［ｉ］の値（例えば、ｐ［２］＝１２０な
ら値は１２０である。）、および参照者インデックスｉ
の値（例えば、ｂ［ｐ［２］］なら、この値は２であ
る。）である。位置リストの構造は図６を用いて後述す
る。ステップ４００で未処理のイタレーションがもうな
ければステップ４０４に進み、以下、位置リストに基づ
いてｒｅｍｏｔｅ参照の後処理をする。ステップ４０４
に進んだ時点では、各プロセッサの位置リストには、自
身が参照者であるようなｒｅｍｏｔｅ参照についての位
置情報が登録されている。これを「参照者側位置リス
ト」と呼ぶ。ステップ４０４では、ｒｅｍｏｔｅ参照の
位置情報が参照者から所有者に渡るように、全プロセッ
サ間で位置リストを交換する。このとき参照者情報も渡
される。交換方法としては、例えば、Ｊｏｈｎｓｓｏｎ
ａｎｄＨｏ， ”ＯｐｔｉｍｕｍＢｒｏａｄｃａ
ｓｔｉｎｇａｎｄＰｅｒｓｏｎａｌｉｚｅｄＣｏ
ｍｍｕｎｉｃａｔｉｏｎｉｎＨｙｐｅｒｃｕｂｅ
ｓ”，ＩＥＥＥＴｒａｎｓ．ｏｎＣｏｍｐｕｔ
ｅｒｓ，Ｖｏｌ．３８，Ｎｏ．９，ｐｐ．１２４
９−１２６８，１９８９年に述べられている全対全
通信などの方法を用いれば良い。この交換により、各プ
ロセッサは、自身が所有者であるようなｒｅｍｏｔｅ参
照についての位置リストを持つようになる。これを「所
有者側位置リスト」と呼ぶ。ステップ４０５では、所有
者側位置リストに基づいて、自身が所有者となっている
ｒｅｍｏｔｅ参照の値を含む値リスト（これを「所有者
側値リスト」と呼ぶ）を作成する。値リストの構造は図
７を用いて後述する。ステップ４０６で、ｒｅｍｏｔｅ
参照の値が所有者から参照者に渡るように、値リストを
全プロセッサ間で交換する。この交換により、各プロセ
ッサは、自身が参照者であるようなｒｅｍｏｔｅ参照に
ついての値リストを持つようになる。これを「参照者側
値リスト」と呼ぶ。ステップ４０７で、参照者側値リス
トに基づいて、自身が参照者であるｒｅｍｏｔｅ参照に
ついて、ｂ［ｐ［ｉ］］の値を左辺の配列要素ａ［ｉ］
に代入する。以上の処理により、図３の間接参照ループ
の並列実行が完了した。FIG. 5 is a flow chart showing the procedure of the parallel execution method according to the present invention of the indirect reference loop of FIG. 3 on the parallel computer of FIG. This procedure is executed by each of the processors 201 to 20n of the parallel computer. First, each processor performs steps 400 to 40 for each iteration assigned to itself.
Process 3 is performed. After the processing for all iterations is completed, the processing from step 404 to step 407 is executed. The details of the processing of each step will be described below. Step 400 is a determination as to whether or not there are still unprocessed iterations assigned to itself. Step 401 if there are unprocessed iterations
Then, the process for the iteration is performed. The indirect reference b [p [i]] is loc in step 401.
It is determined whether it is an al reference or a remote reference. This determination is possible because the value of the subscript array p [i] is fixed when the loop is executed. If it is a local reference, proceed to step 402, and if it is a remote reference, step 403.
Proceed to. In step 402, the substitution for the local reference is executed. Since the array element that is locally referenced is on its own processor, interprocessor communication is not necessary for this assignment. In step 403, the position information regarding the remote reference is registered in the position list. Here, the position information means that if the processor number of the owner of the element b [p [i]] (for example, b [p [2]] = b [120], the processor number of the owner is 2. ), The value of the owner index p [i] (for example, if p [2] = 120, the value is 120), and the referencer index i.
Value (for example, if b [p [2]], this value is 2). The structure of the position list will be described later with reference to FIG. If there is no unprocessed iteration in step 400, the process proceeds to step 404, and after that, post-processing of remote reference is performed based on the position list. Step 404
At the time of proceeding to, the position list of each processor has registered therein the position information regarding the remote reference such that the processor itself is the referrer. This is called a "referencer side position list". In step 404, location lists are exchanged between all processors so that location information for remote references passes from the referrer to the owner. At this time, referrer information is also passed. As an exchange method, for example, Johnsson
and Ho, "Optimum Broadca
sting and Personalized Co
mmunication in Hypercube
s ", IEEE Trans. on Comput
ers, Vol. 38, No. 9, pp. 124
A method such as all-to-all communication described in 9-1268, 1989 may be used. This exchange causes each processor to have a location list for remote references that it owns. This is called the "owner side position list". In step 405, based on the owner side position list, a value list including the value of the remote reference owned by itself (this is called an "owner side value list") is created. The structure of the value list will be described later with reference to FIG. In step 406, remote
Exchange lists of values among all processors so that the values of a reference pass from owner to referencer. This exchange causes each processor to have a list of values for remote references that it is a referrer to. This is called a "referencer side value list". In step 407, the value of b [p [i]] is set to the array element a [i] on the left side of the remote reference which is the referrer based on the referrer side value list.
To. Through the above processing, parallel execution of the indirect reference loop in FIG. 3 is completed.

【００１１】次に、この並列実行方法で用いた位置リス
トや値リストの構造を説明する。図６は、参照者側位置
リストの構造を示す。所有者ごとにエントリを持つヘッ
ダ５００と、その各エントリからポインタでつながるリ
スト本体５０１によって構成される。ｒｅｍｏｔｅ参照
の各々に対して、リスト本体５０１の１エントリが対応
する。リスト本体５０１は所有者インデックスと参照者
インデックスを表す２個のフィールド５０４および５０
５から成る。ヘッダ５００には、所有者のプロセッサ番
号を表すフィールド５０２と、リスト本体に登録されて
いる対の数（登録数）を示すフィールド５０３が含まれ
る。例えば、図６の位置リストの一番最初の項目（斜線
部分）は、プロセッサ４番の持つ配列要素ｂ［６６］の
値が、このプロセッサの配列要素ａ［５］に代入される
べきことを表している。なお、所有者側位置リストの構
造は、参照者側位置リストの構造とほとんど同じである
が、ヘッダ５００内の所有者フィールド５０２が参照者
を表すフィールドに置き換わっている点だけが異なる。Next, the structure of the position list and the value list used in this parallel execution method will be described. FIG. 6 shows the structure of the referrer side position list. It is composed of a header 500 having an entry for each owner and a list body 501 connected from each entry with a pointer. One entry in the list body 501 corresponds to each remote reference. The list body 501 has two fields 504 and 50 representing an owner index and a referrer index.
It consists of 5. The header 500 includes a field 502 indicating the processor number of the owner and a field 503 indicating the number of pairs (registration number) registered in the list body. For example, the first item (hatched portion) of the position list in FIG. 6 indicates that the value of the array element b [66] of the processor No. 4 should be assigned to the array element a [5] of this processor. It represents. The structure of the owner-side position list is almost the same as the structure of the referrer-side position list, except that the owner field 502 in the header 500 is replaced with a field representing the referencer.

【００１２】図７は、所有者側値リストの構造を示す。
位置リストと同様に、ヘッダ５１０とリスト本体５１１
から構成される。ヘッダ５１０のエントリは参照者ごと
に設ける。また、リスト本体５１１の内容は、転送すべ
き配列要素の値である。値の順序は、位置リスト内のイ
ンデックスの順序と一致するようにする。例えば、図７
の値リストの網かけ部分は、値１３．２が、プロセッサ
５番で参照されることを表している。１３．２という値
が、参照者側のどの配列要素に代入されるかは、値リス
トが参照者側に送られたときに、参照者側の位置リスト
との順序対応によって分かる。なお、参照者側値リスト
の構造は、所有者者側値リストの構造とほとんど同じで
あるが、ヘッダ５１０内の参照者フィールド５１２が所
有者を表すフィールドに置き換わっている点だけが異な
る。以上で、本発明による間接参照ループの並列実行方
法の説明を終わる。FIG. 7 shows the structure of the owner-side value list.
Similar to the position list, the header 510 and the list body 511
Composed of. The entry of the header 510 is provided for each referrer. The contents of the list body 511 are values of array elements to be transferred. The order of the values should match the order of the indexes in the position list. For example, in FIG.
The shaded portion of the value list of indicates that the value 13.2 is referred to by the processor 5. Which array element on the referrer side is assigned the value of 13.2 can be known by the order correspondence with the position list on the referrer side when the value list is sent to the referrer side. The structure of the referrer side value list is almost the same as the structure of the owner side value list, except that the referrer field 512 in the header 510 is replaced with a field representing the owner. This is the end of the description of the parallel execution method of the indirect reference loop according to the present invention.

【００１３】次に、本発明による間接参照ループの並列
実行方法のための並列実行装置について説明する。図８
は、そのような並列実行装置の一例を表す。各プロセッ
サ２０内に、演算部２３０、ｒｅｍｏｔｅ参照判定部２
３１、位置情報登録部２３２、ネットワーク制御部２３
３、値リスト作成部２３４、ｒｅｍｏｔｅ参照代入部２
３５、参照者側位置リスト２４０、所有者側位置リスト
２４１、所有者側値リスト２４２、参照者側値リスト２
４３を含む。各リスト２４０から２４３はローカルメモ
リ２１内に置くこともできる。ｒｅｍｏｔｅ参照判定部
２３１は図５のステップ４０１の処理を行う。すなわ
ち、演算部２３０から間接参照の所有者インデックスｐ
［ｉ］を受け取り、それがｌｏｃａｌ参照かｒｅｍｏｔ
ｅ参照かを判定する。位置情報登録部２３２は、図５の
ステップ４０３の処理を行う。すなわち、ｒｅｍｏｔｅ
参照についての位置情報を参照者側位置リスト２４０に
登録する。ネットワーク制御部２３３は図５のステップ
４０４およびステップ４０６の処理を行う。すなわち、
相互結合ネットワーク２２を通じて、位置リスト２４０
および２４１や値リスト２４２および２４３をプロセッ
サ間で交換する。値リスト作成部２３４は、図５のステ
ップ４０５の処理を行う。すなわち、所有者側位置リス
ト２４１に基づいて所有者側値リスト２４２を作成す
る。演算部２３０は通常の演算処理を行う。これには、
図５のステップ４０２におけるｌｏｃａｌ参照の代入実
行を含む。ｒｅｍｏｔｅ参照代入部２３５は図５のステ
ップ４０７の処理を行う。すなわち、参照者側値リスト
２４３用いて、自身が参照者であるｒｅｍｏｔｅ参照に
ついての代入を実行する。Next, a parallel execution device for the parallel execution method of the indirect reference loop according to the present invention will be described. Figure 8
Represents an example of such a parallel execution device. In each processor 20, a calculation unit 230 and a remote reference determination unit 2
31, position information registration unit 232, network control unit 23
3, value list creation unit 234, remote reference assignment unit 2
35, reference side position list 240, owner side position list 241, owner side value list 242, reference side value list 2
Including 43. Each list 240 to 243 can be placed in the local memory 21. The remote reference determination unit 231 performs the process of step 401 in FIG. That is, the owner index p of the indirect reference from the calculation unit 230
[I] is received and it is a local reference or remote
e Determine whether it is a reference. The position information registration unit 232 performs the process of step 403 of FIG. That is, remote
The location information about the reference is registered in the referrer side location list 240. The network control unit 233 performs the processing of step 404 and step 406 of FIG. That is,
Through the interconnection network 22, the location list 240
And 241 and the value lists 242 and 243 are exchanged between the processors. The value list creation unit 234 performs the process of step 405 of FIG. That is, the owner-side value list 242 is created based on the owner-side position list 241. The arithmetic unit 230 performs normal arithmetic processing. This includes
Including substitution execution of local reference in step 402 of FIG. The remote reference assigning unit 235 performs the process of step 407 in FIG. That is, using the referrer side value list 243, substitution is performed for the remote reference, which is the referrer.

【００１４】図１に戻って、本発明のプログラム並列化
方法の一実施例の詳細を説明する。本実施例の並列化方
法は、図３の形の間接参照ループを対象とする。すなわ
ち、次の条件を充たすループを対象とする。（１）ループの中身は１個の代入文である。（２）その代入文の両辺はそれぞれ１個の配列要素参照
である。両辺の配列は異なる。（３）左辺の配列の添字はループイタレーションのイン
デックスｉである。（４）右辺の配列の添字は、別の配列である。すなわ
ち、右辺は間接参照である。本並列化方法によって、図３の間接参照ループは図４に
示すような並列実行プログラム３０１に変換される。図
４のプログラムは図５に示した並列実行方法４０を実現
するものである。Returning to FIG. 1, details of one embodiment of the program parallelizing method of the present invention will be described. The parallelization method of this embodiment targets an indirect reference loop of the form shown in FIG. That is, the loop that satisfies the following conditions is targeted. (1) The content of the loop is one assignment statement. (2) Each side of the assignment statement is one array element reference. The arrangement on both sides is different. (3) The subscript of the array on the left side is the loop iteration index i. (4) The subscript of the array on the right side is another array. That is, the right side is an indirect reference. By this parallelization method, the indirect reference loop of FIG. 3 is converted into the parallel execution program 301 as shown in FIG. The program shown in FIG. 4 realizes the parallel execution method 40 shown in FIG.

【００１５】以下、図４のプログラムを参照しながら、
図１の並列化方法の詳細を説明する。ステップ１００
で、ループ内の代入文の前に、間接参照がｌｏｃａｌ参
照かｒｅｍｏｔｅ参照判定する文を挿入する。図４では
４行目に相当する。ここでｏｗｎｅｒ（ｐ［ｉ］）は、
インデックスの値ｐ［ｉ］から所有者プロセッサ番号を
求めるライブラリ関数である。また、＿ｓｅｌｆは自プ
ロセッサ番号を表す変数である。６行目のｅｌｓｅも本
ステップで挿入する。ステップ１０１で、ｒｅｍｏｔｅ
参照についての位置情報を位置リストに登録する文を、
ｅｌｓｅの後、すなわちｒｅｍｏｔｅ参照の場合に実行
される部分に挿入する。図４の７行目にあるｐｕｔ＿ｌ
ｏｃｉｎｆｏ（）というライブラリ手続き（図５のステ
ップ４０３の処理に対応する）の呼び出しが、その登録
文である。手続きの引数は、登録する情報、すなわち、
所有者プロセッサ番号ｏｗｎｅｒ（ｐ［ｉ］）、所有者
インデックスｐ［ｉ］、参照者インデックスｉである。
ステップ１０２からステップ１０５で、ループの後にｒ
ｅｍｏｔｅ参照の後処理を行う文を挿入する。後処理
は、ライブラリ手続き呼び出しの形をとる。ステップ１
０２では、位置リストを交換する文、すなわち、ライブ
ラリ手続きｅｘｃｈａｎｇｅ＿ｌｏｃｌｉｓｔ（）（図
５のステップ４０４の処理に対応する）の呼び出しを挿
入する。ステップ１０３では、値リストを作成する文、
すなわち、ライブラリ手続きｍａｋｅ＿ｖａｌｌｉｓｔ
（ｂ）（図５のステップ４０５の処理に対応する）の呼
び出しを挿入する。引数として、間接参照されている配
列ｂを与えている。ステップ１０４では、値リストを交
換する文、すなわち、ライブラリ手続きｅｘｃｈａｎｇ
ｅ＿ｖａｌｌｉｓｔ（）（図５のステップ４０６の処理
に対応する）の呼び出しを挿入する。ステップ１０
５では、参照者側でｒｅｍｏｔｅ参照に対する代入を実
行する文、すなわち、ライブラリ手続きｒｅｍｏｔｅ＿
ａｓｓｉｇｎ（ａ）（図５のステップ４０７の処理に対
応する）の呼び出しを挿入する。引数として、代入先の
配列ａを与えている。これらのライブラリ手続き呼び出
しは、図４の並列化後プログラムでは９行目から１２行
目に挿入されている。なお、図７の３行目のｅｐ１，ｅ
ｐ２は、各プロセッサに割り当てられたイタレーション
の範囲を表す変数であり、プロセッサごとに異なる値が
設定される。また、各プロセッサに割り当てられる配列
の範囲は、実際にはｅｐ１からｅｐ２までであるが、図
４では簡単のため、１，２行目の宣言文における配列の
サイズは１００００のままにしてある。Below, referring to the program of FIG.
Details of the parallelization method of FIG. 1 will be described. Step 100
Then, before the assignment statement in the loop, a statement that determines whether the indirect reference is a local reference or a remote reference is inserted. This corresponds to the fourth line in FIG. Where owner (p [i]) is
This is a library function for obtaining the owner processor number from the index value p [i]. Further, _self is a variable indicating the own processor number. The else in line 6 is also inserted in this step. In step 101, remote
A statement to register the location information about the reference in the location list,
It is inserted after the else, that is, in the part executed in the case of a remote reference. Put_l on line 7 of FIG.
The call of the library procedure called ocinfo () (corresponding to the processing of step 403 in FIG. 5) is the registration statement. The procedure argument is the information to be registered, that is,
The owner processor number owner (p [i]), the owner index p [i], and the referencer index i.
In step 102 to step 105, after the loop, r
Insert a statement to perform post-processing for referring to emote. Post-processing takes the form of library procedure calls. Step 1
In 02, a statement for exchanging the position list, that is, a call of the library procedure exchange_loclist () (corresponding to the processing of step 404 in FIG. 5) is inserted. In step 103, the statement that creates the list of values,
That is, the library procedure make_vallist
(B) Insert a call (corresponding to the processing of step 405 in FIG. 5). An array b that is indirectly referenced is given as an argument. In step 104, the statement for exchanging the value list, that is, the library procedure exchang
Insert a call to e_vallist () (corresponding to the processing of step 406 in FIG. 5). Step 10
In 5, the statement in which the assigner executes the substitution for the remote reference, that is, the library procedure remote_
The call of assign (a) (corresponding to the processing of step 407 in FIG. 5) is inserted. The array a to be assigned is given as an argument. These library procedure calls are inserted from the 9th line to the 12th line in the post-parallelization program of FIG. In addition, ep1, e in the third line of FIG.
p2 is a variable representing the range of iterations assigned to each processor, and a different value is set for each processor. Further, the range of the array assigned to each processor is actually from ep1 to ep2, but in FIG. 4, the size of the array in the declaration statement on the first and second lines is left as 10,000 for simplification.

【００１６】以上で、本発明による間接参照ループの並
列化方法の説明を終わる。なお、本実施例および以後の
実施例では、並列化後プログラムをソースプログラムの
形式で示すが、本発明は並列化後プログラムがオブジェ
クトプログラム形式である場合でも同様に適用できる。
また、本実施例および以後の実施例では、並列化前プロ
グラムの例として図３のような逐次処理プログラムを用
いるが、並列化前プログラムが共有メモリ型並列計算機
用プログラムである場合でも、本発明は同様に適用でき
る。また、本実施例および以後の実施例では、１次元配
列および１重ループの場合を例として述べたが、多次元
配列や多重ループの場合でも同様に並列化できる。This concludes the description of the method for parallelizing an indirect reference loop according to the present invention. In the present embodiment and the following embodiments, the post-parallelization program is shown in the form of the source program, but the present invention can be similarly applied even when the post-parallelization program is in the object program form.
Further, in the present embodiment and the subsequent embodiments, the serial processing program as shown in FIG. 3 is used as an example of the pre-parallelization program, but even when the pre-parallelization program is a shared memory type parallel computer program, the present invention Are applicable as well. Further, in the present embodiment and the following embodiments, the case of the one-dimensional array and the single loop has been described as an example, but the multi-dimensional array and the multiple loop can be similarly parallelized.

【００１７】次に本発明の別の実施例として、代入文の
左辺に間接参照があるループの並列実行方法および並列
化方法について説明する。図９に左辺に間接参照がある
ループの一例を示す。ループ内の代入文の左辺の配列ａ
の添字が配列ｐになっている。このループでは、異なる
ｉに対してｐ［ｉ］が同じ値を取ったときに、異なるイ
タレーションが配列ａの同一の要素を書き換える。それ
らのうち、最後のイタレーションによって書かれた値の
みがループ終了後の値として残る。例えば、ａ［ｐ
［２］］＝ｂ［２］でｐ［２］＝１２であり、ａ［ｐ
［６］］＝ｂ［６］でｐ［６］＝１２であったとする
と、ａ［１２］の値としてはｂ［６］が残る。このルー
プを並列実行する場合は、ループ終了後の各配列要素が
逐次実行のときと同じ値を持つようにしなければならな
い。本発明の並列実行方法では、上記の問題を解決する
ために、左辺配列ａと同じサイズの整数配列を新たに設
ける。この配列を「最終イタレーション配列」と呼ぶ。
最終イタレーション配列の各要素は、左辺配列ａと同様
に各プロセッサに割り当てる。各要素は、配列ａの対応
する要素を最後に書き換えたイタレーションのインデッ
クス（上記の例では６になる）を保持する。Next, as another embodiment of the present invention, a parallel execution method and a parallelization method of a loop having an indirect reference on the left side of an assignment statement will be described. FIG. 9 shows an example of a loop having an indirect reference on the left side. Array a on the left side of the assignment statement in the loop
Has a subscript of array p. In this loop, when p [i] takes the same value for different i, different iterations rewrite the same element of array a. Among them, only the value written by the last iteration remains as the value after the loop ends. For example, a [p
[2]] = b [2], p [2] = 12, and a [p
If [6]] = b [6] and p [6] = 12, b [6] remains as the value of a [12]. When executing this loop in parallel, each array element after the end of the loop must have the same value as in serial execution. In the parallel execution method of the present invention, in order to solve the above problem, an integer array having the same size as the left side array a is newly provided. This array is called the "final iteration array".
Each element of the final iteration array is assigned to each processor similarly to the left side array a. Each element holds the index of the iteration in which the corresponding element of the array a was last rewritten (6 in the above example).

【００１８】図１１は、図９の間接参照ループの本発明
による並列実行方法の手順を示すフロートチャートであ
る。ステップ４２０で、最終イタレーション配列の各要
素に初期値を設定する。初期値は、イタレーションイン
デックスの最小値より１少ない値とする。図９のループ
の場合は初期値として−１を設定する。ステップ４２１
は自身に割り当てられたイタレーションのうち、未処理
のものがまだあるかどうかの判定である。未処理のイタ
レーションがあればステップ４２２に進み、以下、その
イタレーションについての処理を行う。ステップ４２２
で間接参照ａ［ｐ［ｉ］］がｌｏｃａｌ参照かｒｅｍｏ
ｔｅ参照かを判定する。もしｌｏｃａｌ参照ならばステ
ップ４２３に進み、ｒｅｍｏｔｅ参照ならばステップ４
２５に進む。ステップ４２３でｌｏｃａｌ参照について
の代入を実行する。代入を実行したイタレーションにつ
いては、ステップ４２４でそのインデックスｉを最終イ
タレーション配列に記録する。ステップ４２５ではｒｅ
ｍｏｔｅ参照についての代入情報を「代入リスト」に登
録する。ここで登録する代入情報とは、要素ａ［ｐ
［ｉ］］の所有者のプロセッサ番号、所有者インデック
スｐ［ｉ］の値、参照者インデックスｉの値、および右
辺配列要素ｂ［ｉ］の値である。FIG. 11 is a float chart showing the procedure of the parallel execution method of the present invention for the indirect reference loop of FIG. At step 420, an initial value is set for each element of the final iteration array. The initial value is one less than the minimum value of the iteration index. In the case of the loop of FIG. 9, -1 is set as the initial value. Step 421
Is a judgment as to whether or not there are still unprocessed iterations assigned to itself. If there is an unprocessed iteration, the process proceeds to step 422, and the process for that iteration is performed. Step 422
And indirect reference a [p [i]] is a local reference or remo
It is determined whether it is a te reference. If it is a local reference, proceed to step 423, and if it is a remote reference, step 4
Go to 25. In step 423, the substitution for the local reference is executed. For the iteration for which substitution has been performed, the index i is recorded in the final iteration array in step 424. In step 425, re
The substitution information about the mote reference is registered in the "substitution list". The substitution information registered here is the element a [p
[I]] is the processor number of the owner, the value of the owner index p [i], the value of the referencer index i, and the value of the right side array element b [i].

【００１９】ここで代入リストの構造を説明する。図１
２に代入リストの構造を示す。位置リストなどと同様
に、所有者ごとにエントリを持つヘッダ５２０と、その
各エントリからポインタでつながるリスト本体５２１に
よって構成される。リスト本体５２１は３個のフィール
ド５２４，５２５，および５２６から成り、各フィール
ドは、右辺の配列要素ｂ［ｉ］の値、所有者インデック
スｐ［ｉ］、参照者インデックスｉを表す。例えば、図
１２の代入リストの一番最初の項目（斜線部分）は、プ
ロセッサ４番の持つ配列要素ａ［３５］に対して、イタ
レーション３番によって２６．５という値が代入される
べきことを表している。Here, the structure of the substitution list will be described. Figure 1
2 shows the structure of the substitution list. Similar to the position list, the header 520 has an entry for each owner, and a list body 521 connected from each entry with a pointer. The list body 521 includes three fields 524, 525, and 526, and each field represents the value of the array element b [i] on the right side, the owner index p [i], and the referencer index i. For example, the first item (hatched part) of the substitution list in FIG. 12 is that the value of 26.5 should be substituted by the iteration number 3 for the array element a [35] of the processor number 4. Is represented.

【００２０】図１１に戻って、ステップ４２１で未処理
のイタレーションがもうなければステップ４２６に進
み、以下、代入リストに基づいてｒｅｍｏｔｅ参照の後
処理をする。ステップ４２６では、全プロセッサ間で代
入リストを交換する。この交換により、各プロセッサ
は、自身が所有者であるようなｒｅｍｏｔｅ参照につい
ての代入リストを持つようになる。代入リストのリスト
本体５２１の各エントリについて、ステップ４２７から
ステップ４３０の処理を行う。ステップ４２７は、リス
ト本体５２１に未処理のエントリがあるかどうかの判定
である。未処理のエントリがなければ並列実行は終了で
ある。未処理のエントリがあればステップ４２８に進
み、以下、そのエントリについての処理を行う。ステッ
プ４２８では、エントリ内の参照者インデックス５２６
と、エントリ内の所有者インデックス５２５に対応する
最終イタレーション配列の要素値と、を比較する。例え
ば、エントリ内の参照者インデックスがｉ＝２であり、
所有者インデックスがｐ［２］＝１２であり、最終イタ
レーション配列ｌａｓｔにおいてｌａｓｔ［１２］＝６
であったとしたとき、２と６を比較する。もし前者が後
者以下ならば、代入リストのエントリは、最終イタレー
ション配列に記録されているイタレーションより前のイ
タレーションによる代入を表している。したがって、こ
の代入は実行してはならず、ステップ４２７にもどって
次のエントリの処理を行う。例えば、エントリ内の参照
者インデックスがｉ＝２であり、所有者インデックスが
ｐ［２］＝１２であり、最終イタレーション配列ｌａｓ
ｔにおいてｌａｓｔ［１２］＝６であったとしたとき、
２（前者）＜６（後者）なので、代入は実行しない。も
し前者が後者より大きければ、ステップ４２９に進む。
ステップ４２９ではｒｅｍｏｔｅ参照による代入を実行
する。例えば、エントリ内の参照者インデックスがｉ＝
６であり、所有者インデックスがｐ［６］＝１２であ
り、最終イタレーション配列ｌａｓｔにおいてｌａｓｔ
［１２］＝２であったとしたとき、６（前者）＞２（後
者）なので、代入を実行する。すなわち、代入リストエ
ントリの値フィールド５２４の内容を、所有者インデッ
クス５２５に対応する左辺配列ａの要素に代入する。ま
たステップ４３０で、所有者インデックス５２５に対応
する最終イタレーション配列の要素に、参照者インデッ
クス５２６を代入する。Returning to FIG. 11, if there are no unprocessed iterations in step 421, the process proceeds to step 426, and thereafter, post-processing of remote reference is performed based on the substitution list. In step 426, the substitution lists are exchanged among all the processors. This exchange causes each processor to have an assignment list for remote references that it owns. The processing from step 427 to step 430 is performed for each entry in the list body 521 of the substitution list. Step 427 is a determination as to whether or not there is an unprocessed entry in the list body 521. If there is no unprocessed entry, the parallel execution ends. If there is an unprocessed entry, the process proceeds to step 428, and the process for that entry is performed. At step 428, referrer index 526 in the entry.
And the element value of the final iteration array corresponding to the owner index 525 in the entry. For example, if the referrer index in the entry is i = 2,
The owner index is p [2] = 12, and last [12] = 6 in the final iteration array last.
Then, 2 and 6 are compared. If the former is less than or equal to the latter, the entry in the substitution list represents substitution by an iteration before the iteration recorded in the final iteration array. Therefore, this substitution should not be executed, and the process returns to step 427 to process the next entry. For example, the reference index in the entry is i = 2, the owner index is p [2] = 12, and the final iteration array las
When last [12] = 6 at t,
Since 2 (the former) <6 (the latter), the substitution is not executed. If the former is larger than the latter, proceed to step 429.
In step 429, substitution by remote reference is executed. For example, if the referrer index in the entry is i =
6, the owner index is p [6] = 12, and last is in the last iteration array last.
If [12] = 2, then 6 (the former)> 2 (the latter), so the substitution is executed. That is, the content of the value field 524 of the assignment list entry is assigned to the element of the left side array a corresponding to the owner index 525. In step 430, the referencer index 526 is assigned to the element of the final iteration array corresponding to the owner index 525.

【００２１】次に図１３を参照して、左辺に間接参照が
あるループに対する、本発明のプログラム並列化方法の
一実施例の詳細を説明する。本実施例の並列化方法は、
図９の形の間接参照ループを対象とする。すなわち、次
の条件を充たすループを対象とする。（１）ループの中身は１個の代入文である。（２）その代入文の両辺はそれぞれ１個の配列要素参照
である。両辺の配列は異なる。（３）右辺の配列の添字はループイタレーションのイン
デックスｉである。（４）左辺の配列の添字は、別の配列である。すなわ
ち、左辺は間接参照である。本並列化方法によって、図９の間接参照ループは図１０
に示すような並列実行プログラム３１１に変換される。
図１０のプログラムは図に示した並列実行方法を実現す
るものである。以下、図１０のプログラム３１１を参照
しながら、図１３の並列化方法の詳細を説明する。ステ
ップ１２０で、最終イタレーション配列ｌａｓｔの宣言
文を挿入する。図１０では３行目がその宣言文である。
また、ステップ１２１で、間接参照ループの直前に最終
イタレーション配列ｌａｓｔの初期化文を挿入する。図
１０では４行目のライブラリ手続きｉｎｉｔｉａｌｉｚ
ｅ（）（図１１のステップ４２０の処理に対応する）の
呼び出しが、その初期化文である。手続きの引数は、初
期化される配列ｌａｓｔと初期値−１である。ステップ
１２２でプログラムのループ内の代入文の前に、間接参
照がｌｏｃａｌ参照かｒｅｍｏｔｅ参照かを判定する文
を挿入する。図１０では６行目に相当する。９行目のｅ
ｌｓｅも本ステップで挿入する。ステップ１２３で、イ
タレーションインデックスを最終イタレーション配列に
記録する文を、ｅｌｓｅの前、すなわちｌｏｃａｌ参照
の場合に実行される部分に挿入する。図の８行目がその
文である。ステップ１２４で、ｒｅｍｏｔｅ参照につい
ての代入情報を代入リストに登録する文を、ｅｌｓｅの
後、すなわちｒｅｍｏｔｅ参照の場合に実行される部分
に挿入する。図１０の１０行目にあるｐｕｔ＿ａｓｇｎ
ｉｎｆｏ（）というライブラリ手続き（図１１のステッ
プ４２５の処理に対応する）の呼び出しが、その登録文
である。手続きの引数は、登録する情報、すなわち、所
有者プロセッサ番号ｏｗｎｅｒ（ｐ［ｉ］）、所有者イ
ンデックスｐ［ｉ］、参照者インデックスｉ、および右
辺配列要素ｂ［ｉ］である。ステップ１２５からステッ
プ１２６で、ループの後にｒｅｍｏｔｅ参照の後処理を
行う文を挿入する。ステップ１２５では、代入リストを
交換する文、すなわち、ライブラリ手続きｅｘｃｈａｎ
ｇｅ＿ａｓｇｎｌｉｓｔ（）（図１１のステップ４２６
の処理に対応する）の呼び出しを挿入する。図１０では
１２行目に挿入されている。ステップ１２６では、所有
者側でｒｅｍｏｔｅ参照の代入を実行する文、すなわ
ち、ライブラリ手続きｒｅｍｏｔｅ＿ａｓｓｉｇｎ＿１
（ａ）の呼び出しを挿入する。引数として、代入先の配
列ａを与えている。図１０では１３行目に挿入されてい
る。このライブラリは、図１１のステップ４２７からス
テップ４３０の処理を実行する。すなわち、代入リスト
の各エントリにつき、参照者インデックスと最終イタレ
ーション配列の要素値を比較して、前者が後者より大き
い場合にのみ代入を実行する。また、代入を実行したと
きには最終イタレーション配列にインデックスを記録す
る。以上で、本発明による左辺間接参照ループの並列化
方法の説明を終わる。Next, referring to FIG. 13, details of an embodiment of the program parallelizing method of the present invention for a loop having an indirect reference on the left side will be described. The parallelization method of this embodiment is
The indirect reference loop of the form shown in FIG. 9 is targeted. That is, the loop that satisfies the following conditions is targeted. (1) The content of the loop is one assignment statement. (2) Each side of the assignment statement is one array element reference. The arrangement on both sides is different. (3) The subscript of the array on the right side is the loop iteration index i. (4) The subscript of the array on the left side is another array. That is, the left side is an indirect reference. With this parallelization method, the indirect reference loop of FIG.
Is converted into a parallel execution program 311 as shown in FIG.
The program of FIG. 10 realizes the parallel execution method shown in the figure. Hereinafter, the details of the parallelization method in FIG. 13 will be described with reference to the program 311 in FIG. In step 120, the declaration statement of the final iteration array last is inserted. In FIG. 10, the third line is the declaration statement.
In step 121, the initialization statement of the final iteration array last is inserted immediately before the indirect reference loop. In FIG. 10, the library procedure initialize on the fourth line
The call of e () (corresponding to the processing of step 420 in FIG. 11) is the initialization statement. The arguments of the procedure are the array last to be initialized and the initial value -1. In step 122, a statement that determines whether the indirect reference is a local reference or a remote reference is inserted before the assignment statement in the loop of the program. In FIG. 10, it corresponds to the sixth line. E on the 9th line
lse is also inserted in this step. In step 123, the statement that records the iteration index in the final iteration array is inserted before the else, that is, in the portion executed in the case of the local reference. The 8th line of the figure is the sentence. In step 124, a statement for registering the substitution information about the remote reference in the substitution list is inserted after the else, that is, in the portion executed in the case of the remote reference. Put_asgn in line 10 of FIG.
The call of the library procedure called info () (corresponding to the processing of step 425 in FIG. 11) is the registration statement. The procedure argument is the information to be registered, that is, the owner processor number owner (p [i]), the owner index p [i], the referencer index i, and the right side array element b [i]. In steps 125 to 126, a statement for post-processing the remote reference is inserted after the loop. In step 125, a statement for exchanging the substitution list, that is, the library procedure exchan
ge_asgnlist () (step 426 of FIG. 11)
(Corresponding to the processing of)) is inserted. In FIG. 10, it is inserted in the 12th line. In step 126, the owner side executes a statement to execute the remote reference assignment, that is, the library procedure remote_assign_1.
Insert the call of (a). The array a to be assigned is given as an argument. In FIG. 10, it is inserted in the 13th line. This library executes the processing from step 427 to step 430 in FIG. That is, for each entry in the substitution list, the reference index and the element value of the final iteration array are compared, and substitution is executed only when the former is larger than the latter. When the substitution is executed, the index is recorded in the final iteration array. This is the end of the description of the method of parallelizing the left indirect reference loop according to the present invention.

【００２２】次に本発明のまた別の実施例として、加算
代入文（意味は後述）の左辺に間接参照がある場合の並
列実行方法および並列化方法について説明する。本実施
例が対象とするループは次の条件を充たすものである。（１）ループの中身は１個の加算代入文である。（２）その代入文の両辺はそれぞれ１個の配列要素参照
である。両辺の配列は異なる。（３）右辺の配列の添字はループイタレーションのイン
デックスｉである。（４）左辺の配列の添字は、別の配列である。すなわ
ち、左辺は間接参照である。図１４に本実施例が対象とするループの一例を示す。４
行目が加算代入文である。それに含まれる加算代入演算
子’＋＝”は、左辺の要素に右辺の要素を足し込むこと
を表している。すなわち、ａ［ｐ［ｉ］］＝ａ［ｐ
［ｉ］］＋ｂ［ｉ］と同等である。このループのように
配列要素に値を次々に足し込んでいく処理は、数値計算
プログラムで頻繁に現われる。したがってこのループの
並列実行は、現実に非常に重要である。Next, as another embodiment of the present invention, a parallel execution method and a parallelization method in the case where there is an indirect reference on the left side of an addition assignment statement (the meaning of which will be described later) will be described. The loop targeted by this embodiment satisfies the following conditions. (1) The content of the loop is one addition assignment statement. (2) Each side of the assignment statement is one array element reference. The arrangement on both sides is different. (3) The subscript of the array on the right side is the loop iteration index i. (4) The subscript of the array on the left side is another array. That is, the left side is an indirect reference. FIG. 14 shows an example of a loop targeted by this embodiment. Four
The line is an addition assignment statement. The addition assignment operator '+ =' included in it indicates that the element on the right side is added to the element on the left side, that is, a [p [i]] = a [p.
It is equivalent to [i] + b [i]. The process of adding values to array elements one after another like this loop frequently appears in a numerical calculation program. Therefore parallel execution of this loop is very important in reality.

【００２３】この形のループを並列実行する場合、加算
の交換結合法則が利用できるので、各イタレーションの
実行順序は任意で良い。そのため、前の実施例と違っ
て、最終イタレーション配列が必要ない。また、代入リ
ストの本体の参照者インデックスも必要ない。同様のこ
とは、代入演算子に含まれる演算が乗算などの場合にも
成り立つ。このループの並列実行方法は、基本的には図
１１に示したものと同じであるが、上記の理由により、
多くのステップが省略できる。省略できるのは、ステッ
プ４２０の最終イタレーション配列の初期化、ステップ
４２４およびステップ４３０の最終イタレーション配列
へのインデックスの記録、ステップ４２８のインデック
スの大小判定である。このループの並列化方法も、基本
的に図１３に示したものと同じであるが、いくつかのス
テップが省略できる。省略できるのは、ステップ１２０
の最終イタレーション配列の宣言文の挿入、ステップ１
２１の最終イタレーション配列の初期化文の挿入、ステ
ップ１２３のインデックス記録文の挿入である。When this type of loop is executed in parallel, the exchange coupling law of addition can be used, so the order of execution of each iteration may be arbitrary. Therefore, unlike the previous embodiment, no final iteration sequence is needed. Also, the referrer index of the body of the substitution list is not necessary. The same holds true when the operation included in the assignment operator is multiplication or the like. The parallel execution method of this loop is basically the same as that shown in FIG. 11, but for the above reason,
Many steps can be omitted. What can be omitted is initialization of the final iteration array in step 420, recording of indexes in the final iteration array in steps 424 and 430, and determination of the size of the index in step 428. The method for parallelizing the loop is basically the same as that shown in FIG. 13, but some steps can be omitted. Step 120 can be omitted.
Insert final iteration array declaration statement, step 1
The initialization statement of the final iteration array 21 is inserted, and the index recording statement of step 123 is inserted.

【００２４】図１５は、本並列化方法によって、図１４
の間接参照ループを並列化した並列実行プログラムであ
る。前実施例の図１０のプログラムと比較して、最終イ
タレーション配列に関する文がなくなっている。また、
７行目の代入リストへの登録ライブラリｐｕｔ＿ａｓｇ
ｎｉｎｆｏ＿２（図１１のステップ４２５の処理に対応
する）の引数に、参照者インデックスｉが含まれていな
い。９行目のｅｘｃｈａｎｇｅ＿ａｓｎｇｌｉｓｔ＿２
は参照者インデックスを含まない代入リストを、プロセ
ッサ間で交換するライブラリ手続き（図１１のステップ
４２６の処理に対応する）である。１０行目のｒｅｍｏ
ｔｅ＿ａｓｓｉｇｎ＿２は、所有者側でｒｅｍｏｔｅ代
入を実行するライブラリ手続き（図１１のステップ４２
７，４２９の処理に対応する）であるが、図１０のｒｅ
ｍｏｔｅ＿ａｓｓｉｇｎ＿１と違って、図１１のステッ
プ４２８のインデックス比較やステップ４３０のインデ
ックス記録は含まない。FIG. 15 shows the result of the parallelization method of FIG.
It is a parallel execution program in which the indirect reference loop of is parallelized. Compared to the program of FIG. 10 of the previous embodiment, the sentence regarding the final iteration array is eliminated. Also,
Registration library put_asg to the substitution list on the 7th line
The referencer index i is not included in the argument of ninfo_2 (corresponding to the processing of step 425 in FIG. 11). Exchange_asnglist_2 on line 9
Is a library procedure (corresponding to the processing of step 426 in FIG. 11) for exchanging an assignment list not including a referrer index between processors. 10th line remo
te_assign_2 is a library procedure for executing remote substitution on the owner side (step 42 in FIG. 11).
(Corresponding to the processing of 7,429), but re in FIG.
Unlike the mote_assign_1, the index comparison in step 428 and the index recording in step 430 in FIG. 11 are not included.

【００２５】次に本発明のまた別の実施例として、複数
の文や複数の間接参照を含むループの並列化方法を説明
する。本並列化方法では、そのようなループを、これま
で述べてきた１個の間接参照を含むループの組み合わせ
に変換してから、並列化するものである。本実施例の並
列化方法は、次の条件を充たすループを対象とする。（１）ループ内に代入文または加算代入文のみを含む。
文が複数個あってもよい。（２）ループ内に間接参照が１個以上ある。（３）ループ外への制御の飛び出しやループ内への制御
の飛び込みがない。（４）各代入文の右辺は配列要素かスカラ変数から構成
される式である。（５）間接参照の添字配列は、ループ内で定義されな
い。（６）代入文の左辺に現われる間接参照配列が、ループ
内のそれ以外の個所で参照されていない。この条件を充たすループを以下では、「一般間接参照ル
ープ」と呼ぶ。また、これまでの実施例で対象としてい
たループを「基本間接参照ループ」と呼ぶ。本実施例で
は簡単のために、ループのインデックスは０から始まり
１ずつ増えるものとするが、そうでない場合でも同様の
方法が適用できる。図１６に本並列化方法の対象となる
ループの一例を示す。Next, as another embodiment of the present invention, a method for parallelizing a loop including a plurality of statements and a plurality of indirect references will be described. In this parallelization method, such a loop is converted into a combination of loops including one indirect reference described above, and then parallelized. The parallelization method of the present embodiment targets a loop that satisfies the following conditions. (1) Only an assignment statement or addition assignment statement is included in the loop.
There may be multiple sentences. (2) There is at least one indirect reference in the loop. (3) No control jumps out of the loop or jumps in the loop. (4) The right side of each assignment statement is an expression composed of array elements or scalar variables. (5) An indirect reference subscript array is not defined in the loop. (6) The indirect reference array appearing on the left side of the assignment statement is not referenced anywhere else in the loop. Hereinafter, a loop satisfying this condition will be referred to as a "general indirect reference loop". Further, the loop targeted in the above embodiments is called a "basic indirect reference loop". In the present embodiment, the loop index starts from 0 and increases by 1 for simplification, but the same method can be applied to other cases. FIG. 16 shows an example of a loop which is the target of this parallelization method.

【００２６】図１８は、本実施例の並列化方法の手順を
表す。図１８のステップ１４０からステップ１４５まで
によって、図１６の一般間接参照ループ３３０は図１７
に示すような基本間接参照ループの組み合わせ３３１に
分解される。FIG. 18 shows the procedure of the parallelization method of this embodiment. As a result of the steps 140 to 145 of FIG. 18, the general indirect reference loop 330 of FIG.
Is decomposed into a combination 331 of basic indirect reference loops.

【００２７】以下、図１６，１７のプログラム３３０，
３３１を参照しながら、図１８の並列化方法の詳細を説
明する。ループ内の各間接参照について、ステップ１４
０からステップ１４５の処理を行う。ステップ１４０
は、元のループ内に間接参照があるかどうかの判定であ
る。間接参照がなければ元のループは既に分解されたの
でステップ１４６に進む。間接参照があればステップ１
４１に進み、以下、その間接参照についての処理を行
う。ステップ１４１で間接参照の値を保持するための一
時配列を生成する。一時配列のサイズはループのイタレ
ーションの数とし、型は間接参照配列の型とする。この
一時配列を以下では＿ｔｍｐ［］と書く。ステップ１４
２で間接参照が右辺にあるか左辺にあるかを判定する。
右辺ならばステップ１４３に進み、左辺ならばステップ
１４４に進む。ステップ１４３で、元のループの前に次
の基本間接参照ループを挿入する。ｆｏｒ（ｉ＝．．．）＿ｔｍｐ［ｉ］＝間接参照；ここでｉはループの制御変数を表す。例えば、図１６の
２行目の間接参照ｂ［ｑ［ｉ］］に対して、図１７の
１，２行目のループを挿入し、図１６の３行目の間接参
照ｅ［ｔ［ｉ］］に対して、図の３，４行目の基本間接
参照ループを挿入する。ステップ１４４で、元のループ
の後に次の基本間接参照ループを挿入する。ｆｏｒ（ｉ）間接参照＝＿ｔｍｐ［ｉ］；元の代入文に加算代入演算子’＋＝’が用いられていれ
ば、挿入した基本間接参照ループの代入文にも’＋＝’
を用いる。例えば、図１６の２行目の間接参照ａ［ｐ
［ｉ］］に対して、図１７の９，１０行目のループを挿
入し、図１６の３行目の間接参照ｄ［ｓ［ｉ］］に対し
て、図１７の１１，１２行目の基本間接参照ループを挿
入する。ステップ１４５で、元のループ内の間接参照
を、一時配列の参照＿ｔｍｐ［ｉ］で置き換える。もし
も間接参照が左辺にあるならば、それに対する代入演算
子が何であっても、’＝’に置き換える。このステップ
１４５により、図１６の元のループは、図１７の５行目
から８行目のように変換される。ステップ１４５までの
変換により、元のループ内から間接参照は消去され、代
わりにステップ１４３，１４５で間接参照ループが生成
された。生成されたループは以前の実施例で対象とした
基本間接参照ループであり、前述の方法で並列化でき
る。ステップ１４６では、生成された基本間接参照ルー
プの各々に対して、前述の方法に従って並列化を行う。
以上で、複数の間接参照を含むループの並列化が終了し
た。Hereinafter, the program 330 of FIGS.
The details of the parallelization method in FIG. 18 will be described with reference to FIG. Step 14 for each indirect reference in the loop
The processing from 0 to step 145 is performed. Step 140
Is a determination of whether there is an indirect reference in the original loop. If there is no indirect reference, the original loop has already been decomposed, so the process proceeds to step 146. Step 1 if there is an indirect reference
In step 41, the process for the indirect reference is performed. In step 141, a temporary array for holding the indirect reference value is generated. The size of the temporary array is the number of loop iterations, and the type is the type of the indirect reference array. Hereinafter, this temporary array will be referred to as _tmp []. Step 14
In step 2, it is determined whether the indirect reference is on the right side or the left side.
If it is the right side, the process proceeds to step 143, and if it is the left side, the process proceeds to step 144. In step 143, the next basic dereference loop is inserted before the original loop. for (i = ...) _ tmp [i] = indirect reference; where i represents a control variable of the loop. For example, the loop of the first and second lines of FIG. 17 is inserted into the indirect reference b [q [i]] of the second line of FIG. 16, and the indirect reference e [t [i of the third line of FIG. 16 is inserted. ]]], The basic indirect reference loop of the 3rd and 4th lines of the figure is inserted. In step 144, insert the next basic dereference loop after the original loop. for (i) indirect reference = _tmp [i]; If the addition assignment operator '+ =' is used in the original assignment statement, '+ =' is also included in the assignment statement of the inserted basic indirect reference loop.
To use. For example, indirect reference a [p in the second line of FIG.
The loop of the 9th and 10th lines of FIG. 17 is inserted into [i], and the 11th and 12th lines of FIG. 17 are inserted into the indirect reference d [s [i]] of the 3rd line of FIG. Insert the basic dereference loop of. In step 145, the indirect reference in the original loop is replaced with the temporary array reference _tmp [i]. If the indirection is on the left, replace it with '=', whatever the assignment operator for it. By this step 145, the original loop of FIG. 16 is transformed as in the fifth to eighth lines of FIG. By the conversion up to step 145, the indirect reference is deleted from the original loop, and instead, the indirect reference loop is generated in steps 143 and 145. The generated loop is the basic indirect reference loop targeted in the previous embodiment and can be parallelized in the manner described above. In step 146, parallelization is performed on each of the generated basic indirect reference loops according to the method described above.
With the above, parallelization of a loop including a plurality of indirect references is completed.

【００２８】図１９に本発明の並列化方法を実行する並
列化コンパイラ６の構成を示す。並列化コンパイラ６
は、構文解析部６０、一般間接参照ループ分解部６１、
基本間接参照ループ並列化部６２、通常ループ並列化部
６３、コード生成部６４を含む。一般間接参照ループ分
解部６１には、一時配列生成部６１０、基本間接参照ル
ープ生成部６１１、間接参照置換部６１２が含まれる。
基本間接参照ループ並列化部６２には、ｒｅｍｏｔｅ／
ｌｏｃａｌ判定文挿入部６２０、情報登録文挿入部６２
１、リスト交換文挿入部６２２、ｒｅｍｏｔｅ代入実行
文挿入部６２３、値リスト作成文挿入部６２４、最終イ
タレーション配列生成部６２５が含まれる。構文解析部
６０は、並列化前プログラム３０を読み込んで、中間語
７０を生成する。中間語７０はコンパイラ内部でのプロ
グラムの表現であり、その形式は通常のコンパイラの場
合と特に変わらないので、ここでは詳細には述べない。FIG. 19 shows the configuration of the parallelizing compiler 6 that executes the parallelizing method of the present invention. Parallelizing compiler 6
Is a syntactic analysis unit 60, a general indirect reference loop decomposition unit 61,
The basic indirect reference loop parallelization unit 62, the normal loop parallelization unit 63, and the code generation unit 64 are included. The general indirect reference loop decomposition unit 61 includes a temporary array generation unit 610, a basic indirect reference loop generation unit 611, and an indirect reference replacement unit 612.
The basic indirect reference loop parallelization unit 62 includes remote /
Local determination sentence insertion unit 620, information registration sentence insertion unit 62
1, a list exchange statement insertion unit 622, a remote assignment execution statement insertion unit 623, a value list creation statement insertion unit 624, and a final iteration array generation unit 625. The syntax analysis unit 60 reads the pre-parallelization program 30 and generates an intermediate word 70. The intermediate language 70 is the expression of the program inside the compiler, and its format is not particularly different from that of the case of a normal compiler, so it will not be described in detail here.

【００２９】一般間接参照ループ分解部６１は、図１８
のステップ１４０からステップ１４５までの処理を行
う。すなわち、一般間接参照ループを基本間接参照ルー
プの組み合わせに分解する。その中で、一時配列生成部
６１０は図１８のステップ１４１の処理を行う。すなわ
ち、元のループ内の間接参照に対して、一時配列を生成
する。また、基本間接参照ループ生成部６１１は、図１
８のステップ１４３およびステップ１４４の処理を行
う、すなわち、間接参照が右辺か左辺かに応じて、元の
ループの前または後に基本間接参照ループを生成する。
間接参照置換部６１２は図１８のステップ１４５の処理
を行う。すなわち、元のループ内の間接参照を、一時配
列生成部６１０が生成した一時配列の参照に置換する。The general indirect reference loop decomposition unit 61 is shown in FIG.
The processing from step 140 to step 145 is performed. That is, the general indirect reference loop is decomposed into a combination of basic indirect reference loops. Among them, the temporary array generation unit 610 performs the process of step 141 of FIG. That is, a temporary array is generated for the indirect reference in the original loop. In addition, the basic indirect reference loop generation unit 611 is
8. Steps 143 and 144 are performed, that is, a basic indirect reference loop is generated before or after the original loop depending on whether the indirect reference is the right side or the left side.
The indirect reference replacement unit 612 performs the process of step 145 of FIG. That is, the indirect reference in the original loop is replaced with the reference of the temporary array generated by the temporary array generation unit 610.

【００３０】基本間接参照ループ並列化部６２は、図１
８のステップ１４６の処理を行う。すなわち、基本間接
参照ループの種類に応じて、図１および図１３に示した
並列化を行う。その中で、ｒｅｍｏｔｅ／ｌｏｃａｌ判
定文挿入部６２０は、図１のステップ１００および図１
３のステップ１２２の処理を行う。すなわち、間接参照
がｒｅｍｏｔｅ参照かｌｏｃａｌ参照か判定する文をル
ープ内に挿入する。また、情報登録文挿入部６２１は図
１のステップ１０１および図１３のステップ１２４の処
理を行う。すなわち、位置リストや代入リストに情報を
登録する文をループ内に挿入する。リスト交換文挿入部
６２２は図１のステップ１０２および図１３のステップ
１２５の処理を行う。すなわち、位置リストや代入リス
トをプロセッサ間で交換する文をプログラムに挿入す
る。ｒｅｍｏｔｅ代入実行文挿入部６２３は図１のステ
ップ１０５および図１３のステップ１２６の処理を行
う。すなわち、交換したリストを用いてｒｅｍｏｔｅ参
照についての代入を実行する文を挿入する。値リスト作
成文挿入部６２４は図１のステップ１０３の処理を行
う。すなわち、右辺間接参照ループに対して、値リスト
を作成する文を位置リスト交換文の後に挿入する。最終
イタレーション配列生成部６２５は図１３のステップ１
２０とステップ１２１の処理を行う。すなわち、左辺間
接参照ループに対して、最終イタレーション配列の宣言
文や初期化文を挿入する。通常ループ並列化部６３は間
接参照ループでないループの並列化を行う。またコード
生成部６４は中間語７０を読み込んで並列化後プログラ
ム３１を生成する。これらの処理の内容は従来の並列化
コンパイラの場合と特に変わらないので、ここでは詳細
は述べない。The basic indirect reference loop parallelization unit 62 is shown in FIG.
The process of step 146 of 8 is performed. That is, the parallelization shown in FIGS. 1 and 13 is performed according to the type of the basic indirect reference loop. Among them, the remote / local determination sentence insertion unit 620 uses the step 100 of FIG.
The process of step 122 of 3 is performed. That is, a statement that determines whether the indirect reference is a remote reference or a local reference is inserted into the loop. Further, the information registration statement insertion unit 621 performs the processing of step 101 of FIG. 1 and step 124 of FIG. That is, a statement that registers information in the position list or the substitution list is inserted in the loop. The list exchange statement insertion unit 622 performs the processing of step 102 of FIG. 1 and step 125 of FIG. That is, a statement for exchanging the position list and the substitution list between the processors is inserted into the program. The remote assignment execution statement inserter 623 performs the processing of step 105 of FIG. 1 and step 126 of FIG. That is, using the exchanged list, a statement for performing substitution for a remote reference is inserted. The value list creation sentence inserting unit 624 performs the process of step 103 of FIG. That is, the statement that creates the value list is inserted after the position list exchange statement in the right side indirect reference loop. The final iteration array generation unit 625 uses step 1 of FIG.
20 and the processing of step 121 are performed. That is, the declaration statement and the initialization statement of the final iteration array are inserted into the indirect reference loop on the left side. The normal loop parallelization unit 63 parallelizes a loop that is not an indirect reference loop. Further, the code generator 64 reads the intermediate word 70 and generates the post-parallelization program 31. Since the contents of these processes are not particularly different from the case of the conventional parallelizing compiler, the details will not be described here.

【００３１】[0031]

【発明の効果】本発明によれば、一般的な間接参照ルー
プのプログラムを分散メモリ型並列計算機用に並列化す
ることができる。また、間接参照が左辺にある場合の間
接参照ループのプログラムを分散メモリ型並列計算機用
に並列化することができる。また、分散メモリ型並列計
算機において、一般的な間接参照ループを並列実行する
ことができる。また、分散メモリ型並列計算機におい
て、間接参照が左辺にある場合の間接参照ループを並列
実行することができる。According to the present invention, a general indirect reference loop program can be parallelized for a distributed memory parallel computer. Further, the program of the indirect reference loop when the indirect reference is on the left side can be parallelized for the distributed memory parallel computer. Further, in a distributed memory type parallel computer, a general indirect reference loop can be executed in parallel. Further, in the distributed memory type parallel computer, the indirect reference loop when the indirect reference is on the left side can be executed in parallel.

[Brief description of drawings]

【図１】本発明による間接参照ループの並列化方法の一
実施例のフローチャートを示す図である。FIG. 1 is a diagram showing a flowchart of an embodiment of a method for parallelizing an indirect reference loop according to the present invention.

【図２】本発明の並列化方法によって並列化されたプロ
グラムを実行する分散メモリ型並列計算機の構成例を示
す図である。FIG. 2 is a diagram showing a configuration example of a distributed memory parallel computer that executes a program parallelized by the parallelization method of the present invention.

【図３】並列化前の右辺間接参照ループの例を示す図で
ある。FIG. 3 is a diagram illustrating an example of a right-side indirect reference loop before parallelization.

【図４】図３の間接参照ループに対して、図１の並列化
方法を適用した結果の間接参照ループを示す図である。4 is a diagram showing an indirect reference loop as a result of applying the parallelization method of FIG. 1 to the indirect reference loop of FIG.

【図５】本発明による右辺間接参照ループの並列実行方
法の一実施例のフローチャートを示す図である。FIG. 5 is a diagram showing a flowchart of an embodiment of a parallel execution method of a right side indirect reference loop according to the present invention.

【図６】図５の並列実行方法で使用する位置リストの構
造を示す図である。6 is a diagram showing a structure of a position list used in the parallel execution method of FIG.

【図７】図５の並列実行方法で使用する値リストの構造
を示す図である。7 is a diagram showing the structure of a value list used in the parallel execution method of FIG.

【図８】図５の並列実行方法を実現する装置の構成例を
示す図である。8 is a diagram showing a configuration example of an apparatus that realizes the parallel execution method of FIG.

【図９】並列化前の左辺間接参照ループの例を示す図で
ある。FIG. 9 is a diagram illustrating an example of a left-side indirect reference loop before parallelization.

【図１０】図９の左辺間接参照ループに対して、本発明
の並列化方法を適用した結果のプログラムを示す図であ
る。10 is a diagram showing a program resulting from applying the parallelization method of the present invention to the indirect reference loop on the left side of FIG.

【図１１】本発明による左辺間接参照ループの並列実行
方法の一実施例のフローチャートを示す図である。FIG. 11 is a diagram showing a flowchart of an embodiment of a parallel execution method of a left side indirect reference loop according to the present invention.

【図１２】図１１の並列実行方法で使用する代入リスト
の構造図を示す図である。12 is a diagram showing a structure diagram of an assignment list used in the parallel execution method of FIG.

【図１３】本発明による左辺間接参照ループの並列化方
法の一実施例のフローチャートを示す図である。FIG. 13 is a diagram showing a flowchart of an embodiment of a method for parallelizing a left side indirect reference loop according to the present invention.

【図１４】並列化前の加算代入間接参照ループの例を示
す図である。FIG. 14 is a diagram illustrating an example of an addition / substitution indirect reference loop before parallelization.

【図１５】図１４の加算代入間接参照ループに対して、
本発明の並列化方法を適用した結果のプログラムを示す
図である。15 is a diagram showing an addition / substitution indirect reference loop of FIG.
It is a figure which shows the program as a result of applying the parallelization method of this invention.

【図１６】一般間接参照ループの例を示す図である。FIG. 16 is a diagram showing an example of a general indirect reference loop.

【図１７】図１６の一般間接参照ループに対して本発明
の方法を適用して、基本間接参照ループの組み合わせに
分解した結果のプログラムを示す図である。17 is a diagram showing a program as a result of applying the method of the present invention to the general indirect reference loop of FIG. 16 and decomposing it into a combination of basic indirect reference loops.

【図１８】本発明による一般間接参照ループの並列化方
法の一実施例のフローチャートを示す図である。FIG. 18 is a diagram showing a flowchart of an embodiment of a method for parallelizing a general indirect reference loop according to the present invention.

【図１９】本発明の間接参照ループ並列化方法を実行す
る並列化コンパイラの例を示す図である。FIG. 19 is a diagram showing an example of a parallelizing compiler that executes the indirect reference loop parallelizing method of the present invention.

[Explanation of symbols]

２２相互結合ネットワーク２０、２０１〜２０ｎプロセッサ２１、２１１〜２１ｎローカルメモリ２３０演算部２３１ｒｅｍｏｔｅ参照判定部２３２位置情報登録部２３３ネットワーク制御部２３４値リスト作成部２３５ｒｅｍｏｔｅ参照代入部２４０参照者側位置リスト２４１所有者側位置リスト２４２所有者側値リスト２４３参照者側値リスト５００、５１０、５２０ヘッダ５０１、５１１、５２１リスト本体 22 Mutual connection network 20, 201 to 20n Processor 21, 211 to 21n Local memory 230 Calculation unit 231 remote reference determination unit 232 Location information registration unit 233 Network control unit 234 Value list creation unit 235 remote reference substitution unit 240 Reference side position list 241 Owner side position list 242 Owner side value list 243 Referencer side value list 500, 510, 520 Header 501, 511, 521 List body

───────────────────────────────────────────────────── フロントページの続き (72)発明者海永正博神奈川県川崎市麻生区王禅寺1099番地株式会社日立製作所システム開発研究所内 (72)発明者斎藤靖彦神奈川県川崎市麻生区王禅寺1099番地株式会社日立製作所システム開発研究所内 ─────────────────────────────────────────────────── ─── Continuation of front page (72) Masahiro Kainaga, Inventor Masahiro Kainaga 1099, Ozenji, Aso-ku, Kawasaki, Kanagawa, Ltd.Inside Hitachi, Ltd. Systems Development Laboratory (72) Inventor, Yasuhiko Saito, 1099, Ozenji, Aso-ku, Kawasaki, Kanagawa Ceremony company Hitachi Systems Development Laboratory

Claims

[Claims]

1. A program parallelization method for converting a serial processing program or a shared memory parallel computer program into a distributed memory parallel computer program, wherein a loop including an indirect reference whose array subscript is an array, Inserting a statement in the indirect reference loop to determine whether the referenced array element is in its own processor or another processor, and if the referenced array element is in another processor, the array element Inserting a statement for registering information on the list in the indirect reference loop, inserting a list exchange statement for exchanging the list between processors after the indirect reference loop, and the exchanged statement. Insert a statement that uses the list to assign the value of the array element of another processor to the array element of the local processor after the list exchange statement And a program parallelizing method.

2. A method for executing an indirect reference loop in which array subscripts are arrays in parallel in a distributed memory parallel computer, in which whether the referenced array element is in its own processor or another processor is indirect. A step of making a decision in a reference loop; a step of assigning to the array element when it is in its own processor; a step of registering information about the array element in a list when it is in another processor in the indirect reference loop And a step of exchanging the list between processors after the end of the indirect reference loop, and a step of substituting an array element value of another processor into an array element of the own processor using the exchanged list. Parallel execution method of indirect reference loop.

3. A program parallelizing apparatus for converting a serial processing program or a shared memory type parallel computer program into a distributed memory type parallel computer program, for a loop including an indirect reference whose array subscript is an array, Means for inserting into the indirect reference loop a statement that determines whether the referenced array element is in its own processor or another processor; and if the referenced array element is in another processor, the array element Means for inserting a statement for registering information on the list in the indirect reference loop, and means for inserting a list exchange statement for exchanging the list between processors after the indirect reference loop, and And a means for inserting a statement that substitutes the value of the array element of another processor into the array element of the own processor using the list, after the list exchange statement. Program parallelizing apparatus, characterized in that it comprises.

4. An apparatus for executing parallel execution of an indirect reference loop whose array subscript is an array in a distributed memory parallel computer, determines whether the referenced array element is in its own processor or another processor. Means, means for substituting to the array element when it is in its own processor, means for registering information about the array element in a list when it is in another processor, and means for exchanging the list between processors And a means for substituting the array element value of the other processor into the array element of the own processor by using the exchanged list.

5. A program parallelization method for converting a serial processing program or a shared memory parallel computer program into a distributed memory parallel computer program, for a loop including a plurality of indirect references in which array subscripts are arrays. And decomposing the loop to convert a combination of a plurality of loops each including one indirect reference, and parallelizing each of the decomposed loops according to the parallelization method of claim 1. A program parallelization method including:

6. The program parallelization method according to claim 5, wherein the step of decomposing an indirect reference loop and converting it into a combination of a plurality of loops each including one indirect reference is performed for each indirect reference. , Generating a temporary array,
A step of replacing the indirect reference in the indirect reference loop with a reference of the temporary array; and a new loop including an assignment statement having the indirect reference and the temporary array element on both sides, before or after the indirect reference loop. And a step of generating the program.

7. In a program parallelization method for converting a serial processing program or a shared memory parallel computer program into a distributed memory parallel computer program, an indirect reference in which an array subscript is an array appears on the right side of an assignment statement. For the loop, insert a statement in the indirect reference loop to determine whether the referenced array element is in its own processor or in another processor; Inserting a statement for registering position information in a position list into the indirect reference loop; inserting a position list exchange statement for exchanging the position list between processors after the indirect reference loop; Using the created position list, a statement to create a value list containing the values of the array elements referenced by other processors, Inserting after the position list exchange statement, inserting a value list exchange statement for exchanging the value list between processors after the value list creation statement, and using the exchanged value list for another processor And a statement for substituting the value of the array element into the array element of its own processor, after the value list exchange statement.

8. The program parallelization method according to claim 7, wherein the position information registered in the position list refers to the processor number of the processor that owns the array element being referenced, the index of the array element, and the array element. A method for parallelizing a program, which includes an index of a loop iteration being performed.

9. In a method of executing in parallel a loop in which an indirect reference in which an array subscript is an array appears on the right side of an assignment statement in a distributed memory parallel computer, whether the referenced array element is in its own processor or not In the indirect reference loop, it is determined whether it is in a processor, when it is in its own processor, it is assigned to the array element, and when it is in another processor, the position information about the array element is indirect. Registering in a position list in a reference loop, exchanging the position list between processors after the indirect reference loop, and using the exchanged position list, an array element referenced by another processor Creating a list of values containing values, exchanging the list of values between processors, and using the exchanged list of values Characterized in that it comprises a step of substituting the value of the array element processor array element of its own processor, parallel execution method of indirect reference loop.

10. The parallel execution method of an indirect reference loop according to claim 9, wherein the position information registered in the position list is the processor number of the processor owning the referenced array element, the index of the array element, and the array. A parallel execution method of an indirect reference loop, characterized by including an index of a loop iteration referencing an element.

11. In a program parallelization method for converting a serial processing program or a shared memory parallel computer program into a distributed memory parallel computer program, an indirect reference in which an array subscript is an array appears on the left side of an assignment statement. For the loop, insert a declaration statement of the final iteration array that records the index of the iteration in which the elements of the dereferenced array are last rewritten, and an initialization statement for the final iteration array. Is inserted before the indirect reference loop, a statement for determining whether the referenced array element is in its own processor or another processor is inserted in the indirect reference loop, and a final iteration is performed. Insert the statement that records the iteration index into the array in the indirect reference loop. A step of inserting, in the indirect reference loop, a statement for registering the assignment information about the array element in the assignment list when it is in another processor; and an assignment list exchange statement for exchanging the assignment list between the processors. A step of inserting after the substitution list exchange statement, a step of inserting after the indirection reference loop, and a statement for assigning the array element value of another processor to the array element of the own processor using the exchanged substitution list And a program parallelizing method including:

12. The program parallelization method according to claim 11, wherein the assignment information registered in the assignment list refers to a processor number of a processor that owns the array element being referred to, an index of the array element, and the array element. A method for parallelizing a program, which includes an index of a loop iteration being performed and a value to be assigned to the array element.

13. A method of executing in parallel a loop in which an indirect reference in which an array subscript is an array appears on the left side of an assignment statement in a distributed memory parallel computer, records the index of the iteration that rewrites the array element last. Initializing the final iteration array, determining in the indirect reference loop whether the referenced array element is in its own processor or another processor, and if it is in its own processor, the array A step of executing assignment to an element and recording an iteration index in the final iteration array; and a step of registering assignment information about the array element in another indirect reference loop in the assignment list in the indirect reference loop Exchanging the assignment list between processors after the indirect reference loop ends, and the exchanging step. And a step of substituting the array element value of another processor into the array element of the own processor using the assigned assignment list.

14. The parallel execution method for an indirect reference loop according to claim 13, wherein the assignment information registered in the assignment list includes a processor number of a processor which owns the array element being referred to, an index of the array element, and the array. A parallel execution method of an indirect reference loop, characterized by including an index of loop iteration referring to an element and a value to be assigned to the array element.

15. The parallel execution method of an indirect reference loop according to claim 13, wherein the step of substituting the array element value of another processor with the array element of the other processor by using the exchanged substitution list,
The step of comparing the index stored in the final iteration array with the loop iteration index in the assignment list, and when the former is less than the latter,
A parallel execution method of an indirect reference loop, which comprises the step of executing substitution and storing the latter in a final iteration array.

16. An element of an indirectly referenced array is rewritten last in a method of executing in parallel a loop in which an indirect reference in which an array subscript is an array appears on the left side of an assignment statement in a distributed memory parallel computer. A parallel execution method of an indirect reference loop, characterized in that an array for recording an iteration index is provided.

17. In a program parallelization method for converting a serial processing program or a shared memory parallel computer program into a distributed memory parallel computer program, an indirect reference whose array subscript is an array is on the left side of the addition assignment statement. For the loop that appears, insert a statement in the indirect reference loop that determines whether the referenced array element is in its own processor or in another processor, and if it is in another processor, Inserting a statement for registering the assignment information of the assignment list in the assignment list into the indirect reference loop, and inserting an assignment list exchange statement for exchanging the assignment list between processors after the indirect reference loop, Using the exchanged assignment list, add and assign the value of the array element of another processor to the array element of the local processor.
And a step of inserting after the substitution list exchange statement.

18. The program parallelization method according to claim 17, wherein the assignment information registered in the assignment list includes a processor number of a processor which owns the referenced array element, an index of the array element, and an array element. A program parallelization method including a value to be substituted.

19. In a method for executing in parallel a loop in which an indirect reference in which an array subscript is an array appears on the left side of an addition assignment statement in a distributed memory parallel computer, is the referenced array element in the own processor? Determining in the indirect reference loop whether it is in another processor, adding and assigning to the array element when it is in its own processor, and assigning information about the array element when it is in another processor Registering in the assignment list in the indirect reference loop, exchanging the assignment list between processors after the indirect reference loop, and using the exchanged assignment list to set the values of array elements of other processors. A parallel execution method of an indirect reference loop, comprising the step of adding and substituting to array elements of a processor.

20. The parallel execution method for an indirect reference loop according to claim 19, wherein the assignment information registered in the assignment list includes a processor number of a processor owning the referenced array element, an index of the array element, and an index of the array element. A parallel execution method of an indirect reference loop characterized by including values to be assigned to array elements.