JPH03172958A

JPH03172958A - Synchronous processing method, system and method for parallel processing, and parallel program generator

Info

Publication number: JPH03172958A
Application number: JP31274689A
Authority: JP
Inventors: Yuji Onishi; 裕二大西; Keisuke Toyama; 圭介十山
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1989-11-30
Filing date: 1989-11-30
Publication date: 1991-07-26

Abstract

PURPOSE:To efficiently execute a program by waiting for the other processors to synchronize each processor with them in accordance with arrival of the processing of this processor at a prescribed waiting position and selecting an execution restart position after waiting by each processor in accordance with the processing result of this processor. CONSTITUTION:Each of processors 1 to 4 waits for the other processors and is synchronized with them based on first information indicating whether processings of respective processors 1 to 4 arrive at a prescribed waiting position or not. Meanwhile, each of processors 1 to 4 selects an execution restart position after waiting based on second information corresponding to processing results in processors 1 to 4. That is, processors 1 to 4 are synchronized with one another, and they are branched to different processings in accordance with processing results. Therefore, this system can cope with not only the for type loop but also the while type loop. Thus, information is efficiently written and referred.

Description

【発明の詳細な説明】［産業上の利用分野コ本発明は、同期処理方法および並列処理システムおよび
並列処理方法および並列化プログラム生成装置に関し、
さらに詳しくは、複数のプロセッサで並列に処理を行う
際に各プロセッサ間で同期をとりながら処理を進めるた
めの同期処理方法および並列処理システムおよび並列処
理方法および並列化プログラム生成装置に関する。[Detailed Description of the Invention] [Industrial Application Field] The present invention relates to a synchronous processing method, a parallel processing system, a parallel processing method, and a parallelized program generation device,
More specifically, the present invention relates to a synchronous processing method, a parallel processing system, a parallel processing method, and a parallel program generation device for proceeding with processing while synchronizing each processor when processing is performed in parallel by a plurality of processors.

［従来の技術］プログラムの実効的な処理速度を向上することを目的と
して複数のプロセッサで並列に処理を進める並列処理シ
ステムがある。[Prior Art] There is a parallel processing system in which processing is performed in parallel by a plurality of processors in order to improve the effective processing speed of a program.

並列処理システムでは、プログラムは複数に分割され、
分割された各プログラムが複数のプロセッサでそれぞれ
並列に実行される。このとき、分割された各プログラム
１１りでデータの定義と参照の順序関係（データ依存関
係）がある場合に、正しい実行結果を得るためには、前
記順序関係を正しく保証する必要がある。つまり、デー
タを定義した後で参照するか参照した後で定義するかに
よって実行結果が異なるため、データの定義と参照の順
序関係が、分割される前のプログラムと同じになるよう
に保証して、各プロセッサで分割された各プログラムを
実行しなければならない。In a parallel processing system, a program is divided into multiple parts,
Each divided program is executed in parallel by multiple processors. At this time, if there is an order relationship (data dependency relationship) between data definitions and references in each of the divided programs 11, it is necessary to ensure the order relationship is correct in order to obtain correct execution results. In other words, the execution results differ depending on whether the data is defined and then referenced, or referenced and then defined. , each processor must execute each divided program.

このため、データ依存関係があるプログラムを担当する
プロセッサ間では、データの定義と参照の順序関係が正
しくなるように同期をとりつつ、担当するプログラムを
実行している。For this reason, processors in charge of programs that have data dependencies execute the programs while synchronizing so that the order of data definition and reference is correct.

上記同期のための従来技術としては、例えば、バリア同
期と呼ばれる同期処理方法が知られている。As a conventional technique for the above synchronization, for example, a synchronization processing method called barrier synchronization is known.

このバリア同期は、分割されたプログラムの全てにバリ
アとしである地点を設定し、各プロセッサはこのバリア
まで処理が進むと次の処理に移らずに他のプロセッサの
処理がバリアまで進むのを待ち合せ、全てのプロセッサ
の処理がバリアまで進んでから次の処理に移ることによ
り同期をとるものである。This barrier synchronization sets a certain point as a barrier for all divided programs, and when each processor reaches this barrier, it waits for the processing of other processors to progress to the barrier without moving on to the next process. , synchronization is achieved by allowing the processing of all processors to proceed to the barrier before moving on to the next processing.

このバリア同期の一般的説明は、例えば「並列処理マシ
ン　富田眞治外　オーム社」や、　　ｒｌＥＥＥＳｏｆ
ｔｗａｒｅ　　Ｊａｎｕａｒｙ、１９８８　　Ｐ、３４
−４２Ｊに記載されている。また、具体的実現方法につ
いては、例えばＩＮＥＥＥ　ＴＲＡＮＳＡＣＴＩＯＮＳ
　ＯＮ　ＣＯＭＰＵＴＥＲＳ　ＡＵＧＵＳＴ。General explanations of this barrier synchronization include, for example, "Parallel Processing Machine by Shinji Tomita Ohmsha" and rlEEEESof
tware January, 1988 P, 34
-42J. For specific implementation methods, see INEEE TRANSACTIONS, for example.
ON COMPUTERS AUGUST.

１９８８　　Ｐ、９９１−１００４Ｊに記載されている
。1988 P, 991-1004J.

［発明が解決しようとする課題］上記ｒＩＥＥＥ　ＴＲＡＮＳＡＣＴＩＯＮＳ　ＯＮ　Ｃ
ＯＭＰＵＴＥＲＳ　　ＡＵ−ＧＵＳＴ、　１９８８　　
Ｐ、９９１−１００４Ｊに記載されているバリア同期の
具体的実現方法の対象は、繰り返し処理であって、繰り
返しを制御する変数の刻みが定数であり、且つ、コンパ
イルする時点で繰り返し回数が確定しているものである
。これはＣ言語におけるｆｏｒ文に対応するため、以下
では、これをｆｏｒ型ループと呼ぶ。[Problem to be solved by the invention] The above rIEEE TRANSACTIONS ON C
OMPUTERS AU-GUST, 1988
The target of the concrete implementation method of barrier synchronization described in P, 991-1004J is repetitive processing, where the increment of the variable that controls the repetition is constant, and the number of repetitions is determined at the time of compilation. It is something that Since this corresponds to the for statement in the C language, this will be referred to as a for-type loop below.

一般に、並列処理の対象となるものは、プログラム実行
時間の大部分が費やされる繰り返し処理であるため、ｆ
ｏｒ型ループに対して適用可能な上記従来技術は有用で
ある。Generally, the target of parallel processing is repetitive processing that consumes most of the program execution time, so f
The above-described conventional techniques applicable to or-type loops are useful.

しかし、繰り返しを制御する変数の刻みが定数でない、
又は、繰り返し回数がコンパイルする時点で確定してい
ない繰り返し処理もある。これはＣ言語におけるｗｈｉ
ｌｅ文に対応するため、以下では、これをｗｈｉｌｅ型
ループと呼ぶ。上記従来技術は、このようなｗｈｉｌｅ
型ループに対しては適用できないという問題があった。However, if the increment of the variable that controls repetition is not constant,
Alternatively, there may be repetitive processing in which the number of repetitions is not determined at the time of compilation. This is wh in C language
In order to correspond to the le statement, this is hereinafter referred to as a while type loop. The above conventional technology is
The problem is that it cannot be applied to type loops.

そこで、本発明の目的は、ｆｏｒ型ループだけでな（、
ｗｈｉｌｅ型ループに対しても適用可能な同期処理方法
を提供することにある。また、その同期処理方法を実行
する並列処理システムを提供することにある。また、そ
の同期処理方法を利用した並列処理方法を提供すること
にある。また、その同期処理方法を実行するプログラム
を生成する並列化プログラム生成装置を提供することに
ある。Therefore, the purpose of the present invention is not only for loops (,
The object of the present invention is to provide a synchronization processing method that can be applied to a while type loop as well. Another object of the present invention is to provide a parallel processing system that executes the synchronous processing method. Another object of the present invention is to provide a parallel processing method using the synchronous processing method. Another object of the present invention is to provide a parallelization program generation device that generates a program that executes the synchronous processing method.

［課題を解決、するための手段］第１の観点では、本発明は、複数のプロセッサによる並
列処理の進行を調和させるためにプロセッサを相互に待
合せさせる同期処理方法において、各プロセッサの処理
が所定の待合せ場所に達したか否かという第１の情報と
、各プロセッサでの処理結果に対応した第２の情報とを
プロセッサ単位にプロセッサ外部で保持し、前記第１の
情報に基づいて各プロセッサは他プロセッサと待合せて
同期をとると共に、前記第２の情報に基づいて各プロセ
ッサは待合せ後の実行再開場所を選択することを特徴と
する同期処理方法を提供する。[Means for Solving and Accomplishing the Problems] In a first aspect, the present invention provides a synchronization processing method for queuing processors with each other in order to harmonize the progress of parallel processing by a plurality of processors. First information indicating whether or not the meeting place has been reached, and second information corresponding to the processing result of each processor are stored outside the processor for each processor, and each processor provides a synchronization processing method characterized in that each processor waits and synchronizes with other processors, and each processor selects a place to resume execution after waiting based on the second information.

第２の観点では、本発明は、複数のプロセッサと、それ
らプロセッサの各々に割り当てられ対応するプロセッサ
により設定される同期検出用ビットを有する第１のレジ
スタ手段と、前記プロセッサの各々に割り当てられ対応
するプロセッサにより設定される再開場所指定用ビット
を有する第２のレジスタ手段と、前記第１のレジスタ手
段の同期検出用ビットの状態から同期を判定して各プロ
セッサに待合せ又は実行再開を指令する同期判定手段と
、前記第２のレジスタ手段の再開場所指定用ビットの状
態に基づいて各プロセッサに実行再開場所を指定する実
行再開場所指定手段とを具備してなることを特徴とする
並列処理システムを提供する。なお、第１のレジスタ手
段と第２のレジスタ手段とは、物理的には一つのレジス
タであってもよい。In a second aspect, the present invention includes a plurality of processors, first register means having a synchronization detection bit assigned to each of the processors and set by the corresponding processor; a second register means having a restart location designating bit set by the processor that executes the process; and a synchronization system that determines synchronization from the state of a synchronization detection bit of the first register means and instructs each processor to wait or resume execution. A parallel processing system comprising: a determining means; and an execution restart location specifying means for specifying an execution restart location for each processor based on the state of a resume location designation bit of the second register means. provide. Note that the first register means and the second register means may be physically one register.

第３の観点では、本発明は、複数のプロセッサで並行し
て処理を実行する並列処理方法において、各プロセッサ
の処理が所定の待合せ場所に到達したことを表わす第１
の情報をプロセッサ外部に出力するステップと、各プロ
セッサの処理結果に対応した第２の情報をプロセッサ外
部に出力するステップと、他のプロセッサから出力され
た前記第１の情報に基づいて処理の待合せ又は実行再開
を行なうステップと、他のプロセッサから出力された前
記第２の情報に基づいて実行再開場所を選択するステッ
プと、を有することを特徴とする並列処理方法を提供す
る。In a third aspect, the present invention provides a parallel processing method in which a plurality of processors execute processing in parallel.
outputting information to the outside of the processor, outputting second information corresponding to the processing results of each processor to the outside of the processor, and queuing processing based on the first information output from other processors. Alternatively, there is provided a parallel processing method characterized by comprising the steps of restarting execution, and selecting a place to restart execution based on the second information output from another processor.

第４の観点では、本発明は、逐次処理用プログラムから
並列処理用プログラムを生成する並列化プログラム生成
装置において、プログラムの所定の待合せ場所に処理が
到達したことを表わす第１の情報をプロセッサ外部に出
力するステップを並列処理用プログラム中に付加する手
段と、プログラム中の処理結果に対応した第２の情報を
プロセッサ外部に出力するステップを並列処理用プログ
ラム中に付加する手段と、他のプロセッサから出力され
た前記第１の情報に基づいて処理の待合せ又は実行再開
を行なうステップを並列処理用プログラム中に付加する
手段と、他のプロセッサから出力された前記第２の情報
に基づいて実行再開場所を選択するステップを並列処理
用プログラム中に付加する手段と、を具備したことを特
徴とする並列化プログラム生成装置を提供する。なお、
この並列化プログラム生成装置は、コンパイラの形態で
あってもよい。In a fourth aspect, the present invention provides a parallelization program generation device that generates a parallel processing program from a sequential processing program, in which first information indicating that a process has reached a predetermined waiting location of the program is transmitted to the outside of the processor. means for adding a step to the parallel processing program to output second information to the outside of the processor, means for adding a step to the parallel processing program to output second information corresponding to the processing result in the program to the outside of the processor; means for adding a step to the parallel processing program for queuing processing or resuming execution based on the first information output from another processor; and resuming execution based on the second information output from another processor. A parallel program generation device is provided, comprising: means for adding a step for selecting a location into a parallel processing program. In addition,
This parallelization program generation device may be in the form of a compiler.

［作用〕第１の観点による本発明の同期処理方法では、各プロセ
ッサの処理が所定の待合せ場所に達したか否かという第
１の情報に基づいて各プロセッサは他プロセッサと待合
せて同期をとる。一方、各プロセッサでの処理結果に対
応した第２の情報に基づいて各プロセッサは待合せ後の
実行再開場所を選択する。すなわち、各プロセッサを同
期させると共に、処理結果に応じて異なる処理に分岐さ
せられるようになる。そこで、ｆｏｒ型ループだけでな
（、ｗｈｉｌｅ型ループにも対応可能となる。また、第
１の情報と第２の情報をプロセッサ外部に保持している
から、情報の書き込みと参照を効率よく行なえるように
なる。[Operation] In the synchronous processing method of the present invention according to the first aspect, each processor waits and synchronizes with other processors based on the first information indicating whether the processing of each processor has reached a predetermined meeting place. . On the other hand, each processor selects a place to resume execution after waiting based on second information corresponding to the processing result of each processor. In other words, each processor can be synchronized and branched into different processes depending on the processing results. Therefore, it becomes possible to handle not only for type loops but also (and while type loops. Also, since the first information and second information are held outside the processor, writing and referencing of information can be performed efficiently. Become so.

第２の観点による本発明の同期処理方法では、各プロセ
ッサは、処理が所定の待合せ場所に達したか否かにより
第１のレジスタ手段の割り当てられた同期検出用ビット
を設定する。そして、同期判定手段は、第１のレジスタ
手段の同期検出用ビットの状態から同期を判定して各プ
ロセッサに待合せ又は実行再開を指令する。一方、各プ
ロセッサは、処理結果に応じて第２のレジスタ手段の割
り当てられた再開場所指定用ビットを設定する。In the synchronous processing method of the present invention according to the second aspect, each processor sets the allocated synchronous detection bit of the first register means depending on whether the processing has reached a predetermined meeting place. The synchronization determining means determines synchronization from the state of the synchronization detection bit of the first register means and instructs each processor to wait or resume execution. On the other hand, each processor sets the assigned restart location designation bit of the second register means according to the processing result.

そして、実行再開場所指定手段は、第２のレジスタ手段
の再開場所指定用ビットの状態に基づいて各プロセッサ
に実行再開場所を指定する。これにより、各プロセッサ
が同期させられると共に、処理結果に応じて異なる処理
に分岐させられるようになる。そこで、ｆｏｒ型ループ
だけでな（、Ｗｈｉ　ｌｅ型小ループも対応可能となる
。The execution restart location designation means designates the execution restart location for each processor based on the state of the restart location designation bit of the second register means. This allows each processor to be synchronized and to branch to different processes depending on the processing results. Therefore, not only for-type loops (and while-type small loops can also be supported).

第３の観点による本発明の並列処理方法では、プロセッ
サは、所定の待合せ場所に処理が到達したことを表わす
第１の情報と、処理結果に対応した第２の情報とをプロ
セッサ外部に出力する。そして、他のプロセッサから出
力された前記第１の情報に基づくタイミングで、また、
他のプロセッサから出力された前記第２の情報に基づく
実行再開場所から、処理を再開する。そこで、他のプロ
セッサと同期できるようになると共に、処理結果に応じ
て異なる処理に分岐できるようになるから、ｆｏｒ型ル
ープだけでな（、ｗｈｉｌｅ型ループにも対応可能とな
る。In the parallel processing method of the present invention according to the third aspect, the processor outputs first information indicating that the processing has reached a predetermined meeting place and second information corresponding to the processing result to the outside of the processor. . Then, at a timing based on the first information output from another processor,
Processing is restarted from the execution restart location based on the second information output from the other processor. Therefore, since it becomes possible to synchronize with other processors and to branch to different processes depending on the processing result, it becomes possible to support not only for-type loops but also while-type loops.

第４の観点による本発明の並列化プログラム生成装置で
は、逐次処理用プログラムから並列処理用プログラムを
生成する際に、所定の待合せ場所に処理が到達したこと
を表わす第１の情報をプロセッサ外部に出力するステッ
プと、処理結果に対応した第２の情報をプロセッサ外部
に出力するステップと、他のプロセッサから出力された
前記第１の情報に基づいて処理の待合わせ又は実行再開
を行なうステップと、他のプロセッサから出力された前
記第２の情報に基づいて実行再開場所を選択するステッ
プとを並列処理用プログラム中に付加する。そこで、生
成された並列処理用プログラムは、ｆｏｒ型ループだけ
でな（、ｗｈｉｌｅ型ループにも対応可能のものとなる
。In the parallelization program generation device of the present invention according to the fourth aspect, when generating a parallel processing program from a sequential processing program, the first information indicating that the processing has arrived at a predetermined waiting place is transmitted to the outside of the processor. a step of outputting second information corresponding to the processing result to the outside of the processor; and a step of waiting or resuming execution of the process based on the first information output from another processor; A step of selecting an execution restart location based on the second information output from another processor is added to the parallel processing program. Therefore, the generated parallel processing program can handle not only for-type loops, but also while-type loops.

［実施例］以下、図に示す実施例により本発明をさらに詳細に説明
する。なお、これにより本発明が限定されるものではな
い。[Example] Hereinafter, the present invention will be explained in more detail with reference to Examples shown in the drawings. Note that the present invention is not limited thereby.

第１図は本発明の同期処理方法を実施する並列処理シス
テム１００のブロック図である。FIG. 1 is a block diagram of a parallel processing system 100 implementing the synchronous processing method of the present invention.

この並列処理システム１００は、４台のプロセッサ１，
２，３．４と、共有の主記憶装置１０と、各プロセッサ
１〜４が待合せ処理中か待合せ処理以外の処理の実行中
かを表すビットＡｌ、Ａ２゜Ａ３．Ａ４を保持するレジ
スタＡ１０１と、各プロセッサ１〜４の待合せ処理終了
後のプログラム実行再開場所を指定するためのビットＢ
ｌ、１３２゜Ｂ３．Ｂ４を保持するレジスタＢ１０２と
、レジスタＡ１０１の各ビットＡ１〜Ａ４を判定して各
プロセッサ１〜４に実行再開指示１０９を出すレジスタ
Ａ判定回路１０３と、レジスタＢ１０２の各ビット８１
〜Ｂ４を判定して各プロセッサ１〜４に実行再開場所を
指定するプロセッサ制御信号１０５ａ、１０５ｂ、１０
６ａ、１０６ｂ、１０７ａ、１０７ｂ、１０８ａ、１０
８ｂを出すレジスタＢ判定回路１０４とを具備してなっ
ている。This parallel processing system 100 includes four processors 1,
2, 3.4, the shared main storage device 10, and bits Al, A2, A3, which indicate whether each of the processors 1 to 4 is executing a queuing process or a process other than the queuing process. A register A101 that holds A4, and a bit B that specifies the location where program execution will resume after the end of the waiting process for each processor 1 to 4.
l, 132°B3. A register B102 that holds B4, a register A determination circuit 103 that determines each bit A1 to A4 of the register A101 and issues an execution restart instruction 109 to each processor 1 to 4, and each bit 81 of the register B102.
- Processor control signals 105a, 105b, 10 that determine B4 and specify the execution restart location for each processor 1-4
6a, 106b, 107a, 107b, 108a, 10
8b.

レジスタＡ判定回路１０３は、レジスタＡｌ０１の全ビ
ットＡ１〜Ａ４の論理積をとり、結果を実行再開指示１
０９として各プロセッサ１〜４に出力する。The register A determination circuit 103 takes the AND of all bits A1 to A4 of the register Al01, and uses the result to issue an execution restart instruction 1.
09 to each processor 1 to 4.

レジスタＢ判定回路１０４は、レジスタＢ１０２のビッ
トＢ１〜Ｂ４をコード化し、プロセッサ制御信号１０５
ａ〜１０８ｂとして各プロセッサ１〜４に出力する。レ
ジスタＢ１０２のビットＢ１〜Ｂ４の値とプロセッサ制
御信号１０５ａ〜１０８ｂの関係を第２図に示す。なお
、第２図中の＊はＯ’、’１’　のどちらでもよいこと
を表している。Register B determination circuit 104 encodes bits B1 to B4 of register B 102 and outputs a processor control signal 105.
It is output to each processor 1-4 as a-108b. FIG. 2 shows the relationship between the values of bits B1 to B4 of register B102 and processor control signals 105a to 108b. Note that * in FIG. 2 indicates that either O' or '1' may be used.

第３図に、上記並列処理システム１００で処理するｗｈ
ｉｌｅ型ループのプログラム例を示す。FIG. 3 shows wh processed by the parallel processing system 100.
An example of a program for an ile type loop is shown below.

このプログラム４００は、ｉに１°を代入し、配列の要
素ａ　［ｉ］が“０°でない場合は何らかの処理をして
から配列の要素ａ　［ｉ］に０°を代入し、ｉをインク
リメントし、配列の要素ａ　［ｉ］が０°のものに出会
うと処理を終了することを行なうプログラムである。This program 400 assigns 1° to i, and if array element a[i] is not 0°, performs some processing, then assigns 0° to array element a[i], and increments i. This is a program that terminates the process when element a [i] of the array is found to be 0°.

プログラム４００は、各プロセッサ１〜４で並列実行す
るために、プログラム４０１，４０２゜４０３．４０４
に分割される。なお、プログラム４０３．４０４の内容
の図示は省略するが、プログラム４０１，４０２の内容
と同様のものである。The program 400 includes programs 401, 402, 403, 404 for parallel execution on each of the processors 1 to 4.
divided into Although illustration of the contents of the programs 403 and 404 is omitted, they are similar to the contents of the programs 401 and 402.

分割したプログラム４０１〜４０４中には、処理の同期
を制御するため、以下の文が付加される。The following statements are added to the divided programs 401 to 404 in order to control the synchronization of processing.

ｃｌｅａｒレジ゛スタＡ　［ｘｌ・・・レジスタＡｌ０
Ｉのピッ）Ａｘをクリアする。clear register A [xl... register Al0
I's beep) Clear Ax.

ｃｌｅａｒレジ′スタＢ　［ｘｌ・・・レジスタＢ１０
２のピッ）Ｂｘをクリアする。clear register' register B [xl... register B10
2nd pick) Clear Bx.

ｓｅｔレジスタ　Ａ　［ｘｌ・・・レジスタＡｌ０Ｉの
ピッ）Ａｘをセットする。set register A [xl... register Al0I pin) Set Ax.

ｓｅｔレジスタ　Ｂ　［ｘｌ・・・レジスタＢ１０２の
ビットＢｘをセットする。set register B [xl...Sets bit Bx of register B102.

待合せ処理・・・同期のための待合せである。後にいく
つかの具体例を示して説明する。Waiting process: Waiting for synchronization. This will be explained later by showing some specific examples.

プログラム４０１，４０２，４０３，４０４は、並列化
変換機能を持つコンパイラによって自動的またはプログ
ラマとの対話により必要な情報を収集し、プログラム４
００から生成される。Programs 401, 402, 403, and 404 collect necessary information automatically or through interaction with the programmer by a compiler with a parallelization conversion function.
Generated from 00.

コンパイラは、同期処理のためのレジスタＡ１０１、　
　レジスタＢ１０２のビットをクリアする文を生成し、
プログラム４００の変換を開始する。The compiler has a register A101 for synchronization processing,
Generate a statement to clear the bit of register B102,
Start converting the program 400.

ループを制御する変数ｉは、各プロセッサ１〜４で一時
的に使用されるため、レジスタｒｅｇｊに割り付けられ
、初期化される。初期値は、プロセッサ１ではプログラ
ム４００と同じ値で、以降のプロセッサ２〜４ではプロ
グラム４００の変数ｉの増分だけ増え、プロセッサ２で
は２．プロセッサ３では３．プロセッサ４では４である
。Since the variable i that controls the loop is temporarily used by each processor 1 to 4, it is allocated to the register regj and initialized. The initial value is the same value as the program 400 for the processor 1, increases by the increment of the variable i of the program 400 for the subsequent processors 2 to 4, and increases by 2. 3 for processor 3. For processor 4, it is 4.

ループ本体の変換では、先頭から、レジスタＡ１０１の
ビットをセットする文と、待合せ処理の文と１プロセッ
サ制御化号１０５ａ〜１０８ｂが“００°のときの実効
再開場所を示すラベル゛ａｄｄｒ　Ｏｏ　と、レジスタ
Ａ１０１のビットをクリアする文とを生成する。そして
、プログラム４００のループ本体の命令列を配置する。In converting the main body of the loop, starting from the beginning, there is a statement that sets the bit of register A101, a statement for waiting processing, and a label "addr Oo" indicating the effective restart location when processor control codes 105a to 108b are "00°". A statement that clears the bit of register A101 is generated.Then, the instruction string of the loop body of the program 400 is arranged.

但し、プログラム４００の変数ｉの更新命令はレジスタ
ｒｅｇｊの更新命令に変換され、レジスタｒｅｇ−ｊの
増分は（プロセッサ数×プログラム４００の変数ｉの増
分）、つまり、４に変換される。However, an instruction to update variable i in program 400 is converted to an instruction to update register regj, and the increment in register reg-j is converted to (number of processors x increment in variable i in program 400), that is, 4.

この後、レジスタＢ１０２．　　レジスタＡｌ０Ｉのビ
ットをセットする文と、待合せ処理の文とが生成される
。After this, register B102. A statement for setting the bit of register Al0I and a statement for queuing processing are generated.

次いで、プロセッサ制御信号１０５ａ〜１０８ｂが０１
°のときの実行再開場所を示すラベル’ａｄｄｒ　１’
　と、プログラム４００のループ本体の命令列から変数
ｉの更新命令を除いた命令と、全ての処理の終了ラベル
゛ｅｘｉｔ’への分岐命令とを生成する。Then, processor control signals 105a-108b become 01
Label 'addr 1' indicating where to resume execution when °
Then, an instruction obtained by removing the update instruction for variable i from the instruction sequence of the loop body of the program 400, and a branch instruction to the end label "exit" of all processing are generated.

さらに、プロセッサ制御信号１０５ａ〜１０８ｂが１１
°のときの実行再開場所を示すラベル’ａｄｄｒ　３°
　と、レジスタｒｅｇｊをプログラム４００の変数ｉに
持ち込む命令を生成する。Further, the processor control signals 105a to 108b are 11
Label 'addr indicating where to resume execution when ° 3°
Then, an instruction to bring register regj into variable i of the program 400 is generated.

最後に、プロセッサ制御信号１０５ａ〜１０８ｂが１０
’　のときの実行再開場所を示すラベル’ａｄｄｒ　２
°　と、全ての処理の終了ラベル°ｅｘｉｔ’とを生成
して、変換を終了する。Finally, the processor control signals 105a-108b are 10
Label indicating where to resume execution when ' addr 2
° and the end label °exit' for all processes are generated, and the conversion is ended.

次に、並列処理システム１００の作動を説明する。なお
、データとして第４図に示す配列ａ　［ｉ］が与えられ
ているものとする。Next, the operation of the parallel processing system 100 will be explained. It is assumed that the array a[i] shown in FIG. 4 is given as data.

プロセッサ１は、プログラム４０１により、レジスタＡ
１０１の第１ビツトＡ１と　レジスタＢ１０２の第１ビ
ツトＢ１をクリアする。プロセッサ２．３．４でも、同
様にして、両レジスタ１０１．１０２の第２ビットＡ２
．Ｂ２と、第３ピツ）Ａ３．Ｂ３と、第４ビットＡ４．
Ｂ４をクリアする。Processor 1 uses register A by program 401.
The first bit A1 of register B101 and the first bit B1 of register B102 are cleared. Similarly, in processor 2.3.4, the second bit A2 of both registers 101 and 102 is
．． B2 and 3rd pit) A3. B3 and the fourth bit A4.
Clear B4.

次に、各プロセッサ１〜４のレジスタｒｅｇｊを初期化
する。初期値はプロセッサ１が°１”、プロセッサ２．
３．４はそれぞれ２°　　°３゜°４°である。Next, register regj of each processor 1 to 4 is initialized. The initial values are 1" for processor 1, 1" for processor 2, and 1" for processor 2.
3.4 are 2°, 3°, and 4°, respectively.

次に、ｗｈｉｌｅの条件判定で、プロセッサ１で配列の
要素ａ　［１］、　　プロセッサ２で配列の要素ａ　［
２］、プロセッサ３で配列の要素ａ　［３］、プロセッ
サ４で配列の要素ａ［４］をそれぞれ判定する。Next, in the while condition judgment, processor 1 selects array element a[1], processor 2 selects array element a[1], and processor 2 selects array element a[1].
2], the processor 3 determines array element a[3], and the processor 4 determines array element a[4].

第４図に示すように、配列の各要素ａ［１］〜ａ［４］
の内容は°０°でないため、ループ本体の実行を行なう
。As shown in Figure 4, each element a[1] to a[4] of the array
Since the content of is not 0°, the main body of the loop is executed.

ループ本体の処理では、プロセッサ１はレジスタＡｌ０
Ｉの第１ビツトＡ１、プロセッサ２．３゜４はそれぞれ
第２ビットＡ２．第３ビツトＡ３゜４ビツトＡ４をセッ
トする。In the processing of the loop body, processor 1 stores register Al0.
The first bit A1 of the processor 2.3.4 is the second bit A2. Set the third bit A3 and the fourth bit A4.

そして、各プロセッサ１〜４は、それぞれ待合せ処理に
入る。Then, each of the processors 1 to 4 enters a waiting process.

待合せ処理は、プロセッサの割込み機能を使用するもの
で、第５図にそのフローチ°ヤードを示す。The queuing process uses the interrupt function of the processor, and its flowchart is shown in FIG.

まず、無限ループで割込みが入るまで待機する（５０１
）。First, wait until an interrupt occurs in an infinite loop (501
).

待機中にレジスタＡ判定回路１０３から実行再開指示１
０９が出力されたことにより各プロセッサ１〜４で割込
みが発生すると、割込み処理ルーチンが起動され、ステ
ップ５０２へ移行する。Execution restart instruction 1 from register A determination circuit 103 during standby
When an interrupt occurs in each of the processors 1 to 4 due to the output of 09, the interrupt processing routine is started and the process moves to step 502.

ステップ５０２では、プロセッサ制御信号１０５ａ〜１
０８ｂを判定し、値が’ｏｏ’　ならａｄｄｒ　Ｏｏへ
、値が°０１“ならａｄｄｒ　１°へ、値が°１０゛な
らａｄｄｒ　２°へ、値が°１１°なら’ａｄｄｒ　３
°ヘジヤンプして、実行を再開する。In step 502, processor control signals 105a-1
08b, if the value is 'oo' go to addr Oo, if the value is '01' go to addr 1°, if the value is '10' go to addr 2°, if the value is '11' go to 'addr 3'
°Hedge jump and resume execution.

さて、レジスタＡ判定回路１０３では、レジスタＡｌ０
Ｉの第１ビツトＡ１〜第４ビツトＡ４がセットされたこ
とにより、各プロセッサ１〜４に実行再開指示１０９を
出す。Now, in the register A determination circuit 103, the register Al0
Since the first bit A1 to the fourth bit A4 of I are set, an execution restart instruction 109 is issued to each processor 1 to 4.

このように、レジスタＡ１０１の全ピットへ１〜Ａ４が
セットされてから各プロセッサ１〜４に実行再開指示１
０９が出されるため、プロセッサ１〜４の同期をとるこ
とができるわけである（バリア同期）。In this way, after 1 to A4 are set in all the pits of the register A101, each processor 1 to 4 is instructed to resume execution 1.
Since 09 is issued, processors 1 to 4 can be synchronized (barrier synchronization).

レジスタＢ判定回路１０４では、レジスタＢ１０２の第
１ビツトＢ１〜第４ビツトＢ４を、第２図のテーブルに
したがってコード化し、プロセッサ制御信号１０５ａ〜
１０８ｂを出力する。この時点では全ビット８１〜Ｂ４
が°０°であるため、全てのプロセッサ１〜４に、実行
再開場所を示すプロセッサ制御信号１０５ａ〜１０８ｂ
として°００°を出力する。The register B determination circuit 104 encodes the first bit B1 to the fourth bit B4 of the register B 102 according to the table shown in FIG.
Outputs 108b. At this point, all bits 81 to B4
is 0°, processor control signals 105a to 108b are sent to all processors 1 to 4 to indicate where to resume execution.
outputs °00° as

実行再開場所を示すプロセッサ制御信号１０５ａ〜１０
８ｂと、実行再開指示１０９が各プロセッサ１〜４に入
力されると、上述の待合せ処理から抜は出して、それぞ
れのプロセッサ制御信号１０５ａと１０５ｂ、１０６ａ
と１０６ｂ、１０７ａと１０７ｂ、１０８ａと１０８ｂ
で指示された場所から実行を再開する。この時点では、
ラベル’ａｄｄｒ　Ｏｏから実行を再開することになる
。Processor control signals 105a-10 indicating where to resume execution
8b and an execution restart instruction 109 are input to each processor 1 to 4, the above-mentioned waiting process is skipped and the respective processor control signals 105a, 105b, and 106a are input.
and 106b, 107a and 107b, 108a and 108b
Resumes execution from the point indicated by . At this point,
Execution will resume from label 'addr Oo.

プロセッサ１では、’ａｄｄｒ　Ｏｏから実行を再開す
ると、レジスタＡ１０１の第１ビツトＡ１をクリアして
から、プログラム４００におけるループ本体の処理を実
行する。そして、配列の要素ａ［ｒｅｇ−ｊＥに°０°
を代入する。さらに、レジスタｒｅｇ−ｊに（プロセッ
サ数×プログラム４００の変数ｉの増分）である°４°
を加算した後、ｗｈｉｌｅの条件判定に戻る。When the processor 1 resumes execution from 'addr Oo, it clears the first bit A1 of the register A101 and then executes the processing of the loop body in the program 400. Then, array element a[reg-jE has °0°
Substitute. Furthermore, the register reg-j has (number of processors x increment of variable i of program 400) °4°
After adding , the process returns to the while condition determination.

プロセッサ２．３．　４における°ａｄｄｒ　Ｏｏから
の実行の再開も同様である。Processor 2.3. The same is true for restarting execution from °addr Oo in step 4.

２回目のループ処理で、各プロセッサ１〜４は、配列の
要素ａ［５］〜ａ［８］を判定するが、第４図に示すよ
うに、やはり配列の各要素ａ［５〕〜ａ　［８］の内容
は°０°でないため、１回目のループ処理と同様の動作
となる。In the second loop processing, each of the processors 1 to 4 determines the elements a[5] to a[8] of the array, but as shown in FIG. Since the content of [8] is not 0°, the operation is similar to the first loop process.

以下、第４図に示す配列ａ　［ｉ］の内容にしたがって
各プロセッサ１〜４はループ処理を行う。Thereafter, each of the processors 1 to 4 performs loop processing according to the contents of the array a[i] shown in FIG.

２５００回目のループになると、プロセッサ１では、ｗ
ｈｉｌｅの条件判定で配列の要素ａ［９９９７］の内容
が°０°でないため、ループ本体の実行に入り、レジス
タＡｌ０Ｉの第１ビツトＡｌをセットし、待合せ処理を
行う。At the 2500th loop, processor 1 executes w
Since the content of element a[9997] of the array is not 0° in the hile condition determination, execution of the main body of the loop is started, the first bit Al of register Al0I is set, and the waiting process is performed.

ところが、プロセッサ２．３では、配列の要素ａ　［９
９９８］、　　ａ　［９９９９］の内容が°０°である
ため、ループ本体の処理に入らず、レジスタＢ１０２と
レジスタＡｌ０Ｉの第２．第３ビツトをセットし、待合
せ処理に入る。However, in processor 2.3, array element a [9
998], a Since the content of [9999] is °0°, the process does not enter the main body of the loop, and the second . The third bit is set and the waiting process begins.

プロセッサ４では、配列の要素ａ　［１００００１の内
容が０゛でないため、ループ本体の実行に入り、レジス
タＡｌ０Ｉの第４ビツトＡ４をセットし、待合せ処理を
行う。Since the contents of array element a [100001 are not 0, processor 4 enters execution of the main body of the loop, sets the fourth bit A4 of register Al0I, and performs a waiting process.

レジスタＡ判定回路１０３は、レジスタＡｌ０１の全ビ
ットＡ１〜Ａ４がセットされた時点で、各プロセッサ１
〜４に実行再開指示１０９を出す。The register A determination circuit 103 determines whether each processor 1
-4, an execution restart instruction 109 is issued.

レジスタＢ判定回路１０４は、レジスタＢＩＯ２の各ビ
ット８１〜８４め値を第３図のテーブルにしたがってコ
ード化するが、この時点では、（Ｂｌ、Ｂ２．Ｂ３．Ｂ
４）＝　（０，１，１，Ｏ）だから、プロセッサ１には
０１“、プロセッサ２には１１°、プロセッサ３．４に
は１０“という内容のプロセッサａｉｌ　ｔｉ＋信号を
出力する。The register B determination circuit 104 codes the values of each bit 81 to 84 of the register BIO2 according to the table shown in FIG.
4) = (0, 1, 1, O), so the processor ail ti+ signal with the contents of 01" is output to the processor 1, 11° to the processor 2, and 10" to the processor 3.4.

プロセッサ１は、プロセッサ制御信号１０５　ａ。Processor 1 receives processor control signal 105a.

１０５ｂとして°０１′が入力されることになるから、
’ａｄｄｒ　１’から実行を再開し、プロセッサ４００
のループ本体の命令を実行してから、全ての処理の終了
ラベル°ｅｘｉｔ’　に分岐し、処理を終わる。Since °01' will be input as 105b,
Resuming execution from 'addr 1', processor 400
After executing the instructions in the main body of the loop, the program branches to the end label 'exit' of all processes, and ends the process.

プロセッサ２は、プロセッサ制御信号１０６　ａ。The processor 2 receives the processor control signal 106a.

１０６ｂとして°１１°が入力されることになるから、
’ａｄｄｒ　３°から実行を再開し、レジスタｒｅｇｊ
をプログラム４００の変数ｊに対応する主記憶装置１０
の領域内に格納し、処理を終わる。Since °11° will be input as 106b,
'addr Resume execution from 3° and register regj
The main memory 10 corresponding to the variable j of the program 400
The data is stored within the area and processing ends.

プロセッサ３，４は、プロセッサ制御信号１０７ａと１
０７ｂ、１０８ａと１０８ｂとして１０′が入力される
ことになるから、’ａｄｄｒ　２’　から実行を再開し
、処理を終わる。Processors 3 and 4 receive processor control signals 107a and 1
Since 10' will be input as 07b, 108a and 108b, execution will be restarted from 'addr 2' and the process will end.

実行結果は、配列の要素ａ［ｌコルａ　［９９９９］が
°０となり、配列の要素ａ　［１００００コは１０００
０’となる。変数ｉは、’９９９８’　となる。The execution result is that array element a [l col a [9999] is °0, array element a [10000 is 1000
It becomes 0'. The variable i becomes '9998'.

次に、待合せ処理の他の例を第６図〜第１０を参照して
説明する。Next, another example of the waiting process will be described with reference to FIGS. 6 to 10.

他の例の第１は、特殊な命令を作るというものである。The first other example is to create a special command.

命令フォーマットを第６図に、命令の処理フローチャー
トを第７図に示す゛。The instruction format is shown in FIG. 6, and the instruction processing flowchart is shown in FIG. 7.

第６図に示すように、命令６００は、オペレーション６
０１と、実行再開場所のａｄｄｒ　Ｏ，ａｄｄｒ　１゜
ａｄｄｒ２．ａｄｄｒ３（６０２，６０３，６０４，６
０５）から構成されている。この命令６００が実行され
たプロセッサは、実行再開指示１０９が入力されるまで
待機する。As shown in FIG. 6, instructions 600 include operation 6
01, addr O, addr 1° addr2. addr3(602,603,604,6
05). The processor that has executed this instruction 600 waits until an execution restart instruction 109 is input.

第７図に示すように、実行再開指示１０９が入力される
と（７０１）、プロセッサ制御信号１０５ａ〜１０８ｂ
を判定して分岐しく７０２）、実行再開場所のアドレス
をＰＣ（プログラムカウンタ）に転送し、このアドレス
より実行を再開する。As shown in FIG. 7, when the execution restart instruction 109 is input (701), the processor control signals 105a to 108b
702), the address of the execution restart location is transferred to the PC (program counter), and execution is resumed from this address.

他の例の第２は、上記の例と同様に特殊な命令を作ると
いうものであるが、実行再開場所の指定方法が異なって
いる。命令フォーマットを第８図に、処理フローチャー
トを第９図に示す。The second example is to create a special instruction similar to the above example, but the method of specifying the execution restart location is different. The instruction format is shown in FIG. 8, and the processing flowchart is shown in FIG. 9.

第８図に示すように、命令８００は、オペレーション８
０１と、オフセット群（８０２，８０３゜８０４、　８
０５）　　からなる。　オフセット０．　オフセットｌ
、　オフセット２．　オフセット３は、実行再開場所の
ａｄｄｒ　　Ｏ，ａｄｄｒ　　ｌ。As shown in FIG. 8, instructions 800 include operation 8
01 and the offset group (802, 803°804, 8
05). Offset 0. offset l
, offset 2. Offset 3 is addr O, addr l at the execution restart location.

ａｄｄｒ　２．　ａｄｄｒ　３までのオフセットである
。この命令が実行されたプロセッサは、実行再開指示１
０９が入力されるまで待機する。addr 2. This is the offset up to addr 3. The processor on which this instruction was executed is instructed to resume execution
Wait until 09 is input.

第９図に示すように、実行再開指示１０９が入力される
と（９０１）、プロセッサ制御信号１０５ａ〜１０８ｂ
を判定して分岐しく９０２）、実行再開場所までのオフ
セットをＰＣ（プログラムカウンタ）に加算し、得られ
たアドレスより実行を再開する。As shown in FIG. 9, when the execution restart instruction 109 is input (901), the processor control signals 105a to 108b
902), the offset to the execution restart location is added to the PC (program counter), and execution is resumed from the obtained address.

他の例の第３は、ソフトウェアによる方式であり、第１
０図にフローチャートを示す。The third example is a software method;
A flowchart is shown in Figure 0.

まず、実行再開指示１０９が人力されるまで待機する。First, the system waits until the execution restart instruction 109 is manually issued.

実行再開指示１０９が入力されると（１００１）、プロ
セッサ制御信号１０５ａ〜１０８ｂを判定して分岐しく
１００２）、実行再開場所のアドレスにジャンプして、
実行を再開する。When the execution restart instruction 109 is input (1001), the processor control signals 105a to 108b are judged and branched (1002), and the process jumps to the address of the execution restart location.
Resume execution.

次に、第１１図に、上記並列処理システム１０Ｏで処理
するｆｏｒｆｆｉループのプログラム例を示。Next, FIG. 11 shows a program example of a forffi loop processed by the parallel processing system 10O.

す。vinegar.

このプログラム１１００は、ａ［４コ＝　ｂ　［２コ、　　　ｂ［４］＝ａ［１コ。This program 1100 is a [4 pieces = b [2 pieces, b [4] = a [1 piece.

ａ　　［５］＝　　ｂ　［３コ、　　ｂ　［５コ＝　　
ａ　　［２］。a [5] = b [3 pieces, b [5 pieces =
a [2].

ａ［６コ＝　　ｂ　［４］、　　　ｂ　　［６］＝　　
ａ　［３］。a [6 pieces = b [4], b [6] =
a [3].

ａ　　［１０００］＝　　ｂ　［９９８］、　　　ｂ　
［１０００］＝　　ａ　［９９７コａ［１００１コ＝　
　ｂ　［９９９］　　　ｂ　［１００１コ＝　　ａ　［
９９８］を行なうプログラムである。a [1000] = b [998], b
[1000] = a [997 pieces a [1001 pieces =
b [999] b [1001 pieces = a [
998].

ｊ＝にのときのｂ［ｋコニ　ａ　［ｋ−３］と、ｊ＝に
＋２のときのａ　［ｋ＋２］−ｂ　［ｋ］の間でデータ
依存関係があるため、ａ　［ｋ＋２］＝　ｂ　［ｋ］の
実行はｂ［ｋコニ　ａ　［ｋ−３コの実行の後に行われ
なければならない。従って、ｊ＝にのループ処理と、ｊ
＝に＋２のループ処理は、並列に実行できない。そこで
、ｊ＝にのループ処理の２つの命令と、ｊ＝に＋１のル
ープ処理の２つの命令を並列に実行する。Since there is a data dependency relationship between b [k coni a [k-3] when j = and a [k + 2] - b [k] when j = +2, a [k + 2] = b The execution of [k] must occur after the execution of b[k a [k-3]. Therefore, the loop processing for j= and j
=+2 loop processing cannot be executed in parallel. Therefore, two instructions for loop processing at j= and two instructions for loop processing at +1 for j= are executed in parallel.

前記のように、ｊ＝にのループ処理と、ｊ＝に＋１のル
ープ処理を並列に実行することにより、各プロセッサ１
〜４に分割されたプログラムでは、プログラム１１００
における内側ループは不要となる。As described above, by executing the loop processing for j= and the loop processing for +1 for j= in parallel, each processor 1
~ In a program divided into 4, program 1100
The inner loop in is no longer needed.

各プロセッサ１〜４のプログラム１１０１〜１１０４は
、先に説明した実施例の場合と同様にコンパイラにより
生成される。Programs 1101 to 1104 for each processor 1 to 4 are generated by a compiler in the same way as in the previously described embodiment.

各プロセッサ１〜４におけるプログラム１１０１〜１１
０４の実行は次のようになる。Programs 1101 to 11 in each processor 1 to 4
The execution of 04 is as follows.

まず、プロセッサ１，２，３．４は、それぞれレジスタ
Ａ１０１およびレジスタＢ１０２の第１ビツトＡ１およ
びＢｌ、第２ビツトＡ２およびＢ２、第３ビツトＡ３お
よびＢ３．第４ビツトＡ４およびＢ４をクリアする。First, processors 1, 2, 3.4 input the first bits A1 and B1, the second bits A2 and B2, the third bits A3 and B3 . Clear the fourth bit A4 and B4.

次に、プロセッサ１，２，３．４は、ループに入り、そ
れぞれレジスタＡ１０１の第１ビツトΔ１、第２ビット
Ａ２．第３ビットＡ３．第４ビツトＡ４をセットし、待
合せ処理に入る。Next, processors 1, 2, 3.4 enter a loop and write the first bit Δ1, second bit A2 . . . of register A101, respectively. Third bit A3. The fourth bit A4 is set and the waiting process begins.

Ｗｈｉｌｅ型ループの実施例でも述べたように、レジス
タＡ判定回路１０３は、レジスタＡ１０１の全ビットＡ
１〜Ａ４がセットされると、各プロセッサ１〜４に実行
再開指示１０９を出す。As mentioned in the embodiment of the while-type loop, the register A determination circuit 103 checks all bits A of the register A101.
When 1 to A4 are set, an execution restart instruction 109 is issued to each processor 1 to 4.

レジスタＢ判定回路１０４は、レジスタＢＩＯ２の各ビ
ットＢ１〜Ｂ４を判定し、プロセッサ制御信号１０５ａ
〜１０８ｂを出力する。すなわち、１回目のループ処理
では、レジスタＢ１０２の各ビットＢ１〜Ｂ４は全て°
Ｏ゛であるため、全てのプロセッサ１〜４にａｄｄｒ　
Ｏ’　からの実行再開を指示する。The register B determination circuit 104 determines each bit B1 to B4 of the register BIO2 and outputs the processor control signal 105a.
~108b is output. That is, in the first loop processing, each bit B1 to B4 of register B102 are all set to °.
O゛, so addr to all processors 1 to 4
Instructs to resume execution from O'.

プロセッサ１，２，３．４は、’ａｄｄｒ　Ｏ’　から
実行を再開し、それぞれレジスタＡｌ０Ｉの第１ビット
ＡＩ、　　第２ビットＡ２．　　第３ビツトＡ３゜第４
ビツトＡ４をクリアする。Processors 1, 2, 3.4 resume execution from 'addr O' and read the first bit AI, second bit A2 . 3rd bit A3゜4th
Clear bit A4.

そして、プロセッサ１はａ　［４］＝　ｂ　［２］、プ
ロセッサ２はａ　［４］＝　ｂ　［１１、プロセッサ３
はａ［５］＝ｂ［３］、プロセッサ４はａ　［５］＝ｂ
　［２］を実行し、２回目のループ処理に入る。Then, processor 1 has a [4] = b [2], processor 2 has a [4] = b [11, processor 3
is a[5]=b[3], processor 4 is a[5]=b
Execute [2] and enter the second loop process.

２回目のループでは、１回目のループ処理と同様に、プ
ロセッサ１はａ　［６］＝　ｂ　［４］、プロセッサ２
はｂ　［６］−ａ　［３］、プロセッサ３はａ　［７］
＝　ｂ　［５］、プロセッサ４はｂ　［７］＝　ａ　［
４］を実行する。In the second loop, similarly to the first loop processing, processor 1 calculates a [6] = b [4], processor 2
is b [6] - a [3], processor 3 is a [7]
= b [5], processor 4 has b [7] = a [
4].

以下、同様にして処理が進められ、プログラム１１０１
．１１０２のループの制御変数ｊが１００２、プログラ
ム１１０３．１１０４のループの制御変数ｊが１００３
になると、プロセッサ１゜２．３．４は、ループを終了
し、それぞれレジスタＢ１０２およびレジスタＡ１０１
の第１ビツトＢ１およびＡｌ、第２ビツトＢ２およびＡ
２．第３ビツトＢ３およびＡ３．第４ビツトＢ４および
Ａ４をセットし、待合せ処理に入る。Thereafter, the processing proceeds in the same manner, and the program 1101
．． The control variable j of the loop of 1102 is 1002, and the control variable j of the loop of programs 1103 and 1104 is 1003.
, processor 1゜2.3.4 ends the loop and writes register B102 and register A101, respectively.
The first bit B1 and Al, the second bit B2 and A
2. Third bit B3 and A3. The fourth bit B4 and A4 are set and the waiting process begins.

レジスタＡ判定回路１０３は、レジスタＡｌ０１の全ビ
ットＡ１〜Ａ４がセットされたことにより、各プロセッ
サ１〜４に実行再開指示１０９を出す。The register A determination circuit 103 issues an execution restart instruction 109 to each of the processors 1 to 4 when all bits A1 to A4 of the register Al01 are set.

レジスタＢ判定回路１０４は、レジスタＢＩＯ２の各ビ
ットＢ１〜Ｂ４の値を第２図のテーブルにしたがってコ
ード化するが、（Ｂｌ、Ｂ２．Ｂ３、Ｂ４）＝　（１，
１，１，１）だから、プロセッサ１には°１１°、プロ
セッサ２．　３．４には°１０゛というプロセッサ制御
信号１０５ａ〜１０８ｂを出力する。The register B determination circuit 104 codes the values of each bit B1 to B4 of the register BIO2 according to the table in FIG. 2, and (Bl, B2.B3, B4) = (1,
1,1,1), so processor 1 has a degree of 11 degrees, processor 2... 3.4, processor control signals 105a to 108b of 10° are output.

プロセッサ１は、実行再開場所として°１１゜が指定さ
れたことにより　ａｄｄｒ　３’　から実行を再開し、
処理を終了する。また、プロセッサ２，３゜４は、実行
再開場所として°１０°が指定されたことにより　ａｄ
ｄｒ　２°から実行を再開し、処理を終了する。Processor 1 resumes execution from addr 3' because °11° is specified as the execution restart location,
Finish the process. In addition, processors 2 and 3゜4 ad
Execution is restarted from dr 2° and the process is ended.

以上のように、本発明の並列処理システム１００によっ
て、Ｗｈｉｌｅ型ループでも、ｆｏｒ型ループでも、並
列実行することが出来るようになる。As described above, the parallel processing system 100 of the present invention enables parallel execution of both while-type loops and for-type loops.

［発明の効果］本発明によれば、ｆｏｒ型ループだけでなく。[Effect of the invention] According to the invention, not only for-type loops.

ｗｈｉｌｅ型ループに対しても、複数のプロセッサ間で
同期をとりながら並列に処理を行うことが出来るように
なる。したがって、プログラムの並列実行による処理速
度を従来以上に向上することができる効果がある。Even for a while type loop, it becomes possible to perform parallel processing while synchronizing a plurality of processors. Therefore, there is an effect that the processing speed due to parallel execution of programs can be improved more than before.

[Brief explanation of the drawing]

第１図は本発明の一実施例の並列処理システムのブロッ
ク図、第２図は第１図に示すレジスタＢ判定回路の入力
と出力の関係を示す図表、第３図はｗｈｉｌｅ型ループ
を含むプログラムとそれを並列化したプログラムの説明
図、第４図は配列データの例示図、第５図は待合せ処理
の一例のフロー図、第６図は待合せ命令の一例のフォー
マット図、第７図は待合せ処理の他の一例のフロー図、
第８図は待合せ命令の他の一例のフォーマット図、第９
図は待合せ処理のさらに他の一例のフロー図、第１０図
は待合せ処理のさらにまた他の一例のフロー図、第１１
図はｆ　ｏ　ｒｕ小ループ含むプログラムとそれを並列
化したプログラムの説明図である。（符号の説明）１００・・・並列処理システム１０１・・・レジスタＡ１０２−・・レジスタＢ１０３・・・レジスタＡ判定回路１０４・・・レジスタＢ判定回路１０５ａ、１０５ｂ・・・プロセッサ１へのプロセッサ
制御信号１０６ａ、１０６ｂ・・・プロセッサ２へのプロセッサ
制御信号１０７ａ、１０７ｂ・・・プロセッサ３へのプロセッサ
制御信号１０８ａ、１０８ｂ・・・プロセッサ４へのプロセッサ
制御信号１０９・・・実行再開指示。Fig. 1 is a block diagram of a parallel processing system according to an embodiment of the present invention, Fig. 2 is a diagram showing the relationship between the input and output of the register B judgment circuit shown in Fig. 1, and Fig. 3 includes a while type loop. An explanatory diagram of a program and a parallelized program, FIG. 4 is an example of array data, FIG. 5 is a flow diagram of an example of a queuing process, FIG. 6 is a format diagram of an example of a queuing command, and FIG. A flow diagram of another example of the waiting process,
FIG. 8 is a format diagram of another example of a waiting instruction, and FIG.
FIG. 10 is a flowchart of yet another example of the meeting process, FIG. 10 is a flowchart of still another example of the meeting process, and FIG.
The figure is an explanatory diagram of a program including a f o ru small loop and a program that parallelizes the program. (Explanation of symbols) 100...Parallel processing system 101...Register A 102-...Register B 103...Register A judgment circuit 104...Register B judgment circuit 105a, 105b...To processor 1 Processor control signals 106a, 106b...Processor control signals 107a, 107b...Processor control signals 108a, 108b...Processor control signals 109 to processor 4...Instruction to resume execution.

Claims

[Claims] 1. In a synchronization processing method in which processors are made to wait with each other in order to harmonize the progress of parallel processing by a plurality of processors, the first step is to determine whether the processing of each processor has reached a predetermined waiting place. information and second information corresponding to the processing results of each processor are held outside the processor for each processor, and each processor waits and synchronizes with other processors based on the first information, and A synchronous processing method characterized in that each processor selects a place to resume execution after queuing based on the second information. 2. A plurality of processors, first register means having a synchronization detection bit assigned to each of the processors and set by the corresponding processor, and a restart location assigned to each of the processors and set by the corresponding processor. a second register means having a designation bit; a synchronization determination means for determining synchronization from the state of the synchronization detection bit of the first register means and instructing each processor to wait or resume execution; 1. A parallel processing system comprising: execution restart location designating means for designating a restart location for each processor based on the state of a restart location designation bit of the register means. 3. In a parallel processing method in which processing is executed in parallel by a plurality of processors, the step of outputting first information indicating that the processing of each processor has reached a predetermined meeting place to the outside of the processor; outputting second information corresponding to the result to the outside of the processor; waiting or resuming processing based on the first information output from another processor; A parallel processing method, comprising: selecting a location to resume execution based on the second information. 4. In a parallelization program generation device that generates a program for parallel processing from a program for sequential processing, the step of outputting first information indicating that the processing has reached a predetermined waiting place of the program to the outside of the processor is performed for parallel processing. means for adding to the program a step for outputting second information corresponding to a processing result in the program to the outside of the processor; A means for adding a step for queuing processing or resuming execution based on information into a parallel processing program, and a step for selecting a place for resuming execution based on the second information output from another processor for parallel processing. 1. A parallelization program generation device characterized by comprising: a means for adding information into a program for use in the program;