JP2517859B2

JP2517859B2 - Parallel process management method

Info

Publication number: JP2517859B2
Application number: JP3301292A
Authority: JP
Inventors: 貴之中川; 真知子朝家; 衛杉江
Original assignee: Agency of Industrial Science and Technology
Current assignee: National Institute of Advanced Industrial Science and Technology AIST
Priority date: 1991-10-22
Filing date: 1991-10-22
Publication date: 1996-07-24
Anticipated expiration: 2011-07-24
Also published as: JPH05113960A

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明はメモリ共有型の並列計算
機システムにおける、暇なプロセッサにプロセスを分配
するシステムにおけるプロセス管理方法に関する。暇な
プロセッサとは、プロセッサ毎の実行待ちプロセステー
ブルに実行すべきプロセスがなくなったプロセッサをさ
す。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a process management method in a system for distributing processes to spare processors in a memory sharing type parallel computer system. A spare processor is a processor for which there is no process to be executed in the pending process table for each processor.

【０００２】[0002]

【従来の技術】従来、メモリ共有型並列計算機システム
におけるプロセス分配方式については、中川ほかによ
る″プロセッサ間ソフトウェア割込処理を高速化するス
リットチェック機構″（情報処理学会研究報告８９−Ａ
ＲＣ−７７、１９８９年Ｐ１７）において論じられてい
る。2. Description of the Related Art Conventionally, regarding a process distribution method in a memory sharing parallel computer system, Nakagawa et al., "Slit check mechanism for speeding up software interrupt processing between processors" (IPSJ Research Report 89-A).
RC-77, 1989 P17).

【０００３】上記従来技術では、各プロセッサは、プロ
セスを他のプロセッサに分配するとき、そのプロセッサ
に対する実行待ちプロセステーブルの先頭に登録されて
いる実行待ちプロセスを分配する。In the above-mentioned prior art, when distributing a process to another processor, each processor distributes the execution waiting process registered at the head of the execution waiting process table for that processor.

【０００４】[0004]

【発明が解決しようとする課題】一般に、人間のプログ
ラム記述の傾向として、相互依存性の高いプロセスを近
接して書く事実が知られている。しかも、共有データに
関する書き込みプロセスを、そのデータの読み出しプロ
セスの直前に書く傾向がある。従って、デ−タの無駄な
待合せを避けるために、記述の順に実行するように登録
するのが一般的である。Generally, as a human tendency to write programs, it is known that processes having high interdependence are written close to each other. Moreover, there is a tendency to write the writing process for shared data just before the reading process of the data. Therefore, in order to avoid useless waiting of data, it is general to register so as to execute in the order of description.

【０００５】今、仮に、プロセスａは共有メモリの１０
０番地に第１のデータを書き、このデータをプロセスｂ
が読み、プロセスｂは２００番地に第２のデータを書
き、このデータをプロセスｃが読むものとする。Now, suppose process a is 10 in the shared memory.
Write the first data at address 0 and process this data as process b
Is read, process b writes the second data at address 200, and process c reads this data.

【０００６】この時、この順にこれらのプロセスは同一
のプロセッサの実行待ちプロセステーブルに登録するこ
とにより、単一のプロセッサによる実行では、無駄な待
合せを回避できる。At this time, by registering these processes in the execution-waiting process table of the same processor in this order, useless waiting can be avoided in the execution by a single processor.

【０００７】この場合、各プロセッサに、ストアインタ
イプのキャッシュがあると、以下の問題が生じる。In this case, if each processor has a cache of store-in type, the following problems occur.

【０００８】ここで、ストアインタイプのキャッシュと
は、データをそのキャッシュに書き込むに際して、自プ
ロセッサに属するキャッシュにデータを書き込むが、共
有されている主記憶に転送しないで、後に、他のプロセ
ッサからそのデータの読み出し要求があると、そのデー
タをそのキャッシュから、データを要求元のプロセッサ
内のキャッシュに転送するものを指す。[0008] Here, the store-in type cache, when write No write data in its cache writes the data into the cache belonging to its own processor but not transferred to the main memory that is shared, later, of the other When a processor requests to read the data, the data is transferred from the cache to the cache in the requesting processor.

【０００９】上記の従来技術では、各プロセッサは、プ
ロセスを他のプロセッサに分配するとき、そのプロセッ
サに対するプロセステーブル内の先頭に登録されている
実行待ちプロセスを分配する。例えば、プロセスａの実
行の結果、そのプロセスがあるデータをそのプロセスを
実行中のプロセッサＰＥ１のキャッシュに書き込んだあ
と、他のプロセッサＰＥ３からプロセスの分配要求があ
ると、次のプロセスｂを分配する。この分配が発生する
と、上記のプロセスａが発生したデータが利用するプロ
セスｂが、プロセスａとはことなるプロセッサにて実行
されることになる。このため、プロセッサＰＥ１に属す
るキャッシュからプロセッサＰＥ３に属するキャッシュ
にデータの転送が必要となるという問題がある。In the above-mentioned conventional technique, each processor, when distributing a process to another processor, distributes the execution waiting process registered at the head of the process table for that processor. For example, as a result of the execution of the process a, when the process writes a certain data in the cache of the processor PE1 that is executing the process, and then a process distribution request is issued from another processor PE3, the next process b is distributed. . When this distribution occurs, the process b used by the data generated by the process a is executed by a processor different from the process a. Therefore, there is a problem in that it is necessary to transfer data from the cache belonging to the processor PE1 to the cache belonging to the processor PE3.

【００１０】本発明は、上記の問題点を解決し、キャッ
シュ間のデータ転送を減少できる並列プロセス管理方法
を提供することを目的とする。SUMMARY OF THE INVENTION It is an object of the present invention to solve the above problems and provide a parallel process management method capable of reducing data transfer between caches.

【００１１】[0011]

【課題を解決するための手段】上記目的を達成するため
に、本発明では、実行待ちプロセスを登録する実行待ち
プロセステーブルを、各プロセッサに対応して上記共有
メモリに設け、実行待ちプロセスのうち最初の実行待ち
プロセスを指定するものであって、最初の実行待ちプロ
セスから最終の実行待ちプロセスへと、指定を順次変更
する第１のポインタと、実行待ちプロセスのうち最終の
実行待ちプロセスを指定するものであって、最終の実行
待ちプロセスから最初の実行待ちプロセスへと、指定を
順次変更する第２のポインタとを各プロセッサ毎に設
け、各プロセッサのうち、実行待ちプロセステーブルに
登録された実行待ちプロセスがあるプロセッサは、第１
のポインタを参照して、最初の実行待ちプロセスを取り
出して実行し、プロセスの分配を要求するプロセッサが
あり、かつ分配可能である場合、第２のポインタを参照
して、最終の実行待ちプロセスを取り出し、プロセスの
分配を要求するプロセッサに対してその最終の実行待ち
のプロセスを分配する。 In order to achieve the above object, according to the present invention, an execution waiting process for registering an execution waiting process is performed.
Share the above process table for each processor
Provided in the memory, waiting for the first execution of the waiting processes
The first process waiting to be specified, which specifies the process.
From the process to the final pending process
The first pointer to
Specifies the pending process, the final execution
Specify from the waiting process to the first waiting process
A second pointer that is changed sequentially is set for each processor.
Out of each processor, in the pending process table
The processor with the registered pending process is the first
The first waiting process by referring to the pointer
Processors that issue and execute and request process distribution
Refer to second pointer if present and distributable
And take out the final pending process,
Waiting for its final execution for processors requesting distribution
Distributing the process of.

【００１２】[0012]

【作用】実行待ちプロセステーブルにプロセスａ，ｂ，
ｃが登録されている場合、自プロセッサによる実行は単
一プロセッサでの実行と同じａ，ｂの順で行ない、他プ
ロセッサへのプロセス分配はまずｃを対象とする。この
結果、プロセスｂは、プロセスａと同じプロセッサにて
実行されるので、プロセスａが生成し、プロセスｂが利
用するデータに関して、キャッシュ間のデータ転送が生
じない。Operation: Processes a, b, and
When c is registered, the execution by the self processor is performed in the same order as a and b as the execution by the single processor, and the process is distributed to other processors by first targeting c. As a result, since the process b is executed by the same processor as the process a, data transfer between caches does not occur with respect to the data generated by the process a and used by the process b.

【００１３】[0013]

【実施例】以下、本発明の一実施例を図１から図５によ
り説明する。図１（ａ）は、本発明の一実施例の構成図
である。三つのプロセッサＰＥ１〜ＰＥ３からなるメモ
リ共有型マルチプロセッサが共有メモリ１により接続さ
れている。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below with reference to FIGS. FIG. 1A is a configuration diagram of an embodiment of the present invention. A shared memory multiprocessor including three processors PE1 to PE3 is connected by a shared memory 1.

【００１４】本実施例では、簡単のために、三つのプロ
セッサＰＥ１〜ＰＥ３からなるメモリ共有型マルチプロ
セッサを例に説明するが、プロセッサ数が増えた場合
は、後述のプロセッサ対応の分配プロセスアドレス記憶
領域１４１〜１４３を増やせば良い。各プロセッサＰＥ
１、ＰＥ２またはＰＥ３は複数のレジスタを有するマイ
コン等により実現できる。In the present embodiment, for the sake of simplicity, a memory shared multiprocessor consisting of three processors PE1 to PE3 will be described as an example. However, when the number of processors increases, a distributed process address storage corresponding to the processors described later will be given. It suffices to increase the areas 141 to 143. Each processor PE
1, PE2 or PE3 can be realized by a microcomputer having a plurality of registers.

【００１５】共有メモリ１に、共有のプロセス分配要求
プロセッサ番号領域１３、それぞれ一つのプロセッサに
対応して設けられた分配プロセスアドレス領域１４１〜
１４３、それぞれ一つのプロセッサに対応して設けられ
た実行待ちプロセステーブル１６１〜１６３を設ける。
各実行待ちプロセステーブルは、実行待ちのプロセスを
表すプロセス実行環境レコ−ド（以下では、簡単のため
に、単にレコ−ドと呼ぶ）のキュ−からなる。本実施例
では、各レコード１５１は、図１（ｂ）に示すように、
４つのデータからなる。すなわち、次のレコ−ドのアド
レスを示す次レコードアドレス、前のレコ−ドのアドレ
スを示す前レコードアドレス、対応するプロセスの処理
を実行する共有メモリ１に記憶されたプログラム（図示
せず）の先頭の命令のアドレスを表す命令コードアドレ
ス、およびそのプログラムで使用する、共有メモリ１に
記憶された引数のリスト（図示せず）の先頭のアドレス
を表す引数リストアドレスからなる。In the shared memory 1, a shared process distribution request processor number area 13 and distributed process address areas 141 to 141 provided corresponding to one processor, respectively.
143, and execution waiting process tables 161 to 163 respectively provided corresponding to one processor.
Each execution-waiting process table is composed of a queue of process execution environment records (hereinafter, simply referred to as records) for representing processes waiting for execution. In this embodiment, each record 151 is, as shown in FIG.
It consists of four data. That is, the next record address indicating the address of the next record, the previous record address indicating the address of the previous record, and the program (not shown) stored in the shared memory 1 for executing the process of the corresponding process. It is composed of an instruction code address representing the address of the leading instruction and an argument list address representing the leading address of a list of arguments (not shown) stored in the shared memory 1 and used in the program.

【００１６】それぞれのプロセッサＰＥ１，ＰＥ２，Ｐ
Ｅ３には、アドレスとデータの対を格納するキャッシュ
４１〜４３、プロセステーブルの最初のレコードアドレ
スを格納するレジスタ２１〜２３（第１のポインタとも
呼び、最初のレコードポインタを表す）、プロセステー
ブルの最終レコードアドレスを格納するレジスタ３１〜
３３（第２のポインタとも呼び、最終のレコードポイン
タを表す）を備えている。Respective processors PE1, PE2, P
E3 includes caches 41 to 43 for storing address / data pairs, registers 21 to 23 for storing the first record address of the process table (also referred to as a first pointer).
Call the first record pointer) , and registers 31 to 31 for storing the last record address of the process table.
33 (also called the second pointer,
(Representing data) .

【００１７】各プロセッサは図２のプロセス制御フロー
１００にしたがって処理を行なう。Each processor performs processing according to the process control flow 100 of FIG.

【００１８】（１）各プロセッサ、例えばＰＥ１は、プ
ロセス判定処理ｃ１では、対応するレジスタ２１に保持
するアドレスが０か否かを判別し、これにより、実行可
能なプロセスが存在するかを判定する。本実施例では、
プロセステーブル１６１に登録されたレコードがない場
合、すなわちＰＥ１に実行待ちのプロセスがない場合、
レジスタ２１内のアドレスを０に設定しておく。他のレ
ジスタ２２、２３についても同様である。(1) In the process determination processing c1, each processor, for example PE1, determines whether the address held in the corresponding register 21 is 0 or not, and thereby determines whether or not there is an executable process. . In this embodiment,
If no records registered in the process table 161, i.e. when there is no execution-waiting process to PE1,
The address in the register 21 is set to 0. The same applies to the other registers 22 and 23.

【００１９】今、仮に、ＰＥ１は３個のレコ−ド１５１
ａ、１５１ｂ、１５１ｃを持つと仮定する。この時、レ
ジスタ２１、３１にはそれぞれレコ−ド１５１ａ、１５
１ｃのアドレスを保持している。Now, suppose that PE1 has three records 151.
Suppose we have a, 151b, 151c. At this time, the registers 21 and 31 have records 151a and 15a, respectively.
It holds an address of 1 c .

【００２０】（２）仮定によりプロセス判定処理ｃ１で
は、ＰＥ１に実行待ちのプロセスが有ると判定されるの
で、プロセス取り出し処理ｃ２で、このレジスタ２１内
のアドレスに従って、自プロセッサの実行待ちプロセス
テーブル１６１の最初のレコードを取り出す。つまり、
このレコ−ドをこのテーブルから外し、共有メモリ１内
のＰＥ１用の作業領域（図示せず）に格納する。この
時、レジスタ２１内のアドレスをレコ−ド１５１ｂのア
ドレスに変更する。さらに、レコ−ド１５１ｂ内の前レ
コ−ドアドレスを０に変更することにより、このレコ−
ド１５１ｂが先頭のレコ−ドであることがわかるように
する。(2) As a result of the assumption, in the process determination processing c1, it is determined that there is a process waiting for execution in PE1. Retrieve the first record of . That is,
This record is removed from this table and stored in the work area (not shown) for PE1 in the shared memory 1. At this time, the address in the register 21 is changed to the address of the record 151b. Further, by changing the previous record address in the record 151b to 0, this record
It should be understood that the code 151b is the first record.

【００２１】（３）次に、プロセス実行処理ｃ３で、取
り出したレコ−ド１５１ａに対応するプロセスを実行す
る。すなわち、このレコ−ド内の命令コードアドレスと
引数リストアドレスを用いて、対応するプロセスの処理
を実行する。(3) Next, in the process execution process c3, the process corresponding to the retrieved record 151a is executed. That is, the processing of the corresponding process is executed using the instruction code address and the argument list address in this record.

【００２２】（４）つぎに、ＰＥ１はプロセス分配制御
処理ｃ４を実行する。(4) Next, the PE1 executes the process distribution control processing c4.

【００２３】図３に示すように、この処理では、まず、
ステップｂ０で、共有メモリ１内のプロセス分配要求プ
ロセッサ番号１３を読み出して、その値が｀０｀か否か
を判定する。後述するように、この番号１３には、プロ
セスの分配を要求するプロセッサがあるときには、その
プロセッサが、その番号をここに記憶するようになって
いる。従って、プロセス分配要求プロセッサ番号１３の
値が｀０｀か否かを判定することにより、他のプロセッ
サからプロセス分配要求が出されているか否かを判別す
ることができる。In this process, as shown in FIG.
In step b0, the process distribution request processor number 13 in the shared memory 1 is read and it is determined whether the value is "0". As will be described later, when there is a processor requesting distribution of a process in this number 13, that processor stores the number here. Therefore, by determining whether or not the value of the process distribution request processor number 13 is "0", it is possible to determine whether or not a process distribution request is issued from another processor.

【００２４】今、仮に、この番号１３の値が｀３｀であ
ると仮定する、ステップｂ０ではプロセスの分配を要求
するプロセッサがあると判定される。Now, assuming that the value of this number 13 is "3", it is determined in step b0 that there is a processor requesting process distribution.

【００２５】この場合、ステップｂ１で、分配可能な少
なくとも一つのプロセスがあるかを判定する。本実施例
では、対応するプロセステーブル１６１に登録されたプ
ロセスの個数が２個以上あるとき、分配可能なプロセス
があるとする。In this case, in step b1, it is determined whether there is at least one process that can be distributed. In this embodiment, when there are two or more processes registered in the corresponding process table 161, it is assumed that there are processes that can be distributed.

【００２６】今、仮定により、ＰＥ１は２個のレコ−ド
１５１ｂ、１５１ｃを持っているので、プロセスの分配
が可能であると判断される。Now, according to the assumption, since PE1 has two records 151b and 151c, it is judged that the process can be distributed.

【００２７】次のステップｂ２では、プロセス分配要求
プロセッサ番号１３の値（今の仮定によりこれは｀３｀
である）により、ＰＥ３への分配プロセスアドレス１４
３に、実行待ちプロセステーブル１６１の最後のレコー
ドポインタであるレジスタ３１内のアドレス（今の例で
は、この値は、このテーブル１６１の最終のレコ−ド１
５１ｃのアドレスである）を書き込む。これにより、Ｐ
Ｅ３にプロセス１５１ｃを分配したことになる。さら
に、この最終のレコ−ド１５１ｃの分配に伴い、レジス
タ３１内のアドレスをその前のレコ−ド１５１ｂのアド
レスに変更する。さらに、このレコ−ド１５１ｂ内の次
レコ−ドアドレスを０に変更することにより、このレコ
−ドが最終のレコ−ドであることを示す。In the next step b2, the value of the process distribution request processor number 13 (this is "3" by the present assumption).
The process address 14 to PE3
3, the address in the register 31 which is the last record pointer of the pending process table 161 (in the present example, this value is the last record 1 of this table 161).
(Which is the address of 51c). This gives P
This means that the process 151c is distributed to E3. Further, with the distribution of the final record 151c, the address in the register 31 is changed to the address of the preceding record 151b. Further, by changing the next record address in the record 151b to 0, it is shown that this record is the final record.

【００２８】さらにステップｂ３で、プロセス分配要求
プロセッサ番号１３に値｀０｀を書き込むことによっ
て、元のプロセス分配要求をクリアする。Further, in step b3, the original process distribution request is cleared by writing the value "0" into the process distribution request processor number 13.

【００２９】ここで、ステップｂ０，ｂ１で判定した条
件のいずれかが成立しない場合は、プロセスの分配をし
ない。If either of the conditions determined in steps b0 and b1 is not satisfied, the process is not distributed.

【００３０】以上のようにしてプロセス分配制御処理ｃ
４を終了する。As described above, the process distribution control process c
Finish 4

【００３１】（４）ステップｃ２での判定の結果、その
プロセッサに実行待ちのプロセスがなかった場合、プロ
セス分配要求制御処理ｃ５とプロセス受信処理ｃ６を実
行する。(4) As a result of the judgment in step c2, if there is no process waiting for execution in the processor, the process distribution request control process c5 and the process reception process c6 are executed.

【００３２】これらの処理の内容は、ＰＥ３がこれらの
処理を実行する場合を例に取り説明する。The contents of these processes will be described taking the case where the PE 3 executes these processes as an example.

【００３３】ＰＥ３は、プロセス判定処理ｃ２を先に実
行したとき、そのプロセッサに実行待ちのプロセスがな
かった場合、プロセス分配要求制御処理ｃ４において、
プロセス分配要求プロセッサ番号１３に、そのプロセッ
サの番号、今の例では、値｀３｀を書き込む。それに伴
い、対応するレジスタ２３、３３内のアドレスをゼロに
する。When the process determination processing c2 is executed first, the PE 3 executes the process distribution request control processing c4 when there is no process waiting for execution in the processor.
In the process distribution request processor number 13, the number of the processor, in the present example, the value "3" is written. Accordingly, the addresses in the corresponding registers 23 and 33 are set to zero.

【００３４】（５）つぎに、ＰＥ３はプロセス受信処理
ｃ６で、自プロセッサへの分配プロセスアドレス１４３
を読み出し、そのアドレスが０か否かを判別する。その
アドレスが０でないときには、実行待ちのプロセスがＰ
Ｅ３に分配されたことになるので、その読みだしたアド
レスを、ＰＥ３に対応するレジスタ２３、３３に格納す
る。こうして、ＰＥ１から分配されたレコ−ド１５１ｃ
がＰＥ３に対応する実行待ちプロセステーブル１６３に
登録される。(5) Next, in the process reception process c6, the PE 3 distributes the process address 143 to its own processor.
Is read and it is determined whether the address is 0 or not. If the address is not 0, the process waiting for execution is P
Since it has been distributed to E3, the read address is stored in the registers 23 and 33 corresponding to PE3. In this way, the record 151c distributed from PE1
Is registered in the pending process table 163 corresponding to PE3.

【００３５】以上の処理に於ける、書き込みの競合を生
じうるデータ１３、１４１〜１４３の読みだし及び書き
込みは、複数のプロセッサの実行時間の衝突による誤操
作を回避するために、例えば、ＩＢＭマニュアル″ＩＢ
ＭＳｙｓｔｅｍ／３７０ＥｘｔｅｎｄｅｄＡｒｃｈ
ｉｔｅｃｔｕｒｅＰｒｉｎｃｉｐｌｅｓｏｆＯｐｅｒ
ａｔｉｏｎＳＡ２２−７０８５−０ページＡ１４″に
述べられるようなコンペア・アンド・スワップ（Ｃｏｍ
ｐａｒｅａｎｄＳｗａｐ）手順を用い、読みだした
後にはデータ″０″を書き込むものとする。In the above processing, the reading and writing of the data 13, 141 to 143, which may cause a write conflict, are performed in order to avoid erroneous operation due to collision of execution times of a plurality of processors. IB
M System / 370 Extended Arch
issue PrinciplesofOpera
ation SA22-7085-0 page A14 ″, compare and swap (Com
It is assumed that the data "0" is written after the data is read by using the pare and Swap procedure.

【００３６】本実施例による効果を、図４〜図５を用い
て説明する。The effects of this embodiment will be described with reference to FIGS.

【００３７】従来技術では、各プロセッサは、プロセス
分配のときには、プロセステーブル内のレコ−ドの先頭
にあるプロセスを分配する。先に実施例で述べたプロセ
スの場合、ＰＥ１がプロセス１５１ｂをＰＥ３に分配す
ることになる。In the prior art, each processor distributes the process at the head of the record in the process table at the time of process distribution. In the case of the process described in the above embodiment, PE1 distributes process 151b to PE3.

【００３８】今、仮に、レコ−ド１５１ａに対応するプ
ロセス（以下ではこれをプロセス１５１ａと呼ぶことが
ある）は共有メモリ１の１００番地に第１のデータを書
き、このデータをレコ−ド１５１ｂに対応するプロセス
（以下ではこれをプロセス１５１ｂと呼ぶことがある）
が読み、プロセス１５１ｂは２００番地に第２のデータ
を書き、このデータをレコ−ド１５１ｃに対応するプロ
セス（以下ではこれをプロセス１５１ｃと呼ぶことがあ
る）が読むものとする。Now, tentatively, the process corresponding to the record 151a (which may be hereinafter referred to as process 151a) writes the first data at address 100 of the shared memory 1 and records this data in the record 151b. Process corresponding to (this may be referred to as process 151b below)
The process 151b writes the second data at the address 200, and the process corresponding to the record 151c (hereinafter sometimes referred to as process 151c) reads the data.

【００３９】図４はこのような場合における、従来技術
によるＰＥ１とＰＥ３の処理を時間を追って、記述した
ものである。FIG. 4 describes the processing of PE1 and PE3 according to the prior art in such a case with time.

【００４０】すなわち、プロセス１５１ａの実行による
１００番地への書き込みデータはプロセッサＰＥ１内の
キャッシュ４１に入るが、このキャッシュがストアイン
タイプであるため、このデータは、共有メモリ１にはす
ぐには転送されない。プロセス１５１ｂによる１００番
地の読み出しはプロセッサＰＥ３内のキャッシュ４３に
行なわれる。この時、キャッシュ４３はこのデータがキ
ャッシュ４３内に存在しないデータであると公知の方法
により判定すると、主記憶１にこのデータをキャッシュ
４３に転送することを要求する。主記憶１は、まず、他
のキャッシュ４１、４２に、このアドレスのデータを更
新したか否かを公知の方法により問い合わせる。キャッ
シュ４１が公知の方法によりこれに応答することになる
のでので、主記憶１を経由して、キャッシュ４１から４
３にプロセス１５１ａにより書き込まれたデータが転送
される。さらに、２００番地のデータについても同様
に、プロセス１５１ｃが書き込むキャッシュ４３と、プ
ロセス１５１ｃが読みだすキャッシュ４１の間でもデー
タ転送が発生し、合計で２回の転送時間が図４内の太線
に示すように、発生する。That is, although the write data to the address 100 by the execution of the process 151a is stored in the cache 41 in the processor PE1, since this cache is a store-in type, this data is immediately transferred to the shared memory 1. Not done. The reading of address 100 by the process 151b is performed in the cache 43 in the processor PE3. At this time, when the cache 43 determines that this data is data that does not exist in the cache 43 by a known method, it requests the main memory 1 to transfer this data to the cache 43. The main memory 1 first inquires of the other caches 41 and 42 by a known method whether or not the data at this address has been updated. Since the cache 41 responds to this by a known method, the caches 41 to 4 can be accessed via the main memory 1.
The data written by the process 151a to 3 is transferred. Similarly, for the data at address 200, data transfer also occurs between the cache 43 written by the process 151c and the cache 41 read by the process 151c, and the transfer time of two times in total is shown by the bold line in FIG. So it happens.

【００４１】図５は上と同じ関係にある二つのデータに
関する、本実施例によるＰＥ１とＰＥ３の処理を時間を
追って、記述したものである。FIG. 5 describes the processing of PE1 and PE3 according to the present embodiment with respect to two data having the same relationship as the above, in chronological order.

【００４２】本実施例ではプロセス１５１ｃをＰＥ３に
分配するので、プロセス１５１ａの実行での１００番地
への書き込みデータはキャッシュ４１に行われ、さら
に、プロセス１５１ｂによる１００番地のデータの読み
出しはキャッシュ４１に行なわれるので、このデータに
関連するキャッシュ間のデータ転送は、発生しない。結
局、キャッシュ間のデータ転送はプロセス１５１ｂとプ
ロセス１５１ｃの間で転送される１００番地のデータに
対してのみである。従って、本実施例では、従来技術に
比べて、キャッシュ間のデータ転送を減らすことができ
る。In the present embodiment, since the process 151c is distributed to the PE3, the write data to the address 100 in the execution of the process 151a is performed in the cache 41, and further the data at the address 100 is read by the process 151b in the cache 41. Since this is done, no data transfer between caches associated with this data will occur. After all, the data transfer between the caches is only for the data at address 100 transferred between the processes 151b and 151c. Therefore, in this embodiment, the data transfer between the caches can be reduced as compared with the conventional technique.

【００４３】なお、本実施例では、プロセス間で受け渡
すデータおよびキャッシュに格納可能なデータの個数は
それぞれ１個としたが、一般に多数のデータが格納可能
なキャッシュを備えるシステムでは、さらに大きな性能
の差異を及ぼすことが明らかである。In this embodiment, the number of data to be transferred between processes and the number of data that can be stored in the cache are each one, but generally, in a system having a cache that can store a large number of data, even greater performance is achieved. It is clear that it makes a difference.

【００４４】また、プロセス分配要求およびプロセス分
配の検出は、上記のように共有メモリ１上の固定アドレ
スのデータ１３、１４１〜１４３の値が０でないことに
よって行ったが、前述の文献に記載のような通信用レジ
スタにより、この検出処理をさらに高速化することがで
きる。Further, the process distribution request and the process distribution are detected by the value 13 of the fixed address data 13 and 141 to 143 on the shared memory 1 not being 0 as described above. With such a communication register, the detection processing can be further speeded up.

【００４５】[0045]

【発明の効果】本発明によれば、キャッシュをそれぞれ
有する複数のプロセッサからなる並列計算機システムに
おいて、キャッシュ間のデータ転送の回数を減少するこ
とができ、高速な処理が実現できる。According to the present invention, in a parallel computer system including a plurality of processors each having a cache, it is possible to reduce the number of times of data transfer between caches and realize high-speed processing.

[Brief description of drawings]

【図１】本発明の一実施例の構成図。FIG. 1 is a configuration diagram of an embodiment of the present invention.

【図２】プロセス制御フローの図。FIG. 2 is a diagram of a process control flow.

【図３】図２内のプロセス分配制御処理（ｃ５）の概略
的フローチャート。3 is a schematic flowchart of a process distribution control process (c5) in FIG.

【図４】従来技術によるプロセス実行の時間経過を表し
たタイムチャート。FIG. 4 is a time chart showing a time course of process execution according to a conventional technique.

【図５】本発明の実施例によるプロセス実行の時間経過
を表したタイムチャート。FIG. 5 is a time chart showing the elapsed time of process execution according to an embodiment of the present invention.

[Explanation of symbols]

１…共有メモリ、１３…プロセス分配要求プロセッサ番
号、１４１、１４２、１４３…分配プロセスアドレス、
１５１ａ，１５１ｂ，１５１ｃ…ＰＥ１の実行待ちプロ
セスのレコ−ド、１６１、１６２、１６３…実行待ちプ
ロセステーブル、２１、２２、２３…実行待ちプロセス
テーブルの最初のレコードポインタレジスタ、３１、３
２、３３…実行待ちプロセステーブルの最終レコードポ
インタレジスタ。1 ... Shared memory, 13 ... Process distribution request processor number, 141, 142, 143 ... Distribution process address,
151a, 151b, 151c ... Record of PE1 execution waiting process, 161, 162, 163 ... Execution waiting process table, 21, 22, 23 ... First record pointer register of execution waiting process table, 31, 3
2, 33 ... Final record pointer register of the pending process table.

Claims

(57) [Claims]

1. An execution waiting process comprising a plurality of processors each having a cache and sharing a shared memory.
Each process has a pending process table for registering processes.
In the shared memory corresponding to the server, the first waiting process among the waiting processes
The first waiting process to be specified
The first to sequentially change the designation to the final pending process
Pointer and the last execution wait of the above waiting process
The last process to wait
The specification is changed sequentially from the process to the first pending process.
A second pointer to be further provided is provided for each processor, and the execution-waiting process table among the processors is provided.
The processor with the registered pending process is
Refer to the first pointer to find the first pending process.
There is a processor that takes and executes, and requests the distribution of the process, and the distribution
If possible, refer to the second pointer above and
Request the process distribution by extracting the pending process
The last pending process for the processor
A parallel process management method characterized by distributing the processes.