JPH04260962A

JPH04260962A - Synchronization control system in parallel computers

Info

Publication number: JPH04260962A
Application number: JP3143355A
Authority: JP
Inventors: Kenji Horie; 堀江　健志; Hiroaki Ishihata; 石畑　宏明; Morio Ikesaka; 守夫池坂
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1990-06-14
Filing date: 1991-06-14
Publication date: 1992-09-16
Anticipated expiration: 2011-12-04
Also published as: JP2559918B2

Abstract

PURPOSE:To immediately execute proper processing even when an error is generated in a parallel computer (PE) and to rapidly attain synchronism between plural PEs by detecting all status at the time of forming the synchronism of all PEs in respect to a synchronism/status/stable state detecting method in a system for connecting many PEs through a network and synchronizing them. CONSTITUTION:This synchronism control system is provided with plural status request register means 2 prepared in individual PEs so as to independently output status requests and store the request signals, a status deciding means 4 for deciding the existence of requests from all PE status request registers, a status distributing means 5 for distributing status based upon the decided result, and a status detecting register means 3 for detecting the status based upon the distributed decision result and the output of a synchronism detecting register means 1 and constituted of detecting the status of all the PEs when the synchronism of all the PEs is established.

Description

[Detailed description of the invention]

【０００１】0001

【産業上の利用分野】本発明は、多数の独立に動作する
計算機（以後、プロセッサエレメントをあらわすＰＥと
略す）を通信ネットワークで接続し互いに同期して並列
動作させる並列計算機のプロセッサ間同期方式において
、同期，ステータス及び安定状態の検出方式に関する。[Industrial Application Field] The present invention relates to an inter-processor synchronization system for parallel computers in which a large number of independently operating computers (hereinafter abbreviated as PE (processor element)) are connected through a communication network and operated in parallel in synchronization with each other. , relates to synchronization, status and stable state detection schemes.

【０００２】0002

【従来の技術】半導体技術の進歩により、高性能なマイ
クロプロセッサや大容量のメモリを安価・小型に構成す
ることが可能になり、このようなマイクロプロセッサや
メモリを多数使用して並列計算機を容易につくることが
できるようになってきた。１つの仕事を複数のＰＥで並
列処理するためには、仕事を分割し各ＰＥに割りつけて
実行させればよい。この並列処理の並列実行に当たって
は、処理の順序関係に注意する必要がある。すなわち、
全ＰＥがある処理を終了させた後にはじめて、次の処理
を行うことを保証する必要がある場合がある。この保証
のために効率良いプロセッサ間同期機能が必要である。これにより、高速な並列計算機が実現できる。[Background Art] Advances in semiconductor technology have made it possible to construct high-performance microprocessors and large-capacity memories at low cost and in a small size.It is now possible to easily construct parallel computers using large numbers of such microprocessors and memories. It has become possible to create In order to process one task in parallel by multiple PEs, it is sufficient to divide the task and assign it to each PE for execution. When performing this parallel processing in parallel, it is necessary to pay attention to the order of processing. That is,
It may be necessary to ensure that the next process is performed only after all PEs have finished one process. To ensure this, an efficient inter-processor synchronization function is required. This makes it possible to realize a high-speed parallel computer.

【０００３】並列計算機の第１の方式として共有メモリ
方式がある。この方式では全ＰＥで共有するメモリを持
ち、そのメモリの一部に対して排他的に読みだし、書き
込みを行うことによりプロセッサ間同期機能を実現して
いる。A first method for parallel computers is a shared memory method. This method has a memory shared by all PEs, and achieves an inter-processor synchronization function by exclusively reading and writing to a part of the memory.

【０００４】並列計算機の第２の方式として同期レジス
タ方式がある。この方式では、個々のＰＥに同期用レジ
スタを用意し、全ＰＥの同期レジスタの出力の論理積を
検出し、全ＰＥに返すことにより全ＰＥでの同期を検出
する。A second method for parallel computers is the synchronous register method. In this method, a synchronization register is provided for each PE, and the logical product of the outputs of the synchronization registers of all PEs is detected and returned to all PEs, thereby detecting synchronization among all PEs.

【０００５】並列計算機の第３の方式として状態検出方
式がある。これは、図２１に示すように、並列計算機シ
ステムのＰＥにおいて、まず各ＰＥ内で処理１を実行し
、それが終了した後、処理２として他のＰＥ間でメッセ
ージの送信／受信を行い、そのメッセージの処理が終了
すると次の処理に移行する。全てのＰＥがメッセージ待
ち状態にあることを、ホストプロセッサが知り、各ＰＥ
にコマンドを放送し、各ＰＥは、それに対応するコマン
ドの処理を開始する。このようにして、各ＰＥがその状
態を知り、次のステップに移行する。A third method for parallel computers is a state detection method. As shown in FIG. 21, in the PEs of the parallel computer system, first, Process 1 is executed within each PE, and after that is completed, messages are sent/received between other PEs as Process 2. When the processing of that message is completed, the process moves to the next one. The host processor learns that all PEs are waiting for a message, and each PE
, and each PE starts processing the corresponding command. In this way, each PE knows its status and moves to the next step.

【０００６】[0006]

【発明が解決しようとする課題】共有メモリ方式には、
ＰＥ数が大きくなると、メモリへのアクセスが頻発する
という欠点がある。[Problem to be solved by the invention] The shared memory method has
As the number of PEs increases, there is a drawback that memory accesses occur more frequently.

【０００７】同期レジスタ方式では、同期レジスタのア
クセスは、ＰＥごとに独立に行えるので、メモリアクセ
スの競合は起こらない。この方式の同期機構は、図２２
（ａ）のように処理１において全ＰＥが同じように異常
なく働いている時には問題はない。処理１を全ＰＥが終
了し同期が成立したことを同期レジスタで検出して処理
２に進めばよい。もし処理１で１つのＰＥに何らかの異
常事態（例えば、プログラムのバグ、０除算やオーバフ
ローなど）が起きた場合には２通りの対処のしかで従来
行われていた。図２２（ｂ）のように１つのＰＥの処理
１でのエラーを無視して他の全ての同期レジスタで同期
を検出して処理２を続行させる方法と、図２２（ｃ）の
ように１つのＰＥの処理１でエラーを検出したらそこで
全てのＰＥの処理１を中断する方法である。この場合、
前者の方法は処理１でエラーを起こしたＰＥ以外は、エ
ラーのことに全く気付かずに処理２を続けてしまうので
、エラーの発見が遅れ無駄な処理を実行してしまうとい
う問題がある。さらに、最後までエラーに気付かず間違
った答えを出す可能性さえある。また、後者の方法では
、処理１でエラーを起こしたＰＥ以外にエラーの情報が
伝わらないためエラーを起こしたＰＥ以外のＰＥは全て
同期待ちに入り、エラーを起こしたＰＥからの同期信号
を受信しないので全ＰＥの同期処理は終了せず、したが
って処理２に進まないという問題がある。[0007] In the synchronous register method, the synchronous register can be accessed independently for each PE, so that no memory access conflict occurs. The synchronization mechanism of this method is shown in Figure 22.
There is no problem when all PEs are working in the same way without any abnormality in process 1 as in (a). The synchronization register may detect that all PEs have completed processing 1 and synchronization has been established, and then proceed to processing 2. If some kind of abnormality (for example, a program bug, division by 0, overflow, etc.) occurs in one PE in process 1, conventionally there are only two ways to deal with it. There are two methods, as shown in FIG. 22(b), in which an error in processing 1 of one PE is ignored and synchronization is detected in all other synchronization registers, and processing 2 is continued; In this method, when an error is detected in Process 1 of one PE, Process 1 of all PEs is interrupted. in this case,
The former method has the problem that PEs other than those that caused the error in Process 1 continue Process 2 without noticing the error at all, resulting in delayed discovery of the error and wasteful execution of the process. Furthermore, there is a possibility that you may not notice the error until the end and give the wrong answer. In addition, in the latter method, since error information is not transmitted to any PE other than the PE that caused the error in process 1, all PEs other than the PE that caused the error enter a synchronization wait and receive the synchronization signal from the PE that caused the error. Therefore, there is a problem in that the synchronization processing of all PEs is not completed and therefore the process does not proceed to process 2.

【０００８】さらに、従来のシステムでは、ネットワー
クにメッセージがまだ残っている場合の、同期要求ある
いはステータス要求の検出が不十分であった。すなわち
、メッセージが各ＰＥにはないか、ＰＥ間の通信経路に
残っているときに、同期要求あるいはステータス要求を
出してしまうと、処理は次のステップに移ってしまい、
ネットワークに残っている前の処理で処理されるべきデ
ータがあるＰＥに到達すると、次のステップでそのデー
タを処理してしまうという問題があった。すなわち、全
てのＰＥが同期成立にあるとき、あるＰＥだけが他のＰ
Ｅより早く同期成立を知り、次のステップへ移行する可
能性があった。Additionally, conventional systems have insufficient ability to detect synchronization or status requests when messages still remain on the network. In other words, if a synchronization request or status request is issued while the message is not present in each PE or remains in the communication path between PEs, the process will move to the next step.
There is a problem in that when a PE that has data remaining in the network to be processed in the previous process is reached, the data is processed in the next step. In other words, when all PEs are in synchronization, only one PE can communicate with other PEs.
There was a possibility that E would know that synchronization was established earlier than E and move on to the next step.

【０００９】本発明の並列計算機システムにおいて、全
ＰＥの同期と同時に全ＰＥのステータスを検出すること
により１つのＰＥにエラーが発生した場合も、直ちに適
切な対処を行うことにより、他のＰＥが誤まった処理を
実行したり、あるいは同期待ちの状態に置かれてしまう
ことを防止できるので、結果としてプロセッサ間同期を
高速に達成して演算速度を向上することを目的とする。In the parallel computer system of the present invention, by synchronizing all PEs and detecting the status of all PEs at the same time, even if an error occurs in one PE, appropriate measures can be taken immediately so that other PEs can Since it is possible to prevent the execution of incorrect processing or to be placed in a synchronization wait state, the purpose is to achieve inter-processor synchronization at high speed as a result and improve calculation speed.

【００１０】また、本発明の並列計算機システムにおい
て、一度安定状態になったことを各ＰＥが正確に知り、
特に全てのＰＥがメッセージ待ち状態にあっても、ネッ
トワーク内にメッセージが存在すると、安定状態ではな
い（メッセージがいつかは、ＰＥに到着し、そのＰＥは
メッセージ待ちの状態ではなくなる）ので、ネットワー
ク内にメッセージがないことを検出して、同期要求ステ
ータスに関してプロセッサ間の安定状態を検出すること
を目的とする。[0010] Furthermore, in the parallel computer system of the present invention, each PE accurately knows once a stable state has been reached;
In particular, even if all PEs are in the message waiting state, if a message exists in the network, it is not a stable state (the message will eventually arrive at the PE, and that PE will no longer be in the message waiting state). The purpose is to detect a stable state between processors with respect to the synchronization request status by detecting the absence of messages in the synchronization request status.

【００１１】[0011]

【課題を解決するための手段】図１は第１の本発明の原
理を示すブロック図である。本発明は、同期レジスタ方
式を前提とする方式である。すなわち多数の独立に作動
する計算機を接続した構成の分散メモリ型並列計算機で
あって、個々のＰＥに同期要求レジスタと同期検出レジ
スタと、全ＰＥの同期要求レジスタの論理積をとる手段
と、その結果を全ＰＥに分配する手段と、分配された結
果によって全ＰＥの同期検出を行う同期検出手段１を有
する並列計算機同期方式を前提とする。[Means for Solving the Problems] FIG. 1 is a block diagram showing the principle of the first invention. The present invention is a method based on a synchronous register method. In other words, it is a distributed memory type parallel computer having a configuration in which a large number of independently operating computers are connected, and each PE has a synchronization request register and a synchronization detection register, and means for calculating the AND of the synchronization request registers of all PEs, and A parallel computer synchronization method is assumed, which has means for distributing results to all PEs, and synchronization detection means 1 for detecting synchronization of all PEs based on the distributed results.

【００１２】ステータス要求レジスタ２は、個々のＰＥ
に設けられ、例えばステータスが正常のときに、そのＰ
ＥのＣＰＵから論理１を入力する。次に判断手段４は、
全ＰＥのステータスが正常であることを判断する。[0012] The status request register 2 is
For example, when the status is normal, the P
A logic 1 is input from the CPU of E. Next, the determining means 4
Determine that the status of all PEs is normal.

【００１３】分配手段５は判断手段４の出力を全ＰＥに
分配する。全ＰＥのステータスが正常でありかつ、全Ｐ
Ｅの同期成立検出手段１によって全ＰＥの同期が成立し
たことを検出したとき、ステータス検出レジスタ３の内
容を変化させる。The distribution means 5 distributes the output of the determination means 4 to all PEs. The status of all PEs is normal and all P
When the synchronization detection means 1 of E detects that the synchronization of all PEs has been established, the contents of the status detection register 3 are changed.

【００１４】ステータス検出レジスタ３で全ＰＥのステ
ータスが正常であると検出されたとき、次の処理に進行
する。図２は第２の本発明の原理図であり、多数の独立
に動作する計算機を通信ネットワークで接続した分散メ
モリ型並列計算機に係るものであり、ＰＥ個々に同期要
求レジスタと同期検出レジスタと、全ＰＥの同期要求レ
ジスタの論理積をとる手段と、その結果を全ＰＥに分配
する手段と、自ＰＥの同期要求を行い全ＰＥの同期成立
を検出する手段６を有する。第２の発明ではこれに代え
て、通信ネットワークの状態信号を受信する手段７を有
し、更にその状態信号を受信する信号線と同期要求レジ
スタの出力とにより自ＰＥ上に同期要求がありかつ、ネ
ットワークにメッセージがないことを検出する同期安定
要求手段８とを具備することを特徴とする。When the status detection register 3 detects that the status of all PEs is normal, the process proceeds to the next step. FIG. 2 is a diagram showing the principle of the second invention, which relates to a distributed memory type parallel computer in which a large number of independently operating computers are connected through a communication network, and each PE has a synchronization request register, a synchronization detection register, It has means for taking a logical product of the synchronization request registers of all PEs, means for distributing the result to all PEs, and means 6 for issuing a synchronization request for its own PE and detecting the establishment of synchronization for all PEs. In the second invention, instead of this, there is provided a means 7 for receiving a status signal of the communication network, and further, a signal line for receiving the status signal and an output of a synchronization request register indicate that a synchronization request is present on the own PE. , and synchronization stability request means 8 for detecting that there is no message on the network.

【００１５】さらに、図３は第３の本発明の原理図であ
り、多数の独立に動作する計算機を通信ネットワークで
接続した分散メモリ型並列計算機に係るものであり、Ｐ
Ｅ個々にステータス要求レジスタとステータス検出レジ
スタと、全ＰＥのステータス要求レジスタの論理積をと
る手段と、その結果を全ＰＥに分配する手段とからなり
、自ＰＥのステータス要求を行い全ＰＥのステータス要
求成立を検出する手段９を有する。第３の発明ではこれ
に代えて、通信ネットワークの状態信号を受信する手段
１０を有し、更にその状態信号を受信する信号線とステ
ータス要求レジスタの出力とにより自ＰＥ上にステータ
ス要求がありかつ、ネットワークにメッセージがないこ
とを検出するステータス安定要求手段１１とを具備する
ことを特徴とする。Furthermore, FIG. 3 is a diagram showing the principle of the third invention, which is related to a distributed memory type parallel computer in which a large number of independently operating computers are connected through a communication network.
Each E consists of means for logically multiplying the status request register and status detection register of all PEs with the status request registers of all PEs, and means for distributing the result to all PEs, requesting the status of its own PE and checking the status of all PEs. It has means 9 for detecting the establishment of a request. In the third invention, instead of this, there is provided means 10 for receiving a status signal of the communication network, and further, a signal line for receiving the status signal and an output of a status request register indicate that a status request has been made on the own PE. , and status stability request means 11 for detecting that there is no message on the network.

【００１６】[0016]

【作　　用】第１の発明はある処理１において、ブロッ
ク１に示すように全ＰＥの同期が確立し、さらにブロッ
ク４で全ＰＥのステータスが正常であることが確立され
たときはじめて、各ＰＥからのステータス正常を示す信
号を出力し、次にステータスクリアして、次の処理２に
進むようにした。したがって、処理１において、エラー
があったときには、例えば処理１において全ＰＥの同期
が確立した場合でも処理２に進まない。そして、処理１
において、エラーがあったことを直ちに判別することが
できる。そして、そのエラーに対して適切な処理を施せ
るので、各ＰＥが１つのＰＥのエラーのために誤まった
動作を実行したり、同期待状態におかれたりすることを
防止できる。[Operation] In the first invention, in a certain process 1, synchronization of all PEs is established as shown in block 1, and only when it is established in block 4 that the status of all PEs is normal, each PE The system outputs a signal indicating that the status is normal, then clears the status and proceeds to the next process 2. Therefore, if there is an error in process 1, the process does not proceed to process 2 even if synchronization of all PEs is established in process 1, for example. And processing 1
In this case, it is possible to immediately determine that an error has occurred. Since appropriate processing can be performed for the error, it is possible to prevent each PE from performing an incorrect operation or being placed in the same expected state due to an error in one PE.

【００１７】第２の発明は、ネットワークの状態を観測
して、安定状態を検出する。本発明で定義される安定状
態とは通信経路上にメッセージがなく、かつ全てのＰＥ
にメッセージがなく、全てのＰＥがメッセージ待ち状態
にあることである。全てのＰＥがメッセージ待ち状態に
あるためには、各ＰＥが同期要求をネットワークにメッ
セージがないことを示す信号とＡＮＤをとる形で出力し
て、自分以外の他の全てのＰＥも各ＰＥと同じ同期要求
を出していることを自分自身が知る、すなわち、同期検
出を行えばよい。すなわち、本発明は、従来の同期検出
に加えて、ネットワークにメッセージがないという条件
を加えるものである。ここで、ネットワークにメッセー
ジがないとは、各ＰＥにメッセージがなく、かつ各ＰＥ
を接続する通信経路上にもないことを言う。[0017] The second invention detects a stable state by observing the state of the network. A stable state defined in this invention means that there are no messages on the communication path, and all PEs
There are no messages in PE, and all PEs are waiting for messages. In order for all PEs to be in the message waiting state, each PE must output a synchronization request in the form of an AND with a signal indicating that there is no message on the network, and all other PEs must also communicate with each PE. All they have to do is to know that they are issuing the same synchronization request, that is, to perform synchronization detection. That is, the present invention adds the condition that there are no messages on the network in addition to the conventional synchronization detection. Here, no messages in the network means that each PE has no messages, and each PE
It also says that it is not on the communication path that connects.

【００１８】ネットワーク上にメッセージがあるという
ことは、少なくとも１つのＰＥにメッセージがあるかあ
るいは全てのＰＥにメッセージがなく、全てのＰＥがメ
ッセージ待ち状態にあるにもかかわらず、メッセージが
通信経路上にあることである。そのメッセージがいつか
は、どこかのＰＥに到達して、そのＰＥはメッセージ待
ち状態ではなくなり、不安定状態となる。すなわち、各
ＰＥが処理を終了し、同期要求を出力しても、メッセー
ジがネットワーク上に残っていれば、そのメッセージは
、いつかどこかのＰＥに入ってそのＰＥがメッセージの
処理を再び実行しなければならなくなって、同期要求を
出力したことに反してしまう。したがって、本発明では
ネットワーク状態をみることによりネットワークにメッ
セージがないことを検出してから、同期安定要求を出力
するようにした。これにより、同期安定要求を出力した
後でＰＥが再びメッセージの処理をしなければならなく
なるという同期安定要求に反する状態が出現することを
回避できる。The presence of a message on the network means that at least one PE has the message, or all PEs have no message, and all PEs are waiting for a message, but the message is not on the communication path. This is true. The message will eventually reach some PE, and that PE will no longer be in a message waiting state and will be in an unstable state. In other words, even if each PE finishes processing and outputs a synchronization request, if the message remains on the network, the message will someday enter some PE and that PE will execute the message processing again. This is contrary to the fact that the synchronization request was output. Therefore, in the present invention, the synchronization stabilization request is output after detecting that there is no message on the network by checking the network status. This makes it possible to avoid a situation in which the PE must process a message again after outputting the synchronization stability request, which is contrary to the synchronization stability request.

【００１９】さらに、第３の発明では、ステータスを検
出する際にもネットワーク状態を検出するようにした。Furthermore, in the third invention, the network state is also detected when detecting the status.

【００２０】[0020]

【実施例】まず、本発明が適用される並列計算機のシス
テムの具体例について説明する。本発明が適用される並
列計算機システムは、分散メモリ型高並列コンピュータ
である。６４〜１０２４台のプロセッサ・エレメント（
セル）とホスト計算機であるエンジニアリング・ワーク
ステーション（ＥＷＳ）から構成される。全セルとホス
トを接続するコマンドバスとセル同士を接続するトーラ
スネットワークの２種類のネットワークによって、セル
─ホスト間やセル同士の通信を行う。Embodiment First, a specific example of a parallel computer system to which the present invention is applied will be described. A parallel computer system to which the present invention is applied is a distributed memory highly parallel computer. 64 to 1024 processor elements (
It consists of a cell) and an engineering workstation (EWS), which is a host computer. Communication between cells and hosts and between cells is carried out using two types of networks: a command bus that connects all cells and hosts, and a torus network that connects cells to each other.

【００２１】セル単体は、３２ビットのＲｅｄｕｃｅｄ
　Ｉｎｓｔｒｕｃｔｉｏｎ　Ｓｅｔ　Ｃｏｍｐｕｔｅｒ
（ＲＩＳＣ）型マイクロプロセッサと高速浮動少数点演
算器により高い演算性能を高性能のＤＭＡコントローラ
により高速・高性能なデータ通信能力を実現している。各セルには、アプリケーションに対応した特殊処理ハー
ドウエア（オプション）の付加が可能であり、種々のア
プリケーションに適した、専用並列マシンにカストマイ
ズすることが可能である。[0021] A single cell is a 32-bit reduced
Instruction Set Computer
(RISC) type microprocessor and high-speed floating point arithmetic unit provide high calculation performance, and high-performance DMA controller provides high-speed and high-performance data communication capability. Special processing hardware (optional) corresponding to the application can be added to each cell, and it can be customized into a dedicated parallel machine suitable for various applications.

【００２２】図４に本発明の適用される並列計算機シス
テムの基本システム構成を示す。本発明の適用される並
列計算機システムは、分散マルチインストラクションマ
ルチデータ（ＭＩＭＤ）型の高並列プロセッサである。各プロセッサエレメントは、セルと呼ばれて全て同一の
構成である。各セルは、高性能３２ビットマイクロプロ
セッサ、高速ＦＰＵ、キャッシュメモリ、大容量主記憶
、ネットワークインタフェース、コマンドバスインタフ
ェースから構成される。FIG. 4 shows the basic system configuration of a parallel computer system to which the present invention is applied. The parallel computer system to which the present invention is applied is a distributed multi-instruction multi-data (MIMD) highly parallel processor. Each processor element is called a cell and has the same configuration. Each cell is comprised of a high-performance 32-bit microprocessor, high-speed FPU, cache memory, large-capacity main memory, network interface, and command bus interface.

【００２３】セル同士は、２次元トーラス状のトポロジ
ーを持つトーラスネットワークによって４つの隣接する
セルと接続している。また、コマンドバスによって全セ
ルとホスト計算機が接続している。また、各セルには、
外部に画像入出力デバイスや高速Ｉ／Ｏインタフェース
、ディスクインタフェース、拡張メモリ、ベクタープロ
セッサなどのオプションハードウエアを付加することが
可能である。[0023] The cells are connected to four adjacent cells by a torus network having a two-dimensional torus topology. In addition, all cells and the host computer are connected by a command bus. Also, each cell has
Optional hardware such as an image input/output device, high-speed I/O interface, disk interface, expansion memory, vector processor, etc. can be added externally.

【００２４】図５に本発明の適用される並列計算機シス
テム内の各セルプロセッサのハードウエア構成を示す。基本部は、整数演算・論理演算・制御を行うＲＩＳＣ型
のマイクロプロセッサＩＵと高速浮動演算器ＦＰＵから
成り、高速なキャッシュメモリＣＭと接続される。キャ
ッシュメモリは、１２８ＫＢの容量があり、高いヒット
率と、低いメモリバストラフィックを実現するため、コ
ピーバッグ方式を採用した。FIG. 5 shows the hardware configuration of each cell processor in the parallel computer system to which the present invention is applied. The basic unit consists of a RISC microprocessor IU that performs integer operations, logical operations, and control, and a high-speed floating arithmetic unit FPU, and is connected to a high-speed cache memory CM. The cache memory has a capacity of 128KB and uses a copy bag method to achieve a high hit rate and low memory bus traffic.

【００２５】メッセージコントローラＭＳＣは、高性能
ＤＭＡコントローラとキャッシュコントローラが内部に
あり、ディスクの転送の高速化を実現する。ＤＲＡＭコ
ントローラは、４Ｍビットまたは１ＭビットのＤＲＡＭ
の制御とエラーの検出・訂正を行う。コマンドバスイン
タフェースＢＩＦは、コマンドバスとのディスク転送を
、ルーティングコントローラＲＴＣは、トーラスネット
ワークとのディスク転送を行う。The message controller MSC has a high-performance DMA controller and a cache controller inside, and realizes high-speed disk transfer. The DRAM controller is a 4Mbit or 1Mbit DRAM
control and detect and correct errors. The command bus interface BIF performs disk transfer with the command bus, and the routing controller RTC performs disk transfer with the torus network.

【００２６】ＭＳＣ，ＤＲＡＭＣ，ＲＴＣ，ＢＩＦは、
ＬＢＵＳと呼ばれる内部バスに接続される。ＬＢＵＳは
、３２ビット幅のアドレス・データ多重のバスである。ＬＢＵＳは各セルごとにコネクタを介して外部に取り出
されており、種々のオプションハードウエアの付加を可
能にしている。[0026] MSC, DRAMC, RTC, BIF are:
It is connected to an internal bus called LBUS. LBUS is a 32-bit wide address/data multiplexed bus. The LBUS is taken out to the outside via a connector for each cell, making it possible to add various optional hardware.

【００２７】電源投入後、ホスト計算機から本システム
の初期化を行う。システム初期化は、本システムがもつ
種々のリソースの機能チェックと初期化設定を行うこと
である。システム初期化は、コマンドバスを経由して、
ホストから放送した初期化プログラムロードし、それを
実行することにより行われれる。この時同時に各セルの
セルｉｄも設定される。After the power is turned on, the system is initialized from the host computer. System initialization is to check the functions and initialize the various resources that this system has. System initialization is performed via the command bus.
This is done by loading the initialization program broadcast from the host and running it. At this time, the cell ID of each cell is also set.

【００２８】初期化された、本システムのセル内では、
セルＯＳ（ＣＰＵからなるセルを動かすＯＳ）が走って
おり、ユーザプログラムのロード・実行要求を待ってい
る。ユーザプログラムも、コマンドバスを経由して、各
セルに放送される。セル内に配置されたユーザプログラ
ムは、ホストからの指示に従って実行される。In the initialized cell of this system,
A cell OS (OS that runs a cell consisting of a CPU) is running and waiting for a request to load and execute a user program. User programs are also broadcast to each cell via the command bus. The user program placed in the cell is executed according to instructions from the host.

【００２９】セルに初期化データを設定したい場合には
、ホスト計算機から初期化データをコマンドバスを通じ
て放送することにより行う。コマンドバスによる放送は
、ホストからセルだけでなく、一つのセルから他のセル
群へ放送を行うことも可能である。また放送先は、任意
のセルをグループ化し、そのグループｉｄが一致したセ
ル群にのみ行うこともできる。When it is desired to set initialization data in a cell, it is done by broadcasting the initialization data from the host computer through the command bus. Broadcasting using the command bus can be performed not only from a host to a cell, but also from one cell to another group of cells. Furthermore, it is also possible to broadcast to only those cells whose group IDs match by grouping arbitrary cells.

【００３０】トーラスネットワークは、図６に示すよう
な、ルーティングユニットＲＴＣをトーラス状のネット
ワークで結んだ構成である。ＲＴＣの４本のポートは、
東西南北（ＥＷＳＮ）と呼び、各セル間で、ＮとＳとＷ
とＥとはそれぞれ接続される。ＲＴＣの個々のデータ転
送路は、それぞれ１６ビット幅のデータと制御信号から
成る。ＲＴＣは、任意セル間の通信における中継処理を
各セルのＣＰＵの介在なく、ハードウェアで高速に行う
。最大構成は、３２×３２（１０２４台）である。The torus network has a configuration in which routing units RTC are connected in a torus-shaped network as shown in FIG. The four ports of RTC are
It is called east-west-south-north (EWSN), and between each cell there are N, S, and W.
and E are connected respectively. Each RTC data transfer path consists of 16-bit wide data and control signals. RTC performs relay processing in communication between arbitrary cells at high speed using hardware without the intervention of the CPU of each cell. The maximum configuration is 32×32 (1024 units).

【００３１】ルーティングユニットＲＴＣは、図７のよ
うな構成をとる。ルーティングは、まずＸ方向へ進み、
次にＹ方向へ進む。送信セルと受信セルが同じときは、
必ず同じ径路を使用する（スタティック・ルーティング
）のでメッセージの追い越しは発生しない。ＲＴＣ内部
では、Ｎ−Ｓ方向のルーティングユニットとＷ−Ｅ方向
のルーティングユニットが独立に存在し直列に接続され
た構成である。The routing unit RTC has a configuration as shown in FIG. Routing first proceeds in the X direction,
Next, proceed in the Y direction. When the sending cell and receiving cell are the same,
Since the same route is always used (static routing), overtaking of messages does not occur. Inside the RTC, a routing unit in the N-S direction and a routing unit in the W-E direction exist independently and are connected in series.

【００３２】個々のルーティングユニットは、データの
行き先の判断と、バッファリングを行う。データ転送の
レイテンシを低くするために、ワームホールルーティン
グと呼ばれるルーティング方式を取り入れている。また
、複数のセルが同時に送信要求を出しても、デッドロッ
クやスループット低下がおきないように、構造化バッフ
ァプールと呼ばれるバッファ管理アルゴリズムを採用し
ている。[0032]Each routing unit performs data destination determination and buffering. In order to reduce the latency of data transfer, it uses a routing method called wormhole routing. In addition, a buffer management algorithm called a structured buffer pool is used to prevent deadlock and throughput degradation even if multiple cells issue transmission requests at the same time.

【００３３】ワームホールルーティングは、メッセージ
のヘッダが直接、入力チャネルから出力チャネルへ中継
ルートをつくりながら、送り出されていくようなルーテ
ィング方式である。各中継ノードでは、メッセージの一
部分をバッファリングする。Wormhole routing is a routing method in which a message header is directly sent out from an input channel to an output channel while creating a relay route. Each relay node buffers a portion of the message.

【００３４】図８に示すように、通常のストア・アンド
・フォワード・ルーティングでは中継ノードがメッセー
ジ全体をバッファリングするのに対し、ワームホールル
ーティングでは、フリットと呼ぶ数ワードのデータのみ
が中継ノードにバッファされる。あるセルがメッセージ
のヘッドを受信すると、中継ルートのチャネルを選択し
、フリットをそのチャネルへ転送する。後続のフリット
はヘッダのフリットが選択したルートと同じルートに転
送されていく。As shown in FIG. 8, in normal store-and-forward routing, the relay node buffers the entire message, whereas in wormhole routing, only a few words of data called a flit are sent to the relay node. Buffered. When a cell receives the head of a message, it selects a relay route channel and transfers the flit to that channel. Subsequent flits are transferred to the same route selected by the header flit.

【００３５】ワームホールルーティングを採用すること
により、メッセージの送信から受信までの遅延時間（レ
イテンシ）を小さくすることができる。ストア＆フォワ
ードルーティングのレイテンシが、セル間距離とメッセ
ージサイズの積に比例するのに対して、ワームホールル
ーティングのレイテンシは、セル間距離とメッセージサ
イズの和に比例する。By employing wormhole routing, the delay time (latency) from message transmission to reception can be reduced. The latency of store-and-forward routing is proportional to the product of inter-cell distance and message size, whereas the latency of wormhole routing is proportional to the sum of inter-cell distance and message size.

【００３６】一方、１つのメッセージが転送されている
間、そのメッセージが使用されているチャネルをブロッ
クするので、デッドロックの発生とスループットの低下
を起こす可能性がある。ＲＴＣでは、レイテンシが小さ
いことを特徴とするワームホールルーティングに構造化
バッファプールのアルゴリズムを取り入れることにより
、低レイテンシかつ高スループット，デッドロックフリ
ーを実現した。これによって、ネットワーク全体で通信
が行われているとき、ネットワークの一部分の輻輳によ
りネットワーク全体の性能が低下することがなくなる。また、送信されたメッセージが宛先セルで必ず受信され
る限り、ネットワーク上でのデッドロックは発生しない
。On the other hand, while one message is being transferred, the channel being used by that message is blocked, which may cause deadlock and decrease throughput. RTC has achieved low latency, high throughput, and deadlock-free performance by incorporating a structured buffer pool algorithm into wormhole routing, which is characterized by low latency. This prevents the performance of the entire network from deteriorating due to congestion in a portion of the network when communication is being performed across the entire network. Further, as long as the transmitted message is always received by the destination cell, deadlock will not occur on the network.

【００３７】ＲＴＣが扱うデータは、図９に示すような
形式で、メッセージの先頭を示すヘッダとメッセージ本
体，メッセージの終了を示すエンドビットから成る。ヘ
ッダには、種々の制御情報を乗せる。エンドビットは、
メッセージコントローラが自動的に付加する。The data handled by the RTC is in the format shown in FIG. 9 and consists of a header indicating the beginning of the message, a message body, and an end bit indicating the end of the message. The header carries various control information. The end bit is
The message controller automatically adds it.

【００３８】さて、このようなシステムにおいて、本発
明にかかる同期機能を以下に述べる。並列処理において
は、全セルの処理の終了を確認してから、次の処理に進
むことを保証できるような機構が必須である。同期・ス
テータス系は、同期処理を効率良く行うハードウエアで
ある。Now, in such a system, the synchronization function according to the present invention will be described below. In parallel processing, it is essential to have a mechanism that can guarantee that the processing of all cells is completed before proceeding to the next processing. The synchronization/status system is hardware that efficiently performs synchronization processing.

【００３９】図１０で、各セルは、自分の処理１が終了
したら、同期要求を行う。全セルが同期要求を出したこ
とを確認できるまで各セルは、同期待ちにはいる。同期
の確立はハードウエアによって通知され、その後、処理
２の実行が開始される。すなわち、本発明のシステムの
ような高並列計算機では、全プロセッサがある状態にな
るのを待って次の状態へ進むといった処理が多く現れ、
同期・ステータス系は、このようなタイプの同期を効率
よく実現する。In FIG. 10, each cell issues a synchronization request after completing its own process 1. Each cell waits for synchronization until it is confirmed that all cells have issued a synchronization request. Establishment of synchronization is notified by the hardware, and then execution of process 2 is started. In other words, in a highly parallel computer such as the system of the present invention, processing often occurs in which the processor waits for all processors to reach a certain state before proceeding to the next state.
The synchronization/status system efficiently realizes this type of synchronization.

【００４０】同期・ステータス系の基本的な動作は、次
のようになる。同期系では、各セルに同期要求レジスタ
を持たせ、全セルでその出力の論理積（ＡＮＤ）をとっ
た値が１になったら、同期検出レジスタをセットする。この時同時に、同期要求レジスタをクリヤする。同期検
出レジスタがセットされていることは、全セルが同期を
要求したことを示しているので、これを用いて全セルが
プログラム中の部分に来たことが保証できる。The basic operation of the synchronization/status system is as follows. In the synchronous system, each cell has a synchronization request register, and when the logical product (AND) of the outputs of all cells becomes 1, the synchronization detection register is set. At this time, the synchronization request register is also cleared. Since the synchronization detection register being set indicates that all cells have requested synchronization, this can be used to ensure that all cells have reached the part being programmed.

【００４１】ステータス系では、各セルにステータス要
求レジスタを持たせ、全セルでその出力の論理積（ＡＮ
Ｄ）をとった値をステータス検出レジスタにセットする
ようにする。ステータス検出レジスタは、全セルのステ
ータス要求レジスタの論理積の時々刻々の値を表してい
る。In the status system, each cell has a status request register, and the AND (AN) of the outputs of all cells is performed.
D) is set in the status detection register. The status detection register represents the momentary value of the AND of the status request registers of all cells.

【００４２】本システムの同期・ステータス系は、最大
１０２４台のセルのレジスタの出力のＡＮＤをとるため
に、図１１のような階層ネットワーク構成をとっている
。また、並行して複数の要因について動作が可能になる
ように、要因ごとに時分割動作を行っている。同期・ス
テータス系ネットワークは、階層ネットワークの時分割
使用により、８要因の高速同期，８要因の高速ステータ
ス，３２要因の低速同期，３２要因の低速ステータスの
処理が可能である。同期・ステータス系のレジスタ類は
、ＢＩＦチップ内に実装されている。The synchronization/status system of this system has a hierarchical network configuration as shown in FIG. 11 in order to AND the outputs of the registers of a maximum of 1024 cells. In addition, time-sharing operations are performed for each factor so that operations can be performed on multiple factors in parallel. The synchronization/status network is capable of processing 8-factor high-speed synchronization, 8-factor high-speed status, 32-factor slow synchronization, and 32-factor low-speed status by using a time-sharing hierarchical network. Synchronization and status registers are mounted within the BIF chip.

【００４３】同期系には、同期要求レジスタ，同期検出
レジスタ，同期マスクレジスタ，同期割り込みマスクレ
ジスタのリソースがある。それぞれ、高速同期系は、８
ビット、低速同期系は３２ビットのレジスタである。The synchronous system includes resources such as a synchronous request register, a synchronous detection register, a synchronous mask register, and a synchronous interrupt mask register. Each high-speed synchronous system has 8
The low-speed synchronous system is a 32-bit register.

【００４４】同期要求はＣＰＵが同期要求レジスタをセ
ットすることにより行う。同期が確立すると、同期検出
レジスタの対応するビットがセットされ、同期要求レジ
スタはリセットされる。ＣＰＵは、同期検出レジスタを
監視することにより、同期の確立を知ることができる。同期が確立した時に、対応する同期割り込みマスクレジ
スタがセットされていなければ、ＣＰＵに割り込みがは
いるのでこれによる通知も可能である。また、同期マス
クレジスタをセットすると、そのセルは、同期に参加し
ない。すなわち、自分は、その同期の検出をしないし、
他のセルに対しては、常に同期要求状態になる。マスク
機能を用いて、特定のグループのセル群のみで同期をと
ることが容易になる。A synchronization request is made by the CPU by setting a synchronization request register. When synchronization is established, the corresponding bit in the synchronization detection register is set and the synchronization request register is reset. The CPU can know whether synchronization has been established by monitoring the synchronization detection register. When synchronization is established, if the corresponding synchronization interrupt mask register is not set, an interrupt is generated in the CPU, so notification can be made using this. Also, if the synchronization mask register is set, that cell will not participate in synchronization. In other words, I do not detect the synchronization,
For other cells, the cell is always in a synchronization request state. Using the mask function, it becomes easy to synchronize only a specific group of cells.

【００４５】ステータス系には、ステータスレジスタ，
ステータス検出レジスタ，ステータスマスクレジスタの
リソースがある。それぞれ、高速ステータス系は、８ビ
ット、低速ステータス系は３２ビットのレジスタである
。[0045] The status system includes status registers,
There are resources for status detection register and status mask register. The high-speed status system is an 8-bit register, and the low-speed status system is a 32-bit register.

【００４６】ステータス要求はＣＰＵがステータス要求
レジスタをセットすることにより行う。ステータス検出
レジスタには、全セルのステータス要求レジスタの値の
ＡＮＤがセットされる。すなわち、ステータス検出レジ
スタの値は、その時の全セルのステータスを表す。ステ
ータスマスクレジスタをセットすると、そのセルは、ス
テータスに参加しない。他のセルに対しては、常にステ
ータス要求状態になる。マスク機能を用いることにより
、特定のグループのセル群のみでステータスをとること
が容易になる。A status request is made by the CPU by setting a status request register. The AND of the values of the status request registers of all cells is set in the status detection register. That is, the value of the status detection register represents the status of all cells at that time. Setting the status mask register causes that cell to not participate in status. For other cells, the status is always requested. By using the mask function, it becomes easy to determine the status of only a specific group of cells.

【００４７】同期＆ステータスは、同期が確立した時点
の各セルのステータスを得るための機能である。同期＆
ステータスモードレジスタをセットすることにより、同
期系とステータス系を結合して、同期＆ステータス動作
を行う。同期＆ステータスモードのセットは、各要因ご
とに可能である。同期＆ステータスモードでは、同期系
の動作は変わらないが、ステータス系の動作に変化があ
る。通常モードでは、ステータス検出レジスタには、常
に全セルのステータスのＡＮＤがセットされていた。同
期＆ステータスモードでは、同期が検出された瞬間のみ
、全セルのステータスをセットする。[0047] Synchronization & status is a function for obtaining the status of each cell at the time when synchronization is established. Sync &
By setting the status mode register, the synchronization system and status system are combined to perform synchronization and status operations. A set of synchronization & status modes is possible for each factor. In sync & status mode, the behavior of the synchronization system remains the same, but the behavior of the status system changes. In the normal mode, the AND of the statuses of all cells is always set in the status detection register. In synchronization & status mode, the status of all cells is set only at the moment synchronization is detected.

【００４８】図１２に同期＆ステータスの使用方法を示
す。各セルのＣＰＵは、同期要求地点にきたら、先ずそ
れまでの処理が正常に実行されていれば、ステータス要
求レジスタをセットする。なにかエラーがあった場合は
、ステータス要求レジスタをクリアする。その後、同期
要求レジスタをセットしてＣＰＵは、同期要求待ちには
いる。同期完了を待っていたＣＰＵは、同期完了を検出
すると、ステータス検出レジスタの値を読みだし、前の
処理を全てのセルがエラーなく実行したことを確認して
から、次のステップに進む。もし１つ以上のセルでエラ
ーが発生していた場合は、ステータス検出レジスタの値
は、０となり全セルが処理を終了させる。FIG. 12 shows how to use synchronization & status. When the CPU of each cell reaches the synchronization request point, it first sets the status request register if the processing up to that point has been executed normally. If there is any error, clear the status request register. Thereafter, the CPU sets the synchronization request register and waits for a synchronization request. When the CPU, which has been waiting for the completion of synchronization, detects the completion of synchronization, it reads the value of the status detection register, confirms that all cells have executed the previous process without error, and then proceeds to the next step. If an error has occurred in one or more cells, the value of the status detection register becomes 0 and all cells complete the process.

【００４９】同期＆ステータスにより、処理の途中でな
んらかのエラーが発生した場合に、同期を検出するとセ
ルが全体の処理状況を確認できる。任意の１セルの異常
処理を全セルが検出判断して、適切な処理をとることが
できるので、正常終了したセルが同期待ちを続けるよう
なことがなくなる。With synchronization and status, if some error occurs during processing, the cell can check the overall processing status when synchronization is detected. Since all cells can detect and judge abnormal processing in any one cell and take appropriate processing, cells that have completed normally will not continue to wait for synchronization.

【００５０】図１３に、本発明の実現の一実施例を示す
。先ず、同期を検出する手順を示す。各ＰＥのＣＰＵ（
図１４を参照）は、同期要求レジスタ２１をセットして
同期検出待ちにはいる。同期要求レジスタ２１の出力は
、全ＰＥで論理積がＡＮＤ回路２４でとられ全ＰＥに返
ってくる。全てのＰＥが同期要求レジスタ２１をセット
すると、その論理積出力は１になり、全ＰＥの同期検出
レジスタ２３をセットし、同時に同期要求レジスタ２１
をクリアする。FIG. 13 shows an embodiment of the implementation of the present invention. First, the procedure for detecting synchronization will be described. CPU of each PE (
(see FIG. 14) sets the synchronization request register 21 and waits for synchronization detection. The output of the synchronization request register 21 is logically multiplied by an AND circuit 24 in all PEs, and is returned to all PEs. When all PEs set the synchronization request register 21, the AND output becomes 1, setting the synchronization detection register 23 of all PEs, and simultaneously setting the synchronization request register 21.
Clear.

【００５１】各ＰＥのＣＰＵは、ダイナミックループで
同期検出レジスタ２３を監視することや、同期要求レジ
スタ２３がセットされたことを要因とした割り込みによ
って同期が確立したことを知ることができる。同期を検
出したＣＰＵは次のステップに進む。[0051] The CPU of each PE can know that synchronization has been established by monitoring the synchronization detection register 23 in a dynamic loop or by receiving an interrupt triggered by the synchronization request register 23 being set. The CPU that detects synchronization proceeds to the next step.

【００５２】本発明では、同期に加えてステータス情報
も得るために以下の手順で処理を行う。図１３において
モードレジスタ２８の出力のＭＯＤＥは０とする。各Ｐ
ＥのＣＰＵは、同期要求地点にきたら、先ずそれまでの
処理が正常に実行されていれば、ステータス要求レジス
タ３１にステータスをセットする。例えばエラーがあっ
た場合はステータスをクリアする。その後、同期要求レ
ジスタ２１をセットして同期要求待ちにはいる。同期要
求レジスタ２１の出力とステータス要求レジスタ３１の
出力はそれぞれ全ＰＥで論理積がＡＮＤ回路２４および
３４でとられ、全ＰＥに返ってくる。全てのＰＥが同期
要求レジスタ２１をセットすると、その論理積出力は１
になり全ＰＥの同期検出レジスタ２３がセットされる。同時に同期要求レジスタ２１をクリアするＡＮＤ回路２
４の“１”出力がステータス線３６上のＯＲ回路３７，
ＡＮＤ回路３８を介してステータス検出レジスタ３３に
送られる。全ＰＥのステータスに異常がなく、ステータ
ス要求レジスタ３１の出力が論理１のとき、ＡＮＤ回路
３４は論理１を出力し、この出力が全てのＰＥに再び分
配される。そして、各ＰＥにおいて、ステータス線３６
を介して送られてきた全てのＰＥの同期成立を示す信号
とＡＮＤ回路４３から送られてきた全てのＰＥのステー
タス要求成立を示す信号との論理積をＡＮＤ回路３８で
とって、ステータス検出レジスタ３３をセットする。In the present invention, in order to obtain status information in addition to synchronization, the following procedure is performed. In FIG. 13, MODE of the output of the mode register 28 is assumed to be 0. Each P
When the CPU E reaches the synchronization request point, it first sets the status in the status request register 31 if the processing up to that point has been executed normally. For example, if there is an error, clear the status. Thereafter, the synchronization request register 21 is set and the process waits for a synchronization request. The output of the synchronization request register 21 and the output of the status request register 31 are ANDed by AND circuits 24 and 34 in all PEs, respectively, and returned to all PEs. When all PEs set the synchronization request register 21, the AND output is 1.
Then, the synchronization detection registers 23 of all PEs are set. AND circuit 2 that simultaneously clears the synchronization request register 21
4 “1” output is the OR circuit 37 on the status line 36,
The signal is sent to the status detection register 33 via the AND circuit 38. When there is no abnormality in the status of all PEs and the output of the status request register 31 is logic 1, the AND circuit 34 outputs logic 1, and this output is distributed again to all PEs. Then, in each PE, the status line 36
The AND circuit 38 performs a logical product of the signal indicating the establishment of synchronization of all the PEs sent via the AND circuit 43 and the signal indicating the establishment of the status request of all the PEs sent from the AND circuit 43. Set 33.

【００５３】同期完了を待っていたＣＰＵは、ステータ
ス検出レジスタ３３の値を読みだし、前の処理で全ての
ＰＥがエラー無く実行したことを確認してから、次のス
テップに進む。もし１つ以上のＰＥでエラーが発生して
いた場合は、ＡＮＤ回路３４の出力は論理０となり、全
ＰＥの同期がとれていても、ＡＮＤ回路３９を介してス
テータス検出レジスタ３３の値は０となり、全ＰＥは次
のが処理を実行しない。ステータス線３６のデータは、
同期が検出された時のみ有効となり、ステータス検出レ
ジスタ３３を更新するので、全ＰＥの同期確立時のステ
ータスを検出することが保証される。The CPU, which has been waiting for the completion of synchronization, reads the value of the status detection register 33, and after confirming that all PEs have executed the previous process without error, proceeds to the next step. If an error has occurred in one or more PEs, the output of the AND circuit 34 becomes logic 0, and even if all PEs are synchronized, the value of the status detection register 33 is changed to 0 via the AND circuit 39. Therefore, all PEs do not execute the following process. The data on the status line 36 is
Since it becomes valid only when synchronization is detected and updates the status detection register 33, it is guaranteed to detect the status of all PEs when synchronization is established.

【００５４】ＭＯＤＥをＣＰＵから読み書きできる別の
モードレジスタ２８によって制御することにより、同期
系とステータス系を独立したものとしても、同期兼ステ
ータス系としても使用することができる。すなわち、モ
ードレジスタ２８の出力のＭＯＤＥが１のときは、たと
えステータス線３６の信号が０で、全ＰＥの同期が成立
していないときでも、全ＰＥのステータスに異常がなけ
れば、ステータス検出レジスタ３３を１にセットできる
。したがって、同期系に関係なくステータス系を独立し
て動作させることができる。By controlling MODE with another mode register 28 which can be read and written by the CPU, the synchronization system and status system can be used independently or can be used as a synchronization and status system. In other words, when MODE output from the mode register 28 is 1, even if the signal on the status line 36 is 0 and all PEs are not synchronized, if there is no abnormality in the status of all PEs, the status detection register 33 can be set to 1. Therefore, the status system can be operated independently regardless of the synchronization system.

【００５５】図１４は、図１３に示した本発明の実施例
によって構成されるＰＥの構成と、並列計算機システム
の全体の構成を示す。図１３と同一部分は、同一参照番
号を付して説明を省略する。FIG. 14 shows the configuration of a PE constructed according to the embodiment of the present invention shown in FIG. 13, and the overall configuration of a parallel computer system. The same parts as those in FIG. 13 are given the same reference numerals and the explanation will be omitted.

【００５６】１つのＰＥは、ＣＰＵ４１と記憶装置ＭＥ
Ｍ４２にデータバスラインで接続された同期要求レジス
タ２１，同期検出レジスタ２３，ステータス要求レジス
タ３１，ステータス検出レジスタ３３，モードレジスタ
２８，カウンタ５１が接続される。データバスはＰＥ間
通信インターフェース４３を介してネットワーク４４に
接続され、他のＰＥとの通信が行われる。各ＰＥの同期
要求レジスタ２１とステータス要求レジスタ３１はＡＮ
Ｄ回路２４および３４に接続され、このＡＮＤ回路２４
，３４の出力は各ＰＥの同期検出レジスタ２３，ステー
タス検出レジスタ３３にそれぞれ分配される。One PE includes the CPU 41 and the storage device ME.
A synchronization request register 21, a synchronization detection register 23, a status request register 31, a status detection register 33, a mode register 28, and a counter 51 are connected to M42 via a data bus line. The data bus is connected to a network 44 via an inter-PE communication interface 43, and communication with other PEs is performed. The synchronization request register 21 and status request register 31 of each PE are AN
connected to D circuits 24 and 34, and this AND circuit 24
, 34 are distributed to the synchronization detection register 23 and status detection register 33 of each PE, respectively.

【００５７】並列計算機にとって、同期待ちの時間は無
駄な時間なので極力短くしたい。プログラムを実行する
上で、どのＰＥの同期待ち時間が長いか知ることは重要
である。図１５に示した実施例では、個々のＰＥ毎にク
ロックをカウントするカウンタ５１を設け、同期要求レ
ジスタ２１の出力（全ＰＥで論理積をとる前）が１のと
きにカウントアップし、同期検出されたとき、カウント
を終了するものである。同期要求・同期検出が繰り返さ
れる場合は、同期を要求している時間が積算されること
になる。[0057] For parallel computers, synchronization waiting time is wasted time, so it is desirable to shorten it as much as possible. When executing a program, it is important to know which PE has a long synchronization wait time. In the embodiment shown in FIG. 15, a counter 51 for counting clocks is provided for each PE, and the counter 51 counts up when the output of the synchronization request register 21 (before performing the AND operation on all PEs) is 1, and detects synchronization. When this happens, the count ends. When synchronization requests and synchronization detections are repeated, the time required for synchronization is accumulated.

【００５８】ユーザは、先ずカウンタ５１をクリアしプ
ログラムを実行し、最後にカウンタの値を読み出す。こ
れにより、そのプログラムの同期オーバヘッドを知るこ
とができる。このような機能はソフトウェアでも実現可
能であるが、ソフトウエアで実行すると、測定をしてい
る時と測定していないときで、プログラムの実行時間が
変化するため、正確な時間が測定できないし思わぬバグ
を引き起こすこともある。The user first clears the counter 51, executes the program, and finally reads the value of the counter. This allows you to know the synchronization overhead of the program. This type of function can also be achieved with software, but when executed using software, the program execution time changes depending on whether it is measuring or not, making it impossible to measure accurate time and causing unexpected problems. It may also cause bugs.

【００５９】処理の途中で何らかのエラーが発生した場
合に、同期を検出すると同時にそのエラーがあったこと
を全ＰＥが確認する必要がある。例えば、図１６のよう
に本実施例によれば、１ＰＥにおいて処理１が終了し、
同期成立したのち正常終了を全ＰＥについて検出判断し
て、異常のときは適切な異常処理をとることができるの
で、正常終了したＰＥが同期待ちを続けるようなことが
なくなる。そして、正常終了のときは処理２に進行する
。[0059] If any error occurs during processing, all PEs need to confirm that the error has occurred at the same time as synchronization is detected. For example, as shown in FIG. 16, according to this embodiment, processing 1 ends in 1PE,
After synchronization is established, normal termination can be detected and determined for all PEs, and if an abnormality occurs, appropriate abnormality processing can be taken, so that PEs that have terminated normally do not continue to wait for synchronization. If the process ends normally, the process proceeds to process 2.

【００６０】図１６において、正常終了の判断にかえて
、各ＰＥにおける演算処理が所定の誤差の範囲内に収束
したかを判断し、収束したとき次の処理２に進行するよ
うにしてもよい。この場合、図１３において、収束して
いたとき、ステータス要求レジスタ３１のステータスを
セットするようにすればよい。In FIG. 16, instead of determining whether the processing has ended normally, it may be determined whether the arithmetic processing in each PE has converged within a predetermined error range, and when convergence occurs, the process may proceed to the next process 2. . In this case, the status of the status request register 31 may be set when the convergence occurs in FIG. 13.

【００６１】図１７に本発明の他の実施例を示す。以下
の処理で全ＰＥの安定状態を検出する。ネットワーク状
態信号６０は１のときに、ネットワークにメッセージが
存在することを表現する信号で、反転回路６１の出力は
、ネットワーク上にメッセージがないときに、１になる
信号である。すなわち、ネットワーク状態信号６０はネ
ットワークからの信号線でネットワークにメッセージが
存在する時に１になるのだから、全ＰＥの少なくとも１
つのネットワーク状態信号線が１のときのは、ネットワ
ーク内にメッセージがあることを示している。FIG. 17 shows another embodiment of the present invention. The stable state of all PEs is detected by the following process. The network status signal 60 is a signal that indicates the presence of a message on the network when it is 1, and the output of the inverting circuit 61 is a signal that becomes 1 when there is no message on the network. That is, since the network status signal 60 is a signal line from the network and becomes 1 when a message exists in the network, at least one of all PEs
When one network status signal line is 1, it indicates that there is a message in the network.

【００６２】ここで、ネットワーク状態の検出の方式を
図１８に示す。まず、自ＰＥにメッセージがあるかを判
断し、自ＰＥにメッセージがあるときには、ネットワー
クにメッセージある状態を示しつづける。この判断は、
自ＰＥの所定レジスタにデータがあるかを検出すること
により行う。自ＰＥにメッセージがないとき、所定時間
たったかを判断する。この所定時間とは相手ＰＥまでの
データの伝播時間に対応する。所定時間たったときには
、ネットワークにメッセージがないことを示す、すなわ
ち、自ＰＥにメッセージがなく、かつ所定時間たったと
いうことは、自ＰＥとその相手ＰＥとの間の通信経路と
の両方にメッセージがないことすなわち、ネットワーク
にメッセージがないことを示す。FIG. 18 shows a method for detecting the network state. First, it is determined whether there is a message in its own PE, and if there is a message in its own PE, it continues to indicate that there is a message in the network. This judgment is
This is done by detecting whether there is data in a predetermined register of the own PE. When there is no message in the own PE, it is determined whether a predetermined period of time has elapsed. This predetermined time corresponds to the data propagation time to the other PE. When a predetermined period of time has elapsed, it indicates that there is no message on the network; that is, there is no message in the own PE, and the fact that the predetermined time has elapsed means that there is no message in both the communication path between the own PE and the other PE. In other words, there are no messages on the network.

【００６３】図１７において、同期要求レジスタ６２は
同期要求信号６３が１の時にセットされ、この同期要求
レジスタ６２のセットは、当該ＰＥの当該処理ステップ
が終了し、他のＰＥに対して同期要求を発する。メッセ
ージ受信信号６４は、自ＰＥに対してネットワークから
メッセージが到着したことを示す信号線である。この信
号が１の時にはＯＲ回路６５の出力は１であるから同期
要求レジスタ６２はクリアされる。例えば１つのＰＥが
処理を終えたつもりで同期要求信号を出した場合に当該
ＰＥに遅れて、メッセージが到着した場合にはこのメッ
セージ受信によって同期要求レジスタ６２をクリアし、
当該ＰＥからの同期要求６３の誤りを訂正する。また、
同期要求レジスタ６２はＡＮＤ回路６８によってもクリ
アされるが、これは、ＡＮＤ回路６８によって同期検出
レジスタ６９をセットしたあとは、その処理ステップに
おける同期要求レジスタ６２をリセットして次の処理に
おける自ＰＥからの同期要求信号６３待ちにするためで
ある。In FIG. 17, the synchronization request register 62 is set when the synchronization request signal 63 is 1, and the synchronization request register 62 is set when the processing step of the PE is completed and a synchronization request is sent to another PE. emits. The message reception signal 64 is a signal line indicating that a message has arrived from the network to the own PE. When this signal is 1, the output of the OR circuit 65 is 1, so the synchronization request register 62 is cleared. For example, if one PE issues a synchronization request signal thinking that it has finished processing, and a message arrives later than that PE, the synchronization request register 62 is cleared by receiving this message,
Correct the error in the synchronization request 63 from the PE. Also,
The synchronization request register 62 is also cleared by the AND circuit 68, but this is because after the AND circuit 68 sets the synchronization detection register 69, the synchronization request register 62 in that processing step is reset and the own PE is cleared in the next processing. This is to wait for the synchronization request signal 63 from.

【００６４】ＣＰＵは、ＰＥに処理するメッセージがな
くなると同期要求レジスタ６２をセットして、同期要求
を出力し安定状態になるのを待つ。このＰＥからメッセ
ージが到着するとメッセージ受信信号６４の信号線が１
になり、まず、この信号線の１により、同期要求レジス
タ６２はクリアされる。なお、ＣＰＵが同期要求レジス
タ６２をクリアする場合、ネットワークから次のメッセ
ージ全体を受信する以前に、同期要求レジスタ６２をク
リアしなければならない。ＣＰＵは、そのメッセージの
処理が終了すると再び同期要求レジスタ６２をセットす
る。When the CPU has no more messages to process in the PE, it sets the synchronization request register 62, outputs a synchronization request, and waits for a stable state. When a message arrives from this PE, the signal line of the message reception signal 64 becomes 1.
First, the synchronization request register 62 is cleared by the signal line 1. Note that if the CPU clears the synchronization request register 62, it must clear the synchronization request register 62 before receiving the next entire message from the network. When the CPU finishes processing the message, it sets the synchronization request register 62 again.

【００６５】同期要求レジスタ６２の出力とネットワー
ク状態信号６０の反転と、さらにメッセージ送信管理部
７１及びメッセージ受信管理部７２（図１９の説明参照
）の出力との論理積がＡＮＤ回路６６でとられて、ＡＮ
Ｄ回路６６から、同期安定要求信号が出力され、この各
ＰＥからの同期安定要求信号について、全ＰＥで論理積
がＡＮＤ回路６７でとられ、全ＰＥに返ってくる。ネッ
トワーク内にメッセージが存在するときは、各ＰＥから
の同期要求が同期要求レジスタ６２から出力されていて
も、反転回路６１の出力は０であるのでＡＮＤ回路６６
の出力は０となり、同期安定要求信号は出力されない。したがって少なくとも１つのＰＥのネットワーク状態が
１になっているときは全体の論理積出力（ＡＮＤ回路６
７の出力）は１にならない。An AND circuit 66 performs a logical product of the output of the synchronization request register 62, the inversion of the network status signal 60, and the outputs of the message transmission management unit 71 and message reception management unit 72 (see explanation in FIG. 19). Te, AN
A synchronization stability request signal is output from the D circuit 66, and the AND circuit 67 performs a logical product of the synchronization stability request signals from each PE in all PEs, and returns the signal to all PEs. When a message exists in the network, even if a synchronization request from each PE is output from the synchronization request register 62, the output of the inverting circuit 61 is 0, so the AND circuit 66
The output of is 0, and no synchronization stability request signal is output. Therefore, when the network state of at least one PE is 1, the entire logical product output (AND circuit 6
7) does not become 1.

【００６６】全ＰＥの同期要求レジスタ６２がセットさ
れネットワーク内にメッセージが存在しないとき、ＡＮ
Ｄ回路６６からは同期安定要求信号が出力され、全体の
論理積出力は１になり、ＡＮＤ回路６８を介して全ＰＥ
の同期検出レジスタ６９をセットし、同時に同期要求レ
ジスタ６２をクリアする。ＡＮＤ回路６８には、ＡＮＤ
回路６６の出力であって、ネットワークにメッセージが
なく、自ＰＥが同期要求していることを示す同期安定要
求信号が入力されるとともに、ＡＮＤ回路６７の出力で
あって全ＰＥが同期安定要求をしている信号も入力され
る。このＡＮＤ回路６８によって、同期安定要求信号が
確実に出力していることを確認する。When the synchronization request register 62 of all PEs is set and there is no message in the network, the AN
A synchronization stability request signal is output from the D circuit 66, and the overall AND output becomes 1, and all PEs are output via the AND circuit 68.
The synchronization detection register 69 is set, and the synchronization request register 62 is cleared at the same time. AND circuit 68 includes AND
The output of the circuit 66, which indicates that there is no message on the network and that the PE itself is requesting synchronization, is input, and the output of the AND circuit 67, which indicates that all PEs have requested synchronization stability, is input. The signal that is displayed is also input. This AND circuit 68 confirms that the synchronization stability request signal is reliably output.

【００６７】同期検出レジスタ６９は、ネットワークに
メッセージがないことも条件として出力が生じるので、
全ＰＥの安定状態を、同期検出レジスタ６９の出力を監
視することにより、知ることができる。本発明では、あ
るＰＥが他よりも早く安定状態を知り、次のステップに
移りメッセージを送信することはなく、全てのＰＥは同
期検出レジスタ６９が安定状態が確立したことを検出し
たあとはじめて、次のステップに移行することができる
。Since the synchronization detection register 69 generates an output on the condition that there is no message on the network,
The stable state of all PEs can be known by monitoring the output of the synchronization detection register 69. In the present invention, some PEs do not know the stable state earlier than others and move on to the next step and send a message, but all PEs only after the synchronization detection register 69 detects that the stable state has been established. You can move on to the next step.

【００６８】図１９に本発明の他の実施例の回路構成を
示し、この実施例は、各ＰＥのステータスの検出の際に
、ネットワークの状態を考慮するものである。以下の処
理で、全ＰＥの安定状態を検出する。FIG. 19 shows a circuit configuration of another embodiment of the present invention, in which the state of the network is taken into consideration when detecting the status of each PE. The stable state of all PEs is detected by the following process.

【００６９】ネットワーク状態７０は、ネットワークか
らの信号線で、ネットワークにメッセージが存在すると
きは、各ＰＥのすくなくとも一つの信号線は、ネットワ
ーク内にメッセージがあることを示している。なお、こ
の信号はネットワークにメッセージがないときに１とな
る。The network status 70 is a signal line from the network, and when a message exists in the network, at least one signal line of each PE indicates that there is a message in the network. Note that this signal becomes 1 when there is no message on the network.

【００７０】メッセージ送信管理部７１（各ＰＥごとに
ある）は、送信すべきメッセージが一つもないことを示
している。メッセージ受信管理部７２（各ＰＥごとにあ
る）は、受信したメッセージが一つもないことを示して
いる。各ＰＥのＣＰＵは、メッセージを受信するときは
、このメッセージ受信管理部からメッセージを受け取る
。The message transmission management unit 71 (located in each PE) indicates that there is no message to be transmitted. The message reception management unit 72 (located in each PE) indicates that no message has been received. When receiving a message, the CPU of each PE receives the message from this message reception management section.

【００７１】ネットワーク状態７０は各ＰＥにおいて、
メッセージ送信管理部７１、メッセージ受信管理部７２
以外にメッセージがあるか否かを示す信号である。メッ
セージ受信７３は、ＰＥに対してネットワークからメッ
セージが到着したことを示す信号線である。この信号に
よりステータス要求７４でセットされていたステータス
要求レジスタ７５はクリアされる。[0071] The network state 70 is as follows in each PE:
Message transmission management section 71, message reception management section 72
This signal indicates whether there are any other messages. Message reception 73 is a signal line indicating that a message has arrived from the network to the PE. This signal clears the status request register 75 that was set by the status request 74.

【００７２】ＣＰＵは、ＰＥに処理するメッセージがな
くなると、ステータス要求レジスタ７５をセットして、
安定状態になるのを待つ。このＰＥにメッセージが到着
すると、メッセージ受信の信号線が１になり、ステータ
ス要求レジスタ７５はクリアされる。ＣＰＵは、そのメ
ッセージの処理が終了すると、再び、ステータス要求レ
ジスタをセットする。When the CPU has no more messages to process in the PE, it sets the status request register 75 and
Wait until it becomes stable. When a message arrives at this PE, the message reception signal line becomes 1 and the status request register 75 is cleared. When the CPU finishes processing the message, it sets the status request register again.

【００７３】ステータス要求レジスタ７５の出力とネッ
トワーク状態７０とメッセージ送信管理部７１、メッセ
ージ受信管理部７２の出力の論理積の出力がＡＮＤ回路
７６でとられ、全ＰＥのＡＮＤ論理積がアンド回路７７
でとられ、全ＰＥに返ってくる。ネットワーク内にメッ
セージが存在するときは、少なくとも一つのネットワー
ク状態が０になっているので、全体の論理積出力は１に
ならない。The output of the logical product of the status request register 75, the network status 70, and the outputs of the message transmission management unit 71 and message reception management unit 72 is obtained by an AND circuit 76, and the AND circuit 77 is an AND circuit for all PEs.
It is taken and returned to all PEs. When a message exists in the network, at least one network state is 0, so the overall AND output will not be 1.

【００７４】全ＰＥのステータス要求レジスタ７５がセ
ットされ、ネットワーク内にメッセージが存在しないと
き、全体の論理積出力は１になり、全ＰＥのステータス
検出レジスタ７８をアンド回路７９を介してセットする
。各ＰＥがネットワークからのメッセージ受信により、
メッセージ待ち状態からメッセージ処理状態（各ＰＥが
安定状態にない）に移行するとき、ＣＰＵがステータス
要求レジスタ７５をクリアするのではなく、メッセージ
受信７３の信号線により、ステータス要求レジスタ７５
をクリアする場合、ネットワークからメッセージ全体を
受信する以前に、ステータス要求レジスタ７５をクリア
しなければならない。When the status request registers 75 of all PEs are set and there is no message in the network, the overall AND output becomes 1, and the status detection registers 78 of all PEs are set via the AND circuit 79. As each PE receives messages from the network,
When transitioning from the message waiting state to the message processing state (each PE is not in a stable state), the CPU does not clear the status request register 75, but the status request register 75
, the status request register 75 must be cleared before receiving the entire message from the network.

【００７５】全ＰＥは、安定状態をステータス検出レジ
スタ７８を監視することにより知ることができる。ある
ＰＥが他よりも早く安定状態を知り、次のステップに移
り、メッセージを送信した場合でも、他のＰＥはステー
タス検出レジスタ７８から、安定状態が確立したことを
検出できるので、次のステップに移行することができる
。All PEs can know the stable state by monitoring the status detection register 78. Even if one PE learns of the stable state earlier than the others, moves to the next step, and sends a message, the other PEs can detect from the status detection register 78 that the stable state has been established, so they cannot proceed to the next step. can be migrated.

【００７６】図２０は本発明のさらに他の実施例の回路
図である。図１７、図１９と同一部分には同一参照番号
を付して説明を省略する。この実施例においては、図１
３に示した同期成立を検出したのちにステータスの検出
を行う同期ステータス方式において、同期要求の有無、
ステータス要求の有無を判断する場合にも、ネットワー
ク状態を検出する。FIG. 20 is a circuit diagram of still another embodiment of the present invention. Components that are the same as those in FIGS. 17 and 19 are given the same reference numerals, and their explanations will be omitted. In this example, FIG.
In the synchronous status method that detects the status after detecting the establishment of synchronization as shown in 3, the presence or absence of a synchronization request,
The network status is also detected when determining the presence or absence of a status request.

【００７７】[0077]

【発明の効果】第１の発明によれば、全ＰＥの同期成立
した場合でも１ＰＥの異常処理等のステータスを全ＰＥ
が検出判断して、適切な処理が行えるので、並列計算機
システムにおいて演算処理時間を向上することができる
。[Effects of the Invention] According to the first invention, even if all PEs are synchronized, the status of one PE's abnormality processing etc. can be updated to all PEs.
Since it is possible to detect and judge and perform appropriate processing, it is possible to improve the calculation processing time in a parallel computer system.

【００７８】また、同期・ステータスモード制御レジス
タとして、モードレジスタを設けることにより、同期・
ステータス独立モードと同期兼ステータスモードの両方
の実現が行え、同期・ステータス系ハードウェアの適用
範囲が広がる。Furthermore, by providing a mode register as a synchronization/status mode control register, synchronization/status mode control registers are provided.
Both status independent mode and synchronization/status mode can be realized, expanding the scope of application of synchronization/status hardware.

【００７９】さらに、同期待ち時間の積算カウンタを設
けることにより、ＰＥの動作を変えることなく、複数の
ＰＥの負荷の分散具合を知ることができる。第２の発明
では、ネットワークの状態を同期検出回路に組み込むこ
とにより、ネットワークの状態を含めた全ＰＥの安定状
態を検出することができる。Furthermore, by providing a synchronization wait time integration counter, it is possible to know how the loads of a plurality of PEs are distributed without changing the operation of the PEs. In the second invention, the stable state of all PEs including the network state can be detected by incorporating the network state into the synchronization detection circuit.

【００８０】また、各ＰＥがネットワークからのメッセ
ージ受信により、メッセージ待ち状態からメッセージ処
理状態（各ＰＥが安定状態にない）に移行するとき、Ｃ
ＰＵが同期要求レジスタをクリアするのではなく、メッ
セージ受信の信号線により同期要求レジスタをクリアす
ることにより、高速にかつ安全に安定状態を検出するこ
とができる。Furthermore, when each PE moves from the message waiting state to the message processing state (each PE is not in a stable state) by receiving a message from the network, C
A stable state can be detected quickly and safely by clearing the synchronization request register using the message reception signal line instead of the PU clearing the synchronization request register.

【００８１】第３の発明では、ネットワークの状態をス
テータス検出回路に組み込むことにより、ネットワーク
の状態を含めた全ＰＥの安定状態を検出することができ
る。また、本発明では、メッセージ受信により、直接ス
テータス要求レジスタをリセットしているので、メッセ
ージ受信したとき、ＣＰＵからのステータス要求によっ
て、ステータス要求レジスタをセットする場合よりも、
ステータス要求レジスタからより高速にステータス要求
を出力することができる。In the third invention, by incorporating the network state into the status detection circuit, the stable state of all PEs including the network state can be detected. In addition, in the present invention, the status request register is directly reset by message reception, so when a message is received, the status request register is reset more easily than when the status request register is set by a status request from the CPU.
A status request can be output from the status request register faster.

[Brief explanation of the drawing]

【図１】第１の本発明の原理を示すブロック図である。FIG. 1 is a block diagram showing the principle of the first invention.

【図２】第２の本発明の原理を示すブロック図である。FIG. 2 is a block diagram showing the principle of the second invention.

【図３】第３の本発明の原理を示すブロック図である。FIG. 3 is a block diagram showing the principle of the third invention.

【図４】本発明の適用される並列計算機システムのハー
ドウェアの構成図である。FIG. 4 is a hardware configuration diagram of a parallel computer system to which the present invention is applied.

【図５】セルの構成を示す図である。FIG. 5 is a diagram showing the configuration of a cell.

【図６】トーラスネットワークの構成図である。FIG. 6 is a configuration diagram of a torus network.

【図７】ＲＴＣの構成図である。FIG. 7 is a configuration diagram of RTC.

【図８】ワームホールとストア＆フォワードルーティン
グを示す図である。FIG. 8 is a diagram showing wormholes and store and forward routing.

【図９】ＲＴＣのメッセージを説明する図である。FIG. 9 is a diagram illustrating RTC messages.

【図１０】同期処理を説明する図である。FIG. 10 is a diagram illustrating synchronization processing.

【図１１】同期＆ステータス系の構成を示す図である。FIG. 11 is a diagram showing the configuration of a synchronization and status system.

【図１２】同期＆ステータス処理の使用例を示す図であ
る。FIG. 12 is a diagram illustrating an example of how synchronization and status processing is used.

【図１３】第１の実施例に係る同期＆ステータス回路の
構成図である。FIG. 13 is a configuration diagram of a synchronization & status circuit according to the first embodiment.

【図１４】第１の実施例において、ＰＥの構成と並列計
算機システムの全体の構成を示す図である。FIG. 14 is a diagram showing the configuration of a PE and the overall configuration of a parallel computer system in the first embodiment.

【図１５】第１の実施例において、同期待ち時間カウン
タを示す図である。FIG. 15 is a diagram showing a synchronization waiting time counter in the first embodiment.

【図１６】第１の実施例による同期＆ステータス処理を
示す図である。FIG. 16 is a diagram showing synchronization and status processing according to the first embodiment.

【図１７】第２の実施例のブロック図である。FIG. 17 is a block diagram of a second embodiment.

【図１８】第２の実施例において、ネットワーク状態検
出のフローチャートである。FIG. 18 is a flowchart of network state detection in the second embodiment.

【図１９】第３の実施例のブロック図である。FIG. 19 is a block diagram of a third embodiment.

【図２０】第４の実施例のブロック図である。FIG. 20 is a block diagram of a fourth embodiment.

【図２１】従来の安定状態検出のフローチャートである
。FIG. 21 is a flowchart of conventional stable state detection.

【図２２】従来のＰＥの処理と同期を示す図である。FIG. 22 is a diagram showing conventional PE processing and synchronization.

[Explanation of symbols]

Claims

[Claims]

1. A distributed memory parallel computer having a configuration in which a plurality of independently operating PEs are connected, wherein each PE independently makes a synchronization request and a synchronization request register means for holding a synchronization request signal; synchronization determination means for determining that there is a request from the synchronization request register of status request register means (provided in the PE for independently making a status request and holding a status request signal);
2), a status determining means (4) for determining whether there is a request from the status request register of all PEs, a status distributing means (5) for distributing the determination result to all PEs, and a status distributing means (5) for distributing the determination result to all PEs. By having status detection register means (3) that performs status detection based on the distributed judgment results and the output of the synchronization detection register means (1), all PEs can be detected when synchronization is established among all PEs.
An inter-processor synchronous control method characterized by being able to detect the status of.

[Claim 2] According to claim 1, by having a mode register for controlling connection/disconnection between the synchronous system and the status system, the system can operate as a synchronous and status system or switch between the synchronous system and the status system as independent systems. An inter-processor synchronous control method characterized by:

3. In the parallel computer synchronization method according to claim 1, each PE has a counter for counting clocks, and the time from when synchronization is requested to when synchronization is detected is accumulated for each PE, and the program A synchronization time measurement method characterized by being able to measure synchronization overhead time for each PE without affecting processing time.

4. An inter-processor synchronous control system in which the status detection register according to claim 1 detects the occurrence of an error in each PE.

5. An inter-processor synchronization control method according to claim 1, wherein the status detection register detects convergence of calculation values in each PE.

6. In a distributed memory parallel computer in which a plurality of PEs operate independently, means for detecting the establishment of synchronization among all PEs, and status request distribution means for distributing the status of each PE to all other PEs; A processor in a parallel computer, comprising: status detection means for detecting that the status of all PEs has reached a predetermined status and synchronization has been established among all PEs; and an output of the status detection means that causes all PEs to proceed to the next step. Inter-synchronous control method.

7. In a distributed memory parallel computer in which a large number of independently operating computers are connected via a communication network, each PE independently makes a synchronization request and a synchronization request register means for holding a synchronization request signal; Means for logically ANDing the outputs from the synchronization request registers and the result for all P
synchronization detection register means (6) for performing synchronization detection based on the distributed results; means (7) for receiving a communication network status signal; and an output of the synchronization request register and the status signal. Synchronization stability request detection means (8
) for detecting that there is no message on the network and that all PEs are in a message waiting state.

8. The parallel computer according to claim 7, further comprising means for indicating that a message has arrived at the PE from the communication network, and a circuit for clearing a synchronization request register by a signal indicating the arrival. Steady state detection method.

9. The synchronization stability request detection means (8) comprises:
8. The stable state detection method for a parallel computer according to claim 7, further comprising means for calculating an AND of a signal on a signal line indicating a network state and an output of a synchronization request register.

10. The synchronization detection register is set by performing a logical product between the output of the synchronization stability request detection means (8) and the fact that all PEs are requesting synchronization stability. Stable state detection method for parallel computers.

11. A claim characterized in that the method has means for indicating that there is no message from the message transmission management section and the reception management section, and performs a logical product between a signal line indicating the state and the output of the synchronization request register. 7. The stable state detection method described in 7.

12. In a distributed memory parallel computer in which a large number of independently operating computers are connected through a communication network, means for performing a logical product of a status request register indicating the status of each PE and the request register with the status of all PEs. and a means for distributing the result to all PEs, a status detection register means (9) for detecting the status based on the distributed result, and a means (10) for indicating the state of the communication network. A stable state detection method characterized in that it is possible to detect that there is no message on the network and that all PEs are in a message waiting state by taking a logical product between the output and a signal indicating the state of the network.

Claim 13: By having means for indicating that a message has arrived at a PE from a communication network, and adding a circuit that clears a status request register using a signal indicating the arrival, the stable state of all PEs can be safely checked. 13. The stable state detection method according to claim 12, wherein the stable state detection method is capable of detecting.

14. A means for indicating that there is a message from the message transmission management unit and reception management unit, and performing a logical product of a signal line indicating the status, an output of the status request register, and an output indicating the network status. 13. The stable state detection method according to claim 12.