JPH10124338A

JPH10124338A - Parallel processor

Info

Publication number: JPH10124338A
Application number: JP8280936A
Authority: JP
Inventors: Hiroyuki Miyata; 裕行宮田; Katsumi Takahashi; 勝己高橋
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1996-10-23
Filing date: 1996-10-23
Publication date: 1998-05-15

Abstract

PROBLEM TO BE SOLVED: To cope with not only a fault of a ring bus and a processor, but also a fault of a processor for input and output. SOLUTION: The parallel processor is equipped with a 1st ring bus which connects processors 2(P0, P1, P2, and P3) and 1st nodes 1(0A, 3A, 2A, and 1A) connected to the respective processors on a ring through 1:1 unidirectional bus connections and a 2nd ring bus which connects 2nd nodes 1(0B, 1B, 2B, and 3B) connected to the respective processors on a ring reversely to the order through 1:1 unidirectional bus connections. Some processors (P1 and P3) among the processors are connected to an input/output bus 4 and the processors (P1 and P3) for input and output performs input from and output to the outside.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、処理を行う複数
のプロセッサから構成される並列計算機において、単一
の障害が発生した場合に処理を継続することができ、高
い信頼性を図ることができる並列処理装置に関するもの
である。The present invention relates to a parallel computer composed of a plurality of processors for processing, which can continue processing when a single failure occurs, thereby achieving high reliability. The present invention relates to a parallel processing device.

【０００２】[0002]

【従来の技術】従来の並列処理装置の構成について図６
を参照しながら説明する。図６は、例えば、Ole Kjolle
r他著“SCI Dual Ring Architecture with Self-Recove
ry”、「The Fourth International Workshop on SCI-b
ased-High-Performance Low-Cost Computing」第３３頁
〜第３７頁、１９９５年、Sponsored by SCIzzLに示さ
れた従来の並列処理装置を示す図である。2. Description of the Related Art FIG.
This will be described with reference to FIG. FIG. 6 shows, for example, Ole Kjolle
r et al. “SCI Dual Ring Architecture with Self-Recove
ry ”,“ The Fourth International Workshop on SCI-b
FIG. 33 is a diagram showing a conventional parallel processing device shown in Sponsored by SCIzzL, ased-High-Performance Low-Cost Computing, pp. 33-37, 1995.

【０００３】図６において、１はプロセッサが他のプロ
セッサと通信を行うための中間に介在するノード（通信
ノード）、２はプロセッサ、３は各ノード間でのデータ
をやりとりするための単方向のデータ転送ライン（通信
バス）である。なお、ノード１は、ノード０Ａ、１Ａ、
２Ａ、３Ａ、０Ｂ、１Ｂ、２Ｂ、３Ｂの８箇所が示され
ている。また、プロセッサ２は、Ｐ０、Ｐ１、Ｐ２、Ｐ
３の４個が示されている。In FIG. 6, reference numeral 1 denotes a node (communication node) interposed between a processor and another processor to communicate with each other, 2 denotes a processor, and 3 denotes a unidirectional node for exchanging data between the nodes. This is a data transfer line (communication bus). Note that node 1 has nodes 0A, 1A,
Eight locations of 2A, 3A, 0B, 1B, 2B, 3B are shown. Further, the processor 2 has P0, P1, P2, P
Three of three are shown.

【０００４】つぎに、従来の並列処理装置の動作につい
て図６を参照しながら説明する。Next, the operation of the conventional parallel processing apparatus will be described with reference to FIG.

【０００５】この図６は、４つの並列プロセッサＰ０、
Ｐ１、Ｐ２、Ｐ３からなる並列計算機を示しており、各
プロセッサが各々２つのノード１と接続され、各ノード
１がリング構成に接続されている。FIG. 6 shows four parallel processors P0,
This shows a parallel computer composed of P1, P2, and P3, where each processor is connected to two nodes 1 each, and each node 1 is connected to a ring configuration.

【０００６】すなわち、プロセッサＰ０にはノード０Ａ
と、ノード０Ｂの２つのノード１が接続されている。ま
た、プロセッサＰ１にはノード１Ａと、ノード１Ｂの２
つのノード１が接続されている。また、プロセッサＰ２
にはノード２Ａと、ノード２Ｂの２つのノード１が接続
されている。さらに、プロセッサＰ３にはノード３Ａ
と、ノード３Ｂの２つのノード１が接続されている。な
お、各ノード１で末尾にＡのつくもの同士が、その番号
の降順（昇順でもよい。）にリングに接続される。ま
た、Ｂも接続順が逆になるが同様である。That is, the processor P0 has the node 0A
And two nodes 1 of the node 0B. The processor P1 has two nodes 1A and 1B.
Nodes 1 are connected. The processor P2
Is connected to two nodes 1 of a node 2A and a node 2B. Further, the processor 3 has a node 3A.
And two nodes 1 of the node 3B are connected. Note that nodes with an A at the end of each node 1 are connected to the ring in descending order of the number (or in ascending order). The same applies to B, although the connection order is reversed.

【０００７】もし、任意のノード１、あるいはデータ転
送ライン３に対して障害が発生した場合には、そのノー
ド１かデータ転送ライン３を含むリングを無効とし、も
う一方のリングを使用することにより、継続した処理を
可能としていた。If a failure occurs in any node 1 or data transfer line 3, the ring including that node 1 or data transfer line 3 is invalidated, and the other ring is used. , Enabling continuous processing.

【０００８】[0008]

【発明が解決しようとする課題】上述したような従来の
並列処理装置では、以上のように構成されているので、
ノード１、あるいはデータ転送ライン３に障害が発生し
た場合には、その障害に対応できるが、プロセッサ２自
身に障害が発生した場合、また、外部とのデータ転送を
行う部分に障害が発生した場合には、それらの障害に対
応できないという問題点があった。The conventional parallel processing device as described above is configured as described above.
When a failure occurs in the node 1 or the data transfer line 3, the failure can be dealt with. However, when a failure occurs in the processor 2 itself, or when a failure occurs in a part performing data transfer with the outside. Had a problem that it could not cope with those obstacles.

【０００９】この発明は、前述した問題点を解決するた
めになされたもので、入出力を行うプロセッサを複数設
け、外部とのデータ転送を行う部分に障害が発生した場
合にも対処できると共に、プロセッサに障害が発生した
場合にも実行中の処理に影響を与えないように継続して
処理することができる並列処理装置を得ることを目的と
する。SUMMARY OF THE INVENTION The present invention has been made to solve the above-mentioned problems, and it is possible to provide a plurality of processors for input / output and cope with a case where a failure occurs in a portion for performing data transfer with the outside. It is an object of the present invention to provide a parallel processing device that can continuously perform processing without affecting processing being executed even when a failure occurs in a processor.

【００１０】[0010]

【課題を解決するための手段】この発明に係る並列処理
装置は、複数のプロセッサと、各プロセッサに接続され
た第１のノードを順に１対１の単方向バス接続によりリ
ング上に結合した第１のリングバスと、各プロセッサに
接続された第２のノードを前記順とは逆方向に１対１の
単方向バス接続によりリング上に結合した第２のリング
バスとを備えた並列処理装置において、前記複数のプロ
セッサのうちの一部のプロセッサを入出力バスに接続
し、前記入出力用プロセッサは外部との入出力を行うも
のである。According to the present invention, there is provided a parallel processing apparatus comprising a plurality of processors and a first node connected to each processor which are sequentially connected on a ring by a one-to-one unidirectional bus connection. A parallel processing apparatus comprising: one ring bus; and a second ring bus in which second nodes connected to each processor are connected on the ring in a direction opposite to the above by a one-to-one unidirectional bus connection. , A part of the plurality of processors is connected to an input / output bus, and the input / output processor performs input / output with an external device.

【００１１】また、この発明に係る並列処理装置は、前
記入出力用プロセッサが、第１及び第２の入出力用プロ
セッサを含み、前記第１の入出力用プロセッサは、第２
の入出力用プロセッサからの状態情報が到達しないとき
には、前記第２の入出力用プロセッサに障害が発生した
と判断して前記第２の入出力用プロセッサが行ってきた
入出力処理を代替して行うものである。Further, in the parallel processing device according to the present invention, the input / output processor includes first and second input / output processors, and the first input / output processor includes a second input / output processor.
When the status information from the second input / output processor does not arrive, it is determined that a failure has occurred in the second input / output processor, and the input / output processing performed by the second input / output processor is substituted. Is what you do.

【００１２】また、この発明に係る並列処理装置は、前
記複数のプロセッサのうちの任意のプロセッサが、リン
グ上に隣接するプロセッサの状態監視機構を有すると共
に、前記隣接するプロセッサが実行するタスクの情報を
記憶し、前記任意のプロセッサは、前記隣接するプロセ
ッサに障害が発生したときには、前記隣接するプロッサ
を論理的に切り離して前記隣接するプロセッサが実行す
る予定であったタスクを再度実行するものである。Further, in the parallel processing apparatus according to the present invention, any one of the plurality of processors has a state monitoring mechanism of an adjacent processor on a ring, and information of a task executed by the adjacent processor is provided. When the adjacent processor fails, the arbitrary processor logically disconnects the adjacent processor and re-executes a task that was scheduled to be executed by the adjacent processor. .

【００１３】また、この発明に係る並列処理装置は、前
記複数のプロセッサのうちの任意のプロセッサが、リン
グ上に隣接するプロセッサの状態監視機構を有すると共
に、前記隣接するプロセッサが実行するタスクのうちの
優先度の高いタスクの情報を予め記憶し、前記任意のプ
ロセッサは、前記優先度の高いタスクについて前記隣接
するプロセッサと同時に実行しておき、前記隣接するプ
ロセッサに障害が発生したときには、前記隣接するプロ
ッサを論理的に切り離すものである。Further, in the parallel processing apparatus according to the present invention, any one of the plurality of processors has a state monitoring mechanism of an adjacent processor on a ring, and a task among the tasks executed by the adjacent processor. Information of the high-priority task is stored in advance, the arbitrary processor executes the high-priority task at the same time as the adjacent processor, and when a failure occurs in the adjacent processor, To logically separate the processor to be executed.

【００１４】また、この発明に係る並列処理装置は、前
記複数のプロセッサのうちの任意のプロセッサが、リン
グ上に隣接するプロセッサの状態監視機構を有すると共
に、前記隣接するプロセッサが実行するタスクと同等の
機能を有し前記タスクよりも実行時間の短いタスクの情
報を予め記憶し、前記任意のプロセッサは、前記隣接す
るプロセッサに障害が発生したときには、前記隣接する
プロッサを論理的に切り離して前記隣接するプロセッサ
が実行する予定であったタスクを前記実行時間の短いタ
スクで代行するものである。Further, in the parallel processing apparatus according to the present invention, any one of the plurality of processors has a state monitoring mechanism of an adjacent processor on a ring and is equivalent to a task executed by the adjacent processor. Information of a task having a shorter execution time than that of the task, and the arbitrary processor logically disconnects the adjacent processor when the adjacent processor fails, In this case, the task that is to be executed by the processor to be executed is substituted by the task having a short execution time.

【００１５】また、この発明に係る並列処理装置は、前
記複数のプロセッサが、各自の状態を示すトークンをリ
ングバス上に流し、前記複数のプロセッサのうちの定め
られたマスタプロセッサは、前記トークンを調べて障害
の発生したプロセッサを論理的に切り離すものである。Further, in the parallel processing apparatus according to the present invention, the plurality of processors sends a token indicating their respective states on a ring bus, and a predetermined master processor among the plurality of processors sends the token to the ring bus. Investigation is to logically separate the failed processor.

【００１６】また、この発明に係る並列処理装置は、前
記マスタプロセッサが、前記複数のプロセッサが実行す
るタスクと同等の機能を有し前記タスクよりも実行時間
の短いタスクの情報を予め記憶し、前記マスタプロセッ
サは、前記プロセッサに障害が発生したときには、前記
障害が発生したプロセッサが実行する予定であったタス
クを前記実行時間の短いタスクで代行するものである。Further, in the parallel processing device according to the present invention, the master processor previously stores information of a task having a function equivalent to a task executed by the plurality of processors and having a shorter execution time than the task, When a failure occurs in the processor, the master processor substitutes a task scheduled to be executed by the failed processor for the task with a short execution time.

【００１７】さらに、この発明に係る並列処理装置は、
前記複数のプロセッサのうちの定められたマスタプロセ
ッサが、第１及び第２のマスタプロセッサを含み、前記
第１のマスタプロセッサは、前記第２のマスタプロセッ
サに障害が発生したときには、前記第２のマスタプロセ
ッサを代替するものである。Furthermore, the parallel processing device according to the present invention is
A predetermined master processor among the plurality of processors includes first and second master processors, and the first master processor is configured to perform the second master processor when the second master processor fails. It replaces the master processor.

【００１８】[0018]

BEST MODE FOR CARRYING OUT THE INVENTION

実施の形態１．この発明の実施の形態１に係る並列処理
装置の構成について図１を参照しながら説明する。図１
は、この発明の実施の形態１に係る並列処理装置の構成
を示す図である。なお、各図中、同一符号は同一又は相
当部分を示す。Embodiment 1 FIG. The configuration of the parallel processing device according to the first embodiment of the present invention will be described with reference to FIG. FIG.
1 is a diagram showing a configuration of a parallel processing device according to Embodiment 1 of the present invention. In the drawings, the same reference numerals indicate the same or corresponding parts.

【００１９】図１において、１はプロセッサが他のプロ
セッサと通信を行うための中間に介在するノード（通信
ノード）、２はプロセッサ、２Ａは入出力用のプロセッ
サ、３は各ノード間でのデータをやりとりするための単
方向のデータ転送ライン（通信バス）である。さらに、
４は入出力用のプロセッサ２Ａと外部との間でデータの
入出力を行うための入出力バスを示す。なお、ノード１
は、ノード０Ａ、１Ａ、２Ａ、３Ａ、０Ｂ、１Ｂ、２
Ｂ、３Ｂの８箇所が示されている。また、プロセッサ２
は、Ｐ０、Ｐ２の２個が示され、入出力用のプロセッサ
２Ａは、Ｐ１、Ｐ３の２個が示され、プロセッサは合計
で４個が示されている。In FIG. 1, reference numeral 1 denotes a node (communication node) interposed between a processor and a processor for communication with another processor, 2 denotes a processor, 2A denotes an input / output processor, and 3 denotes data between the nodes. Is a unidirectional data transfer line (communication bus) for exchanging data. further,
Reference numeral 4 denotes an input / output bus for inputting / outputting data between the input / output processor 2A and the outside. Node 1
Are nodes 0A, 1A, 2A, 3A, 0B, 1B, 2
Eight positions B and 3B are shown. Processor 2
Indicates two of P0 and P2, and two processors P1 and P3 are shown for the input / output processor 2A, and a total of four processors are shown.

【００２０】つぎに、前述した実施の形態１に係る並列
処理装置の動作について図１を参照しながら説明する。Next, the operation of the parallel processing apparatus according to the first embodiment will be described with reference to FIG.

【００２１】入出力を除く全体の動作は従来例の記載で
述べたものと同様である。この実施の形態１では、ここ
に新たにデータ入出力専用の機能を、例えば、プロセッ
サＰ１と、Ｐ３に設けた。外部からのデータ入力は、例
えば、通常はプロセッサＰ３を経由してのみ行われる。
また、外部へのデータ出力も同様にプロセッサＰ３を経
由する。The entire operation except for input and output is the same as that described in the description of the conventional example. In the first embodiment, a function dedicated to data input / output is newly provided in, for example, the processors P1 and P3. External data input is normally performed only via the processor P3, for example.
Data output to the outside also goes through the processor P3.

【００２２】一方、プロセッサＰ１は、プロセッサＰ３
の状態を絶えず監視している。つまり、プロセッサＰ３
はその状態を定期的にプロセッサＰ１に送る。もし、プ
ロセッサＰ３からの定期信号が到達しない場合には、プ
ロセッサＰ１は、プロセッサＰ３がダウンしたと判断す
る。その後、プロセッサＰ１は、これまでプロセッサＰ
３が行ってきた入出力処理を代替して行う。On the other hand, the processor P1 is
Is constantly monitored. That is, the processor P3
Sends its state periodically to the processor P1. If the periodic signal from the processor P3 has not arrived, the processor P1 determines that the processor P3 has gone down. After that, the processor P1
3 is performed instead of the input / output processing performed.

【００２３】これにより、図１に示す構成では、どのポ
イントに障害が発生してもそれを切り離すことにより、
処理を継続することができる。Thus, in the configuration shown in FIG. 1, even if a fault occurs at any point, it is separated from the fault.
Processing can be continued.

【００２４】この実施の形態１によれば、複数のプロセ
ッサ２、２Ａを持つ並列処理装置において、各プロセッ
サ２、２Ａに他のプロセッサ２、２Ａとの結合のための
２つのノード１を結合し、すべてのプロセッサ２、２Ａ
に結合されている一方のノード間の接続を１対１の単方
向バス接続によりリング上に結合し、もう一方のノード
間も１対１の単方向バス接続により結合し、さらに、外
部との入出力を行うプロセッサ２Ａを複数個設けること
により、リング３、ノード１、プロセッサ２、入出力用
プロセッサ２Ａのどの部分に障害が発生しても、自動的
にその部分を切り離して処理を継続できる。According to the first embodiment, in a parallel processing device having a plurality of processors 2 and 2A, two nodes 1 for coupling with other processors 2 and 2A are connected to each processor 2 and 2A. , All processors 2, 2A
Is connected on the ring by a one-to-one unidirectional bus connection, and the other node is also connected by a one-to-one unidirectional bus connection. By providing a plurality of processors 2A that perform input / output, even if a failure occurs in any part of the ring 3, the node 1, the processor 2, and the input / output processor 2A, the processing can be automatically separated and continued. .

【００２５】つまり、この実施の形態１に係る並列処理
装置は、処理を行う複数の同一プロセッサを備え、各プ
ロセッサに２つのノードを接続する。各プロセッサに接
続する１つのノードを順に１対１のバスで接続し、それ
らのノードをリング上に結合する。また、各プロセッサ
のもう一つのノードも、順に１対１の別のバスで逆方向
に接続し、リング状に結合する。例えば、現在市販され
ているSCI(Scalable Coherent Interface)バス[ANSI/IE
EE Std 1596-1992] がこのリングバスに該当する。そし
て、入出力用プロセッサ２Ａと入出力バス４とを設けた
ので、リングバス３の故障だけでなく、プロセッサ２の
故障、入出力用のプロセッサ２Ａの故障にも対処するこ
とができる。That is, the parallel processing device according to the first embodiment includes a plurality of identical processors that perform processing, and connects two nodes to each processor. One node connected to each processor is sequentially connected by a one-to-one bus, and the nodes are connected on a ring. Further, another node of each processor is connected in the reverse direction by another one-to-one bus in order, and is connected in a ring shape. For example, currently available SCI (Scalable Coherent Interface) bus [ANSI / IE
EE Std 1596-1992] corresponds to this ring bus. Since the input / output processor 2A and the input / output bus 4 are provided, not only the failure of the ring bus 3 but also the failure of the processor 2 and the failure of the input / output processor 2A can be dealt with.

【００２６】実施の形態２．この発明の実施の形態２に
係る並列処理装置の構成について図２を参照しながら説
明する。図２は、この発明の実施の形態２に係る並列処
理装置であって、図１で示した全体構成図の一部を拡大
した構成を示す図である。Embodiment 2 FIG. The configuration of a parallel processing device according to Embodiment 2 of the present invention will be described with reference to FIG. FIG. 2 is a diagram illustrating a parallel processing device according to a second embodiment of the present invention, in which a part of the entire configuration diagram shown in FIG. 1 is enlarged.

【００２７】リングバス構成の場合には、各プロセッサ
２、２Ａが隣接するプロセッサと接続されることにな
る。そのため、各プロセッサの障害監視も隣接プロセッ
サが行うと効率的である。In the case of a ring bus configuration, each processor 2, 2A is connected to an adjacent processor. Therefore, it is efficient if the adjacent processors also perform fault monitoring of each processor.

【００２８】そこで、図２に示すように、例えば、プロ
セッサＰ１の状態監視をプロセッサＰ０が行うこととす
る。この時、プロセッサＰ０には、プロセッサＰ１の状
態情報を定期的に送る。つまり、プロセッサＰ１が自身
で自己診断などを行い、その結果をプロセッサＰ０に送
る。また、これとは別に、実行するタスクの情報を、プ
ロセッサＰ１は、そのタスクの実行に先立ち、常にプロ
セッサＰ０に送る。Thus, as shown in FIG. 2, for example, the processor P0 monitors the state of the processor P1. At this time, the status information of the processor P1 is periodically sent to the processor P0. That is, the processor P1 performs a self-diagnosis or the like by itself, and sends the result to the processor P0. Apart from this, the processor P1 always sends information on the task to be executed to the processor P0 prior to the execution of the task.

【００２９】プロセッサＰ０は、プロセッサＰ１から定
期的に送られる状態を判断し、プロセッサＰ１が正常動
作しているかどうかを判断する。正常に動作しているう
ちはよいが、障害の発生が報告されたり、あるいは、プ
ロセッサＰ１の内部障害により、ある定められた時間内
に、その状態報告自体が送られなくなった場合には、プ
ロセッサＰ０は、障害と判断する。The processor P0 determines a state periodically sent from the processor P1, and determines whether the processor P1 is operating normally. If the status report itself is not sent within a predetermined time due to a report of the occurrence of a fault or an internal fault of the processor P1, the processor may operate normally. P0 is determined to be a failure.

【００３０】その後、プロセッサＰ０は、プロセッサＰ
１をシステム全体から切り離す処置を行う。つまり、物
理的には切り離せないため、他の各プロセッサＰ２、Ｐ
３に障害の発生したプロセッサＰ１の件を知らせ、以降
は論理的に切り離して扱う。After that, the processor P0
1 is disconnected from the entire system. In other words, the other processors P2 and P2 cannot be physically separated.
3 is notified of the case of the processor P1 in which the fault has occurred, and thereafter, it is logically separated and handled.

【００３１】ただし、障害の発生時に稼働していたプロ
セッサＰ１のタスク処理は、再処理する必要がある。そ
のため、先にプロセッサＰ０にタスクの実行に先だって
送っておいたタスク情報を使用して、障害発生後に、プ
ロセッサＰ０は同一のタスクを再実行する。これによ
り、プロセッサＰ１の切り離し後、タスクの再実行がで
き、全体としては、継続した処理が可能となる。However, the task processing of the processor P1 which was operating at the time of the occurrence of the fault needs to be reprocessed. Therefore, using the task information sent to the processor P0 prior to the execution of the task, the processor P0 re-executes the same task after the occurrence of a failure. As a result, after the processor P1 is disconnected, the task can be re-executed, and continuous processing can be performed as a whole.

【００３２】なお、この処置は各プロセッサが隣接する
プロセッサに対して行えるため、分散して行え、リング
間の距離も最短のため、故障対策が効率的に行えること
になる。This procedure can be performed in a distributed manner because each processor can perform the processing on the adjacent processor, and the distance between the rings is short, so that the measures against the failure can be efficiently performed.

【００３３】この実施の形態２によれば、上記実施の形
態１の並列処理装置において、上記実施の形態１の各構
成を備え、任意のプロセッサＰ０がリング上で隣接する
プロセッサＰ１の状態監視機構を持つと共に、該プロセ
ッサＰ１の実行中のタスクの情報を記憶しておくことに
より、プロセッサＰ１に障害が発生した場合に、プロセ
ッサＰ０が該障害を検知し、プロセッサＰ１を論理的に
システムから切り離し、該プロセッサＰ１が実行中であ
ったタスクをプロセッサＰ０で再度実行することによ
り、任意のプロセッサの障害発生時にも、連続して処理
を継続できる。According to the second embodiment, the parallel processing apparatus according to the first embodiment includes the components of the first embodiment, and an arbitrary processor P0 is a state monitoring mechanism of an adjacent processor P1 on a ring. By storing information on tasks being executed by the processor P1, when a failure occurs in the processor P1, the processor P0 detects the failure and logically disconnects the processor P1 from the system. By re-executing the task that was being executed by the processor P1 on the processor P0, the processing can be continued even when a failure occurs in any processor.

【００３４】つまり、この実施の形態２に係る並列処理
装置は、実施の形態１に係る並列処理装置の構成（手
段）に加え、あるプロセッサ２Ａの状態監視を、例えば
逆時計方向に隣接するプロセッサ２が行う機構と該プロ
セッサ２Ａが実行中のタスク情報を保持する機構を該隣
接するプロセッサ２の中に備えると共に、プロセッサ２
Ａの障害発生時に、プロセッサ２Ａのリングからの切り
離しを隣接するプロセッサ２から行える機構を備えたも
のである。この実施の形態２においては、隣接するプロ
セッサ２、２Ａ同士で、分散して故障の検知を行うた
め、効率がよく、隣接間のプロセッサを結合するリング
バスに適した故障の検知ができる。That is, in addition to the configuration (means) of the parallel processing device according to the first embodiment, the parallel processing device according to the second embodiment is capable of monitoring the state of a certain processor 2A by, for example, an adjacent processor The processor 2A includes a mechanism for holding task information being executed by the processor 2A and a mechanism for holding task information being executed by the processor 2A.
When the failure of A occurs, the processor 2A can be disconnected from the ring from the adjacent processor 2. In the second embodiment, the failures are detected in a distributed manner between the adjacent processors 2 and 2A. Therefore, it is possible to efficiently detect a failure suitable for a ring bus connecting the processors between the adjacent processors.

【００３５】実施の形態３．この発明の実施の形態３に
係る並列処理装置について図３を参照しながら説明す
る。図３は、この発明の実施の形態３に係る並列処理装
置のプロセッサのタスクリストを示す図である。Embodiment 3 FIG. Third Embodiment A parallel processing device according to a third embodiment of the present invention will be described with reference to FIG. FIG. 3 is a diagram showing a task list of a processor of a parallel processing device according to Embodiment 3 of the present invention.

【００３６】図３において、５はプロセッサＰ１で実行
するタスクリストの例を、６はプロセッサＰ０で実行す
るタスクリスト示す。In FIG. 3, reference numeral 5 denotes an example of a task list executed by the processor P1, and reference numeral 6 denotes a task list executed by the processor P0.

【００３７】この実施の形態３でのベースとなる構成は
上記実施の形態２と同様であり、図２をそのまま使用す
る。すなわち、プロセッサＰ１の状態監視をプロセッサ
Ｐ０が行い、もし、プロセッサＰ１に障害が発生した場
合には、これを切り離す。The base configuration of the third embodiment is the same as that of the second embodiment, and FIG. 2 is used as it is. That is, the state of the processor P1 is monitored by the processor P0, and if a failure occurs in the processor P1, this is disconnected.

【００３８】上記実施の形態２では、プロセッサＰ１で
障害が発生した後には、そのタスク情報に基づいてプロ
セッサＰ０が再度、同一タスクの実行を試みたが、応用
によっては、これでは間に合わない場合がある。つま
り、ある時刻までに必ず処理を終えなければならない場
合に、プロセッサＰ１の障害後、再実行していては、間
に合わないような場合である。In the second embodiment, after a failure occurs in the processor P1, the processor P0 attempts to execute the same task again based on the task information. However, depending on the application, this may not be enough in time. is there. In other words, in a case where the processing must be completed by a certain time, if the processing is re-executed after the failure of the processor P1, it will not be in time.

【００３９】この実施の形態３では、上記のような場合
には、障害を監視しているプロセッサＰ０で事前にプロ
セッサＰ１の実行タスクの中で重要なものは、同じよう
に実行しておく。そして、障害発生時にこれを利用する
ようにする。In the third embodiment, in the above case, the important tasks among the execution tasks of the processor P1 are similarly executed in advance by the processor P0 monitoring the failure. Then, this is used when a failure occurs.

【００４０】図３においては、障害を監視されるプロセ
ッサＰ１において、タスクリスト５に示すように、６つ
のタスクＡ、Ｂ、Ｃ、Ｄ、Ｅ、Ｆを実行すると仮定す
る。この時、あるしきい値（Threshold）を決めてお
く。例えば、この例では、タスクＡ、Ｂと、タスクＣ、
Ｄ、Ｅ、Ｆとを区別する所にしきい値を設定した。要す
るに、このしきい値より上にあるタスクＡ、Ｂは優先度
が高く、タスクＣ、Ｄ、Ｅ、Ｆは優先度が低いとする。In FIG. 3, it is assumed that six tasks A, B, C, D, E, and F are executed in the processor P1 monitored for a fault, as shown in a task list 5. At this time, a certain threshold (Threshold) is determined. For example, in this example, tasks A and B and task C,
A threshold was set where D, E, and F were distinguished. In short, it is assumed that tasks A and B above this threshold have a high priority, and tasks C, D, E and F have a low priority.

【００４１】プロセッサＰ０においては、障害監視の対
象とするプロセッサＰ１の中から、優先度の高いもので
あるタスクＡ、Ｂを自プロセッサＰ０の他のタスクと同
時に実行する。これを図３のタスクリスト６に示す。プ
ロセッサＰ０では、本来、タスクＸ、Ｙ、Ｚを実行する
予定であるが、ここに、プロセッサＰ１の高い優先度の
タスクＡ、Ｂを同時に実行する。In the processor P0, the tasks A and B having a higher priority are executed at the same time as other tasks of the own processor P0 from among the processors P1 to be monitored. This is shown in task list 6 in FIG. The processor P0 is originally supposed to execute the tasks X, Y, and Z. Here, the tasks A and B having the higher priority of the processor P1 are simultaneously executed.

【００４２】上記実施の形態２で示したように、プロセ
ッサＰ０が、プロセッサＰ１での障害発生時には、これ
を検出して、切り離すことは同様で、その後、実行して
いた予備用のタスクＡ、Ｂの結果を切り離したプロセッ
サＰ１の実行結果とする。これにより、障害が発生した
場合でも、リアルタイムな処理結果を損なうことなく、
実行が継続できる。As described in the second embodiment, when a failure occurs in the processor P1, the processor P0 detects and isolates the failure, and the processor A0 executes the backup tasks A, The result of B is taken as the execution result of the separated processor P1. As a result, even if a failure occurs, the real-time processing result is not impaired,
Execution can continue.

【００４３】この実施の形態３によれば、上記実施の形
態２の並列処理装置（並列計算機）において、上記実施
の形態２の手段を備え、さらに任意のプロセッサＰ１が
実行するタスクの中で、ある事前に定められたプライオ
リティより高いタスクに関しては、プロセッサＰ１に隣
接するプロセッサＰ０が同時に実行しておくことによ
り、プロセッサＰ１に障害が発生した場合に、プロセッ
サＰ０が該障害を検知し、プロセッサＰ１を論理的にシ
ステムから切り離し、該プロセッサＰ１が実行する予定
であった高いプライオリティのタスクの実行結果をプロ
セッサＰ０から取り出すことにより、任意のプロセッサ
の障害発生時にも、連続して処理を継続できる。According to the third embodiment, the parallel processing apparatus (parallel computer) according to the second embodiment includes the means according to the second embodiment, and further includes, among tasks executed by an arbitrary processor P1, For a task having a higher priority than a predetermined priority, the processor P0 adjacent to the processor P1 executes the tasks at the same time, so that when a failure occurs in the processor P1, the processor P0 detects the failure and the processor P1 Is logically separated from the system, and the execution result of the high-priority task, which was to be executed by the processor P1, is taken out from the processor P0, so that the processing can be continued even when a failure occurs in any processor.

【００４４】つまり、この実施の形態３に係る並列処理
装置は、上記実施の形態２の手段に加え、例えば時計方
向に隣接するプロセッサＰ１が実行するタスクの中で、
高いプライオリティのタスクはプロセッサＰ０において
も、同時に実行する機構を備えたものである。従って、
ある定められた時間内に行う処理を、たとえ、プロセッ
サＰ１が故障しても、そのまま実行を可能とするもので
ある。That is, the parallel processing device according to the third embodiment includes, in addition to the means of the second embodiment, for example, a task executed by a processor P1 adjacent in a clockwise direction.
The high-priority task has a mechanism to be executed simultaneously in the processor P0. Therefore,
Even if the processor P1 breaks down, a process to be performed within a predetermined time can be executed as it is.

【００４５】実施の形態４．この発明の実施の形態４に
係る並列処理装置について図４を参照しながら説明す
る。図４は、この発明の実施の形態４に係る並列処理装
置のプロセッサのタスクの流れを示す図である。Embodiment 4 Embodiment 4 A parallel processing apparatus according to Embodiment 4 of the present invention will be described with reference to FIG. FIG. 4 is a diagram showing a task flow of the processor of the parallel processing device according to Embodiment 4 of the present invention.

【００４６】図４において、７はタスクの一般の流れ、
８は障害が発生した場合のタスクの流れを示すものであ
る。In FIG. 4, reference numeral 7 denotes a general task flow.
Reference numeral 8 denotes a task flow when a failure occurs.

【００４７】図４におけるタスクの流れ７では、あるプ
ロセッサＰ１で順に実行するタスクがＡ、Ｂ、Ｃ、Ｄの
４つ存在し、それらが時刻Ｔまでには、終了しなければ
ならないことを示している。The task flow 7 in FIG. 4 shows that there are four tasks A, B, C, and D to be executed sequentially by a certain processor P1, and these tasks must be completed by time T. ing.

【００４８】さて、今、このプロセッサＰ１において、
図４に示すようにタスクＡを処理中に障害が発生したと
する。また、先の実施の形態２と同様にプロセッサＰ１
を監視しているプロセッサＰ０は、事前にプロセッサＰ
１が実行するタスクをすべて知っているとする。Now, in this processor P1,
It is assumed that a failure occurs while processing task A as shown in FIG. Further, as in the second embodiment, the processor P1
Is monitored in advance by the processor P0.
Suppose you know all the tasks that 1 performs.

【００４９】上記実施の形態２と同様に、プロセッサＰ
１で障害が発生した後、プロセッサＰ０がこれを検出す
る。その後で、プロセッサＰ０では、本来のタスクＡ、
Ｂ、Ｃ、Ｄに比べ、実行時間が少なて済むタスクＡ’、
Ｂ’、Ｃ’、Ｄ’をあらかじめ用意しておき、これらを
プロセッサＰ０内で順に実行する。As in the second embodiment, the processor P
After a failure has occurred at 1, the processor P0 detects this. After that, in the processor P0, the original task A,
Task A ', which requires less execution time than B, C and D,
B ', C', and D 'are prepared in advance, and are sequentially executed in the processor P0.

【００５０】例えば、タスクＡ’、Ｂ’、Ｃ’、Ｄ’の
例としては、タスクＡ、Ｂ、Ｃ、Ｄと全く同様の処理内
容であるが、処理精度を低くすることにより、実行時間
を短くしたものなどがあげられる。これにより、再度、
タスクＡから再実行していたのでは、決められた時刻Ｔ
に間に合わない処理を同機能の処理により間に合わせる
ことができる。For example, the tasks A ′, B ′, C ′, and D ′ have the same processing contents as the tasks A, B, C, and D, but the execution time is reduced by lowering the processing accuracy. Are shortened. With this,
If task A was re-executed, the time T
The processing that cannot be made in time can be made in time by the processing of the same function.

【００５１】この実施の形態４によれば、上記実施の形
態２の並列処理装置（並列計算機）において、上記実施
の形態２の手段を備え、さらに任意のプロセッサＰ１が
実行する各タスクとそれぞれ同等の機能を有するが、処
理精度が低いなどの理由で実行時間の短い各タスクを隣
接するプロセッサＰ０に用意しておくことにより、プロ
セッサＰ１に障害が発生した場合に、プロセッサＰ０が
該障害を検知し、プロセッサＰ１を論理的に装置（シス
テム）から切り離し、該プロセッサＰ１が実行する予定
であったタスクの実行を、プロセッサＰ０において精度
などは落ちるが同機能の処理時間の短いタスクで代行す
ることにより、任意のプロセッサＰ１の障害発生時に、
連続して処理を継続でき、かつ予め定められた時刻内に
処理を終了できる。According to the fourth embodiment, the parallel processing device (parallel computer) of the second embodiment includes the means of the second embodiment, and is equivalent to each task executed by an arbitrary processor P1. However, by preparing each task having a short execution time in the adjacent processor P0 due to low processing accuracy or the like, when a failure occurs in the processor P1, the processor P0 detects the failure. Then, the processor P1 is logically disconnected from the device (system), and the execution of the task scheduled to be executed by the processor P1 is performed by the processor P0 with a reduced processing time but with a shorter processing time of the same function. Thus, when a failure occurs in an arbitrary processor P1,
The processing can be continued continuously, and the processing can be completed within a predetermined time.

【００５２】つまり、この実施の形態４に係る並列処理
装置は、上記実施の形態２の手段に加え、すべてのタス
クにおいて、処理精度を下げて実行時間を短くした高速
実行バージョンを用意しておく機構を備えたものであ
る。従って、ある定められた時間内に行う処理を、たと
え、プロセッサＰ１が故障しても、若干の処理精度の低
下はあるものの、機能的には実行を可能とする。That is, in the parallel processing device according to the fourth embodiment, in addition to the means of the second embodiment, a high-speed execution version is prepared for all tasks in which the processing accuracy is reduced and the execution time is shortened. It has a mechanism. Therefore, even if the processor P1 breaks down, the processing performed within a predetermined time can be functionally executed, although the processing accuracy is slightly reduced.

【００５３】実施の形態５．この発明の実施の形態５に
係る並列処理装置について図５を参照しながら説明す
る。図５は、この発明の実施の形態５に係る並列処理装
置のプロセッサからの障害状態を明記したトークンを示
す図である。Embodiment 5 A parallel processing device according to a fifth embodiment of the present invention will be described with reference to FIG. FIG. 5 is a diagram showing a token specifying a failure state from a processor of the parallel processing device according to the fifth embodiment of the present invention.

【００５４】図１で示した通信バス３で構成されるリン
グバスに、各プロセッサＰ０、Ｐ１、Ｐ２、Ｐ３がどの
ような状態であるかを示すトークンを流す。例えば、図
５に示すように、各プロセッサＰ０、Ｐ１、Ｐ２、Ｐ３
は、自身で自己診断プログラムを実行するなどして、内
部の状態チェックなどを行い、その結果、内部に異常が
なければ「１」を、異常が発見されれば「０」をトーク
ンの自プロセッサの位置に書き込む。A token indicating the state of each processor P0, P1, P2, P3 is sent to the ring bus constituted by the communication bus 3 shown in FIG. For example, as shown in FIG. 5, each processor P0, P1, P2, P3
Executes the self-diagnosis program by itself and checks the internal state. As a result, if there is no abnormality inside, “1” is returned if abnormality is found, and “0” is returned to the token's own processor. Write to the location.

【００５５】このトークンはリングバス上を常に回って
おり、各プロセッサＰ０、Ｐ１、Ｐ２、Ｐ３は、これを
受け取る度に、自プロセッサに該当する位置に最新の状
態を記載し、次に送る。This token is always rotating on the ring bus. Each time the processors P0, P1, P2, P3 receive this token, they write the latest state at the position corresponding to the own processor, and then send it.

【００５６】すべてのプロセッサＰ０、Ｐ１、Ｐ２、Ｐ
３の中で、一つのプロセッサ、例えば、プロセッサＰ０
をマスタとする。マスタプロセッサＰ０は、トークンが
送られてきた後で、この内容をチェックし、もし、障害
が発生しているプロセッサＰ１が発見された場合には、
そのプロセッサＰ１を論理的に切り離すように他のプロ
セッサＰ２、Ｐ３に伝達する。その後、残りのプロセッ
サで同様の処理を繰り返す。All processors P0, P1, P2, P
3, one processor, for example, processor P0
Is the master. The master processor P0 checks this content after the token has been sent, and if a faulty processor P1 is found,
The signal is transmitted to the other processors P2 and P3 so as to logically separate the processor P1. After that, the same processing is repeated in the remaining processors.

【００５７】この実施の形態５によれば、上記実施の形
態１の並列処理装置（並列計算機）において、上記実施
の形態１の手段を備え、さらに各プロセッサが対応する
フィールドにその状態を記述できるトークンをリングバ
ス上、各プロセッサ経由で転送し続け、ある定められた
マスタプロセッサがこのトークンを調べて、障害の発生
したプロセッサを論理的に装置（システム）から切り離
す。According to the fifth embodiment, the parallel processing apparatus (parallel computer) of the first embodiment includes the means of the first embodiment, and each processor can describe its state in a corresponding field. The token is continuously transferred on the ring bus via each processor, and a predetermined master processor checks this token and logically disconnects the failed processor from the device (system).

【００５８】つまり、この実施の形態５に係る並列処理
装置は、上記実施の形態１の手段に加え、各プロセッサ
が障害の発生の有無の状態を記録し、リングバス内を回
るトークンと、このトークンを監視し、障害の発生した
プロセッサを切り離す機構を備えたマスタプロセッサを
備えたものである。従って、リングバスのデータ転送の
利点を用いて、容易にマスタプロセッサが他のプロセッ
サの障害を検知できる。That is, in addition to the means of the first embodiment, the parallel processing device according to the fifth embodiment records the state of each processor in the presence or absence of a failure, It has a master processor with a mechanism for monitoring the token and disconnecting the failed processor. Therefore, the master processor can easily detect a failure of another processor by using the advantage of the ring bus data transfer.

【００５９】実施の形態６．この発明の実施の形態６に
係る並列処理装置について図４及び図５を参照しながら
説明する。図については、上記の実施の形態４と実施の
形態５で示したものと同様の図４と図５を使用する。Embodiment 6 FIG. Sixth Embodiment A parallel processing device according to a sixth embodiment of the present invention will be described with reference to FIGS. For the drawings, FIGS. 4 and 5 similar to those shown in the above-described fourth and fifth embodiments are used.

【００６０】上記実施の形態５で説明したように、図５
に示したトークンを利用することにより、各プロセッサ
の状態をマスタプロセッサが知る。その後、障害の発生
したプロセッサが、例えば、プロセッサＰ１ならば、こ
れを切り離す。As described in the fifth embodiment, FIG.
The master processor knows the state of each processor by using the token shown in (1). Thereafter, if the failed processor is, for example, the processor P1, it is disconnected.

【００６１】ここで、障害が発生したあるプロセッサＰ
１で順に実行するタスクがＡ、Ｂ、Ｃ、Ｄの４つ存在
し、それらが時刻Ｔまでには、終了しなければならなか
ったとする。また、全体のプロセッサを監視しているマ
スタプロセッサは、事前にすべてのプロセッサが実行す
るタスクをすべて知っているとする。Here, a processor P in which a fault has occurred
It is assumed that there are four tasks A, B, C, and D to be executed in order at 1, and they must be completed by time T. It is also assumed that the master processor monitoring all processors knows in advance all the tasks executed by all processors.

【００６２】上記実施の形態２と同様に、プロセッサＰ
１で障害が発生した後、マスタプロセッサＰ０がこれを
検出する。その後で、マスタプロセッサＰ０では、本来
のタスクＡ、Ｂ、Ｃ、Ｄに比べ、実行時間が少なて済む
タスクＡ’、Ｂ’、Ｃ’、Ｄ’をあらかじめ用意してお
き、これらをマスタプロセッサＰ０内で順に実行する。As in the second embodiment, the processor P
After a failure has occurred at 1, the master processor P0 detects this. Thereafter, in the master processor P0, tasks A ', B', C ', and D' whose execution times are shorter than those of the original tasks A, B, C, and D are prepared in advance, and these are prepared by the master processor P0. Execute sequentially in P0.

【００６３】例えば、タスクＡ’、Ｂ’、Ｃ’、Ｄ’の
例としては、タスクＡ、Ｂ、Ｃ、Ｄと全く同様の処理内
容であるが、処理精度を低くすることにより、実行時間
を短くしたものなどがあげられる。これにより、再度、
タスクＡから再実行していたのでは、決められた時刻Ｔ
に間に合わない処理を同機能の処理により間に合わせる
ことができる。For example, the tasks A ′, B ′, C ′, and D ′ have the same processing contents as the tasks A, B, C, and D, but the execution time is reduced by lowering the processing accuracy. Are shortened. With this,
If task A was re-executed, the time T
The processing that cannot be made in time can be made in time by the processing of the same function.

【００６４】この実施の形態６によれば、上記実施の形
態５の並列処理装置（並列計算機）において、上記実施
の形態５の手段を備え、さらに各プロセッサが実行する
各タスクとそれぞれ同等の機能を有するが、処理精度が
低いなどの理由で実行時間の短い各タスクを予め決めら
れたマスタプロセッサに用意しておくことにより、任意
のプロセッサに障害が発生した場合に、マスタプロセッ
サが該障害を検知し、該障害の発生したプロセッサを論
理的に装置から切り離し、該プロセッサが実行する予定
であったタスクの実行を、マスタプロセッサにおいて精
度などは落ちるが同機能の処理時間の短いタスクで代行
することにより、任意のプロセッサの障害発生時に、連
続して処理を継続でき、かつ予め定められた時刻内に処
理を終了できる。According to the sixth embodiment, the parallel processing apparatus (parallel computer) according to the fifth embodiment includes the means according to the fifth embodiment, and has the same function as each task executed by each processor. However, by preparing each task having a short execution time in a predetermined master processor due to low processing accuracy or the like, if a failure occurs in any processor, the master Upon detection, the failed processor is logically disconnected from the device, and the execution of the task scheduled to be executed by the processor is performed by the master processor, which has reduced accuracy and the like but has a shorter processing time of the same function. Thus, when a failure occurs in an arbitrary processor, the processing can be continued continuously, and the processing can be completed within a predetermined time.

【００６５】つまり、この実施の形態６に係る並列処理
装置は、上記実施の形態１の手段に加え、すべてのタス
クにおいて、処理精度を下げて実行時間を短くした高速
実行バージョンを用意しておく機構を備えたものであ
る。従って、リングバスのデータ転送の利点を用いて、
容易にマスタプロセッサが他のプロセッサの障害を検知
できると共に、ある定められた時間内に行う処理を、た
とえ、プロセッサが故障しても、若干の処理精度の低下
はあるものの、機能的には実行を可能とする。That is, in the parallel processing apparatus according to the sixth embodiment, in addition to the means of the first embodiment, a high-speed execution version in which the processing accuracy is reduced and the execution time is reduced for all tasks is prepared. It has a mechanism. Therefore, using the advantage of the ring bus data transfer,
The master processor can easily detect failures of other processors, and perform processing performed within a certain period of time, even if the processor fails, although there is a slight decrease in processing accuracy, but functionally Is possible.

【００６６】実施の形態７．この発明の実施の形態７に
係る並列処理装置について図５を参照しながら説明す
る。図については、上記の実施の形態５で示したものと
同様の図５を使用する。Embodiment 7 A parallel processing device according to Embodiment 7 of the present invention will be described with reference to FIG. For the figure, FIG. 5 similar to that shown in the above-described fifth embodiment is used.

【００６７】上記実施の形態５では、並列処理装置全体
で、一つのマスタプロセッサを仮定したが、この実施の
形態７では、これを２つ用意する。先の、実施の形態５
に示したトークンの判定を行う際に、もし、一方のマス
タプロセッサが障害により使用できなくなった場合に
は、他方のマスタプロセッサが、これを検知して、先に
一方のマスタプロセッサが行う予定のトークンの検査な
どを行う。In the fifth embodiment, one master processor is assumed for the entire parallel processing apparatus. In the seventh embodiment, two master processors are prepared. Fifth Embodiment
If one of the master processors becomes unusable due to a failure when making the determination of the token shown in the above, the other master processor detects this and one of the master processors Inspect the token.

【００６８】この実施の形態７によれば、上記実施の形
態６の並列処理装置（並列計算機）において、マスタプ
ロセッサを２つ用意しておき、一方のマスタプロセッサ
に障害が発生した場合には、他方のマスタプロセッサが
その変わりを行うことができる。According to the seventh embodiment, in the parallel processing apparatus (parallel computer) of the sixth embodiment, two master processors are prepared, and if one of the master processors fails, The other master processor can make that change.

【００６９】つまり、この実施の形態７に係る並列処理
装置は、上記実施の形態５の手段に加え、マスタプロセ
ッサの代替となるプロセッサを備えたものである。従っ
て、リングバスのデータ転送の利点を用いて、容易にマ
スタプロセッサが他のプロセッサの障害を検知できると
共に、マスタプロセッサに障害が発生しても、継続して
処理ができる。That is, the parallel processing apparatus according to the seventh embodiment includes a processor which is a substitute for the master processor in addition to the means of the fifth embodiment. Therefore, the master processor can easily detect a failure of another processor by using the advantage of the data transfer of the ring bus, and can continue processing even if a failure occurs in the master processor.

【００７０】[0070]

【発明の効果】この発明に係る並列処理装置は、以上説
明したとおり、複数のプロセッサと、各プロセッサに接
続された第１のノードを順に１対１の単方向バス接続に
よりリング上に結合した第１のリングバスと、各プロセ
ッサに接続された第２のノードを前記順とは逆方向に１
対１の単方向バス接続によりリング上に結合した第２の
リングバスとを備えた並列処理装置において、前記複数
のプロセッサのうちの一部のプロセッサを入出力バスに
接続し、前記入出力用プロセッサは外部との入出力を行
うので、リングバスの故障だけでなく、プロセッサの故
障、入出力用のプロセッサの故障にも対処することがで
きるという効果を奏する。As described above, in the parallel processing apparatus according to the present invention, a plurality of processors and the first node connected to each processor are sequentially connected on the ring by a one-to-one unidirectional bus connection. A first ring bus and a second node connected to each processor are connected to the first ring bus in a direction opposite to the above order.
A parallel processing device comprising a second ring bus coupled on a ring by a one-to-one unidirectional bus connection, wherein a part of the plurality of processors is connected to an input / output bus; Since the processor performs input / output with the outside, it is possible to cope with not only the failure of the ring bus but also the failure of the processor and the failure of the input / output processor.

【００７１】また、この発明に係る並列処理装置は、以
上説明したとおり、前記入出力用プロセッサが、第１及
び第２の入出力用プロセッサを含み、前記第１の入出力
用プロセッサは、第２の入出力用プロセッサからの状態
情報が到達しないときには、前記第２の入出力用プロセ
ッサに障害が発生したと判断して前記第２の入出力用プ
ロセッサが行ってきた入出力処理を代替して行うので、
リングバスの故障だけでなく、プロセッサの故障、入出
力用のプロセッサの故障にも対処することができるとい
う効果を奏する。Further, in the parallel processing device according to the present invention, as described above, the input / output processor includes first and second input / output processors, and the first input / output processor includes When the status information from the second input / output processor does not arrive, it is determined that a failure has occurred in the second input / output processor, and the input / output processing performed by the second input / output processor is substituted. So do
In addition to the failure of the ring bus, it is possible to deal with the failure of the processor and the failure of the input / output processor.

【００７２】また、この発明に係る並列処理装置は、以
上説明したとおり、前記複数のプロセッサのうちの任意
のプロセッサが、リング上に隣接するプロセッサの状態
監視機構を有すると共に、前記隣接するプロセッサが実
行するタスクの情報を記憶し、前記任意のプロセッサ
は、前記隣接するプロセッサに障害が発生したときに
は、前記隣接するプロッサを論理的に切り離して前記隣
接するプロセッサが実行する予定であったタスクを再度
実行するので、隣接するプロセッサ同士で、分散して故
障の検知を行うことができ、効率がよく、隣接間のプロ
セッサを結合するリングバスに適した故障の検知ができ
るという効果を奏する。Further, as described above, in the parallel processing device according to the present invention, any one of the plurality of processors has a state monitoring mechanism of an adjacent processor on the ring, and the adjacent processor has a state monitoring mechanism. The information of the task to be executed is stored, and when a failure occurs in the adjacent processor, the arbitrary processor logically disconnects the adjacent processor and re-executes the task scheduled to be executed by the adjacent processor. Since the execution is performed, it is possible to detect failures in a distributed manner between adjacent processors, and it is possible to efficiently detect a failure suitable for a ring bus connecting processors between adjacent processors.

【００７３】また、この発明に係る並列処理装置は、以
上説明したとおり、前記複数のプロセッサのうちの任意
のプロセッサが、リング上に隣接するプロセッサの状態
監視機構を有すると共に、前記隣接するプロセッサが実
行するタスクのうちの優先度の高いタスクの情報を予め
記憶し、前記任意のプロセッサは、前記優先度の高いタ
スクについて前記隣接するプロセッサと同時に実行して
おき、前記隣接するプロセッサに障害が発生したときに
は、前記隣接するプロッサを論理的に切り離すので、あ
る定められた時間内に行う処理を、たとえ、プロセッサ
が故障しても、そのまま実行可能であるという効果を奏
する。Further, in the parallel processing device according to the present invention, as described above, any one of the plurality of processors has a state monitoring mechanism of an adjacent processor on the ring, and the adjacent processor has a state monitoring mechanism. Information of a high-priority task among tasks to be executed is stored in advance, and the arbitrary processor executes the high-priority task at the same time as the adjacent processor, and a failure occurs in the adjacent processor. In this case, the adjacent processors are logically separated from each other, so that there is an effect that the processing performed within a predetermined time can be directly executed even if the processor fails.

【００７４】また、この発明に係る並列処理装置は、以
上説明したとおり、前記複数のプロセッサのうちの任意
のプロセッサが、リング上に隣接するプロセッサの状態
監視機構を有すると共に、前記隣接するプロセッサが実
行するタスクと同等の機能を有し前記タスクよりも実行
時間の短いタスクの情報を予め記憶し、前記任意のプロ
セッサは、前記隣接するプロセッサに障害が発生したと
きには、前記隣接するプロッサを論理的に切り離して前
記隣接するプロセッサが実行する予定であったタスクを
前記実行時間の短いタスクで代行するので、ある定めら
れた時間内に行う処理を、たとえ、プロセッサが故障し
ても、若干の処理精度の低下はあるものの、機能的には
実行可能であるという効果を奏する。Further, as described above, in the parallel processing apparatus according to the present invention, any one of the plurality of processors has a state monitoring mechanism of an adjacent processor on a ring, and the adjacent processor has a state monitoring mechanism. Information of a task having a function equivalent to the task to be executed and having a shorter execution time than the task is stored in advance, and the arbitrary processor logically disconnects the adjacent processor when a failure occurs in the adjacent processor. The task that is scheduled to be executed by the adjacent processor is separated and replaced by the task with a short execution time, so that the processing performed within a predetermined time is slightly reduced even if the processor fails. Although the accuracy is reduced, there is an effect that the function is executable.

【００７５】また、この発明に係る並列処理装置は、以
上説明したとおり、前記複数のプロセッサが、各自の状
態を示すトークンをリングバス上に流し、前記複数のプ
ロセッサのうちの定められたマスタプロセッサは、前記
トークンを調べて障害の発生したプロセッサを論理的に
切り離すので、リングバスのデータ転送の利点を用い
て、容易にマスタプロセッサが他のプロセッサの障害を
検知できるという効果を奏する。Further, as described above, in the parallel processing apparatus according to the present invention, the plurality of processors flow tokens indicating their states on the ring bus, and the predetermined master processor among the plurality of processors is provided. Since the token is examined to logically separate the failed processor, the master processor can easily detect the failure of another processor by using the advantage of the data transfer of the ring bus.

【００７６】また、この発明に係る並列処理装置は、以
上説明したとおり、前記マスタプロセッサが、前記複数
のプロセッサが実行するタスクと同等の機能を有し前記
タスクよりも実行時間の短いタスクの情報を予め記憶
し、前記マスタプロセッサは、前記プロセッサに障害が
発生したときには、前記障害が発生したプロセッサが実
行する予定であったタスクを前記実行時間の短いタスク
で代行するので、リングバスのデータ転送の利点を用い
て、容易にマスタプロセッサが他のプロセッサの障害を
検知できると共に、ある定められた時間内に行う処理
を、たとえ、プロセッサが故障しても、若干の処理精度
の低下はあるものの、機能的には実行可能であるという
効果を奏する。As described above, in the parallel processing device according to the present invention, the master processor has information equivalent to a task executed by the plurality of processors and has a function equivalent to a task executed by the plurality of processors and has a shorter execution time than the task. When the processor fails, the master processor takes over the task scheduled to be executed by the failed processor with the task having a shorter execution time, so that the data transfer of the ring bus is performed. With the advantage of the above, the master processor can easily detect the failure of another processor, and even if the processor fails, the processing performed within a predetermined time is slightly reduced in processing accuracy. This has the effect of being functionally executable.

【００７７】さらに、この発明に係る並列処理装置は、
以上説明したとおり、前記複数のプロセッサのうちの定
められたマスタプロセッサが、第１及び第２のマスタプ
ロセッサを含み、前記第１のマスタプロセッサは、前記
第２のマスタプロセッサに障害が発生したときには、前
記第２のマスタプロセッサを代替するので、リングバス
のデータ転送の利点を用いて、容易にマスタプロセッサ
が他のプロセッサの障害を検知できると共に、マスタプ
ロセッサに障害が発生しても、継続して処理ができると
いう効果を奏する。Further, the parallel processing device according to the present invention
As described above, the defined master processor of the plurality of processors includes the first and second master processors, and the first master processor is configured to execute when the second master processor fails. Since the second master processor is substituted, the master processor can easily detect the failure of another processor by using the advantage of the data transfer of the ring bus, and can continue even if the failure occurs in the master processor. This has the effect of enabling processing.

[Brief description of the drawings]

【図１】この発明の実施の形態１に係る並列処理装置
の構成を示す図である。FIG. 1 is a diagram showing a configuration of a parallel processing device according to Embodiment 1 of the present invention.

【図２】この発明の実施の形態２に係る並列処理装置
の構成の一部を示す図である。FIG. 2 is a diagram illustrating a part of a configuration of a parallel processing device according to a second embodiment of the present invention;

【図３】この発明の実施の形態３に係る並列処理装置
のタスクリストを示す図である。FIG. 3 is a diagram showing a task list of a parallel processing device according to Embodiment 3 of the present invention.

【図４】この発明の実施の形態４に係る並列処理装置
のタスクの流れを示す図である。FIG. 4 is a diagram showing a task flow of a parallel processing device according to Embodiment 4 of the present invention.

【図５】この発明の実施の形態５に係る並列処理装置
のトークンを示す図である。FIG. 5 is a diagram showing tokens of a parallel processing device according to Embodiment 5 of the present invention.

【図６】従来の並列処理装置の構成を示す図である。FIG. 6 is a diagram showing a configuration of a conventional parallel processing device.

[Explanation of symbols]

１ノード（通信ノード）、２プロセッサ、２Ａ入
出力用プロセッサ、３データ転送ライン（通信バス）、
４入出力バス。1 node (communication node), 2 processors, 2A input / output processor, 3 data transfer lines (communication bus),
4 I / O bus.

Claims

[Claims]

A first ring bus in which a plurality of processors, a first node connected to each processor are sequentially connected on a ring by a one-to-one unidirectional bus connection, and a first ring bus connected to each processor. A second ring bus in which two nodes are connected on the ring by a one-to-one unidirectional bus connection in a direction opposite to the order, wherein a part of the plurality of processors Connected to an input / output bus, and the input / output processor performs input / output with an external device.

2. The input / output processor includes first and second input / output processors, wherein the first input / output processor is configured to receive the status information from the second input / output processor. , The second
2. The parallel processing device according to claim 1, wherein it is determined that a failure has occurred in the input / output processor and the input / output processing performed by the second input / output processor is performed instead.

3. An arbitrary processor among the plurality of processors has a state monitoring mechanism of an adjacent processor on a ring, and stores information on a task executed by the adjacent processor. 2. The method according to claim 1, wherein, when a failure occurs in the adjacent processor, the adjacent processor is logically separated and the task scheduled to be executed by the adjacent processor is executed again.
The parallel processing device as described in the above.

4. An arbitrary processor among the plurality of processors has a state monitoring mechanism of an adjacent processor on the ring, and transmits information on a task having a higher priority among tasks executed by the adjacent processor. It is stored in advance, and the arbitrary processor executes the high-priority task at the same time as the adjacent processor, and logically disconnects the adjacent processor when a failure occurs in the adjacent processor. The parallel processing device according to claim 1, wherein:

5. An arbitrary processor among the plurality of processors has a state monitoring mechanism of an adjacent processor on the ring, has a function equivalent to a task executed by the adjacent processor, and has a function equivalent to that of the task. The information of a task having a short execution time is stored in advance, and the arbitrary processor is to logically disconnect the adjacent processor and execute the adjacent processor when a failure occurs in the adjacent processor. 2. The parallel processing apparatus according to claim 1, wherein the task is substituted by the task having a short execution time.

6. The plurality of processors flow tokens indicating their states on a ring bus, and a predetermined master processor among the plurality of processors examines the tokens and logically determines a failed processor. The parallel processing device according to claim 1, wherein the parallel processing device is separated.

7. The master processor has a function equivalent to a task executed by the plurality of processors and previously stores information of a task having a shorter execution time than the task, and the master processor has a fault in the processor. 7. The parallel processing device according to claim 6, wherein, when the error occurs, the task scheduled to be executed by the failed processor is substituted by the task with a short execution time.

8. A predetermined master processor among the plurality of processors includes a first and a second master processor, wherein the first master processor is configured to operate when a failure occurs in the second master processor. 8. The parallel processing device according to claim 6, wherein said second master processor is substituted.