JPH04281535A

JPH04281535A - Stand-by redundant type system

Info

Publication number: JPH04281535A
Application number: JP3045009A
Authority: JP
Inventors: Keiko Nagase; 永瀬　恵子; Toshibumi Seki; 俊文關; Yasukuni Oiyake; 岡宅　泰邦; Shinsuke Tamura; 田村　信介
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1991-03-11
Filing date: 1991-03-11
Publication date: 1992-10-07
Anticipated expiration: 2016-06-18
Also published as: JP3176945B2

Abstract

PURPOSE:To efficiently enable an inside state transferring processing between a main system and a stand-by system according to the change of the inside state of the main system, and to simplify the detection of the fault of the main system. CONSTITUTION:When messages stored in a received message buffer of the stand-by system reaches a prescribed amount (401), when the messages from the other main systems are not received at all within a fixed time (402), or when the messages transmitted by the corresponding main system are not received at all within the fixed time, are defined as a timing in which a check point for holding the inside states of both the main system and the stand-by system to be the same is set. Thus, the total number of the check points can be reduced, and the check point can be efficiently set. Also, when a response to a check point request is not made even after the lapse of the fixed time, the abnormality of the main system is judged, and the succeeding processing is executed by the stand-by system.

Description

[Detailed description of the invention]

【０００１】［発明の目的］[Object of the invention]

【０００２】0002

【産業上の利用分野】本発明は、計算機やプログラムの
ような実行単位が、主系と待機系とに多重化して配置さ
れてなる待機冗長型システムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a standby redundant system in which execution units such as computers and programs are multiplexed and arranged in a main system and a standby system.

【０００３】0003

【従来の技術】近年、電算機システムの各分野における
要求の多様化および技術の高度化に伴って、システムが
大規模化かつ複雑化する傾向にある。そこで、分散して
ジョブを処理する並列／分散処理システムの需要が今日
高まりつつある。2. Description of the Related Art In recent years, as requirements in various fields of computer systems have diversified and technology has become more sophisticated, systems have tended to become larger and more complex. Therefore, demand for parallel/distributed processing systems that process jobs in a distributed manner is increasing today.

【０００４】このような並列／分散処理システムの方式
には、同一の処理を複数の実行単位で並列に実行する並
列多重処理方式と、処理は主系である単一の実行単位で
実行し、この主系の実行単位が故障した段階で待機中の
実行単位が処理を継続する待機冗長制御方式とがある。Such parallel/distributed processing system methods include a parallel multiprocessing method in which the same process is executed in parallel in multiple execution units, and a parallel multiprocessing method in which the same process is executed in a single execution unit as the main system. There is a standby redundancy control method in which a standby execution unit continues processing when the main execution unit fails.

【０００５】並列多重処理方式は、主系故障時に待機系
の立ち上げ等の操作が不要で、システムを停止すること
なく処理を継続できる。しかし、同一の処理を複数の実
行単位が並列処理するため計算機の冗長度が高くなると
いう難点がある。[0005] The parallel multiprocessing method does not require operations such as starting up a standby system when the main system fails, and processing can be continued without stopping the system. However, since multiple execution units process the same process in parallel, there is a drawback that the redundancy of the computer increases.

【０００６】これに対し待機冗長制御方式では、通常動
作時、待機系は待機状態にあるため各計算機の処理負荷
は小さく、計算機資源が有効に利用できる。On the other hand, in the standby redundancy control system, during normal operation, the standby system is in a standby state, so the processing load on each computer is small and computer resources can be used effectively.

【０００７】ところで、この方式では主系が自身の内部
状態を待機系に通報して主系と待機系の内部状態を同一
に保つ箇所であるチェックポイントを設定する必要があ
る。通常これらのチェックポイントは、システム設計者
が意識してプログラム中に明記することで設定される。また別の方法として、主系が周期的に途中経過を待機系
に送る方法もある。しかしながら、これらの方法では、
主系の内部状態の変化とは無関係に途中経過の転送を行
ってしまい、決して効率的とは言えなかった。By the way, in this method, it is necessary to set a checkpoint where the main system reports its internal state to the standby system and keeps the internal states of the main system and the standby system the same. Usually, these checkpoints are set by the system designer consciously specifying them in the program. Another method is for the main system to periodically send intermediate progress information to the standby system. However, these methods
Intermediate progress was transferred regardless of changes in the internal state of the main system, which could not be called efficient.

【０００８】[0008]

【発明が解決しようとする課題】本発明は、このような
事情に対してなされたもので、主系の内部状態の変化に
応じて、主系と待機系との間で効率良く内部状態転送処
理を行うことができ、さらに主系の故障を容易に検出で
きることを保証するチェックポイントの設定が可能な待
機冗長型システムの提供を目的とする。[Problems to be Solved by the Invention] The present invention has been made in response to the above-mentioned circumstances, and is capable of efficiently transferring internal states between the main system and the standby system in response to changes in the internal state of the main system. The purpose of the present invention is to provide a standby redundant system that can perform processing and set checkpoints to ensure that failures in the main system can be easily detected.

【０００９】［発明の構成］[Configuration of the invention]

【００１０】0010

【課題を解決するための手段】本発明の待機冗長型シス
テムは上記した目的を達成するために、システム中の実
行単位を、主系である実行中実行単位および待機系であ
る待機実行単位として多重化し、前記主系は他の主系と
の間でメッセージを交換しながら作業を実行する一方、
待機系は前記主系と同じメッセージを受信するのみで作
業は行わず、前記主系の故障時のみ主系に切替えられて
作業を継続する待機冗長型システムにおいて、前記実行
単位は、内部状態を送信する内部状態送信手段と、前記
内部状態を受信する内部状態受信手段と、受信したメッ
セージを蓄積する受信メッセージバッファの状態を監視
する監視手段とを有し、かつ前記待機系の実行単位は、
前記監視手段により前記受信メッセージバッファに蓄積
されたメッセージが所定量に達したことが判断されたと
き、前記主系に対し内部状態の送信を要求する要求手段
と、一定時間内に内部状態が受信されたとき、自身の内
部状態を更新すると共に前記受信メッセージバッファを
クリアする更新手段と、一定時間内に内部状態が受信さ
れたとき、自身の内部状態を更新すると共に前記受信メ
ッセージバッファをクリアする更新手段と、一定時間内
に内部状態が受信されなかったとき、前記主系が故障し
たことを判断して自身を主系に切替え、自身の内部状態
と前記受信メッセージバッファの内容を基にロールバッ
ク処理を行って作業の引継ぎを行う引継手段とを有する
ことを特徴とする。[Means for Solving the Problems] In order to achieve the above-mentioned object, the standby redundant system of the present invention divides the execution units in the system into a main running execution unit and a standby standby execution unit. multiplexed, the main system executes work while exchanging messages with other main systems,
In a standby redundant system in which the standby system only receives the same messages as the main system but does not perform any work, and only when the main system fails, it is switched to the main system and continues the work. The execution unit of the standby system has an internal state transmitting means for transmitting, an internal state receiving means for receiving the internal state, and a monitoring means for monitoring the state of a reception message buffer for accumulating received messages, and the execution unit of the standby system has:
requesting means for requesting the main system to transmit an internal state when the monitoring means determines that the number of messages accumulated in the received message buffer has reached a predetermined amount; an updating unit that updates its own internal state and clears the received message buffer when the internal state is received within a certain period of time, and updates its own internal state and clears the received message buffer when the internal state is received within a certain time an updating means, and when the internal state is not received within a certain period of time, it is determined that the main system has failed, switches itself to the main system, and rolls based on its own internal state and the contents of the received message buffer. The present invention is characterized in that it has a takeover means that performs back processing and takes over work.

【００１１】[0011]

【作用】本発明の待機冗長型システムでは、待機系の持
つ受信メッセージバッファに蓄積されたメッセージが所
定量に達したとき、待機系は、チェックポイントをとる
タイミングとして主系に対し内部状態の送信を要求する
。そして一定時間内に内部状態が受信されれば、自身の
内部状態を更新すると共に受信メッセージバッファをク
リアする。また、一定時間内に内部状態が受信されなか
った場合、主系が故障したことを判断して自身を主系に
切替え、自身の内部状態と受信メッセージバッファの内
容を基にロールバック処理を行って作業の引継ぎを行う
。[Operation] In the standby redundant system of the present invention, when the number of messages accumulated in the receive message buffer of the standby system reaches a predetermined amount, the standby system sends an internal state to the main system as a checkpoint timing. request. If the internal state is received within a certain period of time, it updates its own internal state and clears the received message buffer. In addition, if the internal state is not received within a certain period of time, it is determined that the main system has failed, switches itself to the main system, and performs rollback processing based on its own internal state and the contents of the received message buffer. The work will be taken over.

【００１２】また、主系間のメッセージのやり取りが少
ない場合には、待機系は主系が正常に動作しているかが
分からない場合が生じる。そこで、上記方法に主系が正
常動作していることを確認する方法を組み合わせた次の
２つの方法がさらに考えられる。[0012] Furthermore, if there are few messages exchanged between the main systems, the standby system may not be able to tell whether the main system is operating normally. Therefore, the following two methods can be considered, which combine the above method with a method of confirming that the main system is operating normally.

【００１３】一つは、主系が異常になれば主系にメッセ
ージを送出する実行単位がなくなると考えるもので、上
記の方法に加え、待機系が予め決められた時間内に他の
主系からのメッセージを１つも受信しないときにもチェ
ックポイントをとる。One method is to think that if the main system becomes abnormal, there will be no execution units to send messages to the main system. A checkpoint is also taken when no messages are received from the server.

【００１４】もう一つは、主系が他の実行単位にメッセ
ージを送信していれば、その主系が正常に動作している
と考えるもので、上記の方法に加え、待機系が対応する
主系の送出するメッセージをも受信し、予め決められた
時間内にその主系からのメッセージを１つも受信しない
ときにチェックポイントをとる。[0014] The other method is to consider that the main system is operating normally if the main system is sending messages to other execution units.In addition to the above method, the standby system also It also receives messages sent by the main system, and takes a checkpoint when no message is received from the main system within a predetermined time.

【００１５】本発明によって、主系の内部状態の変化に
応じた主系と待機系との間での効率的な内部状態転送処
理が可能になり、しかも主系の故障検出を容易に行うこ
とが可能となる。[0015] The present invention enables efficient internal state transfer processing between the main system and standby system in response to changes in the internal state of the main system, and also facilitates failure detection in the main system. becomes possible.

【００１６】[0016]

【実施例】以下、本発明の実施例の詳細を図面に基づい
て説明する。DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, details of embodiments of the present invention will be explained based on the drawings.

【００１７】図１は本発明の一実施例の待機冗長型シス
テムの構成を示す図である。本例では実行単位をプログ
ラム・モジュールとする。同図において、１ａ、１ｂ、
１ｃは、プログラム・モジュールを実行するプロセッサ
である。個々のプロセッサ１ａ、１ｂ、１ｃには、プロ
グラム・モジュール間の通信を制御するメッセージ交換
装置２ａ、２ｂ、２ｃがそれぞれ接続され、各メッセー
ジ交換装置２ａ、２ｂ、２ｃは一列に接続されている。なお、図には示されないが、各メッセージ交換装置２ａ
、２ｂ、２ｃは、受信したメッセージを蓄積する受信メ
ッセージバッファを有している。FIG. 1 is a diagram showing the configuration of a standby redundant system according to an embodiment of the present invention. In this example, the execution unit is a program module. In the same figure, 1a, 1b,
1c is a processor that executes program modules. Message exchange devices 2a, 2b, 2c for controlling communication between program modules are connected to the individual processors 1a, 1b, 1c, respectively, and the message exchange devices 2a, 2b, 2c are connected in a line. Although not shown in the figure, each message exchange device 2a
, 2b, 2c have receive message buffers for storing received messages.

【００１８】また各プロセッサ１ａ、１ｂ、１ｃには、
例えば図２に示すように、実行単位である複数のプログ
ラム・モジュールが多重化して登録されている。例えば
、Ａのプログラム・モジュールは、プロセッサ１ａ、１
ｂ及び１ｃに登録されている（各プログラム・モジュー
ルをＡ１、Ａ２、Ａ３で表す）。ここでプロセッサ１ａ
に登録されているプログラム・モジュールＡ１のみが、
通常時作動する主系であり、プロセッサ１ｂ、１ｃに登
録されているプログラム・モジュールＡ２、Ａ３は、通
常時は主系と同一のメッセージを受信するのみで、実際
の処理は行わず、プログラム・モジュールＡ１の故障時
に主系になるため待機の状態をとっている。このように
待機系が２つ以上存在する場合は、待機系中のいずれか
１つが主系故障時に主系になる。Furthermore, each processor 1a, 1b, 1c includes:
For example, as shown in FIG. 2, a plurality of program modules, which are execution units, are registered in a multiplexed manner. For example, A's program module includes processors 1a, 1
b and 1c (each program module is represented by A1, A2, and A3). Here processor 1a
Only program module A1 registered in
The program modules A2 and A3, which are the main system that normally operates and are registered in the processors 1b and 1c, normally only receive the same messages as the main system, but do not perform any actual processing. It is in a standby state because it becomes the main system when module A1 fails. If there are two or more standby systems as described above, any one of the standby systems becomes the main system when the main system fails.

【００１９】またプログラム・モジュールＢはプロセッ
サ１ａおよび１ｂに登録されている（それぞれＢ１、Ｂ
２で表す）。プロセッサ１ｂのプログラム・モジュール
Ｂ１が主系、プロセッサ１ａのプログラム・モジュール
Ｂ２が待機系である。同様にプログラム・モジュールＣ
はプロセッサ１ａ、１ｂ及び１ｃに登録されている（そ
れぞれＣ１、Ｃ２、Ｃ３で表す）。プロセッサ１ｃのプ
ログラム・モジュールＣ１が主系、プロセッサ１ａと１
ｂのプログラム・モジュールＣ３、Ｃ２が待機系である
。Program module B is also registered in processors 1a and 1b (B1 and B, respectively).
(represented by 2). The program module B1 of the processor 1b is the main system, and the program module B2 of the processor 1a is the standby system. Similarly, program module C
are registered in processors 1a, 1b, and 1c (represented by C1, C2, and C3, respectively). Program module C1 of processor 1c is the main system, processors 1a and 1
Program modules C3 and C2 of b are standby systems.

【００２０】図３は各プログラム・モジュールの構成例
を示している。図において、３ａは個々のプログラム・
モジュール固有の処理を行う本処理部、３ｂは内部状態
を送信する内部状態送信部、３ｃは内部状態を受信する
内部状態受信部、３ｄは受信メッセージバッファの状態
を監視する受信メッセージバッファ監視部である。FIG. 3 shows an example of the configuration of each program module. In the figure, 3a is an individual program.
3b is an internal state transmitter that transmits the internal state; 3c is an internal state receiver that receives the internal state; and 3d is a receive message buffer monitor that monitors the state of the receive message buffer. be.

【００２１】次に、以上の構成の待機冗長型システムに
おいて、主系と待機系との間でチェックポイントをとる
動作を図４および図５を用いて説明する。なお、図４は
待機系の動作、図５は主系の動作の流れを示している。Next, the operation of taking a checkpoint between the main system and the standby system in the standby redundant system having the above configuration will be explained with reference to FIGS. 4 and 5. Note that FIG. 4 shows the flow of the operation of the standby system, and FIG. 5 shows the flow of the operation of the main system.

【００２２】主系間の通常の処理期間において、待機系
（例えばＢ２）は、常に主系（例えばＢ１）と同様、他
の主系群（例えばＡ１、Ｃ１）からのメッセージを受信
してこれを受信メッセージバッファに保存している。そ
の間、待機系Ｂ２の受信メッセージバッファ監視部３ｄ
は、自身の受信メッセージバッファの状態を常に監視し
ている。During the normal processing period between the main systems, the standby system (for example, B2) always receives messages from other main system groups (for example, A1, C1) as well as the main system (for example, B1), and responds accordingly. is stored in the received message buffer. During this time, the reception message buffer monitoring unit 3d of the standby system B2
constantly monitors the state of its receive message buffer.

【００２３】そして受信メッセージバッファに蓄積され
たメッセージが一定の量に達った場合（ステップ４０１
）、あるいは一定時間他の主系Ａ１、Ｃ１からメッセー
ジを受信していない場合（ステップ４０２）、待機系Ｂ
２はチェックポイントをとるべきタイミングであること
を判断して、主系Ｂ１に対し内部状態の送信を要求する
（ステップ４０３）。[0023] When the number of messages accumulated in the reception message buffer reaches a certain amount (step 401
), or if no message has been received from the other main systems A1 and C1 for a certain period of time (step 402), the standby system B
2 determines that it is time to take a checkpoint, and requests the main system B1 to transmit the internal state (step 403).

【００２４】主系Ｂ１はその要求を受け取ると、内部状
態受信部３ｃを起動して、主系Ｂ１の内部状態を読出し
（ステップ５０１）、待機系Ｂ２の内部状態受信部３ｃ
にその内部状態を送信する（ステップ５０２）。When the main system B1 receives the request, it activates the internal state receiving section 3c, reads out the internal state of the main system B1 (step 501), and reads the internal state receiving section 3c of the standby system B2.
(step 502).

【００２５】一方、待機系Ｂ２は主系Ｂ１に対するチェ
ックポイント要求を行った後、主系Ｂ１から内部状態が
送られて来るのを内部状態受信部３ｃにて一定時間待つ
。そして一定時間内に主系Ｂ１からの内部状態が受信さ
れると（ステップ４０４）、待機系Ｂ２は自身の内部状
態を更新すると共に、受信メッセージバッファをクリア
する（ステップ４０５）。On the other hand, after the standby system B2 issues a checkpoint request to the main system B1, the internal state receiving section 3c waits for a certain period of time for the internal state to be sent from the main system B1. When the internal state from the main system B1 is received within a certain period of time (step 404), the standby system B2 updates its own internal state and clears the received message buffer (step 405).

【００２６】また一定時間待っても内部状態を受信でき
ない場合（ステップ４０４）、待機系Ｂ２は主系Ｂ１を
故障と判断して自身を主系に切替える。そして保存され
ている内部状態と受信メッセージバッファ内のメッセー
ジの内容に基づいてロールバック処理を行い（ステップ
４０６）、以降の処理を継続する。If the internal status cannot be received even after waiting for a certain period of time (step 404), the standby system B2 determines that the main system B1 is in failure and switches itself to the main system. Then, rollback processing is performed based on the saved internal state and the contents of the message in the received message buffer (step 406), and subsequent processing is continued.

【００２７】したがって、この実施例の待機冗長型シス
テムによれば、主系の内部状態の変化に応じて、主系か
らの受信メッセージの記憶可能範囲およびロールバック
処理が可能範囲の限界点でのみでチェックポイントをと
ることにより、効率良くチェックポイントをとることが
できる。また同時に、主系の故障検出を保証するチェッ
クポイントの設定が可能になる。Therefore, according to the standby redundant system of this embodiment, depending on changes in the internal state of the main system, messages received from the main system can be stored and rolled back only at the limit of the possible range. By taking checkpoints, you can take checkpoints efficiently. At the same time, it becomes possible to set checkpoints that guarantee fault detection in the main system.

【００２８】次に本発明の他の実施例を説明する。Next, another embodiment of the present invention will be described.

【００２９】図６はこの実施例で実行単位として扱って
いるプログラム・モジュールの構成例を示している。図
において、６ａは個々のプログラム・モジュール固有の
処理を行う本処理部、６ｂは内部状態を送信する内部状
態送信部、６ｃは内部状態を受信する内部状態受信部、
６ｄは自身の受信メッセージバッファを監視する受信メ
ッセージバッファ監視部、そして６ｅは対応する主系か
らの送信メッセージを監視する主系送信メッセージ監視
部である。FIG. 6 shows an example of the configuration of a program module treated as an execution unit in this embodiment. In the figure, 6a is a main processing section that performs processing specific to each program module, 6b is an internal state transmitting section that transmits an internal state, 6c is an internal state receiving section that receives an internal state,
6d is a reception message buffer monitoring section that monitors its own reception message buffer, and 6e is a main system transmission message monitoring section that monitors transmission messages from the corresponding main system.

【００３０】次にこの実施例システムにおいて主系と待
機系との間でチェックポイントをとる動作を図７を用い
て説明する。なお図７は待機系の動作の流れを示してお
り、ステップ７０２以外は図４と同じである。Next, the operation of taking a checkpoint between the main system and the standby system in this embodiment system will be explained with reference to FIG. Note that FIG. 7 shows the flow of operation of the standby system, and the steps other than step 702 are the same as FIG. 4.

【００３１】待機系は、常に、対応する主系に対して送
信された他の主系群からのメッセージを受信してこれを
受信メッセージバッファに保存すると共に、対応する主
系が送出するメッセージも受信する。但し、ここでは主
系の送出するメッセージは受信しても保存せず、主系送
信メッセージ監視部６ｅでその到着間隔だけをチェック
する。そして対応する主系から一定時間内に１つもメッ
セージを受信しなかった場合、その主系に、内部状態の
送信を要求する。以降は、図４の場合と同じであるため
説明を省略する。[0031] The standby system always receives messages sent from other main systems to the corresponding main system and stores them in the reception message buffer, and also receives messages sent by the corresponding main system. Receive. However, here, messages sent by the main system are not saved even if they are received, and only the arrival interval of the messages is checked by the main system transmission message monitoring section 6e. If no message is received from the corresponding main system within a certain period of time, the main system is requested to send its internal state. Since the subsequent steps are the same as those in FIG. 4, the explanation will be omitted.

【００３２】さらに他の実施例として、前記２つの実施
例を組み合わせ、受信メッセージバッファに蓄積された
メッセージが予め決められた量に達したときと、一定時
間内に１つもメッセージを自他の主系から受信されなか
ったときをチェックポイントをとるタイミングにするこ
とも可能である。[0032] As yet another embodiment, the above two embodiments are combined, and when the number of messages accumulated in the received message buffer reaches a predetermined amount, and at least one message is sent to the host or others within a certain period of time. It is also possible to take a checkpoint when no signal is received from the system.

【００３３】また以上の実施例では、実行単位をプログ
ラム・モジュールとして述べたが、図８のように実行単
位を計算機として同様の待機冗長制御方式を実現するこ
とも可能である。この場合、実行単位としての複数の計
算機８ａ１、８ａ２、…、８ｂ１、８ｂ２、…、８ｃ１
、８ｃ２、…中に、それぞれプロセッサ９ａ１、９ａ２
、…、９ｂ１、９ｂ２、…、９ｃ１、９ｃ２、…とメッ
セージ交換装置１０ａ１、１０ａ２、…、１０ｂ１、１
０ｂ２、…、１０ｃ１、１０ｃ２…が存在する。各メッ
セージ交換装置１０ａ１〜１０ｃ２は他の全てのメッセ
ージ交換装置と通信可能な状態で接続されている。した
がって、各計算機８ａ１〜８ｃ２も、他のどの計算機と
も通信可能である。また初期状態では計算機８ａ１、８
ａ２、…が主系であり、それぞれが計算機８ｂ１、８ｂ
２、…、８ｃ１、８ｃ２…なる待機系計算機を持つとす
る。Furthermore, in the above embodiments, the execution unit was described as a program module, but it is also possible to implement a similar standby redundancy control system using a computer as the execution unit as shown in FIG. In this case, multiple computers 8a1, 8a2, ..., 8b1, 8b2, ..., 8c1 as execution units
, 8c2, ..., processors 9a1 and 9a2, respectively.
,..., 9b1, 9b2,..., 9c1, 9c2,... and message exchange devices 10a1, 10a2,..., 10b1, 1
0b2,..., 10c1, 10c2... exist. Each message exchange device 10a1 to 10c2 is connected to all other message exchange devices in a communicable state. Therefore, each of the computers 8a1 to 8c2 can also communicate with any other computer. In addition, in the initial state, computers 8a1 and 8
a2,... are the main systems, and each is a computer 8b1, 8b
Assume that there are standby computers 2, . . . , 8c1, 8c2, .

【００３４】[0034]

【発明の効果】以上説明したように本発明の待機冗長型
システムによれば、主系の内部状態の変化に応じた主系
と待機系との間での効率的な内部状態転送処理が可能に
なり、しかも主系の故障検出を容易に行うことが可能と
なる。[Effects of the Invention] As explained above, according to the standby redundant system of the present invention, efficient internal state transfer processing can be performed between the main system and the standby system in response to changes in the internal state of the main system. Moreover, it becomes possible to easily detect failures in the main system.

[Brief explanation of the drawing]

【図１】本発明に係る一実施例の待機冗長型システムの
構成を示すブロック図である。FIG. 1 is a block diagram showing the configuration of a standby redundant system according to an embodiment of the present invention.

【図２】実行単位であるプログラム・モジュールを複数
のプロセッサに多重化して配置した状態を示す図である
。FIG. 2 is a diagram showing a state in which program modules, which are execution units, are multiplexed and arranged on a plurality of processors.

【図３】プログラム・モジュールの構成を示すブロック
図である。FIG. 3 is a block diagram showing the configuration of a program module.

【図４】図１の待機冗長型システムでの待機系の動作を
示すフローチャートである。FIG. 4 is a flowchart showing the operation of the standby system in the standby redundant system of FIG. 1;

【図５】図１の待機冗長型システムにおける主系の動作
を示すフローチャートである。FIG. 5 is a flowchart showing the operation of the main system in the standby redundant system of FIG. 1;

【図６】本発明の他の実施例におけるプログラム・モジ
ュールの構成を示すブロック図である。FIG. 6 is a block diagram showing the configuration of a program module in another embodiment of the present invention.

【図７】図６の実施例の待機冗長型システムにおける待
機系の動作を示すフローチャートである。FIG. 7 is a flowchart showing the operation of the standby system in the standby redundant system of the embodiment of FIG. 6;

【図８】実行単位を計算機に置き換えた場合のシステム
の構成例を示すブロック図である。FIG. 8 is a block diagram showing an example of a system configuration when the execution unit is replaced with a computer.

[Explanation of symbols]

１ａ、１ｂ、１ｃ……プロセッサ２ａ、２ｂ、２ｃ……メッセージ交換装置Ａ１、Ａ２、
Ａ３、Ｂ１、Ｂ２、Ｃ１、Ｃ２、Ｃ３…プログラム・モ
ジュール３ａ……本処理部３ｂ……内部状態送信部３ｃ……内部状態受信部1a, 1b, 1c...Processors 2a, 2b, 2c...Message exchange devices A1, A2,
A3, B1, B2, C1, C2, C3...Program module 3a...Main processing section 3b...Internal state transmitting section 3c...Internal state receiving section

Claims

[Claims]

Claim 1: Execution units in the system are multiplexed as a running execution unit as a main system and a standby execution unit as a standby system, and the main system performs work while exchanging messages with other main systems. On the other hand, in a standby redundant system, the standby system only receives the same message as the main system but does not perform any work, and only when the main system fails is switched over to the main system and continues the work. The unit has an internal state transmitting means for transmitting an internal state, an internal state receiving means for receiving the internal state, and a monitoring means for monitoring the state of a receiving message buffer for storing received messages, and The execution unit of the system includes a requesting means for requesting the corresponding main system to send an internal state when it is determined by the monitoring means that the number of messages accumulated in the receiving message buffer has reached a predetermined amount; updating means for updating its own internal state and clearing the received message buffer when the internal state is received within a certain time; and updating means for updating the main system when the internal state is not received within a certain time; a standby redundant type, characterized in that it has a takeover means that determines the status, switches itself to the main system, performs rollback processing based on its own internal state and the contents of the received message buffer, and takes over the work. system.

[Claim 2] In the standby redundant system according to claim 1, when a standby execution unit does not receive a message within a predetermined time, the standby execution unit requests the corresponding main system to send an internal state. A standby redundant system further comprising a second requesting means.

3. In the standby redundant system according to claim 1, the execution unit of the standby system includes a message receiving means for receiving a message sent by a corresponding main system to another main system, and a message receiving means for receiving a message sent from a corresponding main system to another main system, and third requesting means for requesting the corresponding main system to transmit an internal state when the message is not received by the message receiving means;
A standby redundant system further comprising: