JP3255934B2

JP3255934B2 - Basic processing unit and highly reliable computer system

Info

Publication number: JP3255934B2
Application number: JP00751991A
Authority: JP
Inventors: 信康金川; 伸一朗山口; 小林　　芳樹; 宮尾　　健; 学荒岡; 智明中村; 雅行丹治; 茂則金子; 晃二桝井; 三朗飯島
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1991-01-25
Filing date: 1991-01-25
Publication date: 2002-02-12
Anticipated expiration: 2017-02-12
Also published as: JPH04241039A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は高信頼化コンピュータシ
ステムにかかり、特に障害発生時に運転継続できること
は勿論、その後の復旧についても考慮された構成を有す
る高信頼化コンピュータシステムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a highly-reliable computer system, and more particularly to a highly-reliable computer system having a structure which allows not only continued operation in the event of a failure but also restoration afterward.

【０００２】[0002]

【従来の技術】例えば交通管制システムや、金融、証券
システムは情報化社会の浸透に伴い、社会生活の根幹を
占めるようになってきており、これらに使用されるコン
ピュータシステムは障害が発生しないように工夫される
とともに、仮に障害が発生したとしてもデータの一貫性
を保持したまま処理を続行するように構成される必要が
ある。2. Description of the Related Art Traffic control systems, financial systems, and securities systems, for example, have become the basis of social life with the spread of the information-oriented society. It is necessary that the processing be continued while maintaining the data consistency even if a failure occurs.

【０００３】これらの要求に応えるため、従来より、障
害許容コンピュータもしくは、耐故障、耐欠陥コンピュ
ータシステムが種々提案されており、障害が発生しても
データ処理を継続できるように同じ機能を有する複数の
システムないし、部品で構成し、各部で冗長性を持たせ
ることにより障害の発生したシステムないし部品を検出
し、切離すことにより残りの構成でデータ処理を続行で
きるように構成されている。In order to meet these demands, various types of fault-tolerant computers or fault-tolerant and fault-tolerant computer systems have been proposed, and a plurality of computers having the same function so that data processing can be continued even if a fault occurs. The system is constituted by a system or components, and a system or a component in which a failure has occurred is detected by providing redundancy in each section, and the data processing can be continued with the remaining components by separating the fault.

【０００４】具体的な従来例として、米国特許第465485
7 号は、通称ペアアンドスペア法と呼ばれる方式を採用
し、自己診断機能のあるメモリ、プロセッサ、入出力制
御装置などからなるプロセッサボード２枚を１組にして
動作する。どのプロセッサボードも内部には２個のマイ
クロプロセッサを持ち、マイクロプロセッサの出力を照
合し、不一致の場合はボード故障と見なすことにより、
障害を検出している。また、プロセッサボードからバス
にだされた出力はもう一方のプロセッサボードとバスク
ロック毎に照合、同期するロックステップ方式を採用し
ており、片方のプロセッサボードで障害が発生してもそ
のバスクロック内で検出し、切り離し処理が行われ、正
常なプロセッサボードの出力のみが使用される。[0004] As a specific conventional example, US Pat.
No. 7 employs a so-called pair-and-spare method, and operates as a set of two processor boards including a memory having a self-diagnosis function, a processor, and an input / output control device. Every processor board has two microprocessors inside, collates the outputs of the microprocessors, and if they do not match, it is regarded as a board failure,
A failure has been detected. In addition, the output from the processor board to the bus is checked and synchronized with the other processor board for each bus clock, and a lockstep method is adopted. Even if a failure occurs on one processor board, the lock step method is used. , A disconnection process is performed, and only the output of the normal processor board is used.

【０００５】また、特開昭59−160899号では、米国特許
第4654857 号と同様に二重のシステムバスの夫々に接続
され、その内部に２つのプロセッサを有する２つのプロ
セッサボードを有し、その同期化のためにキャッシュメ
モリに着目し、キャッシュメモリから主記憶装置へのフ
ラッシュ動作をＯＳ制御の下で行うことにより、ロック
ステップ動作による性能制限を避けている。そして、プ
ロセッサボード内の２個のマイクロプロセッサの照合に
より障害が検出された場合、前回のフラッシュポイント
から代替プロセッサボードで処理を再実行する。In Japanese Patent Application Laid-Open No. 59-160899, similarly to US Pat. No. 4,654,857, there are provided two processor boards which are respectively connected to dual system buses and have two processors therein. Focusing on the cache memory for synchronization, the flash operation from the cache memory to the main storage device is performed under OS control, thereby avoiding the performance limitation due to the lock step operation. Then, if a failure is detected by comparing the two microprocessors in the processor board, the process is executed again on the alternative processor board from the previous flash point.

【０００６】上記システムではプロセッサボード上の２
台と別のプロセッサボード上の２台の計４台のマイクロ
プロセッサを使用するが、特開平1−258057 号では、三
重冗長系ＴＭＲ（Triple Modular Redundancy)技法を採
用し、プロセッサ３台の出力結果を多数決回路を介して
二重化システムバスに出力する。[0006] In the above system, 2 on the processor board
A total of four microprocessors, two on one processor board and another, are used. In Japanese Patent Laid-Open No. 1-258057, a triple redundant TMR (Triple Modular Redundancy) technique is employed, and the output results of three processors are used. Is output to the duplicated system bus via the majority circuit.

【０００７】[0007]

【発明が解決しようとする課題】上記従来例は、一つの
プロセッサボード上に何台のプロセッサを配置するかと
言ったことは別にして、いずれの場合も３台乃至４台の
プロセッサを使用するシステムであり、そのいずれかの
プロセッサに障害を発生したときにはこのプロセッサを
切り離して２台運転にシステムを縮小し、その後新たな
別の１台または２台のプロセッサを組み込んで元のシス
テム構成に再構成されるものである。In the above-mentioned conventional example, three or four processors are used in any case, apart from the number of processors to be arranged on one processor board. If one of the processors fails, disconnect this processor and reduce the system to two-unit operation, and then install another new one or two processors to restore the original system configuration. It is composed.

【０００８】これらのシステムでは障害発生前のプロセ
ッサの組と、復旧後のプロセッサの組とは全く相違す
る。つまり、前２者の従来例では当初Ａ，Ｂ，Ｃ，Ｄの
４つのプロセッサで運転していたとすると、復旧後のプ
ロセッサ構成はＥ，Ｆ，Ｃ，Ｄにて運転されることにな
る。また最後の従来例では当初Ａ，Ｂ，Ｃのものが、
Ｄ，Ｂ，Ｃとなる。このため、従来例のものではそのシ
ステムを構成する他のプロセッサとの間での特別な接
続、切離しハードウエア、同期機構が必要である。ま
た、プロセッサあるいはプロセッサボードは徐々にバー
ジョンアップされ、あるいはレビジョンされるのが通例
であるが、システムの一部であるプロセッサあるいはプ
ロセッサボードを交換する上記従来例では復旧後の新旧
プロセッサ間のミスマッチを防ぐための十分な事前対応
が不可欠である。また、プロセッサボードを交換するも
のでは常に高価な交換ボードを準備しておく必要があ
る。さらに、プロセッサ間での同期化が困難である。In these systems, the set of processors before the occurrence of the failure is completely different from the set of processors after restoration. That is, in the former two examples, if the processor is initially operated by four processors A, B, C, and D, the processor configuration after the restoration is operated by E, F, C, and D. In the last conventional example, those of A, B and C were initially
D, B, and C. For this reason, the prior art requires special connection, disconnection hardware, and a synchronization mechanism with other processors constituting the system. In addition, the processor or the processor board is usually upgraded or revised gradually, but in the above-mentioned conventional example in which the processor or the processor board which is a part of the system is replaced, the mismatch between the old and new processors after restoration is considered. Adequate precautionary measures to prevent them are essential. Further, in the case of replacing the processor board, it is necessary to always prepare an expensive replacement board. Furthermore, synchronization between processors is difficult.

【０００９】以上のことから本発明においては、障害発
生時に運転継続できることは勿論、その後の復旧につい
ても考慮された構成を有する高信頼化コンピュータシス
テムを提供することを目的とする。In view of the above, it is an object of the present invention to provide a highly reliable computer system having a configuration in which not only operation can be continued when a failure occurs but also recovery after that is taken into consideration.

【００１０】また本発明の他の目的は、同期の高速化が
可能な高信頼化コンピュータシステムを提供することに
ある。Another object of the present invention is to provide a highly reliable computer system capable of speeding up synchronization.

【００１１】また本発明の他の目的は、システムを停止
することなくハードウェアボードを交換することのでき
る高信頼化コンピュータシステムを提供することにあ
る。Another object of the present invention is to provide a highly reliable computer system capable of replacing a hardware board without stopping the system.

【００１２】本発明の他の目的は、システムの性能を低
下することなくハードウェアボードを交換することので
きる高信頼化コンピュータシステムを提供することにあ
る。なお、本発明のほかの目的は、明細書の以下の説明
から明確にされる。Another object of the present invention is to provide a highly reliable computer system capable of replacing a hardware board without deteriorating the performance of the system. Other objects of the present invention will be clarified from the following description of the specification.

【００１３】[0013]

【課題を解決するための手段】本発明は、上記目的を達
成するために一つのハードウェアボード上に複数のプロ
セッサで構成されるプロセッサシステムを搭載し、ハー
ドウェアボード自体にフォールトトレランス機能を持た
せたものである。According to the present invention, in order to achieve the above object, a processor system comprising a plurality of processors is mounted on one hardware board, and the hardware board itself has a fault tolerance function. It was made.

【００１４】また、本発明は一つのハードウェアボード
上の複数のプロセッサをクロック同期させたものであ
る。Further, the present invention synchronizes clocks of a plurality of processors on one hardware board.

【００１５】また、本発明はその内部に障害が発生した
ときにはハードウェアボード自体を別のハードウェアボ
ードに交換する。Further, according to the present invention, when a failure occurs in the inside, the hardware board itself is replaced with another hardware board.

【００１６】また、本発明はその内部に障害が発生した
ときにはそのプロセッサを切離し残りのプロセッサによ
り運転継続する。Further, according to the present invention, when a fault occurs in the inside, the processor is disconnected and the operation is continued by the remaining processors.

【００１７】[0017]

【作用】本発明では、一つのハードウェアボード上に複
数のプロセッサで構成されるプロセッサシステムを搭載
し、ハードウェアボード自体にフォールトトレランス機
能を持たせプロセッサボード自体を交換してしまうこと
にしたので、プロセッサ組替えに伴う必要なハード、ソ
フトが不要であり、復旧も考慮したシステム構成とでき
る。According to the present invention, a processor system composed of a plurality of processors is mounted on one hardware board, and the hardware board itself has a fault tolerance function and the processor board itself is replaced. In addition, hardware and software necessary for processor replacement are not required, and a system configuration that allows for recovery can be provided.

【００１８】本発明では、一つのハードウェアボード上
の複数のプロセッサをクロック同期させたので配線が短
距離で良く同期の高速化が達成できる。In the present invention, the clocks of a plurality of processors on one hardware board are synchronized with each other, so that the wiring is short and the speed of synchronization can be increased.

【００１９】本発明では、その内部に障害が発生したと
きにはハードウェアボード自体を別のハードウェアボー
ドに瞬時に交換するのでシステムを停止することがな
い。According to the present invention, when a failure occurs inside the system, the hardware board itself is replaced immediately with another hardware board, so that the system is not stopped.

【００２０】本発明では、その内部に障害が発生したと
きにはそのプロセッサを停止し残りのプロセッサにより
運転継続するのでシステムの性能を低下することなくハ
ードウェアボードを交換することができる。According to the present invention, when a failure occurs in the inside, the processor is stopped and the operation is continued by the remaining processors, so that the hardware board can be replaced without lowering the performance of the system.

【００２１】[0021]

【実施例】以下本発明について詳細に説明するが、本明
細書での説明はその理解を容易にするために以下の項目
に分けて行う。DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention will be described in detail below, but the description in this specification is divided into the following items in order to facilitate understanding.

【００２２】Ｉ．システムの概略全体構成 II ．ＢＰＵ２の構成 III．異常検出手法 IV ．異常時の構成変更制御Ｖ．内部バス接続時の信号処理 VI ．異常発生後の復旧策 VII．各部回路の代案変形例Ｉ．システムの概略全体構成図１に本発明のフォルトトレーラントシステムの概略全
体構成を示す。このシステムは２組のシステムバス１−
１と１−２を有しており、このバス上には一つまたは複
数のベーシックプロセッシングユニット（以下単にＢＰ
Ｕという）２−１，２−２……２−ｎがシステムバス１
−１と１−２に夫々接続されている。またシステムバス
１−１には主記憶装置３−１が、１−２には主記憶装置
３−２が夫々個別に接続され、入出力装置（以下単にＩ
ＯＵという）４−１，４−２が夫々システムバスの何れ
にも接続される。主記憶装置３及びＩＯＵ４は、夫々２
台を一組として使用され、図１の例では各一組ずつ使用
する例を示しているが、これはシステムの拡張に応じて
適宜組数を増加して使用することができる。図示のｎ組
のＢＰＵは、通常は夫々別の処理を実行しているが、何
れも同じ構成とされているのでここでの説明は特に必要
のないかぎりＢＰＵ２−１を例にとってその構成及び作
用について説明する。I. Schematic overall configuration of the system II. Configuration of BPU2 III. Anomaly detection method IV. Configuration change control in case of abnormality V. Signal processing when internal bus is connected VI. Recovery measures after an abnormal occurrence VII. Alternative Variations of Each Circuit I. FIG. 1 shows a schematic overall configuration of a fault tolerant system of the present invention. This system has two sets of system bus 1-
1 and 1-2, and one or a plurality of basic processing units (hereinafter simply referred to as BPs)
2-1, 2-2 ... 2-n is the system bus 1
-1 and 1-2. A main storage device 3-1 is connected to the system bus 1-1, and a main storage device 3-2 is individually connected to the system bus 1-1.
4-1) and 4-2 are connected to any of the system buses. The main storage device 3 and the IOU 4 are 2
The table is used as one set, and the example of FIG. 1 shows an example in which each set is used one by one. However, this can be used by appropriately increasing the number of sets according to the expansion of the system. The n sets of BPUs shown in the figure usually execute different processes, but all have the same configuration. Therefore, the configuration and operation of the BPU 2-1 will be described as an example unless otherwise required, unless otherwise required. Will be described.

【００２３】ＢＰＵ２は、複数のマイクロプロセッシン
グユニット２０（以下単にＭＰＵという。図の例では３
台)、複数のＭＰＵ出力チェック回路２３(図の例では３
台)、３ステートバッファ回路２９等、複数のキャッシ
ュメモリ２２０，２２１，複数のバスインターフェイス
回路２７（以下単にＢＩＵという）等を主要な構成要件
としている。ここで図１の回路の概略の動作を説明して
おくと、３台のＭＰＵ２０により演算が実行され、この
ＭＰＵの出力がチェック回路２３においてチェックさ
れ、正常と判断された２つのＭＰＵの出力が夫々バスイ
ンターフェイス回路２７を介して２組のシステムバス
１、あるいは２組のキャッシュメモリ２２０，２２１に
夫々出力される。ＭＰＵの１つに異常が発見された場
合、このＭＰＵは除外されて残りの２つの正常なＭＰＵ
によりその出力が夫々バスインターフェイス回路２７を
介して２組のシステムバス１に、あるいは２組のキャッ
シュメモリ２２０，２２１に夫々出力される。３台のＭ
ＰＵ２０の一部に異常が発見された後は、適宜のタイミ
ングで３台のＭＰＵ２０が全く別の新たな３台のＭＰＵ
２０に切替られて演算を実行する。The BPU 2 includes a plurality of microprocessing units 20 (hereinafter, simply referred to as MPUs.
), A plurality of MPU output check circuits 23 (3
The main constituent elements are a plurality of cache memories 220 and 221, a plurality of bus interface circuits 27 (hereinafter simply referred to as BIU), such as a three-state buffer circuit 29, and the like. Here, the schematic operation of the circuit shown in FIG. 1 will be described. An operation is executed by three MPUs 20 , the outputs of the MPUs are checked by a check circuit 23, and the outputs of the two MPUs determined to be normal are output. Are output to two sets of the system bus 1 or two sets of cache memories 220 and 221 via the bus interface circuit 27, respectively. If an abnormality is found in one of the MPUs, this MPU is excluded and the other two normal MPUs are removed.
Thus, the output is output to two sets of the system bus 1 or two sets of cache memories 220 and 221 via the bus interface circuit 27, respectively. Three M
After an abnormality is found in a part of the PU 20, three MPUs 20 are replaced by three MPUs 20 that are completely different from each other at an appropriate timing.
The operation is switched to 20 to execute the calculation.

【００２４】II．ＢＰＵ２の構成ＢＰＵ２のより詳細な構成は図２に示されている。なお
後述するように、ＢＰＵは一枚のプリント板上に図示の
機能を搭載されるのが良い。詳しく説明すると、ＢＰＵ
（ベーシックプロセッシングユニット）は、同一の演算
を実行する少なくとも３つのプロセッサユニットと、プ
ロセッサユニットの出力を他のプロセッサユニットの出
力と比較して健全性を確認する確認回路と、健全性の確
認された出力を夫々外部出力し、外部入力を夫々取込む
複数のインタフェイスユニットと、プロセッサユニット
での演算に必要な情報を記憶する複数のキャッシュメモ
リと、これらの間に設けられた内部バスとを一つのプロ
セッサボード上に搭載しており、プロセッサユニット
は、その出力が他のプロセッサユニットの健全性の確認
のために使用され、かつ外部に出力される出力用のプロ
セッサユニットと、その出力が他のプロセッサユニット
の健全性の確認のためのみに使用され、外部に出力され
るされない参照用のプロセッサユニットとの２種類有
り、インタフェイスユニットは、夫々、相違する特定の
出力用のプロセッサユニットの出力の健全性が確認され
たとき、その出力を外部出力するとともに、特定の出力
用のプロセッサユニットの出力の健全性が確認されない
とき、他の健全性の確認された出力用のプロセッサユニ
ットの出力を外部出力する。健全性の確認されない出力
用のプロセッサユニットの出力の代りに他の健全性の確
認された出力用のプロセッサユニットの出力を選択する
選択回路を備えている。また、ＢＰＵは、健全性の確認
されない出力用のプロセッサユニットを切離す切離手段
と、プロセッサユニットの出力を与えるキャッシュメモ
リに他の健全性の確認されたプロセッサユニットの出力
を与えるよう切替える切替手段とを備えている。そし
て、ＢＰＵは、異常のキャッシュメモリの出力を停止す
る停止手段と、異常のキャッシュメモリから情報を受取
るプロセッサユニットに他の正常なキャッシュメモリか
ら情報を受取るよう切替える切替手段とを備えている。
更に、ＢＰＵは、異常のインタフェイスユニット側を切
離す切離手段と、異常のインタフェイスユニット側の内
部バスに接続されるプロセッサユニットを、他の正常な
インタフェイスユニット側の内部バスに接続するよう切
替える切替手段とを備えている。 II. Configuration of BPU2 A more detailed configuration of BPU2 is shown in FIG. As will be described later, the BPU is preferably provided with the functions shown on a single printed board. To be more specific, BPU
(Basic processing unit) is the same operation
At least three processor units for performing
The output of a processor unit is output from another processor unit.
Check circuit to check soundness by comparing with force
External output of each recognized output and input of external input
Multiple interface units and processor units
Cache memos that store information required for calculations in
And the internal bus provided between them
The processor unit is mounted on the processor board.
Is used to check the health of other processor units
For output to be used for
Sessa unit and its output to another processor unit
Used only for checking the health of the
Two types of processor units for reference
Each interface unit has a specific
The soundness of the output of the output processor unit is confirmed
Output, output the output to a specific output
Of the output of the processor unit is not confirmed
At other times, the processor unit for other health-checked outputs
Output the output of the unit. Unchecked output
Other health checks instead of processor unit outputs.
Select the output of the processor unit for the recognized output
It has a selection circuit. In addition, BPU confirms soundness
Disconnecting means for disconnecting the unprocessed output processor unit
And a cache memo giving the output of the processor unit
The output of other processor units whose health has been confirmed
Switching means for switching to give Soshi
Then, the BPU stops outputting the abnormal cache memory.
Stopping means and receiving information from the abnormal cache memory
Is another normal cache memory in the processor unit
Switching means for switching the information to be received.
Further, the BPU switches off the interface unit side of the abnormality.
Release unit and the abnormal interface unit side
The processor unit connected to the
Connect to the internal bus on the interface unit side.
Switching means for switching.

【００２５】図２において、３台のＭＰＵ２０−１，２
０−２，２０−３は図示せぬクロックにより同期演算が
実行され、その結果がアドレスラインＡとデータライン
Ｄに夫々出力される。ＭＰＵ２０−１，２０−２，２０
−３のアドレスラインＡ上のアドレスとデータラインＤ
上のデータには、パリティ生成／検査照合回路１０乃至
１５から適宜のパリティ信号が付与されてＭＰＵ出力チ
ェック回路２３に与えられる。ＭＰＵ出力チェック回路
２３は、ＭＰＵＡ（２０−１）からの出力（パリティ信
号が付与されたアドレス、データ）とＭＰＵＢ（２０−
２）からの出力とを比較する第１のチェック回路ＣＨＫ
ＡＢ（２３−１）と、ＭＰＵＡ（２０−１）からの出力
とＭＰＵＣ（２０−３）からの出力とを比較する第２の
チェック回路ＣＨＫＣＡ（２３−２）と、ＭＰＵＢ（２
０−２）からの出力とＭＰＵＣ(２０−３)からの出力と
を比較する第３のチェック回路ＣＨＫＢＣ(２３−３)
と、３つのチェック回路ＣＨＫからの比較結果に応じて
ＭＰＵのいずれの故障であるかを特定するエラーチェッ
ク回路２３４、２３５から構成される。このＭＰＵ出力
チェック回路２３はいわゆる多数決回路であり、この判
定結果に応じて３ステートバッファ回路２００，２０
１，２０３，２０４，２９の開閉状態が制御される。こ
の判定結果と３ステートバッファ回路の状態の関係につ
いては後述するが、要するに異常と判定されたＭＰＵを
以後使用せず、正常とされたＭＰＵの出力を２つのキャ
ッシュメモリ２２０，２２１に与えて２重系として運用
するものである。なお、以下の説明においては３ステー
トバッファ回路のイネーブル状態を単に開状態と称し、
ディセーブル状態を閉状態ということにする。In FIG. 2, three MPUs 20-1, 20-2
For 0-2 and 20-3, a synchronous operation is performed by a clock (not shown), and the result is output to an address line A and a data line D, respectively. MPU 20-1, 20-2, 20
-3 address on address line A and data line D
The above data is provided with an appropriate parity signal from the parity generation / inspection / collation circuits 10 to 15 and supplied to the MPU output check circuit 23. The MPU output check circuit 23 outputs the output from the MPUA (20-1) (address and data to which the parity signal is added) and the MPU (20-
2) First check circuit CHK that compares the output from CHK
AB (23-1), a second check circuit CHKCA (23-2) for comparing the output from MPUA (20-1) with the output from MPUC (20-3), and MPUB (2
A third check circuit CHKBC (23-3) that compares the output from 0-2) with the output from MPUC (20-3).
And error check circuits 234 and 235 for specifying which of the MPUs has failed according to the comparison results from the three check circuits CHK. The MPU output check circuit 23 is a so-called majority circuit, and the three-state buffer circuits 200, 20
The open / close states of 1, 203, 204 and 29 are controlled. The relationship between this determination result and the state of the three-state buffer circuit will be described later. In short, the MPU determined to be abnormal is not used thereafter, and the output of the MPU determined to be normal is given to the two cache memories 220 and 221 so that It is operated as a heavy system. In the following description, the enable state of the three-state buffer circuit is simply called an open state,
The disabled state is referred to as a closed state.

【００２６】３ステートバッファ回路２００，２０１，
２０３，２０４を介して得られたアドレス、データは２
つのキャッシュメモリ２２０、２２１に夫々与えられ、
その際パリティチェック回路２５０においてパリティ生
成／検査照合回路１０乃至１５で付与したパリティのチ
ェックが行われる。またＭＰＵ出力は、同期回路２９
０，２９１において２つのＭＰＵ出力の同期が図られ、
バスインターフェイスユニットＢＩＵを介してシステム
バスに送出される。その際パリティチェック回路３０，
３１においてパリティ生成／検査照合回路１０乃至１５
で付与したパリティのチェックが行われる。以上の構成
は、ＭＰＵからのライトアクセスを主体に述べたもので
あるが、このようにＭＰＵからのライトアクセスのとき
はＭＰＵ出力チェック回路２３とパリティチェック回路
３０，３１においてチェックが行われる。The three-state buffer circuits 200, 201,
The address and data obtained through 203 and 204 are 2
Provided to the two cache memories 220 and 221, respectively.
At this time, the parity check circuit 250 checks the parity assigned by the parity generation / check / collation circuits 10 to 15. The MPU output is output from the synchronization circuit 29.
At 0,291, the two MPU outputs are synchronized,
The data is transmitted to the system bus via the bus interface unit BIU. At that time, the parity check circuit 30,
At 31, parity generation / check / collation circuits 10 to 15
Is checked. The above configuration mainly describes the write access from the MPU. In this way, at the time of the write access from the MPU, the check is performed in the MPU output check circuit 23 and the parity check circuits 30 and 31.

【００２７】これに対し、キャッシュリードアクセス時
は、各キャッシュメモリ２２０，２２１，３ステートバ
ッファ回路２０２，２０５、ＭＰＵのルートで信号伝送
が行われ、この場合にはパリティ生成／検査照合回路１
０乃至１５でキャッシュメモリからのアドレス、データ
のチェックが行われる。なお、２６−１，２６−２も３
ステートバッファ回路であり、キャッシュリードアクセ
ス時にパリティ生成／検査照合回路１０乃至１５でのア
ドレス、データのチェック結果に応じて開閉状態が制御
される。On the other hand, at the time of cache read access, signal transmission is performed along the route of each of the cache memories 220, 221, 3-state buffer circuits 202, 205 and the MPU.
At 0 to 15, the address and data from the cache memory are checked. 26-1 and 26-2 are also 3
This is a state buffer circuit, and the open / close state is controlled in accordance with the result of checking the address and data in the parity generation / check / collation circuits 10 to 15 during cache read access.

【００２８】図２の構成から明らかなように、本発明の
ＢＰＵシステムでは少なくとも３台のＭＰＵと、多数決
回路による異常ＭＰＵ検出回路と、二重化されたキャッ
シュメモリと、二重化された出力回路部分とを有する。As is apparent from the configuration of FIG. 2, in the BPU system of the present invention, at least three MPUs, an abnormal MPU detecting circuit by a majority circuit, a duplicated cache memory, and a duplicated output circuit portion are used. Have.

【００２９】III．異常検出手法図２のＢＰＵ内部には、その異常検出部としてＭＰＵ出
力チェック回路２３と、多くのパリティチェック回路を
採用している。この項では、これらの異常検出手法につ
いて説明する。III. Anomaly Detection Method The MPU output check circuit 23 and many parity check circuits are employed as an abnormality detection unit in the BPU of FIG. In this section, these abnormality detection methods will be described.

【００３０】《ＭＰＵ出力回路による異常検出》このう
ち、ＭＰＵ出力チェック部分について図３に示す。図３
において第１のチェック回路ＣＨＫＡＢの出力をＡＢ、
第２のチェック回路ＣＨＫＣＡの出力をＣＡ、第３のチ
ェック回路ＣＨＫＢＣの出力をＢＣ、エラーチェック回
路２３１の出力を夫々Ａｇ，Ｃｇ，２９ｇとして、３つ
のチェック回路の出力とそのときの３ステートバッファ
回路の開閉状態との関係について説明する。なお、この
図においてＣは図２では記述しない制御線である。<< Abnormality Detection by MPU Output Circuit >> FIG. 3 shows an MPU output check portion. FIG.
The output of the first check circuit CHKAB is AB,
The output of the second check circuit CHKCA is CA, the output of the third check circuit CHKBC is BC, the output of the error check circuit 231 is Ag, Cg, and 29g, respectively. The relationship with the open / closed state of the circuit will be described. In this figure, C is a control line not described in FIG.

【００３１】まず、第１乃至第３のチェック回路ＣＨＫ
は、その夫々の２組の入力（アドレス，データ，制御信
号）を得て、第１のチェック回路ＣＨＫＡＢはＭＰＵＡ
の出力とＭＰＵＢの出力との比較結果ＡＢを、第２のチ
ェック回路ＣＨＫＣＡはMPUAの出力とＭＰＵＣの出力と
の比較結果ＣＡを、第３のチェック回路ＣＨＫＢＣはＭ
ＰＵＢの出力とＭＰＵＣの出力との比較結果ＢＣを出力
する。この比較結果は一致するか、しないかのいずれか
の状態信号である。First, the first to third check circuits CHK
Obtains the two sets of inputs (address, data, control signal), and the first check circuit CHKAB outputs the MPUA
The comparison result AB between the output of MPUA and the output of MPUB, the second check circuit CHKCA indicates the comparison result CA between the output of MPUA and the output of MPUC, and the third check circuit CHKBC indicates M
A comparison result BC between the output of PUB and the output of MPUC is output. The result of this comparison is a status signal that either matches or does not match.

【００３２】エラーチェック回路２３１は、３つのチェ
ック回路ＣＨＫの出力ＡＢ，ＢＣ，ＣＡから、（１），
（２），（３）式に従いＭＰＵＡ，ＭＰＵＢ，ＭＰＵＣ
の正常を表す出力Ａｇ，Ｂｇ，Ｃｇを得る。なお、図
２，図３においてエラーチェック回路は二重化されてい
る。The error check circuit 231 obtains (1), (2) from the outputs AB, BC and CA of the three check circuits CHK.
MPUA, MPUB, MPUC according to equations (2) and (3)
Are obtained, the outputs Ag, Bg and Cg representing the normality of. 2 and 3, the error check circuit is duplicated.

【００３３】Ａｇ＝「ＡＢ・「ＣＡ＋「ＡＢ・ＢＣ・ＣＡ＋ＡＢ・ＢＣ・「ＣＡ……（１）Ｂｇ＝「ＡＢ・「ＢＣ＋「ＡＢ・ＢＣ・ＣＡ＋ＡＢ・「ＢＣ・ＣＡ……（２）Ｃｇ＝「ＢＣ・「ＣＡ＋ＡＢ・「ＢＣ・ＣＡ＋ＡＢ・ＢＣ・「ＣＡ……（３）但し、ＡＢ：ＭＰＵＡとＭＰＵＢの出力不一致の事象（２３−１で確認）ＢＣ：ＭＰＵＢとＭＰＵＣの出力不一致の事象（２３−３で確認）ＣＡ：ＭＰＵＡとＭＰＵＣの出力不一致の事象（２３−２で確認）・：論理積（ＡＮＤ）＋：論理和（ＯＲ）「：否定（ＮＯＴ）（１），（２），（３）式演算の結果に応じて３ステー
トバッファ回路２００，２０１，２０３，２０４，２９
の開閉状態が制御されるが、この説明は次の項で行う。Ag = “AB ·“ CA + ”AB · BC · CA + AB · BC ·“ CA... (1) Bg = “AB ·“ BC + ”AB · BC · CA + AB ·“ BC · CA... (2) Cg ” = "BC /" CA + AB / "BC / CA + AB / BC /" CA ... (3) where AB: event of output mismatch between MPUA and MPUB (confirmed in 23-1) BC: event of output mismatch between MPUB and MPUC CA: event of output mismatch between MPUA and MPUC (confirmed in 23-2) •: logical product (AND) +: logical sum (OR) “: negation (NOT) (1), (2) ), (3) 3-state buffer circuits 200, 201, 20 3 , 20 4 , 29 according to the result of the operation.
The opening / closing state of is controlled, and this will be described in the next section.

【００３４】表１は、３つのチェック回路ＣＨＫＡＢ，
ＣＨＫＢＣ，ＣＨＫＣＡの出力（一致、不一致）と、こ
のときの異常ＭＰＵの判定結果Ａｇ，Ｂｇ，Ｃｇと、そ
の結果としての３ステートバッファ回路の開閉状態を纏
めた表である。なお、表１中の判定結果の項において、
１はＭＰＵ正常、０は異常または不明を意味する。Table 1 shows three check circuits CHKAB,
6 is a table summarizing the outputs (coincidence, non-coincidence) of CHKBC and CHKCA, the determination results Ag, Bg, and Cg of the abnormal MPU at this time, and the open / close state of the three-state buffer circuit as a result. In addition, in the section of the determination result in Table 1,
1 means MPU normal, 0 means abnormal or unknown.

【００３５】表２は表１の一致、不一致のチェック回路
出力を生じる原因として想定される事例の一部を述べた
ものであるが、（本発明は、異常の際にＢＰＵ内の回路
構成を如何に変更し運転継続させるかに主眼があり、異
常発生原因を特定することは本旨ではないので）ここで
の詳細説明を省略する。Table 2 shows a part of cases assumed as the cause of the coincidence and non-coincidence check circuit output of Table 1. (In the present invention, the circuit configuration in the BPU in the event of an abnormality is changed. The main point is how to change and continue the operation, and it is not the main purpose to specify the cause of the abnormality. Therefore, detailed description is omitted here.

【００３６】[0036]

【表１】 [Table 1]

【００３７】[0037]

【表２】 [Table 2]

【００３８】図３，図２，表１，表２を参照して説明し
たように、本発明においては、ＭＰＵ出力チェック回路
２３で以上の論理でＭＰＵの正常、異常を判断する。As described with reference to FIG. 3, FIG. 2, Table 1 and Table 2, in the present invention, the MPU output check circuit 23 determines whether the MPU is normal or abnormal based on the above logic.

【００３９】次に、ＢＰＵ内各部にその他の異常検出手
法として採用したパリティチェック回路による異常検出
手法について説明する。但し、パリティチェック回路自
体は周知であり任意のものが採用できるので回路につい
ての詳細説明を省略し、ここではパリティエラー検出し
たときの異常個所特定手法について説明する。Next, a description will be given of an abnormality detection method using a parity check circuit employed as another abnormality detection method in each section in the BPU. However, since the parity check circuit itself is well known and any one can be adopted, a detailed description of the circuit is omitted, and here, a method of specifying an abnormal part when a parity error is detected will be described.

【００４０】図２に示すように、ライトアクセス時には
パリティ生成／検査照合回路１０乃至１５から適宜のパ
リティ信号が付与されてアドレスラインＡ、データライ
ンＤに情報送出され、この異常をパリティチェック回路
２５０，３０，３１にて検知する。またリードアクセス
時には、パリティ生成／検査照合回路１０乃至１５，パ
リティチェック回路２５０，３０，３１にて情報の異常
を検知する。これらのパリティチェックは基本的にアド
レスとデータに分けて個別に実施される。そしてアドレ
スについてみると、アドレス情報にパリティエラー検出
したときの異常個所はこのアドレス信号を送出している
バスマスタであり、図２の内部バスの使用権を与えるバ
スアービタ（図示していない）からのバスグラント信号
を監視することでバスマスタとなっている機器（ＭＰ
Ｕ，キャッシュメモリ，ＢＩＵ）を特定することができ
る。次にデータについてみると、ライトアクセス時にデ
ータ情報のパリティエラー検出したときの異常個所はこ
のデータ信号を送出しているバスマスタである。バスマ
スタの特定は、バスアービタのバスグラント信号監視に
より行われる。最後に、リードアクセス時にデータ情報
のパリティエラー検出したときの異常個所はこのデータ
信号の出力元であり、この特定はこのデータに付属する
アドレスが指し示しているデバイスをアドレスをデコー
ドすることで特定できる。As shown in FIG. 2, at the time of write access, an appropriate parity signal is added from the parity generation / check / verification circuits 10 to 15 and information is transmitted to the address line A and the data line D. , 30, 31. At the time of read access, the parity generation / inspection / collation circuits 10 to 15 and the parity check circuits 250, 30, and 31 detect an abnormality in information. These parity checks are basically performed individually for each of address and data. Regarding the address, when a parity error is detected in the address information, the abnormal part is the bus master transmitting this address signal, and the bus from the bus arbiter (not shown) which grants the right to use the internal bus shown in FIG. By monitoring the grant signal, the device (MP
U, cache memory, BIU) can be specified. Next, regarding the data, when a parity error of the data information is detected at the time of the write access, the abnormal part is the bus master transmitting the data signal. The bus master is specified by monitoring a bus grant signal of a bus arbiter. Lastly, when a parity error of data information is detected at the time of read access, the abnormal point is the output source of this data signal, and this can be specified by decoding the device indicated by the address attached to this data. .

【００４１】この異常個所特定の考え方を論理式にて示
すと以下のようになる。The concept of identifying the abnormal part is represented by a logical expression as follows.

【００４２】《パリティチェックによる異常検出》ＰＴＹＧＥＮ／ＮＧ＝ＡＰＥ・ＭＰＵ／ＭＳＴ＋ＤＰＥ（ＷＴ・ＭＰＵ／ＭＳＴ＋ＲＤ・ＭＰＵ／ＳＮＤ）……（４）Ｃａｃｈ／ＮＧ＝ＡＰＥ・Ｃａｃｈ／ＭＳＴ＋ＤＰＥ（ＷＴ・Ｃａｃｈ／ＭＳＴ＋ＲＤ・Ｃａｃｈ／ＳＮＤ）……（５）ＢＩＵ／ＮＧ＝ＡＰＥ・ＢＩＵ／ＭＳＴ＋ＤＰＥ（ＷＴ・ＢＩＵ／ＭＳＴ＋ＲＤ・ＢＩＵ／ＳＮＤ）……（６）ＳＹＳＢＵＳ／ＮＧ＝ＢＩＵ／ＮＧ……（７）但し、（４）乃至（７）式において、ＰＴＹＧＥＮ：パリティ生成／検査照合回路１０乃至１
５／ＮＧ：パリティ異常ＡＰＥ：アドレスパリティ異常・：論理積／ＭＳＴ：バスマスタ＋：論理和ＤＰＥ：データパリティ異常ＷＴ：バスマスタがデータ出力Ｃａｃｈ：キャッシュメモリＲＤ：バスマスタがデータ入力／ＳＮＤ：データ出力元 IV．異常時の構成変更制御ＢＰＵ内の異常には、ＭＰＵからのライトアクセス時に
ＭＰＵ出力チェック回路で検知されるものと、ライトア
クセス時あるいはキャッシュリードアクセス時にパリテ
ィチェック回路で発見されるものとがある。<< Abnormality Detection by Parity Check >> PTYGEN / NG = APE · MPU / MST + DPE (WT · MPU / MST + RD · MPU / SND) (4) Cach / NG = APE · Cach / MST + DPE (WT · Cach / MST + RD · Cach / SND) (5) BIU / NG = APE / BIU / MST + DPE (WT / BIU / MST + RD · BIU / SND) (6) SYSBUS / NG = BIU / NG (7) However, in equations (4) to (7), PTYGEN: parity generation / check / collation circuit 10 to 1
5 / NG: Parity error APE: Address parity error-: Logical product / MST: Bus master +: Logical sum DPE: Data parity error WT: Bus master outputs data Cach: Cache memory RD: Bus master inputs data / SND: Data output source IV. Configuration Change Control at the Time of Abnormality An abnormality in the BPU includes one detected by the MPU output check circuit at the time of write access from the MPU and one detected by the parity check circuit at the time of write access or cache read access.

【００４３】〔ＭＰＵ出力チェック回路による異常検出時の構成変
更〕前記ＭＰＵ出力チェック回路２３のエラーチェック回路
２３１の出力Ａｇに応じて３ステートバッファ回路２０
０，２０１が、Ｃｇに応じて２０３，２０４が、２９ｇ
に応じて２９の開閉状態が、夫々表１のように制御され
る。なお、表１において、ＭＰＵ判定結果Ａｇ＝１は２
００，２０１開、Ａｇ＝０は２００，２０１閉に基本的
に対応し、Ｃｇ＝１は２０３，２０４開、Ｃｇ＝０は２
０３，２０４閉に基本的に対応するが、Ｂｇと２９ｇは
対応関係にはない。２９ｇに従って、２９の開閉状態
は、Ａｇ＝１かつＣｇ＝１のときに閉、ＡｇとＣｇのい
ずれかが１のときは０となった３ステートバッファ回路
に向かう方向の３ステートバッファ回路２９のみが開放
される。以下、表１の各ケースについて、図４の系統構
成を参照してより詳細に説明する。[Configuration Change When MPU Output Check Circuit Detects Abnormality] The three-state buffer circuit 20 according to the output Ag of the error check circuit 231 of the MPU output check circuit 23
0,201 is Cg and 203,204 is 29g
29 are controlled as shown in Table 1 in accordance with. In Table 1, the MPU determination result Ag = 1 is 2
00,201 open, Ag = 0 basically corresponds to 200,201 closed, Cg = 1 is 203,204 open, Cg = 0 is 2
Basically, it corresponds to 03,204 closing, but Bg and 29g do not have a correspondence. Thus the 29 g, 29 opening and closing states of Ag = 1 and Cg = 1 closed when, 3-state buffer circuit of the direction in which any of Ag and Cg is directed to the 3-state buffer circuit becomes 0 when 1 29 Only open. Hereinafter, each case in Table 1 will be described in more detail with reference to the system configuration in FIG.

【００４４】ケース１：全てのＭＰＵ出力が一致し、全
ＭＰＵ正常である。３ステートバッファ回路２００，２
０１，２０３，２０４が開状態、２９が閉状態とされ、
図４（ａ）のようにＭＰＵＡとキャッシュメモリ２２０
による系統と、ＭＰＵＣとキャッシュメモリ２２１によ
る系統とが独立して二重化運用される。Case 1: All MPU outputs match and all MPUs are normal. Three-state buffer circuits 200, 2
01, 203 and 204 are open, 29 is closed,
The MPUA and the cache memory 220 as shown in FIG.
And the system based on the MPUC and the cache memory 221 are independently and redundantly operated.

【００４５】ケース２：チェック回路ＣＨＫＣＡのみが
不一致出力を与えており、ＭＰＵＢのみが正常と判断さ
れる。図２に示すようにＭＰＵＢは他のＭＰＵの参照用
として使用され、キャッシュメモリに出力を与えるよう
に構成されていないので構成変更しての運転継続不可能
であり、この場合システムダウンとなる。Case 2: Only the check circuit CHKCA gives a mismatch output, and only the MPUB is determined to be normal. As shown in FIG. 2, the MPU is used as a reference for other MPUs and is not configured to provide an output to the cache memory. Therefore, the operation cannot be continued by changing the configuration, and in this case, the system is down.

【００４６】ケース３：チェック回路ＣＨＫＢＣのみが
不一致出力を与えており、ＭＰＵＡのみが正常と判断さ
れる。この場合には３ステートバッファ回路２００，２
０１が開状態、２０３，２０４が閉状態、２９はキャッ
シュメモリ２２１方向への３ステートバッファ回路のみ
が開状態とされる。ＭＰＵＢとＭＰＵＣは停止され、図
４（ｂ）のようにＭＰＵＡのみによる単独系統による運
転とされる。キャッシュメモリ２２１方向への３ステー
トバッファ回路２９のみが開状態とされるのは、キャッ
シュメモリ記憶内容の同一性保持のためである。Case 3: Only the check circuit CHKBC gives a mismatch output, and only the MPUA is determined to be normal. In this case, the three-state buffer circuits 200 and 2
01 is open, 203 and 204 are closed, 29 is only the three-state buffer circuit in the direction of the cache memory 221 is open. The MPUB and the MPUC are stopped, and the operation is performed by the single system using only the MPUA as shown in FIG. The reason why only the three-state buffer circuit 29 in the direction of the cache memory 221 is opened is to maintain the identity of the contents stored in the cache memory.

【００４７】ケース４：チェック回路ＣＨＫＡＢのみが
一致出力を与えており、ＭＰＵＡとＭＰＵＢが正常と判
断される。この場合には３ステートバッファ回路２０
０，２０１が開状態、２０３，２０４が閉状態、２９は
キャッシュメモリ２２１方向への３ステートバッファ回
路のみが開状態とされる。この場合にはＭＰＵＣを停止
し、図４（ｃ）のようにＭＰＵＡとＭＰＵＢで二重系を
構成して、ＭＰＵＢによりＭＰＵＡの出力を監視する二
重化運転とされる。キャッシュメモリ２２１方向への３
ステートバッファ回路２９のみが開状態とされるのは、
キャッシュメモリ記憶内容の同一性保持のためである。Case 4: Only the check circuit CHKAB provides a coincidence output, and it is determined that MPUA and MPUB are normal. In this case, the three-state buffer circuit 20
0 and 201 are open, 203 and 204 are closed, 29 is only the three-state buffer circuit in the direction of the cache memory 221 is open. In this case, the MPUC is stopped, a dual system is configured by the MPUA and the MPUB as shown in FIG. 4C, and the duplex operation is performed in which the output of the MPUA is monitored by the MPUB. 3 toward cache memory 221
Only the state buffer circuit 29 is opened.
This is for maintaining the identity of the contents stored in the cache memory.

【００４８】ケース５：チェック回路ＣＨＫＡＢのみが
不一致出力を与えており、ＭＰＵＡとＭＰＵＢが異常、
ＭＰＵＣのみが正常と判断される。この場合には３ステ
ートバッファ回路２００，２０１が閉状態、２０３，２
０４が開状態、２９はキャッシュメモリ２２０方向への
３ステートバッファ回路のみが開状態とされる。この場
合にはＭＰＵＡとＭＰＵＢを停止し、図４（ｄ）のよう
にＭＰＵＣのみによる単独運転とされる。キャッシュメ
モリ２２０方向への３ステートバッファ回路２９のみが
開状態とされるのは、キャッシュメモリ記憶内容の同一
性保持のためである。Case 5: Only the check circuit CHKAB gives a mismatch output, and the MPUA and the MPUB are abnormal.
Only MPU C is determined to be normal. In this case, the three-state buffer circuits 200 and 201 are closed,
04 is open, 29 is only the three-state buffer circuit in the direction of the cache memory 220 is open. In this case, the MPUA and the MPUB are stopped, and as shown in FIG. The reason that only the three-state buffer circuit 29 in the direction of the cache memory 220 is opened is to maintain the identity of the contents stored in the cache memory.

【００４９】ケース６：チェック回路ＣＨＫＢＣのみが
一致出力を与えており、ＭＰＵＣとＭＰＵＢが正常と判
断される。この場合には３ステートバッファ回路２０
０，２０１が閉状態、２０３，２０４が開状態、２９は
キャッシュメモリ２２０方向への３ステートバッファ回
路のみが開状態とされる。この場合には基本的にケース
４と同様に運用される。Case 6: Only the check circuit CHKBC gives a coincidence output, and it is determined that MPUC and MPUB are normal. In this case, the three-state buffer circuit 20
0 and 201 are closed, 203 and 204 are open, and 29 is only the three-state buffer circuit toward the cache memory 220 in the open state. In this case, the operation is basically performed in the same manner as Case 4.

【００５０】ケース７：チェック回路ＣＨＫＣＡのみが
一致出力を与えており、ＭＰＵＣとＭＰＵＡが正常と判
断される。この場合には参照用ＭＰＵの異常なので、図
４（ｅ）ケース７のように、ＭＰＵＢのみを切離し、３
ステートバッファ回路は何等の変更もせずにＭＰＵＣと
ＭＰＵＡによる二重化運転を継続する。Case 7: Only the check circuit CHKCA provides a coincidence output, and it is determined that MPUC and MPUA are normal. In this case, since the MPU for reference is abnormal, only the MPU is disconnected as shown in Case 7 in FIG.
The state buffer circuit continues the duplex operation by MPUC and MPUA without any change.

【００５１】ケース８：いずれのチェック回路ＣＨＫも
不一致を検出しており、全ＭＰＵ異常であることから以
後の運転継続不可能である。Case 8: Any of the check circuits CHK has detected a mismatch, and since all the MPUs are abnormal, the operation cannot be continued thereafter.

【００５２】以上のようにして、３台のＭＰＵとその周
辺回路（例えばパリティ生成／検査照合回路）の正常性
が確認され、適宜構成変更制御が実施されるが、この表
１はあくまでも照合結果の考え得る組合せを述べたにす
ぎず、実際問題としてはケース２から８の７つの異常事
象が同一確率で発生するわけではない。つまり、このう
ち単一故障のケースは４，６，７の３事例、二重故障は
２，３，５の３事例、三重故障は８のケースであり、良
く知られているように運転継続不能となるケース２、８
を含む多重故障の同時発生確率は単一故障に比べて極め
て低い。しかも、実際には単一故障が進展して多重故障
に至ることが殆どであり、従って単一故障の時点で何等
かの回復対策を施すことで事実上運転継続に支障のない
システム構成とすることができる。なお、本発明では仮
に二重故障が発生したとしても多くの場合に支障無く運
転継続可能であり、この意味においては非常に信頼性の
高いシステムであるといえる。As described above, the normality of the three MPUs and their peripheral circuits (for example, parity generation / inspection / collation circuits) is confirmed, and the configuration change control is appropriately performed. Table 1 shows the collation results to the last. Only the possible combinations have been described, and in practice, the seven abnormal events in cases 2 to 8 do not occur with the same probability. In other words, among these, there are three cases of single failure, 4, 6, and 7, three cases of double failure, 2, 3, and 5, and eight cases of triple failure. As is well known, operation continues. Cases 2 and 8 that are disabled
Is extremely low as compared with a single fault. In addition, in most cases, a single failure actually progresses and leads to multiple failures. Therefore, by taking some recovery measures at the time of the single failure, a system configuration that does not substantially hinder operation continuity is obtained. be able to. In the present invention, even if a double failure occurs, the operation can be continued without any problem in many cases, and in this sense, it can be said that the system is extremely reliable.

【００５３】なお、以上の異常事象発生の際に図２には
図示がないが、異常ＭＰＵを停止する信号がＭＰＵ出力
チェック回路２３から発生されてこれを停止し、あるい
は外部出力されて運転員に異常の発生を報知し、以後の
対策の必要性を報知せしめることは当然のこととして行
われる。Although not shown in FIG. 2 at the time of occurrence of the above abnormal event, a signal for stopping the abnormal MPU is generated from the MPU output check circuit 23 to stop the signal, or output to the outside to output the signal to the operator. It is a matter of course that the occurrence of an abnormality is notified to notify the necessity of the following measures.

【００５４】〔パリティチェックによる異常検出時の構成変更〕前記のIII 項で述べたようにして、ライトアクセス時あ
るいはキャッシュリードアクセス時に、キャッシュメモ
リ２２０，２２１，ＢＩＵ２７−１，２７−２の異常個
所が特定できる。次に各異常の時のＢＰＵ内部の構成変
更制御について説明する。なお、表３はキャッシュリ−
ドアクセス時の各部異常の際にキャッシュメモリ２２
０，２２１，ＢＩＵ２７−１，２７−２，３ステートバ
ッファ回路２９，２６−１，２６−２をどのように制御
するのかを一覧表にしたものである。[Configuration Change at the Time of Detecting Abnormality by Parity Check] As described in the above section III, at the time of write access or cache read access, the location of abnormality in the cache memories 220, 221, BIU 27-1, 27-2. Can be identified. Next, the configuration change control inside the BPU at the time of each abnormality will be described. Table 3 shows the cash
The cache memory 22
0, 221, BIU 27-1, 27-2, and three-state buffer circuits 29, 26-1 , 26-2 .

【００５５】[0055]

【表３】 [Table 3]

【００５６】図５は各ケースの時の回路構成を図示した
ものであり、以下表３と図５を参照して説明する。図５
（ａ）は正常時の信号の流れを示している。この場合、
３ステートバッファ回路２９，２６−２は閉、２６−１
は開とされており、従ってＢＩＵ２７−１またはキャッ
シュメモリ２２０からの情報がＭＰＵＡ２０−１と、Ｍ
ＰＵＢ２０−２に供給され、ＢＩＵ２７−２またはキャ
ッシュメモリ２２１からの情報がＭＰＵＣ２０−３に供
給される。このように、通常はＢＩＵ２７−１、キャッ
シュメモリ２２０，ＭＰＵＡ２０−１，ＭＰＵＢ２０−
２が一つの組を構成し、ＢＩＵ２７−２，キャッシュメ
モリ２２１，ＭＰＵＣ２０−３が別の一組を構成するよ
うに運用される。FIG. 5 shows a circuit configuration in each case, which will be described below with reference to Table 3 and FIG. FIG.
(A) shows a signal flow in a normal state. in this case,
3-state buffer circuit 29,26 -2 closed, 2 6-1
Is open, so that information from the BIU 27-1 or the cache memory 220 stores the MPUA 20-1 and M
It is supplied to PUB20- 2, BIU27-2 or information from the cache memory 221 is supplied to the MPUC20-3. Thus, the BIU 27-1, the cache memory 220, the MPUA 20-1, and the MPUB 20-
2 constitute one set, and the BIU 27-2, the cache memory 221, and the MPUC 20-3 operate so as to constitute another set.

【００５７】ケース１：キャッシュメモリ２２０の異常
である。図５（ｂ）のように、キャッシュメモリ２２０
の出力が停止され、３ステートバッファ回路２９はＭＰ
ＵＡ２０−１側への信号のみが通過するように制御さ
れ、３ステートバッファ回路２６−１は閉、２６−２は
開とされる。これにより、全てのＭＰＵはキャッシュメ
モリ２２１からの共通情報を受け取るように構成されて
異常発見後も運転継続される。なお、３ステートバッフ
ァ回路２６−１を閉、２６−２を開のように正常状態か
ら切替る理由は、論理的にはキャッシュメモリ２２０の
異常と特定していても、キャッシュメモリ２２０が接続
された内部バスの異常の可能性も否定できず、念のため
にキャッシュメモリ２２１側に切替るものである。も
し、キャッシュメモリ２２０が接続された内部バスの異
常のときは、３ステートバッファ回路２９が一方向通信
となっているためにＭＰＵＣ側にはその影響が現れな
い。Case 1: The cache memory 220 is abnormal. As shown in FIG. 5B, the cache memory 220
Is stopped, and the three-state buffer circuit 29
Only the signal to the UA 20-1 side is controlled to pass, the three-state buffer circuit 26-1 is closed, and 26-2 is
Opened . Thereby, all the MPUs are configured to receive the common information from the cache memory 221, and the operation is continued even after the abnormality is found. The reason for switching from the normal state, such as closing the three-state buffer circuit 26-1 and opening the three-state buffer circuit 26-1 , is that even if the cache memory 220 is logically identified as abnormal, the cache memory 220 is connected. The possibility of an abnormality in the internal bus cannot be denied, and it is switched to the cache memory 221 side just in case. If the internal bus to which the cache memory 220 is connected is abnormal, the effect does not appear on the MPUC side because the three-state buffer circuit 29 is in one-way communication.

【００５８】ケース２：キャッシュメモリ２２１の異常
である。図５（ｃ）のように、キャッシュメモリ２２１
の出力が停止され、３ステートバッファ回路２９はＭＰ
ＵＣ２０−３側への信号のみが通過するように制御さ
れ、これにより全てのＭＰＵはキャッシュメモリ２２０
からの共通情報を受取るように構成されて異常発見後も
運転継続される。Case 2: The cache memory 221 is abnormal. As shown in FIG. 5C, the cache memory 221
Is stopped, and the three-state buffer circuit 29
Control is performed so that only the signal to the UC 20-3 side is passed, whereby all MPUs
The system is configured to receive the common information from, and the operation is continued even after the abnormality is found.

【００５９】ケース３，４：ＢＩＵ２７−１、２７−２
あるいはその接続されたシステムバス１−１側の異常で
ある。図５（ｄ），（ｅ）のように、ＢＩＵ２７−１又
は２７−２あるいはその接続されたシステムバス１−１
側を停止し、ケース１と同様に運用する。Cases 3 and 4 : BIU 27-1, 27-2
Alternatively, the error is on the connected system bus 1-1 side. As shown in FIGS. 5D and 5E, BIU 27-1 or
Is 27-2 or the system bus 1-1 connected thereto.
Side, and operate as in Case 1.

【００６０】以上のようにして、パリティエラーによる
異常検知されたときは構成変更とともに異常の旨、外部
報知される。As described above, when an error due to a parity error is detected, an external notification of the error is made along with the configuration change.

【００６１】以上詳細に述べたように、本発明によれば
ＢＰＵの内部に異常が発生したとしても、その回路構成
の一部を切離しあるいは情報の流れを変更することによ
って、正常時と同様に運転継続が可能である。このため
データ処理の途中で異常が発生した場合には、（１）切
りの良い時点または、修理保守時期まで当該ＢＰＵでの
動作を継続させ、（２）切りの良い時点または、修理保
守時期に当該ＢＰＵで実行していた処理を他の正常なＢ
ＰＵに引き継がせれば良い。As described in detail above, according to the present invention, even if an abnormality occurs inside the BPU, the circuit configuration can be cut off or the information flow can be changed in the same manner as in the normal state. Operation can be continued. For this reason, if an abnormality occurs during data processing, (1) the operation in the BPU is continued until a good cut or repair maintenance time, and (2) a good cut or a repair maintenance time. The processing executed by the BPU is changed to another normal B
What is necessary is just to hand over to PU.

【００６２】この結果、異常発生時のチェックポイント
リスタートに備えてのバックアップ動作が不要となり、
処理性能を向上させることができる。As a result, the backup operation for the checkpoint restart in the event of an abnormality becomes unnecessary, and
Processing performance can be improved.

【００６３】Ｖ．内部バス接続時の信号処理以上説明したように、各部異常の際に内部バスの切替を
３ステートバッファ２９を用いて行うが、３ステートバ
ッファ２９の開閉操作は、通常の経路でのライトアクセ
スに比べて切替に時間がかかり、しかもバス間で迂回す
るために時間がかかる。この改善策としては、図６のよ
うに異常発生時にのみリトライによりバスサイクルを延
長するのがバスサイクルの遅延を生じず有効である。V. Signal Processing When Connecting Internal Bus As described above, the internal bus is switched using the three-state buffer 29 in the event of an abnormality in each unit. In comparison, it takes time to switch, and it takes time to detour between buses. As a remedy, extending the bus cycle by retry only when an abnormality occurs as shown in FIG. 6 is effective without causing a delay of the bus cycle.

【００６４】つまり、異常が発見された（ステップＳ
１，Ｓ２）ときには、ステップＳ４においてリトライを
させる信号をアサートし、ステップＳ５において異常出
力の停止（異常ＭＰＵの切離し操作等）、正常出力の迂
回処理を実施した後で、ステップＳ６においてこのバス
サイクルを終了させる信号をアサートして一連の処理を
終了する。なお、正常であるときにはステップＳ３にお
いてこのバスサイクルを終了させる信号をアサートする
のみでよい。ＭＰＵにバスサイクルを終了させたり、リ
トライをさせたりするための信号線はＭＰＵの種類によ
り名称が異なるが、多くのＭＰＵではリトライ信号をＭ
ＰＵに入力することでＭＰＵが自動的に実行する。表４
に代表的なＭＰＵの信号名を示す。That is, an abnormality is found (step S
1, S2), the signal for retrying is asserted in step S4, the abnormal output is stopped in step S5 (operation for disconnecting the abnormal MPU, etc.), and the bypass processing for the normal output is performed. Is asserted to end a series of processing. Note that when it is normal, it is only necessary to assert a signal for ending this bus cycle in step S3. The signal line for terminating the bus cycle or retrying the MPU has a different name depending on the type of the MPU.
MPU automatically executes by inputting to PU. Table 4
Shows signal names of typical MPUs.

【００６５】[0065]

【表４】 [Table 4]

【００６６】図７，図８は図６のリトライ方式をライト
アクセス時に採用したときの信号の流れを示したもので
あり、図７は正常時、図８は異常時を示す。同図におい
て、縦軸は時間の経過を示し、横軸はＭＰＵ出力がキャ
ッシュメモリに至るまでの各部回路を示している。通
常、ＭＰＵからはデータ信号に先立って、アドレス信号
が出力される。図７では、アドレス信号、データ信号が
ともに正常であるためにＭＰＵ出力チェック回路２３、
パリティチェック回路２５０では正常と判断され、ＭＰ
Ｕには終了信号が返され、キャッシュメモリ２２０では
データを格納しバスサイクルが終了する。FIGS. 7 and 8 show the signal flow when the retry method of FIG. 6 is employed at the time of write access. FIG. 7 shows a normal state and FIG. 8 shows an abnormal state. In the figure, the vertical axis shows the passage of time, and the horizontal axis shows the circuits of each unit until the MPU output reaches the cache memory. Usually, an address signal is output from the MPU prior to the data signal. In FIG. 7, since both the address signal and the data signal are normal, the MPU output check circuit 23,
The parity check circuit 250 determines that the state is normal,
An end signal is returned to U, the data is stored in the cache memory 220, and the bus cycle ends.

【００６７】図８では、ＭＰＵＡが異常でアドレス信
号、データ信号がともにＭＰＵ出力チェック回路２３に
より異常と判定され、各ＭＰＵに終了信号とともにリト
ライ信号が返されリトライ動作に入る。リトライ動作時
には３ステートバッファ２００，２０１を閉状態として
ＭＰＵＡから内部バスへの信号伝達を阻止し、３ステー
トバッファ２９を一方向のみ開としてＭＰＵＣの出力信
号をキャッシュメモリ２５０にも供給する。その後、各
ＭＰＵには終了信号が返され、動作が終了する。In FIG. 8, the MPU is abnormal and both the address signal and the data signal are determined to be abnormal by the MPU output check circuit 23. A retry signal is returned to each MPU together with an end signal, and a retry operation is started. At the time of the retry operation, the three-state buffers 200 and 201 are closed to prevent signal transmission from the MPUA to the internal bus, and the three-state buffer 29 is opened only in one direction to supply the MPUC output signal to the cache memory 250 as well. Thereafter, an end signal is returned to each MPU, and the operation ends.

【００６８】図９，図１０，図１１は図６のリトライ方
式をキャッシュリードアクセス時に採用したときの信号
の流れを示したものであり、図９は正常時、図１０はア
ドレス信号異常時、図１１はデータ信号異常時を夫々示
す。図９では、アドレス信号、データ信号がともに正常
であり異常が見られないために、ＭＰＵには終了信号が
返され、ＭＰＵはキャッシュメモリ２５０からのデータ
を格納してバスサイクルを終了する。図１０では、ＭＰ
ＵＡからのアドレス信号が他と一致せずに異常と判断さ
れ、各ＭＰＵに終了信号とともにリトライ信号が返され
リトライ動作に入る。リトライ動作時には３ステートバ
ッファ２０１を閉状態としてＭＰＵＡから内部バスへの
信号伝達を阻止し、３ステートバッファ２９を一方向の
み開としてＭＰＵＣのアドレス出力信号をキャッシュメ
モリ２２０に供給し、キャッシュメモリ２２０は与えら
れたアドレスに格納されているデータをＭＰＵＡとMPUB
に供給する。その後、各ＭＰＵに終了信号を返して、リ
トライ動作が終了する。FIGS. 9, 10 and 11 show the flow of signals when the retry method of FIG. 6 is employed at the time of cache read access. FIG. 9 shows a normal state, FIG. FIG. 11 shows the case where the data signal is abnormal. In FIG. 9, since the address signal and the data signal are both normal and no abnormality is found, an end signal is returned to the MPU, and the MPU stores the data from the cache memory 250 and ends the bus cycle. In FIG. 10, MP
The address signal from the UA is determined to be abnormal because the address signal does not match with the others, and a retry signal is returned to each MPU together with an end signal, and a retry operation is started. At the time of the retry operation, the 3-state buffer 201 is closed to prevent signal transmission from the MPUA to the internal bus, the 3-state buffer 29 is opened in only one direction, and the address output signal of the MPUC is supplied to the cache memory 220. The data stored at the given address is transferred to MPUA and MPUB
To supply. Thereafter, an end signal is returned to each MPU, and the retry operation ends.

【００６９】図１１では、キャッシュメモリ２２０から
のデータに異常があり、パリティ生成照合検査回路１
０，１２、パリティチェック回路２５０でのパリティチ
ェックにより正常と判断され、各ＭＰＵに終了信号とと
もにリトライ信号が返されリトライ動作に入る。リトラ
イ動作時にはキャッシュメモリ２２０の出力が阻止さ
れ、３ステートバッファ２９を一方向のみ開としてキャ
ッシュメモリ２２１の出力をＭＰＵＡとＭＰＵＢに供給
する。なおこの場合、３ステートバッファ回路２６−１
を閉、２６−２を開のように正常状態から切替え、３ス
テートバッファ回路２７を通じてキャッシュメモリ２２
１の出力をＭＰＵＢに供給することにより、キャッシュ
メモリ２２０からＭＰＵＢへのデータ信号の経路の異常
により誤ったデータがＭＰＵＢへ供給されるのを防ぐこ
とができる。In FIG. 11, there is an abnormality in the data from the cache memory 220, and the parity generation collation check circuit 1
At 0, 12, the parity check by the parity check circuit 250 is judged to be normal , a retry signal is returned to each MPU together with an end signal, and a retry operation is started. During the retry operation, the output of the cache memory 220 is blocked, the three-state buffer 29 is opened in only one direction, and the output of the cache memory 221 is supplied to the MPUA and the MPUB. In this case, the three-state buffer circuit 26 -1
The closed, switch 2 6-2 from the open normal state as the cache memory 22 through the 3-state buffer circuit 27
By supplying the output of 1 to the MPUB, it is possible to prevent erroneous data from being supplied to the MPUB due to an abnormality in the path of the data signal from the cache memory 220 to the MPUB.

【００７０】VI．異常発生後の復旧策このように本発明装置は異常発生後も運転継続できる
が、この構成のまま永続的に運転することは二次的故障
の可能性を考慮すると、早急に初期の状態に復旧させる
べきであり、次に、以上発生したＢＰＵの機能を正常に
復旧させるための復旧策について説明する。その方法
は、図１のＢＰＵを１つのプリント板上に形成してお
き、異常ＢＰＵプリント板を正常ＢＰＵプリント板に交
換することで達成される。VI. Recovery measures after occurrence of abnormality As described above, the device of the present invention can continue to operate even after occurrence of abnormality.However, it is difficult to operate the system permanently with this configuration, considering the possibility of secondary failure, to quickly return to the initial state. Next, a recovery measure for normally recovering the function of the BPU generated above will be described. The method is achieved by forming the BPU of FIG. 1 on one printed board and replacing the abnormal BPU printed board with a normal BPU printed board.

【００７１】図１２は、計算機盤構成を示しており、そ
の扉を開放するとその内部にプリント板を収納するスロ
ット部が形成され、更に各スロットには図１の主記憶装
置３、ＢＰＵ２、入出力制御装置ＢＩＵ４を構成する各
プリント板が挿入され、挿入された状態で図１１には図
示せぬシステムバスに接続されるようになっている。図
示の例ではスロットＳＬは１２個あり、このうちＳＬ
１，ＳＬ３〜ＳＬ６にプリント板が挿入され、他のＳＬ
２，ＳＬ７〜ＳＬ１２が空スロットとなっている。スロ
ットＳＬに挿入されるプリント板ＰＬは通常知られたも
ので良いが、本発明のものではこのプリント板をスロッ
トＳＬに固定するためのレバー２８２、プリント板が停
止中か否かを表わす表示ランプ２８０を備え、必要に応
じて適宜プリント板の取外し要求ボタン２８１が備えら
れる。以下、ＢＰＵプリント板の交換手順について説明
する。FIG. 12 shows the structure of a computer board. When the door is opened, a slot portion for accommodating a printed board is formed inside the computer board. Further, in each slot, the main storage device 3, BPU2, Each printed board constituting the output control unit BIU4 is inserted, and in the inserted state, is connected to a system bus not shown in FIG. In the illustrated example, there are 12 slots SL, of which SL
1, a printed board is inserted into SL3 to SL6, and another SL is inserted.
2, SL7 to SL12 are empty slots. The printed board PL to be inserted into the slot SL may be a known one, but in the present invention, a lever 282 for fixing the printed board to the slot SL, and an indicator lamp indicating whether the printed board is stopped or not. 280, and a printed board removal request button 281 as needed. Hereinafter, the procedure for replacing the BPU printed board will be described.

【００７２】《ＢＰＵプリント板が１枚のときの交換》
図１３は、システムバス（説明の都合上一重系で示す）
１にプリント板ＰＬが接続可能なｎ個のスロットＳＬの
うち、ＳＬ１にその内部で異常発生したＢＰＵ，ＳＬ２
に主記憶装置３、ＳＬｎにＩＯＵ４のプリントが夫々挿
入されており、ＳＬ３が空きスロットとなっている例を
示す。ここでは、異常ＢＰＵに代わり機能すべき新ＢＰ
Ｕは未だスロットに挿入されていない。そしてプリント
板上の表示ランプ２８０は稼働中のために消灯してい
る。<< Replacement with One BPU Printed Board >>
FIG. 13 shows a system bus (shown as a single system for convenience of explanation).
Among the n slots SL to which the printed circuit board PL can be connected, 1 is a BPU or SL2 in which an abnormality has occurred inside SL1.
Shows an example in which a print of the IOU 4 is inserted into the main storage device 3 and SLn, respectively, and SL3 is an empty slot. Here, the new BP that should function instead of the abnormal BPU
U has not yet been inserted into the slot. The display lamp 280 on the printed board is turned off because it is in operation.

【００７３】この状態で、旧ＢＰＵ２Ａの機能を正常な
新ＢＰＵ２Ｂに引き継ぐには、まず、空きスロットを用
意する。図１３の例の場合は、スロットＳＬ３が空きス
ロットとなっているので、次に新ＢＰＵ２Ｂを空きスロ
ットＳＬ３に挿入する。In this state, to take over the function of the old BPU 2A to the normal new BPU 2B, first, an empty slot is prepared. In the example of FIG. 13, since the slot SL3 is an empty slot, the new BPU 2B is inserted into the empty slot SL3 next.

【００７４】ＢＰＵ２ＡはＢＰＵ２Ｂの挿入を検知し、
そのオペレーティングシステム（以下ＯＳと略す）の処
理により、旧ＢＰＵＡで実行中のタスクを新ＢＰＵ２Ｂ
に移管し、旧ＢＰＵ２Ａのプリント板上の表示ランプ２
８０を点灯する。以降、オンラインの業務は新ＢＰＵ２
Ｂにより実行される。旧ＢＰＵ２Ａから新ＢＰＵ２Ｂへ
の業務移管は瞬時に行われる。その後、旧ＢＰＵプリン
ト板上の表示ランプ280が点灯し、該ＢＰＵが停止状態
であることを確認した上で、旧ＢＰＵ２Ａを取外す。以
上の手順により、旧ＢＰＵ２Ａを抜く前に、オンライン
業務を新BPU2B に移管完了されているため、システムを
停止することなく、またシステム性能を低下させること
なくＢＰＵの交換を実現できる。The BPU 2A detects the insertion of the BPU 2B,
By the processing of the operating system (hereinafter abbreviated as OS), the task running on the old BPUA can be changed to the new BPU2B.
To display lamp 2 on the printed board of the old BPU2A
Light 80. After that, the online business will be the new BPU2
Executed by B. Business transfer from the old BPU 2A to the new BPU 2B is instantaneous. Thereafter, the display lamp 280 on the old BPU printed board is turned on, and after confirming that the BPU is stopped, the old BPU 2A is removed. According to the above procedure, since the online business has been transferred to the new BPU 2B before the old BPU 2A is pulled out, the BPU can be replaced without stopping the system and without lowering the system performance.

【００７５】図１４は、図１３で示した例についてＢＰ
Ｕ交換手順を人による動作と計算機内部の処理に分けて
処理の内容を示したＢＰＵ交換手順処理フローである。FIG. 14 shows the BP for the example shown in FIG.
It is a BPU exchange procedure processing flow showing the contents of the processing by dividing the U exchange procedure into human operation and processing inside the computer.

【００７６】ＢＰＵを交換する場合、まず空きスロット
を用意（Ｓｔ１）する。空きスロットは、既に未使用の
空きスロットがあればそれを用いればよく、また空きス
ロットがない場合も、一時的に取り外し可能なハードウ
ェアボードがあれば、そのボードを抜き、一時的に空き
スロットを作り出し、目的のＢＰＵ交換後に、再び該ボ
ードを戻すことにより空スロットを準備することも可能
である。次に、空きスロットに新ＢＰＵを挿入（Ｓｔ
２）する。そのＢＰＵ挿入を、旧ＢＰＵ２Ａは割込等の
手段で認識（Ｓｔ３）する。すると、旧ＢＰＵ２Ａは現
在実行中のタスクを主記憶装置上に退避（Ｓｔ４）し、
新ＢＰＵ２Ｂが該タスクの処理を続行できるようにす
る。新ＢＰＵ２Ｂはそれを受けて、該タスクを実行（Ｓ
ｔ５）し、オンライン業務を開始する。旧ＢＰＵ２Ａは
自らＢＰＵ上のボード停止ランプを点灯（Ｓｔ６）し、
処理を停止（Ｓｔ７）する。その後、旧ＢＰＵ上のボー
ド停止ランプが点灯しているのを人間が確認(Ｓｔ８)
後、旧ＢＰＵを取り外す(Ｓｔ９)。これで、ＢＰＵ交換
は完了である。When replacing the BPU, first, an empty slot is prepared (St1). If there is an empty slot that is already unused, use that empty slot.If there is no empty slot, if there is a hardware board that can be temporarily removed, remove the board and temporarily remove the empty slot. It is also possible to prepare an empty slot by returning the board after the target BPU is replaced. Next, a new BPU is inserted into an empty slot (St
2 ) Do it. The old BPU 2A recognizes the BPU insertion by means such as an interrupt (St 3 ). Then, the old BPU 2A saves the currently executing task on the main storage device (St 4 ),
This enables the new BPU 2B to continue processing the task. The new BPU 2B receives this and executes the task (S
t5), and start an online business. The old BPU2A turns on the board stop lamp on the BPU by itself (St6),
The processing is stopped (St7). Then, a human confirms that the board stop lamp on the old BPU is lit (St8).
Thereafter, the old BPU is removed (St9). This completes the BPU exchange.

【００７７】図１５は、上記実施例における、旧ＢＰＵ
２Ａ上で実行中のタスクを新BPU２Bに引き継ぎする手段
を詳細に説明した図である。システムバスに旧ＢＰＵ２
Ａ，新ＢＰＵ２Ｂ、さらに主記憶装置３の各々プリント
板が装着されている。旧ＢＰＵ２Ａ上では、あるタスク
９２０−１が実行中である。その時に、新BPU２Bが挿入
されたことの連絡が旧ＢＰＵ２Ａに入ったとすると、旧
ＢＰＵ２Ａは、処理を中断し、実行中のタスク９２０−
１を主記憶装置３上に退避する。一方、新ＢＰＵ２Ｂは
主記憶装置３上に退避されたタスク９２０−１に続くタ
スク９２０−２を回復して、中断したポイントからタス
クの処理を続行する。以上の方式を用いて、交換したＢ
ＰＵ間の業務の引き継ぎを行う。FIG. 15 shows the old BPU in the above embodiment.
FIG. 7 is a diagram for explaining in detail a means for taking over a task being executed on 2A to a new BPU 2B. Old BPU2 in the system bus
A, a new BPU 2B, and a printed board of the main storage device 3 are mounted. A task 920-1 is being executed on the old BPU 2A. At that time, if the notification that the new BPU 2B has been inserted enters the old BPU 2A, the old BPU 2A suspends the processing and executes the task 920-
1 is saved on the main storage device 3. On the other hand, the new BPU 2B recovers the task 920-2 following the task 920-1 saved on the main storage device 3, and continues the processing of the task from the point at which it was interrupted. Using the above method, the exchanged B
Take over business between PUs.

【００７８】以上が、ＢＰＵが１つの場合のＢＰＵの交
換の例である。上記実施例では、ＢＰＵが１つの場合で
も、システムを停止することなくＢＰＵの交換が可能で
ある。The above is an example of BPU replacement when there is one BPU. In the above embodiment, even when there is one BPU, the BPU can be replaced without stopping the system.

【００７９】《ＢＰＵプリント板が複数のときの交換》
次にＢＰＵが複数の場合、あるいは挿入したＢＰＵが正
しく動作しなかった場合の対応について説明する。図１
６の本実施例では、ＢＰＵが複数装着されている。それ
ぞれのＢＰＵは交換されるべきＢＰＵを指定する手段と
して、ボード取外し要求ボタン２８１と、プリント板番
号２８２を具備している。<< Replacement when there are a plurality of BPU printed boards >>
Next, a description will be given of a case where there are a plurality of BPUs or a case where the inserted BPU does not operate correctly. FIG.
In the sixth embodiment, a plurality of BPUs are mounted. Each BPU has a board removal request button 281 and a printed board number 282 as means for designating a BPU to be replaced.

【００８０】システムバス１にプリント板を接続するた
めの、スロットＳＬ１からＳＬ３にはＢＰＵ２Ａ，２
Ｂ，２Ｃがそれぞれ装着されている。スロットＳＬ４に
は主記憶装置が接続されている。スロットＳＬ５は空き
スロットである。また、各ＢＰＵは、ＢＰＵが停止した
ときに点灯する表示ランプ２８０と、取り外すべきＢＰ
Ｕを指定するために用いるプリント板取外し要求ボタン
２８１と、プリント板番号２８２を有する。ここで、プ
リント板番号はＢＰＵ２Ａが１、ＢＰＵ２Ｂが２、ＢＰ
Ｕ２Ｃが３と約束されている。今、新ＢＰＵ２Ｄをスロ
ットＳＬ２に装着されている旧ＢＰＵ２Ｂと交換する場
合には、まず、新ＢＰＵ２Ｄを空きスロットであるスロ
ットＳＬ５に挿入する。それから、スロットＳＬ１〜Ｓ
Ｌ３に装着されているＢＰＵのうち、交換したいスロッ
トＳＬ２のＢＰＵ２Ｂの取外し要求ボタン２８１を押
す。そうすると、旧ＢＰＵ２Ｂは実行中のタスクと自身
のプリント板番号を主記憶装置３上に退避し、新ＢＰＵ
２Ｄが主記憶装置３上に退避されたプリント板番号を取
り込み、退避中タスクを実行する。旧ＢＰＵ２Ｂは、表
示２８０を点灯し自ら停止する。その後、旧ＢＰＵ２Ｂ
のボード停止ランプ２８０が点灯しているのを確認後、
該ＢＰＵ２Ｂを取り外す。Slots SL1 to SL3 for connecting a printed board to the system bus 1 have BPUs 2A, 2
B and 2C are respectively mounted. A main storage device is connected to the slot SL4. Slot SL5 is an empty slot. Also, each BPU has a display lamp 280 that lights up when the BPU stops, and a BP to be removed.
It has a printed board removal request button 281 used to designate U and a printed board number 282. Here, the printed board numbers are 1 for BPU2A, 2 for BPU2B, and BP
U2C is promised three. When replacing the new BPU 2D with the old BPU 2B installed in the slot SL2, the new BPU 2D is first inserted into the empty slot SL5. Then, the slots SL1 to SL
The user presses the button 281 for requesting the removal of the BPU 2B in the slot SL2 to be replaced among the BPUs mounted in L3. Then, the old BPU 2B saves the running task and its own printed board number on the main storage device 3, and stores the new BPU
The 2D fetches the printed board number saved on the main storage device 3 and executes the task under saving. The old BPU 2B turns on the display 280 and stops itself. After that, the old BPU2B
After confirming that the board stop lamp 280 is lit,
Remove the BPU 2B.

【００８１】図１７は、図１６で示した例についてのＢ
ＰＵ交換手順を人による動作と計算機内部の処理に分け
て処理の内容を示したＢＰＵ交換手順処理フローであ
る。FIG. 17 is a block diagram of B in the example shown in FIG.
It is a BPU exchange procedure processing flow showing the contents of the processing by dividing the PU exchange procedure into human operation and processing inside the computer.

【００８２】ＢＰＵ交換する場合、まず空きスロットを
用意（Ｓｔ１）する。空きスロットは、既に未使用の空
きスロットがあればそれを用いればよく、また空きスロ
ットがない場合も、一時的に取り外し可能なハードウェ
アボードがあれば、そのボードを抜き、一時的に空きス
ロットを作り出し、目的のＢＰＵ交換後に、再び該ボー
ドを戻すことにより空スロットを準備することも可能で
ある。When replacing the BPU, first, an empty slot is prepared (St1). If there is an empty slot that is already unused, use that empty slot.If there is no empty slot, if there is a hardware board that can be temporarily removed, remove the board and temporarily remove the empty slot. It is also possible to prepare an empty slot by returning the board after the target BPU is replaced.

【００８３】次に、空きスロットに新ＢＰＵ２Ｄを挿入
（Ｓｔ２）する。その後、取り外したい旧ＢＰＵ２Ｂの
プリント板取り外し要求ボタンを押す（Ｓｔ３）。する
と、旧ＢＰＵ２Ｂは現在実行中のタスクと自プリント板
番号を主記憶装置３上に退避（Ｓｔ４）し、新ＢＰＵ２
Ｄが該タスクの処理を続行できるようにする。新ＢＰＵ
２Ｄはそれを受けて、該タスクを実行（Ｓｔ５）し、オ
ンライン業務を開始する。旧ＢＰＵ２Ｂは自らＢＰＵ上
の表示ランプを点灯（Ｓｔ６）し、処理を停止（Ｓｔ
７）する。その後、旧ＢＰＵ２Ｂ上の表示ランプが点灯
しているのを確認（Ｓｔ８）後、旧ＢＰＵ２Ｂを取り外
す（Ｓｔ９）。これで、ＢＰＵ交換は完了である。Next, a new BPU2D is inserted into an empty slot (St2). Thereafter, the user presses the printed board removal request button of the old BPU 2B to be removed (St3). Then, the old BPU 2B saves the currently executing task and the own printed board number on the main storage device 3 (St4), and the new BPU 2B
Allow D to continue processing the task. New BPU
In response, the 2D executes the task (St5) and starts an online job. The old BPU 2B turns on the display lamp on the BPU by itself (St6), and stops the processing (St).
7) Yes. Then, after confirming that the display lamp on the old BPU 2B is lit (St8), the old BPU 2B is removed (St9). This completes the BPU exchange.

【００８４】図１８は、上記実施例における、旧ＢＰＵ
上で実行中のタスクとプリント板番号を新ＢＰＵに引継
ぐ手段を詳細に説明した図である。システムバスに旧Ｂ
ＰＵが３台（２Ａ，２Ｂ，２Ｃ）、新ＢＰＵ２Ｄ、さら
に主記憶装置が装着されている。旧ＢＰＵ２Ａ，２Ｂ，
２Ｃ上では、夫々タスク１，２，３が実行中である。ま
た、旧ＢＰＵ２Ａ，２Ｂ，２Ｃのプリント板番号２８２
は夫々１,２,３である。その時に、取り外しＢＰＵを指
定するために、旧ＢＰＵ２Ｂのプリント板取り外し要求
ボタンが押されたとすると、旧ＢＰＵ２Ｂは、処理を中
断し、実行中のタスク２と自プリント板番号２８２を主
記憶装置３上に退避する。一方、新ＢＰＵ２Ｄは主記憶
装置３上に退避されたプリント板番号２８２とタスク２
を回復し、中断ポイントからタスクの処理を続行する。
以上の方式を用いて、交換したＢＰＵ間の業務の引き継
ぎを行う。FIG. 18 shows the old BPU in the above embodiment.
FIG. 11 is a diagram for explaining in detail a means for transferring a task being executed and a printed board number to a new BPU. Old B on system bus
Three PUs (2A, 2B, 2C), a new BPU 2D, and a main storage device are mounted. Old BPU2A, 2B,
On 2C, tasks 1, 2, and 3 are being executed, respectively. Also, the printed board numbers 282 of the old BPUs 2A, 2B, 2C
Are 1, 2, and 3, respectively. At that time, in order to specify the removal BPU, when a printed board removal request button of the old BPU2B is pressed, old BPU2B interrupts the process, a main memory and task 2 running its own printed circuit board No. 2 82 3. Evacuate on top. On the other hand, the new BPU2D is a printed circuit board No. 2 82 is retracted in the main storage device 3 Task 2
And continue processing the task from the point of interruption.
By using the above method, the business between the exchanged BPUs is taken over.

【００８５】本実施例によれば、交換されるべきＢＰＵ
を指定する手段であるプリント板取外し要求ボタンを設
けることにより、ＢＰＵが複数装着されている場合で
も、システムを停止することなく、さらにはシステム性
能を低下させることなくＢＰＵを交換できるという長所
がある。According to this embodiment, the BPU to be replaced
Has the advantage that even if a plurality of BPUs are installed, the BPU can be replaced without stopping the system and without lowering the system performance even if a plurality of BPUs are installed. .

【００８６】また、交換するＢＰＵに割当てているプリ
ント板番号を交換ＢＰＵ間で引継ぐことにより、ユーザ
プログラムにより動作プリント板番号が指定されている
場合でも、ユーザプログラムを変更することなくＢＰＵ
を交換できるという長所がある。Further, by taking over the printed circuit board number assigned to the exchanged BPU between the exchanged BPUs, even if the operation printed circuit board number is designated by the user program, the BPU can be changed without changing the user program.
There is an advantage that can be replaced.

【００８７】《挿入されたＢＰＵが正しく作動しなかっ
た場合》一方、交換された新ＢＰＵが万一正常に動作し
ない場合に、システムに重大な影響を及ぼすという短所
がある。図１９，図２０によれば、挿入されたＢＰＵの
動作チェックを実行する手段を有し、新しく挿入した新
ＢＰＵが万一正常に動作しない場合にもシステムへの影
響を与えることがない。<< When the inserted BPU does not operate correctly >> On the other hand, if the replaced new BPU does not operate normally, there is a disadvantage that the system is seriously affected. According to FIG. 19 and FIG. 20, there is provided a means for checking the operation of the inserted BPU, and there is no effect on the system even if the newly inserted new BPU does not operate normally.

【００８８】図１９は、新ＢＰＵ２Ｂが挿入された状態
を示す図であり、このとき旧BPU2Aではあるタスクが実
行中である。新ＢＰＵ２Ｂが挿入されると、該ＢＰＵ上
で動作チェックを行うため、ＢＰＵ自己診断プログラム
９２５を実行する。診断プログラムが正常に終了するま
では旧ＢＰＵＡにはボード挿入の連絡はしない。該診断
プログラム９２５により新ＢＰＵに故障箇所が発見され
ると旧ＢＰＵへは連絡せず、自ＢＰＵ２Ｂの表示ランプ
２８０を点灯し、処理を停止する。旧ＢＰＵでは、新Ｂ
ＰＵ挿入タイミングでタスク１を中断することなく、何
事もなかったかのようにタスクの処理を続行する。FIG. 19 is a diagram showing a state where the new BPU 2B is inserted. At this time, a task is being executed in the old BPU 2A. When the new BPU 2B is inserted, the BPU self-diagnosis program 925 is executed to perform an operation check on the BPU. Until the diagnosis program ends normally, the old BPUA is not notified of the board insertion. If a failure point is found in the new BPU by the diagnostic program 925, the old BPU is not contacted, the display lamp 280 of the own BPU 2B is turned on, and the processing is stopped. In the old BPU, the new B
The task processing is continued as if nothing had happened without interrupting task 1 at the PU insertion timing.

【００８９】図２０は、上記実施例における、ＢＰＵ交
換手順を人による動作と計算機内部の処理に分けて処理
の内容を示したＢＰＵ交換手順処理フローである。Ｓｔ
１，Ｓｔ２，Ｓｔ４〜Ｓｔ８，Ｓｔ１１〜Ｓｔ１３の処
理については、図２１と全く同一の処理であるためここ
では説明を省略し、本実施例に特有の処理につき説明す
る。FIG. 20 is a BPU exchange procedure processing flow showing the contents of the BPU exchange procedure in the above embodiment, which is divided into human operation and computer internal processing. St
The processes of 1, St2, St4 to St8, and St11 to St13 are completely the same as those in FIG. 21 and thus will not be described here. Only the processes unique to this embodiment will be described.

【００９０】新ＢＰＵが挿入されると、まず該ＢＰＵの
動作チェックを実施するため診断プログラムを実行（Ｓ
ｔ３）する。該診断プログラムの結果、正常と判定され
た場合には、前実施例と同じく処理Ｓｔ４に移る。しか
し、故障と判定された場合には、挿入された新ＢＰＵ上
の表示ランプを点灯(Ｓｔ９)し、新ＢＰＵの処理を停止
(Ｓｔ１０)する。その後、新ＢＰＵ上の表示ランプの点
灯を確認（Ｓｔ１４）し、新ＢＰＵを再度取り外す（Ｓ
ｔ１５）。この結果、ＢＰＵの交換は失敗に終ったもの
の、旧ＢＰＵが処理を継続しているため、オンラインシ
ステムには影響を与えることはない。交換が成功したか
否かは、ＢＰＵ挿入後、新旧ＢＰＵのどちらの表示ラン
プが点灯するかにより判定する。When a new BPU is inserted, a diagnostic program is first executed to check the operation of the new BPU (S
t3). If the result of the diagnosis program indicates that the condition is normal, the process proceeds to step St4 as in the previous embodiment. However, when it is determined that a failure has occurred, the display lamp on the inserted new BPU is turned on (St9), and the processing of the new BPU is stopped.
(St10). Thereafter, the lighting of the display lamp on the new BPU is confirmed (St14), and the new BPU is removed again (S14).
t15). As a result, although the exchange of the BPU has failed, the online system is not affected because the old BPU continues the processing. Whether or not the replacement has succeeded is determined by which of the display lamps of the new and old BPUs is turned on after the BPU is inserted.

【００９１】以上、本実施例の方式により、挿入された
ＢＰＵが正常に動作しない場合にも、オンラインシステ
ムには影響を排除することが可能となった。As described above, according to the method of the present embodiment, even when the inserted BPU does not operate normally, it is possible to eliminate the influence on the online system.

【００９２】《異常発生前後の構成と処理》以上述べた
旧ＢＰＵ２Ａと新ＢＰＵ２Ｂ内のＭＰＵの処理並びに構
成を時系列的に示したものが図２１であり、正常運転時
にはＢＰＵ２Ａの３台のＭＰＵが運転しており、その多
数決結果が出力されている。そして処理Ｂの実行中にMP
UCに障害が発生するとこれを切離し、ＭＰＵＡとＭＰＵ
Ｂによる多重化回路構成により運転が正常に継続され
る。他方ＭＰＵＡの異常報知により新ＢＰＵ２Ｂのプリ
ント板を空スロットに挿入すると、新ＢＰＵ２Ｂ内の各
ＭＰＵは自己診断を実施し、適宜の時点で処理を旧ＢＰ
Ｕ２Ａから新ＢＰＵ２Ｂに移してＢＰＵ２Ｂの３台のＭ
ＰＵ(ＭＰＵＤ、ＭＰＵＥ、ＭＰＵＦ)の多数決結果によ
る処理Ｄを実行する。この処理引継ぎは、切りの良い時
点または、修理保守時期まで、当該ＢＰＵでの動作を継
続させ、切りの良い時点または、修理保守時期に当該Ｂ
ＰＵで実行した処理を他の正常なＢＰＵに引き継がせれ
ば良く、実際にはソフトウェアの都合で最も性能上望ま
しい時点で行うことができる。このようなタイミングと
しては、タスク切替のタイミングが一般的にはふさわし
いことは明らかである。なんとなれば、マルチプロセッ
サシステムにおけるプロセッサの切替とまったく同一手
順でＢＰＵの切替が可能であり、引き継ぎに伴う余分な
性能上のオーバーヘッドを０にすることが可能であるか
らである。このため本発明によれば、フォールト発生時
のチェックポイントリスタートに備えてのバックアップ
動作が不要となり、処理性能を向上させることができ
る。<< Configuration and Processing Before and After Abnormality Occurrence >> FIG. 21 shows the processing and configuration of the MPUs in the old BPU 2A and the new BPU 2B in chronological order, and shows three MPUs of the BPU 2A during normal operation. Are operating, and the majority result is output. Then, during the execution of the process B, the MP
If a failure occurs in the UC, it is disconnected and the MPUA and MPU
The operation is normally continued by the multiplexing circuit configuration of B. On the other hand, when the printed board of the new BPU 2B is inserted into the empty slot due to the notification of the abnormality of the MPUA, each MPU in the new BPU 2B performs a self-diagnosis, and at an appropriate time, the processing is performed by the old BP
Moved from U2A to new BPU2B, 3 M of BPU2B
The processing D based on the majority decision result of the PU (MPUD, MPUE, MBUF) is executed. In this process handover, the operation in the BPU is continued until a time when the cutting is good or the time for repair and maintenance is performed.
It is sufficient that the processing executed by the PU is taken over by another normal BPU, and in practice, it can be performed at the most desirable point in performance due to software. Obviously, the timing of task switching is generally appropriate as such timing. This is because the BPU can be switched in exactly the same procedure as the processor switching in the multiprocessor system, and the extra performance overhead associated with the handover can be reduced to zero. Therefore, according to the present invention, the backup operation for the checkpoint restart at the time of occurrence of a fault becomes unnecessary, and the processing performance can be improved.

【００９３】なお、フォールトが発生した場合には、ハ
ードウェアはフォールトの発生状況をレジスタに記録
し、オペレーティングシステムはコンテクストスイッチ
時や修理保守のための割込み処理時にレジスタを参照
し、処理の引継ぎが必要な場合には、処理引継ぎ先のＢ
ＰＵに割込みなどで通知し、自ＢＰＵでの処理を終了す
る。ＢＰＵ２を構成する要素（ＭＰＵ，キャッシュメモ
リなど）の一部で故障が発生した場合、他の要素は正常
であっても、本方式では処理引継ぎ後には、他の正常な
要素も含めてＢＰＵ２全体の使用を中止する。When a fault occurs, the hardware records the fault occurrence status in a register, and the operating system refers to the register at the time of a context switch or at the time of an interrupt process for repair and maintenance. If necessary, take over B
The PU is notified by an interrupt or the like, and the processing in the own BPU is terminated. When a failure occurs in a part of the elements (MPU, cache memory, etc.) constituting the BPU 2, even if the other elements are normal, after the processing is taken over in this method, the entire BPU 2 including the other normal elements is processed. Stop using.

【００９４】図２２に、フォールトトレランスの為に冗
長化したＭＰＵＡ，ＭＰＵＢ，MPUCが故障などの原因で
障害をうけた場合の引継ぎ時の本発明方式と公知例との
構成の相違を模式的に示す。従来の方法では、障害をう
けたＭＰＵＡのみを正常なＭＰＵＤと交換する方法を採
っていた。これに対し、本発明による方法では、障害を
うけたＭＰＵＡだけでなく、正常なＭＰＵＢ，ＭＰＵＣ
も新たにＭＰＵＤ，ＭＰＵＥ，ＭＰＵＦと交換してい
る。以上の様にすることにより、フォールトトレランス
の為に冗長化したＭＰＵの組合わせ、すなわちＭＰＵ
Ａ，ＭＰＵＢ，ＭＰＵＣの組合わせを固定化することが
できる。従ってＭＰＵの組合わせを交換単位にすれば、
それぞれの組合わせを構成するＭＰＵ間を高速のクロッ
クで結合することができ、高速のフォールトトレラント
コンピュータを実現することができる。また従来のよう
に、ＭＰＵの組替えに伴う種々のハードウェア、ソフト
ウェアが不要である。FIG. 22 schematically shows the difference between the configuration of the present invention and the known example at the time of takeover when the MPUA, MPUB, and MPUC redundant for fault tolerance suffers a failure or the like. Show. In the conventional method, only the failed MPUA is replaced with a normal MPUD. In contrast, in the method according to the present invention, not only the failed MPUA but also the normal MPUB, MPUC
Has also been newly replaced with MPUD, MPUE, and MPUF. As described above, a combination of redundant MPUs for fault tolerance, that is, an MPU
The combination of A, MPUB, and MPUC can be fixed. Therefore, if the combination of MPUs is used as an exchange unit,
The MPUs constituting each combination can be connected by a high-speed clock, and a high-speed fault-tolerant computer can be realized. Further, unlike the related art, various hardware and software associated with the MPU rearrangement are unnecessary.

【００９５】なお、ＢＰＵは単一故障の場合には動作を
継続することができるので、この処理引継ぎは故障発生
後直ちに行う必要は無く、処理の切りの良い時点また
は、修理保守時に処理引継ぎを行えばよい。Since the BPU can continue its operation in the case of a single failure, it is not necessary to carry out the processing immediately after the occurrence of the failure. Just do it.

【００９６】本実施例により処理を継続しながら、故障
の発生したＢＰＵ２０−１の配線基板を引き抜き正常な
配線基板を交換することができる。According to the present embodiment, the wiring board of the failed BPU 20-1 can be pulled out and a normal wiring board can be replaced while the processing is continued.

【００９７】VII．各部回路の代案変形例以上、本発明について説明したが、本発明の各部回路等
は適宜変更して実現することができる。以下、これらの
代案、変形例について説明する。VII. Alternative Modifications of Each Unit Circuit Although the present invention has been described above, each unit circuit and the like of the present invention can be implemented by appropriately changing. Hereinafter, these alternatives and modifications will be described.

【００９８】《多数決論理部》図２３は、図２の多数決論理回路部の組方と切替の様子
を、他の構成要件を省いて簡略化し理解しやすい形にし
て示したものであり、ＭＰＵＡとＭＰＵＣを出力用に固
定化して用い、ＭＰＵＢをＭＰＵＡとＭＰＵＣの健全性
確認の参照用としてのみ用いるとともに、ＭＰＵＡある
いはＭＰＵＣの異常時には健全性の確認された方の一つ
の出力を共通に用いて２組のキャッシュメモリに供給す
るようにしたものである。この方式の場合、ＭＰＵの出
力が多数決回路を通らずに直接キャッシュメモリに入力
されるので、多数決回路での遅延時間の分キャッシュメ
モリアクセス時間を短縮できる。<< Majority Logic Unit >> FIG. 23 shows how the majority logic circuit unit of FIG. 2 is assembled and switched in a simplified and easy-to-understand form by omitting other components. And MPUC are fixed for output, and MPUB is used only as a reference for checking the soundness of MPUA and MPUC, and when MPUA or MPUC is abnormal, one output whose soundness is checked is commonly used. The data is supplied to two sets of cache memories. In the case of this method, the output of the MPU is directly input to the cache memory without passing through the majority circuit, so that the cache memory access time can be reduced by the delay time in the majority circuit.

【００９９】本発明においては、以上のようにして多数
決論理を用いて３重系を２重系に切替て運転継続するも
のであり、本発明の変形例としてはこの方式以外にも種
々のものとすることができる。例えば、図２５では３つ
のＭＰＵの出力を多数決選択回路２１０と２１１に夫々
与え、３つのＭＰＵの中から健全性の確認された１つの
出力を選択する。この場合、故障した方の多数決選択回
路に接続されているキャッシュメモリのデータが破壊さ
れるが、正常な多数決選択回路に接続されているキャッ
シュメモリのデータを用いて運転継続できる。In the present invention, the operation is continued by switching from the triple system to the double system by using the majority logic as described above. It can be. For example, in FIG. 25, the outputs of three MPUs are given to majority decision selection circuits 210 and 211, respectively, and one of the three MPUs whose soundness is confirmed is selected. In this case, the data in the cache memory connected to the failed majority decision circuit is destroyed, but the operation can be continued using the data in the cache memory connected to the normal majority selection circuit.

【０１００】また、図２４のようにＭＰＵの出力をゲー
ト回路，切替回路等を通さずに直接キャッシュメモリに
入力し、異常となったＭＰＵから信号を受けるキャッシ
ュメモリの動作を停止して以降そのデータを使用しない
ようにすれば、さらにゲート回路，切替回路等の遅延時
間の分キャッシュメモリアクセス時間を短縮することが
できる。しかも多くの信号線からなるアドレスバス、デ
ータバスの切替手段が不要となるのでハード量を減少さ
せることができる。Further, as shown in FIG. 24, the output of the MPU is directly input to the cache memory without passing through a gate circuit, a switching circuit, etc., and the operation of the cache memory which receives a signal from the abnormal MPU is stopped. If data is not used, the cache memory access time can be further reduced by the delay time of the gate circuit, switching circuit, and the like. In addition, since there is no need to switch between address buses and data buses including many signal lines, the amount of hardware can be reduced.

【０１０１】図２６は４台のＭＰＵを備え、ＭＰＵＡと
ＭＰＵＣを出力用に固定し、ＭＰＵＢとＭＰＵＤをそれ
らの参照用に用い、２組の出力一致により出力用ＭＰＵ
の出力を夫々与えるものである。なお、ＭＰＵの異常時
には、健全側のものに切替て使用する方法とか、異常と
なったＭＰＵから信号を受けるキャッシュメモリの動作
を停止して以降そのデータを使用しないようにする方法
等で対応できる。FIG. 26 shows four MPUs, in which MPUA and MPUC are fixed for output , MPUB and MPUD are used for their reference, and the output MPU is output by matching two sets of outputs.
Respectively. When the MPU is abnormal, a method of switching to a healthy MPU and using the data or a method of stopping the operation of the cache memory receiving a signal from the abnormal MPU and not using the data thereafter can be used. .

【０１０２】《キャッシュデータのリードアクセス部》また、キャッシュメモリについてみると、キャッシュメ
モリ２２０，２２１の出力（データ）はパリティチェッ
クにより正常／異常が判断できるので、図２７のように
パリティチェック２５０により正常と判断されたキャッ
シュメモリの出力を切替手段２０６を通じてＭＰＵＡ，
ＭＰＵＢ，ＭＰＵＣに入力する。また、両方のキャッシ
ュメモリが正常である場合には、キャッシュメモリの主
系，従系を予め決めておき、主系の出力を選択すればよ
い。[0102] "read access portion of the cache data" Also, when we attached to the cache memory, the cache output of the memory 220, 221 (data) parity check by the normal / abnormal can be determined, the parity check 250 as shown in Figure 27 MPUA through switching means 2 06 the output of the cache memory that is judged to be normal by,
Input to MPUB and MPUC. If both cache memories are normal, the master and slave of the cache memory may be determined in advance, and the output of the master may be selected.

【０１０３】又、図２８のようにＭＰＵＡ，ＭＰＵＢは
接続するキャッシュをそれぞれキャッシュメモリを２２
０，２２１に固定しておきＭＰＵＢのみに選択したキャ
ッシュメモリの出力を入力してもよい。この場合、いず
れかのキャッシュメモリが故障しても３つのうちの２つ
のＭＰＵに正常な動作をさせることができ、しかもハー
ド量を削減することができる。Also, as shown in FIG. 28, MPUA and MPUB use the cache memories 22
Alternatively, the output of the selected cache memory may be input only to the MPU and fixed to 0,221. In this case, even if one of the cache memories fails, two of the three MPUs can operate normally and the amount of hardware can be reduced.

【０１０４】[0104]

【発明の効果】本発明では、一つのハードウェアボード
上に複数のプロセッサで構成されるプロセッサシステム
を搭載し、ハードウェアボード自体にフォールトトレラ
ンス機能を持たせプロセッサボード自体を交換してしま
うことにしたので、プロセッサ組替えに伴う必要なハー
ド、ソフトが不要であり、復旧も考慮したシステム構成
とできる。According to the present invention, a processor system including a plurality of processors is mounted on a single hardware board, and the hardware board itself has a fault tolerance function and the processor board itself is replaced. Therefore, hardware and software necessary for processor replacement are not required, and a system configuration that allows for recovery can be provided.

【０１０５】本発明では、一つのハードウェアボード上
の複数のプロセッサをクロック同期させたので配線が短
距離で良く同期の高速化が達成できる。In the present invention, a plurality of processors on one hardware board are clock-synchronized, so that the wiring is short and the speed of synchronization can be increased.

【０１０６】本発明では、その内部に障害が発生したと
きにはハードウェアボード自体を別のハードウェアボー
ドに瞬時に交換するのでシステムを停止することがな
い。According to the present invention, when a failure occurs inside the system, the hardware board itself is replaced immediately with another hardware board, so that the system is not stopped.

【０１０７】本発明では、その内部に障害が発生したと
きにはそのプロセッサを停止し残りのプロセッサにより
運転継続するのでシステムの性能を低下することなくハ
ードウェアボードを交換することができる。In the present invention, when a failure occurs in the inside, the processor is stopped and the operation is continued by the remaining processors, so that the hardware board can be replaced without lowering the performance of the system.

[Brief description of the drawings]

【図１】本発明の全体システム構成を示す図。FIG. 1 is a diagram showing the overall system configuration of the present invention.

【図２】本発明のＢＰＵの構成を示す図。FIG. 2 is a diagram showing a configuration of a BPU of the present invention.

【図３】ＭＰＵ出力チェック回路の一実施例図。FIG. 3 is a diagram showing an embodiment of an MPU output check circuit.

【図４】ライトアクセスでの異常時のＢＰＵの構成を示
す図。FIG. 4 is a diagram showing a configuration of a BPU when an abnormality occurs in a write access.

【図５】リードアクセスでの異常時のＢＰＵの構成を示
す図。FIG. 5 is a diagram showing a configuration of a BPU at the time of abnormality in read access.

【図６】バスサイクル制御フロー図。FIG. 6 is a flowchart of a bus cycle control.

【図７】ＭＰＵ正常時のＢＰＵ内の信号の流れを示す
図。FIG. 7 is a diagram showing a signal flow in a BPU when the MPU is normal.

【図８】ＭＰＵ異常時のＢＰＵ内の信号の流れを示す
図。FIG. 8 is a diagram showing a signal flow in a BPU when an MPU is abnormal.

【図９】ＭＰＵ正常時のＢＰＵ内の信号の流れを示す
図。FIG. 9 is a diagram showing a signal flow in a BPU when the MPU is normal.

【図１０】アドレス信号異常時のＢＰＵ内の信号の流れ
を示す図。FIG. 10 is a diagram showing a signal flow in a BPU when an address signal is abnormal.

【図１１】データ信号異常時のＢＰＵ内の信号の流れを
示す図。FIG. 11 is a diagram showing a signal flow in a BPU when a data signal is abnormal.

【図１２】計算機盤構成を示す図。FIG. 12 is a diagram showing a computer board configuration.

【図１３】ＢＰＵ交換原理説明図。FIG. 13 is an explanatory diagram of a BPU exchange principle.

【図１４】ＢＰＵ交換手順を示す図。FIG. 14 is a diagram showing a BPU replacement procedure.

【図１５】新旧ＢＰＵの処理引継を示す図。FIG. 15 is a diagram showing processing takeover of new and old BPUs.

【図１６】マルチプロセッサ時のＢＰＵ交換原理説明
図。FIG. 16 is an explanatory diagram of a BPU exchange principle at the time of a multiprocessor.

【図１７】マルチプロセッサ時のＢＰＵ交換手順を示す
図。FIG. 17 is a diagram showing a BPU exchange procedure at the time of a multiprocessor.

【図１８】マルチプロセッサ時の新旧ＢＰＵ処理引継を
示す図。FIG. 18 is a diagram showing takeover of old and new BPU processes at the time of a multiprocessor.

【図１９】挿入ＢＰＵ故障時のＢＰＵ交換処理を示す
図。FIG. 19 is a diagram showing a BPU replacement process when an inserted BPU fails.

【図２０】挿入ＢＰＵ故障時のＢＰＵ交換処理フロー
図。FIG. 20 is a flowchart of a BPU replacement process when an inserted BPU fails.

【図２１】ＢＰＵ故障時の処理の引継ぎを示す図。FIG. 21 is a diagram showing handover of processing when a BPU fails.

【図２２】ＢＰＵ故障時の処理の引継ぎを示す図。FIG. 22 is a diagram showing handover of processing when a BPU fails.

【図２３】３ＭＰＵによる比較照合の実施例図。FIG. 23 is a view showing an embodiment of comparison and collation by 3MPU.

【図２４】３ＭＰＵによる比較照合の他の実施例図。FIG. 24 is a view showing another embodiment of comparison and collation by 3MPU.

【図２５】多数決方式の他の実施例図。FIG. 25 is a view showing another embodiment of a majority decision system.

【図２６】４ＭＰＵによる比較照合の実施例図。FIG. 26 is a diagram showing an embodiment of comparison and collation by 4MPU.

【図２７】キャッシュデ−タのリ−ドアクセスを示す
図。FIG. 27 is a diagram showing read access of cache data.

【図２８】キャッシュデ−タのリ−ドアクセスの他の実
施例図。FIG. 28 is a view showing another embodiment of read access of cache data.

[Explanation of symbols]

１…システムバス、２…ＢＰＵ、１０，１１，１２，１
３，１４，１５…パリティ生成／照合回路、２０…ＭＰ
Ｕ、２３…ＭＰＵ出力チェック回路、２７…ＢＩＵ（バ
スインタフェースユニット）、３０，３１…パリティチ
ェック回路、２００乃至２０５，２６−１，２６−２，
２９…３ステートバッファ、２２０、２２１…キャッシ
ュメモリ、２３４，２３５…エラーチェック回路。1: System bus, 2: BPU, 10, 11, 12, 1
3, 14, 15 ... parity generation / collation circuit, 20 ... MP
U, 23 ... MPU output check circuit, 27 ... BIU (bus interface unit), 30, 31 ... Parity check circuit, 200 to 205, 26-1 , 26-2 ,
29 ... three-state buffer, 220, 221 ... cache memory, 234, 235 ... error check circuit.

───────────────────────────────────────────────────── フロントページの続き (72)発明者小林芳樹茨城県日立市久慈町4026番地株式会社日立製作所日立研究所内 (72)発明者宮尾健茨城県日立市大みか町五丁目２番１号株式会社日立製作所大みか工場内 (72)発明者荒岡学茨城県日立市大みか町五丁目２番１号株式会社日立製作所大みか工場内 (72)発明者中村智明茨城県日立市大みか町五丁目２番１号株式会社日立製作所大みか工場内 (72)発明者丹治雅行茨城県日立市大みか町五丁目２番１号株式会社日立製作所大みか工場内 (72)発明者金子茂則茨城県日立市大みか町五丁目２番１号株式会社日立製作所大みか工場内 (72)発明者桝井晃二茨城県日立市大みか町五丁目２番１号株式会社日立製作所大みか工場内 (72)発明者飯島三朗茨城県日立市大みか町五丁目２番１号日立プロセスコンピュータエンジニアリング株式会社内 (56)参考文献特開平２−202636（ＪＰ，Ａ) 河本恭彦、他４名，“Ｖ60／Ｖ70マイクロプロセッサと高信頼化システム”, 情報処理学会論文誌，情報処理学会, 1989年１月，Ｖｏｌ．30，Ｎｏ．１, ｐ．58−71 古城隆、他１名，“汎用マイクロプロセッサチップ”，電子情報通信学会誌, 電子情報通信学会，1990年11月，Ｖｏｌ．73，Ｎｏ．11，ｐ．1222−1227河本恭彦、他４名，“Ｖ60／Ｖ70マイクロプロセッサシステムと高信頼化システム”，情報処理学会論文誌, (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06F 11/16 - 11/20 G06F 15/16 - 15/177 ──────────────────────────────────────────────────続き Continued on the front page (72) Inventor Yoshiki Kobayashi 4026 Kuji-cho, Hitachi City, Ibaraki Prefecture Within Hitachi Research Laboratory, Hitachi, Ltd. (72) Inventor Takeshi Miyao 5-2-1 Omika-cho, Hitachi City, Ibaraki Prefecture Co., Ltd. Inside Hitachi, Ltd. Omika Plant (72) Inventor Manabu Araoka 5-2-1, Omika-cho, Hitachi City, Ibaraki Prefecture Inside Hitachi Ltd. Omika Plant (72) Inventor Tomoaki Nakamura 5-2-1, Omika-cho, Hitachi City, Ibaraki Prefecture Hitachi, Ltd. Omika Plant (72) Inventor Masayuki Tanji 5-2-1, Omika-cho, Hitachi City, Ibaraki Prefecture Hitachi, Ltd. Omika Plant (72) Inventor Shigenori Kaneko 5-2-1, Omika-cho, Hitachi City, Ibaraki Prefecture No. 1 Inside the Hitachi, Ltd. Omika Plant (72) Inventor Koji Masui Omikacho, Hitachi City, Ibaraki Prefecture No.2-1, Hitachi, Ltd. Omika Factory, Hitachi, Ltd. (72) Inventor Saburo Iijima 5-2-1, Omika-cho, Hitachi City, Ibaraki Prefecture Within Hitachi Process Computer Engineering Co., Ltd. (56) References JP-A-2-2 202636 (JP, A) Yasuhiko Kawamoto and 4 others, "V60 / V70 Microprocessor and High Reliability System", Transactions of Information Processing Society of Japan, Information Processing Society of Japan, January 1989, Vol. 30, No. 1, p. 58-71 Takashi Kojo and 1 other, “General-purpose Microprocessor Chip”, Journal of the Institute of Electronics, Information and Communication Engineers, Institute of Electronics, Information and Communication Engineers, November 1990, Vol. 73, No. 11, p. 1222-1227 Yasuhiko Kawamoto and 4 others, “V60 / V70 Microprocessor System and High Reliability System”, Transactions of Information Processing Society of Japan, (58) Fields investigated (Int. Cl. ⁷ , DB name) G06F 11 / 16-11/20 G06F 15/16-15/177

Claims

(57) [Claims]

At least three processor units for executing the same operation, and an output of the processor unit is connected to another processor unit.
Check the soundness by comparing with the output of the
Checking circuit and sound output are output as external output
Multiple interface units, each taking in external inputs.
And information required for calculations in the processor unit
Multiple cache memories, and
Internal bus on a single processor board
A processing unit , the output of which is output to another processor.
Used to check the health of the unit and
The output processor unit to be output and the output
Used only to check the health of other processor units.
Reference processor unit that is used and is not output to the outside
There are two types of interface units.
Processor unit output health for specific output different
When the property is confirmed, the output is output to the outside,
The soundness of the output of the processor unit for that particular output
If not checked, another health-checked output
A basic processing unit that outputs the output of a processor unit to the outside .

2. The basic processing according to claim 1.
The output of the processor unit for output that is not
Processor for other sanitized outputs instead of power
It has a selection circuit for selecting the output of the unit.
Basic processing unit.

3. A basic processor according to claim 1 or 2.
In Tsu single unit, switching the processor unit for the output not confirm the soundness
Separating means for separating and providing an output of the processor unit
Other health-checked output
Switching means for switching to give the output of the processor unit
And a basic processing unit.

4. A basic processing unit according to claim 1, wherein: a stop means for stopping output of an abnormal cache memory;
A processor that receives information from the cache memory of the abnormality
Unit receives information from other normal cache memory
A basic processing unit, comprising: switching means for performing switching in such a manner .

5. The basic processing unit according to claim 1 , wherein said disconnecting means disconnects an abnormal interface unit.
Connected to the internal bus on the interface unit side of the error
Processor unit to another normal interface
Switching means for switching to connect to the internal bus on the unit side
And a basic processing unit.

6. At least three processors performing the same operation.
The output of the processor unit and the processor unit to another
Checks to check the health by comparing with the output of the processor
Output for each of the
Multiple interface units and processes
Multiple keys to store information necessary for computation in the
Cache memory and the internal bus between them.
Basic processor mounted on one processor board
And a plurality of interface units
A plurality of system buses respectively connected to the plurality of system buses;
High reliability configured with main storage connected to the system bus
In a computer system, a processor unit of the basic processing unit is provided.
The output of the unit is the health of other processor units.
Used for checking the status and output to the outside
Processor unit and its output
Used only to check the integrity of the knit,
Two types available, with unreferenced processor unit for reference
Each interface unit has a specific
The soundness of the output of the output processor unit is confirmed
Output the external output and output the specific output
Of the output of the processor unit is not confirmed
When other processor outputs are checked for soundness,
The high reliability core characterized in that the output of the
Computer system.

7. A highly reliable computer system according to claim 6.
In the system , the basic processing unit
Other sound instead of processor output for unrecognized output
To select the processor output for the
Computer having a selection circuit
System .

8. A highly reliable computer according to claim 6 or 7.
In a data system, the basic processing unit
Release handle for disconnecting processor unit for unauthorized output
Stage and a cache providing an output of the processor unit
A processor processor for other sanitized outputs in memory
Switching means for switching to give a knit output
A highly reliable computer system characterized by the above-mentioned .

9. The high signal according to any one of claims 6 to 8.
In the computer system, the basic processing unit is used to
Stopping means for stopping the output of the flash memory;
Processor unit that receives information from cache memory
To receive information from another normal cache memory
A highly reliable computer system comprising: a switching unit for switching .

10. The height according to claim 6, wherein :
In a trusted computer system, the basic processing unit is used to detect an error
Disconnecting the interface unit and an abnormal interface.
Processor connected to the internal bus on the face unit side
The unit within other normal interface units
Switching means for switching to connect to the external bus
A highly reliable computer system characterized by the following.