JPH05224970A

JPH05224970A - Error detection system

Info

Publication number: JPH05224970A
Application number: JP4025624A
Authority: JP
Inventors: Satoshi Hashimoto; 智橋本
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1992-02-13
Filing date: 1992-02-13
Publication date: 1993-09-03

Abstract

PURPOSE:To enhance the processing efficiency for restoration of the faults by discriminating the position of an error when it is detected. CONSTITUTION:Registers 107-1 to 107-4 hold the results of comparison carried out between the bits of the output 101 of one of both data processors 11 and 12 of a duplex system and the bits of the output 102 of the other processor. The information held in those registers are read by each CPU of both processors 11 and 12, thereby the bit position of an error can be easily detected. Thus it is possible to easily discriminate the type of the error, a specific faulty part of hardware, etc., and to simplify the fault restore operation.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は、デ―タ処理装置の２
重化システムにおける誤り検出方式に関する。BACKGROUND OF THE INVENTION The present invention relates to a data processing device 2
An error detection method in a redundant system.

【０００２】[0002]

【従来の技術】従来の２重化システムにおける誤り検出
機構について、図６を例にして説明をする。ここでは、
２重化システム内を区別するために、１つの系をＡ系、
他の系をＢ系とする。2. Description of the Related Art An error detection mechanism in a conventional duplex system will be described with reference to FIG. here,
In order to distinguish the inside of the duplex system, one system is A system,
The other system is system B.

【０００３】図６においては、４ビットのＡ系信号２０
１と４ビットのＢ系信号２０２について比較器２０３−
１〜２０３−４によって値を比較している。比較器２０
３−１〜２０３−４は、入力される２つの信号が同じ場
合には論理“０”、異なる場合には論理“１”の比較結
果信号Ｃ０〜Ｃ４をそれぞれ発生する。比較結果信号Ｃ
０〜Ｃ４は、論理和回路から成るエラー検出回路２０５
にて検出信号２０６として１つの信号にまとめる。In FIG. 6, a 4-bit A-system signal 20
Comparator 203-for 1- and 4-bit B-system signal 202
1 to 203-4 are used to compare the values. Comparator 20
3-1 to 203-4 generate comparison result signals C0 to C4 of logic "0" when the two input signals are the same, and logic "1" when they are different. Comparison result signal C
0 to C4 are error detection circuits 205 each including an OR circuit.
Are combined into one signal as the detection signal 206.

【０００４】詳しく説明すると、Ａ系信号２０１（ａ
０，ａ１，ａ２，ａ３）が（０，１，１，０）であり、
Ｂ系信号２０２（ｂ０，ｂ１，ｂ２，ｂ３）が（０，
１，１，０）であった場合には、比較器２０３−１〜２
０３−４の出力である比較結果信号Ｃ０〜Ｃ４は
“０”，“０”，“０”，“０”となり、エラー検出回
路２０５を通して出力される誤り検出信号２０６は論理
“０”となる。論理“０”の場合には、Ａ系，Ｂ系の動
作が同じで正常であると見なされる。More specifically, the A system signal 201 (a
0, a1, a2, a3) is (0,1,1,0),
The B system signal 202 (b0, b1, b2, b3) is (0,
1, 1, 0), the comparators 203-1 and 203-1
The comparison result signals C0 to C4 output from the circuit 03-4 are "0", "0", "0", "0", and the error detection signal 206 output through the error detection circuit 205 is logic "0". .. When the logical value is "0", the operations of the A system and B system are the same and are considered to be normal.

【０００５】また、Ａ系信号２０１（ａ０，ａ１，ａ
２，ａ３）が（０，０，１，１）であり、Ｂ系信号２０
２（ｂ０，ｂ１，ｂ２，ｂ３）が（０，１，１，１）で
あった場合には、比較器２０３−１〜２０３−４の出力
である比較結果信号Ｃ０〜Ｃ３は“０”，“１”，
“０”，“０”となり、エラー検出回路２０５を通して
出力される誤り検出信号２０６は論理“１”の値とな
る。論理“１”の場合には、Ａ系，Ｂ系の動作に違いが
あり、即ち、誤りが検出されたことになる。Further, the A system signal 201 (a0, a1, a
2, a3) is (0, 0, 1, 1), and the B-system signal 20
When 2 (b0, b1, b2, b3) is (0,1,1,1), the comparison result signals C0 to C3 output from the comparators 203-1 to 203-4 are "0". , "1",
The error detection signal 206 is "0" or "0", and the error detection signal 206 output through the error detection circuit 205 has a logic "1" value. In the case of logic "1", there is a difference between the operations of the A system and the B system, that is, an error has been detected.

【０００６】このように構成された従来の２重化システ
ムの誤り検出機構にあっては、誤りが発生した情報は誤
り検出信号としてＣＰＵや外部に通知できるが、誤りの
発生した位置を判定するための情報は不足しており、シ
ステムに幾つかのテストパタンを与えて回路を検証しな
がら誤り位置を見つけていく必要があるという問題点が
あった。In the error detecting mechanism of the conventional duplex system configured as described above, the information in which an error has occurred can be notified to the CPU or the outside as an error detection signal, but the position in which the error has occurred is determined. There is a problem that it is necessary to find some error position while verifying the circuit by giving some test patterns to the system.

【０００７】[0007]

【発明が解決しようとする課題】従来では、誤り発生位
置の検出のために改めて外部からテストデ―タを用いて
テストを行う必要があり、障害復旧に時間がかかる欠点
があった。Conventionally, it is necessary to perform a test from the outside again using test data in order to detect an error occurrence position, and there is a drawback that it takes time to recover from a failure.

【０００８】この発明はこのような点に鑑みてなされた
もので、誤り検出時にその誤り発生位置を判定できるよ
うにし、障害復旧を容易に行うことができる誤り検出方
式を提供することを目的とする。The present invention has been made in view of the above circumstances, and an object thereof is to provide an error detection system capable of determining an error occurrence position at the time of error detection and facilitating failure recovery. To do.

【０００９】[0009]

【課題を解決するための手段および作用】この発明によ
る誤り検出方式は、同一処理を並行して実行する第１お
よび第２のデータ処理装置を有する２重化システムにお
いて、前記第１のデ―タ処理装置からの複数ビットから
成る出力とこの出力に対応する前記第２のデ―タ処理装
置の複数ビットから成る出力とをビット単位で比較する
比較手段と、この比較手段による比較結果を各ビット毎
に保持する保持手段と、前記比較手段によって前記第１
のデ―タ処理装置と前記第２のデ―タ処理装置の出力間
の不一致が検出された際、前記保持手段に保持されてい
る比較結果に基づいて誤りが発生したビット位置を検出
する手段とを具備したことを特徴とする。According to the error detecting method of the present invention, in the duplication system having the first and second data processing devices that execute the same processing in parallel, the first data A comparison means for comparing the output of a plurality of bits from the data processing device and the output of a plurality of bits of the second data processing device corresponding to this output in bit units, and a comparison result by the comparison means. Holding means for holding each bit and the first means by the comparing means
Means for detecting a bit position in which an error has occurred based on the comparison result held in the holding means when a mismatch between the outputs of the data processing device and the second data processing device is detected. And is provided.

【００１０】この誤り検出方式においては、各ビットに
ついての比較結果を保持手段で保持しているので、その
保持されている情報をリ―ドすることにより誤りビット
位置を容易に検出することができる。通常、誤りビット
位置はどのような誤動作が生じたか、あるいはハードウ
エアのどの部分の故障か等を判定するための目安として
有効に利用できるので、誤りビット位置を検出すること
で、障害復旧動作が容易になると共に、交換すべきハ―
ドウェアの選定処理が簡略化される。In this error detection system, since the comparison result for each bit is held by the holding means, the error bit position can be easily detected by reading the held information. .. Normally, the error bit position can be effectively used as a guide for determining what kind of malfunction has occurred, what part of the hardware has failed, etc., so the error recovery operation can be performed by detecting the error bit position. It ’s easy, and it ’s a hur
The selection process of hardware is simplified.

【００１１】[0011]

【実施例】以下、図面を参照してこの発明の実施例を説
明する。Embodiments of the present invention will be described below with reference to the drawings.

【００１２】図１にはこの発明の一実施例に係わる２重
化システムのエラー検出機構が示されている。第１およ
び第２のデータ処理装置１１，１２は２重化システム
（デュアルシステム）を構成するものであり、それぞれ
同一のＣＰＵ、記憶装置、その他各種周辺回路等を有し
ている。これら第１および第２のデータ処理装置１１，
１２のＣＰＵは、同一プログラムを実行することによっ
て、同一処理を並行して実行する。FIG. 1 shows an error detecting mechanism of a duplex system according to an embodiment of the present invention. The first and second data processing devices 11 and 12 compose a duplicated system (dual system) and each have the same CPU, storage device, and other various peripheral circuits. These first and second data processing devices 11,
The 12 CPUs execute the same processing in parallel by executing the same program.

【００１３】第１のデータ処理装置（Ａ）１１の出力で
あるＡ系信号１０１（ａ０，ａ１，ａ２，ａ３）と、第
２のデータ処理装置（Ｂ）１２の出力であるＢ系信号１
０２（ｂ０，ｂ１，ｂ２，ｂ３）は、４つの比較器１０
３−１〜１０３−４に入力される。すなわち、比較器１
０３−１〜１０３−４それぞれには、Ａ系信号１０１と
Ｂ系信号１０２の対応するビットの信号（ａ０とｂ０，
ａ１とｂ１，ａ２とｂ２，ａ３とｂ３）が接続されてい
る。An A system signal 101 (a0, a1, a2, a3) output from the first data processing device (A) 11 and a B system signal 1 output from the second data processing device (B) 12.
02 (b0, b1, b2, b3) is the four comparators 10
3-1 to 103-4 are input. That is, the comparator 1
03-1 to 103-4 respectively have corresponding bit signals (a0 and b0, of A system signal 101 and B system signal 102).
a1 and b1, a2 and b2, a3 and b3) are connected.

【００１４】比較器１０３−１〜１０３−４は、それぞ
れ入力される２つの信号を比較して、同じであるのなら
論理“０”、異なる場合には論理“１”を比較結果信号
Ｃ０〜Ｃ３として出力する。これら比較器１０３−１〜
１０３−４の各々は、排他的論理和回路（ＥＸＯＲ）に
よって容易に実現することができる。The comparators 103-1 to 103-4 compare the two signals respectively inputted, and if they are the same, the logic "0" is given, and if they are different, the logic "1" is given and the comparison result signals C0 to C0 are given. Output as C3. These comparators 103-1 to 103-1
Each of 103-4 can be easily realized by an exclusive OR circuit (EXOR).

【００１５】エラー検出回路１０５は、論理和回路から
構成されるものであり、４つの比較器１０３−１〜１０
３−４から出力される比較結果信号Ｃ０〜Ｃ３の少なく
とも１つに論理“１”が存在する場合には、論理“１”
の誤り検出信号（Ｅ）を出力するものである。また、４
つの比較器１０３−１〜１０３−４から出力される比較
結果信号Ｃ０〜Ｃ３の全てが論理“０”であるのなら
ば、論理“０”の誤り検出信号（Ｅ）を出力する。The error detection circuit 105 is composed of a logical sum circuit, and has four comparators 103-1 to 10-3.
If there is a logic "1" in at least one of the comparison result signals C0 to C3 output from 3-4, a logic "1"
The error detection signal (E) is output. Also, 4
If all the comparison result signals C0 to C3 output from the one comparator 103-1 to 103-4 are logic "0", the error detection signal (E) of logic "0" is output.

【００１６】論理“１”の誤り検出信号（Ｅ）は、誤り
を検出したことを通知する信号であり、データ処理装置
１１，１２それぞれのＣＰＵの割り込み入力（ＩＮＴ）
に供給される。誤り検出信号（Ｅ）が論理“１”になっ
たら、ＣＰＵに割り込みが発生し、誤り処理プログラム
が実行される。誤り処理プログラムでは、ユ―ザ―への
サ―ビスが停止され、システムの誤りについての診断が
それぞれのＣＰＵで行われる。The error detection signal (E) of logic "1" is a signal for notifying that an error has been detected, and is an interrupt input (INT) to the CPU of each of the data processors 11 and 12.
Is supplied to. When the error detection signal (E) becomes logic "1", an interrupt occurs in the CPU and the error processing program is executed. In the error processing program, the service to the user is stopped and the diagnosis of the system error is performed by each CPU.

【００１７】診断の結果が正常であり一時的な故障であ
るならば誤り部分を復旧させて、ユ―ザ―へのサ―ビス
が再開される。また、異常があるならば、その障害が発
生したほうのデータ処理装置がシステムから切り離さ
れ、正常なほうのデータ処理装置だけでシステムが運用
される。この場合、障害が発生したほうのデータ処理装
置では復旧処理が行われ、例えば、誤りの原因となった
ハ―ドウェアの交換等が行われる。If the result of the diagnosis is normal and it is a temporary failure, the erroneous portion is restored and the service to the user is restarted. If there is an abnormality, the data processing device in which the failure has occurred is disconnected from the system, and the system is operated only by the normal data processing device. In this case, the data processing device in which the failure has occurred performs a recovery process, for example, replacement of the hardware causing the error.

【００１８】４個のレジスタ１０７−１〜１０７−４に
は、比較器１０３−１〜１０３−４によって比較した結
果、つまり比較結果信号Ｃ０〜Ｃ３がそれぞれ格納され
る。これらレジスタ１０７−１〜１０７−４の各々は、
Ｄフリップフロップによって構成されており、クロック
信号の立ち上がりのタイミングでその時の比較結果信号
の値を格納する。また、レジスタ１０７−１〜１０７−
４の各々に格納された値は、クリア信号によって消去さ
れる。The four registers 107-1 to 107-4 store the results of comparison by the comparators 103-1 to 103-4, that is, the comparison result signals C0 to C3, respectively. Each of these registers 107-1 to 107-4 is
It is composed of a D flip-flop and stores the value of the comparison result signal at that time at the rising timing of the clock signal. In addition, the registers 107-1 to 107-
The value stored in each of 4 is erased by the clear signal.

【００１９】レジスタ１０７−１〜１０７−４の出力ｅ
０〜ｅ３は、データ処理装置１１，１２各々のＣＰＵか
らメモリの一部として見え、誤り処理プログラムによっ
てリ―ドされたり、システム診断を行うサ―ビスプロセ
ッサに詳細な誤り検出情報として提供される。次に、実
際の動作を示しながら説明を行う。Output e of registers 107-1 to 107-4
0 to e3 are seen as a part of the memory from the CPUs of the data processing devices 11 and 12, read by an error processing program, and provided as detailed error detection information to a service processor for system diagnosis. .. Next, a description will be given while showing the actual operation.

【００２０】まず、Ａ系信号１０１（ａ０，ａ１，ａ
２，ａ３）が（１，１，０，１）に、Ｂ系信号１０２
（ｂ０，ｂ１，ｂ２，ｂ３）が（１，１，０，１）に設
定される。First, the A system signal 101 (a0, a1, a
2, a3) becomes (1, 1, 0, 1), and the B system signal 102
(B0, b1, b2, b3) is set to (1, 1, 0, 1).

【００２１】これらＡ系信号１０１とＢ系信号１０２
は、比較器１０３−１〜１０３−４によって比較され、
比較結果信号Ｃ０〜Ｃ３の値が決定される。この場合に
は、全てのビットのペア（ａ０とｂ０，ａ１とｂ１，ａ
２とｂ２，ａ３とｂ３）は同じ値の信号であるので、Ｃ
０，Ｃ１，Ｃ２，Ｃ３は、“０”，“０”，“０”，
“０”となる。These A system signal 101 and B system signal 102
Are compared by the comparators 103-1 to 103-4,
The values of the comparison result signals C0 to C3 are determined. In this case, all bit pairs (a0 and b0, a1 and b1, a
2 and b2, a3 and b3) are signals of the same value, so C
0, C1, C2, C3 are "0", "0", "0",
It becomes "0".

【００２２】エラー検出回路１０５では、比較結果信号
Ｃ０〜Ｃ３の論理和が取られる。比較結果信号Ｃ０〜Ｃ
３が全て“０”であるので、誤り検出信号（Ｅ）は
“０”となり、誤りは検出されていないことを示す。The error detection circuit 105 takes the logical sum of the comparison result signals C0 to C3. Comparison result signals C0 to C
Since all 3 are "0", the error detection signal (E) becomes "0", indicating that no error has been detected.

【００２３】レジスタ１０７−１〜１０７−４は、クロ
ック信号の立ち上がりで比較結果信号Ｃ０〜Ｃ３を取り
込み格納する。ここでは、比較結果信号Ｃ０〜Ｃ３は全
て“０”なので、４つのレジスタ１０７−１〜１０７−
４には“０”が格納される。したがって、レジスタ出力
ｅ０，ｅ１，ｅ２，ｅ３は、“０”，“０”，“０”，
“０”となる。The registers 107-1 to 107-4 fetch and store the comparison result signals C0 to C3 at the rising edge of the clock signal. Here, since the comparison result signals C0 to C3 are all "0", the four registers 107-1 to 107-
“0” is stored in 4. Therefore, the register outputs e0, e1, e2, e3 are "0", "0", "0",
It becomes "0".

【００２４】次に、Ａ系信号１０１（ａ０，ａ１，ａ
２，ａ３）が（１，１，１，０）、Ｂ系信号１０２（ｂ
０，ｂ１，ｂ２，ｂ３）が（１，１，１，１）に設定さ
れた場合を想定する。Next, the A system signal 101 (a0, a1, a
2, a3) is (1, 1, 1, 0), and B system signal 102 (b
Assume that 0, b1, b2, b3) is set to (1,1,1,1).

【００２５】この場合は、ａ３とｂ３に違いが検出され
るので比較結果信号Ｃ０，Ｃ１，Ｃ２，Ｃ３は、
“０”，“０”，“０”，“１”となる。エラー検出回
路１０５は、比較結果信号Ｃ０，Ｃ１，Ｃ２，Ｃ３が示
す値を入力として論理和を行う。ａ３とｂ３の比較結果
信号Ｃ３が“１”であるので、誤り検出信号（Ｅ）は
“１”となる。したがって、誤りを検出したことにな
る。この誤り検出信号（Ｅ）は、誤りの発生を通知する
ために、２重化されたデータ処理装置１１，１２の各Ｃ
ＰＵの割り込み入力（ＩＮＴ）に伝えられる。In this case, since the difference between a3 and b3 is detected, the comparison result signals C0, C1, C2 and C3 are
It becomes "0", "0", "0", "1". The error detection circuit 105 inputs the values indicated by the comparison result signals C0, C1, C2 and C3 and performs a logical sum. Since the comparison result signal C3 of a3 and b3 is "1", the error detection signal (E) becomes "1". Therefore, an error has been detected. This error detection signal (E) is used for each C of the duplicated data processing devices 11 and 12 to notify the occurrence of an error.
It is transmitted to the interrupt input (INT) of PU.

【００２６】レジスタ１０７−１〜１０７−４は、クロ
ック信号の立ち上がりで比較結果信号Ｃ０，Ｃ１，Ｃ
２，Ｃ３をそれぞれ取り込み格納する。ここでは、比較
結果信号Ｃ０，Ｃ１，Ｃ２，Ｃ３は“０”，“０”，
“０”，“１”であるので、レジスタ１０７−１〜１０
７−４には“０”，“０”，“０”，“１”が格納され
る。したがって、１０７−１〜１０７−４の出力ｅ０〜
ｅ３は、“０”，“０”，“０”，“１”となる。デー
タ処理装置１１，１２の各ＣＰＵでの誤り処理では、次
のような処理が行われる。The registers 107-1 to 107-4 have comparison result signals C0, C1 and C at the rising edge of the clock signal.
2 and C3 are respectively captured and stored. Here, the comparison result signals C0, C1, C2 and C3 are "0", "0",
Since they are “0” and “1”, the registers 107-1 to 10-10
"0", "0", "0", and "1" are stored in 7-4. Therefore, outputs e0 of 107-1 to 107-4
e3 becomes "0", "0", "0", "1". In the error processing in each CPU of the data processing devices 11 and 12, the following processing is performed.

【００２７】すなわち、図２のフローチャートに示され
ているように、論理“１”の誤り検出信号（Ｅ）による
割り込みである場合には（ステップＳ１１）、データ処
理装置１１，１２の各ＣＰＵはエラー処理プログラムを
実行し、誤りが発生した出力のビット位置を見つけるた
めに、４つのレジスタ１０７−１〜１０７−４の値をリ
―ドする（ステップＳ１２）。各ＣＰＵは、４つのレジ
スタ１０７−１〜１０７−４の値（ｅ０〜ｅ３）をリ―
ドすることで、ｅ３の値が“１”であることを知る。こ
の情報に基づき、各ＣＰＵは誤りが発生したビット位置
の解析処理を行い（ステップＳ１３）、これによってａ
３とｂ３の信号に誤りが生じたと判定することができ
る。That is, as shown in the flow chart of FIG. 2, in the case of an interrupt due to an error detection signal (E) of logic "1" (step S11), each CPU of the data processing devices 11 and 12 The error processing program is executed, and the values of the four registers 107-1 to 107-4 are read in order to find the bit position of the output in which the error has occurred (step S12). Each CPU reads the values (e0 to e3) of the four registers 107-1 to 107-4.
By doing so, it is known that the value of e3 is “1”. Based on this information, each CPU analyzes the bit position where the error has occurred (step S13).
It can be determined that an error has occurred in the signals of 3 and b3.

【００２８】この場合、、各ＣＰＵは対応する出力（ａ
３またはｂ３）について自装置の障害の有無を調べ、障
害の発生したほうのＣＰＵは、必要な回復処理（ステッ
プＳ１４）を行った後、復旧が可能であるのならクリア
信号によってレジスタ１０７−１〜１０７−４に格納さ
れている値を（０，０，０，０）にリセットし、通常動
作に戻る。In this case, each CPU has a corresponding output (a
3 or b3), the presence or absence of a fault in its own device is checked, and the CPU with the fault performs the necessary recovery processing (step S14), and if restoration is possible, the clear signal causes the register 107-1 to register. The value stored in 107-4 is reset to (0,0,0,0), and the normal operation is resumed.

【００２９】このように、図１の誤り検出機構において
は、出力１０１と１０２の各ビットについての比較結果
をレジスタ１０７−１〜１０７−４で保持しているの
で、その保持されている情報をリ―ドすることにより誤
りビット位置を容易に検出することができる。したがっ
て、どのような誤動作が生じたか、あるいはハードウエ
アのどの部分の故障か等を容易に判定できるようにな
り、障害復旧動作が簡単になる。次に、図３を参照し
て、この発明の第２実施例について説明する。As described above, in the error detection mechanism of FIG. 1, since the comparison results of the bits of the outputs 101 and 102 are held in the registers 107-1 to 107-4, the held information is stored. By reading, the error bit position can be easily detected. Therefore, it becomes possible to easily determine what malfunction has occurred, which part of the hardware has failed, and the failure recovery operation is simplified. Next, a second embodiment of the present invention will be described with reference to FIG.

【００３０】図１と同様に、４つの比較器１０３−１〜
１０３−４には、Ａ系信号１０１（ａ０，ａ１，ａ２，
ａ３）とＢ系信号１０２（ｂ０，ｂ１，ｂ２，ｂ３）の
各信号のペア（ａ０とｂ０，ａ１とｂ１，ａ２とｂ２，
ａ３とｂ３）がそれぞれ接続されている。比較器１０３
−１〜１０３−４は、入力される２つの信号を比較して
同じであるのなら論理“０”、異なるならば論理“１”
を比較結果信号Ｃ０〜Ｃ３としてそれぞれ出力する。こ
れら比較器１０３−１〜１０３−４の各々は、排他的論
理和回路（ＥＸＯＲ）によって容易に実現することがで
きる。Similar to FIG. 1, four comparators 103-1 to 103-3 are provided.
103-4 includes A system signals 101 (a0, a1, a2,
a3) and a pair of signals of the B system signal 102 (b0, b1, b2, b3) (a0 and b0, a1 and b1, a2 and b2).
a3 and b3) are respectively connected. Comparator 103
-1 to 103-4 compare two input signals, and if they are the same, logic "0", and if they are different, logic "1"
Are output as comparison result signals C0 to C3, respectively. Each of these comparators 103-1 to 103-4 can be easily realized by an exclusive OR circuit (EXOR).

【００３１】レジスタ１０８−１〜１０８−４には、比
較器１０３−１〜１０３−４からの比較結果信号Ｃ０〜
Ｃ３が格納される。この場合、レジスタ１０８−１〜１
０８−４の現在の記憶状態（出力ｅ０〜ｅ３）は入力側
にフィ―ドバックされており、論理“１”の比較結果信
号については累積して保持される。さらに、状態設定信
号Ｓ０〜Ｓ３によって、レジスタ１０８−１〜１０８−
４に“１”を強制設定できるようになっている。これ
は、誤り処理プログラムの動作テスト等の目的で、行わ
れるものである。Registers 108-1 to 108-4 have comparison result signals C0 to C0 from the comparators 103-1 to 103-4.
C3 is stored. In this case, the registers 108-1 to 1
The current storage state (outputs e0 to e3) of 08-4 is fed back to the input side, and the comparison result signal of logic "1" is accumulated and held. Further, by the status setting signals S0 to S3, the registers 108-1 to 108-
4 can be set to "1" forcibly. This is performed for the purpose of an operation test of the error processing program.

【００３２】これらレジスタ１０８−１〜１０８−４の
各々は、図４に示されているように、Ｄフリップフロッ
プ２０１と３入力論理和回路２０２とから構成されてお
り、クロック信号の立ち上がりのタイミングでその時の
論理和回路２０２の出力の値を格納する。したがって、
状態設定信号Ｓｎ（ｎ＝０〜３）が“０”の状態で、
“１”の比較結果信号Ｃｎ（ｎ＝０〜３）が一旦入力さ
れると、それ以降に入力される比較結果信号Ｃｎが
“０”であっても、Ｄフリップフロップ２０１には
“１”が保持される。また、レジスタ１０８−１〜１０
８−４の各々に格納された値は、Ｄフリップフロップ２
０１に供給されるクリア信号によって消去される。As shown in FIG. 4, each of these registers 108-1 to 108-4 is composed of a D flip-flop 201 and a 3-input logical sum circuit 202, and the rising timing of the clock signal. Then, the value of the output of the OR circuit 202 at that time is stored. Therefore,
When the state setting signal Sn (n = 0 to 3) is “0”,
Once the comparison result signal Cn (n = 0 to 3) of “1” is input, the D flip-flop 201 receives “1” even if the comparison result signal Cn input thereafter is “0”. Is retained. Also, the registers 108-1 to 10
The value stored in each of 8-4 is the D flip-flop 2
It is erased by the clear signal supplied to 01.

【００３３】レジスタ１０８−１〜１０８−４の出力ｅ
０〜ｅ３は、データ処理装置１１，１２各々のＣＰＵか
らメモリの一部として見え、エラー監視プログラムによ
って定期的にリ―ドされたり、システム診断を行うサ―
ビスプロセッサに詳細な誤り検出情報として提供され
る。Output e of registers 108-1 to 108-4
0 to e3 are seen as a part of the memory from the CPUs of the data processing devices 11 and 12, and are read regularly by the error monitoring program, or a server for performing system diagnosis.
It is provided to the bisprocessor as detailed error detection information.

【００３４】エラー監視プログラムは、定期的にレジス
タ１０８−１〜１０８−４をリ―ドすることで２重化シ
ステムに誤りがないかを検査する。もし、誤りがあれば
誤り処理プログラムを実行する。誤り処理プログラムで
は、ユ―ザ―へのサ―ビスを停止し、システムの誤りに
ついて診断をおこなう。診断の結果が正常であり一時的
な故障であるならば誤り部分を復旧させて、ユ―ザ―へ
のサ―ビスを再開する。異常があるならば、誤りを発生
したハ―ドウェアを交換、又は隔離した後にシステムを
復旧させる。次に、実際の動作を示しながら説明を行
う。The error monitoring program periodically reads the registers 108-1 to 108-4 to check the duplex system for errors. If there is an error, the error processing program is executed. The error handling program stops the service to the user and diagnoses the system error. If the result of the diagnosis is normal and it is a temporary failure, the erroneous part is restored and service to the user is restarted. If there is an abnormality, replace the faulty hardware or isolate it and then restore the system. Next, a description will be given while showing the actual operation.

【００３５】まず、Ａ系信号１０１（ａ０，ａ１，ａ
２，ａ３）が（１，１，０，１）に、Ｂ系信号１０２
（ｂ０，ｂ１，ｂ２，ｂ３）が（１，１，０，１）に設
定される。First, the A system signal 101 (a0, a1, a
2, a3) becomes (1, 1, 0, 1), and the B system signal 102
(B0, b1, b2, b3) is set to (1, 1, 0, 1).

【００３６】これらＡ系信号１０１とＢ系信号１０２
は、比較器１０３−１〜１０３−４によって比較され、
比較結果信号Ｃ０〜Ｃ３の値が決定される。この場合に
は、全てのビットのペア（ａ０とｂ０，ａ１とｂ１，ａ
２とｂ２，ａ３とｂ３）は同じ値の信号であるので、Ｃ
０，Ｃ１，Ｃ２，Ｃ３は、“０”，“０”，“０”，
“０”となる。These A system signal 101 and B system signal 102
Are compared by the comparators 103-1 to 103-4,
The values of the comparison result signals C0 to C3 are determined. In this case, all bit pairs (a0 and b0, a1 and b1, a
2 and b2, a3 and b3) are signals of the same value, so C
0, C1, C2, C3 are "0", "0", "0",
It becomes "0".

【００３７】レジスタ１０８−１〜１０８−４は、クロ
ック信号の立ち上がりで比較結果信号Ｃ０〜Ｃ３を取り
込み格納する。ここでは、比較結果信号Ｃ０〜Ｃ３は全
て“０”なので、４つのレジスタ１０８−１〜１０８−
４には“０”が格納される。したがって、レジスタ出力
ｅ０，ｅ１，ｅ２，ｅ３は、“０”，“０”，“０”，
“０”となる。The registers 108-1 to 108-4 fetch and store the comparison result signals C0 to C3 at the rising edge of the clock signal. Here, since the comparison result signals C0 to C3 are all "0", the four registers 108-1 to 108-
“0” is stored in 4. Therefore, the register outputs e0, e1, e2, e3 are "0", "0", "0",
It becomes "0".

【００３８】ここで誤り監視プログラムによって、レジ
スタ１０８−１〜１０８−４をリ―ドしてＣＰＵ上でシ
ステムの状態を解析する。レジスタ１０８−１〜１０８
−４は全て“０”なので正常であると認知され、通常動
作に戻る。Here, the error monitoring program reads the registers 108-1 to 108-4 and analyzes the system state on the CPU. Registers 108-1 to 108
Since all -4 are "0", it is recognized as normal and the normal operation is resumed.

【００３９】次に、Ａ系信号１０１（ａ０，ａ１，ａ
２，ａ３）が（１，１，１，０）、Ｂ系信号１０２（ｂ
０，ｂ１，ｂ２，ｂ３）が（１，１，１，１）に設定さ
れた場合を想定する。この場合は、ａ３とｂ３に違いが
検出されるので比較結果信号Ｃ０，Ｃ１，Ｃ２，Ｃ３
は、“０”，“０”，“０”，“１”となる。Next, the A system signal 101 (a0, a1, a
2, a3) is (1, 1, 1, 0), and B system signal 102 (b
Assume that 0, b1, b2, b3) is set to (1,1,1,1). In this case, since the difference between a3 and b3 is detected, the comparison result signals C0, C1, C2, C3
Becomes "0", "0", "0", "1".

【００４０】レジスタ１０８−１〜１０８−４は、クロ
ック信号の立ち上がりで比較結果信号Ｃ０，Ｃ１，Ｃ
２，Ｃ３をそれぞれ取り込み格納する。ここでは、比較
結果信号Ｃ０，Ｃ１，Ｃ２，Ｃ３は“０”，“０”，
“０”，“１”であるので、レジスタ１０８−１〜１０
８−４には“０”，“０”，“０”，“１”が格納され
る。したがって、レジスタ１０８−１〜１０８−４の出
力ｅ０〜ｅ３は、“０”，“０”，“０”，“１”とな
る。The registers 108-1 to 108-4 receive the comparison result signals C0, C1 and C at the rising edge of the clock signal.
2 and C3 are respectively captured and stored. Here, the comparison result signals C0, C1, C2 and C3 are "0", "0",
Since they are "0" and "1", the registers 108-1 to 108-10
"0", "0", "0", "1" are stored in 8-4. Therefore, the outputs e0 to e3 of the registers 108-1 to 108-4 are "0", "0", "0", "1".

【００４１】誤りの状態“１”を格納したレジスタ１０
８−４は、出力ｅのフィ―ドバックによって、次のサイ
クルにおいても誤りが生じたことを示す値“１”を保持
する。Register 10 storing error state "1"
The output 8-4 holds the value "1" indicating that an error has occurred in the next cycle due to the feedback of the output e.

【００４２】次に、Ａ系信号１０１（ａ０，ａ１，ａ
２，ａ３）が（０，０，１，１）、Ｂ系信号１０２（ｂ
０，ｂ１，ｂ２，ｂ３）が（０，０，０，１）に設定さ
れるとする。比較結果信号Ｃ０〜Ｃ３は、“０”，
“０”，“１”，“０”となる。Next, the A system signal 101 (a0, a1, a
2, a3) is (0, 0, 1, 1), and the B system signal 102 (b
0, b1, b2, b3) is set to (0, 0, 0, 1). The comparison result signals C0 to C3 are "0",
It becomes "0", "1", "0".

【００４３】レジスタ１０８−１〜１０８−４の入力に
は、比較結果信号Ｃ０〜Ｃ３とフィ―ドバックされた出
力信号ｅ０〜ｅ３の論理和が入力される。いま、ＣＰＵ
からの状態設定は行わないとすると（Ｓ０〜Ｓ３＝
“０”）、比較結果信号Ｃ０〜Ｃ３（０，０，１，０）
とフィ―ドバックされた出力信号ｅ０〜ｅ３（０，０，
０，１）との論理和された値が選ばれる。したがって、
レジスタ１０８−１〜１０８−４の入力は“０”，
“０”，“１”，“１”となって、この値が、次のクロ
ックの立ち上がりでレジスタ１０８−１〜１０８−４に
それぞれ格納される。The logical sums of the comparison result signals C0 to C3 and the feedback output signals e0 to e3 are input to the inputs of the registers 108-1 to 108-4. CPU now
(S0 to S3 =
"0"), comparison result signals C0 to C3 (0,0,1,0)
And output signals e0 to e3 (0, 0,
The value obtained by ORing with 0, 1) is selected. Therefore,
The inputs of the registers 108-1 to 108-4 are "0",
The values become "0", "1", "1", and these values are stored in the registers 108-1 to 108-4 at the rising edge of the next clock.

【００４４】データ処理装置１１，１２の各ＣＰＵは、
図５のフローチャートに示されているように、誤り監視
プログラムによって各チェックポイント毎にレジスタ１
０８−１〜１０８−４をリ―ドし（ステップＳ２１、Ｓ
２２）、“１”が保持されているか否かに応じてエラー
発生の有無を調べる（ステップＳ２３）。レジスタ１０
８−１〜１０８−４のいずれかに“１”が保持されてい
た場合には、エラー発生が生じたものと判断して、誤り
処理プログラムを実行する。また、レジスタ１０８−１
〜１０８−４のいずれにも“１”が保持されてない場合
にはエラー発生が無かったと判断して、処理を終了す
る。この例では、レジスタ１０８−３と１０８−４に
“１”が保持されているので、、誤り処理プログラムが
実行される。The CPUs of the data processing devices 11 and 12 are
As shown in the flow chart of FIG. 5, the error monitoring program registers 1 for each checkpoint.
08-1 to 108-4 are read (steps S21, S
22), it is checked whether or not an error has occurred depending on whether or not "1" is held (step S23). Register 10
When "1" is held in any of 8-1 to 108-4, it is determined that an error has occurred, and the error processing program is executed. In addition, the register 108-1
If "1" is not held in any of the items 108 to 108-4, it is determined that no error has occurred, and the process ends. In this example, since "1" is held in the registers 108-3 and 108-4, the error processing program is executed.

【００４５】誤り処理では、各ＣＰＵは、リ―ドしたレ
ジスタ１０８−１〜１０８−４の中で“１”を保持して
いた出力ｅ２，ｅ３に基づいて誤りが発生したビット位
置の解析処理を行い（ステップＳ２４）、これによって
ａ３，ｂ３、およびａ４，ｂ４に誤りが生じたと判定す
る。In the error processing, each CPU analyzes the bit position in which an error has occurred based on the outputs e2 and e3 holding "1" in the read registers 108-1 to 108-4. Is performed (step S24), and it is determined that an error has occurred in a3, b3, and a4, b4.

【００４６】この場合、各ＣＰＵは対応する出力（デー
タ処理装置１１ではａ３，ａ４、データ処理装置１２で
はｂ３，ｂ４）について自装置の障害の有無を調べ、障
害の発生したほうのＣＰＵは、必要な回復処理（ステッ
プＳ２５）を行った後、復旧が可能であるのならクリア
信号によってレジスタ１０８−１〜１０８−４に格納さ
れている値を（０，０，０，０）にリセットし、通常動
作に戻る。In this case, each CPU examines the corresponding output (a3, a4 in the data processing device 11 and b3, b4 in the data processing device 12) for a fault in its own device, and the CPU in which the fault has occurred is After performing necessary recovery processing (step S25), if recovery is possible, the value stored in the registers 108-1 to 108-4 is reset to (0,0,0,0) by a clear signal. , Return to normal operation.

【００４７】この誤り検出機構によると、誤りが発生し
たことをレジスタ１０８−１〜１０８−４で累積して格
納しているので、任意の時刻にＣＰＵからレジスタ１０
８−１〜１０８−４に格納された誤り情報を調べること
ができる。即ち、図１に示した第１実施例のように誤り
検出機構からのＣＰＵへの割り込みによる誤り通知では
なく、ＣＰＵの通常の動作であるレジスタリ―ドとして
誤りを検出することができる。According to this error detection mechanism, since the occurrence of an error is accumulated and stored in the registers 108-1 to 108-4, the CPU can register the register 10 at any time.
The error information stored in 8-1 to 108-4 can be checked. That is, the error can be detected as a register read, which is a normal operation of the CPU, instead of the error notification by the interrupt from the error detection mechanism to the CPU as in the first embodiment shown in FIG.

【００４８】具体的には、処理をブロックに分割し、ブ
ロック間にチェックポイントを設定する。処理を進める
に当たってチェックポイントに達したらブロックでの環
境を退避させる。そして、レジスタ１０８−１〜１０８
−４を調べて誤りがなければ次のブロックの処理に進
む。誤りがなければ、クリア信号でレジスタ１０８−１
〜１０８−４に保持している誤り状態をクリアしてから
前回のチェックポイントまで戻り、そのブロックでの環
境をリストアして処理を再実行して一時的な故障を回避
することができる。Specifically, the processing is divided into blocks, and check points are set between the blocks. When the process reaches the checkpoint, the environment in the block is saved. Then, the registers 108-1 to 108
-4 is checked, and if there is no error, the process proceeds to the next block. If there is no error, register 108-1 with a clear signal.
It is possible to avoid a temporary failure by clearing the error state held in 108-4, returning to the previous checkpoint, restoring the environment in that block, and re-executing the processing.

【００４９】また、このように出力を比較した結果から
不一致の情報を累積して保持することにより、例えば複
数のチェックポイントに亙って誤り発生位置の統計を取
れば、その統計結果をハ―ドウェアの交換時期やシステ
ムの信頼性についての目安として使用することもでき
る。Further, by accumulating and holding the information of disagreement from the results of comparing the outputs in this way, if the statistics of the error occurrence positions are collected over a plurality of check points, the statistical results are displayed. It can also be used as a guide for when to replace hardware and for system reliability.

【００５０】尚、以上の説明においてＡ系出力１０１お
よびＢ系出力１０２についてその出力内容については特
定しなかったが、例えば、Ａ系出力１０１およびＢ系出
力１０２を機能的（デ―タやアドレスなど）に区分し、
その区分毎に誤り検出機構を設けて誤り検出を行うこと
ができる。Although the output contents of the A-system output 101 and the B-system output 102 are not specified in the above description, for example, the A-system output 101 and the B-system output 102 are functional (data and address). Etc.),
Error detection can be performed by providing an error detection mechanism for each section.

【００５１】また、ここでは、データ処理装置１１，１
２の各ＣＰＵ自体が誤り処理プログラムの実行によって
レジスタリードを行ったが、２重化システム内に設けら
れたサービスプロセッサを利用してレジスタリードを行
うことも可能である。Further, here, the data processing devices 11 and 1 are
Although each of the two CPUs themselves read the register by executing the error processing program, it is also possible to perform the register read by using the service processor provided in the duplex system.

【００５２】[0052]

【発明の効果】以上詳述したように、この発明によれ
ば、２重化システムにおける誤り検出について詳細な誤
り位置情報を容易に得ることができる。As described above in detail, according to the present invention, it is possible to easily obtain detailed error position information regarding error detection in a duplex system.

[Brief description of drawings]

【図１】この発明の第１実施例に係わる誤り検出機構の
構成を示すブロック図。FIG. 1 is a block diagram showing the configuration of an error detection mechanism according to a first embodiment of the present invention.

【図２】同第１実施例の動作を説明するフローチャー
ト。FIG. 2 is a flowchart for explaining the operation of the first embodiment.

【図３】この発明の第２実施例に係わる誤り検出機構の
構成を示すブロック図。FIG. 3 is a block diagram showing the configuration of an error detection mechanism according to the second embodiment of the present invention.

【図４】同第２実施例で使用されるレジスタの具体的構
成の一例を示す回路図。FIG. 4 is a circuit diagram showing an example of a specific configuration of a register used in the second embodiment.

【図５】同第２実施例の動作を説明するフローチャー
ト。FIG. 5 is a flowchart for explaining the operation of the second embodiment.

【図６】従来の誤り検出機構を示すブロック図。FIG. 6 is a block diagram showing a conventional error detection mechanism.

[Explanation of symbols]

１０１…Ａ系信号、１０２…Ｂ系信号、１０３−１〜１
０３−４…比較器、１０５…エラー検出回路、１０７−
１〜１０７−４，１０８−１〜１０８−４…レジスタ。101 ... A system signal, 102 ... B system signal, 103-1 to 1
03-4 ... comparator, 105 ... error detection circuit, 107-
1 to 107-4, 108-1 to 108-4 ... Registers.

Claims

[Claims]

1. In a duplex system having first and second data processing devices that execute the same processing in parallel, an output consisting of a plurality of bits from the first data processing device and this output are provided. Comparing means for comparing the corresponding output of the second data processing device consisting of a plurality of bits on a bit-by-bit basis, holding means for holding the comparison result by the comparing means for each bit, and the comparing means When a mismatch between the outputs of the first data processing device and the second data processing device is detected, the bit position in which an error has occurred is detected based on the comparison result held in the holding means. An error detection method for a duplex system, comprising:

2. A first data processing device and a second data processing device having the same circuit are duplicated, and the first data processing device and the second data processing device are duplicated. -In a duplex system in which the same program is executed in the data processing device, the output of the first data processing device and the output of the second data processing device are compared, and an error is detected due to a mismatch. Means for holding, for each bit, the result of comparing the output of the first data processing device consisting of a plurality of bits and the output of the second data processing device consisting of a plurality of bits; When a mismatch occurs between the outputs of the second data processing device, the first and second data processing devices are interrupted and an error occurs in each of the first and second data processing devices. And a means for notifying and holding in the holding means that is activated in response to the notification. An error detecting method for a duplex system, comprising means for detecting a bit position where an error has occurred based on the compared result information.

3. A first data processing device and a second data processing device having the same circuit are duplicated, and the first data processing device and the second data processing device are duplicated. -In a duplex system in which the same program is executed in the data processing device, the output of the first data processing device and the output of the second data processing device are compared, and an error is detected due to a mismatch. Holding means for accumulating and holding, for each bit, the disagreement information between the plurality of bits of the output of the first data processing apparatus and the plurality of bits of the output of the second data processing apparatus; The duplication is characterized by further comprising: means for periodically reading the inconsistency information accumulated and retained in the retaining means, and detecting the presence / absence of an error and the bit position in which the error has occurred based on the inconsistency information. System error detection method.