JPH0797327B2 - Failure detection method - Google Patents

Failure detection method

Info

Publication number
JPH0797327B2
JPH0797327B2 JP63177419A JP17741988A JPH0797327B2 JP H0797327 B2 JPH0797327 B2 JP H0797327B2 JP 63177419 A JP63177419 A JP 63177419A JP 17741988 A JP17741988 A JP 17741988A JP H0797327 B2 JPH0797327 B2 JP H0797327B2
Authority
JP
Japan
Prior art keywords
standby
main
normality
normality confirmation
confirmation function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
JP63177419A
Other languages
Japanese (ja)
Other versions
JPH0227442A (en
Inventor
恵子 赤川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to JP63177419A priority Critical patent/JPH0797327B2/en
Publication of JPH0227442A publication Critical patent/JPH0227442A/en
Publication of JPH0797327B2 publication Critical patent/JPH0797327B2/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Landscapes

  • Hardware Redundancy (AREA)

Description

【発明の詳細な説明】 〔産業上の利用分野〕 本発明は、故障検出方法に関し、特に情報処理装置の主
装置側における待機予備装置の故障検出方法に関する。
The present invention relates to a failure detection method, and more particularly to a failure detection method for a standby standby device on the main device side of an information processing device.

〔従来の技術〕[Conventional technology]

従来の故障検出方法は、例えば、特開昭61−169036号公
報に示されるシステム監視装置の場合、現用系および待
機系のコンピュータにそれぞれ異状検出部を設け、これ
らの異状検出部によって検出された異状状態信号により
系の切換えを制御していた。また、信号バスを介して複
数のプロセッサを接続し、データの送受を行うような装
置の場合、例えば、特開昭62−243039号公報に示される
分散制御式交換機のプロセッサ状態監視方式では監視装
置を設けておき、この監視装置で各プロセッサの異状を
判断させるようになっていた。
In the conventional failure detection method, for example, in the case of the system monitoring device disclosed in Japanese Patent Laid-Open No. 61-169036, abnormality detection units are provided in the active computer and the standby computer, and the abnormality detection unit detects the abnormality. The switching of the system was controlled by the abnormal status signal. Further, in the case of a device that transmits and receives data by connecting a plurality of processors via a signal bus, for example, a monitoring device in the processor state monitoring system of the distributed control type exchange disclosed in Japanese Patent Laid-Open No. 62-243039. The monitoring device is provided so that the abnormality of each processor can be determined.

〔発明が解決しようとする課題〕[Problems to be Solved by the Invention]

上述した従来の故障検出方法は、いずれも、異状検出部
あるいは監視装置等の監視するための機能を設けるもの
であった。このため、異状検出部あるいは監視装置等に
異状が発生した場合には、被監視装置に異状が起きてい
ても外部への報告が不能になるという問題点があった。
また、これらの従来の故障検出方法を主装置および待機
予備装置より成る情報処理装置に対し実施することと
し、異状検出部あるいは監視装置に当る監視機能をソフ
トウェアで構成し、待機予備装置上で自系の正常性を確
認するための正常性確認プログラムを走行させるよう構
成した場合、待機予備装置上で走行中の正常性確認プロ
グラムの暴走等があっても、これらを見逃したり検出を
し損ずるることがあるという問題点もあった。
Each of the above-described conventional failure detection methods has a function for monitoring the abnormality detection unit or the monitoring device. Therefore, when an abnormality occurs in the abnormality detection unit, the monitoring device, or the like, there is a problem in that even if the abnormality occurs in the monitored device, it cannot be reported to the outside.
In addition, these conventional failure detection methods will be applied to an information processing device consisting of a main device and a standby standby device, and a monitoring function corresponding to the abnormality detection unit or the monitoring device will be configured by software, and the standby standby device will automatically execute the monitoring function. When configured to run the normality check program for checking the normality of the system, even if there is a runaway of the normality check program running on the standby standby device, these will be overlooked or fail to be detected. There was also the problem that there were occasions.

本発明は上記欠点を補い、待機予備装置に故障がありな
がら報告のない場合にも故障を検出し得ることを目的と
する。
SUMMARY OF THE INVENTION It is an object of the present invention to compensate for the above-mentioned drawbacks and to detect a failure even when the standby standby device has a failure but no report is made.

〔課題を解決するための手段〕[Means for Solving the Problems]

本発明の故障検出方法は、主装置および待機予備装置よ
り成る情報処理装置の前記待機予備装置の故障を、前記
待機予備装置内に設けた正常性確認機能により検出する
故障検出方法において、前記主装置がオンライン動作に
入ると同時に前記待機予備装置上で自系の正常性を確認
するための正常性確認機能を起動し、前記主装置から前
記待機予備装置に対し前記正常性確認機能が正常に動作
していることを知るために停止信号を送出し、前記停止
信号を受信した前記待機予備装置では前記正常性確認機
能の動作を停止し、前記正常性確認機能の動作状態を前
記主装置および前記待機予備装置の主記憶装置に書き込
み、この書き込みの終了で書き込み終了信号を前記主装
置に送出し、前記書き込み終了信号を受信した前記主装
置では自己の主記憶装置に書き込まれた前記待機予備装
置の正常性確認機能の動作状態を読み出して前記待機予
備装置の正常性確認機能の動作の正常性を判別するよう
にしている。
The failure detection method of the present invention is a failure detection method for detecting a failure of the standby backup device of an information processing apparatus including a main device and a standby backup device by a normality confirmation function provided in the standby backup device. At the same time when the device enters online operation, the normality confirmation function for confirming the normality of its own system is activated on the standby standby device, and the normality confirmation function for the standby standby device from the main device is normally activated. In order to know that it is operating, a stop signal is sent, and in the standby standby device that has received the stop signal, the operation of the normality confirmation function is stopped, and the operation state of the normality confirmation function is set to the main device and Writing to the main memory of the standby standby device, sending a write end signal to the main device at the end of this writing, and receiving the write end signal, the main device writes its own main memory. It reads the operating state of normality confirmation functions written in device the auxiliary standby device is adapted to determine the normality of the operation of the normality confirmation function of the auxiliary standby device.

〔実施例〕〔Example〕

第1図は、本発明の一実施例の装置構成図である。 FIG. 1 is a device configuration diagram of an embodiment of the present invention.

主装置1は、正常性確認機能が正常に動作しているかを
待機予備装置2に対して問い合わせる。待機予備装置2
は主装置1に対して報告を行う。
The main device 1 inquires of the standby standby device 2 whether the normality confirmation function is operating normally. Standby standby device 2
Reports to the main unit 1.

第2図は本発明の一実施例における処理の流れを示す流
れ図である。まず主装置1がオンライン動作に入ると同
時に、待機予備装置2上では自系の正常性を確認するた
めの正常性確認プログラムを走行させておく。次に、主
装置1は、待機予備装置に対し正常性確認プログラムが
正常に走行していることを知るため、ステップ11で主装
置1より待機予備装置2に停止信号が送出され、ステッ
プ21で待機予備装置1内で走行中の正常性確認プログラ
ムが停止する。プログラム停止によりステップ22で主装
置1および待機予備装置2の両装置の主記憶装置に待機
予備装置2が正常性確認プログラムの走行状態を書き込
む。書き込みが終了すると待機予備装置2はステップ23
で主装置1への信号を送りステップ24で正常性確認プロ
グラムの走行を再開する。主装置1は信号を受け取ると
ステップ12で主記憶装置から先に書き込まれた待機予備
装置2上での正常性確認プログラムの走行状態を読み出
し、ステップ13で主装置1が待機予備装置2上の正常性
確認プログラムの走行状態を正常かどうか判断する。正
常であればステップ14で正常終了処理を行い、異常であ
ればステップ15で異常終了処理を行う。
FIG. 2 is a flow chart showing the flow of processing in one embodiment of the present invention. First, at the same time when the main device 1 enters the online operation, a normality confirmation program for confirming the normality of its own system is run on the standby standby device 2. Next, the main device 1 sends a stop signal from the main device 1 to the standby standby device 2 in step 11 in order to know that the normality confirmation program is running normally in the standby standby device, and in step 21. The normality confirmation program running in the standby standby device 1 stops. When the program is stopped, in step 22, the standby backup device 2 writes the running state of the normality confirmation program in the main storage devices of both the main device 1 and the standby backup device 2. When the writing is completed, the standby standby device 2 proceeds to step 23.
Then, a signal is sent to the main unit 1 to restart the running of the normality confirmation program in step 24. Upon receiving the signal, the main unit 1 reads the running state of the normality confirmation program on the standby standby unit 2 previously written from the main storage unit in step 12, and in step 13, the main unit 1 stores the standby standby unit 2 on the standby standby unit 2. Determine whether the running condition of the normality confirmation program is normal. If it is normal, normal termination processing is performed in step 14, and if abnormal, abnormal termination processing is performed in step 15.

〔発明の効果〕〔The invention's effect〕

以上説明したように本発明は、主装置側から待機予備装
置へ問い合わせを行い待機予備装置から主装置へ報告さ
せ、待機予備装置上での正常性確認プログラムの走行状
態を主装置でも再度確認することにより、従来の方法で
は対応していない監視機能あるいは報告機能の障害や、
正常性確認プログラムの暴走等で、確実に故障がありな
がら報告のない場合にも待機予備装置の故障を検出でき
る効果がある。
As described above, according to the present invention, the main device side makes an inquiry to the standby standby device, causes the standby standby device to report to the main device, and confirms the running state of the normality confirmation program on the standby standby device again in the main device. As a result, there is a failure in the monitoring function or reporting function that is not supported by the conventional method,
Due to a runaway of the normality confirmation program, there is an effect that the failure of the standby standby device can be detected even when there is no failure but the failure is reported.

【図面の簡単な説明】[Brief description of drawings]

第1図は本発明の一実施例の装置構成図、第2図は処理
の流れを示す流れ図である。 1……主装置、2……待機予備装置。
FIG. 1 is an apparatus configuration diagram of an embodiment of the present invention, and FIG. 2 is a flow chart showing a processing flow. 1 ... Main device, 2 ... Standby standby device.

Claims (1)

【特許請求の範囲】[Claims] 【請求項1】主装置および待機予備装置より成る情報処
理装置の前記待機予備装置の故障を、前記待機予備装置
内に設けた正常性確認機能により検出する故障検出方法
において、前記主装置がオンライン動作に入ると同時に
前記待機予備装置上で自系の正常性を確認するための正
常性確認機能を起動し、前記主装置から前記待機予備装
置に対し前記正常性確認機能が正常に動作していること
を知るために停止信号を送出し、前記停止信号を受信し
た前記待機予備装置では前記正常性確認機能の動作を停
止し、前記正常性確認機能の動作状態を前記主装置およ
び前記待機予備装置の主記憶装置に書き込み、この書き
込みの終了で書き込み終了信号を前記主装置に送出し、
前記書き込み終了信号を受信した前記主装置では自己の
主記憶装置に書き込まれた前記待機予備装置の正常性確
認機能の動作状態を読み出して前記待機予備装置の正常
性確認機能の動作の正常性を判別することを特徴とする
故障検出方法。
1. A failure detection method for detecting a failure of a standby standby device of an information processing apparatus comprising a main device and a standby standby device by a normality confirmation function provided in the standby standby device, wherein the main device is online. Simultaneously with the start of operation, the normality confirmation function for confirming the normality of the own system is activated on the standby standby device, and the normality confirmation function operates normally from the main device to the standby standby device. The standby standby device that has received the stop signal stops the operation of the normality confirmation function, and changes the operating state of the normality confirmation function to the main device and the standby standby device. Write to the main memory of the device, and at the end of this write send a write end signal to the main device,
Upon receiving the write end signal, the main device reads the operating state of the normality confirmation function of the standby standby device written in its own main storage device to check the normality of the operation of the normality confirmation function of the standby standby device. A failure detection method characterized by determining.
JP63177419A 1988-07-15 1988-07-15 Failure detection method Expired - Lifetime JPH0797327B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP63177419A JPH0797327B2 (en) 1988-07-15 1988-07-15 Failure detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP63177419A JPH0797327B2 (en) 1988-07-15 1988-07-15 Failure detection method

Publications (2)

Publication Number Publication Date
JPH0227442A JPH0227442A (en) 1990-01-30
JPH0797327B2 true JPH0797327B2 (en) 1995-10-18

Family

ID=16030600

Family Applications (1)

Application Number Title Priority Date Filing Date
JP63177419A Expired - Lifetime JPH0797327B2 (en) 1988-07-15 1988-07-15 Failure detection method

Country Status (1)

Country Link
JP (1) JPH0797327B2 (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS553018A (en) * 1978-06-20 1980-01-10 Toshiba Corp Computer device
JPS55116150A (en) * 1979-02-28 1980-09-06 Nec Corp Fault detection system for processor
JPS5672359A (en) * 1979-11-17 1981-06-16 Fujitsu Ltd Supervising system for spare unit
JPS61169036A (en) * 1985-01-22 1986-07-30 Nec Corp System supervisory device
JPS62243039A (en) * 1986-04-15 1987-10-23 Nec Eng Ltd Processor state monitoring system for decentralized control type switchboard

Also Published As

Publication number Publication date
JPH0227442A (en) 1990-01-30

Similar Documents

Publication Publication Date Title
JP3481737B2 (en) Dump collection device and dump collection method
JP2687927B2 (en) External bus failure detection method
JPH10312327A (en) Mirroring monitor system
JPH0814797B2 (en) Checking method in redundant processing equipment
JPH0797327B2 (en) Failure detection method
JP3313667B2 (en) Failure detection method and method for redundant system
JPS6146543A (en) Fault processing system of transfer device
JP3161444B2 (en) Fault logging system, method, and storage medium storing program
JP2785992B2 (en) Server program management processing method
JPH0534877B2 (en)
JP2500217Y2 (en) I / O card abnormality detection system
JPS6155748A (en) Electronic computer system
JP3008851B2 (en) Inter-system monitoring method for multi-computer systems
JPH06290066A (en) Duplex device
JP2744113B2 (en) Computer system
JPH08305675A (en) Multi-processor system and its operation management method
JPS59119451A (en) Diagnosing system of electronic computer system
JPH03253945A (en) Abnormality recovery processing function confirming system for data processing system
JPH0434184B2 (en)
JPH0395634A (en) Restart control system for computer system
JPS5827538B2 (en) Mutual monitoring method
JPH07230432A (en) Calculating device
JPS60254338A (en) Abnormality detecting system of multiprocessor
JPS62263554A (en) Shared memory duplex system
JPH0716218B2 (en) Electronic exchange duplex system