JP3208885B2

JP3208885B2 - Fault monitoring system

Info

Publication number: JP3208885B2
Application number: JP00285893A
Authority: JP
Inventors: 康介新井
Original assignee: Fuji Xerox Co Ltd; Fujifilm Business Innovation Corp
Current assignee: Fujifilm Business Innovation Corp
Priority date: 1993-01-11
Filing date: 1993-01-11
Publication date: 2001-09-17
Anticipated expiration: 2016-09-17
Also published as: JPH06208485A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、複数のデータ処理装置
がネットワークを介してそれぞれ接続されているネット
ワークシステムに関し、特に、データ処理装置の障害を
監視する障害監視システムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a network system in which a plurality of data processing devices are connected via a network, and more particularly to a fault monitoring system for monitoring a fault in a data processing device.

【０００２】[0002]

【従来の技術】従来、ネットワークシステムにおける各
計算機で発生した障害に関する障害情報の表示及び蓄積
の方法として、特開平３−１４２５４０号公報に開示さ
れたものが知られている。この公報に開示されたもの
は、ネットワークシステム内の監視計算機以外の計算機
が障害を起こした場合、その障害を監視計算機に通知し
この監視計算機のコンソールに障害情報を表示すると共
に、障害情報は障害を発生した障害計算機の外部記憶装
置に蓄積するようにしている。2. Description of the Related Art Conventionally, a method disclosed and disclosed in Japanese Patent Application Laid-Open No. 3-142540 is known as a method for displaying and storing fault information relating to a fault that has occurred in each computer in a network system. In this publication, when a computer other than the monitoring computer in the network system fails, the failure is notified to the monitoring computer, the failure information is displayed on the console of the monitoring computer, and the failure information is displayed as the failure information. Is stored in the external storage device of the faulty computer in which the error occurred.

【０００３】このように全計算機の障害情報を分散して
蓄積することにより、１つの計算機当たりに必要な外部
記憶装置の容量を抑制していた。In this way, by distributing and accumulating the failure information of all the computers, the capacity of the external storage device required for one computer has been suppressed.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、上記従
来の公報に開示されたものでは、障害情報が分散されて
蓄積されているので、障害情報を再度参照したりバック
アップするなどの保守操作が面倒である。またネットワ
ークの規模が拡大した場合においては、１箇所の障害計
算機のみしか障害を監視することができないため、監視
計算機近くの通信媒体（つまりネットワークと監視計算
機とを接続するインタフェース）が障害を起こした場合
には、障害を監視することができない。However, in the prior art disclosed in the above publication, the failure information is dispersed and accumulated, so that maintenance operations such as referring to the failure information again or backing up the information are troublesome. is there. Further, when the scale of the network is increased, only one fault computer can monitor the fault, so that a communication medium near the monitor computer (that is, an interface connecting the network and the monitor computer) has failed. In that case, the failure cannot be monitored.

【０００５】この発明は、グループ単位で障害情報を蓄
積すると共にシステム全体で障害を監視することができ
る障害監視システムを提供することを目的とする。An object of the present invention is to provide a fault monitoring system capable of storing fault information in groups and monitoring faults in the entire system.

【０００６】[0006]

【課題を解決するための手段】上記目的を達成するた
め、第１の発明は、複数のデータ処理装置が通信回線を
介してそれぞれ接続され、該複数のデータ処理装置につ
いての障害を監視する障害監視システムであって、前記
複数のデータ処理装置がグループ化された当該各グルー
プ毎のデータ処理装置のうちの予め設定される特定のデ
ータ処理装置以外のデータ処理装置は、自己装置が検出
した障害に関する障害情報を、自己装置には蓄積せず
に、自己装置が属するグループの前記特定のデータ処理
装置へ転送し、前記各グループ毎の特定のデータ処理装
置は、自己装置が検出した障害に関する障害情報と、グ
ループ内の他のデータ処理装置から転送された障害情報
とを蓄積すると共に、該蓄積される障害情報を、前記各
特定のデータ処理装置におけるシステム全体の監視機能
を有する予め設定される所定のデータ処理装置へ転送
し、前記所定のデータ処理装置は、他のグループに属す
る特定のデータ処理装置からの障害情報については、自
己装置には蓄積せずに、出力処理のみ実行することを特
徴とする。また第２の発明は、第１の発明において、前
記複数のデータ処理装置のうちの前記所定のデータ処理
装置に加えて他の少なくとも１つのデータ処理装置は、
前記予め設定される所定のデータ処理装置として機能す
る共に、前記各グループ毎に特定のデータ処理装置以外
のデータ処理装置は、当該特定のデータ処理装置が蓄積
する障害情報をバックアップすることを特徴とする。さ
らに第３の発明は、第１又は第３の発明において、前記
各データ処理装置は、自己装置本体と、該装置本体に接
続される周辺装置と、前記装置本体及び周辺装置の障害
を監視し、該障害の異常を検出する障害検出手段とを備
えることを特徴とする。According to a first aspect of the present invention, a plurality of data processing devices are connected via a communication line, and a plurality of data processing devices are monitored for faults. In the monitoring system, a data processing device other than a predetermined specific data processing device among the data processing devices in each group into which the plurality of data processing devices are grouped is a fault detected by its own device. The failure information related to the failure detected by the own device is transferred to the specific data processing device of the group to which the own device belongs without storing the failure information related to the failure to the own device. Information and fault information transferred from other data processing devices in the group, and stores the stored fault information in each of the specific data processing devices. To a predetermined data processing device which has a function of monitoring the entire system in advance, and the predetermined data processing device has its own device for fault information from a specific data processing device belonging to another group. Only output processing is executed without accumulation. In a second aspect based on the first aspect, the at least one other data processing device in addition to the predetermined data processing device among the plurality of data processing devices,
In addition to functioning as the predetermined data processing device set in advance, the data processing device other than the specific data processing device for each of the groups backs up the fault information accumulated by the specific data processing device. I do. In a third aspect based on the first or third aspect, each of the data processing devices monitors its own device body, peripheral devices connected to the device body, and failures of the device body and the peripheral device. And a failure detecting means for detecting an abnormality of the failure.

【０００７】[0007]

【作用】本発明では、各グループ毎の特定のデータ処理
装置のみが、自己処理装置も含むグループ内のデータ処
理装置が検出した障害に関する障害情報を蓄積すると共
に、該蓄積される障害情報を、システム全体の監視機能
を有する予め設定される所定のデータ処理装置（複数の
グループ毎の特定のデータ処理装置のうちの１つのデー
タ処理装置）へ転送し、この所定のデータ処理装置は、
自己が属するグループ内のデータ処理装置が検出した障
害に関する障害情報を蓄積すると共に、出力処理（表示
処理）も実行し、一方、他のグループに属する特定のデ
ータ処理装置からの障害情報（つまり他のグループに属
する全てのデータ処理装置についての障害情報）につい
ては、自己装置には蓄積せずに、出力処理（表示処理）
のみ実行する。このため、本発明によれば、グループ毎
に障害情報を集中して蓄積するので、システム全体の記
憶媒体資源を有効利用することができ、しかも予め設定
されるデータ処理装置のみが障害情報を表示するので、
システム全体の障害情報を１元管理することができる。
また本発明では、システム全体の監視機能を有する所定
のデータ処理装置に加えて他の少なくとも１つのデータ
処理装置もシステム全体の監視機能を有すると共に、各
グループ毎に特定のデータ処理装置が蓄積する障害情報
を少なくとも他のデータ処理装置がバックアップする。
このため、本発明によれば、１つの所定のデータ処理装
置がシステムダウンして、システム全体の監視機能が実
施不可能になった場合であっても、システム全体の監視
機能を有する他のデータ処理装置によって、システム全
体の障害を監視することができる。また特定のデータ処
理装置における障害情報を蓄積している手段（外部記憶
手段）に障害が発生した場合であっても、バックアップ
用障害情報を基に当該障害情報を復帰することができ
る。According to the present invention, only a specific data processing device of each group accumulates fault information relating to a fault detected by a data processing device in a group including its own processing device, and also stores the fault information stored therein. The data is transferred to a predetermined data processing device having a function of monitoring the entire system (one data processing device among specific data processing devices for each of a plurality of groups), and the predetermined data processing device
In addition to accumulating fault information on faults detected by the data processing devices in the group to which the self belongs, output processing (display processing) is also performed, while fault information from a specific data processing device belonging to another group (ie, other Output processing (display processing) without accumulating in its own device.
Only run. Therefore, according to the present invention, the fault information is centrally stored for each group, so that the storage medium resources of the entire system can be effectively used, and only the preset data processing device displays the fault information. So
Failure information of the entire system can be centrally managed.
Further, in the present invention, in addition to a predetermined data processing device having a function of monitoring the entire system, at least one other data processing device also has a function of monitoring the entire system, and a specific data processing device accumulates for each group. The failure information is backed up by at least another data processing device.
For this reason, according to the present invention, even when one predetermined data processing device is down and the monitoring function of the entire system cannot be performed, other data having the monitoring function of the entire system can be used. The processing device can monitor the failure of the entire system. Further, even if a failure has occurred in the means (external storage means) storing the failure information in a specific data processing device, the failure information can be restored based on the backup failure information.

【０００８】[0008]

【実施例】以下、本発明の実施例を添付図面を参照して
説明する。Embodiments of the present invention will be described below with reference to the accompanying drawings.

【０００９】第１の実施例について図１乃至図６を参照
して説明する。The first embodiment will be described with reference to FIGS.

【００１０】図１は、本発明に係る障害監視システムの
第１の実施例を示す構成図である。同図において、複数
の計算機Ｇ１−Ｓ１、Ｇ１−Ｓ２、…、Ｇ１−Ｓｎ、Ｇ
２−Ｓ１、Ｇ２−Ｓ２、…、Ｇ２−Ｓｎ、Ｇｎ−Ｓ１、
Ｇｎ−Ｓ２、…、Ｇｎ−Ｓｎはネットワーク１０を介し
てそれぞれ接続されている。各計算機には、外部記憶装
置Ｇ１−Ｄ１、Ｇ１−Ｄ２、…、Ｇ１−Ｄｎ、Ｇ２−Ｄ
１、Ｇ２−Ｄ２、…、Ｇ２−Ｄｎ、Ｇｎ−Ｄ１、Ｇｎ−
Ｄ２、…、Ｇｎ−Ｄｎと、コンソールＧ１−Ｃ１、Ｇ１
−Ｃ２、…、Ｇ１−Ｃｎ、Ｇ２−Ｃ１、Ｇ２−Ｃ２、
…、Ｇ２−Ｃｎ、Ｇｎ−Ｃ１、Ｇｎ−Ｃ２、…、Ｇｎ−
Ｃｎとが備えられている。FIG. 1 is a block diagram showing a first embodiment of a fault monitoring system according to the present invention. In the figure, a plurality of computers G1-S1, G1-S2,..., G1-Sn, G
2-S1, G2-S2,..., G2-Sn, Gn-S1,
Gn-S2,..., Gn-Sn are connected via a network 10. Each computer has an external storage device G1-D1, G1-D2,..., G1-Dn, G2-D.
1, G2-D2, ..., G2-Dn, Gn-D1, Gn-
D2,..., Gn-Dn and consoles G1-C1, G1
-C2, ..., G1-Cn, G2-C1, G2-C2,
..., G2-Cn, Gn-C1, Gn-C2, ..., Gn-
Cn.

【００１１】ここでは、計算機Ｇ１−Ｓ１、Ｇ１−Ｓ
２、…、Ｇ１−Ｓｎのｎ個の計算機でグループ１を形成
し、このグループ１においては、計算機Ｇ１−Ｓ１は、
このシステム全体を統括するシステム監視機能とグルー
プ１内の情報を監視するグループ監視機能とを有してい
る。以下、この計算機をシステム監視＆グループ監視計
算機Ｇ１−Ｓ１と定義する。Here, the computers G1-S1, G1-S
, G1-Sn form a group 1 with n computers, in which the computers G1-S1
The system has a system monitoring function for controlling the entire system and a group monitoring function for monitoring information in the group 1. Hereinafter, this computer is defined as a system monitoring & group monitoring computer G1-S1.

【００１２】また計算機Ｇ２−Ｓ１、Ｇ２−Ｓ２、…、
Ｇ２−Ｓｎのｎ個の計算機でグループ２を形成し、この
グループ２においては、計算機Ｇ２−Ｓ１は、グループ
２内の情報を監視するグループ監視機能を有している。
以下、この計算機をグループ監視計算機Ｇ２−Ｓ１と定
義する。The computers G2-S1, G2-S2,...
A group 2 is formed by n computers G2-Sn. In this group 2, the computer G2-S1 has a group monitoring function of monitoring information in the group 2.
Hereinafter, this computer is defined as a group monitoring computer G2-S1.

【００１３】更に計算機Ｇｎ−Ｓ１、Ｇｎ−Ｓ２、…、
Ｇｎ−Ｓｎのｎ個の計算機でグループｎを形成し、この
グループｎにおいては、計算機Ｇｎ−Ｓ１は、グループ
ｎ内の情報を監視するグループ監視機能を有している。
以下、この計算機をグループ監視計算機Ｇｎ−Ｓ１と定
義する。Further, computers Gn-S1, Gn-S2,.
A group n is formed by n computers Gn-Sn. In the group n, the computer Gn-S1 has a group monitoring function of monitoring information in the group n.
Hereinafter, this computer is defined as a group monitoring computer Gn-S1.

【００１４】なおここでは、システム監視＆グループ監
視計算機及びグループ監視計算機以外の計算機を一般計
算機という。Here, the system monitoring & group monitoring computer and computers other than the group monitoring computer are referred to as general computers.

【００１５】図２は、計算機Ｇ１−Ｓ１の機能ブロック
図を示したものであり、計算機Ｇ１−Ｓ１は、障害情報
ハンドラ２１０と障害処理情報設定部２２０と障害検出
部２３０とを有する障害モジュール２４０を備えてい
る。FIG. 2 is a functional block diagram of the computer G1-S1. The computer G1-S1 has a failure module 240 having a failure information handler 210, a failure processing information setting unit 220, and a failure detection unit 230. It has.

【００１６】障害処理情報設定部２２０はユーザによっ
て設定された障害処理情報を障害情報ハンドラ２１０に
与える。障害検出部２３０は、計算機Ｇ１−Ｓ１、コン
ソールＧ１−Ｃ１及び外部記憶装置Ｇ１−Ｄ１の状態を
監視し、異常の場合は、その旨を障害情報ハンドラ２１
０へ通知する。障害情報ハンドラ２１０では、上記障害
処理情報に基づいて、障害情報をコンソールＧ１−Ｃ１
又は外部記憶装置Ｇ１−Ｄ１等のローカル的な出力装置
へ出力するか、又は障害情報を他の計算機へ転送する
か、又は障害情報をローカル的な出力装置への出力及び
他の計算機への転送を実施する。また障害情報ハンドラ
２１０は、障害検出部２３０からの出力装置の異常であ
る旨を受け取ると、上記障害処理情報に基づいて障害情
報を他の計算機へ転送する。The fault processing information setting section 220 gives fault processing information set by the user to the fault information handler 210. The failure detection unit 230 monitors the statuses of the computer G1-S1, the console G1-C1, and the external storage device G1-D1.
Notify 0. The failure information handler 210 sends the failure information to the console G1-C1 based on the failure processing information.
Alternatively, the failure information is output to a local output device such as the external storage device G1-D1, or the failure information is transferred to another computer, or the failure information is output to a local output device and transferred to another computer. Is carried out. Further, when the failure information handler 210 receives from the failure detection unit 230 that the output device is abnormal, the failure information handler 210 transfers the failure information to another computer based on the failure processing information.

【００１７】他の計算機も上記同様の構成になってい
る。The other computers have the same configuration as above.

【００１８】ここで、障害処理情報のフォーマット例を
図３に示す。Here, FIG. 3 shows an example of the format of the failure processing information.

【００１９】図３に示すように、障害処理情報は、テー
ブル形式になっており、障害情報の種類、表示処理、表
示異常時の転送先、蓄積処理、蓄積異常時の転送先、転
送処理の各フィールド３１０〜３６０から構成されてい
る。As shown in FIG. 3, the failure processing information is in the form of a table. The type of the failure information, the display processing, the transfer destination when the display is abnormal, the accumulation processing, the transfer destination when the accumulation is abnormal, and the transfer processing are described. It is composed of fields 310 to 360.

【００２０】また障害処理情報は、自己の計算機内で発
生した障害情報に関する処理についての情報が記述され
るエントリ３７１、他の計算機からの障害情報の表示を
依頼された際（これにはコンソールが異常の場合の依頼
も含む）の、その障害情報に関する処理についての情報
が記述されるエントリ３７２、他の計算機からの障害情
報の蓄積を依頼された際（これには外部記憶装置が異常
の場合の依頼も含む）の、その障害情報に関する処理に
ついてのエントリ３７３、他の計算機から障害情報の表
示及び蓄積を依頼された際の、その障害情報に関する処
理についての情報が記述されるエントリ３７４を有して
いる。The failure processing information includes an entry 371 in which information about processing relating to the failure information that has occurred in its own computer is described, and a request to display failure information from another computer. An entry 372 in which information on the processing related to the failure information is described in the entry 372 when the failure information is requested from another computer (this includes the case where the external storage device is abnormal). Of the failure information, and an entry 374 in which information on the failure information processing when another computer is requested to display and store the failure information is described. are doing.

【００２１】フィールド３１０には、自己の計算機内で
発生した障害情報（以下、ローカルの障害情報とい
う）、他の計算機からの表示依頼された障害情報（以
下、リモート（表示）の障害情報という）、他の計算機
からの蓄積依頼された障害情報（以下、リモート（蓄
積）の障害情報という）、他の計算機からの表示及び蓄
積依頼された障害情報（以下、リモート（表示＆蓄積）
の障害情報という）といった障害情報の種類が設定され
る。これは、障害情報そのものの種類ではなく、表示／
蓄積などの処理の種類と言える。In a field 310, information on a fault that has occurred in its own computer (hereinafter referred to as local fault information) and fault information requested to be displayed by another computer (hereinafter referred to as remote (display) fault information). Failure information requested to be stored by another computer (hereinafter referred to as remote (storage) failure information), and failure information requested to be displayed and stored by another computer (hereinafter referred to as remote (display & storage))
Of the failure information is set. This is not the type of fault information itself, but the display /
It can be said that it is a type of processing such as accumulation.

【００２２】表示処理のフィールド３２０には、障害情
報を表示するか否かを示す情報が設定される。In the display processing field 320, information indicating whether or not to display the fault information is set.

【００２３】表示異常時の転送先のフィールド３３０に
は、自己の計算機のコンソールが異常の場合に、障害情
報をどこへ転送するかを示す転送先情報例えばアドレス
が設定される。In the transfer destination field 330 when the display is abnormal, transfer destination information indicating where to transfer the fault information when the console of the computer is abnormal, for example, an address is set.

【００２４】蓄積処理のフィールド３４０には、障害情
報を蓄積するか否かを示す情報が設定される。In the storage process field 340, information indicating whether or not the fault information is stored is set.

【００２５】蓄積異常時の転送先のフィールド３５０に
は、自己の計算機の外部記憶装置が異常の場合に、障害
情報をどこへ転送するかを示す転送先情報例えばアドレ
スが設定される。In the transfer destination field 350 at the time of accumulation abnormality, transfer destination information indicating where the failure information is to be transferred, for example, an address, is set when the external storage device of the computer is abnormal.

【００２６】転送処理のフィールド３６０には、障害情
報を転送するか否かを示す情報と、転送する場合はどこ
へ転送するかを示す転送先情報とが設定される。In the field 360 of the transfer process, information indicating whether or not to transfer the fault information, and, if so, destination information indicating where to transfer the fault information, are set.

【００２７】ここで、一般の計算機例えばグループ２の
計算機Ｇ２−Ｓ２の障害処理情報の一例を図４に示し、
グループ監視計算機例えばグループ監視計算機Ｇ２−Ｓ
１の障害処理情報の一例を図５に示し、システム監視＆
グループ監視計算機Ｇ１−Ｓ１の障害処理情報の一例を
図６に示す。Here, FIG. 4 shows an example of failure processing information of a general computer, for example, the computer G2-S2 of group 2.
Group monitoring computer, for example, group monitoring computer G2-S
FIG. 5 shows an example of the failure processing information of the first system.
FIG. 6 shows an example of the failure processing information of the group monitoring computer G1-S1.

【００２８】図４においては、計算機Ｇ２−Ｓ２は、計
算機Ｇ２−Ｓ２内で発生した障害のローカルの障害情
報、他の計算機からのリモート（表示）の障害情報、リ
モート（蓄積）の障害情報、リモート（表示＆蓄積）の
障害情報いずれも、表示処理及び蓄積処理を実施せず、
計算機Ｇ２−Ｓ２内で発生した障害のローカルの障害情
報を、表示データ及び蓄積データとしてグループ監視計
算機Ｇ２−Ｓ１へ転送する、ということが示されてい
る。In FIG. 4, the computer G2-S2 includes local fault information of a fault occurring in the computer G2-S2, remote (display) fault information from another computer, remote (accumulated) fault information, Neither the remote (display & storage) failure information performs display processing and storage processing,
It is shown that local fault information of a fault that has occurred in the computer G2-S2 is transferred to the group monitoring computer G2-S1 as display data and accumulated data.

【００２９】図５においては、グループ監視計算機Ｇ２
−Ｓ１は、グループ監視計算機Ｇ２−Ｓ１内で発生した
障害のローカルの障害情報、及び他の計算機からのリモ
ート（蓄積）の障害情報、リモート（表示＆蓄積）の障
害情報については、表示処理は実施しないが、蓄積処理
は実施し、またこれらの障害情報の表示を依頼すべくシ
ステム監視＆グループ監視計算機Ｇ１−Ｓ１へ転送し、
更に他の計算機からのリモート（表示）の障害情報につ
いては、表示処理及び蓄積処理も実施せず、この障害情
報の表示を依頼すべくシステム監視＆グループ監視計算
機Ｇ１−Ｓ１へ転送する、ということが示されている。In FIG. 5, the group monitoring computer G2
-S1 is local fault information of a fault that has occurred in the group monitoring computer G2-S1, remote (accumulation) fault information from another computer, and remote (display & accumulation) fault information, Although not performed, the storage process is performed, and the fault information is transferred to the system monitoring & group monitoring computer G1-S1 to request display of the fault information.
Further, remote (display) fault information from another computer is not subjected to display processing and accumulation processing, but is transferred to the system monitoring & group monitoring computer G1-S1 to request display of this fault information. It is shown.

【００３０】図６においては、システム監視＆グループ
監視計算機Ｇ１−Ｓ１は、システム監視＆グループ監視
計算機Ｇ１−Ｓ１内で発生した障害のローカルの障害情
報、他の計算機からのリモート（表示）の障害情報、リ
モート（蓄積）の障害情報、リモート（表示＆蓄積）の
障害情報全てについて、表示処理及び蓄積処理を実施
し、これらの障害情報は転送しない、ということが示さ
れている。In FIG. 6, the system monitoring & group monitoring computer G1-S1 includes local fault information of a fault that has occurred in the system monitoring & group monitoring computer G1-S1, and a remote (display) fault from another computer. It is shown that display processing and storage processing are performed for all pieces of information, remote (storage) failure information, and remote (display & storage) failure information, and that these pieces of failure information are not transferred.

【００３１】次に具体例を挙げて説明する。一般計算機
Ｇ２−Ｓ２は、自己内に障害が発生すると、図４に示す
障害処理情報のエントリ３７１の内容に基づいて、その
障害情報の表示及び蓄積を依頼すべく障害情報を、グル
ープ監視計算機Ｇ２−Ｓ１へ転送する。このとき一般計
算機Ｇ２−Ｓ２は、その障害情報については蓄積も表示
も実施しない。リモート（表示＆蓄積）の障害情報を受
信したグループ監視計算機Ｇ２−Ｓ１は、図５に示す障
害処理情報のエントリ３７４の内容に基づいて、障害情
報を外部記憶装置Ｇ２−Ｄ１に蓄積すると共に、表示依
頼すべくシステム監視＆グループ監視計算機Ｇ１−Ｓ１
へ転送する。勿論グループ監視計算機Ｇ２−Ｓ１は、自
己内で発生した障害の障害情報については、図５に示す
障害処理情報のエントリ３７１の内容に基づいて、蓄積
のみを実施し、表示依頼すべくシステム監視＆グループ
監視計算機Ｇ１−Ｓ１へ転送する。システム監視＆グル
ープ監視計算機Ｇ１−Ｓ１では、グループ監視計算機も
兼ねているので、図６に示す障害処理情報のエントリ３
７４に基づいて、グループ内の他の計算機から転送され
てきた障害情報について、コンソールＧ１−Ｃ１に表示
すると共に外部記憶装置Ｇ１Ｄ１に蓄積する。また、他
のグループのグループ監視計算機からのリモート（表
示）の障害情報については表示のみを実施する。Next, a specific example will be described. When a failure occurs in itself, the general computer G2-S2 transmits the failure information to request display and accumulation of the failure information based on the contents of the entry 371 of the failure processing information shown in FIG. -Transfer to S1. At this time, the general computer G2-S2 does not store or display the fault information. The group monitoring computer G2-S1, which has received the remote (display & accumulation) failure information, stores the failure information in the external storage device G2-D1 based on the contents of the failure processing information entry 374 shown in FIG. System monitoring & group monitoring computer G1-S1 to request display
Transfer to Of course, the group monitoring computer G2-S1 only stores the fault information of the fault that has occurred in itself based on the contents of the fault processing information entry 371 shown in FIG. Transfer to the group monitoring computer G1-S1. Since the system monitoring & group monitoring computer G1-S1 also serves as the group monitoring computer, the entry 3 of the failure processing information shown in FIG.
Based on 74, the failure information transferred from the other computers in the group is displayed on the console G1-C1 and stored in the external storage device G1D1. In addition, only the display of the remote (display) fault information from the group monitoring computer of another group is performed.

【００３２】他の一般計算機及びグループ監視計算機Ｇ
ｎ−Ｓ１も上記同様の処理動作を実施する。Other general computer and group monitoring computer G
The n-S1 performs the same processing operation as described above.

【００３３】以上説明したように第１の実施例によれ
ば、障害情報の蓄積をグループ単位で実施し、その表示
についてはシステムを管理するシステム監視計算機で表
示するようにしているので、障害情報を再度参照したり
バックアップするなどの保守操作をグループ単位で実施
すれば良いこととなり、保守操作を容易に行うことがで
きる。As described above, according to the first embodiment, the failure information is stored in units of groups, and its display is displayed by the system monitoring computer which manages the system. It is only necessary to perform a maintenance operation such as referring again to a backup or performing a backup in a group unit, and the maintenance operation can be easily performed.

【００３４】次に第２の実施例について図７乃至図１２
を参照して説明する。Next, a second embodiment will be described with reference to FIGS.
This will be described with reference to FIG.

【００３５】図７は、本発明に係る障害監視システムの
第２の実施例を示す構成図である。この構成図は、図１
の第１の実施例の構成図の構成において、計算機Ｇ１−
Ｓ２を計算機Ｇ１−Ｓ２ａに変更し、計算機Ｇ２−Ｓ２
を計算機Ｇ２−Ｓ２ａに変更し、計算機Ｇ２−Ｓｎを計
算機Ｇ２−Ｓｎａに変更し、計算機Ｇｎ−Ｓ２を計算機
Ｇｎ−Ｓ２ａに変更した構成になっている。FIG. 7 is a block diagram showing a second embodiment of the fault monitoring system according to the present invention. This configuration is shown in FIG.
In the configuration of the configuration diagram of the first embodiment, the computer G1-
S2 is changed to the computer G1-S2a, and the computer G2-S2
Is changed to the computer G2-S2a, the computer G2-Sn is changed to the computer G2-Sna, and the computer Gn-S2 is changed to the computer Gn-S2a.

【００３６】計算機Ｇ１−Ｓ２ａ、Ｇ２−Ｓｎａ、Ｇｎ
−Ｓ２ａは、各グループ内のグループ監視計算機が蓄積
する障害情報をバックアップするための計算機（以下、
バックアップ用計算機という）である。Computers G1-S2a, G2-Sna, Gn
-S2a is a computer for backing up the failure information accumulated by the group monitoring computer in each group (hereinafter referred to as a computer).
Backup computer).

【００３７】計算機Ｇ２−Ｓ２ａは、このシステム全体
を統括するシステム監視機能を有する計算機（以下、シ
ステム監視計算機という）である。The computer G2-S2a is a computer having a system monitoring function for controlling the entire system (hereinafter, referred to as a system monitoring computer).

【００３８】すなわちこの第２の実施例においては、２
つの計算機でシステムを監視し、各グループの代表に障
害情報を蓄積し、各グループにおいては、蓄積情報を自
動的にバックアップするようにしている。That is, in the second embodiment, 2
The system is monitored by one computer, failure information is stored in a representative of each group, and the stored information is automatically backed up in each group.

【００３９】ここで、一般計算機例えば計算機Ｇ１−Ｓ
ｎの障害処理情報の一例を図８に示し、グループ監視計
算機Ｇ２−Ｓ１の障害処理情報の一例を図９に示し、シ
ステム監視計算機Ｇ２−Ｓ２ａの障害処理情報の一例を
図１０に示し、システム監視＆グループ監視計算機Ｇ１
−Ｓ１の障害処理情報の一例を図１１に示し、バックア
ップ用計算機例えばバックアップ用計算機Ｇ２−Ｓｎａ
の障害処理情報の一例を図１２に示す。これらの障害処
理情報は図３に示すようなフォーマットで設定される。Here, a general computer, for example, the computer G1-S
FIG. 8 illustrates an example of the failure processing information of the group monitoring computer G2-S1 in FIG. 9, and FIG. 10 illustrates an example of the failure processing information of the system monitoring computer G2-S2a. Monitoring & Group Monitoring Computer G1
FIG. 11 shows an example of the failure processing information of -S1. The backup computer, for example, the backup computer G2-Sna
FIG. 12 shows an example of the failure processing information. These pieces of failure processing information are set in a format as shown in FIG.

【００４０】図８においては、計算機Ｇ１−Ｓｎは、計
算機Ｇ１−Ｓｎ内で発生した障害のローカルの障害情
報、他の計算機からのリモート（表示）の障害情報、リ
モート（蓄積）の障害情報、リモート（表示＆蓄積）の
障害情報いずれも、表示処理及び蓄積処理を実施せず、
計算機Ｇ１−Ｓｎ内で発生した障害のローカルの障害情
報を、表示及び蓄積依頼すべくシステム監視＆グループ
監視計算機Ｇ１−Ｓ１へ転送する、ということが示され
ている。In FIG. 8, the computer G1-Sn includes local fault information of a fault occurring in the computer G1-Sn, remote (display) fault information from another computer, remote (accumulated) fault information, Neither the remote (display & storage) failure information performs display processing and storage processing,
The figure shows that local fault information of a fault that occurred in the computer G1-Sn is transferred to the system monitoring & group monitoring computer G1-S1 to request display and accumulation.

【００４１】図９においては、グループ監視計算機Ｇ２
−Ｓ１は、グループ監視計算機Ｇ２−Ｓ１内で発生した
障害のローカルの障害情報については、表示処理は実施
しないが蓄積処理は実施し、更にその障害情報を、表示
依頼すべく計算機Ｇ１−Ｓ１及びＧ２−Ｓ２ａへ転送す
ると共に、蓄積依頼すべく計算機Ｇ２−Ｓｎａへ転送
し、また他の計算機からのリモート（表示）の障害情報
については、表示処理及び蓄積処理いずれも実施せず、
表示依頼すべく計算機Ｇ１−Ｓ１及びＧ２−Ｓ２ａへ転
送し、他の計算機からのリモート（蓄積）の障害情報に
ついては、表示処理は実施しないが蓄積処理は実施し、
更にバックアップ依頼すべく計算機Ｇ２−Ｓｎａへ転送
し、他の計算機からのリモート（表示＆蓄積）の各障害
情報については、表示処理は実施しないが蓄積処理は実
施し、表示依頼すべく計算機Ｇ１−Ｓ１、Ｇ２−Ｓ２ａ
へ転送し、またバックアップ依頼すべく計算機Ｇ２−Ｓ
ｎａへ転送する、ということが示されている。In FIG. 9, the group monitoring computer G2
-S1 does not execute the display process but executes the accumulation process for the local fault information of the fault that has occurred in the group monitoring computer G2-S1, and further requests the computer G1-S1 and the computer G1-S1 to request the display of the fault information. In addition to the transfer to G2-S2a, the transfer to the computer G2-Sna to request the storage, and neither the display processing nor the storage processing for the remote (display) failure information from other computers,
It is transferred to the computers G1-S1 and G2-S2a to request a display, and remote (accumulation) failure information from other computers is not displayed, but accumulation is performed.
Further, the computer G2-Sna is transferred to the computer G2-Sna to request a backup, and for each remote (display & storage) failure information from another computer, the display processing is not performed, but the storage processing is performed, and the computer G1-Sna is requested to perform the display request. S1, G2-S2a
To the computer G2-S to request a backup
In this case, it is indicated that the transfer is made to the N.A.

【００４２】図１０においては、システム監視計算機Ｇ
２−Ｓ２ａは、システム監視計算機Ｇ２−Ｓ２ａ内で発
生した障害のローカルの障害情報については、表示処理
及び蓄積処理いずれも実施せず、計算機Ｇ２−Ｓ１へ転
送し、他の計算機からのリモート（表示）の障害情報、
リモート（表示＆蓄積）の障害情報については、表示処
理は実施するが、蓄積処理及び転送処理は実施せず、他
の計算機からのリモート（蓄積）の障害情報について
は、表示処理、蓄積処理及び転送処理いずれも実施しな
い、ということが示されている。In FIG. 10, the system monitoring computer G
2-S2a transfers the local fault information of the fault that has occurred in the system monitoring computer G2-S2a to the computer G2-S1 without performing any display processing and storage processing, and transmits the remote information from another computer ( Display) fault information,
For remote (display & accumulation) failure information, display processing is executed, but accumulation processing and transfer processing are not executed. For remote (accumulation) failure information from other computers, display processing, accumulation processing and This indicates that no transfer processing is performed.

【００４３】図１１においては、システム監視＆グルー
プ監視計算機Ｇ１−Ｓ１は、計算機Ｇ１−Ｓ１で発生し
た障害のローカルの障害情報、また他の計算機からのリ
モート（表示＆蓄積）の障害情報については、表示処理
及び蓄積処理いずれも実施し、またバックアップ依頼す
べく計算機Ｇ１−Ｓ２ａへ転送し、また他の計算機から
のリモート（表示）の障害情報については、表示処理は
実施するが、蓄積処理及び転送は実施せず、更に他の計
算機からのリモート（蓄積）の障害情報については、表
示処理は実施しないが、蓄積処理は実施し、更にバック
アップ依頼すべく計算機Ｇ１−Ｓ２ａへ転送する、とい
うことが示されている。In FIG. 11, the system monitoring & group monitoring computer G1-S1 has local fault information of a fault which has occurred in the computer G1-S1 and remote (display & accumulation) fault information from another computer. , Display processing and storage processing, and transfer to the computer G1-S2a to request a backup. Remote (display) failure information from other computers is displayed, but storage processing and storage processing are performed. No transfer is performed, and no display processing is performed for remote (accumulated) failure information from another computer, but accumulation processing is performed and the information is transferred to the computer G1-S2a to request a backup. It is shown.

【００４４】図１２においては、バックアップ用計算機
Ｇ２−Ｓｎａは、計算機Ｇ２−Ｓｎａで発生した障害の
ローカルの障害情報については、表示処理及び蓄積処理
いずれも実施せず、表示及び蓄積依頼すべく計算機Ｇ２
−Ｓ２ａへ転送し、また他の計算機からのリモート（表
示）の障害情報については、表示処理、蓄積処理、転送
処理いずれも実施せず、更に他の計算機からのリモート
（表示）の障害情報、リモート（表示＆蓄積）の障害情
報については、表示処理及び転送処理は実施しないが、
バックアップ処理は実施する、ということが示されてい
る。In FIG. 12, the backup computer G2-Sna performs neither display processing nor accumulation processing for the local failure information of the failure that has occurred in the computer G2-Sna, and requests the computer to display and accumulate it. G2
-Transfer to S2a and perform remote (display) failure information from other computers without performing any display processing, storage processing, and transfer processing, and further perform remote (display) failure information from other computers. Display processing and transfer processing are not performed for remote (display & storage) failure information.
This indicates that the backup process is performed.

【００４５】次に具体例を挙げて説明する。Next, a specific example will be described.

【００４６】一般計算機Ｇ１−Ｓｎ内で障害が発生する
と、計算機Ｇ１−Ｓｎは、図８に示す障害処理情報のエ
ントリ３７１の内容に基づいて、その障害情報をグルー
プのグループ監視計算機であるシステム監視＆グループ
監視計算機Ｇ１−Ｓ１へ転送する。システム監視＆グル
ープ監視計算機Ｇ１−Ｓ１では、計算機Ｇ１−Ｓｎから
の障害情報を、図１１に示す障害処理情報のエントリ３
７４の内容に基づいて、障害情報の表示及び蓄積処理を
実施すると共に、障害情報をバックアップ用計算機Ｇ１
−Ｓ２ａへ転送する。バックアップ用計算機Ｇ１−Ｓ２
ａでは、転送されてきた障害情報を外部記憶装置Ｇ１−
Ｄ２ａに蓄積する。When a failure occurs in the general computer G1-Sn, the computer G1-Sn transmits the failure information based on the contents of the entry 371 of the failure processing information shown in FIG. & Transfer to the group monitoring computer G1-S1. The system monitoring & group monitoring computer G1-S1 stores the failure information from the computer G1-Sn in the entry 3 of the failure processing information shown in FIG.
Based on the content of the error information 74, the failure information is displayed and stored, and the failure information is stored in the backup computer G1.
-Transfer to S2a. Backup computer G1-S2
In a, the transferred fault information is stored in the external storage device G1-
D2a.

【００４７】またグループ監視計算機Ｇ２−Ｓ１内で障
害が発生すると、グループ監視計算機Ｇ２−Ｓ１は、図
９に示す障害処理情報のエントリ３７１の内容に基づい
て、障害情報を表示依頼すべくシステム監視＆クループ
監視計算機Ｇ１−Ｓ１及びシステム監視計算機Ｇ２−Ｓ
２ａへ転送すると共に、障害情報をバックアップすべく
バックアップ用計算機Ｇ２−Ｓｎａへ転送する。グルー
プ監視計算機Ｇ２−Ｓ１からの障害情報を受信したシス
テム監視計算機Ｇ２−Ｓ２ａでは、、図１０に示す障害
処理情報のエントリ３７２の内容に基づいて、その障害
情報について、表示処理は実施するが、蓄積処理も転送
処理も実施しない。また、グループ監視計算機Ｇ２−Ｓ
１からの障害情報を受信したバックアップ用計算機Ｇ２
−Ｓｎａでは、図１２に示す障害処理情報のエントリ３
７３の内容に基づいて、その障害情報を、外部記憶装置
Ｇ２−Ｄｎａに蓄積する。一方、システム監視＆グルー
プ監視計算機Ｇ１−Ｓ１では、図１１に示す障害処理情
報のエントリ３７２の内容に基づいて、障害情報につい
て表示処理は実施するが、蓄積処理及び転送処理は実施
しない。When a failure occurs in the group monitoring computer G2-S1, the group monitoring computer G2-S1 performs system monitoring to request display of failure information based on the contents of the failure processing information entry 371 shown in FIG. & Group monitoring computer G1-S1 and system monitoring computer G2-S
2a and to the backup computer G2-Sna to back up the failure information. The system monitoring computer G2-S2a that has received the failure information from the group monitoring computer G2-S1 performs display processing on the failure information based on the contents of the failure processing information entry 372 shown in FIG. Neither accumulation processing nor transfer processing is performed. Also, the group monitoring computer G2-S
Backup computer G2 that has received the failure information from
-Sna, entry 3 of the failure processing information shown in FIG.
Based on the contents of 73, the failure information is stored in the external storage device G2-Dna. On the other hand, in the system monitoring & group monitoring computer G1-S1, based on the contents of the failure processing information entry 372 shown in FIG. 11, the display processing is performed on the failure information, but the accumulation processing and the transfer processing are not performed.

【００４８】またシステム監視計算機Ｇ２−Ｓ２ａ内で
障害が発生すると、その障害情報は、図１０の障害処理
情報のエントリ３７１の内容により、グループ監視計算
機Ｇ２−Ｓ１に入力される。その後、その障害情報は、
図９に示すエントリ３７４の内容により、システム監視
＆クループ監視計算機Ｇ１−Ｓ１及びシステム監視計算
機Ｇ２−Ｓ２ａ内のそれぞれのコンソールに表示される
と共に、バックアップ用計算機Ｇ２−Ｓｎａの外部記憶
装置Ｇ２−Ｄｎａに蓄積される。なおシステム監視計算
機Ｇ２−Ｓ２ａが、コンソールＧ２−Ｃ２ａの異常であ
った場合は、当然、障害情報は表示されない。しかしそ
の障害情報と同一の内容のものが、システム監視＆クル
ープ監視計算機Ｇ１−Ｓ１のコンソールＧ１−Ｃ１に表
示されるので、この場合は、システム監視計算機Ｇ２−
Ｓ２ａが計算機及び出力装置に障害が発生した場合であ
っても、もう１つのシステム監視計算機であるシステム
監視＆グループ監視計算機Ｇ１−Ｓ１でシステム全体の
監視を行うことができる。When a fault occurs in the system monitoring computer G2-S2a, the fault information is input to the group monitoring computer G2-S1 according to the contents of the entry 371 of the fault processing information in FIG. After that, the trouble information
The contents of the entry 374 shown in FIG. 9 are displayed on the consoles of the system monitoring and group monitoring computer G1-S1 and the system monitoring computer G2-S2a, and the external storage device G2-Dna of the backup computer G2-Sna. Is accumulated in If the system monitoring computer G2-S2a has an abnormality in the console G2-C2a, the failure information is not displayed. However, since the same content as the fault information is displayed on the console G1-C1 of the system monitoring & group monitoring computer G1-S1, in this case, the system monitoring computer G2-
Even if a failure has occurred in the computer and the output device in S2a, the entire system can be monitored by another system monitoring computer, the system monitoring & group monitoring computer G1-S1.

【００４９】更にバックアップ用計算機Ｇ２−Ｓｎａで
障害が発生すると、その障害情報は、図１２に示す障害
処理情報のエントリ３７１の内容により、グループ監視
計算機Ｇ２−Ｓ１に入力される。グループ監視計算機Ｇ
２−Ｓ１に入力された障害情報は、図９に示す障害処理
情報のエントリ３７４の内容により、システム監視＆グ
ループ監視計算機Ｇ１−Ｓ１及びシステム監視計算機Ｇ
２−Ｓ２ａ内のそれぞれのコンソールに表示されると共
に、バックアップ用計算機Ｇ２−Ｓｎａの外部記憶装置
Ｇ２−Ｄｎａに蓄積される。Further, when a failure occurs in the backup computer G2-Sna, the failure information is input to the group monitoring computer G2-S1 according to the contents of the failure processing information entry 371 shown in FIG. Group monitoring computer G
The failure information input to 2-S1 is based on the contents of the failure processing information entry 374 shown in FIG.
2-S2a is displayed on each console, and is stored in the external storage device G2-Dna of the backup computer G2-Sna.

【００５０】同様に、他の一般計算機、他のグループ監
視計算機、他のシステム監視計算機で障害が発生した場
合も、その障害情報は、ユーザによって設定される障害
処理情報に基づいて処理されることになる。Similarly, when a failure occurs in another general computer, another group monitoring computer, or another system monitoring computer, the failure information is processed based on the failure processing information set by the user. become.

【００５１】すなわち各計算機は、障害が発生すると、
その障害情報を、グループのグループ監視計算機へ転送
し、グループ監視計算機では、受信した障害情報を蓄積
し、更にその蓄積情報を、２つのシステム監視計算機及
びグループ内のバックアップ用計算機へ転送する。That is, when a failure occurs, each computer
The failure information is transferred to the group monitoring computer of the group, the group monitoring computer accumulates the received failure information, and further transfers the accumulated information to the two system monitoring computers and the backup computer in the group.

【００５２】なお、システム監視計算機は、各グループ
に設けるようにしても良いし、１つのグループに複数
（例えば２つ）を設けるようにしても良い。The system monitoring computer may be provided for each group, or a plurality of (for example, two) computers may be provided for one group.

【００５３】以上説明したように第２の実施例によれ
ば、上述した第１の実施例の作用効果に加えて、複数の
システム監視計算機を設けるようにしているので、１つ
のシステム監視計算機がダウンした場合であっても、他
のシステム監視計算機でシステムを監視することができ
る。また障害情報をバックアップするようにしているの
で、万一、グループ監視計算機に蓄積されている障害情
報が破壊されるような事態になっても、その障害情報を
復帰させることができる。As described above, according to the second embodiment, since a plurality of system monitoring computers are provided in addition to the operation and effect of the first embodiment, one system monitoring computer is provided. Even if it goes down, the system can be monitored by another system monitoring computer. Further, since the failure information is backed up, even if the failure information stored in the group monitoring computer is destroyed, the failure information can be restored.

【００５４】[0054]

【発明の効果】以上説明したように、本発明によれば、
各グループ毎の特定のデータ処理装置のみが、自己処理
装置も含むグループ内のデータ処理装置が検出した障害
に関する障害情報を蓄積すると共に、該蓄積される障害
情報を、システム全体の監視機能を有する予め設定され
る所定のデータ処理装置へ転送し、この所定のデータ処
理装置は、自己が属するグループ内のデータ処理装置が
検出した障害に関する障害情報を蓄積すると共に、出力
処理（例えば表示処理）も実行し、一方、他のグループ
に属する特定のデータ処理装置からの障害情報について
は、自己装置には蓄積せずに、出力処理（例えば表示処
理）のみ実行するよう構成したので、グループ毎に障害
情報を集中して蓄積するので、システム全体の記憶媒体
資源を有効利用することができ、しかも予め設定される
データ処理装置のみが障害情報を表示するので、システ
ム全体の障害情報を１元管理することができる。すなわ
ち、グループ単位で障害を監視することができ、しかも
システム全体でも障害を監視することができるという利
点がある。また本発明によれば、システム全体の監視機
能を有する所定のデータ処理装置に加えて他の少なくと
も１つのデータ処理装置もシステム全体の監視機能を有
すると共に、各グループ毎に特定のデータ処理装置が蓄
積する障害情報を少なくとも他のデータ処理装置がバッ
クアップするよう構成したので、１つの所定のデータ処
理装置がシステムダウンして、システム全体の監視機能
が実施不可能になった場合であっても、システム全体の
監視機能を有する他のデータ処理装置によって、システ
ム全体の障害を監視することができる。また特定のデー
タ処理装置における障害情報を蓄積している手段に障害
が発生した場合であっても、バックアップ用障害情報を
基に当該障害情報を迅速に復帰することができる。As described above, according to the present invention,
Only a specific data processing device of each group stores fault information on faults detected by data processing devices in the group including the self-processing device, and has a function of monitoring the stored fault information of the entire system. The data is transferred to a predetermined data processing device that is set in advance, and the predetermined data processing device accumulates fault information regarding a fault detected by a data processing device in a group to which the data processing device belongs, and also performs an output process (for example, a display process). On the other hand, fault information from a specific data processing device belonging to another group is not stored in its own device, and only output processing (for example, display processing) is executed. Since the information is centrally stored, the storage medium resources of the entire system can be used effectively, and the data processing device of a preset data processing device can be used. There therefore view the failure information can be 1-way managing fault information of the whole system. That is, there is an advantage that the failure can be monitored in a group unit and the failure can be monitored in the entire system. According to the present invention, in addition to a predetermined data processing device having a function of monitoring the entire system, at least one other data processing device has a function of monitoring the entire system, and a specific data processing device is provided for each group. Since the fault information to be stored is configured to be backed up by at least another data processing device, even when one predetermined data processing device is down and the monitoring function of the entire system cannot be performed, A fault in the entire system can be monitored by another data processing device having a function of monitoring the entire system. Further, even when a failure has occurred in the means for storing the failure information in a specific data processing device, the failure information can be quickly restored based on the backup failure information.

[Brief description of the drawings]

【図１】本発明に係る障害監視システムの第１の実施例
を示す構成図。FIG. 1 is a configuration diagram showing a first embodiment of a fault monitoring system according to the present invention.

【図２】各計算機の機能的な構成を示す図。FIG. 2 is a diagram showing a functional configuration of each computer.

【図３】障害処理情報のフォーマットの一例を示す図。FIG. 3 is a diagram showing an example of a format of failure processing information.

【図４】第１の実施例における一般計算機の障害処理情
報の一例を示す図。FIG. 4 is a diagram illustrating an example of failure processing information of a general computer according to the first embodiment.

【図５】第１の実施例におけるグループ監視計算機の障
害処理情報の一例を示す図。FIG. 5 is a diagram illustrating an example of failure processing information of a group monitoring computer according to the first embodiment.

【図６】第１の実施例におけるシステム監視＆グループ
監視計算機の障害処理情報の一例を示す図。FIG. 6 is a diagram illustrating an example of failure processing information of a system monitoring & group monitoring computer according to the first embodiment.

【図７】本発明に係る障害監視システムの第２の実施例
を示す構成図。FIG. 7 is a configuration diagram showing a second embodiment of the fault monitoring system according to the present invention.

【図８】第２の実施例における一般計算機の障害処理情
報の一例を示す図。FIG. 8 is a diagram illustrating an example of failure processing information of a general computer according to the second embodiment.

【図９】第２の実施例におけるグループ監視計算機の障
害処理情報の一例を示す図。FIG. 9 is a diagram illustrating an example of failure processing information of a group monitoring computer according to the second embodiment.

【図１０】第２の実施例におけるシステム監視計算機の
障害処理情報の一例を示す図。FIG. 10 is a diagram illustrating an example of failure processing information of a system monitoring computer according to the second embodiment.

【図１１】第２の実施例におけるシステム監視＆グルー
プ監視計算機の障害処理情報の一例を示す図。FIG. 11 is a diagram illustrating an example of failure processing information of a system monitoring & group monitoring computer according to the second embodiment.

【図１２】第２の実施例におけるバックアップ用計算機
の障害処理情報の一例を示す図。FIG. 12 is a diagram illustrating an example of failure processing information of a backup computer according to the second embodiment.

[Explanation of symbols]

Ｇ１−Ｓ１、Ｇ１−Ｓ２、Ｇ１−Ｓｎ、Ｇ１−Ｓ２ａ…
グループ１内の計算機、Ｇ２−Ｓ１、Ｇ２−Ｓ２、Ｇｎ−Ｓｎ、Ｇ２−Ｓ２ａ…
グループ２内の計算機、Ｇｎ−Ｓ１、Ｇｎ−Ｓ２、Ｇｎ−Ｓｎ、Ｇ２−Ｓｎａ、
Ｇｎ−Ｓ２ａ…グループｎ内の計算機、Ｇ１−Ｄ１、Ｇ１−Ｄ２、Ｇ１−Ｄｎ、Ｇ１−Ｄ２ａ…
グループ１内の外部記憶装置、Ｇ１−Ｃ１、Ｇ１−Ｃ２、Ｇ１−Ｃｎ、Ｇ１−Ｃ２ａ…
グループ１内のコンソール、Ｇ２−Ｄ１、Ｇ２−Ｄ２、Ｇｎ−Ｄｎ、Ｇ２−Ｄ２ａ…
グループ２内の外部記憶装置、Ｇ２−Ｃ１、Ｇ２−Ｃ２、Ｇｎ−Ｃｎ、Ｇ２−Ｃ２ａ…
グループ２内のコンソール、Ｇｎ−Ｄ１、Ｇｎ−Ｄ２、Ｇｎ−Ｄｎ、Ｇ２−Ｄｎａ、
Ｇｎ−Ｄ２ａ…グループｎ内の外部記憶装置、Ｇｎ−Ｃ１、Ｇｎ−Ｃ２、Ｇｎ−Ｃｎ、Ｇ２−Ｃｎａ、
Ｇｎ−Ｃ２ａ…グループｎ内のコンソール、１０…ネットワーク、２１０…障害情報ハンドラ、２２０…障害処理情報設定部、２３０…障害検出部、２４０…計算機内障害モジュール。G1-S1, G1-S2, G1-Sn, G1-S2a ...
Computers in group 1, G2-S1, G2-S2, Gn-Sn, G2-S2a ...
Computers in group 2, Gn-S1, Gn-S2, Gn-Sn, G2-Sna,
Gn-S2a: Computers in group n, G1-D1, G1-D2, G1-Dn, G1-D2a ...
External storage devices in group 1, G1-C1, G1-C2, G1-Cn, G1-C2a,.
Consoles in group 1, G2-D1, G2-D2, Gn-Dn, G2-D2a ...
External storage devices in group 2, G2-C1, G2-C2, Gn-Cn, G2-C2a...
Consoles in group 2, Gn-D1, Gn-D2, Gn-Dn, G2-Dna,
Gn-D2a: external storage devices in group n, Gn-C1, Gn-C2, Gn-Cn, G2-Cna,
Gn-C2a: Console in group n, 10: Network, 210: Fault information handler, 220: Fault processing information setting unit, 230: Fault detection unit, 240: Fault module in the computer.

Claims

(57) [Claims]

1. A fault monitoring system in which a plurality of data processing devices are connected via a communication line, respectively, and wherein the plurality of data processing devices are monitored for faults, wherein the plurality of data processing devices are grouped. Among the data processing devices of each group, the data processing devices other than the specific data processing device set in advance do not store the failure information relating to the failure detected by the own device in the own device, The specific data processing device of the group to which it belongs is transferred to the specific data processing device, and the specific data processing device of each group is provided with fault information on a fault detected by its own device and fault information transferred from another data processing device in the group. And the stored fault information is stored in a predetermined format having a function of monitoring the entire system in each of the specific data processing devices. The predetermined data processing device transfers the failure information from a specific data processing device belonging to another group to its own device, and executes only the output process without storing the failure information in its own device. Characteristic fault monitoring system.

2. At least one other data processing device in addition to the predetermined data processing device of the plurality of data processing devices functions as the preset predetermined data processing device, and 2. The fault monitoring system according to claim 1, wherein a data processing device other than the specific data processing device for each group backs up the fault information accumulated by the specific data processing device.

3. Each of the data processing devices includes a self-device main unit, a peripheral device connected to the device main unit, and a fault detecting unit that monitors a fault in the device main unit and the peripheral device and detects an abnormality of the fault. The fault monitoring system according to claim 1 or 2, further comprising: