JPH06266581A

JPH06266581A - Remote maintenance fault monitoring system

Info

Publication number: JPH06266581A
Application number: JP5050795A
Authority: JP
Inventors: Tatsuji Nadenaka; 達司撫中
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1993-03-11
Filing date: 1993-03-11
Publication date: 1994-09-22

Abstract

PURPOSE:To enable more speedy following dealing by more speedily and more exactly reporting various faults, which are generated in a remote information system, to a maintenance center. CONSTITUTION:When any fault is generated, it is reported from a peripheral device to a central processing unit (CPU) 3. When the fault affects the restart of the system, the CPU 3 prepares data (H/W information) and requests processing to an operating system (OS). In the OS, a request is performed to a communication equipment 4, and the communication equipment 4 performs automatic dialing by using a first line 7. When the first line can not be used, any usable line is selected out of installed public lines (11-12) and the data are reported to a remote maintenance device 8 by using the selected public line.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は情報処理システムの遠隔
保守方式に関し、特に、緊急性を要する重大障害の自動
通知方式に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a remote maintenance system for an information processing system, and more particularly to an automatic notification system for urgent critical failures.

【０００２】[0002]

【従来の技術】従来から、ＣＰＵと、そのＣＰＵに接続
された複数の周辺装置と、これらの装置と公衆回線で接
続された遠隔保守装置とを有するシステムが知られてい
る。たとえば、特開昭６３−３００３２８号公報に示さ
れているようなシステムである。従来のこのシステムで
は、その障害の通知は、ＣＰＵがその通知が可能な障
害、すなわち、周辺装置の故障などの軽障害の報告を行
なうことがその目的であった。2. Description of the Related Art Conventionally, a system having a CPU, a plurality of peripheral devices connected to the CPU, and a remote maintenance device connected to these devices via a public line is known. For example, it is a system as shown in Japanese Patent Laid-Open No. 63-300328. In the conventional system, the purpose of the notification of the failure is to report a failure that the CPU can notify, that is, a light failure such as a failure of a peripheral device.

【０００３】[0003]

【発明が解決しようとする課題】上述した保守システム
では、その障害の通知は、その通知を行なうＣＰＵが処
理可能な状態での障害（例えば、周辺装置の故障などの
軽障害）のみの通知であり、システムダウンにつながる
重障害などは、自動復旧後にしか報告できなかった。In the above-mentioned maintenance system, the notification of the failure is made only for the failure (for example, a minor failure such as a failure of a peripheral device) in a state in which the CPU making the notification can process. Yes, serious failures that could lead to system down could only be reported after automatic recovery.

【０００４】また、重大障害の報告を行なう手段とし
て、そのＣＰＵとは別の処理装置（例えば、パーソナル
コンピュータなど）を付加することが行なわれ、遠隔保
守方式を行なうために、それらの付加装置が必要であっ
た。また、これらの通知は、ＣＰＵが自動復旧を行なう
というトリガにより実施されるものであり、ＣＰＵ自体
が正常であることがその条件であり、ハングアップ状態
に陥った場合、システムとしては異常であるにも、拘ら
ずその異常通知が行なわれないという問題があった。Further, as a means for reporting a serious failure, a processing device (for example, a personal computer) different from the CPU is added, and these additional devices are used to perform a remote maintenance system. Was needed. Further, these notifications are carried out by a trigger that the CPU automatically recovers, and the condition is that the CPU itself is normal, and if it falls into a hang-up state, it is abnormal as a system. However, there is a problem that the abnormality notification is not given regardless.

【０００５】また、その通知を行なうための回線は、通
常１回線のみであり、緊急性を要する通知方法の信頼性
にも問題があった。Further, the number of lines for making the notification is usually only one line, and there is a problem in the reliability of the notification method requiring urgency.

【０００６】さらには、重大障害発生後のシステムの自
動復旧後、その解析データは、遠隔保守装置への報告を
受けた後、システムの存在する場所へ赴き、テープなど
の媒体へデータのコピーを行ない、それを持ち返るとい
うのが通常であり、次の障害を防ぐためにも、いち速く
解析することが必要である場合に対応できないという問
題があった。Furthermore, after the system is automatically restored after the occurrence of a serious failure, the analysis data is sent to the location where the system is present and copied to a medium such as tape after receiving a report to the remote maintenance device. It is usual to do it and bring it back, and there is a problem that it is not possible to deal with the case where quick analysis is necessary to prevent the next failure.

【０００７】この発明は上記のような問題点を解決する
ためになされたもので、遠隔にある情報システムにて発
生した各種障害を、より速く、より正確に保守センタへ
通知し、その後のより速い対応を可能とする遠隔保守障
害監視方式を得ることを目的とする。The present invention has been made in order to solve the above-mentioned problems, and it notifies various troubles occurring in a remote information system to a maintenance center more quickly and more accurately, and then the The purpose is to obtain a remote maintenance fault monitoring system that enables quick response.

【０００８】[0008]

【課題を解決するための手段】本発明に係る遠隔保守障
害監視方式は、ＣＰＵと、このＣＰＵに接続された複数
の周辺装置と、これらの装置と公衆回線で接続された遠
隔保守装置とを有するシステムにおいて、通信装置の制
御を行なう手続きをその装置内部にファームウェアとし
て準備する。そのシステムの運転継続か不可能な障害
（マシンチェック、システムディスク破壊等）が発生し
た場合に、オペレーティングシステム内部のマシンチェ
ック処理部において、まずその通信装置を初期化し、次
にそのファームウェアを主記憶内部にロードする。次に
そのファームウェアに処理を移すことによりその障害情
報を転送する。このようにして、その重大障害発生をそ
のシステムの自動復旧を行なう前に、公衆回線を用いて
遠隔保守装置への通知を行なう。A remote maintenance fault monitoring system according to the present invention comprises a CPU, a plurality of peripheral devices connected to the CPU, and a remote maintenance device connected to these devices via a public line. In the system, the procedure for controlling the communication device is prepared as firmware inside the device. When the system continues to operate or when an impossible failure (machine check, system disk damage, etc.) occurs, the machine check processing unit inside the operating system first initializes the communication device and then stores the firmware in the main memory. Load inside. Next, the fault information is transferred by transferring the processing to the firmware. In this way, the occurrence of the serious failure is notified to the remote maintenance device by using the public line before the system is automatically restored.

【０００９】また、そのシステムのハングアップ状態
（正常処理が実行できていない状態、例えば、オペレー
ティングシステムでのループなど）を検知する機能（即
ち、計算機システム中に、タイマ割り込みを利用してそ
の計算機システム自身がハングアップしていないことを
確認し、もしハングアップが発生した場合には自らダウ
ンし再起動する機能）を有し、ハングアップが発生した
場合に、そのシステムを自動的に再起動し、再起動終了
後、その障害発生を公衆回線を用いて遠隔保守装置への
通知を行なう。A function for detecting a hang-up state of the system (state in which normal processing cannot be executed, for example, a loop in the operating system) (that is, a computer system using a timer interrupt is used). It has the function of confirming that the system itself has not hung up, and if there is a hangup, it has the function of going down and rebooting itself, and when a hangup occurs, the system is automatically rebooted. After the restart is completed, the failure occurrence is notified to the remote maintenance device using the public line.

【００１０】また、その障害の通知を行なう際、第一回
線が使用不可能な場合、自動的に使用可能な別回線を選
択し、通知を行なう。When notifying the failure, if the first line is unavailable, another available line is automatically selected and the notification is given.

【００１１】また、障害発生の際の必要最小限な情報と
は別に、自動復旧時に磁気ディスク装置に収集した解析
データ（セカンダリ情報）を、公衆回線を用いて遠隔保
守装置への自動的に転送を行なう。In addition to the minimum necessary information when a failure occurs, the analysis data (secondary information) collected in the magnetic disk unit at the time of automatic restoration is automatically transferred to the remote maintenance device using the public line. Do.

【００１２】[0012]

【作用】本発明においては、ＣＰＵと、このＣＰＵに接
続された複数の周辺装置と、これらの装置と公衆回線で
接続された遠隔保守装置とを有するシステムにおいて、
重大障害発生をそのシステムの自動復旧を行なう前に、
公衆回線を用いて遠隔保守装置への通知を行なうので緊
急性の要求されるシステム異常通知をシステムダウン前
に遠隔保守装置に伝えることができる。In the present invention, in a system having a CPU, a plurality of peripheral devices connected to the CPU, and a remote maintenance device connected to these devices via a public line,
Before the automatic recovery of the system in case of serious failure occurrence,
Since the public maintenance line is used to notify the remote maintenance device, it is possible to notify the remote maintenance device of a system abnormality notification that requires urgency before the system goes down.

【００１３】また、そのシステムのハングアップ状態を
検知する機能を有し、システムの再起動終了後、その障
害発生を遠隔保守装置へ通知するので、重大障害通知と
同様に、システムのハングアップ状態が発生した場合で
も遠隔保守装置へ通知することができる。Further, since the system has a function of detecting the hang-up state of the system and notifies the remote maintenance device of the occurrence of the fault after the restart of the system is finished, the system hang-up state is reported as in the case of the serious fault notification. Even in the case of occurrence, it is possible to notify the remote maintenance device.

【００１４】また、その障害の通知を行なう際、第一回
線が使用不可能な場合、自動的に使用可能な別回線を選
択し、通知を行なうので回線に異常が発生している場合
でも障害通知が遠隔保守装置へ伝えられる。In addition, when the first line is unavailable when the fault is notified, another line that can be used is automatically selected and the notice is given, so that the fault occurs even if the line is abnormal. The notification is transmitted to the remote maintenance device.

【００１５】また、障害発生の際の必要最小限な情報と
は別に、自動復旧時に磁気ディスク装置に収集した解析
データ（セカンダリ情報）を、公衆回線を用いて遠隔保
守装置への自動的に転送を行なうので、必要最小限の情
報が迅速に遠隔保守装置へ伝わると共にセカンダリ情報
は選択的に送信されるためシステム及び回線の無駄の無
い使用が可能に成る。In addition to the minimum necessary information when a failure occurs, the analysis data (secondary information) collected in the magnetic disk device at the time of automatic restoration is automatically transferred to the remote maintenance device using the public line. Since the required minimum information is quickly transmitted to the remote maintenance device and the secondary information is selectively transmitted, the system and the line can be used without waste.

【００１６】[0016]

【Example】

実施例１．次に、本発明の一実施例を図を用いて説明す
る。図１は、この実施例における遠隔保守システムの構
成図である。複数の周辺装置（１〜２）、中央処理装置
３、主記憶装置１４、自動代替ダイヤル機能７を有する
通報処理装置４、磁気ディスク装置５、タイマ監視から
の情報によりシステムを再起動させリモートファンクシ
ョン装置６が保守されるシステムとして存在している。
そして、このシステムに公衆回線にて接続された遠隔保
守装置８が保守センタ９に設置されている。Example 1. Next, an embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram of the remote maintenance system in this embodiment. A plurality of peripheral devices (1-2), a central processing unit 3, a main storage unit 14, a message processing unit 4 having an automatic alternate dial function 7, a magnetic disk unit 5, and a remote function for restarting the system by information from a timer monitor. The device 6 exists as a system to be maintained.
A remote maintenance device 8 connected to this system via a public line is installed in a maintenance center 9.

【００１７】公衆回線としては、第１回線１０と、第２
回線１１と、第３回線１２の３つの公衆回線が用意され
ている。通信装置４の自動代替ダイヤル７は通常は第１
回線１０を用いて自動ダイヤルを行う。もし第１回線が
使用出来ない場合には第２回線に対して自動ダイヤルを
行う。更に第２回線が使用出来ない場合には第３回線１
２に対して自動ダイヤルを行う。The public lines are the first line 10 and the second line.
Three public lines, a line 11 and a third line 12, are prepared. The automatic alternate dial 7 of the communication device 4 is usually the first
Automatic dial using line 10. If the first line cannot be used, the second line is automatically dialed. If the second line cannot be used, the third line 1
Dial 2 automatically.

【００１８】なおこの実施例においては、中央処理装置
３、および通信装置４は正常に動作することを前提とす
る。及び主記憶装置１４も中央処理装置が動作するため
の障害報告に必要なオペレーティングシステムの一部分
（例えば後述する割り込みハンドラー、マシンチェック
処理部等）、及び通信制御装置４から読み込まれる制御
用ファームウェア１３が動作するために必要な部分は正
常に動作することを前提にする。In this embodiment, it is assumed that the central processing unit 3 and the communication device 4 operate normally. The main storage device 14 also includes a part of the operating system (for example, an interrupt handler and a machine check processing unit described later) necessary for the failure report for the central processing unit to operate, and the control firmware 13 read from the communication control device 4. It is premised that the necessary parts to operate normally.

【００１９】周辺装置（１〜２）は、システムの入出力
を処理する装置であり、障害発生時には、その障害が周
辺装置から、中央処理装置３に報告される。中央処理装
置３は、その障害がシステムの再起動に及ぶものである
場合、データ（Ｈ／Ｗ情報）を用意し、オペレーティン
グシステムに処理を要求する。The peripheral devices (1-2) are devices for processing the input / output of the system, and when a failure occurs, the failure is reported from the peripheral device to the central processing unit 3. When the failure extends to the system restart, the central processing unit 3 prepares data (H / W information) and requests the operating system for processing.

【００２０】ここで用意されるＨ／Ｗ情報とは、例えば
周辺装置１が磁気ディスク装置の場合、故障箇所、故障
原因、あるいは故障の原因となったリクエスト情報など
であり、これらのＨ／Ｗ情報を収集してオペレーティン
グシステムに障害に対する処理を要求する。The H / W information prepared here is, for example, when the peripheral device 1 is a magnetic disk device, a failure location, a cause of the failure, request information which caused the failure, and the like. Gather information and request the operating system to handle the failure.

【００２１】オペレーティングシステムでは、通報処理
装置４へのリクエストが行なわれ、通報処理装置４は、
第一回線１０を用いて自動ダイヤルを行なう。もし、第
一回線が使用不可能な場合、設置された回線（１１〜１
２）から、使用可能な回線を選択し、公衆回線を用いて
そのデータが遠隔保守装置８に通知させる。In the operating system, a request is made to the notification processing device 4, and the notification processing device 4
Automatic dialing is performed using the first line 10. If the first line is unavailable, the installed line (11-1
From 2), select an available line and use the public line to notify the remote maintenance device 8 of the data.

【００２２】実施例２．図２は、重障害が発生した場合
に通報装置４がどのようにして動作するかを示す処理フ
ローを示す図であり、ここでは、マシンチェック発生時
の処理を例として説明する。マシンチェックが発生した
場合２０、割り込みが発生し２１、中央処理装置３によ
り、オペレーティングシステム内の割り込み処理が起動
される２２。次に、オペレーテイングシステム内部のマ
シンチェック処理部２３において、通報装置４が初期化
される２４。Example 2. FIG. 2 is a diagram showing a processing flow showing how the notification device 4 operates when a serious failure occurs, and here, the processing when a machine check occurs will be described as an example. If a machine check occurs 20, an interrupt occurs 21, and the central processing unit 3 activates 22 an interrupt process in the operating system. Next, in the machine check processing unit 23 inside the operating system, the notification device 4 is initialized 24.

【００２３】次に、通報装置４に予め用意された制御用
ファームウェア１３が、主記憶１４内部にロードされる
２５。ロードされたファームウェアは、第一回線１０を
用いて、Ｈ／Ｗが用意したデータの通知を試みる２６。
エラーとなった場合２７、設置された他の回線１１〜１
２から、使用可能な回線を選択し２８、公衆回線を用い
てそのデータを遠隔保守装置８に通知する２９。Next, the control firmware 13 prepared in advance in the notification device 4 is loaded 25 into the main memory 14. The loaded firmware tries to notify 26 of the data prepared by the H / W using the first line 10.
In case of error 27, other installed lines 11 to 1
From 2, the usable line is selected 28, and the data is notified to the remote maintenance device 8 using the public line 29.

【００２４】図３は、ハングアップ状態の検知方法につ
いて説明する図である。図において３０はリモートファ
ンクション装置６に設けられた時間記録領域、３１は時
間記録領域３０の値を一定時間毎にチェックしにいく時
間チェック、３２は中央処理装置３から一定時間毎に時
間記録領域３０を更新するために出される更新要求、３
３はリモートファンクション装置６から中央処理装置３
に対して異常発生を知らせるための割り込みである。FIG. 3 is a diagram for explaining a method of detecting a hang-up state. In the figure, 30 is a time recording area provided in the remote function device 6, 31 is a time check for checking the value of the time recording area 30 at regular intervals, and 32 is a time recording area at regular intervals from the central processing unit 3. Update request issued to update 30, 3
3 is the remote function device 6 to the central processing unit 3
Is an interrupt for notifying the occurrence of an abnormality to.

【００２５】ここでは中央処理装置３が所定の制御ソフ
トウェアを用いて定期的に更新要求３２を発生させ、時
間記録領域３０の値を更新しているものとする。またリ
モートファンクション装置６には制御用ソフトウェア１
５が設けられており、制御用ソフトウェア１５は中央処
理装置３がハングアップ状態の場合に、このハングアッ
プ状態を解除し、システムの再起動を行うことができる
ものとする。Here, it is assumed that the central processing unit 3 periodically generates an update request 32 using predetermined control software to update the value of the time recording area 30. In addition, the remote function device 6 includes control software 1
5 is provided, the control software 15 can cancel the hung-up state and restart the system when the central processing unit 3 is in the hang-up state.

【００２６】中央処理装置３に接続された時間監視機能
付のリモートファンクション装置６は、装置内にて予め
決められた領域（時間記憶領域）３０の時間更新を監視
しており３１、一定時間以上、この領域への制御ソフト
ウェアによる更新３２がなかった場合、中央処理装置３
に異常発生の割り込みを発生させる３４。The remote function device 6 with a time monitoring function, which is connected to the central processing unit 3, monitors the time update of a predetermined area (time storage area) 30 in the device 31, which is a predetermined time or more. , If there is no update 32 by the control software to this area, the central processing unit 3
To generate an abnormal interrupt 34.

【００２７】中央処理装置３は、この割り込みを処理す
るために、オペレーティングシステムに対し処理を依頼
する３５。オペレーティングシステムは、リモートファ
ンクション装置６の制御用ソフトウェア１４を起動３６
し、リモートファンクション装置６へシステムの再起動
要求を行なう３７。オペレーティングシステムは、シス
テムが復旧された際に、通報装置４の制御用ソフトウェ
ア１３に対してデータの通信を要求する。これ以降の通
知方法については、図２のステップ２６からの部分と同
様である。The central processing unit 3 requests the operating system to process 35 in order to process this interrupt. The operating system activates the control software 14 of the remote function device 6 36.
Then, a system restart request is issued to the remote function device 6 (37). When the system is restored, the operating system requests the control software 13 of the reporting device 4 to communicate data. The subsequent notification method is the same as that from step 26 in FIG.

【００２８】実施例３．次に、図４を用いて、重障害が
発生した場合に、磁気ディスク装置５内に収集された解
析用データ（セカンダリ情報）がどのようにして、遠隔
保守装置へ転送されるかを説明する。重障害が発生して
復旧させた後、あるいは時間監視機構付のリモートファ
ンクション装置によりシステムの自動復旧が行なわれた
後４０、オペレーティングシステムは、システムが復旧
された際に、通信すべきデータがあるかどうかを判断す
る４１。システムをダウンさせる前に、故障箇所、故障
原因等の情報をディスクに退避させる。障害が発生した
システムが再起動された場合には、オペレーティングシ
ステムはシステムをダウンさせる前に退避したデータを
ディスクから読み込み、そのデータの中から通信すべき
データがあるかどうかを判断する。ここでオペレーティ
ングシステムがシステム起動時に通信すべきデータがあ
るかどうかを判断するのは、通常のシステム起動におい
ても行われるが、通常のシステム起動の場合には、シス
テムダウンによるデータは存在しないため、通信装置４
へのデータの通信要求は出されない。通信すべきデータ
がある場合４２、通信装置４の制御用ソフトウェア１３
に対してデータの通信を要求する４４。Example 3. Next, with reference to FIG. 4, how the analysis data (secondary information) collected in the magnetic disk device 5 is transferred to the remote maintenance device when a serious failure occurs will be described. . After a serious failure occurs and the system is restored, or after the system is automatically restored by the remote function device with the time monitoring mechanism 40, the operating system has data to be communicated when the system is restored. It is determined whether 41. Before shutting down the system, save information such as failure location and failure cause to the disk. When the failed system is restarted, the operating system reads the saved data from the disk before shutting down the system, and determines whether there is data to be communicated among the data. Here, the operating system determines whether or not there is data to be communicated at the time of system startup, although it is also performed during normal system startup, but in the case of normal system startup, there is no data due to system down, so Communication device 4
No data communication request is issued to. When there is data to be communicated 42, the control software 13 of the communication device 4
To request data communication 44.

【００２９】通報装置４の制御用ソフトウェア１３は、
必要最小限のデータをまず、送付する４５。ここで必要
最小限のデータとは、例えば、システムがダウンしたと
いう情報そのものを意味している。また付加的な情報と
してダウンの時間、及びダウンの原因等を付加して同時
に送付するようにしても構わない。これら必要最小限の
データが遠隔保守装置８に送信されると保守センター９
にいるオペレータがあらかじめ指定された制御用フラグ
を指定し、通信装置４に送り返してくる。この制御用フ
ラグを判断することによりさらに保守センタのオペレー
タが詳細な情報を必要としているかどうかを判定するこ
とができる。中央処理装置３は通信装置４を介して得た
制御用フラグの値を確認し４６、詳細なデータの送付要
求があるかどうかを判断する。セカンダリ情報の送付要
求がある場合４７、磁気ディスク装置内のデータを転送
する４８。ここでセカンダリ情報とは、例えばシステム
をダウンさせる前に記憶しておいたシステムダウン直前
のメモリイメージ等である。セカンダリ情報の送付要求
がない場合には、通常の処理へ移る４３。The control software 13 of the notification device 4 is
First, the minimum required data is sent 45. Here, the minimum necessary data means, for example, the information itself that the system is down. Also, as additional information, the down time, the cause of the down, etc. may be added and sent at the same time. When these minimum required data are transmitted to the remote maintenance device 8, the maintenance center 9
The operator in the field specifies the control flag specified in advance and sends it back to the communication device 4. By judging this control flag, it is possible to judge whether the operator of the maintenance center needs detailed information. The central processing unit 3 confirms 46 the value of the control flag obtained via the communication unit 4 and judges whether there is a request to send detailed data. If there is a request to send the secondary information 47, the data in the magnetic disk device is transferred 48. Here, the secondary information is, for example, a memory image immediately before the system down stored before the system down. If there is no request to send the secondary information, the process proceeds to normal processing 43.

【００３０】実施例４．上記実施例１においてはマシン
チェックが発生し、ハードウェアの障害が発生した場合
について説明し、上記実施例２においてはハングアップ
状態の検出の場合を説明したが、この他にシステムの運
転継続が不可能な障害の場合として、例えばオペレーテ
ィングシステムのプログラムチェックが発生した場合、
或いはシステムディスクに何らかの障害が発生して動作
出来ないような場合等が考えられる。Example 4. In the first embodiment, the case where the machine check occurs and the hardware failure occurs is described, and in the second embodiment, the case where the hang-up state is detected has been described. In the case of an impossible failure, for example, when a program check of the operating system occurs,
Alternatively, there may be a case where the system disk cannot operate due to some failure.

【００３１】実施例５．上記実施例においては、公衆回
線を用いて遠隔保守装置と通信装置を接続している場合
について説明したが、公衆回線を用いる場合でなく私的
な回線を用いても構わない。あるいは固定接続された回
線を用いても構わない。あるいはローカルエリアネトワ
ーク等の通信手段で接続している場合でも構わない。Example 5. In the above embodiment, the case where the remote maintenance device and the communication device are connected using the public line has been described, but a private line may be used instead of the public line. Alternatively, a fixedly connected line may be used. Alternatively, the connection may be made by a communication means such as a local area network.

【００３２】[0032]

【発明の効果】以上説明したように、本発明ではシステ
ムダウンにつながる重障害の発生時、自動復旧を行なう
前に、その障害情報が通信回線を通じて、遠隔保守装置
に自動送付することができ、より緊急性の要求されるシ
ステム異常通知がより速く、正確に伝えることができ
る。これにより、その後の処理への対応もより短い時間
に行なうことができる。また、自らその異常を検知する
ことにより、遠隔への異常を伝えることも可能となり、
より信頼性の高い遠隔保守が可能となる。As described above, according to the present invention, in the event of a serious failure that leads to a system down, the failure information can be automatically sent to the remote maintenance device through the communication line before the automatic recovery. System abnormality notifications that require more urgency can be communicated faster and more accurately. This makes it possible to respond to the subsequent processing in a shorter time. In addition, by detecting the abnormality itself, it becomes possible to convey the abnormality to the remote,
Enables more reliable remote maintenance.

[Brief description of drawings]

【図１】この発明の遠隔保守・障害監視方式に関わる装
置の一実施例を示したシステムブロック図である。FIG. 1 is a system block diagram showing an embodiment of an apparatus relating to a remote maintenance / fault monitoring system of the present invention.

【図２】この発明の、重障害が発生した場合に通信装置
がどのようにして動作するかを示した処理フロー図であ
る。FIG. 2 is a processing flow chart showing how the communication device operates when a serious failure occurs according to the present invention.

【図３】この発明の、ハングアップ状態の検知方法につ
いての説明を行う処理フロー図である。FIG. 3 is a process flow chart for explaining a method for detecting a hang-up state according to the present invention.

【図４】この発明の、重障害が発生した場合に、磁気デ
ィスク装置内に収集された解析用データ（セカンダリ情
報）がどのようにして、遠隔保守装置へ転送されるかを
説明する処理フロー図である。FIG. 4 is a process flow for explaining how the analysis data (secondary information) collected in the magnetic disk device is transferred to the remote maintenance device when a serious failure occurs according to the present invention. It is a figure.

[Explanation of symbols]

１周辺装置２周辺装置３中央処理装置４通信装置５磁気ディスク装置６リモートファンクション装置７自動代替ダイヤル８遠隔保守装置９保守センタ１０，１１，１２公衆回線１３制御用ファームウェア１４主記憶装置１５制御用ソフトウェア 1 Peripheral Device 2 Peripheral Device 3 Central Processing Unit 4 Communication Device 5 Magnetic Disk Device 6 Remote Function Device 7 Automatic Alternate Dial 8 Remote Maintenance Device 9 Maintenance Center 10, 11, 12 Public Line 13 Control Firmware 14 Main Storage Device 15 Control software

Claims

[Claims]

1. A remote maintenance fault monitoring system comprising a central processing unit, peripheral devices connected to the central processing unit, and a remote maintenance unit connected to these units via a line, wherein the central processing unit and the central processing unit are provided. When a peripheral device connected to the central processing unit encounters a failure that makes it impossible to continue operating the system, the failure is notified to the remote maintenance device using the line before the system is automatically restored. A remote maintenance fault monitoring system characterized by having means.

2. A remote maintenance fault monitoring system comprising a central processing unit, peripheral devices connected to the central processing unit, and a remote maintenance unit connected to these units via a line, wherein the central processing unit and the central processing unit are provided. Peripheral device connected to the central processing unit has a means to detect the hang-up state of the system, and when the hang-up occurs, the system is automatically restarted and a failure occurs after the restart A remote maintenance fault monitoring system characterized by comprising means for notifying a remote maintenance device via a line.

3. In a remote maintenance fault monitoring system provided with means for notifying a remote maintenance device by using a line, the means for notifying the remote maintenance device by using a line has a plurality of lines in advance. If the line is unavailable,
A remote maintenance fault monitoring system characterized by having means for selecting another usable line and notifying it.

4. A remote maintenance fault monitoring system comprising a central processing unit, peripheral devices connected to the central processing unit, and a remote maintenance unit connected to these units via a line, the central processing unit comprising: When a peripheral device connected to the central processing unit detects a system abnormality, a means for collecting data related to the abnormality in the storage device before the system is restored, and a failure maintenance information for the remote maintenance device when the system is restored. A remote maintenance fault monitoring system comprising means for notifying the remote maintenance device and means for transferring the data collected in the storage device at the time of automatic restoration to the remote maintenance device based on a request from the remote maintenance device.