JP2000324105A

JP2000324105A - Collection system or fault information

Info

Publication number: JP2000324105A
Application number: JP11132964A
Authority: JP
Inventors: Katsuyuki Morishita; 克之森下
Original assignee: NEC Engineering Ltd
Current assignee: NEC Engineering Ltd
Priority date: 1999-05-13
Filing date: 1999-05-13
Publication date: 2000-11-24

Abstract

PROBLEM TO BE SOLVED: To collect the fault information centering around the data of higher priorities and also to collect the representative data of lower priorities hen plural faults simultaneously occur by locking the fault information which cannot be updated if the fault occurs while another fault is processed. SOLUTION: An input/output processor 5 notifies a central processor 3 of a fault occurring at a channel part 6a and also locks a fault information saving area 2 to prepare to transfer the fault information data. Then a fault information control part recognizes that the transfer of the fault information data is prepared. Meanwhile, if the fault information control part recognizes a fault of a channel part 6b, the fault of the part 6b is informed to the processor 5. When the processor 5 notifies the processor 3 of the fault of the channel 6b, the fault information control part acquires a lock flag and inhibits writing to a fault information data part while the lock flag is acquired.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は障害情報採取方式、
特に障害情報の情報処理装置に関する。TECHNICAL FIELD The present invention relates to a fault information collecting method,
In particular, the present invention relates to a failure information information processing apparatus.

【０００２】[0002]

【従来の技術】従来、この種の障害情報採取方式は、複
数の障害情報が発生した場合に、情報処理装置内の主記
憶部上の障害情報セーブエリアに有効な情報があり、こ
の情報が採取済みでないことを示すロックバイトを設け
ることで、障害情報セーブエリアの排他制御を行ってい
た。その為に、障害情報セーブエリアのロックバイトを
取得した障害情報のみが、障害情報の優先順位に拘らず
障害情報採取の対象となり、ロックバイトを取得できな
い障害情報は廃棄していた。また、障害の要因に拘らず
採取する障害情報のデータは同一であった。2. Description of the Related Art Conventionally, in this type of fault information collecting method, when a plurality of fault information has occurred, effective information is stored in a fault information save area on a main storage unit in the information processing apparatus. Exclusive control of the failure information save area was performed by providing a lock byte indicating that the information was not collected. For this reason, only the failure information for which the lock byte of the failure information save area has been acquired is subject to failure information collection regardless of the priority of the failure information, and failure information for which the lock byte cannot be acquired has been discarded. The data of the fault information collected was the same regardless of the cause of the fault.

【０００３】これに対し、例えば特開平７−２３０４１
号公報の「回線障害通知方式」に開示される如く、優先
度の高い障害情報のみを採取する技術を情報処理装置に
用いることにより、優先度の高い障害情報を採取する技
術が用いられている。On the other hand, for example, Japanese Patent Application Laid-Open No. 7-23041
As disclosed in the “line fault notification method” of Japanese Patent Application Laid-Open Publication No. H10-207, a technique of collecting high-priority fault information is used by using a technique of collecting only high-priority fault information in an information processing apparatus. .

【０００４】図５は従来の障害情報採取方式の一例のブ
ロック図を示す。この障害情報採取方式は、障害情報セ
ーブエリア２を含む主記憶装置１、中央処理装置３、診
断プロセッサ４、入出力処理装置５、チャネル部（ａ〜
ｄ）６を含む障害検出部９、周辺装置（ａ〜ｄ）８及び
汎用処理装置インタフェースバス７より構成される。FIG. 5 is a block diagram showing an example of a conventional fault information collecting method. This failure information collecting method includes a main storage device 1 including a failure information save area 2, a central processing unit 3, a diagnostic processor 4, an input / output processing device 5, a channel unit (a to
d) A fault detector 9 including 6, peripheral devices (a to d) 8, and a general-purpose processing device interface bus 7.

【０００５】図示しないＯＳ（オペレーションシステ
ム）は、主記憶装置１に汎用処理装置インタフェースバ
ス７を介して接続された中央処理装置３に対し、Ｉ／Ｏ
命令の実行をする。中央処理装置３は、主記憶装置１上
にＩ／Ｏ（入出力）命令の実行の為のチャネルプログラ
ムを用意し、汎用処理装置インタフェースバス７を介し
て入手力処理装置５に対してチャネルプログラムが準備
されたことを通知する。An operating system (OS) (not shown) sends an I / O signal to the central processing unit 3 connected to the main storage device 1 via the general-purpose processing device interface bus 7.
Execute the instruction. The central processing unit 3 prepares a channel program for executing I / O (input / output) instructions on the main storage device 1, and sends the channel program to the availability processing device 5 via the general-purpose processing device interface bus 7. Notify that has been prepared.

【０００６】中央処理装置３と入出力処理装置５との間
で、汎用処理装置インタフェースバス７を介して行われ
るいくつかの通話のやりとりの後、チャネル部６ａ〜６
ｄ配下の周辺装置８ａ〜８ｄと主記憶装置１との間でデ
ータ転送が行われることによってチャネルプログラムが
実行される。更に、情報処理装置の障害の診断及び障害
情報のデータ採取とデータ採取後の障害情報セーブエリ
ア２上のロックバイトを外す機能及び情報処理装置の初
期化を行うことを目的とした診断プロセッサ４が汎用処
理装置インタフェースバス７を介して接続されている。After several conversations between the central processing unit 3 and the input / output processing unit 5 via the general-purpose processing unit interface bus 7, the channel units 6a to 6
The channel program is executed by performing data transfer between the peripheral devices 8a to 8d under d and the main storage device 1. Further, a diagnosis processor 4 for diagnosing a failure of the information processing device, collecting data of the failure information, removing a lock byte in the failure information save area 2 after the data collection, and initializing the information processing device is provided. It is connected via a general-purpose processor interface bus 7.

【０００７】上述の如き構成処理装置において、例え
ば、チャネル部６ａと６ｂが障害を発生した場合、障害
検出部９が障害を検出する。そして、入出力処理装置５
に対してそれぞれ障害があった旨を通知する。これを認
識した入出力処理装置５は、中央処理装置３に対してチ
ャネルからの障害があったことを通知する。In the configuration processing apparatus as described above, for example, when a failure occurs in the channel units 6a and 6b, the failure detection unit 9 detects the failure. And the input / output processing device 5
To each of them. The input / output processing device 5 recognizing this notifies the central processing device 3 that there has been a failure from the channel.

【０００８】これにより、ＯＳはチャンネル部６ａ、６
ｂで障害が発生したことを認識する。入出力装置５は、
障害情報セーブエリア２上のロックバイトを取得し、障
害情報を主記憶装置１上にある障害情報セーブエリア２
に対し障害情報をデータ転送する。同時に診断プロセッ
サ４に対して障害があったことを通知することにより、
診断プロセッサ４は、障害情報セーブエリア２から障害
情報を採取して障害情報セーブエリア２上のロックバイ
トを解放する。この際に、障害情報セーブエリア２は、
装置当り１つしか存在しない為、最初のチャネル部６ａ
の障害を書込んだ時点で、障害情報セーブエリア２上の
ロックバイトを取得し、障害情報セーブエリア２のロッ
クを行う。As a result, the OS operates in the channel sections 6a and 6a.
It recognizes that a failure has occurred in b. The input / output device 5
The lock byte in the failure information save area 2 is acquired, and the failure information is stored in the failure information save area 2 in the main storage device 1.
To transfer failure information to the server. At the same time, by notifying the diagnostic processor 4 that a failure has occurred,
The diagnostic processor 4 collects the fault information from the fault information save area 2 and releases the lock byte on the fault information save area 2. At this time, the fault information save area 2
Since there is only one per device, the first channel 6a
When the failure is written, the lock byte on the failure information save area 2 is acquired and the failure information save area 2 is locked.

【０００９】その後のチャネル部６ｂの障害は、障害情
報セーブエリア２のロックがとれない為に、チャネル部
６ｂの障害情報は廃棄することになる。障害情報セーブ
エリア２が、装置当り１つしか存在しないのは、同時に
複数の障害があった場合に、どこまでリソースを持てば
よいかが予測できない為である。In the subsequent failure of the channel section 6b, the failure information save area 2 cannot be locked, so that the failure information of the channel section 6b is discarded. The reason why there is only one fault information save area 2 per device is that when there are multiple faults at the same time, it is impossible to predict how much resources should be held.

【００１０】次に、図６は、図５に示す障害情報処理装
置中の障害検出部９を含む部分の詳細構成図である。障
害検出部９は、チャネル部（ａ〜ｄ）６と接続され、障
害要因解析部３４及び制御部３２を有する。各チャネル
部（ａ〜ｄ）６からの障害情報は、Ａ〜Ｃの３種類の障
害種類３３ａ〜３３ｃにグループ分けされて入力され
る。Ａ〜Ｃ障害種類３３ａ〜３３ｃの間に優先順位を３
３ａ＞３３ｂ＞３３ｃが設定された優先順位判定手段
（図示せず）を設ける。３つのグループ中で複数のグル
ープから障害情報が発生すると、上述した優先順位判定
手段により優先する１つの障害情報を、障害要因解析部
３４から制御部３２へ通知するものである。この場合、
複数グループ中の優先する１つのグループの障害通知に
対し処理が行われる為に、より有効な障害情報解析を可
能にする。Next, FIG. 6 is a detailed configuration diagram of a portion including the fault detecting section 9 in the fault information processing device shown in FIG. The failure detection unit 9 is connected to the channel units (a to d) 6 and includes a failure factor analysis unit 34 and a control unit 32. The fault information from each of the channel units (a to d) 6 is input by being grouped into three types of faults 33a to 33c of A to C. The priority is 3 among the A to C failure types 33a to 33c.
A priority determining means (not shown) in which 3a>33b> 33c is set is provided. When failure information is generated from a plurality of groups among the three groups, one failure information given priority by the above-described priority order determination means is notified from the failure factor analysis unit 34 to the control unit 32. in this case,
Since the processing is performed for the failure notification of one of the groups that has priority, a more effective failure information analysis can be performed.

【００１１】[0011]

【発明が解決しようとする課題】上述した従来技術の問
題点は、複数の障害に対して、より早く発生した１つの
障害情報しか採取できないので、重要な障害情報が採取
きない虞れがある。The above-mentioned problem of the prior art is that, for a plurality of faults, only one fault information which has occurred earlier can be collected, and there is a possibility that important fault information cannot be collected. .

【００１２】また、図５及び図６の技術では、複数の障
害に対し優先順位をつけ、より優先度の高い障害に対す
る障害情報を採取する場合には、優先度の低い障害情報
に関しては全く採取できない。さらに、障害要因が異な
っても採取すべき障害情報は固定データを採取している
為に、要因によっては不要なデータや詳細データが不足
するという問題があった。In the techniques of FIGS. 5 and 6, priorities are assigned to a plurality of faults, and when fault information for a fault with a higher priority is collected, fault information for a lower priority is not collected at all. Can not. Furthermore, since the failure information to be collected is fixed data even if the failure factors are different, there is a problem that unnecessary data or detailed data is insufficient depending on the factors.

【００１３】本発明の目的は、複数の障害が同時に発生
した場合に、優先度の高い障害情報のデータを中心に障
害情報を採取すると共に優先度の低い障害情報も代表デ
ータを採取し、且つ障害の要因によって採取するデータ
をフレキシブルに変更することで、限りある障害情報セ
ーブエリアを有効に使用することである。[0013] An object of the present invention is to collect fault information mainly on data of fault information with high priority and collect representative data on fault information with low priority when a plurality of faults occur simultaneously. The purpose is to effectively use a limited fault information save area by flexibly changing data to be collected depending on the cause of a fault.

【００１４】[0014]

【課題を解決するための手段】前述の課題を解決するた
め、本発明による障害情報採取方式は、次のような特徴
的な構成を採用している。SUMMARY OF THE INVENTION In order to solve the above-mentioned problems, a failure information collecting method according to the present invention employs the following characteristic configuration.

【００１５】（１）障害を検出する障害検出部と、発生
した障害に対する障害処理部とを有する障害情報採取方
式において、前記障害処理部での障害処理実行中に他の
障害が発生した際に、優先順位を決定し、前記障害情報
を更新できないロック状態にする障害情報採取方式。(1) In a fault information collecting method having a fault detecting unit for detecting a fault and a fault processing unit for the fault which has occurred, when another fault occurs while the fault processing unit is executing fault processing. A failure information collecting method for determining a priority order and setting a lock state in which the failure information cannot be updated.

【００１６】（２）それぞれ周辺装置に接続される複数
のチャネル部及び該チャネル部の障害発生を検出する障
害検出部を含み障害情報を採取する障害情報採取方式に
おいて、前記障害検出部に接続され、前記障害情報を格
納する障害情報バッファを含む障害情報採取部を有する
障害情報採取方式。(2) In a fault information collecting method for collecting fault information including a plurality of channel units each connected to a peripheral device and a fault detecting unit for detecting the occurrence of a fault in the channel unit, the system is connected to the fault detecting unit. A fault information collecting method including a fault information collecting unit including a fault information buffer for storing the fault information.

【００１７】（３）前記障害情報バッファは、格納され
たデータを保護する為のロック状態及びロック解放状態
となる障害情報データ部を含む上記（２）の障害情報採
取方式。(3) The fault information collecting method according to the above (2), wherein the fault information buffer includes a fault information data section which is in a lock state and a lock release state for protecting stored data.

【００１８】（４）前記障害情報データ部は、前記障害
の要因の何如を問わず共通的データを書込む共通部と、
要因に応じて最適データを書込むフレキシブル部とを有
する上記（３）の障害情報採取方式。(4) The failure information data section includes a common section for writing common data regardless of the cause of the failure,
The fault information collecting method according to the above (3), further comprising a flexible section for writing optimum data according to the factor.

【００１９】（５）前記障害情報採取部は、順次発生す
る障害情報の優先度を比較して前記障害情報バッファへ
通知する障害情報制御部を有する上記（２）、（３）又
は（４）の障害情報採取方式。(5) The fault information collecting unit has a fault information control unit that compares the priorities of fault information that occur sequentially and notifies the fault information buffer of the fault information. Failure information collection method.

【００２０】[0020]

【発明の実施の形態】以下、本発明による障害情報採取
方式の好適実施形態例を添付図１乃至図４を参照して詳
細に説明する。尚、図５及び図６の従来例の障害情報採
取方式の構成要素と同様の要素には、便宜上同じ参照符
号を使用する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS A preferred embodiment of a failure information collecting system according to the present invention will be described below in detail with reference to FIGS. Note that the same reference numerals are used for the same elements as those of the conventional failure information collecting method shown in FIGS. 5 and 6 for convenience.

【００２１】先ず、図１は、本発明による障害情報採取
方式の好適実施形態例のブロック図を示す。この障害情
報採取方式は、障害情報セーブエリア２を含む主記憶装
置１、中央処理装置３、診断プロセッサ４、入出力処理
装置５、チャネル部（ａ〜ｄ）６、汎用処理装置インタ
フェースバス７、周辺装置（ａ〜ｄ）８、障害検出部９
及び障害情報採取部１０より構成される。FIG. 1 is a block diagram showing a preferred embodiment of a failure information collecting system according to the present invention. This failure information collecting method includes a main storage device 1 including a failure information save area 2, a central processing unit 3, a diagnostic processor 4, an input / output processing device 5, a channel unit (a to d) 6, a general-purpose processing device interface bus 7, Peripheral devices (a to d) 8, failure detection unit 9
And a failure information collecting unit 10.

【００２２】ＯＳ（図示せず）は、主記憶装置１に汎用
処理装置インタフェースバス７を介して接続された中央
処理装置３に対し、Ｉ／Ｏ命令の実行をする。中央処理
装置３上にＩ／Ｏ命令の実行の為のチャネルプログラム
を用意し、汎用処理装置インタフェースバス７を介し
て、入出力処理装置５に対しチャネルプログラムが準備
されたことを通知する。An OS (not shown) executes an I / O instruction for the central processing unit 3 connected to the main storage device 1 via the general-purpose processing device interface bus 7. A channel program for executing an I / O instruction is prepared on the central processing unit 3, and the I / O processing unit 5 is notified via the general-purpose processing unit interface bus 7 that the channel program has been prepared.

【００２３】中央処理装置３と入出力処理装置５との間
で、汎用処理装置インタフェースバス７を介して行われ
るいくつかの通信のやりとりの後、チャネル部６ａ〜６
ｄ配下の周辺装置８ａ〜８ｄと主記憶装置１との間でデ
ータ転送が行われることにより、チャネルプログラムが
実行される。更に情報処理装置の障害の診断及び障害情
報のデータ採取とデータ採取後の障害情報セーブエリア
２上のロックバイトを外す機能及び情報処理装置の初期
化を行うことを目的とした診断プロセッサ４が汎用処理
装置インタフェースバス７を介して接続されている。After some communication between the central processing unit 3 and the input / output processing unit 5 via the general-purpose processing unit interface bus 7, the channel units 6a to 6
The channel program is executed by performing data transfer between the peripheral devices 8a to 8d under d and the main storage device 1. Further, a diagnostic processor 4 for diagnosing a failure of the information processing apparatus, collecting data of the failure information, removing a lock byte in the failure information save area 2 after the data collection, and initializing the information processing apparatus is generally used. It is connected via a processing device interface bus 7.

【００２４】チャネル部６ａ〜６ｄのいずれかで障害が
発生したことを検出する障害検出部９が対応するチャネ
ル部６と接続される。障害検出部９が障害発生したこと
を通知する障害情報採取部１０は、入出力処理装置５と
障害検出部９との間に接続されている。A failure detector 9 for detecting that a failure has occurred in any of the channel units 6a to 6d is connected to the corresponding channel unit 6. The failure information collecting unit 10 for notifying that a failure has occurred in the failure detection unit 9 is connected between the input / output processing device 5 and the failure detection unit 9.

【００２５】次に、図２は、図１の障害情報採取部１０
の詳細ブロック図である。この障害情報採取部１０は、
障害検出部９に接続された制御部１２と権利者レジスタ
１３、ロックフラグ１４及び障害情報データ部１５を含
む障害情報バッファ１１とより構成されている。Next, FIG. 2 shows the fault information collecting unit 10 shown in FIG.
It is a detailed block diagram of. This failure information collecting unit 10
It comprises a control unit 12 connected to the failure detection unit 9 and a failure information buffer 11 including a right holder register 13, a lock flag 14, and a failure information data unit 15.

【００２６】次に、図１及び図２を参照して、本発明に
よる障害情報採取方式の動作を説明する。図１の障害情
報採取方式において、優先度が低い要因Ａの障害が例え
ばチャネル部６ａから発生した直後に、優先度が高い要
因Ｂの障害が例えばチャネル部６ｂから発生した場合を
想定する。障害検出部９は、最初に障害の起ったチャネ
ル部６ａの障害を認識し、障害情報採取部１０に対して
障害の通知を行う。Next, the operation of the fault information collecting method according to the present invention will be described with reference to FIGS. In the failure information collection method of FIG. 1, it is assumed that a failure of a high priority factor B occurs, for example, from the channel unit 6b immediately after a failure of the low priority factor A occurs, for example, from the channel unit 6a. The failure detection unit 9 recognizes the failure of the channel unit 6a where the failure has occurred first, and notifies the failure information collection unit 10 of the failure.

【００２７】障害情報採取部１０内の制御回路１２は、
これを検出してチャネル部６ａからの要因Ａであること
を障害情報バッファ１１内の権利者レジスタ１３に書込
む。この際に、障害情報制御部１２は、ロックフラグ１
４と権利者レジスタ１３をチェックする。そして、障害
情報制御部１２がロック状態出ないことと、有効な障害
情報が障害情報データ部１５にないことを確認する。The control circuit 12 in the fault information collecting unit 10
Upon detecting this, the cause A from the channel section 6a is written into the right holder register 13 in the failure information buffer 11. At this time, the failure information control unit 12 sets the lock flag 1
4 and the right holder register 13 are checked. Then, it is confirmed that the failure information control unit 12 does not come out of the locked state and that there is no valid failure information in the failure information data unit 15.

【００２８】図３は、図２中の障害情報データ部１５の
詳細構成図である。図３に示す如く、障害情報データ部
１５は、共通部１６及びフレキシブル部１７から構成さ
れる。共通部１６へは、要因の何如に拘らず共通的なデ
ータを書込むが要因によってはフレキシブル部１７への
書込むデータは変化する。即ち、この時点では、要因Ａ
に最適なデータを採取し、フレキシブル部１７へ書込
む。FIG. 3 is a detailed configuration diagram of the fault information data section 15 in FIG. As shown in FIG. 3, the failure information data section 15 includes a common section 16 and a flexible section 17. Common data is written to the common unit 16 irrespective of the factor, but data to be written to the flexible unit 17 changes depending on the factor. That is, at this point, the factor A
The most suitable data is collected and written to the flexible unit 17.

【００２９】障害情報採取部１０は、入出力処理装置５
に対し、チャネル部６ａで障害があったことを通知す
る。これを認識した入出力処理装置５は、中央処理装置
３にチャネル部６ａからの障害があったことを通知す
る。これにより、ＯＳは、中央処理装置３を介してチャ
ネル部６ａで障害が発生したことを認識し、チャネル部
６ａの障害発生処理を行う。一般的には、チャネル部６
ａの切離し処理又はチャネル部６ａのリセット後に回復
処理を行う。The fault information collecting unit 10 is provided with the input / output processing unit 5
Is notified that a failure has occurred in the channel unit 6a. The input / output processing device 5 recognizing this notifies the central processing device 3 that a failure has occurred from the channel unit 6a. Accordingly, the OS recognizes that a failure has occurred in the channel unit 6a via the central processing unit 3, and performs a failure occurrence process for the channel unit 6a. Generally, the channel section 6
Recovery processing is performed after the disconnection processing of a or the reset of the channel unit 6a.

【００３０】一方、入出力処理装置５は、障害情報を主
記憶装置１上にある障害情報セーブエリア２に対し、障
害情報のデータを転送する為の準備をする。しかし、デ
ータ転送前に障害検出部９が、次に障害の起ったチャネ
ル部６ｂからの障害を認識し、障害情報採取部１０に対
して障害の通知が行われた場合には、障害情報採取部１
０内の制御部１２は、これを検出し、チャネル部６ｂか
らの要因Ｂの障害だということを障害情報バッファ１１
内の権利者レジスタ１３に書込もうとする。On the other hand, the input / output processing device 5 prepares for transferring the failure information data to the failure information save area 2 on the main storage device 1. However, if the failure detection unit 9 recognizes the failure from the next failed channel unit 6b before the data transfer and notifies the failure information collection unit 10 of the failure, the failure information Sampling unit 1
The control unit 12 in the error information buffer 11 detects the cause of the cause B from the channel unit 6b.
Attempts to write to the right holder register 13 in the.

【００３１】この際、上述の場合と同様に、ロックフラ
ブ１４と権利者レジスタ１３をチェックする。この場
合、ロックフラグ１４は取得されていないが、既にチャ
ネル部６ａからの要因Ａのデータが権利者レジスタ１３
に書込まれているので、障害情報制御部１２は、要因Ａ
と要因Ｂの優先順位を比較する。その結果、要因Ｂの方
が要因Ａよりも優先順位が高い為に、制御部１２は障害
情報データ部１５の要因Ａから代表データのみを取出
し、共通部１６へチャネル部６ａからの要因Ａの障害が
発生したことと代表データを書込む。その後、フレキシ
ブル部１７にあった、チャネル部６ａからの要因Ａの障
害情報を破棄し、要因Ｂに最適なデータをフレキシブル
部１７へ書込む。また、権利者レジスタ１３には、チャ
ネル部６ｂからの要因Ｂが発生したということを示すデ
ータを書込む。At this time, the lock flag 14 and the right holder register 13 are checked as in the case described above. In this case, although the lock flag 14 has not been acquired, the data of the factor A from the channel unit 6a has already been
The fault information control unit 12 writes the
And the priority of the factor B. As a result, since the factor B has a higher priority than the factor A, the control unit 12 extracts only the representative data from the factor A of the fault information data unit 15 and sends the representative data of the factor A from the channel unit 6a to the common unit 16. Write down the failure and representative data. Thereafter, the failure information of the factor A from the channel unit 6a in the flexible unit 17 is discarded, and the optimal data for the factor B is written to the flexible unit 17. Further, data indicating that the factor B from the channel unit 6b has occurred is written into the right holder register 13.

【００３２】一方、障害情報採取部１０は、入出力処理
装置５に対し、チャネル部６ｂで障害があったことを通
知する。これを認識した入出力処理装置５は、中央処理
装置３にチャネル部６ｂからの障害があった事を通知す
る。これによって、ＯＳは中央処理装置３を介してチャ
ネル部６ｂで障害が発生したことを認識し、ＯＳは、チ
ャネル部６ｂの障害発生処理を行う。On the other hand, the fault information collecting unit 10 notifies the input / output processing unit 5 that a fault has occurred in the channel unit 6b. The input / output processing device 5 that recognizes this notifies the central processing device 3 that there has been a failure from the channel unit 6b. Thus, the OS recognizes that a failure has occurred in the channel unit 6b via the central processing unit 3, and the OS performs a failure occurrence process for the channel unit 6b.

【００３３】更に、入出力処理装置５は、障害情報セー
ブエリア２上のロックバイトを取得した後、障害情報の
データの転送を準備し終わると、準備ができたことを障
害情報採取部１０に知らせる。この通知を受けた障害情
報採取部１０は、障害情報データ部１５のデータを保護
する為に障害情報バッファ１１内のロックフラグ１４の
ロックを取得する。これが終わると、実際に障害情報デ
ータ部１５内のデータが、入出力処理装置５を通して、
主記憶装置１内の障害情報セーブエリア２へデータ転送
される。データ転送が終了すると、障害情報制御部１２
は、障害情報バッファ１１内の権利者レジスタ１３及び
ロックフラグ１４をクリアする。Further, after acquiring the lock byte in the failure information save area 2, the input / output processor 5 finishes preparing the transfer of the failure information data. Inform. Upon receiving this notification, the failure information collecting unit 10 acquires the lock of the lock flag 14 in the failure information buffer 11 to protect the data of the failure information data unit 15. When this is completed, the data in the fault information data section 15 is actually
The data is transferred to the failure information save area 2 in the main storage device 1. When the data transfer is completed, the failure information control unit 12
Clears the right holder register 13 and the lock flag 14 in the failure information buffer 11.

【００３４】一方、入出力処理装置５は、診断プロセッ
サ４に対し有効な障害情報を障害情報セーブエリア２へ
データ転送したことを通知する。これによって、診断プ
ロセッサ４は、障害情報セーブエリア２から障害情報を
採取し、採取が終わると、障害情報セーブエリア２上の
ロックバイトを解放する。On the other hand, the input / output processing unit 5 notifies the diagnostic processor 4 that valid fault information has been transferred to the fault information save area 2. As a result, the diagnostic processor 4 collects the fault information from the fault information save area 2, and releases the lock byte on the fault information save area 2 when the collection is completed.

【００３５】次に、上述した一連の動作の一例を、図４
を用いて詳細に説明する。障害情報制御部１２は、チャ
ネル部６ａでの障害を認識すると、ロックフラグ１４が
ロックされているかどうかチェックする（ステップ
１）。もしここで、ロックがされている場合は、障害情
報の採取をあきらめ、入出力処理装置５への障害通知ま
で処理を飛ばす。Next, an example of the above-described series of operations will be described with reference to FIG.
This will be described in detail with reference to FIG. When recognizing a fault in the channel unit 6a, the fault information control unit 12 checks whether the lock flag 14 is locked (step 1). Here, if the lock is set, the collection of the failure information is abandoned, and the processing is skipped to the failure notification to the input / output processing device 5.

【００３６】ステップ１でロックされていない場合は、
権利者レジスタ１３をチェックし、フリーでない場合は
優先度を比較する（ステップ２）。権利者レジスタ１３
がフリー又は、権利者レジスタ１３に書かれていたデー
タより優先度が高ければ、チャネル部６ａからの要因Ａ
だということを、権利者レジスタ１３に書込むと共に、
障害情報データ部１５に要因Ａのフォーマットで書込
み、権利者レジスタ１３に書込まれていたデータの代表
データを履歴として障害情報データ部１５に残す（ステ
ップ３）。If it is not locked in step 1,
The right holder register 13 is checked, and if it is not free, the priorities are compared (step 2). Right holder register 13
Is free or has a higher priority than the data written in the right holder register 13, if the factor A from the channel unit 6a
Is written in the rights holder register 13 and
The data is written in the failure information data section 15 in the format of the cause A, and the representative data of the data written in the right holder register 13 is left in the failure information data section 15 as a history (step 3).

【００３７】障害情報制御部１２は、入出力処理装置５
にチャネル部６ａからの障害があったことを通知する。
入出力処理装置５は、中央処理装置３にチャネル部６ａ
からの障害があったことを通知すると共に、障害情報セ
ーブエリア２のロックを取り、また、障害情報のデータ
転送の為の準備を行う。データ転送の準備ができると、
障害情報制御部１２へ準備ができたことを通知するが、
この間に、障害情報制御部１２が、チャネル部６ｂでの
障害を認識すると、上述と同様にステップ１、ステップ
２、ステップ３と実行する。ステップ３において、権利
者レジスタ１３に書込まれていたデータより優先度が低
い場合は、代表データだけを履歴として障害情報データ
部１５に書込む。The fault information control unit 12 is provided with the input / output processing unit 5
Is notified that a failure has occurred from the channel unit 6a.
The input / output processing device 5 includes a channel unit 6a in the central processing unit 3.
In addition to notifying that a failure has occurred, the lock of the failure information save area 2 is taken, and preparations are made for data transfer of failure information. When you are ready to transfer data,
The failure information control unit 12 is notified that the preparation is completed.
During this time, if the failure information control unit 12 recognizes a failure in the channel unit 6b, it executes steps 1, 2, and 3 as described above. In step 3, if the priority is lower than the data written in the right holder register 13, only the representative data is written in the failure information data section 15 as a history.

【００３８】障害情報制御部１２は、入出力処理装置５
にチャネル部６ｂからの障害があったことを通知する。
入出力処理装置５は、中央処理装置３にチャネル部６ｂ
からの障害があった事を通知する。The fault information control unit 12 is provided with the input / output processing unit 5
Is notified that there is a failure from the channel unit 6b.
The input / output processing device 5 includes a channel unit 6b in the central processing unit 3.
Notify that there was an obstacle from.

【００３９】データ転送の準備ができたことが障害情報
制御部１２へ通知されたら、障害情報制御部１２は、ロ
ックフラグ１４を取得する。このロックフラグ１４が取
得されている間は、障害情報データ部１５への書込みは
禁止される。障害情報制御部１２は、障害情報データ部
１５から、主記憶装置１内の障害情報セーブエリア２へ
のデータ転送を起動する。データ転送が終了すると、ロ
ックフラグ１４を解放すると共に、権利者レジスタ１３
もクリアする。When the failure information control unit 12 is notified that the data transfer is ready, the failure information control unit 12 acquires the lock flag 14. While the lock flag 14 is being acquired, writing to the failure information data section 15 is prohibited. The failure information control unit 12 activates data transfer from the failure information data unit 15 to the failure information save area 2 in the main storage device 1. When the data transfer is completed, the lock flag 14 is released and the right holder register 13 is released.
Also clear.

【００４０】入出力処理装置５は、診断プロセッサ４に
対し障害データの転送があったことを通知することによ
って、診断プロセッサ４は、障害情報セーブエリア２か
ら障害情報を採取し、採取が終了すると、障害情報セー
ブエリア２上のロックバイトを解放する。The input / output processing device 5 notifies the diagnostic processor 4 that the failure data has been transferred, so that the diagnostic processor 4 collects the failure information from the failure information save area 2 and, when the collection is completed, Then, the lock byte on the failure information save area 2 is released.

【００４１】これらの動作により、複数の障害が同時に
発生した場合に、優先度の高い障害情報のデータをより
多く採取すると共に、優先度の低い障害情報も、代表デ
ータだけでも採取し、また、障害の要因によって採取す
るデータをフレキシブルに変更することで、限りあるデ
ータを有効に使用する事を目的とする。With these operations, when a plurality of faults occur simultaneously, more data of fault information having a higher priority is collected, and fault information having a lower priority is also collected using only representative data. The purpose is to use limited data effectively by flexibly changing the data collected depending on the cause of the failure.

【００４２】以上、本発明による障害情報採取方式の好
適実施形態例の構成及び動作を詳述した。しかし、本発
明は斯かる特定例のみに限定されるべきではなく、本発
明の要旨を逸脱することなく種々の変形変更が可能であ
ることが当業者には容易に理解できよう。The configuration and operation of the preferred embodiment of the failure information collecting system according to the present invention have been described above in detail. However, the present invention should not be limited to only such specific examples, and those skilled in the art can easily understand that various modifications can be made without departing from the gist of the present invention.

【００４３】[0043]

【発明の効果】上述の説明から理解される如く、本発明
の障害情報採取方式によると、複数の障害が同時に発生
する場合でも優先度の高い障害情報をより多く採取する
と共に優先度の低い障害情報も代表データを採取し、障
害の要因により採取するデータをフレキシブルに変更す
ることが可能になるという実用上の顕著な効果を有す
る。As can be understood from the above description, according to the fault information collecting method of the present invention, even when a plurality of faults occur simultaneously, more fault information of higher priority is collected and fault information of lower priority is obtained. The information has a practically remarkable effect that the representative data can be collected and the collected data can be flexibly changed depending on the cause of the failure.

[Brief description of the drawings]

【図１】本発明による障害情報採取方式の好適実施形態
例のブロック図である。FIG. 1 is a block diagram of a preferred embodiment of a failure information collecting method according to the present invention.

【図２】図１中の障害情報採取部の詳細ブロック図であ
る。FIG. 2 is a detailed block diagram of a failure information collecting unit in FIG.

【図３】図２中の障害情報データ部の構成図である。FIG. 3 is a configuration diagram of a failure information data section in FIG. 2;

【図４】図１乃至図３に示す障害情報採取方式の動作タ
イムチャートである。FIG. 4 is an operation time chart of the failure information collecting method shown in FIGS. 1 to 3;

【図５】従来の障害情報採取方式の一例のブロック図で
ある。FIG. 5 is a block diagram illustrating an example of a conventional failure information collection method.

【図６】図５中の障害情報検出部の構成ブロック図であ
る。FIG. 6 is a block diagram illustrating a configuration of a fault information detection unit in FIG. 5;

[Explanation of symbols]

６ａ〜６ｄチャネル部８ａ〜８ｄ周辺装置９障害検出部１０障害情報採取部１１障害情報バッファ１２障害情報制御部１５障害情報データ部１６共通部１７フレキシブル部 6a to 6d Channel unit 8a to 8d Peripheral device 9 Failure detection unit 10 Failure information collection unit 11 Failure information buffer 12 Failure information control unit 15 Failure information data unit 16 Common unit 17 Flexible unit

Claims

[Claims]

In a fault information collecting method having a fault detecting unit for detecting a fault and a fault processing unit for the fault that has occurred, when another fault occurs during execution of fault processing in the fault processing unit, A failure information collecting method for determining a priority order and setting a lock state in which the failure information cannot be updated.

2. A fault information collecting method for collecting fault information including a plurality of channel units connected to peripheral devices and a fault detecting unit for detecting occurrence of a fault in the channel unit, wherein the fault information collecting system is connected to the fault detecting unit; A failure information collection method, comprising: a failure information collection unit that includes a failure information buffer that stores the failure information.

3. The fault information collecting method according to claim 2, wherein said fault information buffer includes a fault information data section which is in a lock state and a lock release state for protecting stored data.

4. The fault information data section has a common section for writing common data regardless of the cause of the fault, and a flexible section for writing optimum data according to the cause. The failure information collection method according to claim 3.

5. The fault information collection unit according to claim 2, further comprising a fault information control unit that compares priorities of the fault information sequentially generated and notifies the fault information buffer.
The failure information collection method described in 3 or 4.