JPH07141265A

JPH07141265A - Error monitoring method for magnetic disk device

Info

Publication number: JPH07141265A
Application number: JP5286596A
Authority: JP
Inventors: Takashi Kuramochi; 高志倉持
Original assignee: NEC Information Service Co Ltd
Current assignee: NEC Information Service Co Ltd
Priority date: 1993-11-16
Filing date: 1993-11-16
Publication date: 1995-06-02

Abstract

PURPOSE:To minimize manual operation in a monitoring process for error information by automating the short-time monitoring and long-period monitoring of the error information. CONSTITUTION:A short-period comparison part 7 inputs an error reference frequency from an error reference frequency parameter file 2 and compares data totalized by a short-period classification totalization part 6, and an error accumulating process part 8 stores the comparison result of the short-period comparison part 7 in a disk system error accumulation file 4. An information selection part 11 inputs conditions for selection from an information selection condition storage file 3 and selects necessary data in the result of totalization by a long-period classification totalization part 10, and a long-period comparison part 12 inputs an error reference frequency from the error reference frequency parameter file 2 and compares it with the data selected by the long-period classification totalization part 11. Then, an alarm output part 13 places an output device 14 according to the decision result of the short-period comparison part 7 or long-period comparison part 13 to generate an alarm.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、電子計算機システムに
おける磁気ディスク装置に関するエラーの発生状況を監
視するためのエラー監視方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an error monitoring method for monitoring the occurrence status of errors relating to a magnetic disk unit in an electronic computer system.

【０００２】[0002]

【従来の技術】図４は従来の磁気ディスク装置のエラー
監視方法の一例を示すフローチャートである。2. Description of the Related Art FIG. 4 is a flow chart showing an example of a conventional error monitoring method for a magnetic disk device.

【０００３】電子計算機システムにおける磁気ディスク
装置に関するエラーの発生状況を監視するための従来の
エラー監視方法は、磁気ディスク装置のエラーに関する
エラー情報レポートを出力して保守員（または操作員）
が人手によって判断する方法や、エラーの発生の都度そ
のことを保守員（または操作員）に対して通報するとい
う方法が一般的に採用されている。A conventional error monitoring method for monitoring the occurrence status of an error relating to a magnetic disk device in an electronic computer system is to output an error information report relating to an error of the magnetic disk device and maintain it (or an operator).
Is generally adopted, or a method of notifying a maintenance person (or an operator) of the occurrence of an error whenever it occurs is generally adopted.

【０００４】従って、エラー情報レポートやエラーの通
報によってエラーの発生状況を知らされた保守員（また
は操作員）は、エラーを発生した磁気ディスク装置や、
発生したエラーの種類を分類して集計し、その集計結果
から磁気ディスク装置の媒体障害となる兆候を認識した
とき、該当する磁気ディスク装置に対して障害発生の予
防処置を施している。Therefore, the maintenance personnel (or the operator) who has been informed of the error occurrence status by the error information report or the error notification notifies the magnetic disk device in which the error has occurred,
When the types of errors that have occurred are categorized and tabulated, and the signs of media failure of the magnetic disk drive are recognized from the results of the tabulation, preventive measures against failure are taken for the corresponding magnetic disk drive.

【０００５】図４はこのような従来の磁気ディスク装置
のエラー監視方法の一例を示すフローチャートである。FIG. 4 is a flow chart showing an example of such a conventional error monitoring method for a magnetic disk device.

【０００６】図４において、エラー情報格納ファイル１
は、電子計算機システムの各ハードウエアからオペレー
ティングシステムに対して通知されたエラー情報を格納
している。電子計算機システムは、このエラー情報格納
ファイル１のエラー情報を読出し（ステップ４１）、そ
の中の磁気ディスク装置に関する情報のみを抽出し（ス
テップ４２）、それを編集して（ステップ４３）１日に
１回ずつエラー情報レポートを出力している（ステップ
４４）。In FIG. 4, an error information storage file 1
Stores error information notified from each hardware of the electronic computer system to the operating system. The electronic computer system reads the error information in the error information storage file 1 (step 41), extracts only the information relating to the magnetic disk device therein (step 42), and edits it (step 43) for one day. An error information report is output once (step 44).

【０００７】例えば、特定の磁気ディスク装置において
特定の「エラー事象Ａ」が３回発生しているとき、この
「エラー事象Ａ」が５回発生した場合に磁気ディスク装
置に対して障害発生の予防処置を行うと決められている
と、上述のエラー情報レポートからは、「障害発生の予
防処置を行う必要はない」と判断される。For example, when a particular "error event A" occurs three times in a particular magnetic disk device, and if this "error event A" occurs five times, the failure prevention for the magnetic disk device is prevented. If it is decided to take measures, it is judged from the above-mentioned error information report that "it is not necessary to take preventive measures against the occurrence of failure".

【０００８】しかしながら、その前日以前にも「エラー
事象Ａ」が発生しており、その発生回数が基準値５回に
達していない場合は、この磁気ディスク装置において
は、「エラー事象Ａ」による障害発生の可能性が高いに
も拘らず、「障害発生の予防処置を行わない」という結
果となる。However, if the "error event A" has occurred even before the previous day, and the number of occurrences has not reached the reference value of 5, then this magnetic disk drive has a failure due to the "error event A". Despite the high probability of occurrence, the result is that "no preventive measures against the occurrence of the disorder will be taken."

【０００９】このように、前日以前のエラー情報は、当
日の障害発生の予防処置の要否の判断に用いられないた
め、前日以前のエラー情報を障害発生の予防処置の要否
の判断の材料とするためには、保守員（または操作員）
は、一定期間（例えば１週間または１カ月）のエラー情
報を全部見直し、過去のエラー情報の発生履歴を点検し
なければならない。As described above, since the error information before the previous day is not used for judging the necessity of preventive measures against the occurrence of the failure on the day, the error information before the previous day is used as a material for judging whether the preventive measures against the failure occurs. In order to be
Must review all error information for a certain period (for example, one week or one month) and check the history of occurrence of past error information.

【００１０】[0010]

【発明が解決しようとする課題】上述したように、従来
の磁気ディスク装置のエラー監視方法は、前日以前のエ
ラー情報を含んでエラー事象の発生状況を監視するため
には、保守員（または操作員）は、一定期間のエラー情
報を全部見直して過去のエラー情報の発生履歴を点検し
なければならないため、工数面で多大の負担がかかるの
みならず、その作業が目視によって行われるために、保
守員（または操作員）の経験に依存し、時には重要なエ
ラー情報を見落して磁気ディスク装置の障害を発生させ
る危険があるという欠点を有している。As described above, according to the conventional error monitoring method for the magnetic disk device, in order to monitor the occurrence status of the error event including the error information before the previous day, the maintenance personnel (or the operation person). Member) must review all the error information for a certain period of time and check the history of occurrence of past error information, so not only is it a great burden in terms of man-hours, but the work is performed visually. It has a drawback that it depends on the experience of a maintenance person (or an operator) and sometimes there is a risk of overlooking important error information and causing a failure of the magnetic disk device.

【００１１】[0011]

【課題を解決するための手段】本発明の磁気ディスク装
置のエラー監視方法は、電子計算機システムのハードウ
エア全般のエラーに関する情報を格納しているエラー情
報格納ファイルから磁気ディスク装置に関するエラー情
報のみを抽出し、前記エラー情報を磁気ディスク装置の
装置別およびエラーの種類別に分類して集計して短期集
計結果を求め、前記短期集計結果をジョブ投入時にエラ
ーの種類別に上限回数として設定してあるエラー基準回
数と比較し、前記短期集計結果が前記エラー基準回数を
超えたときは警報出力を行い、前記短期集計結果をディ
スク系エラー累積ファイルに累積して格納し、前記ディ
スク系エラー累積ファイルに格納してあるすべての情報
を磁気ディスク装置の装置別およびエラーの種類別に分
類して集計して長期集計結果を求め、前記長期集計結果
の中からあらかじめ設定してある情報選別条件と合致す
る情報のみを選別して情報選別結果を求め、前記情報選
別結果を前記エラー基準回数と比較して前記情報選別結
果が前記エラー基準回数を超えたとき警報出力を行うこ
とを含むものであり、更に、ジョブ自動運転システムに
組込んで一定周期で自動的に動作させるようにし、ま
た、ディスク系エラー累積ファイルを一定期間毎に新し
いものとした複数世代のファイルを有する世代管理フア
イルとし、前記一定期間を任意の長さに設定できるよう
にしたものである。According to the error monitoring method for a magnetic disk device of the present invention, only the error information about the magnetic disk device is stored in an error information storage file that stores information about errors in the hardware of the computer system. An error that is extracted, the error information is classified by device of the magnetic disk device and classified by error type to obtain a short-term aggregation result, and the short-term aggregation result is set as the upper limit number of times by the error type when the job is submitted. Compared with a reference count, when the short-term count result exceeds the error reference count, an alarm is output, and the short-term count result is accumulated and stored in a disk-based error cumulative file and stored in the disk-based error cumulative file. All the information that is stored is categorized by device of the magnetic disk device and by the type of error and then aggregated and lengthened. Obtaining the aggregation result, selecting only the information that matches the preset information selection condition from the long-term aggregation result to obtain the information selection result, and comparing the information selection result with the error reference count to obtain the information. It includes outputting an alarm when the selection result exceeds the error reference number, and further incorporates it into the job automatic operation system to automatically operate at a constant cycle. Is a generation management file having files of a plurality of generations that are new at regular intervals, and the constant period can be set to an arbitrary length.

【００１２】[0012]

【実施例】次に、本発明の実施例について図面を参照し
て説明する。Embodiments of the present invention will now be described with reference to the drawings.

【００１３】図１は本発明の一実施例を示すフローチャ
ート、図２は図１の実施例を機能ブロックとして表現し
たブロック図、図３は図１の実施例の各ファイルの内容
の一例を示すフォーマット図で、（ａ）はエラー情報格
納ファイル、（ｂ）はエラー基準パラメータファイル、
（ｃ）はディスク系エラー累積ファイル、（ｄ）は情報
選別条件ファイルを示す図である。FIG. 1 is a flow chart showing an embodiment of the present invention, FIG. 2 is a block diagram expressing the embodiment of FIG. 1 as a functional block, and FIG. 3 shows an example of contents of each file of the embodiment of FIG. In the format diagram, (a) is an error information storage file, (b) is an error reference parameter file,
(C) is a diagram showing a disk-based error accumulation file, and (d) is an information selection condition file.

【００１４】本実施例は、図２に示すように、電子計算
機システムのハードウエア全般のエラーに関する情報を
格納しているエラー情報格納ファイル１と、磁気ディス
ク装置のエラーの種類別に、エラーの内容や危険度から
判断して決定した上限回数をそのエラーに関する基準回
数として設定して格納してあるエラー基準回数パラメー
タファイル２と、磁気ディスク装置のエラー情報を累積
して格納するディスク系エラー累積ファイル４と、ディ
スク系エラー累積ファイル４に格納してあるエラー情報
から必要な情報を選別するための条件を格納している情
報選別条件格納ファイル３と、警報を出力する出力装置
１４とを使用し、エラー情報入力・抽出部５においてエ
ラー情報格納ファイル１からエラー情報を入力して磁気
ディスク装置に関するエラー情報を抽出し、短期間分類
集計部６において、エラー情報入力・抽出部５で抽出し
たエラー情報を磁気ディスク装置の装置毎にエラーの種
類別に分類して集計し、短期間比較部７において、エラ
ー基準回数パラメータファイル２からエラー基準回数を
入力して短期間分類集計部６で集計したデータの比較
し、エラー累積処理部８において、短期間比較部７で比
較した結果をディスク系エラー累積ファイル４に格納
し、累積エラー入力部９において、ディスク系エラー累
積ファイル４に格納してある情報を入力し、長期間分類
集計部１０において、累積エラー入力部９が入力した情
報を磁気ディスク装置の装置毎にエラーの種類別に分類
して集計し、情報選別部１１において、情報選別条件格
納ファイル３から選別のための条件を入力してその条件
によって長期間分類集計部１０で集計した結果の中から
必要なデータを選別し、長期間比較部１２において、エ
ラー基準回数パラメータファイル２からエラー基準回数
を入力して長期間分類集計部１１で選別したデータと比
較し、警報出力部１３において、短期間比較部７または
長期間比較部１２の判定結果によって出力装置１４を動
作させて警報を発生させる。In this embodiment, as shown in FIG. 2, an error information storage file 1 for storing information about errors in the hardware of the electronic computer system and error contents for each type of error in the magnetic disk unit. Error reference count parameter file 2 in which the upper limit count determined from the risk is set as the reference count for the error and stored, and the disk error cumulative file that accumulates and stores the error information of the magnetic disk device 4, an information selection condition storage file 3 that stores conditions for selecting necessary information from the error information stored in the disk-based error accumulation file 4, and an output device 14 that outputs an alarm. The error information input / extraction unit 5 inputs the error information from the error information storage file 1 to input the error information to the magnetic disk device. Error information extracted by the error information input / extraction unit 5 is categorized by the error type for each magnetic disk device and aggregated by the short-term comparison unit 7. In, the error reference number is input from the error reference number parameter file 2, the data aggregated by the short-term classification and aggregation unit 6 is compared, and the error accumulation processing unit 8 compares the result by the short-term comparison unit 7 with the disk error. The information stored in the cumulative file 4 is input to the cumulative error input unit 9, and the information stored in the disk error cumulative file 4 is input to the cumulative error input unit 9. Each device is classified by type of error and totaled, and the information selection unit 11 inputs the conditions for selection from the information selection condition storage file 3. Then, necessary data is selected from the results aggregated by the long-term classification aggregation unit 10 according to the conditions, and the long-term comparison unit 12 inputs the error reference number from the error reference number parameter file 2 to perform long-term classification aggregation. In comparison with the data selected by the unit 11, the alarm output unit 13 operates the output device 14 according to the determination result of the short-term comparison unit 7 or the long-term comparison unit 12 to generate an alarm.

【００１５】以下、上述の動作について、図１および図
３を参照して詳細に説明する。The above operation will be described in detail below with reference to FIGS. 1 and 3.

【００１６】磁気ディスク装置を含む電子計算機システ
ムのハードウエアのすべてのエラー情報は、オペレーテ
ィングシステムによってエラー情報格納ファイル１に格
納される。All the error information of the hardware of the electronic computer system including the magnetic disk device is stored in the error information storage file 1 by the operating system.

【００１７】エラー情報格納ファイル１は、図３（ａ）
に示すように、エラーの発生時刻５１と、エラーを発生
した装置名５２と、発生したエラーの内容（エラー内
容）５３とを一組として記録している。The error information storage file 1 is shown in FIG.
As shown in FIG. 3, the error occurrence time 51, the device name 52 in which the error occurred, and the content of the error (error content) 53 are recorded as a set.

【００１８】図１のステップ２１において、エラー情報
格納ファイル１から上記のようなエラー情報を入力し、
ステップ２２において、それらの中から磁気ディスク装
置に関するエラー情報のみを抽出する。続いてステップ
２３において、ステップ２２で抽出した磁気ディスク装
置に関するエラー情報をエラーの種類別に分類し、ステ
ップ２４において、その分類したエラー情報を磁気ディ
スク装置毎に集計する。In step 21 of FIG. 1, the above error information is input from the error information storage file 1,
In step 22, only the error information regarding the magnetic disk device is extracted from them. Subsequently, in step 23, the error information regarding the magnetic disk device extracted in step 22 is classified by the type of error, and in step 24, the classified error information is totaled for each magnetic disk device.

【００１９】次にステップ２５において、エラー基準回
数パラメータファイル２から、エラーの種類別に設定さ
れているエラー基準回数を入力する。エラー基準回数パ
ラメータファイル２は、図３（ｂ）に示すように、それ
ぞれのエラーを記号で示したエラー種別６１と、各エラ
ーのそれぞれに対して設定した基準回数６２と、各エラ
ーの内容を説明したエラー種別説明６３とを一組として
記録している。Next, at step 25, the error reference number set for each type of error is input from the error reference number parameter file 2. As shown in FIG. 3B, the error reference number parameter file 2 includes an error type 61 in which each error is indicated by a symbol, a reference number 62 set for each error, and the content of each error. The described error type description 63 is recorded as a set.

【００２０】次にステップ２６において、ステップ２４
で集計した磁気ディスク装置毎のエラー情報の回数と、
それに対応するエラー基準回数パラメータファイル２の
基準回数とを比較し、エラー情報の回数が基準回数を超
える場合は、警報出力部に磁気ディスク装置の装置番号
とエーラー回数とエラー種別とを通知し、警報出力部は
出力装置１４にメッセージを出力させる（ステップ３
５）。Next, in step 26, step 24
The number of error information for each magnetic disk device
The corresponding reference number of the error reference number parameter file 2 is compared, and when the number of error information exceeds the reference number, the alarm output unit is notified of the device number of the magnetic disk device, the error count, and the error type, The alarm output unit causes the output device 14 to output a message (step 3).
5).

【００２１】ステップ２５におけるエラー情報と基準回
数との比較結果の如何に拘らず、ステップ２４における
集計結果は、すべてディスク系エラー累積ファイル４に
格納する（ステップ２７）。Regardless of the comparison result between the error information and the reference number in step 25, the totalized result in step 24 is stored in the disk-based error cumulative file 4 (step 27).

【００２２】ディスク系エラー累積ファイル４は、図３
（ｃ）に示すように、磁気ディスク装置の過去に発生し
たすべてのエラーについて、エラーを発生した装置名７
１と、そのエラーの発生時刻７２と、そのエラーの種類
を記号で示したエラー種別７３と、そのエラーの内容
（エラー内容）７４とを一組としてランダムに記録して
いる。The disk system error cumulative file 4 is shown in FIG.
As shown in (c), for all errors that occurred in the past in the magnetic disk device, the device name 7 in which the error occurred
1, an error occurrence time 72, an error type 73 in which the error type is indicated by a symbol, and an error content (error content) 74 are randomly recorded as one set.

【００２３】次に、ステップ２８においてディスク系エ
ラー累積ファイル４から累積されたエラー情報を入力
し、ステップ２９においてエラーの種類別に分類し、ス
テップ３０においてそれらを集計する。Next, in step 28, the error information accumulated from the disk-type error accumulation file 4 is input, in step 29 the error information is classified according to the type of error, and in step 30, they are totaled.

【００２４】ステップ３０における集計結果には、既に
処置が完了しているために判定対象から除外すべき不要
なエラーが混在しているため、ステップ３１において、
必要なエラーのみを選別するための条件を情報選別条件
格納ファイル３から入力する。Since the totalized result in step 30 includes unnecessary errors that should be excluded from the judgment target because the treatment has already been completed, in step 31,
The conditions for selecting only necessary errors are input from the information selection condition storage file 3.

【００２５】情報選別条件格納ファイル３には、図３
（ｄ）に示すように、装置名８１と、その装置に対して
処置を行った日時分を示す選別キー日時８２と、その日
時分に行っ処置の内容を示すコメント８３とが一組とし
て記録されている。従ってこのこの情報から、次に処理
すべき対象を選別する（ステップ３２）。The information selection condition storage file 3 is shown in FIG.
As shown in (d), the device name 81, the selection key date and time 82 indicating the date and time when the device is treated, and the comment 83 indicating the content of the action performed on the date and time are recorded as a set. Has been done. Therefore, from this information, the next object to be processed is selected (step 32).

【００２６】次に、ステップ３３においてエラー基準回
数パラメータファイル２からエラー基準回数を入力し、
ステップ３４においてステップ３０で集計した磁気ディ
スク装置毎のエラー情報の回数とそれに対応するエラー
基準回数とを比較する。エラー情報の回数が基準回数を
超える場合は、警報出力部に磁気ディスク装置の装置番
号とエーラー回数とエラー種別とを通知し、警報出力部
は出力装置１４にメッセージを出力させる（ステップ３
５）。エラー情報の回数が基準回数を超えない場合は、
そのまま処理を終了する。Next, at step 33, the error reference number is input from the error reference number parameter file 2,
In step 34, the number of error information for each magnetic disk device tabulated in step 30 is compared with the corresponding error reference number. If the number of times of error information exceeds the reference number, the alarm output unit is notified of the device number of the magnetic disk device, the error count, and the error type, and the alarm output unit causes the output device 14 to output a message (step 3).
5). If the number of error information does not exceed the reference number,
The process is terminated as it is.

【００２７】このように、一定時間毎のエラー情報の監
視に加えて、長期間に亘るエラー情報の発生回数を基準
回数と自動的に比較して監視する処理を一連のショブと
して実行できるようにすることにより、エラー情報の監
視処理に対する人手の介入を最小限に抑制することがで
きる。また、監視精度を向上させることができるため、
磁気ディスク装置の障害の発生を未然に防止することが
可能となる。Thus, in addition to monitoring the error information at regular intervals, a process of automatically comparing the number of occurrences of error information over a long period of time with a reference number and monitoring it can be executed as a series of shoves. By doing so, human intervention in the error information monitoring process can be suppressed to a minimum. Also, since the monitoring accuracy can be improved,
It is possible to prevent a failure of the magnetic disk device from occurring.

【００２８】なお、上述の説明は、１回の処理動作の説
明であるが、この動作をジョブの自動運転システムに組
込むことにより、エラー情報の監視処理を一定の周期で
自動的に実行させることができる。It should be noted that, although the above description is for one-time processing operation, by incorporating this operation in the job automatic operation system, the error information monitoring processing can be automatically executed at a constant cycle. You can

【００２９】また、エラー基準回数パラメータファイル
の内容は、ジョブの投入時に変更することができるた
め、磁気ディスク装置の特性等に応じて警報発生のレベ
ルを任意に設定することができる。Since the content of the error reference number parameter file can be changed when the job is submitted, the alarm generation level can be arbitrarily set according to the characteristics of the magnetic disk device.

【００３０】また、ディスク系エラー累積ファイルを一
定期間毎に新しいものとし、期間の異なる複数のディス
ク系エラー累積ファイルによって世代管理ファイル（例
えば、一つのディスク系エラー累積ファイルの累積期間
を１カ月とし、６カ月分のデータを対象とするときは、
６個のディスク系エラー累積ファイルによって世代管理
ファイルを構成する）とすることにより、累積期間の長
さを任意に設定することが可能となる。Further, the disk-based error cumulative file is made new at regular intervals, and a generation management file (for example, one disk-based error cumulative file has a cumulative period of one month) by a plurality of disk-based error cumulative files of different periods. , When the data for 6 months is targeted,
It is possible to arbitrarily set the length of the cumulative period by configuring the generation management file by the six disk-type error cumulative files).

【００３１】[0031]

【発明の効果】以上説明したように、本発明の磁気ディ
スク装置のエラー監視方法は、磁気ディスク装置におい
て発生したエラー情報を、一定の期間毎に独立に集計し
た結果を監視する短期間監視と、任意の長さに設定した
累積期間に発生したエラー情報を集計した結果を監視す
る長期間監視とを自動的に行うことができるようにする
ことにより、エラー情報の監視処理に対する人手の介入
を最小限に抑制することができるという効果がある。ま
た、監視精度を向上させることができるため、磁気ディ
スク装置の障害の発生を未然に防止することが可能とな
るという効果がある。As described above, the error monitoring method of the magnetic disk device according to the present invention includes short-term monitoring for monitoring the result of independently collecting error information generated in the magnetic disk device at regular intervals. By automatically enabling long-term monitoring that monitors the result of totaling error information that has occurred in the cumulative period set to an arbitrary length, human intervention for error information monitoring processing can be performed. There is an effect that it can be suppressed to the minimum. Further, since the monitoring accuracy can be improved, it is possible to prevent occurrence of a failure in the magnetic disk device.

[Brief description of drawings]

【図１】本発明の一実施例を示すフローチャートであ
る。FIG. 1 is a flowchart showing an embodiment of the present invention.

【図２】図１の実施例を機能ブロックとして表現したブ
ロック図である。FIG. 2 is a block diagram showing the embodiment of FIG. 1 as a functional block.

【図３】図１の実施例の各ファイルの内容の一例を示す
フォーマット図で、（ａ）はエラー情報格納ファイル、
（ｂ）はエラー基準パラメータファイル、（ｃ）はディ
スク系エラー累積ファイル、（ｄ）は情報選別条件ファ
イルを示す図である。3 is a format diagram showing an example of the contents of each file of the embodiment of FIG. 1, (a) is an error information storage file,
(B) is a diagram showing an error reference parameter file, (c) is a disk-based error cumulative file, and (d) is an information selection condition file.

【図４】従来の磁気ディスク装置のエラー監視方法の一
例を示すフローチャートである。FIG. 4 is a flowchart showing an example of a conventional error monitoring method for a magnetic disk device.

[Explanation of symbols]

１エラー情報格納ファイル２エラー基準回数パラメータファイル３情報選別条件格納ファイル４ディスク系エラー累積ファイル５エラー情報入力・抽出部６短期間分類集計部７短期間比較部８エラー累積処理部９累積エラー入力部１０長期間分類集計部１１情報選別部１２長期間比較部１３警報出力部１４出力装置２１〜３５・４１〜４３ステップ５１・７２エラーの発生時刻５２・７１・８１装置名５３・７４エラー内容６１・７３エラー種別６２基準回数６３エラー種別説明８２選別キー日時８３コメント 1 Error information storage file 2 Error reference count parameter file 3 Information selection condition storage file 4 Disk error accumulation file 5 Error information input / extraction unit 6 Short period classification aggregation unit 7 Short period comparison unit 8 Error accumulation processing unit 9 Accumulated error input Part 10 Long-term classification / aggregation part 11 Information selection part 12 Long-term comparison part 13 Alarm output part 14 Output device 21-35.41-43 Step 51.72 Error occurrence time 52.71.81 Device name 53.74 Error content 61/73 Error type 62 Reference count 63 Error type description 82 Selection key date / time 83 Comment

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁶ 識別記号庁内整理番号ＦＩ技術表示箇所Ｇ１１Ｂ 20/18 ５７４Ｅ 9074−5Ｄ ─────────────────────────────────────────────────── ─── Continuation of the front page (51) Int.Cl. ⁶ Identification code Internal reference number FI technical display location G11B 20/18 574 E 9074-5D

Claims

[Claims]

1. An error information storage file that stores information related to general hardware errors of an electronic computer system is used to extract only error information related to a magnetic disk device, and the error information is used for each device of the magnetic disk device and the error information. The short-term aggregation result is obtained by classifying and summing up by type, and the short-term aggregation result is compared with the error reference number that is set as the upper limit number for each error type when the job is submitted. When it exceeds the limit, an alarm is output, the short-term totaling result is accumulated and stored in the disk-based error cumulative file, and all the information stored in the disk-based error cumulative file is stored for each device of the magnetic disk device and the error Calculate the long-term aggregation results by classifying by type and set in advance from the long-term aggregation results. To obtain an information selection result by selecting only information that matches a certain information selection condition, compare the information selection result with the error reference number, and issue an alarm when the information selection result exceeds the error reference number. An error monitoring method for a magnetic disk device, comprising:

2. A method for monitoring an error in a magnetic disk drive according to claim 1, wherein the method is incorporated into an automatic job operation system and automatically operated at a constant cycle.

3. A generation management file having files of a plurality of generations, in which the disk-type error cumulative file is made new every fixed period, and the fixed period can be set to an arbitrary length. An error monitoring method for a magnetic disk device according to claim 1 or 2.