JP3570893B2

JP3570893B2 - Failure determination device

Info

Publication number: JP3570893B2
Application number: JP16349498A
Authority: JP
Inventors: 和宏松本; 修山本; 栄治内藤
Original assignee: Fujitsu Ltd; Nippon Telegraph and Telephone Corp
Current assignee: Fujitsu Ltd; Nippon Telegraph and Telephone Corp
Priority date: 1998-06-11
Filing date: 1998-06-11
Publication date: 2004-09-29
Anticipated expiration: 2018-06-11
Also published as: JPH11355437A

Description

【０００１】
【発明の属する技術分野】
本発明は、装置の各個所から報告される障害発生情報に基づき障害の状態を判定する障害判定装置に関する。
【０００２】
【従来の技術】
ＡＴＭ交換機等の装置の各監視個所から報告される障害発生情報に基づく障害の状態の判定においては、障害の種類、重要度に応じて多種多様な判定処理が必要とされる。例えば或るものについては、障害発生を示す割込の発生頻度を監視し、割込発生の頻度が所定の閾値を超えたとき、その後所定の判定時間の経過後にも障害が継続していれば障害状態と判定されるが、発生頻度の閾値及び判定時間は障害の種類及び重要度に応じて多様に設定する必要があり、また、監視個所ごとに発生頻度の測定のためのタイマ及び判定時間の設定のためのタイマが必要である。
【０００３】
また、物理層における障害とそれによってサポートされるデータリンク層における障害との関係のように、一方が障害状態になれば必ず他方も障害状態となるが逆は必ずしも起こらないという相互に密接な関係を有する一群の監視個所が存在する。これら一群の監視個所については、共通のタイマを用いて処理を簡単化することが考えられる。
【０００４】
【発明が解決しようとする課題】
本発明の第１の目的は、多種多様な判定処理を必要とする多数の監視個所に対する判定処理を、可能な限り少ない数のタイマで統一的に達成しうる障害判定装置を提供することにある。
本発明の第２の目的は、相互に密接な関係のある複数の監視個所について共通のタイマを用いて判定処理を達成しうる障害判定装置を提供することにある。
【０００５】
【課題を解決するための手段】
本発明によれば、障害発生を示す割込の回数をカウントする手段と、障害発生割込の回数が１になったときタイマに第１の値を設定する手段と、障害発生割込の回数が所定値になったときタイマに第２の値を設定する手段と、タイマがタイムアップし、障害発生割込の回数が前記所定値以上であり、障害発生が継続しているとき、障害状態と判定する手段と、タイマがタイムアップし、障害発生割込の回数が前記所定値以上であり、障害発生が継続していないとき、障害復旧と判定する手段と、タイマがタイムアップし、障害発生割込の回数が前記所定値に達しないとき障害復旧と判定する手段と、障害復旧と判定されたとき障害発生割込の回数をゼロにクリアする手段とを具備する障害判定装置が提供される。
【０００６】
本発明によれば、障害が発生したとき発生した障害の種類に応じて障害判定中であることを記憶する手段と、障害が発生したとき発生した障害よりも優先度の高い種類の障害について障害判定中であることが記憶されていないとき、タイマに値を設定する手段と、タイマがタイムアップし、障害判定中であることが記憶されている障害の種類の中で最も優先度の高いものについて障害発生が継続しているとき、障害状態と判定する手段と、タイマがタイムアップし、障害判定中であることが記憶されている障害の種類の中で最も優先度の高いものについて障害発生が継続しておらず、他の障害の種類について障害判定中であるとき、タイマに値を設定する手段とを具備する障害判定装置もまた提供される。
【０００７】
【発明の実施の形態】
図１は本発明に係る障害判定装置のハードウェアの構成を示す。本発明の障害判定装置１０は、バス１２で相互に接続されたＣＰＵ１４，ＲＯＭ１６，ＲＡＭ１８、割込コントローラ２０、及び障害ステータスレジスタ２２を備えている。ＣＰＵ１４はＲＯＭ１６に格納されているプログラムに従って障害判定処理を行なう。ステータスレジスタ２２は、障害判定の対象となるＡＴＭ交換機２４の各監視個所に接続され、各監視個所から報告される障害の状態を記憶する。割込コントローラ２０は監視個所のいずれかにおいて障害が発生したときＣＰＵ１４に割込をかけ、それに応じてＣＰＵ１４が所定の割込処理を完了した後においても障害の状態が継続しているとき、又は、再度障害が発生したとき、ＣＰＵ１４に対して再度割込をかける。ステータスレジスタ２２の内容はＣＰＵ１４から読み出すことができる。
【０００８】
図２は、本発明の第１の実施例に係る障害判定装置のソフトウェアの構成を示す。障害検出処理部３０は、障害発生の割込によって起動され、ＲＡＭ１８上に監視個所ごとに設けられた発生回数記憶領域３２に記憶される発生回数の値を更新し、発生回数が所定値になったら、タイマ領域３４に所定の値を設定する。タイマ処理部３６は、一定時間が経過するごとに発生する割込によって起動され、タイマ領域３４に格納されるタイマ値をダウンカウントし、ゼロになったらタイムアップ処理部３８へタイムアップを通知する。タイムアップ処理部３８は、タイムアップ通知により起動され、発生回数記憶領域３２内の発生回数及びステータスレジスタ２２の内容に基づき障害の状態を判定する。
【０００９】
図３は障害検出部３０における処理の詳細を示すフローチャートである。障害発生を示す割込によって起動されたら、まず、発生個所を特定し（ステップ１０００）、発生個所に対応する記憶領域３２に記憶されている発生回数をインクリメントする（ステップ１００２）。発生回数が所定値ｎに達しているかどうかが判定され（ステップ１００４）、発生回数がｎに達していなければ、次に、発生回数が１であるかが判定される。発生回数が１であれば、タイマ領域３４に発生頻度を判定するためのタイマ値Ｔ_１を設定する。（ステップ１００８）。発生回数がｎに達していれば、それは発生頻度判定のためのタイマがタイムアップする前に割込がｎ回発生したことを意味するので、タイマ領域３４に固定障害判定のためのタイマ値Ｔ_２を設定する（ステップ１０１０）。その後、発生回数がそれ以上更新されることを防ぐため、割込にマスクをかける（ステップ１０１２）。
【００１０】
図４はタイマ処理部３６における処理の詳細を示すフローチャートである。図４において、タイマ処理中の監視個所を示すパラメータｊに１を代入し（ステップ１１００）、ｊ番目のタイマのタイマ値がゼロであるか否かが判定される。（ステップ１１０２）。タイマ値がゼロでないときはそれをデクリメントし（ステップ１１０４）、ゼロになったら（ステップ１１０６）、タイムアップ処理部３８へｊ番目のタイマがタイムアップしたことを通知する（ステップ１１０８）。次に全タイマについて処理が終了したか否かが判定され（ステップ１１１０）、そうでないときはパラメータｊをインクリメントして（ステップ１１１２）、ステップ１１０２の処理へ戻る。
【００１１】
図５はタイムアップ処理部３８における処理の詳細を示すフローチャートである。図５において、まず、タイムアップしたタイマに対応する発生回数がｎに達しているかが判定される（ステップ１２００）。発生回数がｎに達していないとすれば、タイムアップしたのは発生頻度の判定のための判定時間Ｔ_１であり、その間の割込発生回数がｎに満たないのであるから、間欠障害または障害復旧と判定し（ステップ１２０６）、発生回数をゼロにクリアして割込マスクを解除する（ステップ１２０８）。ステップ１２００において発生回数がｎに達していれば、タイムアップしたのは固定障害判定のための時間Ｔ_２であることを意味するから、対応するステータスレジスタ２２の内容を読み取り（ステップ１２０２）、それに基いて障害が継続しているか否かが判定される（ステップ１２０４）。障害継続と判定されれば所定の障害処理を行なう（ステップ１２１０）。
【００１２】
図６は本発明の第１の実施例に係る障害判定装置の動作を説明するタイミングチャートである。図６において、判定時間Ｔ_１を定めるタイマは障害個所ごとに独立に最初の割込からスタートする。また、その間の発生回数の閾値ｎは図６中のｎ_１，ｎ_２のように判定個所ごとにそれぞれ異なる値に設定することができる。障害個所１及び２については判定時間Ｔ_１中に発生回数がそれぞれｎ_１、及びｎ_２に達すると判定時間Ｔ_２を定めるタイマがスタートする。判定時間Ｔ_２についても図６のＴ_２１，Ｔ_２２のように判定個所ごとにそれぞれ異なる値に設定することができる。障害個所３については、発生回数がｎ_３に達する前に判定時間Ｔ_１がタイムアップしたので障害復旧と判定される。
【００１３】
なお、本発明の第１の実施例において、タイマ処理部３６がＴ_１のタイマがタイムアップしたと判定してタイムアップ通知を行なった直後でタイムアップ処理部３８が動作する前に障害検出処理部３０が動作して発生回数をｎ−１からｎへカウントアップすると、タイムアップ処理部３８はＴ_２のタイマがタイムアップしたものとして誤動作してしまう。これを避けるためには、タイマ処理部３６がタイムアップ通知をする直前、すなわち図４のステップ１１０６と１１０８の間に障害発生割込をマスクして発生回数が増えないようにすれば良い。
【００１４】
図７は本発明の第２の実施例に係る障害判定装置のソフトウェアの構成を示す。本発明の第２の実施例では、物理層における障害とそれによってサポートされるデータリンク層における障害との関係のように相互に密接な関係にあるものを１つの障害群としてまとめ、各障害群ごとに１つのタイマを用いて障害状態の判定を行なう。また、第１の実施例におけるｎが１、すなわち、障害発生の割込が入ったら直ちに固定障害の判定を行なうタイプの監視個所をその対象とする。
【００１５】
図７において、ＲＡＭ１８（図１）上には各障害群ごとに１つのタイマ領域４０と、その障害群に属する各監視個所の状態（正常、障害判定中、または障害）を記憶する領域４２が設けられる。障害検出処理部４４は、障害発生の割込によって起動され、対応個所の状態を障害判定中に変更し、同じ障害群に属する監視個所の中でそれよりも優先度の高いものがすべて正常であるときのみタイマ領域４０に値を設定する。タイマ処理部４６は、一定時間が経過するごとに発生する割込によって起動され、タイマ領域４０のタイマ値をダウンカウントし、ゼロになったらタイムアップ処理部４８へタイムアップ通知を行なう。タイムアップ処理部４８は、タイムアップ通知により起動され、ステータスレジスタ２２の内容に基いて障害の状態を判定し、障害復旧と判定される場合にはそれよりも優先度の低い監視個所が障害判定中であるときはタイマを再び起動する。
【００１６】
図８は障害検出処理部４４の処理の詳細を示すフローチャートである。障害発生の割込によって起動されたら、その障害群及び障害種別（監視個所）を特定し（ステップ１３００）、領域４２を参照して正常から障害判定中へと状態を変更する（ステップ１３０２，１３０４）。次に、同じ障害群に属する監視個所の中で、それよりも優先度の高い監視個所の状態がすべて正常と設定されているときのみタイマ領域４０に値を設定してタイマを起動する（ステップ１３０６，１３０８）。すなわち、同じ障害群に属する監視個所の中で優先度の高いものの障害が先に検出されて固定障害判定時間がスタートした後にそれよりも優先度の低い監視個所の障害が検出されたときにタイマ値が再設定されると、優先度の高い監視個所に対する固定障害判定時間が実質的に延長されて障害処理が遅れてしまうので、このことを防止するため、優先度の高い監視個所が障害判定中であるときは障害判定中の設定のみを行ない、タイマ値の設定は行なわない。
【００１７】
図９はタイマ処理部４６の処理の詳細を示すフローチャートである。タイマ処理部４６の動作は既に説明したタイマ処理部３６の動作（図４）と実質的に同一であるので説明を省略する。
図１０はタイムアップ処理部４８の処理の詳細を示すフローチャートである。本発明の第２の実施例では、障害群ごとに１つのタイマが設けられるので、タイムアップ処理部４８ではタイムアップしたタイマに対応する障害群に属する監視個所について障害判定処理が行なわれる。ステップ１５００において、まず、タイムアップしたタイマに対応する障害群に属する監視個所の中で最も優先度の高いものの状態を領域４２から読み取って判定する（ステップ１５０２，１５０４）。それが“障害”でも“障害判定中”でもなければ、すなわち、“正常”であれば次に優先度の高い監視個所について状態を読み取り（ステップ１５０６）、ステップ１５０２に戻る。“障害判定中”であれば、それに対応するステータスレジスタ２２の内容を読み取り（ステップ１５０８）、障害が継続しているか否かを判定する（ステップ１５１０）、障害が継続していれば、状態を“障害判定中”から“障害”へ変更し（ステップ１５１２）、障害処理を行なう（ステップ１５１４）。障害が復旧していれば、状態を“障害判定中”から“正常”へと変更し（ステップ１５１６）、同じ障害群に属する他の監視個所について、“障害判定中”が設定されているか否かを判定する（ステップ１５１８）。他の監視個所に“障害判定中”が設定されていればタイマ値を再度設定する（ステップ１５２０）。
【００１８】
すなわち、本発明の第２の実施例においては、同じ障害群に属する複数の監視個所について１つのタイマが用いられ、同時にまたは相前後して障害が検出されたら、優先度の高いものが優先して固定障害の判定が行なわれ、障害復旧と判定されたときのみ優先度の低いものの処理が行なわれる。
【００１９】
【発明の効果】
以上説明した様に、本発明により障害種別／装置種別の異なる障害、優先順位のある障害の検出を統一でき、さらに障害検出までの管理データ量も減少できることにより、障害処理の処理効率を向上させることができる。
【図面の簡単な説明】
【図１】本発明の一実施例のハードウェアの構成を示すブロック図である。
【図２】本発明の第１の実施例のソフトウェアの構成を示すブロック図である。
【図３】障害検出処理部３０の処理のフローチャートである。
【図４】タイマ処理部３６の処理のフローチャートである。
【図５】タイムアップ処理部３８の処理のフローチャートである。
【図６】本発明の第１の実施例の動作を説明するタイミングチャートである。
【図７】本発明の第２の実施例のソフトウェアの構成を示すブロック図である。
【図８】障害検出処理部４４の処理のフローチャートである。
【図９】タイマ処理部４６の処理のフローチャートである。
【図１０】タイムアップ処理部４８の処理のフローチャートである。[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a failure determination device that determines a failure state based on failure occurrence information reported from each part of the device.
[0002]
[Prior art]
In determining the state of a failure based on failure occurrence information reported from each monitoring point of an apparatus such as an ATM switch, various types of determination processing are required according to the type and importance of the failure. For example, for a certain device, the frequency of occurrence of an interrupt indicating the occurrence of a failure is monitored, and when the frequency of occurrence of the interrupt exceeds a predetermined threshold, if the failure continues even after a predetermined determination time has elapsed thereafter, Although it is judged as a failure state, the threshold value of the occurrence frequency and the judgment time need to be set variously according to the type and importance of the trouble, and a timer and a judgment time for measuring the occurrence frequency for each monitoring location. Requires a timer for setting.
[0003]
A close relationship, such as the relationship between a failure at the physical layer and a failure at the data link layer supported by it, that if one fails, the other always fails, but the reverse does not necessarily occur. There is a group of monitoring points with For these groups of monitoring points, it is conceivable to simplify the processing by using a common timer.
[0004]
[Problems to be solved by the invention]
A first object of the present invention is to provide a failure determination device capable of uniformly performing a determination process on a large number of monitoring locations requiring various types of determination processes with as few timers as possible. .
A second object of the present invention is to provide a failure determination device that can achieve determination processing using a common timer for a plurality of monitoring locations closely related to each other.
[0005]
[Means for Solving the Problems]
According to the present invention, means for counting the number of interrupts indicating occurrence of a failure, means for setting a first value to a timer when the number of failure occurrence interrupts is 1, Means for setting a second value to the timer when the timer has reached a predetermined value; and when the timer has expired and the number of times of failure interrupts is equal to or greater than the predetermined value and the failure has continued, a failure state Means for determining that the timer has expired, the number of times a failure interrupt has occurred is equal to or greater than the predetermined value, and when the failure has not continued, the means for determining that the failure has been restored; and Provided is a failure determination device including means for determining failure recovery when the number of occurrence interrupts does not reach the predetermined value, and means for clearing the number of failure occurrence interrupts to zero when failure recovery is determined. You.
[0006]
According to the present invention, a means for storing that a fault is being determined according to the type of a fault that has occurred when a fault has occurred, and a fault having a higher priority than a fault that has occurred when a fault has occurred. A means for setting a value to the timer when it is not stored that the determination is being made, and the highest priority among the types of failures that the timer times out and the failure determination is stored is stored. Means for determining the failure state when the failure continues, and for the highest priority type of failure stored in the timer that has expired and that the failure is being determined. And a means for setting a value to a timer when failure determination is not being performed and failure determination is being performed for another failure type.
[0007]
BEST MODE FOR CARRYING OUT THE INVENTION
FIG. 1 shows a hardware configuration of a failure determination device according to the present invention. The fault determining apparatus 10 of the present invention includes a CPU 14, a ROM 16, a RAM 18, an interrupt controller 20, and a fault status register 22 interconnected by a bus 12. The CPU 14 performs a failure determination process according to a program stored in the ROM 16. The status register 22 is connected to each monitoring location of the ATM switch 24 to be subjected to the failure determination, and stores the status of the failure reported from each monitoring location. The interrupt controller 20 interrupts the CPU 14 when a failure occurs in any of the monitoring locations, and accordingly, when the failure state continues even after the CPU 14 completes the predetermined interrupt processing, or When a failure occurs again, the CPU 14 is again interrupted. The contents of the status register 22 can be read from the CPU 14.
[0008]
FIG. 2 shows a software configuration of the failure determination device according to the first embodiment of the present invention. The failure detection processing unit 30 is activated by the interruption of the failure occurrence, updates the value of the number of occurrences stored in the occurrence number storage area 32 provided for each monitoring location on the RAM 18, and the number of occurrences becomes a predetermined value. Then, a predetermined value is set in the timer area 34. The timer processing unit 36 is started by an interrupt generated every time a predetermined time elapses, counts down a timer value stored in the timer area 34, and notifies the time-up processing unit 38 of time-up when the timer value reaches zero. . The time-up processing unit 38 is activated by the time-up notification, and determines a failure state based on the number of occurrences in the occurrence count storage area 32 and the contents of the status register 22.
[0009]
FIG. 3 is a flowchart showing details of the processing in the failure detection unit 30. When activated by an interrupt indicating that a failure has occurred, the location of occurrence is first specified (step 1000), and the number of occurrences stored in the storage area 32 corresponding to the location of occurrence is incremented (step 1002). It is determined whether or not the number of occurrences has reached a predetermined value n (step 1004). If the number of occurrences has not reached n, it is next determined whether or not the number of occurrences is 1. If the number of occurrences is 1, it sets a timer value T ₁ of the for determining the occurrence frequency in the timer area 34. (Step 1008). If the number of occurrences reaches n, which means that the interrupt has occurred n times before the timer for occurrence frequency determination times out, the timer value 34 for the fixed failure determination is stored in the timer area 34. ₂ is set (step 1010). Thereafter, the interrupt is masked to prevent the occurrence count from being updated any more (step 1012).
[0010]
FIG. 4 is a flowchart showing details of the processing in the timer processing unit 36. In FIG. 4, 1 is substituted for a parameter j indicating a monitoring position during timer processing (step 1100), and it is determined whether or not the timer value of the j-th timer is zero. (Step 1102). If the timer value is not zero, it is decremented (step 1104), and if it becomes zero (step 1106), the time-up processing unit 38 is notified that the j-th timer has timed out (step 1108). Next, it is determined whether or not the processing has been completed for all the timers (step 1110). If not, the parameter j is incremented (step 1112), and the process returns to step 1102.
[0011]
FIG. 5 is a flowchart showing details of the processing in the time-up processing unit 38. In FIG. 5, first, it is determined whether the number of occurrences corresponding to the timer whose time has expired has reached n (step 1200). If occurrence frequency does not reach the n, a determination time T ₁ of the for the determination of the occurrence frequency of the time is up, since it is the interrupt occurrence count therebetween is less than n, intermittent failure or failure It is determined that the recovery has occurred (step 1206), the occurrence count is cleared to zero, and the interrupt mask is released (step 1208). If the number of occurrences in step 1200 is reached n, it means that the time is up is a time T ₂ of the order of determination holding disorders, reads the contents of the corresponding status register 22 (step 1202), it Then, it is determined whether or not the failure continues (step 1204). If it is determined that the failure continues, predetermined failure processing is performed (step 1210).
[0012]
FIG. 6 is a timing chart for explaining the operation of the failure determination device according to the first embodiment of the present invention. 6, a timer defining the determination time T ₁ is started from the first interrupt independently for fault location. In addition, the threshold value n of the number of occurrences during that time can be set to a different value for each determination point, such as n ₁ and n ₂ in FIG. For the failure points 1 and 2, when the number of occurrences reaches n ₁ and n ₂ during the determination time T ₁ , a timer for determining the determination time T ₂ starts. For determination time T ₂ can also be set to a different value for each determination point as T _21, T ₂₂ in FIG. The fault location 3, generation number determination time T ₁ before reaching the n ₃ is determined to disaster recovery because the time is up.
[0013]
In the first embodiment of the present invention, fault detection process before time-up processing unit 38 operates immediately after the timer processing unit 36 of the T ₁ timer makes a time-up notification is determined that the time is up When the number of occurrences section 30 is operated to count up from n-1 to n, the time-up processing unit 38 malfunctions as the timer T ₂ has timed. In order to avoid this, just before the timer processing unit 36 gives a time-up notification, that is, between the steps 1106 and 1108 in FIG. 4, the failure occurrence interrupt may be masked so that the number of occurrences does not increase.
[0014]
FIG. 7 shows a software configuration of the failure determination device according to the second embodiment of the present invention. In the second embodiment of the present invention, faults that are closely related to each other, such as a fault in the physical layer and a fault in the data link layer supported thereby, are collected as one fault group, and each fault group is collected. The failure state is determined using one timer each time. Further, a monitoring point of a type in which n is 1 in the first embodiment, that is, a type in which a determination of a fixed failure is made as soon as an interruption of the failure occurs is taken as the target.
[0015]
In FIG. 7, on the RAM 18 (FIG. 1), one timer area 40 is provided for each failure group, and an area 42 for storing the status (normal, under determination, or failure) of each monitoring location belonging to the failure group. Provided. The failure detection processing unit 44 is activated by the interruption of the failure occurrence, changes the state of the corresponding location during the failure determination, and all the monitoring locations belonging to the same failure group having a higher priority than the monitoring locations are normal. Only when there is a value, a value is set in the timer area 40. The timer processing unit 46 is activated by an interrupt generated every time a predetermined time elapses, counts down the timer value in the timer area 40, and notifies the time-up processing unit 48 of time-up when the timer value reaches zero. The time-up processing unit 48 is started by the time-up notification, determines the state of the failure based on the contents of the status register 22, and when it is determined that the failure has been recovered, the monitoring part having a lower priority than that is determined. If so, restart the timer.
[0016]
FIG. 8 is a flowchart showing details of the processing of the failure detection processing unit 44. When activated by the interruption of the failure occurrence, the failure group and the failure type (monitoring location) are specified (step 1300), and the status is changed from normal to failure determination with reference to the area 42 (steps 1302 and 1304). ). Next, a value is set in the timer area 40 and the timer is started only when the statuses of the monitoring points having higher priority among the monitoring points belonging to the same failure group are all set to normal (step S1). 1306, 1308). That is, when a fault with a higher priority is detected first among the monitoring points belonging to the same fault group and a fixed fault determination time starts, and a fault is detected at a monitoring point with a lower priority after the fixed fault determination time starts, a timer is set. If the value is reset, the fixed fault determination time for the high-priority monitoring point is substantially extended and the fault processing is delayed, so in order to prevent this, the high-priority monitoring point determines the fault. When it is in the middle, only the setting during the failure determination is performed, and the timer value is not set.
[0017]
FIG. 9 is a flowchart showing details of the processing of the timer processing unit 46. The operation of the timer processing unit 46 is substantially the same as the operation (FIG. 4) of the timer processing unit 36 already described, and thus the description is omitted.
FIG. 10 is a flowchart showing details of the processing of the time-up processing unit 48. In the second embodiment of the present invention, since one timer is provided for each failure group, the time-up processing unit 48 performs a failure determination process on a monitoring point belonging to the failure group corresponding to the timer whose time has expired. In step 1500, first, the state of the highest priority monitoring point belonging to the failure group corresponding to the timer whose time has expired is read from the area 42 and determined (steps 1502 and 1504). If it is neither "failure" nor "fault determination", that is, if it is "normal", the state is read for the next highest priority monitoring point (step 1506), and the process returns to step 1502. If "failure is being determined", the contents of the status register 22 corresponding thereto are read (step 1508), and it is determined whether or not the failure has continued (step 1510). The state is changed from "under failure determination" to "failure" (step 1512), and failure processing is performed (step 1514). If the fault has been recovered, the status is changed from "diagnosing fault" to "normal" (step 1516), and "monitoring fault" is set for other monitoring points belonging to the same fault group. Is determined (step 1518). If "during failure determination" is set in another monitoring location, the timer value is set again (step 1520).
[0018]
That is, in the second embodiment of the present invention, one timer is used for a plurality of monitoring points belonging to the same failure group, and if failures are detected simultaneously or consecutively, the one with the higher priority takes precedence. The fixed fault is determined in this way, and only when it is determined that the fault has been recovered, the process with the lower priority is performed.
[0019]
【The invention's effect】
As described above, according to the present invention, detection of faults having different fault types / device types and faults having priorities can be unified, and the amount of management data up to fault detection can be reduced, thereby improving the processing efficiency of fault processing. be able to.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a hardware configuration according to an embodiment of the present invention.
FIG. 2 is a block diagram illustrating a software configuration according to the first embodiment of this invention.
FIG. 3 is a flowchart of a process performed by a failure detection processing unit 30;
FIG. 4 is a flowchart of a process of a timer processing unit 36;
FIG. 5 is a flowchart of processing of a time-up processing unit 38;
FIG. 6 is a timing chart illustrating the operation of the first exemplary embodiment of the present invention.
FIG. 7 is a block diagram illustrating a software configuration according to a second embodiment of the present invention.
FIG. 8 is a flowchart of processing of a failure detection processing unit 44;
FIG. 9 is a flowchart of a process of a timer processing unit 46;
FIG. 10 is a flowchart of a process performed by a time-up processing unit;

Claims

Means for counting the number of interrupts indicating occurrence of a failure;
Means for setting a first value to a timer when the number of times of failure occurrence interrupts becomes 1,
Means for setting a second value to the timer when the number of failure occurrence interrupts reaches a predetermined value;
When the timer times out, the number of times of failure occurrence interrupt is equal to or more than the predetermined value, and when the failure occurrence continues, means for determining a failure state;
When the timer times out, the number of failure occurrence interrupts is equal to or greater than the predetermined value, and when failure occurrence is not continued, means for determining failure recovery,
Means for determining that the timer has expired, and when the number of failure interrupts does not reach the predetermined value, determines that the failure has been recovered;
Means for clearing the number of failure occurrence interrupts to zero when it is determined that the failure has been recovered.