JP4313823B2

JP4313823B2 - Fault response system and fault response method

Info

Publication number: JP4313823B2
Application number: JP2007046200A
Authority: JP
Inventors: 孝一服部
Original assignee: 株式会社日立情報システムズ
Priority date: 2007-02-26
Filing date: 2007-02-26
Publication date: 2009-08-12
Anticipated expiration: 2027-02-26
Also published as: JP2008210148A

Description

本発明は、監視対象サーバで発生した障害に対応するための障害対応システム及び障害対応方法に関する。 The present invention relates to a failure handling system and a failure handling method for handling a failure that has occurred in a monitored server.

従来、サーバが障害によってダウンした場合には、障害復旧のための担当者が障害原因の分析を行い、障害に応じた復旧のための対応手順を試行錯誤によって行っていた。 Conventionally, when a server goes down due to a failure, a person in charge for failure recovery analyzes the cause of the failure and performs a recovery procedure according to the failure by trial and error.

サーバに障害が発生すると、予想される対処作業をスケジューリングし、各担当者に障害内容と各担当者の作業開始予想時間を通知することを可能にする障害情報通知プログラム（特許文献１）が提案されている。 When a failure occurs in a server, a failure information notification program (Patent Document 1) is proposed that makes it possible to schedule an expected response work and notify each person in charge of the failure content and the estimated work start time of each person in charge. Has been.

特許文献１に記載された障害情報通知プログラムは、担当者と対処内容とを格納したリカバリ情報テーブルを参照し、監視対象サーバで発生した事象に関する担当者と対処内容とを抽出する第一抽出ステップと、前記第一抽出ステップにおいて抽出した対処内容に基づき、対処スケジュールを生成する第一スケジューリングステップと、前記対処スケジュールを前記担当者に対して通知する第一通知ステップとを動作させることを特徴としている。 The failure information notification program described in Patent Literature 1 refers to a recovery information table storing a person in charge and the contents of countermeasures, and extracts a person in charge and details of countermeasures related to an event that has occurred in the monitored server. And a first scheduling step for generating a countermeasure schedule based on the countermeasure contents extracted in the first extraction step, and a first notification step for notifying the person in charge of the countermeasure schedule. Yes.

これにより、適切な担当者が適切なタイミングで呼び出されるので、不適切な担当者が呼ばれて、呼ばれたものが更に適切な担当者を呼び出すという、２段階呼び出しのため、時間がかかり、障害情報の伝達が不十分になるという問題を解決できるとしている。また、各担当者も早く来すぎて、自分の作業順番がくるまで無駄な時間を過ごすといった問題も解決できるとしている。
特開２００４−２８０１７１号公報 As a result, since the appropriate person is called at the right time, it takes time for the two-stage call that the inappropriate person is called and the called person calls the appropriate person. It is said that it can solve the problem of insufficient communication of fault information. In addition, it is said that each person in charge can come too early and solve the problem of wasting time until his work order comes.
JP 2004-280171 A

しかしながら、特許文献１に記載の障害情報通知プログラムが生成する予定表は、デッドラインに間に合うように生成されたものではないので、予定通り復旧作業を行ってもデッドラインに間に合わない可能性があった。 However, since the schedule table generated by the failure information notification program described in Patent Document 1 is not generated in time for the deadline, there is a possibility that it will not be in time for the deadline even if recovery work is performed as scheduled. It was.

また、前記障害情報通知プログラムは、対応手順の選択、順序づけに際して、単位時間あたりの復旧率を考慮せずに行うので、予定表どおりに作業しても必ずしも復旧率は高くなく、復旧までの時間も短時間ではないという問題があった。 In addition, since the failure information notification program is performed without considering the recovery rate per unit time when selecting and ordering the response procedures, the recovery rate is not necessarily high even if working according to the schedule, and the time until recovery There was also a problem that it was not a short time.

本発明は、このような実情に鑑みてなされたものであり、デッドラインに間に合うように、かつ、復旧率も高く短時間に復旧が可能となるような予定表を生成する障害対応システム及び方法を提供しようとするものである。 The present invention has been made in view of such circumstances, and a failure response system and method for generating a schedule so as to meet the deadline and have a high recovery rate and can be recovered in a short time. Is to provide.

本発明の課題は、下記の各発明によって解決することが出来る。
即ち、本発明の障害対応システムは、監視対象サーバに障害が発生した場合に、前記障害を復旧させるための対応手順を選択し、前記対応手順を試行する順番を示す予定表を生成する障害対応システムであって、障害に関する情報が保存されている障害情報データベースと、前記監視対象サーバに発生した障害の種類である障害事象に関連づけられて、対応手順と、前記対応手順の所要時間と、前記対応手順で復旧したか否かの情報とが保存されているリカバリ手順情報データベースと、過去に発生した障害事象と、その障害を復旧するために行った対応手順と、その対応手順に要した所要時間と、その対応手順での復旧の有無とが保存された事例蓄積データベースと、監視対象サーバに内蔵される監視手段から送信される前記監視対象サーバの障害に関する情報を受信する障害情報受信手段と、障害を復旧させなければならない期限であるデッドラインの入力を受け付けるデッドライン入力受け付け手段と、前記監視手段から送信された情報を基にして、前記障害情報データベースを検索することにより、前記監視対象サーバに発生した障害事象を割り出すエラー情報分析手段と、前記障害事象を基に前記リカバリ手順情報データベースを検索して対応手順を読み込み、前記デッドライン入力受け付け手段から入力されたデッドラインまでに障害復旧が間に合うように前記対応手順をスケジューリングした予定表を生成するスケジュール分析生成手段と、監視対象サーバに障害が発生するごとに、前記事例蓄積データベースの内容と前記リカバリ手順情報データベースの内容とを更新する対応履歴監視手段と、を備えることを主要な特徴としている。 The problems of the present invention can be solved by the following inventions.
That is, the failure response system of the present invention selects a response procedure for recovering the failure when a failure occurs in the monitored server, and generates a schedule table indicating the order in which the response procedures are tried. A fault information database in which information about faults is stored, a fault procedure that is a type of fault that has occurred in the monitored server, a response procedure, a time required for the response procedure, Recovery procedure information database that stores information about whether or not the recovery procedure has been recovered, failure events that occurred in the past, the response procedure that was performed to recover from the failure, and the requirements required for the response procedure A case accumulation database in which the time and the presence or absence of recovery in the corresponding procedure are stored, and the monitoring target server transmitted from the monitoring means built in the monitoring target server Based on the information transmitted from the monitoring means, the fault information receiving means for receiving the information on the fault, the deadline input receiving means for receiving the input of the deadline that is the deadline for which the fault must be recovered, By searching the failure information database, error information analysis means for determining the failure event that has occurred in the monitored server, the recovery procedure information database is searched based on the failure event, the corresponding procedure is read, and the deadline input Schedule analysis generation means for generating a schedule table in which the corresponding procedure is scheduled so that failure recovery is in time for the deadline input from the reception means, and the contents of the case accumulation database each time a failure occurs in the monitored server And the contents of the recovery procedure information database The corresponding history monitoring means new to, further comprising: a is mainly characterized.

これにより、デッドラインに間に合うように予定表が生成されるため、予定表どおりに復旧作業を実施することによりデッドラインに間に合うように監視対象サーバを障害から復旧させることができる。 As a result, the schedule is generated in time for the deadline, so that the monitored server can be recovered from the failure in time for the deadline by performing the recovery operation according to the schedule.

また、障害が発生するごとにリカバリ手順情報データベースの内容が更新されるので、対応手順の所要時間、復旧率の情報も更新され、次回予定表を生成する際に、精度の高い予定表を生成することができる。 In addition, since the contents of the recovery procedure information database are updated each time a failure occurs, the time required for the response procedure and the recovery rate information are also updated, and a highly accurate schedule is generated when the next schedule is generated. can do.

また、本発明の障害対応システムは、前記スケジュール分析生成手段は、前記リカバリ手順情報データベースに保存されている、対応手順ごとの復旧する確率である復旧率と、その復旧率を有する対応手順に要する所要時間とから、単位時間あたりの復旧率を算出し、前記単位時間あたりの復旧率が高い順番に対応手順を選択して、前記選択を、対応手順を実施する予定の日時に、選択された対応手順の所要時間を加算し、加算後の日時が前記デッドラインに達する直前まで行い、選択された対応手順を、前記単位時間あたりの復旧率の高い順番に実施するように予定表を生成することを主要な特徴としている。 Further, in the failure handling system according to the present invention, the schedule analysis / generation unit is required for a restoration rate that is stored in the recovery procedure information database and is a probability of restoration for each handling procedure, and a handling procedure having the restoration rate. The recovery rate per unit time is calculated from the required time, the corresponding procedure is selected in order of the recovery rate per unit time, and the selection is selected at the date and time when the corresponding procedure is scheduled to be performed. Add the time required for the response procedure, perform until the date and time after the addition reaches the deadline, and generate a schedule so that the selected response procedure is executed in order of the recovery rate per unit time. This is the main feature.

これにより、単位時間あたりの復旧率の高い対応手順から実施するように予定表が生成されるので、効率的に復旧作業を実施することができ、短時間に高い確率での復旧を可能とすることができる。 As a result, a schedule is generated so that it can be executed from a procedure with a high recovery rate per unit time, so that recovery work can be carried out efficiently and recovery can be performed with high probability in a short time. be able to.

更に、本発明の障害対応システムは、前記スケジュール分析生成手段は、最初に予定表を生成した後であって、前記監視対象サーバの障害復旧するまでに、少なくとも1回以上予定表を生成し直し、予定表を生成し直す際には、生成し直す時点での日時、デッドライン、前記リカバリ手順情報データベースの内容を基にして予定表を生成し直すことを主要な特徴としている。 Furthermore, in the failure handling system according to the present invention, the schedule analysis generation unit regenerates the schedule table at least once after the schedule table is first generated and before the failure of the monitored server is recovered. When the schedule is regenerated, the main feature is that the schedule is regenerated based on the date and time when the schedule is regenerated, the deadline, and the contents of the recovery procedure information database.

これにより、対応手順の実施が予定より早く終了しても、あるいは、遅く終了してもそれに合わせて新しく予定表が生成し直されるので、常に予定表の精度を高く維持することができる。 As a result, even if the implementation of the response procedure ends earlier than scheduled or ends later, a new schedule is generated again accordingly, so that the accuracy of the schedule can always be kept high.

以上、説明したように、本発明の障害対応システムによれば、デッドラインに間に合うように予定表を生成するので、予定表どおりに作業することにより、監視サーバの復旧をデッドラインに間に合わせることができる。 As described above, according to the failure response system of the present invention, the schedule is generated in time for the deadline, so that the monitoring server can be recovered in time for the deadline by working according to the schedule. Can do.

また、本発明の障害対応システムによれば、障害が発生するごとにリカバリ手順情報データベースの内容が更新されるので、対応手順の所要時間、復旧率に関する情報も更新されて、次回予定表を生成する際に、より精度の高い予定表を生成することができる。 In addition, according to the failure handling system of the present invention, the contents of the recovery procedure information database are updated every time a failure occurs, so the time required for the handling procedure and information on the recovery rate are also updated to generate the next schedule In doing so, a more accurate schedule can be generated.

更に、単位時間あたりの復旧率の高い対応手順から実施するように予定表が生成されるので、効率的に復旧作業を実施することができ、短時間に高い確率での復旧を可能と知ることができる。 In addition, since the schedule is generated so that it can be executed from the procedure with a high recovery rate per unit time, the recovery work can be carried out efficiently, and it can be recovered with high probability in a short time. Can do.

更にまた、予定表の生成後、そのときの最新情報を基にして1回以上予定表を生成し直すので、常に予定表の精度を高く維持することができる。 Furthermore, since the schedule is generated once or more after the schedule is generated based on the latest information at that time, the accuracy of the schedule can always be kept high.

以下、添付図面を参照しながら、本発明の障害対応システムの一実施形態を詳細に説明する。
＜構成＞
図１は、本発明の一実施形態に係る障害対応システムの構成図である。図１に示すように、障害対応システムは、監視対象サーバ１０と、クライアント２０と、ネットワーク３０と、監視サーバ４０とから構成されている。監視対象サーバ１０と、クライアント２０と、監視サーバ４０とは、互いにネットワーク３０を介して接続されている。 Hereinafter, an embodiment of a failure handling system of the present invention will be described in detail with reference to the accompanying drawings.
<Configuration>
FIG. 1 is a configuration diagram of a failure handling system according to an embodiment of the present invention. As illustrated in FIG. 1, the failure handling system includes a monitoring target server 10, a client 20, a network 30, and a monitoring server 40. The monitoring target server 10, the client 20, and the monitoring server 40 are connected to each other via the network 30.

監視対象サーバ１０は、監視プログラム１１を有している。監視プログラム１１は、監視対象サーバ１０を監視し、監視対象サーバ１０に障害が発生した場合には、障害を検知して障害に関する情報を監視サーバ４０にネットワーク３０を介して送信する。ここで、監視プログラム１１は、必ずしもプログラムである必要はなく、同じ機能を果たす装置であっても良い。また、監視対象サーバは、１台とは限らず複数台の場合もある。
なお、クライアント２０も、１台とは限らず複数台の場合もある。 The monitoring target server 10 has a monitoring program 11. The monitoring program 11 monitors the monitoring target server 10, and when a failure occurs in the monitoring target server 10, detects the failure and transmits information regarding the failure to the monitoring server 40 via the network 30. Here, the monitoring program 11 is not necessarily a program, and may be a device that performs the same function. Moreover, the number of monitoring target servers is not limited to one, and there may be a plurality of servers.
The number of clients 20 is not limited to one, and there may be a plurality of clients.

監視サーバ４０は、エラー情報分析手段４１と、スケジュール分析・生成手段４２と、対応履歴管理手段４３と、情報データベース４４と、リカバリ手順情報データベース４５と、事例蓄積データベース４６とから構成されている。 The monitoring server 40 includes an error information analysis unit 41, a schedule analysis / generation unit 42, a response history management unit 43, an information database 44, a recovery procedure information database 45, and a case accumulation database 46.

エラー情報分析手段４１は、監視プログラム１１から送信されたエラー情報を分析してどの障害に該当するかを調査する。具体的には、障害事象とそれに関連づけられたエラー番号やアラート番号が保存されている障害情報データベース４４の検索を行い、前記エラー情報に該当する障害事象を割り出すことによって行う。 The error information analysis means 41 analyzes the error information transmitted from the monitoring program 11 and investigates which fault corresponds. Specifically, the failure information database 44 in which the failure event and the error number and alert number associated therewith are stored is searched to determine the failure event corresponding to the error information.

スケジュール分析・生成手段４２は、エラー情報分析手段４１によって割り出された障害事象に関する復旧方法のすべてをリカバリ手順情報データベース４５から読み込む。リカバリ手順情報データベース４５には、障害事象に関連づけられた、その障害復旧のための対応手順、その手順の所要時間、その手順により復旧したか否か等の情報が保存されている。 The schedule analysis / generation unit 42 reads all the recovery methods related to the failure event determined by the error information analysis unit 41 from the recovery procedure information database 45. The recovery procedure information database 45 stores information related to the failure event, such as a response procedure for the failure recovery, a time required for the procedure, and whether or not the procedure has been recovered.

次に、障害を復旧させなければならない期限であるデッドラインが、クライアント２０からネットワーク３０を介して監視サーバ４０に入力されている場合は、デッドラインに間に合うように、リカバリ手順情報データベース４５から読み込んだ復旧方法を組み合わせて、復旧手順を示した予定表を生成する。なお、デッドラインは、監視サーバ４０に直接入力することもできる。 Next, when a deadline that is a time limit for recovering from a failure is input from the client 20 to the monitoring server 40 via the network 30, it is read from the recovery procedure information database 45 in time for the deadline. By combining the recovery methods, a schedule showing the recovery procedure is generated. The deadline can also be input directly to the monitoring server 40.

デッドラインが入力されていない場合は、リカバリ手順情報データベース４５から読み込んだ複数の復旧方法のうち単位時間あたりの復旧確率の高い順番に復旧方法を選択して、スケジューリングを行い予定表を生成する。ここで、スケジューリングの方法は、必ずしも単位時間あたりの復旧確率の高い順番にスケジューリングするだけでなく、様々な方法を適用することができる。例えば、全体として、最も復旧確率が高くなるように復旧方法を組み合わせてスケジューリングすることもできるし、復旧確率よりも復旧スピードを重視してスケジューリングすることもできる。 If no deadline has been input, a recovery method is selected from the plurality of recovery methods read from the recovery procedure information database 45 in descending order of recovery probability per unit time, and scheduling is performed to generate a schedule. Here, the scheduling method is not limited to scheduling in the order of high recovery probability per unit time, and various methods can be applied. For example, as a whole, scheduling can be performed by combining the recovery methods so that the recovery probability becomes the highest, or the recovery speed can be more important than the recovery probability.

対応履歴管理手段４３は、監視対象サーバ１０障害が発生するごとに、監視プログラム１１から送られてくる情報と、障害復旧担当者が監視サーバに直接入力またはネットワークを介して入力した情報とを基にして、事例蓄積データベース４６に障害事例情報を保存する。障害事例情報とは、障害発生日時、障害が発生したサーバ、障害事象、その障害復旧のために行った復旧方法、復旧手順、復旧の有無などである。 The response history management unit 43 is based on information sent from the monitoring program 11 every time a failure occurs in the monitored server 10 and information input by the person in charge of failure recovery to the monitoring server directly or via the network. Thus, the failure case information is stored in the case accumulation database 46. The failure case information includes a failure occurrence date and time, a server in which a failure has occurred, a failure event, a recovery method performed for recovery from the failure, a recovery procedure, and the presence or absence of recovery.

また、対応履歴管理手段４３は、事例蓄積データベースの内容を基に、リカバリ手順情報データベース４５の内容を更新する。これにより、復旧のための対応手順を増やし、対応手順ごとの所要時間の精度を高めることができる。 Further, the response history management unit 43 updates the contents of the recovery procedure information database 45 based on the contents of the case accumulation database. As a result, the number of response procedures for recovery can be increased, and the accuracy of the required time for each response procedure can be increased.

図２は、障害情報データベース４４に保存されている情報の例を示した図である。図２に示すように、障害が発生したときに障害が発生したサーバが出すエラー番号やアラート番号と、障害事象が関連づけられて保存されている。これにより、エラー情報分析手段４１は、監視プログラム１１から受信した障害情報に含まれている、エラー番号やアラート番号から、それらに対応する障害事象を検索し割り出すことができる。 FIG. 2 is a diagram illustrating an example of information stored in the failure information database 44. As shown in FIG. 2, when a failure occurs, the error number or alert number issued by the server where the failure has occurred and the failure event are stored in association with each other. As a result, the error information analysis unit 41 can search for and determine the failure events corresponding to the error numbers and alert numbers included in the failure information received from the monitoring program 11.

図３は、リカバリ手順情報データベース４５に保存されている情報の例を示した図である。図３に示すように、対応手順ＩＤと、対応手順ＩＤが示す障害復旧方法を表す内容と、その内容を実施するのに必要な所要時間と、その内容を実施した場合にシステムが復旧する確率を表す復旧率とが障害事象ＩＤに関連づけて保存されている。また、障害事象ＩＤに関連づけて、対応手順ＩＤと、その対応手順の更に細かい作業手順を示す手順と、前記細かい作業手順ごとに振られた番号であるステップＩＤも保存されている。 FIG. 3 is a diagram illustrating an example of information stored in the recovery procedure information database 45. As shown in FIG. 3, the response procedure ID, the content representing the failure recovery method indicated by the response procedure ID, the time required to implement the content, and the probability that the system will recover when the content is implemented Is stored in association with the failure event ID. In association with the failure event ID, a response procedure ID, a procedure indicating a more detailed work procedure of the response procedure, and a step ID which is a number assigned to each of the detailed work procedures are also stored.

これにより、スケジュール分析・生成手段４２は、障害事象に対応した障害復旧方法を検索して選択することができ、更に、復旧に必要な所要時間、復旧率を考慮して予定表を生成することができる。 Thereby, the schedule analysis / generation unit 42 can search and select a failure recovery method corresponding to the failure event, and further generate a schedule table in consideration of the time required for recovery and the recovery rate. Can do.

図４は、事例蓄積データベース４６に保存されている情報の例を示した図である。図４に示すように、障害が発生した日時と、障害を起こしたサーバを示す障害サーバと、障害事象と、その障害に対して行った対応の対応手順ＩＤと、その対応手順ＩＤが表す手順を実施したときに要した所要時間と、復旧に成功したか否かを表す復旧有無とが関連づけられて保存されている。 FIG. 4 is a diagram illustrating an example of information stored in the case accumulation database 46. As shown in FIG. 4, the date and time when the failure occurred, the failure server indicating the failed server, the failure event, the response procedure ID corresponding to the failure, and the procedure indicated by the response procedure ID Is stored in association with the required time required when the operation is performed and the presence / absence of recovery indicating whether the recovery is successful or not.

この事例蓄積データベース４６に保存された内容を基にして、対応履歴管理手段４３は、リカバリ手順情報データベース４５の内容を更新することができる。これにより、障害が発生するごとに事例蓄積データベース４６と、リカバリ手順情報データベース４５にデータが蓄積されて、スケジュール分析・生成手段４２が生成する予定表の精度も高いものとなる。 Based on the contents stored in the case accumulation database 46, the response history management means 43 can update the contents of the recovery procedure information database 45. As a result, every time a failure occurs, data is accumulated in the case accumulation database 46 and the recovery procedure information database 45, and the accuracy of the schedule table generated by the schedule analysis / generation unit 42 is also high.

図５は、スケジュール分析・生成手段４２が生成した予定表の例を示した図である。図５に示すように、障害復旧方法とその実施順番を示した対策手順と、対策手順に示された各障害復旧方法を実施した場合の開始予定時間、終了予定時間を示す開始予定、終了予定と、各対策手順の着手状況を示すＳｔａｔｕｓとに関連づけて、各対策手順の実施予定がガントチャートで表されている。 FIG. 5 is a diagram showing an example of a schedule table generated by the schedule analysis / generation unit 42. As shown in FIG. 5, the failure recovery method and the countermeasure procedure indicating the order of execution, and the scheduled start time and the planned end time indicating the scheduled start time and the planned end time when each failure recovery method indicated in the countermeasure procedure is performed. In association with the status indicating the start status of each countermeasure procedure, the implementation schedule of each countermeasure procedure is represented by a Gantt chart.

図５に示す一次デッドラインは、クライアントから入力されたデッドラインのことであり、スケジュール分析・生成手段４２は、一次デッドラインまでに障害の復旧が完了するようにリカバリ手順情報データベース４５から手順を選択し、順序づけている。図５に示すデッドラインは、一次デッドラインまでに行う手順によっても復旧しなかった場合に、バックアップによる復元を行ったときの復旧予想日時を表している。 The primary deadline shown in FIG. 5 is a deadline input from the client, and the schedule analysis / generation unit 42 executes the procedure from the recovery procedure information database 45 so that the recovery from the failure is completed by the primary deadline. Select and order. The deadline shown in FIG. 5 represents the expected recovery date and time when restoration by backup is performed in the case where recovery is not performed even by the procedure performed up to the primary deadline.

＜動作＞
次に、本実施形態の障害対応システムの動作について図１を参照して説明する。
監視プログラム１１は、監視対象サーバ１０に障害が発生した場合、その障害を検知して、障害を起こしたサーバが出すエラー番号やアラート番号などの障害に関する障害情報をネットワーク３０を介して、監視サーバ４０に送信する。 <Operation>
Next, the operation of the failure handling system of this embodiment will be described with reference to FIG.
When a failure occurs in the monitoring target server 10, the monitoring program 11 detects the failure and transmits failure information related to the failure such as an error number and an alert number issued by the failed server via the network 30. 40.

前記障害情報を受信したエラー情報分析手段４１は、障害情報データベース４４を検索して、受信した障害情報を基に、監視対象サーバで発生した障害に対応する障害事象を割り出す。具体的には、前記障害情報に含まれているエラー番号やアラート番号を基にして、対応する障害事象を検索して選択する。 The error information analysis unit 41 that has received the failure information searches the failure information database 44 and determines a failure event corresponding to the failure that occurred in the monitoring target server based on the received failure information. Specifically, the corresponding failure event is searched and selected based on the error number and alert number included in the failure information.

このようにして割り出された障害事象は、スケジュール分析・生成手段４２に渡される。スケジュール分析・生成手段４２は、リカバリ手順情報データベース４５を検索して、渡された障害事象に対応した対応手順ＩＤを選択する。このとき、デッドラインがクライアント２０から入力されていた場合は、スケジュール分析・生成手段４２は、このデッドラインを一次デッドラインとして設定し、一次デッドラインまでに実行可能な対応手順ＩＤを選択し、選択した対応手順ＩＤの実行順番を決定する。 The failure event determined in this way is passed to the schedule analysis / generation unit 42. The schedule analysis / generation unit 42 searches the recovery procedure information database 45 and selects a corresponding procedure ID corresponding to the passed failure event. At this time, if a deadline is input from the client 20, the schedule analysis / generation unit 42 sets this deadline as a primary deadline, selects a corresponding procedure ID that can be executed by the primary deadline, The execution order of the selected corresponding procedure ID is determined.

対応手順ＩＤの選択方法および実行順番の決定は、様々な方法が考えられる。例えば、図３に示すように、リカバリ手順情報データベース４５に保存されている対応手順ＩＤごとの所要時間と復旧率のデータを使用して、復旧率を所要時間で除することにより単位時間あたりの復旧率を求める。次に、この単位時間あたりの復旧率の高いものから選択してゆき、選択の終了は、復旧作業開始予定日時から選択した対応手順ＩＤの所要時間を加算して、加算後の日時が一次デッドラインに達する直前までとすることができる。選択した対応手順ＩＤの順番は、単位時間あたりの復旧率の高いものから並べることができる。 Various methods can be considered for selecting the corresponding procedure ID and determining the execution order. For example, as shown in FIG. 3, by using the required time and recovery rate data for each corresponding procedure ID stored in the recovery procedure information database 45, by dividing the recovery rate by the required time, Find the recovery rate. Next, select the one with a high recovery rate per unit time, and end the selection by adding the time required for the corresponding procedure ID selected from the scheduled restoration work start date and time. Until just before reaching the line. The order of the selected corresponding procedure IDs can be arranged in descending order of the recovery rate per unit time.

これにより、単位時間あたりの復旧率の高い復旧方法から復旧作業を行う予定を組むことができるので、一次デッドラインまでに復旧を完了する確率を高くすると共に、組まれた予定は、短時間に復旧を終了させるための効率的なものとすることができる。 As a result, it is possible to make a plan to perform recovery work from a recovery method with a high recovery rate per unit time, so that the probability of completing the recovery by the primary deadline is increased, and the planned schedule is reduced in a short time. It can be efficient to end the recovery.

次にスケジュール分析・生成手段４２は、選択し並べられた対応手順ＩＤを基にして予定表を生成する。図５に示すように、予定表は、対策手順ごとに、開始予定、終了予定、Ｓｔａｔｕｓを記載し、１次デッドライン、デッドラインまでの予定を線で表したガントチャートとすることができる。これにより、対策手順と予定が一目で把握できるので、復旧担当者の負担を軽減し、効率的な復旧作業が可能となる。 Next, the schedule analysis / generation unit 42 generates a schedule based on the corresponding procedure IDs selected and arranged. As shown in FIG. 5, the schedule can be a Gantt chart in which the start schedule, the end schedule, and the Status are described for each countermeasure procedure, and the schedule up to the primary deadline and the deadline is represented by lines. As a result, the countermeasure procedure and schedule can be grasped at a glance, thereby reducing the burden on the person in charge of restoration and enabling efficient restoration work.

また、スケジュール分析・生成手段４２は、対策手順の進捗ごとに、または、定期的に予定表を更新することができる。対策手順の進捗ごとに予定表を更新する場合は、対策手順の一つが実行されて結果が出たときに、その結果をもとにその時点から予定表を生成し直す。例えば、終了予定より早く結果が出て、まだ復旧しない場合は、一次デッドラインまでに予定より時間があるので、予定表を作り直すことにより、さらに対策手順を追加できる場合がある。 In addition, the schedule analysis / generation unit 42 can update the schedule table for each progress of the countermeasure procedure or periodically. When the schedule is updated for each progress of the countermeasure procedure, when one of the countermeasure procedures is executed and a result is obtained, the schedule is generated again from that point based on the result. For example, if the result is obtained earlier than the scheduled end and it still does not recover, there is a time before the primary deadline, so it may be possible to add a countermeasure procedure by recreating the schedule.

また、終了予定より遅く結果が出て、まだ復旧していない場合は、当初予定よりも一次デッドラインまでの時間がないので、予定表から対策手順を削除することになる場合もある。 Also, if the result comes later than the scheduled end and has not yet been restored, there is no time until the primary deadline from the initial schedule, so the countermeasure procedure may be deleted from the schedule.

定期的にスケジュールを更新する場合は、スケジュールを更新するタイミングが定期的なだけで、内容は上述の対策手順の進捗ごとの更新と同じである。
このように、状況に応じて予定表が更新されるので、予定表の精度を常に高い状態に保つことができる。 When the schedule is regularly updated, the timing for updating the schedule is only periodic, and the content is the same as the update for each progress of the countermeasure procedure described above.
Thus, since the schedule is updated according to the situation, the accuracy of the schedule can always be kept high.

図４に示すように、対応履歴管理手段４３は、障害が発生するたびに、障害の発生日時、障害サーバ、障害事象、復旧に使用した対応手順ＩＤ、対応手順ＩＤごとの所要時間と復旧の有無についての情報を事例蓄積データベース４６に保存する。 As shown in FIG. 4, each time a failure occurs, the response history management unit 43 generates a failure occurrence date, failure server, failure event, response procedure ID used for recovery, required time for each response procedure ID, and recovery time. Information about presence / absence is stored in the case accumulation database 46.

また、対応履歴管理手段４３は、事例蓄積データベース４６に保存された内容を基にして、リカバリ手順情報データベース４５の更新を行う。これにより、リカバリ手順情報データベース４５のデータが増えるので、予定表生成の精度が向上することになる。 In addition, the response history management unit 43 updates the recovery procedure information database 45 based on the contents stored in the case accumulation database 46. Thereby, since the data of the recovery procedure information database 45 increases, the accuracy of schedule generation is improved.

次にフローチャートに基づいて更に説明する。
図６は、本発明の障害対応システムにおける処理フローを表したフローチャートである。
監視対象サーバ１０で障害が発生すると、監視プログラム１１がそれを検知し、障害情報を監視サーバ４０に送信する。送信された障害情報を受信したエラー情報分析手段４１は、エラー情報の分析を行い、障害情報データベース４４を検索してどの障害事象に該当するかを割り出す（Ｓ１）。 Next, further description will be given based on the flowchart.
FIG. 6 is a flowchart showing a processing flow in the failure handling system of the present invention.
When a failure occurs in the monitoring target server 10, the monitoring program 11 detects the failure and transmits failure information to the monitoring server 40. The error information analysis means 41 that has received the transmitted failure information analyzes the error information and searches the failure information database 44 to determine which failure event corresponds (S1).

次に、スケジュール分析・生成手段４２は、割り出された障害事象に該当する復旧手順をリカバリ手順情報データベース４５から読み込む（Ｓ２）。続いて、スケジュール分析・生成手段４２は、デッドライン情報の入力があるか確認をして（Ｓ３）、入力がある場合は、デッドライン情報から一次デッドラインを生成して（Ｓ４）、１次デッドラインに収まるように予定表を生成する（Ｓ５）。デッドライン情報の入力がない場合は、予定表の生成を実施する（Ｓ５）。 Next, the schedule analysis / generation unit 42 reads the recovery procedure corresponding to the determined failure event from the recovery procedure information database 45 (S2). Subsequently, the schedule analysis / generation unit 42 checks whether there is an input of deadline information (S3). If there is an input, the schedule analysis / generation unit 42 generates a primary deadline from the deadline information (S4). A schedule is generated so as to be within the deadline (S5). If no deadline information is input, a schedule is generated (S5).

続いて、予定表の生成（Ｓ５）について、図７を参照して詳細に説明する。図７は、予定表の生成フローを表したフローチャートである。 Next, the schedule generation (S5) will be described in detail with reference to FIG. FIG. 7 is a flowchart showing a schedule generation flow.

スケジュール分析生成手段４２は、一次デッドラインが生成されているか否かを確認する（Ｓ１１）。一次デッドラインが生成されていない場合は、すべての復旧手順をスケジューリング対象として（Ｓ１３）、復旧手順を単位時間あたりの復旧率の高い順番で実行する予定表を生成する（Ｓ１７）。 The schedule analysis generation means 42 confirms whether or not a primary deadline has been generated (S11). When the primary deadline has not been generated, all the recovery procedures are set as scheduling targets (S13), and a schedule table for executing the recovery procedures in order of high recovery rate per unit time is generated (S17).

一次デッドラインが生成されている場合は、一次デッドラインまでに対応手順ＩＤ９９を除く作業が可能か否か判断を行う（Ｓ１２）。ここで、対応手順ＩＤ９９とは、図３に示す対応手順ＩＤ９９のことであり、テープ等のバックアップファイルからのシステムの復元のことを示す。バックアップファイルからシステムを復元した場合は、システムの状態は、バックアップファイル生成時点に戻るため、バックアップファイル生成後のデータは復元できない。よって、バックアップファイル生成後のデータを犠牲にする前提でのシステム復旧方法であり、これは、最後の手段である。 When the primary deadline has been generated, it is determined whether or not the work excluding the corresponding procedure ID 99 can be performed by the primary deadline (S12). Here, the handling procedure ID 99 is the handling procedure ID 99 shown in FIG. 3 and indicates that the system is restored from a backup file such as a tape. When the system is restored from the backup file, the system state returns to the time when the backup file was generated, so the data after the backup file was created cannot be restored. Therefore, this is a system recovery method on the premise of sacrificing data after the generation of the backup file, which is the last means.

図７にもどって、ステップＳ１２において、一次デッドラインまでに対応手順ＩＤ９９を除く作業が不可能と判断された場合は、対等手順ＩＤ９９のみをスケジューリング対象として（Ｓ１４）、予定表の生成を行う（Ｓ１７）。 Returning to FIG. 7, if it is determined in step S12 that the work except the corresponding procedure ID 99 cannot be performed before the primary deadline, only the peer procedure ID 99 is set as a scheduling target (S14), and a schedule is generated (S14). S17).

ステップＳ１２において、一次デッドラインまでに対応手順ＩＤ９９を除く作業が可能と判断された場合は、各復旧手順の単位時間あたりの復旧率を算出し（Ｓ１５）、一次デッドラインまでに実行可能な復旧手順の組み合わせの中から復旧確率が最大になる組み合わせを選択する（Ｓ１６）。 In step S12, when it is determined that the work excluding the corresponding procedure ID 99 can be performed by the primary deadline, the recovery rate per unit time of each recovery procedure is calculated (S15), and recovery that can be performed by the primary deadline. A combination that maximizes the recovery probability is selected from among the combinations of procedures (S16).

ここで、復旧確率が最大になる組み合わせの選択には、様々な方法が考えられる。例えば、一次デッドラインまでに実行可能な復旧手段の組み合わせパターンを抽出し、それぞれのパターンにおける復旧率の合計が最大になる組み合わせを選択する。この際、単位時間あたりの復旧率が高い手順を含む組み合わせから順番に検討してゆき、復旧率が１００％を超えるような組み合わせが見つかった場合はそこで計算を打ち切る。このようにして復旧確率が最大になる組み合わせを選択することができる。そのた、様々な方法を適用することができる。
続いて、選択された復旧手順を単位時間あたり復旧率の高い順番で実行する予定表を生成する（Ｓ１７）。 Here, various methods are conceivable for selecting a combination that maximizes the recovery probability. For example, a combination pattern of recovery means that can be executed up to the primary deadline is extracted, and a combination that maximizes the total recovery rate in each pattern is selected. At this time, it examines in order from a combination including a procedure with a high recovery rate per unit time, and if a combination with a recovery rate exceeding 100% is found, the calculation is terminated there. In this way, the combination that maximizes the recovery probability can be selected. In addition, various methods can be applied.
Subsequently, a schedule for executing the selected restoration procedure in order of the restoration rate per unit time is generated (S17).

次に、事例蓄積データベース４６とリカバリ手順情報データベース４５の更新について、図８を参照して説明する。図８は、事例蓄積データベース４６とリカバリ手順情報データベース４５の更新フローを表したフローチャートである。 Next, updating of the case accumulation database 46 and the recovery procedure information database 45 will be described with reference to FIG. FIG. 8 is a flowchart showing an update flow of the case accumulation database 46 and the recovery procedure information database 45.

生成された予定表に記載された復旧のための手順を記載された順番に従って実行する（Ｓ２１）。手順の実行は、復旧作業の担当者が行うこともできるし、監視サーバ４０に行わせることもできる。 The recovery procedure described in the generated schedule is executed in the order described (S21). The execution of the procedure can be performed by a person in charge of the recovery work or can be performed by the monitoring server 40.

次に、対応履歴管理手段４３は、復旧したか否かの確認を行う（Ｓ２２）。この確認のための情報は、対応履歴管理手段４３が監視プログラム１１から直接得ることもできるし、復旧作業の担当者がネットワークを介して、または、直接監視サーバに入力したものを得ることもできる。 Next, the correspondence history management unit 43 confirms whether or not it has been restored (S22). Information for this confirmation can be obtained directly from the monitoring program 11 by the response history management means 43, or can be obtained by a person in charge of recovery work who has input to the monitoring server via the network. .

復旧しなかった場合は、対応履歴管理手段４３は、手順実行の所要時間と復旧しなかった旨等を事例蓄積データベース４６に保存する（Ｓ２３）。
次に、現在時間と一次デッドラインから予定表を再度生成し（Ｓ２４）、一次デッドラインまでに実行可能な手順があるか否かの確認を行う（Ｓ２５）。 If not recovered, the response history management unit 43 saves the time required for executing the procedure and the fact that the procedure has not been recovered in the case accumulation database 46 (S23).
Next, a schedule is generated again from the current time and the primary deadline (S24), and it is confirmed whether there is a procedure that can be executed before the primary deadline (S25).

実行可能な手順がある場合は、Ｓ２１から再度実行を行う。実行可能な手順が無い場合は、対応手順ＩＤ９９のバックアップデータによる復元を行い（Ｓ２６）、復元に要した所要時間と、復旧した旨等を事例蓄積データベース４６に保存する。この復元作業も、復旧作業の担当者が行うこともできるし、監視サーバ４０に行わせることもできる。その後、リカバリ手順情報データベース４５の所要時間と復旧率の更新を行う（Ｓ２８）。 If there is an executable procedure, the procedure is executed again from S21. If there is no executable procedure, restoration is performed using the backup data of the corresponding procedure ID 99 (S26), and the time required for restoration, the fact that the restoration has been performed, and the like are stored in the case accumulation database 46. This restoration operation can also be performed by the person in charge of the recovery operation, or can be performed by the monitoring server 40. Thereafter, the required time and recovery rate of the recovery procedure information database 45 are updated (S28).

ステップＳ２２において復旧したと確認された場合は、対応履歴管理手段４３は、手順実行の所要時間と復旧した旨等を事例蓄積データベース４６に保存し（Ｓ２７）、リカバリ手順情報データベース４５の所要時間と復旧率の更新を行う（Ｓ２８）。 When it is confirmed in step S22 that the recovery has been made, the response history management means 43 stores the required time for executing the procedure and the fact that it has been recovered in the case accumulation database 46 (S27), and the required time in the recovery procedure information database 45 The recovery rate is updated (S28).

なお、本発明は、本発明の技術的思想の範囲内で様々な変更が可能である。例えば、本発明の障害対応システムが行う動作は、記憶装置上のプログラムをコンピュータシステムのＣＰＵが読み込み、実行することによっても行うことができ、全く同様の作用効果を得て、発明が解決しようとする課題を解決することができる。 The present invention can be modified in various ways within the scope of the technical idea of the present invention. For example, the operation performed by the failure handling system of the present invention can also be performed by the CPU of the computer system reading and executing a program on the storage device, and the invention is intended to be solved by obtaining exactly the same effects. The problem to be solved can be solved.

本発明の障害対応システムは、コンピュータのＣＰＵ、メモリ、記憶装置、ディスプレイ、入出力デバイス等を含むハードウェア資源上に構築されたＯＳ、アプリケーション、データベース、プログラム等によって実現されるものであり、障害を復旧させるための手順が記載され、デッドラインに間に合うようにスケジューリングされた予定表を生成するという情報処理が上記のハードウェア資源を用いて具体的に実現されるものであるから、自然法則を利用した技術的思想に該当するものであり、コンピュータシステムを使用する分野ならばどこでも、そのシステムがダウンした場合に復旧させるためのシステムとして利用することができる。 The failure handling system of the present invention is realized by an OS, application, database, program, etc. built on hardware resources including a computer CPU, memory, storage device, display, input / output device, etc. The information processing procedure for creating a schedule that is scheduled in time for the deadline is specifically realized using the above hardware resources. It falls under the technical idea used and can be used as a system for recovering when the system goes down anywhere in the field where the computer system is used.

本発明の一実施形態に係る障害対応システムの構成図である。It is a block diagram of the failure response system which concerns on one Embodiment of this invention. 障害情報データベースに保存されている情報の例を示した図である。It is the figure which showed the example of the information preserve | saved in the failure information database. リカバリ手順情報データベースに保存されている情報の例を示した図である。It is the figure which showed the example of the information preserve | saved in the recovery procedure information database. 事例蓄積データベースに保存されている情報の例を示した図である。It is the figure which showed the example of the information preserve | saved in the case accumulation | storage database. スケジュール分析・生成手段が生成した予定表の例を示した図である。It is the figure which showed the example of the schedule table which the schedule analysis and production | generation means produced | generated. 本発明の障害対応システムにおける処理フローを表したフローチャートである。It is a flowchart showing the processing flow in the failure response system of this invention. 予定表の生成フローを表したフローチャートである。It is a flowchart showing the production | generation flow of a schedule. 事例蓄積データベースとリカバリ手順情報データベースの更新フローを表したフローチャートである。It is a flowchart showing the update flow of the case accumulation database and the recovery procedure information database.

Explanation of symbols

１０監視対象サーバ
１１監視対象プログラム
２０クライアント
３０ネットワーク
４０監視サーバ
４１エラー情報分析手段
４２スケジュール分析・生成手段
４３対応履歴管理手段
４４障害情報データベース
４５リカバリ手順情報データベース
４６事例蓄積データベース DESCRIPTION OF SYMBOLS 10 Monitoring object server 11 Monitoring object program 20 Client 30 Network 40 Monitoring server 41 Error information analysis means 42 Schedule analysis / generation means 43 Correspondence history management means 44 Failure information database 45 Recovery procedure information database 46 Case accumulation database

Claims

When a failure occurs in a monitored server, a failure response system that selects a response procedure for recovering the failure and generates a schedule indicating the order in which the response procedure is tried,
A failure information database in which information about failures is stored;
A recovery procedure in which a response procedure, a time required for the response procedure, and information on whether or not a recovery has been performed in the response procedure are stored in association with a failure event that is the type of failure that has occurred in the monitored server An information database;
A case accumulation database that stores fault events that occurred in the past, the response procedures that were taken to recover from the failures, the time required for the response procedures, and whether or not there was recovery in the response procedures;
A failure information receiving means for receiving information on a failure of the monitored server transmitted from the monitoring means built in the monitored server;
Deadline input acceptance means for accepting deadline input that is the deadline for which the failure must be recovered;
Based on the information transmitted from the monitoring means, by searching the failure information database, error information analysis means for determining a failure event that has occurred in the monitored server;
Search the recovery procedure information database based on the failure event, read the response procedure, and generate a schedule table that schedules the response procedure so that failure recovery is in time by the deadline input from the deadline input receiving means Schedule analysis generation means for
Corresponding history monitoring means for updating the contents of the case accumulation database and the contents of the recovery procedure information database each time a failure occurs in the monitored server;
A failure response system comprising:

The schedule analysis generation means recovers per unit time from the recovery rate stored in the recovery procedure information database, which is the recovery rate for each corresponding procedure, and the time required for the corresponding procedure having the recovery rate. Calculate the rate,
Select the corresponding procedure in order of the recovery rate per unit time,
The selection is performed by adding the time required for the selected response procedure to the date and time when the response procedure is scheduled to be performed, and until the date and time after the addition reaches the deadline,
The failure response system according to claim 1, wherein the schedule is generated so that the selected response procedure is executed in order of the recovery rate per unit time.

The schedule analysis generation means generates the schedule table at least once after the schedule table is generated first and before the failure of the monitored server is recovered,
The schedule is generated again based on the date and time when the schedule is regenerated, the deadline, and the contents of the recovery procedure information database. Failure response system.

A failure information database in which information about failures is stored;
Corresponding procedure for recovering the failure in association with a failure event that is the type of failure that has occurred in the monitored server, time required for the handling procedure, and information on whether or not the recovery procedure has recovered Recovery procedure information database where
A case accumulation database that stores fault events that occurred in the past, the response procedures that were taken to recover from the failures, the time required for the response procedures, and whether or not there was recovery in the response procedures;
A failure information receiving means for receiving information on a failure of the monitored server transmitted from the monitoring means built in the monitored server;
Deadline input acceptance means for accepting deadline input that is the deadline for which the failure must be recovered;
Error information analysis means for determining a failure event occurring in the monitored server;
A schedule analysis generation means for generating a schedule that schedules the response procedure so that failure recovery is in time by the deadline;
Corresponding history monitoring means for updating the contents of the case accumulation database and the contents of the recovery procedure information database each time a failure occurs in the monitored server;
A failure handling method in a failure handling system comprising:
A failure information receiving step in which the failure information receiving means receives information related to a failure of the monitored server;
The deadline input receiving means, a deadline input receiving step for receiving a deadline input that is a time limit for repairing a failure;
An error information analysis step in which the error information analysis means searches the failure information database based on the information about the failure received in the failure information receiving step to determine a failure event that has occurred in the monitored server; ,
The schedule analysis generation means reads the response procedure by searching the recovery procedure information database based on the failure event, and the response is made so that the failure recovery is in time by the deadline input by the deadline input reception step. A schedule analysis generation step for generating a schedule with scheduled procedures;
A response history monitoring step in which the response history monitoring means updates the contents of the recovery procedure information database and the case accumulation database each time a failure occurs in the monitoring target server;
A failure handling method in a failure handling system, characterized in that

The schedule analysis generation step is based on the recovery rate stored in the recovery procedure information storage step, which is the probability of recovery for each corresponding procedure, and the time required for the corresponding procedure having the recovery rate. Calculating a recovery rate;
Selecting a corresponding procedure in order of high recovery rate per unit time; and
Adding the time required for the selected response procedure to the date and time when the response procedure is scheduled to be performed, and ending immediately before the date and time after the addition reaches the deadline;
Generating a schedule so that the selected response procedure is performed in the order of the high recovery rate per unit time; and
The failure handling method in the failure handling system according to claim 4, further comprising:

The schedule analysis generation means step generates the schedule table at least once after the schedule table is generated first and before the failure of the monitored server is recovered.
When the schedule is regenerated, the schedule includes the step of regenerating the schedule based on the date and time at the time of recreation, the deadline, and the contents saved by the recovery procedure information saving step. The failure handling method in the failure handling system according to claim 4 or 5.

Information related to the failure event associated with the information related to the failure and information related to the procedure associated with the failure associated with the failure event stored in the storage device of the computer when a failure occurs in the monitored server Based on the information on the time required for the response procedure and the information on the presence / absence of recovery in the response procedure, the computer selects the response procedure for recovering the failure and tries the response procedure. A failure handling program that generates a schedule showing the order,
On the computer,
A failure information receiving step for receiving information on a failure transmitted from the monitoring means built in the monitored server;
A deadline input acceptance step for accepting deadline input, which is a time limit for which a failure must be recovered;
Based on the information received in the failure information receiving step, the information on the failure event associated with the failure information stored in the storage device of the computer is searched, and the error occurred in the monitored server. An error information analysis step to determine the failure event;
Based on the failure event, the deadline input in the deadline input receiving step is read in the information on the response procedure for recovering the failure associated with the failure event stored in the storage device of the computer. A schedule analysis generating step for generating a schedule table in which the corresponding procedure is scheduled so that failure recovery is in time;
Each time a failure occurs in a monitored server, information about the failure that has occurred, a failure event, the response procedure that has been performed, the time required for the response procedure, and information about whether or not the recovery procedure has been restored are stored in the storage device of the computer The response history monitoring step to be stored in
A failure handling program characterized by causing

In the schedule analysis generation step, recovery per unit time is calculated from a recovery rate that is stored in the storage device of the computer and is a probability of recovery for each corresponding procedure, and a time required for the corresponding procedure having the recovery rate. Calculating a rate;
Selecting a corresponding procedure in order of high recovery rate per unit time; and
Adding the time required for the selected response procedure to the date and time when the response procedure is scheduled to be performed, and ending immediately before the date and time after the addition reaches the deadline;
Generating a schedule so that the selected response procedure is performed in the order of the high recovery rate per unit time; and
The failure handling program according to claim 7, comprising:

The schedule analysis generation means step generates the schedule table at least once after the schedule table is generated first and before the failure of the monitored server is recovered.
When the schedule is regenerated, the schedule includes the step of regenerating the schedule based on the date and time when the schedule is regenerated, the deadline, and the contents stored in the storage device of the computer. The failure handling program according to claim 7 or 8.