JP2013164668A

JP2013164668A - Fault monitoring system, incident tabulation method, and program

Info

Publication number: JP2013164668A
Application number: JP2012026310A
Authority: JP
Inventors: Shinichi Momma; 伸一門間
Original assignee: Hitachi Systems Ltd
Current assignee: Hitachi Systems Ltd
Priority date: 2012-02-09
Filing date: 2012-02-09
Publication date: 2013-08-22

Abstract

PROBLEM TO BE SOLVED: To materialize case classification and configuration information of non-handling hours and incident reports by handling category and handling department, and utilize a policy file for tabulation in order to efficiently provide service quality assurance.SOLUTION: A tabulation system 30 retrieves incident information 21 and calculates respectively a number of incident cases and an average handling time corresponding to the retrieval item, on the basis of a condition item to be outputted to a tabulation report 32 inputted to a policy file 31 and configuration information 23 stored in a system management database 22. Subsequently, there is added to the report an item which is calculated from a condition inputted to the policy file 31 of one item for calculating a proportion of the incident cases (incident case number ratio) and from the configuration information 23 stored in the system management database 22. These processes are repeatedly executed a condition number of times defined in the policy file 31 and output as a tabulation report.

Description

本発明は、障害などのインシデントの監視技術に関し、特に、インシデントの対応区分別や対応部署別のレポート作成に有効な技術に関する。 The present invention relates to a technology for monitoring incidents such as faults, and more particularly to a technology effective for creating reports according to incident response categories and response departments.

近年の情報処理サービスを行うコンピュータシステムでは、インターネットを通じて各種サービスを提供している。その中でも、インシデント管理業務では、障害メッセージを障害監視装置が受信し、受信した障害メッセージをインシデント管理システムにインシデント情報として登録し、登録されたインシデントは、予め決められたワークフローなどに従い、サービスデスクが対応する。 Computer systems that provide information processing services in recent years provide various services through the Internet. Among them, in incident management work, a failure monitoring device receives a failure message, registers the received failure message as incident information in an incident management system, and the registered incident is handled by a service desk according to a predetermined workflow. Correspond.

サービスデスクの対応状況は、インシデント発生後の通報や一次対応の実施時間をＳＬＡ（Service Level Agreement）として定め、このＳＬＡを定量的な指標とし、顧客に対してサービスの品質を保証している。 The service desk response status is defined as SLA (Service Level Agreement), which is a report time after an incident occurs and a primary response execution time, and this SLA is used as a quantitative index to guarantee the quality of service to customers.

なお、この種のインシデント管理技術としては、例えば、記憶機器が業務ステップ毎の処理目標時間を記憶し、時間管理部が業務ステップ毎に時間の超過を判定し、通知部が時間の超過を出力することにより、業務ステップ毎に作業時間の管理を行うものが知られている（例えば、特許文献１参照）。 In addition, as this kind of incident management technology, for example, the storage device stores the processing target time for each business step, the time management unit determines the time excess for each business step, and the notification unit outputs the time excess By doing this, there is known one that manages work time for each business step (see, for example, Patent Document 1).

特開２００７−３４３５３号公報JP 2007-34353 A

ところが、上記のようなインシデントの管理技術では、次のような問題点があることが本発明者により見い出された。 However, the present inventor has found that the incident management technology as described above has the following problems.

すなわち、上記技術では、監視対象毎に決められたワークフローにより、通報時間や対策時間などの収集と集計が実現できると考えられるが、発生したインシデントの対応内容によってはインシデント管理業務を実施する部署以外の対応時間も含まれるために、サービス品質を正確に把握することが困難となり、顧客に対してサービスを保証するのが難しいという問題がある。 In other words, in the above technology, it is considered that collection and aggregation of notification time and countermeasure time can be realized by the workflow determined for each monitoring target, but depending on the response contents of the incident that occurred, it is not the department that performs the incident management work Therefore, it is difficult to accurately grasp the service quality, and it is difficult to guarantee the service to the customer.

本発明の目的は、対応時間外の事例区分や構成情報、およびインシデントの対応分類別や対応部署別のレポートを実現し、集計用のポリシファイルを利用してサービス品質を正確に把握することのできる技術を提供することである。 The purpose of the present invention is to realize case classification and configuration information outside the response time, and report by incident response classification and response department, and accurately grasp the service quality by using a policy file for aggregation. It is to provide technology that can.

本発明の前記ならびにそのほかの目的と新規な特徴については、本明細書の記述および添付図面から明らかになるであろう。 The above and other objects and novel features of the present invention will be apparent from the description of this specification and the accompanying drawings.

本願において開示される発明のうち、代表的なものの概要を簡単に説明すれば、次のとおりである。 Of the inventions disclosed in the present application, the outline of typical ones will be briefly described as follows.

すなわち、前述の目的を達成するために、本発明では、インシデント情報を有するインシデント管理システムと部署名と発生元情報などを含む構成情報、およびポリシファイルを用いて、サービスの品質を正確に把握できる仕組みを実現する。 In other words, in order to achieve the above-described object, the present invention can accurately grasp the quality of service by using an incident management system having incident information, configuration information including a department name and source information, and a policy file. Realize the mechanism.

ポリシファイルは、入力した条件に従いインシデントを集計対象とするかを定義したものであり、構成情報に従い、ポリシファイルに入力した条件毎のインシデント件数を構成情報に従い、インシデント集計情報として出力する。 The policy file defines whether incidents are to be counted according to the input conditions. The number of incidents for each condition input to the policy file is output as incident total information according to the configuration information according to the configuration information.

本発明は、監視対象となる機器におけるインシデントを収集し、集計する障害監視システムからなる。 The present invention comprises a failure monitoring system that collects and aggregates incidents in devices to be monitored.

この障害監視システムは、集計対象とするインシデントを定義したポリシファイル、および集計単位を示す構成情報に基づいて、監視対象となる機器において発生した障害メッセージに対応事例番号が付加されたインシデント情報を検索し、ポリシファイルに定義された条件毎のインシデント集計情報を算出し、出力する集計部を有する。 This fault monitoring system searches for incident information in which a corresponding case number is added to a fault message that occurred in a monitored device, based on the policy file that defines the incidents to be counted and the configuration information that indicates the counting unit And an aggregation section that calculates and outputs incident aggregation information for each condition defined in the policy file.

また、本発明では、ポリシファイルに定義された条件毎のインシデント集計情報の算出を実現するシステムによる方法や、前記システムとしてコンピュータシステムを機能させるプログラムにも適用することができる。 Further, the present invention can be applied to a method by a system that realizes calculation of incident total information for each condition defined in a policy file, and a program that causes a computer system to function as the system.

本願において開示される発明のうち、代表的なものによって得られる効果を簡単に説明すれば以下のとおりである。 Among the inventions disclosed in the present application, effects obtained by typical ones will be briefly described as follows.

サービスデスクにおけるサービス品質を正確に把握することができる。 The service quality at the service desk can be accurately grasped.

本発明の一実施の形態による障害監視システムの構成の一例を示す説明図である。It is explanatory drawing which shows an example of a structure of the failure monitoring system by one embodiment of this invention. 図１のインシデントデータベースに格納されるインシデント情報における項目の一例を示した説明図である。It is explanatory drawing which showed an example of the item in the incident information stored in the incident database of FIG. 図１のシステム管理データベースに格納される構成情報における項目の一例を示した説明図である。It is explanatory drawing which showed an example of the item in the structure information stored in the system management database of FIG. 図１の事例付加システムが参照する対応事例データベースにおける項目例を示した説明図である。It is explanatory drawing which showed the example of the item in the correspondence case database which the case addition system of FIG. 1 refers. 図１の集計システムが入力するポリシファイルの項目例を示した説明図である。It is explanatory drawing which showed the example of the item of the policy file which the aggregation system of FIG. 1 inputs. 図１の集計システムの動作の一例を示すフローチャートである。It is a flowchart which shows an example of operation | movement of the total system of FIG. 図６のステップＳ１０９の処理によるレポート作成の実現例を示した説明図である。It is explanatory drawing which showed the implementation example of the report preparation by the process of step S109 of FIG. 図１の集計システムが生成する集計レポートの一例を示した説明図である。It is explanatory drawing which showed an example of the total report which the total system of FIG. 1 produces | generates.

以下、本発明の実施の形態を図面に基づいて詳細に説明する。なお、実施の形態を説明するための全図において、同一の部材には原則として同一の符号を付し、その繰り返しの説明は省略する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. Note that components having the same function are denoted by the same reference symbols throughout the drawings for describing the embodiment, and the repetitive description thereof will be omitted.

〈発明の概要〉
本発明の第１の概要は、監視対象となる機器（監視対象機器２）におけるインシデントを収集し、集計する障害監視システム（障害監視システム１）である。 <Summary of invention>
The first outline of the present invention is a failure monitoring system (failure monitoring system 1) that collects and aggregates incidents in devices to be monitored (monitored devices 2).

この障害監視システムは、集計対象とするインシデントを定義したポリシファイル（ポリシファイル３１）、および集計単位を示す構成情報（構成情報２３）に基づいて、監視対象となる機器において発生した障害メッセージに対応事例の番号（対応事例番号）が付加されたインシデント情報（インシデント情報２１）を検索し、ポリシファイルに定義された条件毎のインシデント集計情報（集計レポート３２）を算出し、出力する集計部（集計システム３０）を有している。 This fault monitoring system responds to fault messages generated in devices to be monitored based on a policy file (policy file 31) that defines incidents to be counted and configuration information (configuration information 23) that indicates the counting unit. The incident information (incident information 21) to which the case number (corresponding case number) is added is searched, the incident aggregation information (aggregation report 32) for each condition defined in the policy file is calculated, and the aggregation section (aggregation) System 30).

本発明の第２の概要は、コンピュータシステム（障害監視システム１）により監視対象となる機器（監視対象機器２）におけるインシデントを収集し、集計するインシデント集計方法である。 The second outline of the present invention is an incident counting method for collecting and counting incidents in a device to be monitored (monitored device 2) by a computer system (fault monitoring system 1).

このインシデント集計方法は、通信回線（センタ間ネットワーク４）を介して、機器からの障害メッセージを収集するステップと、障害メッセージに対応事例の番号を付加し、対応事例の番号が付加された障害メッセージをインシデント情報（インシデント情報２１）として格納するステップと、インシデントを集計対象とするかを定義したポリシファイル（ポリシファイル３１）、および集計単位を示す構成情報（構成情報２３）に基づいて、インシデント情報を検索してポリシファイルに定義された条件毎のインシデント集計情報（集計レポート３２）を出力するステップとを有する。 The incident counting method includes a step of collecting a failure message from a device via a communication line (inter-center network 4), a failure message to which a response case number is added to the failure message, and a response case number is added. Is stored as incident information (incident information 21), a policy file (policy file 31) that defines whether incidents are to be counted, and configuration information (configuration information 23) that indicates a counting unit. And outputting incident total information (total report 32) for each condition defined in the policy file.

さらに、本発明の第３の概要は、以下のステップをコンピュータシステムに実行させるプログラムである。 Furthermore, a third outline of the present invention is a program that causes a computer system to execute the following steps.

このプログラムは、通信回線（センタ間ネットワーク４）を介して、監視対象となる機器（監視対象機器２）からの障害メッセージを収集するステップと、障害メッセージに対応事例番号を付加し、対応事例番号が付加された障害メッセージをインシデント情報（インシデント情報２１）として格納するステップと、集計対象とするインシデントを定義したポリシファイル（ポリシファイル３１）、および集計単位を示す構成情報（構成情報２３）に基づいて、インシデント情報を検索し、ポリシファイルに定義された条件毎のインシデント集計情報（集計レポート３２）を出力するステップと有する。 This program collects a failure message from a device to be monitored (monitored device 2) via a communication line (inter-center network 4), adds a corresponding case number to the failure message, Is stored as incident information (incident information 21), a policy file (policy file 31) that defines the incidents to be counted, and configuration information (configuration information 23) that indicates a counting unit. And searching for incident information and outputting incident total information (total report 32) for each condition defined in the policy file.

以下、上記した概要に基づいて、実施の形態を詳細に説明する。 Hereinafter, the embodiment will be described in detail based on the above-described outline.

〈障害監視システムの構成〉
図１は、本発明の一実施の形態による障害監視システムの構成の一例を示す説明図である。障害監視システム１は、図１に示すように、複数の監視対象機器２、障害監視装置３、事例付加システム６、対応事例データベース７、運用端末８、インシデント管理システム１０、インシデントデータベース２０、システム管理データベース２２、および集計システム３０から構成されている。 <Configuration of fault monitoring system>
FIG. 1 is an explanatory diagram showing an example of a configuration of a failure monitoring system according to an embodiment of the present invention. As shown in FIG. 1, the failure monitoring system 1 includes a plurality of monitored devices 2, a failure monitoring device 3, a case addition system 6, a response case database 7, an operation terminal 8, an incident management system 10, an incident database 20, and system management. A database 22 and a totaling system 30 are included.

監視対象機器２、および障害監視装置３は、データセンタＤＣに設けられており、事例付加システム６、対応事例データベース７、運用端末８、インシデント管理システム１０、インシデントデータベース２０、ならびに集計システム３０は、監視センタＫＣに設けられている。 The monitoring target device 2 and the failure monitoring device 3 are provided in the data center DC. The case addition system 6, the response case database 7, the operation terminal 8, the incident management system 10, the incident database 20, and the counting system 30 are It is provided in the monitoring center KC.

障害監視装置３と事例付加システム６とは、通信回線であるセンタ間ネットワーク４を介してそれぞれ接続されている。監視対象機器２は、障害監視の対象となる機器であり、例えば、情報処理システムなどの電子システムからなる。 The failure monitoring apparatus 3 and the case addition system 6 are connected to each other via an inter-center network 4 that is a communication line. The monitoring target device 2 is a device that is a target for fault monitoring, and includes, for example, an electronic system such as an information processing system.

これら監視対象機器２は、障害監視装置３にそれぞれ接続されている。なお、図１では、監視対象機器２が３つの例を記載しているが、監視対象機器２の数は、これに限定されるものではない。 These devices to be monitored 2 are connected to the failure monitoring device 3, respectively. Although FIG. 1 shows an example in which there are three monitored devices 2, the number of monitored devices 2 is not limited to this.

障害監視装置３は、監視対象機器２において発生した障害のメッセージを、センタ間ネットワーク４を介して事例付加システム６に送信する。事例付加システム６には、対応事例データベース７、およびインシデント管理システム１０がそれぞれ接続されている。 The failure monitoring device 3 transmits a message of a failure that has occurred in the monitored device 2 to the case addition system 6 via the inter-center network 4. The case addition system 6 is connected to a corresponding case database 7 and an incident management system 10.

事例付加システム６は、送信された障害メッセージの内容や発生元情報に基づいて、対応事例データベース７から、障害メッセージに関連付ける対応事例を抽出し、障害メッセージと抽出した対応事例の番号である対応事例Ｎｏ．をインシデント管理システム１０に送信する。 The case addition system 6 extracts a correspondence case to be associated with the failure message from the correspondence case database 7 based on the content of the failure message transmitted and the origin information, and the correspondence case that is the number of the correspondence case extracted with the failure message. No. Is transmitted to the incident management system 10.

インシデント管理システム１０は、障害情報入力部１１、ならびにワークフロー制御部１２を有している。障害情報入力部１１、およびワークフロー制御部１２には、インシデントデータベース２０、およびシステム管理データベース２２がそれぞれ接続されている。 The incident management system 10 includes a failure information input unit 11 and a workflow control unit 12. An incident database 20 and a system management database 22 are connected to the failure information input unit 11 and the workflow control unit 12, respectively.

事例付加システム６から送信された障害メッセージ、および対応事例Ｎｏ．（番号）は、障害情報入力部１１に入力される。障害情報入力部１１は、入力された障害メッセージをインシデント情報２１としてインシデントデータベース２０に登録する。 The failure message transmitted from the case addition system 6 and the corresponding case No. The (number) is input to the failure information input unit 11. The failure information input unit 11 registers the input failure message as incident information 21 in the incident database 20.

監視センタＫＣにおいて、サービスデスク要員は、運用端末８を利用して、障害情報入力部１１が登録したインシデント情報のワークフローの業務ステップ処理を、インシデント管理システム１０のワークフロー制御部１２により行う。 In the monitoring center KC, the service desk personnel use the operation terminal 8 to perform the work step processing of the workflow of the incident information registered by the failure information input unit 11 by the workflow control unit 12 of the incident management system 10.

集計システム３０は、ポリシファイル３１に定義されている条件とシステム管理データベース２２に格納された構成情報２３からインシデントデータベース２０のインシデント情報２１を検索し、集計レポート３２を生成する。 The totaling system 30 searches the incident information 21 in the incident database 20 from the conditions defined in the policy file 31 and the configuration information 23 stored in the system management database 22, and generates a totaling report 32.

〈インシデント情報の構成例〉
図２は、図１のインシデントデータベース２０に格納されるインシデント情報２１における項目の一例を示した説明図である。 <Example configuration of incident information>
FIG. 2 is an explanatory diagram showing an example of items in the incident information 21 stored in the incident database 20 of FIG.

インシデント情報２１は、図示するように、「インシデントＩＤ」、「発生日時」、「顧客名」、および「対応事例Ｎｏ．」などからなる。 The incident information 21 includes an “incident ID”, “occurrence date / time”, “customer name”, “corresponding case No.”, and the like as illustrated.

「インシデントＩＤ」は、登録されたインシデントを一意に識別する識別符号である。「発生日時」は、障害監視装置３が検知した障害の発生日時を示す。「顧客名」は、発生元の監視対象機器２から関連付けられた顧客名である。「対応事例Ｎｏ．」は、図１の事例付加システム６が付加したインシデント毎の対応手順(区分)を検索するための番号である。 The “incident ID” is an identification code that uniquely identifies a registered incident. “Occurrence date and time” indicates the occurrence date and time of the failure detected by the failure monitoring device 3. The “customer name” is a customer name associated with the monitoring target device 2 that is the generation source. “Correspondence case No.” is a number for searching for a response procedure (section) for each incident added by the case addition system 6 of FIG.

〈構成情報の構成例〉
図３は、図１のシステム管理データベース２２に格納される構成情報２３における項目の一例を示した説明図である。 <Configuration example of configuration information>
FIG. 3 is an explanatory diagram showing an example of items in the configuration information 23 stored in the system management database 22 of FIG.

構成情報２３の項目は、図示するように、「センタ名」、「顧客名」、「システム名」、「サブシステム名」、ならびに「連絡先」からなり、これらは、インシデントの対応事例区分別や対応部署別のレポートを作成する際に使用される。 As shown in the figure, the items of the configuration information 23 are composed of “center name”, “customer name”, “system name”, “subsystem name”, and “contact”, and these are classified according to incident case classifications. It is used when creating reports for each department and corresponding department.

〈対応事例データベースに格納される対応事例の一例〉
図４は、図１の事例付加システム６が参照する対応事例データベース７における対応事例の項目の一例を示した説明図である。 <Example of correspondence cases stored in the correspondence case database>
FIG. 4 is an explanatory diagram showing an example of a corresponding case item in the corresponding case database 7 referred to by the case addition system 6 of FIG.

対応事例データベース７における対応事例の項目は、「対応事例Ｎｏ．」、および「対応手順」などからなる。対応事例Ｎｏ．は、対応事例を一意に識別するため番号であり、対応手順は、インシデント発生時に参照する対応手順である。 Corresponding case items in the corresponding case database 7 include “corresponding case No.” and “corresponding procedure”. Corresponding case No. Is a number for uniquely identifying a response case, and the response procedure is a response procedure to be referred to when an incident occurs.

〈ポリシファイルの構成例〉
また、図５は、図１の集計システム３０が入力するポリシファイル３１の項目例を示した説明図であり、これは、集計対象とするインシデントの抽出条件を定義したものである。 <Configuration example of policy file>
FIG. 5 is an explanatory diagram showing an example of items of the policy file 31 input by the counting system 30 of FIG. 1, which defines the extraction conditions for incidents to be counted.

〈集計システムの動作例〉
次に、集計システム３０の動作について説明する。 <Operation example of aggregation system>
Next, the operation of the counting system 30 will be described.

集計システム３０が行う処理の各種機能は、たとえば、集計システム３０に設けられたプログラム格納メモリ（図示せず）などに記憶されているプログラム形式のソフトウェアを該集計システム３０の図示しないＣＰＵ(Central Processing Unit)などが実行することにより実現する。 The various functions of the processing performed by the totaling system 30 include, for example, program format software stored in a program storage memory (not shown) or the like provided in the totaling system 30 and a CPU (Central Processing) (not shown) of the totaling system 30. This is realized by executing (Unit).

図６は、図１の集計システム３０の動作の一例を示すフローチャートである。 FIG. 6 is a flowchart showing an example of the operation of the counting system 30 of FIG.

まず、集計システム３０によりポリシファイル３１に集計レポート３２に出力する条件を入力する（ステップＳ１０１）。以降、ステップＳ１０３〜Ｓ１０９の処理は、ポリシファイル３１に定義された条件数分、繰り返し行われる（ステップＳ１０２，Ｓ１１０）。 First, a condition to be output to the total report 32 is input to the policy file 31 by the total system 30 (step S101). Thereafter, the processes in steps S103 to S109 are repeated for the number of conditions defined in the policy file 31 (steps S102 and S110).

ポリシファイル３１に入力した条件項目とシステム管理データベース２２に格納される構成情報２３から、インシデントデータベース２０のインシデント情報２１を検索し（ステップＳ１０３，Ｓ１０４）、検索項目に該当するインシデント件数、および平均対応時間をそれぞれ算出する（ステップＳ１０６）。 The incident information 21 in the incident database 20 is searched from the condition items input to the policy file 31 and the configuration information 23 stored in the system management database 22 (steps S103 and S104), the number of incidents corresponding to the search items, and the average response Each time is calculated (step S106).

インシデント件数、および平均対応時間をそれぞれ算出した後、インシデント件数の割合（インシデント件数比率）を計算する（ステップＳ１０７）。このステップＳ１０６，Ｓ１０７の処理は、インシデント数分繰り返し実行される（ステップＳ１０５，Ｓ１０８）。 After calculating the number of incidents and the average response time, the ratio of the number of incidents (incident number ratio) is calculated (step S107). The processes in steps S106 and S107 are repeatedly executed for the number of incidents (steps S105 and S108).

ステップＳ１０６，Ｓ１０７の処理がインシデント数分繰り返し実行された後、１項目のポリシファイル３１に入力した条件とシステム管理データベース２２に格納される構成情報２３から算出した項目をレポートに追加する（ステップＳ１０９）。前述したように、ステップＳ１０３〜Ｓ１０９の処理は、ポリシファイル３１に定義された条件数分、繰り返し実行される。 After the processes in steps S106 and S107 are repeatedly executed for the number of incidents, the condition input in the policy file 31 of one item and the item calculated from the configuration information 23 stored in the system management database 22 are added to the report (step S109). ). As described above, the processes in steps S103 to S109 are repeatedly executed for the number of conditions defined in the policy file 31.

〈ステップＳ１０９の項目追加例〉
図７は、図６のステップＳ１０９の処理によるレポート作成の実現例を示した説明図である。 <Example of item addition in step S109>
FIG. 7 is an explanatory diagram showing an implementation example of report creation by the process of step S109 of FIG.

図７の上方は、ポリシファイル３１に入力した条件を示しており、図７の下方は、作成したレポート例を示している。図７の上方のポリシファイル３１に入力した条件が図７の下方に示す「項目」となり、図７の下方の「集計単位」は、構成情報２３が有するセンタ名である。そして、各集計単位の項目毎にインシデント割合件数、ならびに平均対応時間をそれぞれ抽出してレポートを作成する。 The upper part of FIG. 7 shows the conditions input to the policy file 31, and the lower part of FIG. 7 shows an example of the created report. The conditions input to the policy file 31 in the upper part of FIG. 7 are “items” shown in the lower part of FIG. 7, and the “aggregation unit” in the lower part of FIG. Then, a report is created by extracting the number of incidents and the average response time for each item in each tabulation unit.

〈集計レポートの一例〉
図８は、集計システム３０が生成する集計レポート３２の一例を示した説明図である。 <Example of aggregate report>
FIG. 8 is an explanatory diagram showing an example of a summary report 32 generated by the summary system 30.

集計レポート３２は、ポリシファイル３１と構成情報２３とから自動的に生成される。図８における集計レポート３２のＸ軸は、集計テンプレートに定義内容を示しており、Ｙ軸は、ポリシファイル３１に入力した条件と構成情報２３に定義された条件に従い、インシデント対応の分類毎の割合を抽出している。 The total report 32 is automatically generated from the policy file 31 and the configuration information 23. The X axis of the total report 32 in FIG. 8 indicates the definition content in the total template, and the Y axis indicates the ratio for each incident response classification according to the conditions input to the policy file 31 and the conditions defined in the configuration information 23. Is extracted.

それにより、本実施の形態によれば、インシデント毎の割合を算出するためにポリシファイル３１と構成情報２３とに対応付けし、インシデント情報２１も合わせることで、インシデント管理業務を実施する部署単位、または発生元単位に対応事例の比率が分かり、サービスデスク本来のサービス品質を正確に把握することが可能となる。 Thereby, according to the present embodiment, in order to calculate the ratio for each incident, the policy file 31 and the configuration information 23 are associated with each other, and the incident information 21 is also combined, so Or, the ratio of corresponding cases can be found in each source unit, and the service quality inherent in the service desk can be accurately grasped.

以上、本発明者によってなされた発明を実施の形態に基づき具体的に説明したが、本発明は前記実施の形態に限定されるものではなく、その要旨を逸脱しない範囲で種々変更可能であることはいうまでもない。 As mentioned above, the invention made by the present inventor has been specifically described based on the embodiment. However, the present invention is not limited to the embodiment, and various modifications can be made without departing from the scope of the invention. Needless to say.

本発明は、アウトソーシングサービスにおける障害監視運用サービスの品質向上技術に適している。 The present invention is suitable for a quality improvement technique for a failure monitoring operation service in an outsourcing service.

１障害監視システム
２監視対象機器
３障害監視装置
４センタ間ネットワーク
６事例付加システム
７対応事例データベース
８運用端末
１０インシデント管理システム
１１障害情報入力部
１２ワークフロー制御部
２０インシデントデータベース
２１インシデント情報
２２システム管理データベース
２３構成情報
３０集計システム
３１ポリシファイル
３２集計レポート
ＤＣデータセンタ
ＫＣ監視センタ DESCRIPTION OF SYMBOLS 1 Fault monitoring system 2 Monitoring object apparatus 3 Fault monitoring apparatus 4 Center network 6 Case addition system 7 Correspondence case database 8 Operation terminal 10 Incident management system 11 Fault information input part 12 Workflow control part 20 Incident database 21 Incident information 22 System management database 23 Configuration Information 30 Total System 31 Policy File 32 Total Report DC Data Center KC Monitoring Center

Claims

Based on the policy file that defines the incidents to be counted, and the configuration information indicating the counting unit, the policy file is searched by searching for incident information in which the number of the corresponding case is added to the failure message that occurred in the monitored device. A fault monitoring system comprising: a totaling unit that calculates and outputs incident totaling information for each condition defined in 1).

The failure monitoring system according to claim 1,
Furthermore, a response case database storing response cases consisting of response procedures for each incident associated with the failure message;
A case addition system that searches the correspondence case database, extracts a correspondence case associated with the failure message, and outputs the extracted number of the correspondence case and the failure message;
An incident management unit that registers the failure message output from the case addition system and the number of the corresponding case;
An incident information database for storing the information registered by the incident management unit as the incident information;
The counting unit is
A failure monitoring system that searches the incident information database and acquires the incident information.

The failure monitoring system according to claim 1 or 2,
The incident total information output by the totaling unit is
Fault monitoring, which is the ratio of the number of incidents, average response time, and the number of incidents corresponding to the items to be counted for incidents defined in the policy file, for each counting unit indicated in the configuration information system.

An incident counting method for collecting and tabulating incidents on devices to be monitored by a computer system,
Collecting fault messages from the device via a communication line;
Adding a response case number to the failure message, and storing the failure message with the response case number added as incident information;
A policy file that defines incidents to be counted, and a step of searching the incident information based on configuration information indicating a counting unit and outputting incident count information for each condition defined in the policy file Incident aggregation method characterized by

In the incident totaling method according to claim 4,
The step of storing the failure message as incident information includes:
Searching a response case database in which a response case consisting of a response procedure for each incident associated with a failure message is stored, and extracting a response case associated with the failure message transmitted via the communication line;
And a step of storing the extracted number of the corresponding case and the failure message in the incident information database as the incident information.

In the incident totaling method according to claim 4 or 5,
The step of outputting the incident total information includes
For each counting unit indicated in the configuration information, the number of incidents corresponding to the items to be counted for incidents defined in the policy file, an average response time, and a ratio of the number of incidents are calculated and output. Incident aggregation method.

Collecting fault messages from monitored devices via a communication line;
Adding a response case number to the failure message, and storing the failure message with the response case number added as incident information;
A computer system defining a policy file defining incidents to be aggregated, and outputting the incident aggregation information for each condition defined in the policy file by searching for the incident information based on configuration information indicating an aggregation unit A program characterized by being executed by.

The program according to claim 7, wherein
The step of storing the failure message as incident information includes:
Searching a response case database in which a response case consisting of a response procedure for each incident associated with a failure message is stored, and extracting a response case associated with the failure message transmitted via the communication line;
A program comprising the step of storing the extracted number of the corresponding case and the failure message in the incident information database as the incident information.

The program according to claim 7 or 8,
The step of outputting the incident total information includes
For each counting unit indicated in the configuration information, the number of incidents corresponding to the items to be counted for incidents defined in the policy file, an average response time, and a ratio of the number of incidents are calculated and output. Program.