JP5469011B2

JP5469011B2 - Incident management system, failure impact range visualization method

Info

Publication number: JP5469011B2
Application number: JP2010176461A
Authority: JP
Inventors: 浩吉田
Original assignee: Nomura Research Institute Ltd
Current assignee: Nomura Research Institute Ltd
Priority date: 2010-08-05
Filing date: 2010-08-05
Publication date: 2014-04-09
Anticipated expiration: 2030-08-05
Also published as: JP2012038028A

Description

本発明は、ＩＴサービス運用管理におけるインシデント管理システムなどの技術に関し、特に、対象システムの構成や障害状況などを把握・可視化する技術などに関する。 The present invention relates to a technology such as an incident management system in IT service operation management, and more particularly, to a technology for grasping and visualizing a configuration of a target system and a failure status.

ＩＴサービス運用管理におけるインシデント管理などに係わる基準として、ＩＴＩＬ（Information Technology Infrastructure Library）Version3などがある。インシデント管理システムでは、管理対象の情報処理システム（対象システム）で発生する障害などのインシデントをインシデント情報として記録・管理し、インシデントへの対策（対応）やエスカレーションに連携する。 ITIL (Information Technology Infrastructure Library) Version 3 is a standard related to incident management in IT service operation management. In the incident management system, incidents such as failures occurring in the information processing system (target system) to be managed are recorded and managed as incident information, and linked to incident countermeasures (responses) and escalations.

対象システムでは、クラウド環境の発展などに伴い、仮想サーバや並列分散処理などの技術が適用されている。また、対象システムでは、サービスレベルなどに基づいて、サーバやデータベースなどの構成部位（構成アイテム）は、障害許容性（フォールト・トレランス等）や性能などを考慮して、多重化構成などで設計・実装されている。 In the target system, technologies such as virtual servers and parallel distributed processing are applied with the development of the cloud environment. In the target system, components (configuration items) such as servers and databases are designed and multiplexed in consideration of fault tolerance (fault tolerance, performance, etc.) and performance based on the service level. Has been implemented.

先行技術例として、特開２００７−２５７２４４号公報（特許文献１）（障害影響範囲特定システム等）、特開２００９−１８１５３７号公報（特許文献２）（インシデント管理システム等）などがある。 Examples of prior art include Japanese Patent Application Laid-Open No. 2007-257244 (Patent Document 1) (failure influence range specifying system and the like), Japanese Patent Application Laid-Open No. 2009-181537 (Patent Document 2) (incident management system and the like), and the like.

特開２００７−２５７２４４号公報JP 2007-257244 A 特開２００９−１８１５３７号公報JP 2009-181537 A

従来のインシデント管理システム（及び構成管理システム等の関連システム）では、課題として、対象システムでの障害（インシデント）の発生時の影響範囲や影響先（上位のサービスなど）、及び緊急度やインパクトレベルなどを、担当者（インシデント管理者）が即座に把握することが難しい。よって、それらの把握に基づく優先度などに即した迅速なエスカレーション及び対策の実施などが難しい。特に、対象システムの構成部位が障害許容性などに応じた多重化構成などを採る場合、構成部位（構成アイテム）間での影響関係などが複雑であるため、上記の課題が顕著である。 In conventional incident management systems (and related systems such as configuration management systems), the issues are the scope of impact and the impact destination (higher services, etc.), and the urgency and impact level when a failure (incident) occurs in the target system. It is difficult for the person in charge (incident manager) to grasp such information immediately. Therefore, it is difficult to quickly escalate and implement countermeasures in accordance with the priority based on these grasps. In particular, when the configuration part of the target system adopts a multiplexed configuration or the like according to the fault tolerance or the like, the above-described problem is remarkable because the influence relationship between the configuration parts (configuration items) is complicated.

上記課題に対し、対象システムの障害影響範囲などの状況や構成（インシデント状況や運用状況）を画面で可視化する技術などが有効なものとして考えられる。しかし、従来技術では、クラウド環境や障害許容性などを考慮して設計・実装された構成の対象システムにおける障害影響範囲などの状況や構成を画面でわかりやすく可視化する技術について検討・実現が不十分である。 For the above problems, a technique for visualizing the status and configuration (incident status and operational status) of the target system on the screen, etc., is considered effective. However, in the conventional technology, there is insufficient examination and realization of technology that makes it easy to understand the situation and configuration such as the failure impact range in the target system of the configuration designed and implemented considering the cloud environment and fault tolerance etc. on the screen It is.

本発明の主な目的は、上記インシデント管理システム等に係わり、クラウド環境や障害許容性などを考慮した構成の対象システムにおける、障害影響範囲などの状況や構成、及びインシデント・対策の優先度などの情報を画面で可視化することで、担当者が上記状況などを即座にわかりやすく把握でき、迅速なエスカレーション及び対策の実施などが実現できる技術を提供することである。 The main object of the present invention relates to the above incident management system, etc., such as the situation and configuration of the failure impact range, the priority of incidents and countermeasures, etc. in the target system configured in consideration of the cloud environment and fault tolerance By visualizing information on the screen, the person in charge can immediately grasp the above situation in an easy-to-understand manner, and provide a technology that can realize prompt escalation and implementation of countermeasures.

上記目的を達成するために、本発明の代表的な実施の形態は、インシデント管理システム等であって、以下に示す構成を有することを特徴とする。 In order to achieve the above object, a typical embodiment of the present invention is an incident management system or the like, and has the following configuration.

本インシデント管理システムは、対象システムの障害を含むインシデントをインシデント情報として第１のデータベースに管理し、前記対象システムの構成を構成情報として第２のデータベースに管理する構成管理システムと連携し、担当者の端末に対して情報の画面を提供するサービスポータルシステムと連携し、前記対象システムの障害を含むインシデントを監視する障害監視システムと連携する。本インシデント管理システムは、前記対象システムの構成、障害影響範囲及び障害影響先サービスを含むインシデント状況を可視化する画面を、前記構成情報及び前記インシデント情報を用いて作成し、前記担当者の端末に提供する第１の機能と、前記対象システムにおける障害許容性を考慮して設計される構成部位を含む構成を、構成管理モデルとして前記構成情報に設定する第２の機能と、を有する。 The incident management system manages incidents including failures in the target system in the first database as incident information, and cooperates with the configuration management system that manages the configuration of the target system in the second database as configuration information. In cooperation with a service portal system that provides information screens to the terminals of the system, it cooperates with a fault monitoring system that monitors incidents including faults in the target system. The incident management system uses the configuration information and the incident information to create a screen for visualizing the incident status including the configuration of the target system, the failure influence range, and the failure affected service, and provides the screen to the person in charge And a second function for setting a configuration including a configuration part designed in consideration of fault tolerance in the target system in the configuration information as a configuration management model.

そして、前記構成管理モデルでは、前記障害許容性を考慮して設計される構成部位を含む各構成部位を第１の構成アイテムとして設定し、前記第１の構成アイテムについての障害許容性を第２の構成アイテムとして設定し、前記第１、第２の構成アイテムを含む構成アイテム間の依存関係性をリンクとして設定する。前記第１の機能による画面では、前記構成アイテムをリンクで接続した構造で、前記対象システムの構成管理モデル、障害影響範囲及び障害影響先サービスを含むインシデント状況を表示する。 In the configuration management model, each configuration part including a configuration part designed in consideration of the fault tolerance is set as a first configuration item, and the fault tolerance for the first configuration item is set to the second. And the dependency between the configuration items including the first and second configuration items is set as a link. The screen by the first function displays an incident status including a configuration management model, a failure influence range, and a failure influence destination service of the target system in a structure in which the configuration items are connected by links.

本発明の代表的なものによれば、インシデント管理システム等に係わり、クラウド環境や障害許容性などを考慮した構成の対象システムにおける、障害影響範囲などの状況や構成、及びインシデント・対策の優先度などの情報を画面で可視化することで、担当者が上記状況などを即座にわかりやすく把握でき、迅速なエスカレーション及び対策の実施などが実現できる。 According to a representative example of the present invention, the status and configuration of the failure impact range, etc., and the priority of incidents and countermeasures in the target system related to the incident management system and the like, taking into account the cloud environment and fault tolerance, etc. By visualizing such information on the screen, the person in charge can immediately grasp the above situation in an easy-to-understand manner, and can implement prompt escalation and implementation of countermeasures.

本発明の一実施の形態のインシデント管理システムを含むコンピュータシステム全体の概要構成例を示す図である。It is a figure which shows the example of an outline structure of the whole computer system containing the incident management system of one embodiment of this invention. 本実施の形態のインシデント管理システムに係わる各部の構成例を示す図である。It is a figure which shows the structural example of each part concerning the incident management system of this Embodiment. 本実施の形態における対象システムの構成管理モデル及び正常時の状況を可視化する画面例を示す図である。It is a figure which shows the example of a screen which visualizes the structure management model of the object system in this Embodiment, and the condition at the time of normal. 本実施の形態における対象システムの構成管理モデル及び障害時の状況を可視化する画面例を示す図である。It is a figure which shows the example of a screen which visualizes the structure management model of the target system in this Embodiment, and the condition at the time of a failure. （ａ）〜（ｆ）は、本実施の形態における対象システムの障害許容性構成アイテム（ＦＴＣＩ）の障害許容情報の例を示す図である。(A)-(f) is a figure which shows the example of the fault tolerance information of the fault tolerance configuration item (FTCI) of the target system in this Embodiment. 本実施の形態における構成情報（構成アイテム情報）等の例を示す図である。It is a figure which shows an example of the structure information (structure item information) etc. in this Embodiment. 本実施の形態における構成アイテムのステータス決定方法の例を示す図である。It is a figure which shows the example of the status determination method of the configuration item in this Embodiment. 本実施の形態における構成アイテム間の依存関係性の情報の例を示す図である。It is a figure which shows the example of the information of the dependency relation between the configuration items in this Embodiment. 本実施の形態における構成管理モデル（一部）の例を示す図である。It is a figure which shows the example of the structure management model (part) in this Embodiment. 本実施の形態における優先度算出方法などを示す図である。It is a figure which shows the priority calculation method etc. in this Embodiment. 本実施の形態におけるインシデント情報の例を示す図である。It is a figure which shows the example of the incident information in this Embodiment. 本実施の形態におけるインシデント画面の例を示す図である。It is a figure which shows the example of the incident screen in this Embodiment. 従来技術例における画面例を示す図である。It is a figure which shows the example of a screen in a prior art example.

以下、本発明の実施の形態（インシデント管理システム、障害影響範囲可視化方法）を図面に基づいて詳細に説明する。なお、実施の形態を説明するための全図において、同一部には原則として同一符号を付し、その繰り返しの説明は省略する。 Hereinafter, embodiments of the present invention (incident management system, failure effect range visualization method) will be described in detail with reference to the drawings. Note that components having the same function are denoted by the same reference symbols throughout the drawings for describing the embodiment, and the repetitive description thereof will be omitted.

［概要等］
本実施の形態の概要や特徴などは以下である（図１，図３，図４等）。主な特徴として、本インシデント管理システム１０は、障害影響範囲可視化機能１０１を有し、また、構成管理システム２０は、障害許容性構成アイテム（ＦＴＣＩ）設定機能１０２を有する（図１）。障害影響範囲可視化機能１０１は、対象システム１の構成及び障害状況（障害影響範囲など）を画面で可視化する機能である。ＦＴＣＩ設定機能１０２は、対象システム１の構成情報（構成管理モデル）において、サーバ等の構成部位（構成アイテム：ＣＩ）の「障害許容性」（障害許容性などを考慮した設計・実装の構成）を、一種の構成アイテム（障害許容性構成アイテム：ＦＴＣＩ）として設定する機能である。 [Summary]
The outline and features of the present embodiment are as follows (FIGS. 1, 3, 4, etc.). As main features, the incident management system 10 has a failure influence range visualization function 101, and the configuration management system 20 has a failure tolerance configuration item (FTCI) setting function 102 (FIG. 1). The failure influence range visualization function 101 is a function for visualizing the configuration and failure status (failure influence range, etc.) of the target system 1 on a screen. The FTCI setting function 102 is configured in the configuration information (configuration management model) of the target system 1 as “failure tolerance” (design / implementation configuration considering failure tolerance) of a configuration part (configuration item: CI) such as a server. Is set as a kind of configuration item (fault tolerance configuration item: FTCI).

構成アイテム（ＣＩ）とは、構成管理モデル（構成情報）及び画面（図３等）において、対象システム１を構成するサーバ等の構成部位であり、画面に表示する対象となる要素である。ＣＩは、カテゴリ等に応じて、特定のアイコンなどの表現で表示される。ＣＩ間の依存関係性（リンク）も線などで可視化される。ＣＩ及びリンクを含んで成る構成管理モデルが設定される。 The configuration item (CI) is a configuration part such as a server configuring the target system 1 in the configuration management model (configuration information) and the screen (FIG. 3 and the like), and is an element to be displayed on the screen. The CI is displayed as a specific icon or the like according to the category or the like. Dependencies (links) between CIs are also visualized with lines. A configuration management model including a CI and a link is set.

「障害許容性」とは、本技術分野における公知の用語（フォールト・トレランス等）に従ったものであり、対象システム１における障害許容性や性能やサービスレベルなどを考慮した設計・実装の構成に相当し、例えば冗長構成など（物理的・仮想的な多重化・クラスタリングなど）の公知の各種技術が該当する。「障害許容性」は、対象システム１（ＣＩ）の設計情報の１つであり、ＦＴＣＩ設定機能１０２を用いて、担当者３等により、ＦＴＣＩとして設定が可能となっている。 “Fault tolerance” is in accordance with a well-known term (fault tolerance, etc.) in this technical field, and is designed and implemented in consideration of fault tolerance, performance, service level, etc. in the target system 1. This corresponds to various known techniques such as a redundant configuration (physical / virtual multiplexing / clustering, etc.). “Fault tolerance” is one piece of design information of the target system 1 (CI), and can be set as FTCI by the person in charge 3 or the like using the FTCI setting function 102.

画面（図３，図４等）では、ＦＴＣＩを含む構成管理モデル上に、本システムでの分析結果など（障害状況など）がマッピングされた情報が可視化される。これにより担当者３は障害影響範囲やエスカレーション先などをわかりやすく即座に把握することができる。 On the screen (FIG. 3, FIG. 4, etc.), information in which an analysis result (failure status, etc.) in this system is mapped on a configuration management model including FTCI is visualized. As a result, the person in charge 3 can immediately and easily understand the fault influence range and the escalation destination.

［システム構成］
図１で、本インシデント管理システム１０を含むコンピュータシステム全体の概要構成例を示す。本インシデント管理システム１０は、構成管理システム２０、サービスポータルシステム３０、障害監視システム４０等と連携するシステムである。運用管理・監視、インシデント管理、及び構成管理などのプロセス間の連携がシステム化されている。対象システム１は、インシデント管理運用の対象となる情報処理システム（稼働システム）である。担当者３は、サービスポータルシステム３０を利用するユーザやその端末などである。図１のように各システム間は通信可能に接続される。なお各システム（１０，２０，３０，４０等）を一体化したシステムとしてもよいし、適宜分割した構成としてもよい。 [System configuration]
FIG. 1 shows a schematic configuration example of the entire computer system including the incident management system 10. The incident management system 10 is a system that cooperates with the configuration management system 20, the service portal system 30, the failure monitoring system 40, and the like. Cooperation between processes such as operation management / monitoring, incident management, and configuration management is systematized. The target system 1 is an information processing system (operation system) that is a target of incident management operation. The person in charge 3 is a user who uses the service portal system 30 or a terminal thereof. As shown in FIG. 1, the systems are connected to be communicable. Each system (10, 20, 30, 40, etc.) may be integrated, or may be appropriately divided.

○対象システム１：対象システム１は、例えば、ネットワーク機器（スイッチ等）、サーバ、ストレージ、データベース、ミドルウェア、アプリケーション、等の構成部位を含んで成り、所定のサービス（サービス処理）を実現する。各構成部位は、例えばログ情報や障害メッセージを保持または出力する。 Target system 1: The target system 1 includes components such as network devices (switches, etc.), servers, storages, databases, middleware, applications, and the like, and realizes predetermined services (service processing). Each component holds or outputs log information or a failure message, for example.

○インシデント管理システム１０：インシデント管理システム１０は、サーバシステム等により構成され、基本的な機能として、インシデント管理データベース（ＤＢ）５１に障害情報を含むインシデント情報を管理（登録・検索など）する。インシデント情報は、障害情報、稼働情報（初期診断実行結果情報）、分析結果（優先度など）、等の情報が含まれる（後述、図１１等）。またインシデント情報は、対策情報、担当者情報などが含まれ得る（又は関連付けられる）。ＤＢ５１のインシデント情報（ｂ２）をもとにインシデント画面Ｇ２が構成される。 Incident management system 10: The incident management system 10 includes a server system and the like, and manages (registers / searches etc.) incident information including failure information in an incident management database (DB) 51 as a basic function. Incident information includes information such as failure information, operation information (initial diagnosis execution result information), analysis results (priority, etc.) (described later, such as FIG. 11). The incident information may include (or be associated with) countermeasure information, person-in-charge information, and the like. An incident screen G2 is configured based on the incident information (b2) in the DB 51.

またインシデント管理システム１０では、構成管理システム２０で管理される対象システム１の構成（構成管理モデル）に関して、障害監視システム４０での対象システム１の障害監視に基づき、初期診断や分析処理により、構成・運用状況、障害などのインシデントの状況を把握する（障害構成情報ｂ３やインシデント情報ｂ２に反映される）。 In addition, in the incident management system 10, the configuration (configuration management model) of the target system 1 managed by the configuration management system 20 is configured by initial diagnosis and analysis processing based on the fault monitoring of the target system 1 by the fault monitoring system 40. -Grasp the status of incidents such as operational status and faults (reflected in fault configuration information b3 and incident information b2).

担当者３等により、インシデント画面Ｇ２を通じて、対象システム１での障害などのインシデントに関して、対策情報（対策手順、説明など）や関連情報などを登録したり、検索したりすることができる。 The person in charge 3 or the like can register or search for countermeasure information (measurement procedure, explanation, etc.) and related information regarding incidents such as failures in the target system 1 through the incident screen G2.

図２には、インシデント管理システム１０の各部の詳細構成例などを示す。インシデント管理システム１０は、障害情報取得部１１、構成情報取得部１２、初期診断部１３、分析部１４｛障害影響範囲ＣＩ抽出部１５、ＦＴＣＩ状況把握部１６、優先度算出部１７｝、情報登録部１８、等を有する構成である。各部はソフトウェアプログラム等により実現される。 FIG. 2 shows a detailed configuration example of each part of the incident management system 10. The incident management system 10 includes a failure information acquisition unit 11, a configuration information acquisition unit 12, an initial diagnosis unit 13, an analysis unit 14 {failure influence range CI extraction unit 15, FTCI situation grasping unit 16, priority calculation unit 17}, information registration It is the structure which has the part 18, etc. Each unit is realized by a software program or the like.

分析部１４は、インシデント分析の一部（ＦＴＣＩを含む対象システム１の障害（インシデント）の影響などの分析）をシステム化した処理部である。分析部１４は、検知された障害（インシデント）について障害影響範囲などを把握する機能、及び複数の障害（インシデント）について優先順位付けのための優先度などを算出する機能などを含む。各部の詳細については後述のフローで説明する。 The analysis unit 14 is a processing unit that systematizes a part of the incident analysis (analysis of the influence of the failure (incident) of the target system 1 including FTCI). The analysis unit 14 includes a function for grasping a failure influence range and the like for a detected failure (incident), a function for calculating priorities for prioritizing a plurality of failures (incidents), and the like. Details of each part will be described in a later-described flow.

○構成管理システム２０：構成管理システム２０は、サーバシステム等により構成され、基本的な機能として、対象システム１や障害監視システム４０等から取得・収集した構成情報や、担当者３等により設定される構成情報を、構成管理データベース（ＤＢ）５２に管理（登録・検索など）する。構成情報は、ＣＩ情報（ＦＴＣＩ情報を含む）などが含まれる。構成情報を用いて、対象システム１の障害許容性などを含む構成が、構成管理モデルとしてモデル化される。また構成情報は、担当者情報などが含まれ得る（又は関連付けられる）。ＤＢ５２の構成情報（ｂ１）をもとに構成情報画面Ｇ１が構成される。 ○ Configuration management system 20: The configuration management system 20 is configured by a server system or the like, and is set by the configuration information acquired / collected from the target system 1, the fault monitoring system 40, etc., or the person in charge 3 as basic functions. Configuration information is managed (registered, searched, etc.) in the configuration management database (DB) 52. The configuration information includes CI information (including FTCI information) and the like. A configuration including the fault tolerance of the target system 1 and the like is modeled as a configuration management model using the configuration information. The configuration information may include (or be associated with) person-in-charge information. The configuration information screen G1 is configured based on the configuration information (b1) in the DB 52.

担当者３等により、構成情報画面Ｇ１を通じて、対象システム１の構成（構成管理モデル）に関して、構成アイテム（ＣＩ）及びＣＩ間の依存関係性（リンク）、上位のサービスとの関係性、などを構成情報（構成管理モデル情報）として設定することができる。特に、担当者３等により、ＦＴＣＩ設定機能１０２を用いて、構成管理モデルに、ＣＩの冗長構成の関係性など、ＦＴＣＩの情報（後述、図５）を設定することができる。 The person in charge 3 and the like, through the configuration information screen G1, regarding the configuration of the target system 1 (configuration management model), the dependency between the configuration item (CI) and the CI (link), the relationship with the upper service, etc. It can be set as configuration information (configuration management model information). In particular, the person in charge 3 or the like can use the FTCI setting function 102 to set FTCI information (to be described later, FIG. 5) such as the relationship of the CI redundant configuration in the configuration management model.

また例えば構成管理システム２０では、対象システム１の構成部位（ＣＩ）やそのカテゴリ等に対して、所定の担当者３（エスカレーション先を含む）を関連付け管理することができる。担当者情報は、例えば、通知先のアドレス、組織における所属、名前、担当の構成部位（ＣＩ）などの情報を有する。構成情報やインシデント情報には、必要に応じて担当者情報が関連付けられる。 Further, for example, the configuration management system 20 can associate and manage a predetermined person in charge 3 (including an escalation destination) with respect to a configuration part (CI) of the target system 1 and its category. The person-in-charge information includes, for example, information such as a notification destination address, organization affiliation, name, and a constituent part (CI) in charge. Personnel information is associated with configuration information and incident information as necessary.

また上記の構成情報は、担当者３等による手動操作での設定に限らず、一部自動的な処理による登録なども可能である。例えば、構成管理システム２０は、対象システム１から構成情報を取得・収集してＤＢ５２に反映する（ａ２）。また障害監視システム４０との連携により同様に構成情報を取得してもよい（図１のａ３）。例えば対象システム１の構成部位間での公知のディスカバリコマンド実行などによって構成情報の自動収集なども可能である。 The above configuration information is not limited to setting by manual operation by the person in charge 3 or the like, but registration by partially automatic processing is also possible. For example, the configuration management system 20 acquires and collects configuration information from the target system 1 and reflects it in the DB 52 (a2). Further, the configuration information may be obtained in the same manner in cooperation with the failure monitoring system 40 (a3 in FIG. 1). For example, it is possible to automatically collect configuration information by executing a known discovery command between components of the target system 1.

○サービスポータルシステム３０：サービスポータルシステム３０は、サーバシステム等により構成され、図２の画面提供部３１（例えばＷｅｂサーバ等による）を有し、各システム（１０，２０）の情報（構成情報ｂ１，インシデント情報ｂ２，障害構成情報ｂ３等）を用いて、担当者３等が閲覧するための各種の画面（Ｇ１，Ｇ２等）をＷｅｂページ等で構成し、担当者３の端末に提供する。画面を構成する元となる情報は、インシデント管理システム１０（情報登録部１８）側から提供される。 Service Portal System 30: The service portal system 30 is configured by a server system or the like, has the screen providing unit 31 (for example, by a Web server or the like) in FIG. 2, and information (configuration information b1) of each system (10, 20). , Incident information b2, failure configuration information b3, etc.), various screens (G1, G2, etc.) for the person-in-charge 3 to view are composed of web pages and provided to the terminal of the person-in-charge 3. The information that constitutes the screen is provided from the incident management system 10 (information registration unit 18) side.

本実施の形態では、画面として、構成情報画面Ｇ１，インシデント画面Ｇ２を有する。特に、構成情報画面Ｇ１では、構成情報ｂ１、障害構成情報ｂ３をもとに、対象システム１に関する従来の構成情報（ＦＴＣＩ等は無し）を表示可能とするだけでなく、ＦＴＣＩを含む各ＣＩ及びリンクの構成（構成管理モデル）、及びその上にマッピングされる障害状況など（障害箇所・障害影響範囲・障害影響先サービスなど、及び担当者情報など）を可視化する（後述、図３，図４等）。インシデント画面Ｇ２では、インシデント情報ｂ２をもとに、優先度、目標解決時間、担当者情報などを含むインシデント情報を表示する（後述、図１１，図１２等）。障害構成情報ｂ３は、構成管理モデル（構成情報ｂ１）上に、分析部１４での分析結果による障害状況（障害影響範囲などを含む）や関連情報などをマッピングした情報である。 In this embodiment, the screen includes a configuration information screen G1 and an incident screen G2. In particular, on the configuration information screen G1, not only can the conventional configuration information (no FTCI etc.) related to the target system 1 be displayed based on the configuration information b1 and the failure configuration information b3, but each CI including FTCI and Visualize the link configuration (configuration management model) and the failure status mapped on it (failure location, failure impact range, failure impact destination service, etc., and person-in-charge information) (see below, FIGS. 3 and 4) etc). On the incident screen G2, incident information including priority, target solution time, person-in-charge information, etc. is displayed based on the incident information b2 (described later, FIG. 11, FIG. 12, etc.). The failure configuration information b3 is information obtained by mapping a failure status (including failure influence range and the like) and related information based on the analysis result in the analysis unit 14 on the configuration management model (configuration information b1).

なお、障害構成情報ｂ３については、インシデント管理システム１０からサービスポータルシステム３０へ提供する形に限らず、構成管理システム２０から提供する形などとしてもよい。その場合、構成管理システム２０内（ＤＢ５２）で、障害構成情報ｂ３を含む構成管理モデルを管理する。また構成情報画面Ｇ１とは別に障害構成情報ｂ３用の画面などを提供してもよい。 The failure configuration information b3 is not limited to the form provided from the incident management system 10 to the service portal system 30, but may be provided from the configuration management system 20. In that case, the configuration management model including the failure configuration information b3 is managed in the configuration management system 20 (DB 52). In addition to the configuration information screen G1, a screen for failure configuration information b3 may be provided.

担当者３の端末に対する画面の提供の仕方（ＧＵＩ）としては、対象システム１の構成・状況の変動に応じて画面表示内容を更新表示する。例えば、所定のＷｅｂページのウィンドウ（画面Ｇ１等）で常に対象システム１の構成や障害状況を表示する。そして構成や障害状況の変動に応じて上記表示内容を更新する。あるいはユーザ操作に応じて必要な時だけ画面を表示してもよい。また例えば障害検知（Ｓ１）に伴うアラートによって自動的に画面を表示してもよい。また画面Ｇ１と画面Ｇ２の間で表示を遷移してもよいし、表示内容を統合してもよい。また例えば、時点ごとに構成・状況を履歴（スナップ）として保存しておき、指定の時点の情報を表示可能としてもよい。 As a way of providing a screen (GUI) for the terminal of the person in charge 3, the screen display content is updated and displayed in accordance with the change in the configuration and status of the target system 1. For example, the configuration and failure status of the target system 1 are always displayed in a predetermined Web page window (screen G1 or the like). Then, the display contents are updated in accordance with changes in configuration and failure status. Alternatively, the screen may be displayed only when necessary according to a user operation. Further, for example, the screen may be automatically displayed by an alert accompanying failure detection (S1). Further, the display may transition between the screen G1 and the screen G2, or the display contents may be integrated. Further, for example, the configuration / situation may be saved as a history (snap) for each time point, and information at a specified time point may be displayed.

○障害監視システム４０：障害監視システム４０は、公知の各種技術により構成可能であり、例えば、対象システム１に対して、サーバ・ストレージの監視及び構成情報収集、ネットワークの監視及び構成情報収集、及び監視に基づく障害ログ解析、等の処理機能を有する。なお他のシステム（１０，２０）に障害監視システム４０の処理機能を備えてもよい。障害監視システム４０は、対象システム１の構成部位からログ・障害メッセージ（ａ１）等を収集し、それによる障害情報（例えば障害ログ解析による障害検知・障害箇所情報など）をインシデント管理システム１０に通知する（Ｓ１）。また障害監視システム４０は、対象システム１の構成部位の構成情報を収集し、構成管理システム２０に提供してもよい（ａ３）。 Fault monitoring system 40: The fault monitoring system 40 can be configured by various known technologies. For example, for the target system 1, server / storage monitoring and configuration information collection, network monitoring and configuration information collection, and It has processing functions such as failure log analysis based on monitoring. In addition, the processing function of the failure monitoring system 40 may be provided in another system (10, 20). The failure monitoring system 40 collects logs / failure messages (a1) from the components of the target system 1 and notifies the incident management system 10 of failure information (for example, failure detection / failure location information by failure log analysis). (S1). Further, the failure monitoring system 40 may collect configuration information of the components of the target system 1 and provide the configuration information to the configuration management system 20 (a3).

○担当者３：担当者３は、サービスポータルシステム３０を利用する担当者及びその端末等を示す。担当者３は、エスカレーション先を含む。担当者３は、Ｗｅｂブラウザ等を備える端末から、サービスポータルシステム３０へアクセスし、構成情報画面Ｇ１、インシデント画面Ｇ２を含む各種の画面（Ｗｅｂページ等）を閲覧することができる。担当者３の端末は、画面提供部３１への要求に応じて上記画面を取得して表示したり、あるいは自動的に画面の表示更新内容データを取得して表示内容を更新する。 ○ Person in charge 3: Person in charge 3 shows a person in charge who uses the service portal system 30 and its terminal. The person in charge 3 includes an escalation destination. The person in charge 3 can access the service portal system 30 from a terminal provided with a Web browser and browse various screens (Web pages and the like) including the configuration information screen G1 and the incident screen G2. The terminal of the person in charge 3 acquires and displays the screen in response to a request to the screen providing unit 31, or automatically acquires display update content data of the screen and updates the display content.

担当者３として、Ｕは、初期診断担当者である。Ａ，Ｂ，Ｃは、各種のエスカレーション先の担当者である。Ａは機能的エスカレーション先を示し、対象システム１の構成部位に関連付けられる開発者・保守運用者などである。Ｂ，Ｃは階層的エスカレーション先を示し、組織の上司−部下といった階層的な関係者を示す。エスカレーション先は各種を設けて管理してもよい。例えば、階層的エスカレーション先の第１の種別（Ｂ）として管理（本システム）側の担当者、第２の種別（Ｃ）として顧客（対象システム１）側の担当者、等である。Ｅ１はＵからＡへのエスカレーション（通知など）を示す。Ｅ２はＵからＢまたはＣへのエスカレーション（通知など）を示す。 As the person in charge 3, U is the person in charge of initial diagnosis. A, B, and C are persons in charge of various escalation destinations. A indicates a functional escalation destination, which is a developer / maintenance operator or the like associated with a component of the target system 1. B and C indicate hierarchical escalation destinations, and hierarchical parties such as the supervisor and subordinates of the organization. Various escalation destinations may be managed. For example, the first type (B) of the hierarchical escalation destination is the person in charge on the management (this system) side, the second type (C) is the person in charge on the customer (target system 1) side, and the like. E1 indicates escalation (notification, etc.) from U to A. E2 indicates an escalation (notification or the like) from U to B or C.

［管理運用フロー］
本実施の形態のインシデント管理システム１０及び障害影響範囲可視化方法におけるインシデント管理運用フローの概要は以下である。なお本管理運用フローは、ＩＴＩＬ Version3に準拠したものとなっている。 [Management operation flow]
The outline of the incident management operation flow in the incident management system 10 and the failure influence range visualization method of the present embodiment is as follows. This management operation flow conforms to ITIL Version 3.

（０）ＦＴＣＩを含む構成管理モデルの設定（その他、構成情報（ｂ１）の取得等）
（１−１）障害（インシデント）の検知（その他、インシデントの識別・記録等）
（１−２）初期診断（その他、障害ログ解析等）
（２−１）分析による障害状況などの把握：障害影響範囲・影響先、ＦＴＣＩ状況などの把握
（２−２）分析によるインシデントの優先順位付け（対策方針決定）：優先度・目標解決時間・エスカレーション先などの決定
（３−１）上記分析結果などを反映した情報の作成・登録：構成管理モデル上に障害状況などをマッピング（障害構成情報ｂ３）、及び対応するインシデント情報（ｂ２）
（３−２）画面提供：上記の情報（ｂ３，ｂ２）を可視化する画面（構成情報画面Ｇ１，インシデント画面Ｇ２）を構成し担当者３へ提供
（４−１）１次対応：担当者３（初期診断担当者Ｕなど）により上記画面（Ｇ１，Ｇ２）で構成・障害状況・対策方針などを把握し、必要に応じて各種のエスカレーション（Ｅ１，Ｅ２）を実行
（４−２）２次対応：エスカレーション先の担当者３（Ａ，Ｂ，Ｃ）により、上記画面（Ｇ１，Ｇ２）で構成・障害状況・対策方針などを把握し、必要に応じて対策などを実施。 (0) Configuration management model setting including FTCI (Others, acquisition of configuration information (b1), etc.)
(1-1) Fault (incident) detection (Incident identification / recording, etc.)
(1-2) Initial diagnosis (Others, failure log analysis, etc.)
(2-1) Understanding failure status through analysis: Understanding failure impact range, impact destination, FTCI status, etc. (2-2) Incident prioritization through analysis (decision of countermeasure policy): Priority, target solution time, Determination of escalation destination, etc. (3-1) Creation / registration of information reflecting the above analysis results: Mapping failure status on the configuration management model (failure configuration information b3) and corresponding incident information (b2)
(3-2) Screen provision: The screen (configuration information screen G1, incident screen G2) for visualizing the above information (b3, b2) is configured and provided to the person in charge 3 (4-1) Primary response: Person in charge 3 (Early diagnosis staff U, etc.) grasp the configuration, failure status, countermeasures, etc. on the above screen (G1, G2) and execute various escalations (E1, E2) as necessary (4-2) Secondary Response: The person in charge 3 (A, B, C) of the escalation destination grasps the configuration, failure status, countermeasure policy, etc. on the above screen (G1, G2), and implements countermeasures as necessary.

上記（４−２）では、例えば、機能的エスカレーション先の担当者（Ａ）により、画面Ｇ２での障害メッセージの確認や、画面Ｇ１の障害影響範囲などを対象とした調査などを行い、例えばサーバプログラム修正など、障害への対策を実施する。これにより、当該障害の復旧など（問題解決）の場合、当該インシデントがクローズされる。なおインシデント情報（ｂ２）の登録（更新）は随時行われ、インシデントのステータスは随時更新される。 In the above (4-2), for example, the person in charge of functional escalation (A) checks the failure message on the screen G2, investigates the failure influence range on the screen G1, etc. Implement countermeasures such as program correction. Thereby, in the case of recovery of the failure (problem solving), the incident is closed. Incident information (b2) is registered (updated) at any time, and the incident status is updated at any time.

［処理フロー］
図１，図２を用いて、上記管理運用フローに基づく本システムの処理フローにおける、対象システム１での障害発生時における主な処理の流れ（ステップＳ０〜Ｓ９で示す）について説明する。 [Processing flow]
A main processing flow (indicated by steps S0 to S9) when a failure occurs in the target system 1 in the processing flow of the present system based on the management operation flow will be described with reference to FIGS.

（Ｓ０：構成設定）準備・前提の１つとして、構成管理システム２０（ＤＢ５２）に対し、対象システム１の構成（構成管理モデル）を設定する。例えば、担当者３等により、構成情報画面Ｇ１を用いて、各構成要素をＣＩとして設定し、ＣＩ間の依存関係性（リンク）を設定し、またＦＴＣＩ設定機能１０２を用いて、構成要素（ＣＩ）の障害許容性をＦＴＣＩとして設定することにより、構成管理モデルを構成情報として設定する。 (S0: Configuration Setting) As one of preparation and premise, the configuration (configuration management model) of the target system 1 is set in the configuration management system 20 (DB 52). For example, the person in charge 3 or the like uses the configuration information screen G1 to set each component as a CI, sets a dependency (link) between CIs, and uses the FTCI setting function 102 to set the component ( By setting the fault tolerance of CI) as FTCI, the configuration management model is set as configuration information.

（Ｓ１：障害検知）インシデント管理システム１０（障害情報取得部１１）は、障害監視システム４０を用いて対象システム１の障害を検知する。障害検知をトリガにしてＳ２以降の処理を実行する。例えば、対象システム１から障害監視システム４０へ障害情報（障害メッセージ等）が出力される（図２のａ１）。障害監視システム４０から障害情報取得部１１へ障害情報（障害メッセージ等）が出力される。障害情報取得部１１は、受信（取得）した障害情報をＤＢ５１へインシデント情報として登録してもよい。障害情報取得部１１は、例えば、障害ログ解析などにより、障害メッセージから障害箇所のＣＩを抽出してもよい。 (S1: Failure Detection) The incident management system 10 (failure information acquisition unit 11) detects a failure of the target system 1 using the failure monitoring system 40. The process after S2 is executed with the failure detection as a trigger. For example, failure information (failure message or the like) is output from the target system 1 to the failure monitoring system 40 (a1 in FIG. 2). Fault information (such as a fault message) is output from the fault monitoring system 40 to the fault information acquisition unit 11. The failure information acquisition unit 11 may register the received (acquired) failure information in the DB 51 as incident information. The failure information acquisition unit 11 may extract the CI of the failure location from the failure message by, for example, failure log analysis.

（Ｓ２：構成情報取得）インシデント管理システム１０（構成情報取得部１２）は、構成管理システム２０（ＤＢ５２）から、日次などの所定のタイミングで、対象システム１の構成情報（構成管理モデル情報）を取得する。あるいは、構成が変更されたタイミングなどで構成情報を取得してもよい。取得した構成情報を以下の処理で用いる。 (S2: Configuration Information Acquisition) The incident management system 10 (configuration information acquisition unit 12) receives configuration information (configuration management model information) of the target system 1 from the configuration management system 20 (DB52) at a predetermined timing such as daily. To get. Alternatively, the configuration information may be acquired at a timing when the configuration is changed. The acquired configuration information is used in the following processing.

（Ｓ３：初期診断実行）Ｓ１をもとに、インシデント管理システム１０（初期診断部１３）は、対象システム１（Ｓ１の障害検知箇所を含む、診断対象とする例えば対象システム１の全体または一部）に対して、初期診断（初期診断スクリプト）を実行する。初期診断スクリプトは、診断対象の部位のカテゴリ（サーバ、データベース等）毎に対応したスクリプトプログラムである。 (S3: Execution of initial diagnosis) Based on S1, the incident management system 10 (initial diagnosis unit 13) sets the target system 1 (for example, the entire or part of the target system 1 to be diagnosed, including the failure detection location of S1). ) To execute an initial diagnosis (initial diagnosis script). The initial diagnosis script is a script program corresponding to each category (server, database, etc.) of the part to be diagnosed.

（Ｓ４：初期診断結果取得）インシデント管理システム１０（初期診断部１３）は、対象システム１から、上記Ｓ３の初期診断スクリプトの実行結果（稼働情報）を取得する。これにより対象システム１の稼働状況を把握する（障害箇所（ＣＩ）の把握を含む）。Ｓ４の結果情報は、ＤＢ５１の該当インシデント情報に反映してもよい。 (S4: Initial diagnosis result acquisition) The incident management system 10 (initial diagnosis unit 13) acquires the execution result (operation information) of the initial diagnosis script of S3 from the target system 1. As a result, the operating status of the target system 1 is grasped (including the grasp of the failure location (CI)). The result information of S4 may be reflected in the corresponding incident information in the DB 51.

次に、Ｓ４までの情報に基づき、以下のＳ５〜Ｓ７で、分析部１４により、当該障害（インシデント）に関する分析処理を行う。これにより、Ｓ５では障害影響範囲のＣＩを把握し、Ｓ６ではＦＴＣＩの状況（ステータス）を把握し、Ｓ７では、優先度（Ｐ）や目標解決時間（Ｔ）などを決定する。 Next, based on the information up to S4, the analysis unit 14 performs analysis processing on the failure (incident) in the following S5 to S7. As a result, the CI of the fault influence range is grasped in S5, the situation (status) of the FTCI is grasped in S6, and the priority (P) and the target solution time (T) are determined in S7.

（Ｓ５：障害影響範囲ＣＩ抽出）障害影響範囲ＣＩ抽出部１５は、Ｓ４，Ｓ２の情報などを用いて、当該障害箇所に係わる障害影響範囲のＣＩ（ＦＴＣＩを含む）を抽出する（後述、図４等）。 (S5: Failure Influence Range CI Extraction) The failure influence range CI extraction unit 15 extracts the failure influence range CI (including FTCI) related to the failure location using the information in S4 and S2 (described later, FIG. 4 etc.).

（Ｓ６：ＦＴＣＩ状況把握）ＦＴＣＩ状況把握部１６は、Ｓ４，Ｓ５，Ｓ２の情報などを用いて、ＦＴＣＩの状況（障害許容状況）をステータスとして決定する（後述、図４，図７等）。 (S6: FTCI situation grasping) The FTCI situation grasping unit 16 determines the FTCI situation (failure allowable situation) as the status using the information of S4, S5, S2, etc. (described later, FIG. 4, FIG. 7, etc.).

また上記Ｓ５，Ｓ６の処理に基づいて、分析部１４は、下位のＦＴＣＩのステータス等に応じて、障害影響先となる上位のサービス（サービスＣＩ）のステータスや数（Ｎ）などを把握する。 Also, based on the processing of S5 and S6, the analysis unit 14 grasps the status and number (N) of the higher-level service (service CI) that is the failure-affected destination according to the status of the lower-level FTCI.

（Ｓ７：優先度算出）優先度算出部１７は、Ｓ４〜Ｓ６，Ｓ２の情報などを用いて、対象システム１の現在のサービスレベルやＦＴＣＩを含む各ＣＩの状況（例えばＣＩのステータス：「正常」、「縮退」、「低下」、「停止」等）などを総合的に考慮して、緊急度（α）やインパクトレベル（β）を算出する。そして、これらの情報（α，β）を用いて、当該障害（インシデント）への対策の優先度（Ｐ）を算出する。更に、優先度（Ｐ）に応じた目標解決時間（Ｔ）や、エスカレーション有無などを決定する（後述、図１０等）。 (S7: Priority calculation) The priority calculation unit 17 uses the information of S4 to S6, S2, and the like, the status of each CI including the current service level of the target system 1 and the FTCI (for example, the CI status: “Normal” ”,“ Degeneration ”,“ decrease ”,“ stop ”, etc.) are comprehensively considered, and the urgency level (α) and impact level (β) are calculated. Then, using these pieces of information (α, β), the priority (P) of the countermeasure against the failure (incident) is calculated. Furthermore, the target solution time (T) according to the priority (P), the presence / absence of escalation, and the like are determined (described later, such as FIG. 10).

（Ｓ８：情報登録）インシデント管理システム１０（情報登録部１８）は、上記Ｓ７までの処理で得た各種情報（障害箇所、障害影響範囲、ＣＩ及びリンクのステータス、障害影響先サービス数（Ｎ）、優先度（Ｐ）、目標解決時間（Ｔ）など）を用いて、障害構成情報ｂ３やインシデント情報ｂ２を作成または更新し、ＤＢ５１，ＤＢ５２等に対して登録する。Ｓ５〜Ｓ７の分析結果（障害状況など）は、Ｓ２の構成管理モデル（構成情報ｂ１）上にマッピングされ、障害構成情報ｂ３となる。 (S8: Information registration) The incident management system 10 (information registration unit 18) obtains various types of information (failure location, fault impact range, CI and link status, number of fault impact destination services (N) obtained through the processing up to S7. The failure configuration information b3 and the incident information b2 are created or updated using the priority (P), the target solution time (T), etc., and registered in the DB 51, DB 52, and the like. The analysis results (failure status, etc.) of S5 to S7 are mapped onto the configuration management model (configuration information b1) of S2 and become failure configuration information b3.

サービスポータルシステム３０側は、インシデント管理システム１０（情報登録部１８）等に対して上記の情報（ｂ３，ｂ２）を必要に応じて要求して取得する。あるいは、サービスポータルシステム３０側で障害構成情報ｂ３の作成などを行う形態の場合は、情報登録部１８は、上記の情報（ｂ３，ｂ２）をサービスポータルシステム３０へ送信して登録してもよい。これにより画面（Ｇ１，Ｇ２）が構成・提供可能となる。 The service portal system 30 side requests and acquires the above information (b3, b2) from the incident management system 10 (information registration unit 18) or the like as necessary. Alternatively, in the case of creating the failure configuration information b3 on the service portal system 30 side, the information registration unit 18 may transmit and register the above information (b3, b2) to the service portal system 30. . As a result, the screens (G1, G2) can be configured and provided.

（Ｓ９：画面提供）サービスポータルシステム３０（画面提供部３１）は、上記の障害構成情報ｂ３を用いることで、構成・障害状況などを可視化する構成情報画面Ｇ１を構成し担当者３に対して提供する。また、インシデント情報ｂ２を用いることでインシデント画面Ｇ２を構成し担当者３に対して提供する。初期診断担当者Ｕなどの担当者３は、構成情報画面Ｇ１の参照により、対象システム１の構成・障害状況（障害影響範囲を含む）・関連情報（担当者情報などを含む）などを把握でき、インシデント画面Ｇ２の参照により、詳細なインシデント情報を把握できる。 (S9: Screen Provision) The service portal system 30 (screen provision unit 31) uses the above-described failure configuration information b3 to construct a configuration information screen G1 for visualizing the configuration / failure status and the like to the person in charge 3 provide. Further, by using the incident information b2, an incident screen G2 is constructed and provided to the person in charge 3. The person in charge 3 such as the initial diagnosis person in charge U can grasp the configuration, the failure status (including the failure influence range), related information (including the person in charge information, etc.) of the target system 1 by referring to the configuration information screen G1. Detailed incident information can be grasped by referring to the incident screen G2.

［処理例（ａ）］
前記初期診断（Ｓ３，Ｓ４）〜ＦＴＣＩ状況把握（Ｓ６）に係わる詳細処理例は以下である。これは、対象システム１の状況に応じて各ＣＩ及びリンクのステータスを更新する処理例であり、図６〜図８等のデータ情報（後述）を用いる。 [Processing example (a)]
Detailed processing examples relating to the initial diagnosis (S3, S4) to FTCI status grasping (S6) are as follows. This is a processing example in which the status of each CI and link is updated according to the status of the target system 1, and data information (described later) shown in FIGS.

（１）初期診断スクリプト実行：前記Ｓ３で、初期診断部１３は、障害検知（Ｓ１）に基づく診断対象の各構成部位（ＣＩ）に対し、当該ＣＩのカテゴリ毎に対応付けられたスクリプトを、図６の初期診断スクリプトパラメータを引数にして実行する。前記Ｓ４で、Ｓ３の実行結果（稼働情報）は図６の構成情報に格納される。 (1) Initial diagnosis script execution: In S3, the initial diagnosis unit 13 executes a script associated with each CI category for each component (CI) to be diagnosed based on failure detection (S1). The initial diagnosis script parameter of FIG. 6 is executed as an argument. In S4, the execution result (operation information) of S3 is stored in the configuration information of FIG.

（２）ＣＩステータスの登録：障害影響範囲ＣＩ抽出部１５は、障害箇所に関係する各ＣＩ（ここではＦＴＣＩを除くＣＩ）に対し、上記（１）の結果情報をもとに、図６，図７のステータス決定方法に従い決定されるステータスを、当該ＣＩのステータスとして登録する。 (2) Registration of CI status: The failure influence range CI extraction unit 15 applies to each CI (here, CI excluding FTCI) related to the failure location, based on the result information of the above (1), FIG. The status determined according to the status determination method of FIG. 7 is registered as the status of the CI.

（３）依存関係性ステータスの登録：障害影響範囲ＣＩ抽出部１５は、障害箇所に関係する各ＣＩ間の依存関係性（リンク）に対し、当該ＣＩのレイヤ番号（図６）の大きい方（即ち下位）のＣＩのステータスを、当該依存関係性（リンク）のステータス（「依存関係性ステータス」）として登録する（図８，図９）。 (3) Dependency Status Registration: The failure influence range CI extraction unit 15 has a larger CI layer number (FIG. 6) for the dependency relationship (link) between the CIs related to the failure location ( That is, the status of the lower CI is registered as the status of the dependency (link) (“dependency status”) (FIGS. 8 and 9).

（４）ＦＴＣＩステータスの登録：ＦＴＣＩ状況把握部１６は、障害箇所・障害影響範囲に関係する各ＦＴＣＩに対し、図６，図７のステータス決定方法に従い決定されるステータスを、当該ＦＴＣＩのステータスに登録する。 (4) FTCI status registration: The FTCI status grasping unit 16 sets the status determined according to the status determination method of FIGS. 6 and 7 to the status of the FTCI for each FTCI related to the fault location / failure influence range. sign up.

［処理例（ｂ）］
前記情報登録（Ｓ８）〜画面提供（Ｓ９）の詳細処理例は以下である。図６〜図８等のデータ情報（後述）を用いる。インシデント管理システム１０（情報登録部１８等）は、ＤＢ５１，ＤＢ５２等に対して、随時（例えば構成・状況の変動に応じたタイミング）、画面（Ｇ１，Ｇ２等）の提供・内容更新のための情報（ｂ１〜ｂ３等）を作成・登録・提供する処理を行う。以下例えば情報登録部１８による処理である。 [Processing example (b)]
Detailed processing examples from the information registration (S8) to the screen provision (S9) are as follows. Data information (described later) shown in FIGS. The incident management system 10 (information registration unit 18 or the like) can provide the screens (G1, G2, etc.) to the DB 51, DB 52, etc. at any time (for example, timing according to changes in configuration / situation), and update the contents. Processing to create, register, and provide information (b1 to b3, etc.) is performed. Hereinafter, for example, processing by the information registration unit 18 is performed.

（１）画面Ｇ１（障害構成情報ｂ３）におけるＣＩアイコン（ＦＴＣＩアイコンを含む）の表示状態を、当該ＣＩ（ＦＴＣＩ）のステータス（図６）の変動に応じて変更する。例えば、該当ＣＩのステータスが「正常」から「異常」／「停止」へ変動したことに応じて、該当ＣＩアイコンの表示色を青から赤へ変更する処理を行う。 (1) The display state of the CI icon (including the FTCI icon) on the screen G1 (failure configuration information b3) is changed according to the change in the status of the CI (FTCI) (FIG. 6). For example, in response to the status of the corresponding CI changing from “normal” to “abnormal” / “stop”, a process of changing the display color of the corresponding CI icon from blue to red is performed.

（２）画面Ｇ１（障害構成情報ｂ３）におけるＣＩ間の依存関係性（リンク）の表示状態を、当該依存関係性（リンク）のステータス（図８）の変動に応じて変更する。上記（１）の変更に併せて（２）の変更を行う。例えば、該当の依存関係性のステータスを、下位ＣＩのステータスに応じて決定・変更する処理を行う。 (2) The display state of the dependency relationship (link) between CIs on the screen G1 (failure configuration information b3) is changed in accordance with the change in the dependency relationship (link) status (FIG. 8). The change in (2) is performed in conjunction with the change in (1) above. For example, a process of determining / changing the status of the corresponding dependency relationship according to the status of the lower CI is performed.

（３）上記（１），（２）の変更の情報を用いて、画面Ｇ１用の障害構成情報ｂ３、及び画面Ｇ２用のインシデント情報ｂ２等を作成または更新し、ＤＢ５１，ＤＢ５２等に対して登録する。これにより、サービスポータルシステム３０（画面提供部３１）では、上記情報を用いて、図３，図４の例のような画面Ｇ１等を提供可能とする。 (3) Create or update failure configuration information b3 for screen G1, incident information b2 for screen G2, etc. using information on the changes in (1) and (2) above, and for DB 51, DB 52, etc. sign up. Thereby, the service portal system 30 (screen providing unit 31) can provide the screen G1 and the like as in the examples of FIGS. 3 and 4 using the above information.

また上記情報の他にも同様に、画面内に表示する各種の情報（例えば、図３内の担当者のアイコンの表示用の情報や、ＣＩステータスの表示用の情報や、障害箇所、障害影響範囲、障害影響先サービスなどの表示用の情報など）を作成・登録・提供する。 In addition to the above information, various types of information displayed on the screen (for example, information for displaying the icon of the person in charge in FIG. 3, information for displaying the CI status, fault location, fault impact) Create, register, and provide information for display such as scope and failure-affected service.

［画面（１）］
図３，図４等は表示画面例を示し、あわせて対象システム１の構成例も示されている。図３には、障害構成情報ｂ３を可視化する画面（構成情報画面Ｇ１）の第１の例（対象システム１が正常時の場合）を示す。本画面では、障害構成情報ｂ３等に基づき、複数のＣＩ（ＣＩアイコン）間が依存関係性を示すリンク（線）で接続された構造を表示する。 [Screen (1)]
3 and 4 show examples of display screens, and a configuration example of the target system 1 is also shown. FIG. 3 shows a first example (when the target system 1 is normal) of a screen (configuration information screen G1) for visualizing the failure configuration information b3. This screen displays a structure in which a plurality of CIs (CI icons) are connected by links (lines) indicating dependency relationships based on the failure configuration information b3 and the like.

画面の上方ほどレイヤ番号（図６）が小さい上位のＣＩ（“Service”，“Cluster”等）を示し、下方ほどレイヤ番号が大きい下位のＣＩ（“Physical Server”，“L2Switch”等）を示す。論理的な単位・仮想的な単位（サービス、仮想サーバ、アプリケーション等）ほどレイヤが上位に設定され、物理的な単位（サーバ機器やネットワーク機器など）ほどレイヤが下位に設定される。図３では、下位から順に例えば、“Terminal”（端末），“L3Switch”,“L2Switch”,“Physical Server”（物理サーバ），“Hypervisor”（サーバ仮想化ソフトウェア），“DB Server”（仮想化サーバ），“DataBase”，“Cluster”（障害許容性）等のＣＩを有する。雲のアイコンは対象システム１で提供される上位のサービス（サービスＣＩ）を示す。 An upper CI ("Service", "Cluster", etc.) with a lower layer number is shown in the upper part of the screen, and a lower CI ("Physical Server", "L2Switch", etc.) with a higher layer number is shown in the lower part of the screen. . A logical unit / virtual unit (service, virtual server, application, etc.) has a higher layer, and a physical unit (server device, network device, etc.) has a lower layer. In FIG. 3, for example, “Terminal” (terminal), “L3Switch”, “L2Switch”, “Physical Server” (physical server), “Hypervisor” (server virtualization software), “DB Server” (virtualization) Server), “DataBase”, “Cluster” (failure tolerance), etc. A cloud icon indicates a higher-level service (service CI) provided by the target system 1.

各ＣＩやリンクについては、名称、カテゴリ、ステータス、その他に応じて、色やアイコンや文字情報、大きさ、その他など、表現を変えて表示する。例えば図３では、ステータスが「正常」であるＣＩ及びリンクを実線や青で表示する。 Each CI and link is displayed with different expressions such as color, icon, character information, size, etc., according to the name, category, status, etc. For example, in FIG. 3, CIs and links whose status is “normal” are displayed in solid lines or blue.

またＦＴＣＩについては、通常のＣＩ（非ＦＴＣＩ）と区別できるように特定のアイコン（図３の例では八角形のアイコン）で表示する。４０１〜４１４等はＦＴＣＩを示す。それ以外は通常のＣＩ（非ＦＴＣＩ）を示す。ＦＴＣＩに付随する障害許容情報については後述する（図５）。 The FTCI is displayed with a specific icon (an octagonal icon in the example of FIG. 3) so that it can be distinguished from a normal CI (non-FTCI). 401-414 etc. show FTCI. Otherwise, normal CI (non-FTCI) is indicated. The fault tolerance information associated with FTCI will be described later (FIG. 5).

各ＣＩやリンクの表示の仕方は、本システムに対して入力・設定が可能である。 The method of displaying each CI and link can be input and set in the system.

また各ＣＩアイコンに対しては、当該ＣＩに関連付けられる関連情報を適宜表示する。例えば担当者３のマウス操作（ＣＩアイコンへのマウスオーバやクリック等）により、当該ＣＩの関連情報をポップアップなどの形で表示する。例えば、当該ＣＩに関連付けられるインシデント情報を表示、あるいはインシデント画面Ｇ２へリンクする。また例えば、当該ＦＴＣＩに関連付けられる障害許容情報を表示する。 In addition, for each CI icon, related information associated with the CI is appropriately displayed. For example, when the person in charge 3 performs a mouse operation (such as mouse over or clicking on a CI icon), the relevant information of the CI is displayed in a pop-up form. For example, incident information associated with the CI is displayed or linked to the incident screen G2. Also, for example, fault tolerance information associated with the FTCI is displayed.

また例えば、当該ＣＩに関連付けられる担当者３の情報を表示する。図３の例では、ＣＩアイコンの右上（Ａ：例えばオレンジ色）や左上（Ｂ，Ｃ：例えば緑色）などに担当者アイコンを表示し、当該担当者アイコンの操作により担当者３の情報を表示する。ＦＴＣＩアイコンや、サービスＣＩアイコン毎に、存在する種別の担当者３のアイコンが表示される。 Further, for example, information on the person in charge 3 associated with the CI is displayed. In the example of FIG. 3, the person-in-charge icon is displayed on the upper right (A: for example, orange) or upper left (B, C: for example, green) of the CI icon, and information on the person in charge 3 is displayed by operating the person-in-charge icon. To do. For each FTCI icon or service CI icon, the icon of the person in charge 3 of the existing type is displayed.

［画面（２）］
図４には、図３と同じ前提で、障害構成情報ｂ３を可視化する画面（構成情報画面Ｇ１）の第２の例（対象システム１での障害発生時の場合）を示す。図３の構成管理モデル上に障害状況（障害影響範囲など）などが反映された内容である。ＣＩ名称などは略す。図４の例では、各リンクの線をステータスに応じた線種（「正常」は実線、「縮退」は破線、「低下」は１点鎖線、「停止」は点線）で表示している。 [Screen (2)]
FIG. 4 shows a second example (when a failure occurs in the target system 1) of a screen (configuration information screen G1) for visualizing the failure configuration information b3 on the same premise as FIG. The content reflects the failure status (failure impact range, etc.) on the configuration management model of FIG. CI names are omitted. In the example of FIG. 4, each link line is displayed with a line type corresponding to the status (“normal” is a solid line, “degenerate” is a broken line, “decline” is a one-dot chain line, and “stop” is a dotted line).

障害箇所、障害影響範囲、障害影響先サービスなどを、特定のアイコンや囲み等の表現によって表示する。表現は例えば障害度合いが深刻なもの（ステータス値が大きいものに対応する）ほど目立つようにする。 The fault location, fault impact range, fault impact destination service, and the like are displayed using specific icons or boxes. For example, the expression is made more conspicuous as the degree of failure is serious (corresponding to a large status value).

５０１〜５０５等は、前記Ｓ１等に基づく障害箇所（ＣＩ，リンク）を示す。また、障害影響範囲ＣＩ（前記Ｓ５）については、例えば、該当の各ＣＩが囲みで表示される。囲みの種類や色はステータスに応じたものにする。図４の例では、ステータスが「停止」中の障害影響範囲のＣＩをそれぞれ実線の囲みで強調するように表示している。同様に「低下」や「縮退」の範囲についてもそれぞれ表示してもよい。色で表現する場合は例えば、ＣＩ及びリンクのステータスが「正常」の場合は青、「縮退」は紫、「低下」は黄、「停止」は赤、などで表示する。また各ＣＩのステータス等の情報をポップアップ等で表示してもよい。 Reference numerals 501 to 505 and the like indicate failure locations (CI, links) based on the S1 and the like. In addition, for the failure influence range CI (S5), for example, each corresponding CI is displayed in a box. The type and color of the enclosure will depend on the status. In the example of FIG. 4, the CIs in the failure influence range whose status is “stopped” are displayed so as to be highlighted with a solid line box. Similarly, ranges of “decrease” and “degeneration” may be displayed respectively. For example, when the status of the CI and the link is “normal”, it is displayed in blue, “degenerate” is purple, “decreased” is yellow, “stop” is red, and the like. Information such as the status of each CI may be displayed in a pop-up or the like.

５００は、障害影響範囲の一例（一部のみ）であり、５０１の障害箇所（物理サーバ）から上位への障害影響によりＦＴＣＩ４０１，４０２までの範囲を示している。このように複数のＣＩやリンクを包含した範囲を表示してもよい。また、ＦＴＣＩの上位のサービス（本例では６０１，６０２の２つ）も障害影響を受けている。図４の例では、ステータスが「停止」の各サービス（６０１，６０２）について、障害影響先サービスとして囲みで表示している。また、障害影響サービス数（Ｎ）などのサービス状況の情報をポップアップ等で表示してもよい。 500 is an example (only a part) of the failure influence range, and shows the range from the failure location (physical server) 501 to the FTCIs 401 and 402 due to the failure influence to the upper level. Thus, a range including a plurality of CIs and links may be displayed. In addition, the services higher in the FTCI (two in this example, 601 and 602) are also affected by the failure. In the example of FIG. 4, each service (601, 602) whose status is “stopped” is displayed in a box as a failure affected service. Also, service status information such as the number of failure-affected services (N) may be displayed in a pop-up or the like.

図４の例では、ＦＴＣＩ４０１，４０２等は、ステータスが「停止（３）」である。ＦＴＣＩ４０３は「縮退（１）」である。ＦＴＣＩ４０４は「低下（２）」である。ＦＴＣＩ４０５〜４０８は「正常（０）」である。ＦＴＣＩ４０９は「縮退（１）」である。ＦＴＣＩ４１１，４１２，４１３，４１４は「正常（０）」である。上位の２つのサービス６０１，６０２は、ステータスが「停止（３）」である。障害影響サービス数（Ｎ）（「停止」のもの）は２である。 In the example of FIG. 4, the status of the FTCI 401, 402, etc. is “stop (3)”. The FTCI 403 is “degeneration (1)”. FTCI 404 is “Decrease (2)”. The FTCIs 405 to 408 are “normal (0)”. The FTCI 409 is “degenerate (1)”. The FTCIs 411, 412, 413, and 414 are “normal (0)”. The upper two services 601 and 602 have a status of “stop (3)”. The number (N) of fault-affected services (“stop”) is 2.

担当者３は、画面Ｇ１で、色の違いや囲みや特定のアイコンや表示情報を見ることで、障害影響範囲などをわかりやすく把握することができる。構成部位（ＣＩ）に関連付けられた形で各種情報が参照可能であるため、担当者３による状況把握などが容易化・迅速化できる。例えば初期診断担当者Ｕは、障害影響範囲などに該当しているＣＩアイコンやその担当者アイコンに触れると、対応するインシデント情報や、存在する各種のエスカレーション先（Ａ，Ｂ，Ｃ）などの担当者３の情報を見ることができ、また対応するエスカレーション動作（通知）へ連携することもできる。 The person in charge 3 can grasp the fault influence range and the like in an easy-to-understand manner by looking at the difference in color, surroundings, specific icons, and display information on the screen G1. Since various types of information can be referred to in a form associated with the component part (CI), the situation grasp by the person in charge 3 can be facilitated and speeded up. For example, when a person in charge of initial diagnosis U touches a CI icon corresponding to a failure influence range or the person in charge thereof, the person in charge of corresponding incident information or various escalation destinations (A, B, C) exists. The information of the person 3 can be viewed, and the corresponding escalation operation (notification) can be linked.

［画面（３）］
図１３は、一般的な従来技術例における対象システム構成を可視化する画面例を示す。従来技術例ではＣＩ間の依存関係性（リンク）が表示されていないので、障害箇所の影響先が把握できない。また、従来技術例ではＦＴＣＩは無いので、冗長構成箇所が障害になった場合の上位レイヤへの影響度などが把握できない。一方、本実施の形態ではＣＩ間の依存関係性（リンク）とＦＴＣＩがあるため、障害箇所の影響先とサービス等の上位レイヤへの影響度などを把握することができる。なお特許文献１，２等の先行技術例でも、ＦＴＣＩ等を表示する機能は持っていない。 [Screen (3)]
FIG. 13 shows an example of a screen for visualizing a target system configuration in a general prior art example. In the prior art example, since the dependency relationship (link) between the CIs is not displayed, the influence destination of the failure part cannot be grasped. In addition, since there is no FTCI in the prior art example, it is impossible to grasp the degree of influence on the upper layer when a redundant configuration location becomes a failure. On the other hand, in the present embodiment, since there are dependency relationships (links) between CIs and FTCIs, it is possible to grasp the influence destination of a failure location and the degree of influence on higher layers such as services. Note that even prior art examples such as Patent Documents 1 and 2 do not have a function of displaying FTCI or the like.

［ＦＴＣＩ情報］
図５には、図３の構成に対応した各ＦＴＣＩの障害許容情報の例について示す。ＦＴＣＩでは、それぞれ、属性情報として、障害許容性に関する設計情報（「障害許容情報」）が入力・設定される。障害許容情報は、対象システム１の構成に応じて異なる設計情報であり、担当者３等により設定可能である（ＦＴＣＩ設定機能１０２）。また、関連するＣＩ（上位、下位）との関係性なども依存関係性（リンク）の形で設定される。またＦＴＣＩに関する担当者３（Ａ，Ｂ，Ｃ等）が存在する場合はその担当者情報が関連付けられる。 [FTCI information]
FIG. 5 shows an example of fault tolerance information of each FTCI corresponding to the configuration of FIG. In FTCI, design information related to fault tolerance (“failure tolerance information”) is input and set as attribute information. The fault tolerance information is design information that differs depending on the configuration of the target system 1, and can be set by the person in charge 3 or the like (FTCI setting function 102). Further, the relationship with related CIs (upper and lower) is also set in the form of dependency (link). If there is a person in charge 3 (A, B, C, etc.) related to FTCI, the person in charge information is associated.

図５（ａ）に示した、４０１，４０２等のＦＴＣＩ（アイコン表示名称：“Cluster”）では、ＤＢアクセス（ServiceからDBへのアクセス）に関して負荷分散などのための二重化構成（クラスタリング構成）である。この二重化構成で、片方の系のみが障害状態（片系障害）の場合は「縮退」（サービス許容）、両方の系が障害状態（両系障害）の場合は「停止」（サービス停止）、といった情報が設定される。 In the FTCI (icon display name: “Cluster”) such as 401 and 402 shown in FIG. 5A, a duplex configuration (clustering configuration) for load distribution and the like regarding DB access (access from the service to the DB) is used. is there. In this duplex configuration, if only one system is in a failed state (single system failure), "Degenerate" (service allowed), if both systems are in a failed state (both system failures), "stopped" (service stopped), Such information is set.

図５（ｂ）に示した、４０３，４０４等のＦＴＣＩ（“Cluster”）では、Middleware（MW）アクセス（ServiceからMiddlewareへのアクセス）に関して負荷分散などのための三重化構成（クラスタリング構成）である。この三重化構成で、一重障害の場合は「縮退」（サービス許容）、二重障害の場合は「低下」（サービス低下）、三重障害の場合は「停止」（サービス停止）、といった情報が設定される。 FTCI (“Cluster”) such as 403 and 404 shown in FIG. 5B is a triple configuration (clustering configuration) for load distribution with respect to Middleware (MW) access (access from Service to Middleware). is there. In this triple configuration, information such as “degenerate” (service allowance) for a single failure, “decrease” (service degradation) for a double failure, and “stop” (service stop) for a triple failure is set. Is done.

図５（ｃ）に示した、４０５〜４０９等のＦＴＣＩ（“Cluster”）では、L2Switch−Physical Server（PS）間が二重化構成である。この二重化構成で、片系障害の場合は「縮退」、両系障害の場合は「上位のＦＴＣＩに依存」（上位のＦＴＣＩのステータスに応じて当該ステータスが決定される等）、といった情報が設定される。 In FTCI ("Cluster") such as 405 to 409 shown in FIG. 5C, the L2Switch-Physical Server (PS) has a duplex configuration. In this duplex configuration, information such as “degenerate” in the case of a one-system failure and “depends on higher-level FTCI” in the case of both-system failure (such status is determined according to the status of the higher-level FTCI) is set. Is done.

図５（ｄ）に示した、４１１，４１２等のＦＴＣＩ（“Cluster”）では、L2Switch−Storage間が二重化構成である。この二重化構成で、片系障害の場合は「縮退」、両系障害の場合は「停止」（サービス停止（全体））、といった情報が設定される。 In FTCI (“Cluster”) such as 411 and 412 shown in FIG. 5D, the L2Switch-Storage configuration is a duplex configuration. In this duplex configuration, information such as “degenerate” in the case of a single system failure and “stop” (service stop (all)) in the case of both system failures is set.

図５（ｅ）に示した、４１３のＦＴＣＩ（“Cluster”）では、L2Switchが二重化構成である。この二重化構成で、片系障害の場合は「縮退」、両系障害の場合は「停止」（サービス停止（全体））、といった情報が設定される。 In FTCI (“Cluster”) 413 shown in FIG. 5E, the L2Switch has a duplex configuration. In this duplex configuration, information such as “degenerate” in the case of a single system failure and “stop” (service stop (all)) in the case of both system failures is set.

図５（ｆ）に示した、４１４のＦＴＣＩ（“Cluster”）では、L3Switchが二重化構成である。この二重化構成で、片系障害の場合は「縮退」、両系障害の場合は「停止」（サービス停止（全体））、といった情報が設定される。 In FTCI (“Cluster”) 414 shown in FIG. 5F, L3Switch has a duplex configuration. In this duplex configuration, information such as “degenerate” in the case of a single system failure and “stop” (service stop (all)) in the case of both system failures is set.

［ＣＩ情報］
図６は、構成情報（ＣＩ情報）のデータ構造例（テーブル）を示す。項目として、ＣＩ＿ＩＤ、カテゴリ名、レイヤ番号、初期診断スクリプトパラメータ、機能的エスカレーション（Ａ）、階層的エスカレーション＃１（Ｂ）、階層的エスカレーション＃２（Ｃ）、ステータス決定方法、ステータス、初期診断スクリプト実行結果、等を有する。 [CI information]
FIG. 6 shows a data structure example (table) of configuration information (CI information). Items include CI_ID, category name, layer number, initial diagnostic script parameter, functional escalation (A), hierarchical escalation # 1 (B), hierarchical escalation # 2 (C), status determination method, status, initial diagnostic script Execution results, etc.

ＣＩ＿ＩＤはＣＩの識別子である。カテゴリ名は、ＣＩのカテゴリ（種別）を示し、ＤＢサーバ、ＤＢ、ミドルウェア、サービス、等の他に、「障害許容性」（ＦＴＣＩ）を有する。ＦＴＣＩについては更にＦＴＣＩの種別など（例えば図５のような各ＦＴＣＩ）を設けて管理してもよい。 CI_ID is an identifier of the CI. The category name indicates the category (type) of the CI, and has “failure tolerance” (FTCI) in addition to the DB server, DB, middleware, service, and the like. The FTCI may be managed by providing a FTCI type or the like (for example, each FTCI as shown in FIG. 5).

レイヤ番号は、ＣＩの属するレイヤを示し、ＣＩ間の上位・下位などの関係性に関する情報である。レイヤの数値が小さい方が上位、大きい方が下位である。レイヤに応じて構成管理モデルが作成・表示される。またＣＩのステータスなどはレイヤを考慮して決定される。本例では、レイヤ１：サービス、レイヤ１．５：ＦＴＣＩ、レイヤ２：ＤＢ，ミドルウェア等、レイヤ３：ＤＢサーバ，Ｗｅｂサーバ等、といったように規定されている。 The layer number indicates the layer to which the CI belongs, and is information regarding the relationship between the CIs such as upper and lower levels. The smaller the numerical value of the layer is, the higher the lower one. A configuration management model is created and displayed according to the layer. The CI status and the like are determined in consideration of the layers. In this example, layer 1: service, layer 1.5: FTCI, layer 2: DB, middleware, etc., layer 3: DB server, Web server, etc. are defined.

初期診断スクリプトパラメータは、初期診断処理（Ｓ３）で引数として用いるパラメータ情報を示す。例えばＩＰやユーザ・パスワードなどの情報である。初期診断スクリプト実行結果は、初期診断処理の結果（Ｓ４）の情報を示す。これらはインシデント情報にも格納される。 The initial diagnosis script parameter indicates parameter information used as an argument in the initial diagnosis process (S3). For example, information such as IP and user password. The initial diagnosis script execution result indicates information on the result (S4) of the initial diagnosis process. These are also stored in incident information.

機能的エスカレーション（Ａ）は、当該ＣＩ（構成部位）に関連付けられる、機能的エスカレーション先の担当者３の情報を示す。Ａ１〜Ａ３は個別の担当者を示す。階層的エスカレーション（Ｂ）は、第１種の階層的エスカレーション先として、管理（本システム）側の担当者（例えば上司）の情報を示す。Ｂ１〜Ｂ３は個別の担当者を示す。階層的エスカレーション（Ｃ）は、第２種の階層的エスカレーション先として、顧客（対象システム１）側の担当者（例えば上司）の情報を示す。Ｃ１は個別の担当者を示す。 Functional escalation (A) indicates information of the person 3 in charge of functional escalation associated with the CI (component). A1 to A3 indicate individual persons in charge. Hierarchical escalation (B) indicates information of a person in charge (for example, a supervisor) on the management (this system) side as the first type of hierarchical escalation destination. B1 to B3 indicate individual persons in charge. Hierarchical escalation (C) indicates information of a person in charge (for example, a supervisor) on the customer (target system 1) side as a second type of hierarchical escalation destination. C1 indicates an individual person in charge.

ステータス決定方法は、次の項目であるステータスの値の決定方法を示す（詳しくは図７）。例えば、カテゴリがＤＢサーバ，ＤＢ，ミドルウェア等のＣＩでは、（ａ）の方法を適用し、ＩＤが“0126”のＦＴＣＩ（４０１）では（ｂ）の方法を適用し、ＩＤが“0130”のＦＴＣＩ（４０３）では（ｃ）の方法を適用し、サービスＣＩでは（ｄ）の方法を適用する等、ＣＩ及びカテゴリ毎に異なる設定が可能である。ステータスは、当該ＣＩの状況を示し、例えば「正常（０）」、「縮退（１）」、「低下（２）」、「停止（３）」、「異常（１）」などを有する。特にＦＴＣＩの場合、ステータスは障害許容状況を示す。ステータスの括弧の値は、カテゴリや方法ごとに、ステータスを識別する番号を示す。障害度合い等が大きいほどこのステータス値（ステータス番号）が大きくなるように定義されている。 The status determination method indicates a method for determining a status value as the next item (see FIG. 7 in detail). For example, in a CI whose category is DB server, DB, middleware or the like, the method (a) is applied, and in the FTCI (401) whose ID is “0126”, the method (b) is applied and the ID is “0130”. Different settings can be made for each CI and category, such as applying the method (c) in the FTCI (403) and applying the method (d) in the service CI. The status indicates the status of the CI, and includes, for example, “normal (0)”, “degenerate (1)”, “degraded (2)”, “stop (3)”, “abnormal (1)”, and the like. In particular, in the case of FTCI, the status indicates a fault tolerance situation. The value in the parenthesis of status indicates a number for identifying the status for each category or method. The status value (status number) is defined so as to increase as the degree of failure or the like increases.

図７は、ステータス決定方法の例を示す。 FIG. 7 shows an example of a status determination method.

（ａ）の方法では、対象ＣＩの初期診断スクリプト実行結果において、正常終了の場合は、ステータスを「正常（０）」とし、異常終了の場合はステータスを「異常（１）」とする。これは単純な２値の定義の例であるが、ＣＩや方法に応じて多値で定義する形にしてもよい。 In the method (a), in the initial diagnostic script execution result of the target CI, the status is set to “normal (0)” in the case of normal termination, and the status is set to “abnormal (1)” in the case of abnormal termination. This is an example of a simple binary definition, but it may be defined in multiple values according to the CI or method.

（ｂ）の方法では、下位ＣＩの正常稼働率（ｒとする）において、１００％の場合は「正常（０）」、５０％以上で１００％未満の場合は「縮退（１）」、０％の場合は「停止（３）」とする。ｒは各ＣＩのステータス値から算出できる。 In the method (b), in the normal operation rate (r) of the lower CI, “normal (0)” when 100%, “degenerate (1)” when 50% or more and less than 100%, 0 In the case of%, “stop (3)” is assumed. r can be calculated from the status value of each CI.

（ｃ）の方法では、下位ＣＩの正常稼働率（ｒ）において、１００％の場合は「正常（０）」、６５％以上で１００％未満の場合は「縮退（１）」、１％以上で６５％未満の場合は「低下（２）」、０％の場合は「停止（３）」とする。 In the method of (c), in the normal operation rate (r) of the lower CI, “normal (0)” when 100%, “degenerate (1)” when 65% or more and less than 100%, 1% or more If it is less than 65%, “decrease (2)”, and if it is 0%, “stop (3)”.

（ｄ）の方法では、下位ＣＩのステータス番号が１番大きいステータス（障害度合い等が１番大きいもの）を継承する。例えば、図４の左側のサービスＣＩ（６０１）の場合、一方の下位ＣＩ（４０１）は「停止（３）」、他方の下位ＣＩ（４０３）は「縮退（１）」であるため、大きい方である「停止（３）」の方が継承されて当該サービスＣＩのステータスに設定される。 In the method (d), the status with the highest status number of the lower CI (the one with the highest failure degree or the like) is inherited. For example, in the case of the service CI (601) on the left side of FIG. 4, one lower CI (401) is “stop (3)”, and the other lower CI (403) is “degenerate (1)”. “Stop (3)” is inherited and set to the status of the service CI.

［依存関係性］
図８は、ＣＩ間の依存関係性（リンク）のデータ構造例（テーブル）を示す。図８の値は、図９の構成管理モデルの例（一部）と対応した値である。図９でＣＩ及びリンク（線）の隣の数値はＩＤを示す。各ＣＩ間の線が依存関係性（リンク）を示している。なお依存関係性（リンク）についてもＣＩの一種に含める（依存関係性ＣＩとする）。 [Dependency]
FIG. 8 shows a data structure example (table) of dependency relationships (links) between CIs. The values in FIG. 8 correspond to the example (part) of the configuration management model in FIG. In FIG. 9, the numerical values next to the CI and the link (line) indicate the ID. A line between each CI indicates a dependency (link). Dependency relationships (links) are also included in a type of CI (referred to as dependency relationship CI).

図８で、項目として、依存関係性（リンク）＿ＩＤ，第１のＣＩ（下位ＣＩ）＿ＩＤ、第２のＣＩ（上位ＣＩ）＿ＩＤ、依存関係性ステータス（＝下位ＣＩステータス）を有する。 In FIG. 8, items include dependency (link) _ID, first CI (lower CI) _ID, second CI (upper CI) _ID, and dependency status (= lower CI status).

図９の例では、あるサービス（0131）の下位に、ＦＴＣＩとして“Cluster”（0126）と“Cluster”（0130）の２つがある。ＦＴＣＩ“Cluster”（0126）は、二重化構成であり、下位に、２つのＤＢ（0124，0125）がある。ＤＢ（0124）の下位にＤＢサーバ（0123）がある。ＦＴＣＩ“Cluster”（0130）は、三重化構成であり、下位に、３つのミドルウェア（0127，0128，0129）がある。各ＣＩ・リンクをステータスに応じた表現で示している。吹き出しはステータスを示す。特にステータスが「異常」のＣＩを点線で示している。 In the example of FIG. 9, there are two FTCIs, “Cluster” (0126) and “Cluster” (0130), below a certain service (0131). FTCI “Cluster” (0126) has a duplex configuration, and there are two DBs (0124, 0125) at the lower level. A DB server (0123) is subordinate to the DB (0124). The FTCI “Cluster” (0130) has a triple configuration, and there are three middlewares (0127, 0128, 0129) at the lower level. Each CI / link is shown in an expression corresponding to the status. A balloon indicates the status. In particular, a CI whose status is “abnormal” is indicated by a dotted line.

図９のように、例えばＩＤ“1233”の依存関係性（リンク）は、ＩＤ“0123”のＤＢサーバである第１のＣＩ（下位ＣＩ）と、ＩＤ“0124”のＤＢである第２のＣＩ（上位ＣＩ）との依存関係性（リンク）を示し、当該依存関係性（リンク）ステータスは、下位ＣＩのステータスと同じ「異常」（例：赤）である。 As shown in FIG. 9, for example, the dependency (link) of the ID “1233” is the first CI (subordinate CI) that is the DB server of ID “0123” and the second DB that is the DB of ID “0124”. The dependency relationship (link) with the CI (upper CI) is shown, and the dependency relationship (link) status is “abnormal” (for example, red) as the status of the lower CI.

ステータスの例として、“Cluster”（0126）における一方側のＤＢサーバ（0123）及びそのＤＢ（0124）では「異常」（例：赤）であり、他方側のＤＢ（0125）では「正常」（例：青）である。“Cluster”（0126）は、片系障害であるため、「縮退」（例：紫）である。また、“Cluster”（0130）における第１のミドルウェア（0127）は「正常」（例：青）であり、第２、第３のミドルウェア（0128，0129）は「異常」（例：赤）である。“Cluster”（0130）は、二重障害であるため、「低下」（例：黄）である。サービス（0131）は、“Cluster”（0130）の方のステータスの継承により「低下」（例：黄）となる。 As an example of the status, in the “Cluster” (0126), one side DB server (0123) and its DB (0124) are “abnormal” (eg, red), and the other side DB (0125) is “normal” ( Example: Blue). Since “Cluster” (0126) is a one-system failure, it is “degenerate” (example: purple). In the “Cluster” (0130), the first middleware (0127) is “normal” (example: blue), and the second and third middleware (0128, 0129) are “abnormal” (example: red). is there. “Cluster” (0130) is “decreased” (eg, yellow) because it is a double failure. The service (0131) is “decreased” (eg, yellow) due to the inheritance of the status of “Cluster” (0130).

［優先度などの算出］
図１０に、優先度算出部１７（前記Ｓ７）における優先度（Ｐ）などの算出方法を示す。図１０（ａ）は、緊急度（α）の算出方法（一例）を示す。障害影響範囲（前記Ｓ５）におけるＦＴＣＩのステータスや障害許容情報に応じて緊急度（α）を決定する。条件として、ＦＴＣＩのステータスにおいて、「正常（０）」か「縮退（１）」がある場合は、α＝１とする。「停止（３）」が無く「低下（２）」がある場合は、α＝２とする。「停止（３）」がある場合は、α＝３とする。 [Calculation of priority, etc.]
FIG. 10 shows a calculation method of the priority (P) and the like in the priority calculation unit 17 (S7). FIG. 10A shows a method (one example) of calculating the degree of urgency (α). The urgency level (α) is determined in accordance with the FTCI status and the fault tolerance information in the fault influence range (S5). As a condition, α = 1 is set when there is “normal (0)” or “degenerate (1)” in the status of FTCI. If there is no “stop (3)” and “decrease (2)”, α = 2. If there is “stop (3)”, α = 3.

図１０（ｂ）は、インパクトレベル（β）の算出方法（一例）を示す。ステータス決定方法（図７）に応じて異なる。ＦＴＣＩのステータスや障害許容情報などを用いて、各ステータスのＦＴＣＩの数や障害影響サービス数などの状況に応じてインパクトレベル（β）を算出する。例えば図７の（ｃ）の方法の場合（ステータス値は正常（０），縮退（１），低下（２），停止（３）の４値）、条件（式）として、ＦＴＣＩのステータスにおいて、［縮退（１）のＦＴＣＩ数（ｎ１）］×係数ａ１（例：１）＋［低下（２）のＦＴＣＩ数（ｎ２）］×係数ａ２（例：５）＋［停止（３）のＦＴＣＩ数（ｎ３）］×係数ａ３（例：１０）である。本式による数値をインパクトレベル（β）とする。各重み付け係数（ａ１〜ａ３）の値は一例である。 FIG. 10B shows a method for calculating the impact level (β) (an example). It depends on the status determination method (FIG. 7). Using the status of FTCI, fault tolerance information, and the like, the impact level (β) is calculated according to the situation such as the number of FTCI of each status and the number of fault-affected services. For example, in the case of the method of (c) in FIG. 7 (status values are four values of normal (0), degeneration (1), decrease (2), and stop (3)), as a condition (expression), in the FTCI status, [Degenerate (1) FTCI number (n1)] × Coefficient a1 (Example: 1) + [Decrease (2) FTCI number (n2)] × Coefficient a2 (Example: 5) + [Stop (3) FTCI number (N3)] × coefficient a3 (example: 10). The numerical value according to this formula is the impact level (β). The value of each weighting coefficient (a1 to a3) is an example.

同様に例えば（ｂ）の方法の場合（ステータス値は正常（０），縮退（１），停止（３）の３値）、上記式は、β＝［縮退（１）のＦＴＣＩ数（ｎ１）］×係数ａ１＋［停止（３）のＦＴＣＩ数（ｎ３）］×係数ａ３といったようになる。 Similarly, for example, in the case of the method (b) (status values are three values of normal (0), degeneration (1), and stop (3)), the above equation is expressed by β = [number of FTCIs of degeneration (1) (n1) ] × coefficient a1 + [number of FTCIs for stop (3) (n3)] × coefficient a3.

図１０（ｃ）は、優先度（Ｐ）などの算出方法（一例）を示す。上記α，βの値を用いる。条件として、α×β≦９の場合、優先度（Ｐ）＝「低」とする。１０≦α×β≦２９の場合、優先度（Ｐ）＝「中」とする。３０≦α×βの場合、優先度（Ｐ）＝「高」とする。 FIG. 10C shows a calculation method (one example) such as the priority (P). The values of α and β are used. As a condition, when α × β ≦ 9, priority (P) = “low”. When 10 ≦ α × β ≦ 29, the priority (P) = “medium”. When 30 ≦ α × β, the priority (P) = “high”.

また、優先度（Ｐ）に対応して、当該障害（インシデント）への対策における目標解決時間（Ｔ）を求める。本例では、Ｐ＝「低」の場合は１２時間、Ｐ＝「中」の場合は６時間、Ｐ＝「高」の場合は２時間、といったように対応付けている。 In addition, corresponding to the priority (P), a target solution time (T) in the countermeasure for the failure (incident) is obtained. In this example, the correspondence is 12 hours when P = “low”, 6 hours when P = “medium”, and 2 hours when P = “high”.

また、優先度（Ｐ）に対応して、前述の各種のエスカレーション（Ａ〜Ｃ）の有無などを求める。例えば、Ｐ＝「低」の場合、階層的エスカレーション＃１（Ｂ）及び階層的エスカレーション＃２（Ｃ）ともに無しである。Ｐ＝「中」の場合、＃１（Ｂ）（管理側への連絡等）を有りにする。更に、Ｐ＝「高」の場合、＃２（Ｃ）（顧客側への連絡等）も有りにする。 Also, the presence or absence of the above-described various escalations (A to C) is obtained in correspondence with the priority (P). For example, when P = “low”, there is no hierarchical escalation # 1 (B) and hierarchical escalation # 2 (C). In the case of P = “medium”, # 1 (B) (contact to the management side, etc.) is set to “present”. Further, when P = “high”, # 2 (C) (contact to the customer side, etc.) is also set.

またその他、前記Ｓ５で抽出した障害影響範囲ＣＩや前記Ｓ６で決定したＦＴＣＩステータス等をもとに、障害影響サービス数（Ｎ）を算出する。例えば、サービスＣＩの下位のＦＴＣＩのステータスなどに応じて当該サービスＣＩのステータスを決定する。そしてサービスＣＩのステータス（「縮退」、「低下」、「停止」など）毎に、障害影響サービス数（Ｎ）をカウントする。 In addition, the number (N) of failure affected services is calculated based on the failure affected range CI extracted in S5 and the FTCI status determined in S6. For example, the status of the service CI is determined according to the status of the FTCI subordinate to the service CI. Then, the number (N) of failure-affected services is counted for each service CI status (“degenerate”, “decreased”, “stopped”, etc.).

情報登録部１８等は、上記で得た各情報を含めて記述したインシデント情報ｂ２を、ＤＢ５１等に登録する。 The information registration unit 18 or the like registers the incident information b2 described including the information obtained above in the DB 51 or the like.

［インシデント情報］
図１１に、インシデント情報（ｂ２）のデータ構造例（テーブル）を示す。図１２は、図１１に対応するインシデント画面Ｇ２の例（フォーマット）を示す。インシデント情報において、項目として、インシデントＩＤ，緊急度（α），インパクトレベル（β），障害影響サービス数（Ｎ），優先度（Ｐ），目標解決時間（Ｔ），階層的エスカレーション＃１（Ｂ），＃２（Ｃ）、等を有する。各項目には前述した処理で得た情報が格納される。インシデントＩＤやその他の従来のインシデント情報と同様の項目（ステータス、タイトル、カテゴリ、構成部位（ＣＩ）、日時、説明情報など）も格納・管理される。障害影響サービス数（Ｎ）についてはステータスごとの値を格納する。図１２のインシデント画面Ｇ２では、図１１のインシデント情報をもとに情報が表示される。担当者３は、画面Ｇ２でインシデント情報を参照したり、値を入力することができる。他にも例えば、当該ＣＩに関連付けられる担当者３の情報など（機能的エスカレーション（Ａ）の担当者３や当該担当者３（Ａ）による対策情報など）を管理・表示してもよい。 Incident information
FIG. 11 shows a data structure example (table) of the incident information (b2). FIG. 12 shows an example (format) of the incident screen G2 corresponding to FIG. In incident information, items include incident ID, urgency level (α), impact level (β), number of failure-affected services (N), priority (P), target solution time (T), hierarchical escalation # 1 (B ), # 2 (C), and the like. Each item stores information obtained by the processing described above. The same items as the incident ID and other conventional incident information (status, title, category, component (CI), date, description information, etc.) are also stored and managed. For the number of failure-affected services (N), a value for each status is stored. In the incident screen G2 of FIG. 12, information is displayed based on the incident information of FIG. The person in charge 3 can refer to incident information or input a value on the screen G2. In addition, for example, information on the person in charge 3 associated with the CI (such as information on the person in charge 3 of the functional escalation (A) or countermeasure information by the person in charge 3 (A)) may be managed and displayed.

［具体例］
前述したフロー（Ｓ０〜Ｓ９）に沿った具体例を以下に示す。 [Concrete example]
A specific example along the flow (S0 to S9) described above is shown below.

（Ｓ０）図３で示されるような構成管理モデルを設定する。 (S0) A configuration management model as shown in FIG. 3 is set.

（Ｓ１）障害検知（障害情報）により、障害箇所のＣＩが例えば図４の５０１（物理サーバ）であるとする。他の障害箇所（５０２等）がある場合も同様の考え方である。 (S1) Assume that the failure location CI is, for example, 501 (physical server) in FIG. 4 by failure detection (failure information). The same concept applies when there is another fault location (502 etc.).

（Ｓ２）障害箇所（５０１）を含む関連するＣＩ情報（全部または一部）を取得する。少なくとも上位・下位でつながるＣＩ及びリンクの情報が取得される。 (S2) Acquire related CI information (all or a part) including the failure location (501). Information on CIs and links connected at least in the upper and lower levels is acquired.

（Ｓ３），（Ｓ４）障害箇所（５０１）を含む対象に対する初期診断実行結果を得る。 (S3), (S4) An initial diagnosis execution result is obtained for the object including the failure location (501).

（Ｓ５）上記結果から、障害箇所（５０１）を含む障害影響範囲のＣＩを抽出する。例えば図４の障害影響範囲５００のＣＩが抽出される。障害箇所などの下位ＣＩから、依存関係性（リンク）でつながる上位ＣＩへ、障害の影響が伝播する。処理例としては、上位ＣＩのステータスが、リンクで接続されるすべての下位ＣＩのステータスの値を用いた前述の計算に応じて決定される。障害影響範囲５００は、上位のＦＴＣＩ（例えば４０１，４０２）までを含めた場合である。 (S5) From the above result, the CI in the fault influence range including the fault location (501) is extracted. For example, the CI in the failure influence range 500 of FIG. 4 is extracted. The influence of the failure propagates from the lower CI such as the failure location to the higher CI connected by the dependency (link). As an example of processing, the status of the upper CI is determined according to the above calculation using the status values of all the lower CIs connected by the link. The failure influence range 500 is a case including up to upper FTCI (eg, 401, 402).

（Ｓ６）上記障害影響範囲に係わるＦＴＣＩ（例えば４０１，４０２)について、障害許容状況を把握する。例えば４０１について、図６，図７の方法（ｂ）を用いてステータスを決定する。まず、４０１の下位の一方の障害箇所（５０１）の障害の影響のみを考えた場合、４０１のステータスは、片系障害なので「縮退（１）」になる。また４０１の下位のもう一方の障害箇所（５０２）の障害の影響を加えて考えた場合、４０１のステータスは、両系障害なので「停止（３）」になる。４０２のＦＴＣＩについても同様に、「停止（３）」になる。 (S6) For the FTCI (eg, 401, 402) related to the failure influence range, the failure allowable status is grasped. For example, for 401, the status is determined using the method (b) of FIGS. First, considering only the influence of a failure at one failure location (501) below 401, the status of 401 is "degenerate (1)" because it is a one-system failure. Further, when the influence of the failure of the other failure portion (502) below 401 is added, the status of 401 is “stop (3)” because of the failure of both systems. Similarly, the FTCI 402 is “stop (3)”.

また、障害影響範囲５００における更に上位のサービス（６０１，６０２）についても、障害影響先（障害影響サービス）として、下位のＦＴＣＩ（４０１，４０２）のステータス等をもとに、ステータス（「停止」）やその数（Ｎ＝２）などが求まる。 Further, the higher-level services (601, 602) in the failure-affected range 500 also have the status ("stopped") as the failure-affected destination (failure-affected service) based on the status of the lower-level FTCI (401, 402). ) And its number (N = 2).

（Ｓ７）上記の障害箇所（５０１）及びそれに基づくＦＴＣＩ(４０１，４０２)を含む障害影響範囲５００に係わるインシデントに関して、優先度（Ｐ）を求める。まず、緊急度（α）は、４０１，４０２のステータスが共に「停止（３）」の場合、α＝３となる。 (S7) The priority (P) is obtained for the incident related to the failure influence range 500 including the failure location (501) and the FTCI (401, 402) based on the failure location (501). First, the urgency level (α) is α = 3 when the statuses of 401 and 402 are both “stop (3)”.

次に、上記障害影響範囲５００に係わるインシデントにおけるインパクトレベル（β）は、方法（ｂ）に応じた所定の条件（式）から、例えばβ＝２×１＋１×５＋２×１０＝２７となる。 Next, the impact level (β) in the incident related to the failure influence range 500 is, for example, β = 2 × 1 + 1 × 5 + 2 × 10 = 27 from a predetermined condition (formula) according to the method (b).

次に、上記障害影響範囲５００に係わるインシデントにおける優先度（Ｐ）は、α×β＝３×２７＝８１，３０≦α×βであるから、Ｐ＝「高」となる。あわせて、Ｔ＝２時間、階層的エスカレーション＃１（Ｂ）：有り、階層的エスカレーション＃２（Ｃ）：有り、と求まる。 Next, since the priority (P) in the incident relating to the failure influence range 500 is α × β = 3 × 27 = 81, 30 ≦ α × β, P = “high”. In addition, T = 2 hours, hierarchical escalation # 1 (B): Yes, hierarchical escalation # 2 (C): Yes.

（Ｓ８）上記Ｓ７までの結果を、当該インシデント情報ｂ２に反映・登録し、また、障害構成情報ｂ３（図３の構成管理モデル上に上記障害影響範囲５００を含む状況をマッピングした情報など）を構成し、ＤＢ５１等に登録する。 (S8) The results up to S7 are reflected / registered in the incident information b2, and the failure configuration information b3 (information mapping the situation including the failure influence range 500 on the configuration management model in FIG. 3) is stored. Configure and register in DB51 etc.

（Ｓ９）上記によりサービスポータルシステム３０で担当者３に対し図４のような内容を持つ画面Ｇ１が提供される。 (S9) As described above, the service portal system 30 provides the person in charge 3 with the screen G1 having the contents as shown in FIG.

［効果等］
以上、本実施の形態によれば、インシデント管理システム１０等に係わり、クラウド環境や障害許容性などを考慮した構成の対象システム１における、障害影響範囲などの状況や構成、及びインシデント・対策の優先度などの情報を画面（Ｇ１，Ｇ２）で可視化することで、担当者３が上記状況などを即座にわかりやすく把握でき、迅速なエスカレーション（情報伝達）及び対策の実施などが実現できる。 [Effects]
As described above, according to the present embodiment, the situation and configuration of the failure impact range, etc., and the priority of incidents and countermeasures are related to the incident management system 10 and the like in the target system 1 configured in consideration of the cloud environment and fault tolerance. By visualizing information such as the degree on the screen (G1, G2), the person in charge 3 can immediately grasp the above situation in an easy-to-understand manner, and can realize quick escalation (information transmission) and implementation of countermeasures.

担当者３は、障害検知（Ｓ１）時、図４等の画面（Ｇ１）を見ることで、ＦＴＣＩを含む構成における障害箇所・障害影響範囲・障害影響先サービス、ＦＴＣＩステータスなどを、色やアイコンなどによってわかりやすく把握でき、あわせてインシデント情報（Ｇ２）を見ることで、当該障害影響範囲や優先度（Ｐ）などの情報に基づき、１次切り分けやエスカレーション等の対応を容易化・迅速化できる。 When the person in charge 3 detects the failure (S1), the person in charge 3 sees the screen (G1) in FIG. By viewing incident information (G2) together, it is possible to facilitate and speed up the response such as primary isolation and escalation based on information such as the scope of impact and priority (P). .

本実施の形態では、特に、ＦＴＣＩを設けた仕組みにより、対象システム１で提供するサービスの継続可能性（サービスレベル等）との兼ね合いで、仮想サーバ等の障害影響範囲のＣＩだけでなく、それによる障害影響先となるサービス（ＦＴＣＩの上位のサービスＣＩなど）の状況を、各サービスのステータスや障害影響サービス数（Ｎ）などの可視化によって把握することができる。 In the present embodiment, in particular, due to the mechanism provided with FTCI, in addition to the CI in the fault influence range of the virtual server, etc. in consideration of the continuity of the service provided by the target system 1 (service level, etc.) The status of the service (such as the service CI higher in FTCI) affected by the failure can be grasped by visualizing the status of each service and the number (N) of the failure affected services.

［他の実施の形態］
（１）対象システム１の構成部位の障害許容性をＣＩ（ＦＴＣＩ）としてモデル化したが、障害許容性以外にも、構成部位の性能（性能指標）などの他の非機能項目（設計情報）をＣＩとしてモデル化してもよい。 [Other embodiments]
(1) Although the fault tolerance of the component part of the target system 1 is modeled as CI (FTCI), in addition to the fault tolerance, other non-functional items (design information) such as the performance of the component part (performance index) May be modeled as CI.

（２）障害情報（Ｓ１）に基づく初期診断（Ｓ３，Ｓ４）の際に、対象システム１の全ＣＩに対して診断実行し、その結果から障害箇所ＣＩなどを発見・特定する形だけでなく、一部の特定のＣＩに対して診断を実行する形態としてもよい。例えば、障害情報（Ｓ１）から、障害等が推定される一部の特定のＣＩを特定（絞り込み）し、その特定のＣＩを診断対象とする。 (2) In the initial diagnosis (S3, S4) based on the failure information (S1), the diagnosis is executed on all CIs of the target system 1, and the failure location CI is found and specified from the result. The diagnosis may be performed on some specific CIs. For example, a part of specific CIs for which a fault or the like is estimated is specified (narrowed down) from the fault information (S1), and the specific CI is set as a diagnosis target.

（３）障害情報（Ｓ１）等をもとに、自動的に、ＤＢ５１内の既存インシデント情報（履歴）を検索したり、障害パターン解析などを行い、インシデント情報に関連付けられる又は含まれる対策手順などの対策情報を取得し、あわせて画面（Ｇ１，Ｇ２）で提示してもよい。 (3) Based on the failure information (S1), etc., the existing incident information (history) in the DB 51 is automatically searched, the failure pattern analysis is performed, and the countermeasure procedure related to or included in the incident information, etc. May be obtained and presented on the screen (G1, G2).

以上、本発明者によってなされた発明を実施の形態に基づき具体的に説明したが、本発明は前記実施の形態に限定されるものではなく、その要旨を逸脱しない範囲で種々変更可能であることは言うまでもない。 As mentioned above, the invention made by the present inventor has been specifically described based on the embodiment. However, the present invention is not limited to the embodiment, and various modifications can be made without departing from the scope of the invention. Needless to say.

本発明は、統合運用管理システム、インシデント管理システム、構成管理システム、サービスポータルシステム、障害監視システムなどに利用可能である。 The present invention can be used for an integrated operation management system, an incident management system, a configuration management system, a service portal system, a failure monitoring system, and the like.

１…対象システム（稼働システム）、３…担当者、１０…インシデント管理システム、１１…障害情報取得部、１２…構成情報取得部、１３…初期診断部、１５…障害影響範囲ＣＩ抽出部、１６…ＦＴＣＩ状況把握部、１７…優先度算出部、１８…情報登録部、２０…構成管理システム、３０…サービスポータルシステム、３１…画面提供部、４０…障害監視システム、５１…インシデント管理データベース（ＤＢ）、５２…構成管理データベース（ＤＢ）、１０１…障害影響範囲可視化機能、１０２…ＦＴＣＩ設定機能。 DESCRIPTION OF SYMBOLS 1 ... Target system (operation system), 3 ... Person in charge, 10 ... Incident management system, 11 ... Fault information acquisition part, 12 ... Configuration information acquisition part, 13 ... Initial diagnosis part, 15 ... Fault influence range CI extraction part, 16 FTCI situation grasping unit 17 Priority calculating unit 18 Information registering unit 20 Configuration management system 30 Service portal system 31 Screen providing unit 40 Fault monitoring system 51 Incident management database (DB) ), 52 ... Configuration management database (DB), 101 ... Failure influence range visualization function, 102 ... FTCI setting function.

Claims

An incident management system that manages an incident including a failure of a target system as incident information in a first database,
In cooperation with a configuration management system that manages the configuration of the target system as configuration information in a second database,
In cooperation with the service portal system that provides information screens to the terminal of the person in charge,
In cooperation with a fault monitoring system that monitors incidents including faults in the target system,
This incident management system
A first function for creating a screen for visualizing an incident situation including a configuration of the target system, a failure influence range, and a failure influence destination service using the configuration information and the incident information, and providing the screen to the terminal of the person in charge; ,
A second function for setting a configuration including a configuration part designed in consideration of fault tolerance in the target system based on an operation of the person in charge as the configuration management model in the configuration information;
In the configuration management model, each configuration part including a configuration part designed in consideration of the fault tolerance is set as a first configuration item, and fault tolerance for the first configuration item is set as a second configuration item. Set as an item, set the dependency between configuration items including the first and second configuration items as a link,
The screen by the first function displays a configuration management model of the target system, a configuration item in a fault impact range including a fault location, and an incident status including a fault target service in a structure in which the configuration items are connected by a link. Incident management system characterized by that.

The incident management system according to claim 1,
(S1) a processing unit that detects a failure in the target system and acquires failure information;
(S2) a processing unit for acquiring configuration information of the target system;
(S3) a processing unit that performs an initial diagnosis on a constituent part including a faulty part of the target system;
(S4) a processing unit for acquiring information of an execution result of the initial diagnosis;
(S5) Using the information of (S4) above, a processing unit that extracts the first configuration item and the second configuration item included in the failure influence range due to the failure,
(S6) Using the information of (S4) above, a processing unit that grasps the status of the second configuration item included in the fault influence range due to the fault according to the fault tolerance design information,
(S7) Using the information of (S6) above, a processing unit that calculates the priority regarding the countermeasure against the failure, the target solution time, and further the presence or absence of escalation,
(S8) A processing unit that reflects the results up to (S7) in the incident information and configuration information, and creates information in which the failure status including the failure affected range and the failure affected service is mapped on the configuration management model,
(S9) An incident management system comprising: a processing unit that provides a screen for visualizing a failure status in the target system to the person in charge using the information created in (S8).

In the incident management system according to claim 2,
In connection with the processing of (S5) and (S6), the status determination method is set as the fault tolerance design information for the second configuration item,
The status has a plurality of status values including normal, degeneracy, decline, and stop according to the degree of fault tolerance,
The link between the configuration items connects the upper configuration item and the lower configuration item,
The status of the link is determined according to the status of the subordinate configuration item,
Each of the configuration items belongs to a layer, and the status of the configuration item above the layer is determined according to the status of the configuration item below the layer,
An incident management system characterized in that statuses of upper configuration items are determined by calculation using status values of all lower configuration items connected by the link.

The incident management system according to claim 1,
An incident management system characterized in that, on the screen, each icon indicating the configuration item and each line indicating the link are displayed in a color corresponding to a status.

The incident management system according to claim 1,
The incident management system characterized in that, on the screen, information of a person in charge associated with the configuration item is displayed for each icon indicating the configuration item.

The incident management system according to claim 1,
The screen includes a first screen that displays the configuration information, and a second screen that displays the incident information.
An incident management system, characterized in that an incident status including a configuration management model and a failure influence range of the target system is displayed on the first screen.

The incident management system according to claim 1,
An analysis unit including a processing unit for calculating a priority regarding countermeasures against the failure, a target solution time, and further, the presence or absence of escalation;
In the processing of the analysis unit,
Using the status of the second configuration item, calculate the urgency (α) of countermeasures for the incident,
Using the status of the second configuration item, the impact level (β) of the countermeasure for the incident is calculated,
The priority is calculated using the urgency level (α) and the impact level (β),
In association with the priority, a target solution time is determined,
Corresponding to the priority, determine the presence or absence of escalation,
An incident management system, comprising: calculating information including a status and number of a higher-level service that is a failure affected destination according to the failure affected range.

A method for visualizing a fault influence range in an incident management system for managing an incident including a fault in a target system as incident information in a first database,
The incident management system includes a configuration management system that manages the configuration of the target system as configuration information in a second database, a service portal system that provides an information screen to a terminal of a person in charge, and a failure of the target system In conjunction with a fault monitoring system that monitors incidents, including
The incident management system has a function of setting a configuration including a configuration part designed in consideration of fault tolerance in the target system based on an operation of the person in charge as the configuration management model in the configuration information, and the target A screen for visualizing an incident situation including a system configuration, a failure influence range and a failure influence destination service is created using the configuration information and the incident information, and has a function to provide the terminal of the person in charge.
In the configuration management model, each configuration part including a configuration part designed in consideration of the fault tolerance is set as a first configuration item, and fault tolerance for the first configuration item is set as a second configuration item. Set as an item, set the dependency between configuration items including the first and second configuration items as a link,
In the screen, in the structure in which the configuration items are connected by a link, the configuration management model of the target system, the configuration item of the fault impact range including the fault location and the incident status including the fault impact destination service are displayed,
The incident management system includes:
(S1) processing for detecting a failure in the target system and acquiring failure information;
(S2) processing for acquiring configuration information of the target system;
(S3) a process for executing an initial diagnosis on a constituent part including a faulty part of the target system;
(S4) a process for acquiring information on the execution result of the initial diagnosis;
(S5) Using the information of (S4) above, a process of extracting the first configuration item and the second configuration item that are included in the fault influence range due to the fault,
(S6) Using the information of (S4) above, a process of grasping the status of the second configuration item included in the fault influence range due to the fault according to the fault tolerance design information;
(S7) Using the information of (S6) above, a process for calculating the priority regarding the countermeasure against the failure, the target solution time, and further the presence / absence of escalation,
(S8) processing for reflecting the results up to (S7) above in the incident information and configuration information, and creating information in which the failure status including the failure impact range and the fault affected service is mapped on the configuration management model;
(S9) Using the information created in (S8) above, the service portal system performs a process of providing a screen for visualizing the failure status in the target system to the person in charge. How to visualize the range of failure impact.