JP2024027816A

JP2024027816A - Monitoring method and system

Info

Publication number: JP2024027816A
Application number: JP2022130943A
Authority: JP
Inventors: 重志大場; 賢作岡; 博文泉; 幸治村井; 千穂神林; 龍二山口
Original assignee: Fujitsu FSAS Inc
Current assignee: Fujitsu FSAS Inc
Priority date: 2022-08-19
Filing date: 2022-08-19
Publication date: 2024-03-01

Abstract

【課題】顧客システムの障害対応をセキュアに実行すること。【解決手段】監視システムは、システムと、システムを監視する監視サーバとを有する。監視サーバは、システムに障害が発生したことを検知した場合、障害の対応情報を、記憶部に登録する。システムのツールは、前記記憶部に登録された前記対応情報を取得し、前記対応情報に応じた処理を前記システムに対して実行する。【選択図】図１[Problem] To securely execute troubleshooting of a customer system. A monitoring system includes a system and a monitoring server that monitors the system. When the monitoring server detects that a failure has occurred in the system, it registers failure response information in the storage unit. A system tool acquires the correspondence information registered in the storage unit, and executes processing on the system according to the correspondence information. [Selection diagram] Figure 1

Description

本発明は、監視方法および監視システムに関する。 The present invention relates to a monitoring method and a monitoring system.

従来、顧客システムを監視し、顧客システムに何らかの障害が発生した場合に、係る障害に対応する監視システムがある。 Conventionally, there are monitoring systems that monitor customer systems and respond to any failures that occur in the customer systems.

図１３は、従来の監視システムの一例を示す図である。図１３に示すように、この監視システムは、顧客システム５、自動化処理部６、監視サーバ７、ＩＴＳＭ（IT Service Management）サーバ８を有する。監視サーバ７およびＩＴＳＭサーバ８と、自動化処理部６との間には、不正アクセス等を防止するためのFirewall９が配置される。 FIG. 13 is a diagram showing an example of a conventional monitoring system. As shown in FIG. 13, this monitoring system includes a customer system 5, an automation processing section 6, a monitoring server 7, and an ITSM (IT Service Management) server 8. A firewall 9 is arranged between the monitoring server 7 and ITSM server 8 and the automation processing unit 6 to prevent unauthorized access.

顧客システム５は、顧客が利用するシステムであり、複数の電子機器から構成される。自動化処理部６は、インバウンド通信によって、外部の監視サーバ７から、ワークアラウンドの実行命令を受信した場合に、該当するワークアラウンドに応じたジョブを、顧客システムに対して実行する。図示を省略するが、監視システムは、顧客システム５に加えて、他の顧客システムを更に含んでいてもよい。 The customer system 5 is a system used by a customer, and is composed of a plurality of electronic devices. When the automation processing unit 6 receives a workaround execution command from the external monitoring server 7 through inbound communication, it executes a job corresponding to the corresponding workaround on the customer system. Although not shown, the monitoring system may further include other customer systems in addition to the customer system 5.

監視サーバ７は、ＳａａＳ（Software as a Service）型の監視サーバであり、顧客システム５や、他の顧客システム（図示略）の監視を行う。ここでは、顧客システム５を用いて、監視サーバ７の説明を行う。 The monitoring server 7 is a SaaS (Software as a Service) type monitoring server, and monitors the customer system 5 and other customer systems (not shown). Here, the monitoring server 7 will be explained using the customer system 5.

監視サーバ７は、顧客システム５から、障害発生の通知を受け付けた場合等に、顧客システム５の障害発生を検知し、表示画面等に障害発生の情報を表示させる。監視サーバ７のオペレータは、障害発生の情報を表示画面等で確認すると、障害内容に対応するワークアラウンドを選択し、選択したワークアラウンドの実行命令を、自動化処理部６に対して送信する。 When the monitoring server 7 receives a notification of the occurrence of a failure from the customer system 5, it detects the occurrence of a failure in the customer system 5, and displays information on the occurrence of the failure on a display screen or the like. When the operator of the monitoring server 7 confirms the information on the occurrence of a failure on a display screen or the like, he selects a workaround corresponding to the failure details, and sends an execution command for the selected workaround to the automation processing unit 6.

ＩＴＳＭサーバ８は、ＳａａＳ型のＩＴＳＭサーバであり、顧客システム５および他の顧客システム（図示略）に発生した障害内容、係る障害内容に対して選択したワークアラウンド等の履歴情報を保存する。監視サーバ７のオペレータは、ＩＴＳＭサーバ８に保存された履歴情報を参照して、顧客システム５で新たに発生した障害内容に対応するワークアラウンドを選択する場合もある。 The ITSM server 8 is a SaaS type ITSM server, and stores history information such as failure details that have occurred in the customer system 5 and other customer systems (not shown), workarounds selected for the failure contents, and the like. The operator of the monitoring server 7 may refer to history information stored in the ITSM server 8 and select a workaround corresponding to the content of a new failure that has occurred in the customer system 5.

特開２０１４－１６４４５７号公報Japanese Patent Application Publication No. 2014-164457 特開２０１４－３２５９８号公報Japanese Patent Application Publication No. 2014-32598

上述した従来の監視システムによる障害対応の仕組みは、インバウンド通信を前提しており、監視サーバ７から送信されるデータは、基本的にFirewall９を通過して、自動化処理部６に到達する。このため、たとえば、悪意のある第三者が、監視サーバ７を利用して、自動化処理部６に送信するデータに、ウイルスを埋め込んだり、悪意のある操作を行ったりすることも可能であり、セキュリティ対策に課題があった。 The failure response mechanism of the conventional monitoring system described above assumes inbound communication, and data sent from the monitoring server 7 basically passes through the firewall 9 and reaches the automation processing unit 6. Therefore, for example, it is possible for a malicious third party to use the monitoring server 7 to embed a virus or perform malicious operations on the data sent to the automation processing unit 6. There were issues with security measures.

このため、顧客システムの障害対応をセキュアに実行することが求められる。 Therefore, it is required to securely handle failures in customer systems.

１つの側面では、本発明は、顧客システムの障害対応をセキュアに実行することができる監視方法および監視システムを提供することを目的とする。 In one aspect, an object of the present invention is to provide a monitoring method and a monitoring system that can securely perform failure handling of a customer system.

第１の案では、監視システムは、システムと、システムを監視する監視サーバとを有する。監視サーバは、システムに障害が発生したことを検知した場合、障害の対応情報を、記憶部に登録する。システムのツールは、前記記憶部に登録された前記対応情報を取得し、前記対応情報に応じた処理を前記システムに対して実行する。 In the first proposal, the monitoring system includes a system and a monitoring server that monitors the system. When the monitoring server detects that a failure has occurred in the system, it registers failure response information in the storage unit. A system tool acquires the correspondence information registered in the storage unit, and executes processing on the system according to the correspondence information.

顧客システムの障害対応をセキュアに実行することができる。 It is possible to securely respond to customer system failures.

図１は、本実施例に係る監視システムの一例を示す図である。FIG. 1 is a diagram showing an example of a monitoring system according to this embodiment. 図２は、障害ＤＢのデータ構造の一例を示す図である。FIG. 2 is a diagram showing an example of the data structure of the failure DB. 図３は、本実施例に係る自動化処理装置の構成を示す機能ブロック図である。FIG. 3 is a functional block diagram showing the configuration of the automated processing device according to this embodiment. 図４は、処理テーブルのデータ構造の一例を示す図である。FIG. 4 is a diagram showing an example of the data structure of the processing table. 図５は、監視サーバの構成を示す機能ブロック図である。FIG. 5 is a functional block diagram showing the configuration of the monitoring server. 図６は、ＩＴＳＭサーバの構成を示す機能ブロック図である。FIG. 6 is a functional block diagram showing the configuration of the ITSM server. 図７は、ワークアラウンド管理テーブルのデータ構造の一例を示す図である。FIG. 7 is a diagram showing an example of the data structure of the workaround management table. 図８は、システムレベル管理テーブルのデータ構造の一例を示す図である。FIG. 8 is a diagram showing an example of the data structure of the system level management table. 図９は、本実施例に係る自動化処理装置の処理手順を示すフローチャートである。FIG. 9 is a flowchart showing the processing procedure of the automated processing device according to this embodiment. 図１０は、監視サーバおよびＩＴＳＭサーバの処理手順を示すフローチャートである。FIG. 10 is a flowchart showing the processing procedure of the monitoring server and the ITSM server. 図１１は、実施例の監視サーバと同様の機能を実現するコンピュータのハードウェア構成の一例を示す図である。FIG. 11 is a diagram illustrating an example of the hardware configuration of a computer that implements the same functions as the monitoring server of the embodiment. 図１２は、実施例のＩＴＳＭサーバと同様の機能を実現するコンピュータのハードウェア構成の一例を示す図である。FIG. 12 is a diagram illustrating an example of the hardware configuration of a computer that implements the same functions as the ITSM server of the embodiment. 図１３は、従来の監視システムの一例を示す図である。FIG. 13 is a diagram showing an example of a conventional monitoring system.

以下に、本願の開示する監視方法および監視システムの実施例を図面に基づいて詳細に説明する。なお、この実施例によりこの発明が限定されるものではない。 DESCRIPTION OF THE PREFERRED EMBODIMENTS Examples of the monitoring method and monitoring system disclosed in the present application will be described in detail below based on the drawings. Note that the present invention is not limited to this example.

図１は、本実施例に係る監視システムの一例を示す図である。図１に示すように、この監視システムは、顧客システム１０ａ，１０ｂ，１０ｃと、自動化処理装置２０ａ，２０ｂ，２０ｃと、監視サーバ１００と、ＩＴＳＭサーバ２００とを有する。本実施例では、監視サーバ１００と、ＩＴＳＭサーバ２００とを別々のサーバとして説明するが、監視サーバ１００と、ＩＴＳＭサーバ２００とを一つのサーバで実現することもできる。 FIG. 1 is a diagram showing an example of a monitoring system according to this embodiment. As shown in FIG. 1, this monitoring system includes customer systems 10a, 10b, and 10c, automated processing devices 20a, 20b, and 20c, a monitoring server 100, and an ITSM server 200. In this embodiment, the monitoring server 100 and the ITSM server 200 are described as separate servers, but the monitoring server 100 and the ITSM server 200 can also be implemented as a single server.

顧客システム１０ａ～１０ｃは、自動化処理装置２０ａ～２０ｃにそれぞれ接続される。自動化処理装置２０ａ～２０ｃは、それぞれ、外部からの不正アクセスを防止するためのFirewall３０ａ，３０ｂ，３０ｃを介して、ネットワーク５０に接続される。監視サーバ１００およびＩＴＳＭサーバ２００は、ネットワーク５０に接続される。 Customer systems 10a-10c are connected to automated processing devices 20a-20c, respectively. The automated processing devices 20a to 20c are each connected to a network 50 via firewalls 30a, 30b, and 30c for preventing unauthorized access from the outside. Monitoring server 100 and ITSM server 200 are connected to network 50.

顧客システム１０ａ～１０ｃは、顧客が利用するシステムであり、複数の電子機器から構成される。以下の説明では、特に区別する場合を除き、顧客システム１０ａ～１０ｃをまとめて「顧客システム１０」と表記する。顧客システム１０は、自顧客システム１０内に障害が発生した場合に、障害情報を、監視サーバ１００に送信する。 The customer systems 10a to 10c are systems used by customers and are composed of a plurality of electronic devices. In the following description, customer systems 10a to 10c will be collectively referred to as "customer system 10" unless otherwise specified. The customer system 10 transmits failure information to the monitoring server 100 when a failure occurs within the own customer system 10 .

たとえば、障害情報には、障害の内容を一意に識別する障害コードと、顧客システム１０を一意に識別するシステム識別番号が含まれる。顧客システム１０ａのシステム識別番号を「ｓｙｓ１」、顧客システム１０ｂのシステム識別番号を「ｓｙｓ２」、顧客システム１０ｃのシステム識別番号を「ｓｙｓ３」とする。 For example, the fault information includes a fault code that uniquely identifies the details of the fault, and a system identification number that uniquely identifies the customer system 10. It is assumed that the system identification number of the customer system 10a is "sys1," the system identification number of the customer system 10b is "sys2," and the system identification number of the customer system 10c is "sys3."

監視サーバ１００は、顧客システム１０を監視する。監視サーバ１００は、顧客システム１０から障害情報を受信した場合、障害情報を表示画面に表示する。また、監視サーバ１００は、障害情報を設定したインシデント発行要求を、ＩＴＳＭサーバ２００に送信する。 The monitoring server 100 monitors the customer system 10. When the monitoring server 100 receives fault information from the customer system 10, it displays the fault information on the display screen. Additionally, the monitoring server 100 transmits an incident issuance request in which failure information is set to the ITSM server 200.

ＩＴＳＭサーバ２００は、インシデント発行要求を受付けると、インシデント番号を発行し、障害情報に関する情報を、障害ＤＢ２４１に登録する。 Upon receiving the incident issue request, the ITSM server 200 issues an incident number and registers information regarding the fault information in the fault DB 241 .

図２は、障害ＤＢのデータ構造の一例を示す図である。図２に示すように、この障害ＤＢ２４１は、障害テーブルｔａ１，ｔａ２，ｔａ３を有する。障害テーブルｔａ１は、顧客システム１０ａの障害情報に関する情報を保持する。障害テーブルｔａ２は、顧客システム１０ｂの障害情報に関する情報を保持する。障害テーブルｔａ３は、顧客システム１０ｃの障害情報に関する情報を保持する。障害ＤＢ２４１は、他の顧客システムの障害テーブルを更に有していてもよい。 FIG. 2 is a diagram showing an example of the data structure of the failure DB. As shown in FIG. 2, this failure DB 241 has failure tables ta1, ta2, and ta3. The failure table ta1 holds information regarding failure information of the customer system 10a. The failure table ta2 holds information regarding failure information of the customer system 10b. The failure table ta3 holds information regarding failure information of the customer system 10c. The failure DB 241 may further include failure tables of other customer systems.

障害テーブルｔａ１について説明する。障害テーブルｔａ１には、顧客システム１０ａのシステム識別番号「ｓｙｓ１」が設定される。また、障害テーブルｔａ１には、インシデント番号、障害コード、ワークアラウンド、対処フラグが設定される。 The failure table ta1 will be explained. The system identification number "sys1" of the customer system 10a is set in the failure table ta1. Further, the incident number, failure code, workaround, and response flag are set in the failure table ta1.

インシデント番号は、ＩＴＳＭサーバ２００が発行する番号である。障害コードは、障害情報に設定された障害を一意に識別する情報である。ワークアラウンドは、障害コードによって識別される障害の対処内容を示す。たとえば、ワークアラウンドは、Windowsサーバのサービス状態確認、長時間走行ジョブの確認等である。その他のワークアラウンドの説明を省略する。 The incident number is a number issued by the ITSM server 200. The fault code is information that uniquely identifies the fault set in the fault information. The workaround indicates how to deal with the failure identified by the failure code. For example, workarounds include checking the service status of a Windows server and checking long-running jobs. Other workarounds will be omitted.

対処フラグは、顧客システムに障害に対応したか否かを示すフラグである。障害に対処した場合には、対処フラグに「ＯＮ」が設定される。障害に対応していない場合には、対処フラグに「ＯＦＦ」が設定される。 The response flag is a flag indicating whether or not the customer system has responded to the failure. When the failure has been dealt with, "ON" is set in the handling flag. If the failure is not handled, "OFF" is set in the handling flag.

障害テーブルｔａ２について説明する。障害テーブルｔａ２には、顧客システム１０ｂのシステム識別番号「ｓｙｓ２」が設定される。また、障害テーブルｔａ２には、インシデント番号、障害コード、ワークアラウンド、対処フラグが設定される。 The failure table ta2 will be explained. The system identification number "sys2" of the customer system 10b is set in the failure table ta2. Further, the incident number, failure code, workaround, and response flag are set in the failure table ta2.

障害テーブルｔａ３について説明する。障害テーブルｔａ３には、顧客システム１０ｃのシステム識別番号「ｓｙｓ３」が設定される。また、障害テーブルｔａ３には、インシデント番号、障害コード、ワークアラウンド、対処フラグが設定される。 The failure table ta3 will be explained. The system identification number "sys3" of the customer system 10c is set in the failure table ta3. Furthermore, the incident number, failure code, workaround, and response flag are set in the failure table ta3.

以下の説明では、障害テーブルｔａ１，ｔａ２，ｔａ３のインシデント番号によって識別されるレコードを「インシデント」と表記する。たとえば、インシデント番号「inc_1」によって識別されるインシデントは、障害コード「error1100」、ワークアラウンド「Windowsサーバのサービス状態確認」、対処フラグ「ＯＦＦ」のレコードに対応する。以下の説明では、適宜、対処フラグが「ＯＦＦ」となるインシデントを、未対処のインシデントと表記する。 In the following description, records identified by incident numbers in the failure tables ta1, ta2, and ta3 will be referred to as "incidents." For example, an incident identified by the incident number "inc_1" corresponds to a record with a failure code "error1100", a workaround "check service status of Windows server", and a response flag "OFF". In the following description, an incident whose handling flag is "OFF" will be appropriately referred to as an unhandled incident.

図１の説明に戻る。自動化処理装置２０ａ～２０ｃは、アウトバウンド通信によって、所定時間毎に、ＩＴＳＭサーバ２００の障害ＤＢ２４１にアクセスし、自身の顧客システム１０に対応する障害テーブルを参照する。たとえば、自動化処理装置２０ａは、顧客システム１０ａに対応する障害テーブルｔａ１を参照する。自動化処理装置２０ｂは、顧客システム１０ｂに対応する障害テーブルｔａ２を参照する。自動化処理装置２０ｃは、顧客システム１０ｃに対応する障害テーブルｔａ３を参照する。 Returning to the explanation of FIG. The automation processing devices 20a to 20c access the fault DB 241 of the ITSM server 200 at predetermined intervals through outbound communication, and refer to the fault table corresponding to their own customer system 10. For example, the automated processing device 20a refers to the failure table ta1 corresponding to the customer system 10a. The automated processing device 20b refers to the failure table ta2 corresponding to the customer system 10b. The automated processing device 20c refers to the failure table ta3 corresponding to the customer system 10c.

自動化処理装置２０ａは、障害テーブルｔａ１のインシデントのうち、対処フラグが「ＯＦＦ」となる未対処のインシデントを特定し、特定したインシデントに設定されたワークアラウンドに対応するジョブを、顧客システム１０ａに対して実行する。図２に示す例では、自動化処理装置２０ａは、ワークアラウンド「Windowsサーバのサービス状態確認」のジョブ、「長時間走行ジョブの確認」のジョブを顧客システム１０ａに対して実行する。自動化処理装置２０ａは、「Windowsサーバのサービス状態確認」、「長時間走行ジョブの確認」の処理結果を、ＩＴＳＭサーバ２００に通知する。 The automation processing device 20a identifies unhandled incidents whose handling flag is "OFF" among the incidents in the failure table ta1, and sends a job corresponding to the workaround set to the identified incident to the customer system 10a. and execute it. In the example shown in FIG. 2, the automated processing device 20a executes a workaround job of "confirm service status of Windows server" and a job of "confirm long-time running job" on the customer system 10a. The automation processing device 20a notifies the ITSM server 200 of the processing results of "confirmation of Windows server service status" and "confirmation of long running job".

ＩＴＳＭサーバ２００は、ワークアラウンド「Windowsサーバのサービス状態確認」、「長時間走行ジョブの確認」に対処した旨の情報を、処理結果として受信した場合には、障害テーブルｔａ１のワークアラウンド「Windowsサーバのサービス状態確認」、「長時間走行ジョブの確認」に対応する対処フラグを「ＯＦＦ」から「ＯＮ」に更新する。 When the ITSM server 200 receives information that the workaround "Windows server service status confirmation" and "long running job confirmation" have been handled as a processing result, the ITSM server 200 executes the workaround "Windows server service status confirmation" in the failure table ta1. The response flags corresponding to "Confirm service status" and "Confirm long running job" are updated from "OFF" to "ON".

一方、ＩＴＳＭサーバ２００は、ワークアラウンド「Windowsサーバのサービス状態確認」、「長時間走行ジョブの確認」に対処に失敗した旨の情報を、処理結果として受信した場合には、エラー情報を、監視サーバ１００に送信する。エラー情報には、システム識別番号、対処に失敗したワークアラウンドに対応するインシデント番号等が設定される。 On the other hand, if the ITSM server 200 receives information indicating that the workaround "confirmation of service status of Windows server" or "confirmation of long running job" has failed as a processing result, the ITSM server 200 monitors the error information. Send to server 100. The error information includes a system identification number, an incident number corresponding to a failed workaround, and the like.

自動化処理装置２０ｂは、障害テーブルｔａ２のインシデントのうち、未対処のインシデントを特定し、特定したインシデントに設定されたワークアラウンドに対応するジョブを、顧客システム１０ｂに対して実行する。自動化処理装置２０ｂは、ジョブの処理結果を、ＩＴＳＭサーバ２００に通知する。その他の説明は、自動化処理装置２０ａに関する説明と同様である。 The automation processing device 20b identifies unhandled incidents among the incidents in the failure table ta2, and executes a job corresponding to the workaround set for the identified incident on the customer system 10b. The automated processing device 20b notifies the ITSM server 200 of the job processing results. Other explanations are the same as those regarding the automated processing device 20a.

自動化処理装置２０ｃは、障害テーブルｔａ３のインシデントのうち、未対処のレコードを特定し、特定したインシデントに設定されたワークアラウンドに対応するジョブを、顧客システム１０ｃに対して実行する。自動化処理装置２０ｃは、ジョブの処理結果を、ＩＴＳＭサーバ２００に通知する。その他の説明は、自動化処理装置２０ａに関する説明と同様である。 The automation processing device 20c identifies unhandled records among the incidents in the failure table ta3, and executes a job corresponding to the workaround set for the identified incident on the customer system 10c. The automated processing device 20c notifies the ITSM server 200 of the job processing results. Other explanations are the same as those regarding the automated processing device 20a.

以下の説明では、自動化処理装置２０ａ～２０ｃを特に区別しない場合、自動化処理装置２０ａ～２０ｃをまとめて「自動化処理装置２０」と表記する。自動化処理装置２０は、顧客システム１０のツールの一例である。自動化処理装置２０は、顧客システム１０内に設定されていてもよいし、顧客システム１０が、自動化処理装置２０の機能を有していてもよい。 In the following description, unless the automated processing devices 20a to 20c are particularly distinguished, the automated processing devices 20a to 20c will be collectively referred to as the "automated processing device 20." The automated processing device 20 is an example of a tool of the customer system 10. The automated processing device 20 may be set within the customer system 10, or the customer system 10 may have the function of the automated processing device 20.

上記のように、本実施例に係る監視システムは、監視サーバ１００が、顧客システム１０の障害情報を受信した場合に、インシデント発行要求を、ＩＴＳＭサーバ２００に行い、ＩＴＳＭサーバ２００は、障害情報に関する情報を、障害ＤＢ２４１に登録する。また、自動化処理装置２０は、アウトバウンド通信によって、障害ＤＢ２４１にアクセスして、ワークアラウンドを取得し、ワークアラウンドに対応するジョブを、顧客システム１０に対して実行する。このように、アウトバウンド通信によって、自動化処理装置２０側から、ワークアラウンドを取得するため、インバウンド通信の場合と比較して、顧客システム１０の障害対応をセキュアに実行することができる。 As described above, in the monitoring system according to the present embodiment, when the monitoring server 100 receives fault information of the customer system 10, it issues an incident issuance request to the ITSM server 200, and the ITSM server 200 issues a request to the ITSM server 200 regarding the fault information. The information is registered in the fault DB 241. Further, the automation processing device 20 accesses the failure DB 241 through outbound communication, obtains a workaround, and executes a job corresponding to the workaround on the customer system 10. In this way, since the workaround is obtained from the automation processing device 20 side through outbound communication, troubleshooting of the customer system 10 can be executed more securely than in the case of inbound communication.

次に、図１で説明した自動化処理装置２０の構成例について説明する。図３は、本実施例に係る自動化処理装置の構成を示す機能ブロック図である。図３に示すように、この自動化処理装置２０ａは、通信部２１と、記憶部２４と、制御部２５とを有する。 Next, a configuration example of the automated processing device 20 described in FIG. 1 will be described. FIG. 3 is a functional block diagram showing the configuration of the automated processing device according to this embodiment. As shown in FIG. 3, this automated processing device 20a includes a communication section 21, a storage section 24, and a control section 25.

通信部２１は、ネットワーク５０を介して、監視サーバ１００、ＩＴＳＭサーバ２００との間で情報の送受信を行う。また、通信部２１は、顧客システム１０との間で情報の送受信を行う。通信部２１は、ＮＩＣ（Network Interface Card）等によって実現される。 The communication unit 21 sends and receives information to and from the monitoring server 100 and the ITSM server 200 via the network 50. The communication unit 21 also sends and receives information to and from the customer system 10 . The communication unit 21 is realized by a NIC (Network Interface Card) or the like.

記憶部２４は、処理テーブル２４ａを有する。たとえば、記憶部２４は、メモリ等の記憶装置である。 The storage unit 24 has a processing table 24a. For example, the storage unit 24 is a storage device such as a memory.

処理テーブル２４ａは、ワークアラウンドに対応するジョブを設定するテーブルである。図４は、処理テーブルのデータ構造の一例を示す図である。図４に示すように、この処理テーブル２４ａは、ワークアラウンドと、ジョブとを対応付ける。ワークアラウンドに関する説明は、上記のワークアラウンドに関する説明と同様である。ジョブは、複数のプログラムをまとめて連続して実行するひとつのかたまりである。また、ジョブは、複数のコード部品の実行順を定義したパーツに対応する。 The processing table 24a is a table in which jobs corresponding to workaround are set. FIG. 4 is a diagram showing an example of the data structure of the processing table. As shown in FIG. 4, this processing table 24a associates workarounds with jobs. The description regarding the workaround is similar to the description regarding the workaround above. A job is a group of multiple programs that are executed continuously. Further, a job corresponds to a part that defines the execution order of a plurality of code parts.

図３の説明に戻る。制御部２５は、取得部２５ａと、実行部２５ｂとを有する。制御部２５は、たとえば、ＣＰＵ（Central Processing Unit）やＭＰＵ(Micro Processing Unit)等である。 Returning to the explanation of FIG. 3. The control unit 25 includes an acquisition unit 25a and an execution unit 25b. The control unit 25 is, for example, a CPU (Central Processing Unit) or an MPU (Micro Processing Unit).

取得部２５ａは、所定時間毎に、ＩＴＳＭサーバ２００の障害ＤＢ２４１の障害テーブルｔａ１にアクセスする。取得部２５ａは、ＩＴＳＭサーバ２００にアクセスする場合、ジョブの実行対象となる顧客システムのシステム識別番号を通知する。取得部２５ａは、障害テーブルｔａ１のインシデントのうち、未対処のインシデントのワークアラウンドを取得する。取得部２５ａは、インシデント番号もあわせて取得してもよい。取得部２５ａは、取得したワークアラウンドを、実行部２５ｂに出力する。 The acquisition unit 25a accesses the failure table ta1 of the failure DB 241 of the ITSM server 200 at predetermined intervals. When accessing the ITSM server 200, the acquisition unit 25a notifies the system identification number of the customer system on which the job is to be executed. The acquisition unit 25a acquires workarounds for unhandled incidents among the incidents in the failure table ta1. The acquisition unit 25a may also acquire the incident number. The acquisition unit 25a outputs the acquired workaround to the execution unit 25b.

実行部２５ｂは、取得部２５ａから取得したワークアラウンドと、処理テーブル２４ａとを比較し、ワークアラウンドに対応するジョブを特定する。実行部２５ｂは、特定したジョブを、顧客システム１０ａに対して実行する。実行部２５ｂは、処理結果を、ＩＴＳＭサーバ２００に送信する。処理結果には、インシデント番号と、ワークアラウンドに対応するジョブの実行に成功したか否かの情報が含まれる。 The execution unit 25b compares the workaround acquired from the acquisition unit 25a with the processing table 24a, and identifies a job corresponding to the workaround. The execution unit 25b executes the specified job on the customer system 10a. The execution unit 25b transmits the processing result to the ITSM server 200. The processing result includes an incident number and information as to whether or not the job corresponding to the workaround was successfully executed.

ここで、実行部２５ｂは、ジョブの実行に失敗した場合に、所定回数、ジョブの実行を再試行してもよい。実行部２５ｂは、所定回数、ジョブを再試行しても、ジョブの実行に成功しない場合に、処理結果に、ワークアラウンドに対応するジョブの実行に失敗した旨の情報を設定し、ＩＴＳＭサーバ２００に送信する。 Here, if execution of the job fails, the execution unit 25b may retry execution of the job a predetermined number of times. If the execution of the job is not successful even after retrying the job a predetermined number of times, the execution unit 25b sets information to the effect that execution of the job corresponding to the workaround has failed in the processing result, and sends the job to the ITSM server 200. Send to.

自動化処理装置２０ｂ，２０ｃの機能ブロック図は、図３に示した自動化処理装置２０ａの機能ブロック図に対応するため、説明を省略する。 The functional block diagrams of the automated processing devices 20b and 20c correspond to the functional block diagram of the automated processing device 20a shown in FIG. 3, and therefore their description will be omitted.

次に、図１で説明した監視サーバ１００の構成例について説明する。図５は、監視サーバの構成を示す機能ブロック図である。図５に示すように、この監視サーバ１００は、通信部１１０と、入力部１２０と、表示部１３０と、記憶部１４０と、制御部１５０とを有する。 Next, a configuration example of the monitoring server 100 described in FIG. 1 will be described. FIG. 5 is a functional block diagram showing the configuration of the monitoring server. As shown in FIG. 5, this monitoring server 100 includes a communication section 110, an input section 120, a display section 130, a storage section 140, and a control section 150.

通信部１１０は、ネットワーク５０を介して、ＩＴＳＭサーバ２００、自動化処理装置２０、顧客システム１０と情報の送受信を行う。通信部１１０は、ＮＩＣ等によって実現される。 The communication unit 110 sends and receives information to and from the ITSM server 200, the automation processing device 20, and the customer system 10 via the network 50. The communication unit 110 is realized by a NIC or the like.

入力部１２０は、各種の情報を、監視サーバ１００に入力する入力装置である。入力部１２０は、キーボードやマウス、タッチパネル等に対応する。 The input unit 120 is an input device that inputs various information to the monitoring server 100. The input unit 120 corresponds to a keyboard, a mouse, a touch panel, etc.

表示部１３０は、制御部１５０から出力される情報を表示する表示装置である。表示部１３０は、液晶ディスプレイ、有機ＥＬ（Electro Luminescence）ディスプレイ、タッチパネル等に対応する。たとえば、表示部１３０は、顧客を表示する。 The display unit 130 is a display device that displays information output from the control unit 150. The display unit 130 corresponds to a liquid crystal display, an organic EL (Electro Luminescence) display, a touch panel, or the like. For example, the display unit 130 displays customers.

記憶部１４０は、制御部１５０が処理を実行するための各種の情報を保持する。記憶部１４０は、メモリ等の記憶装置である。 The storage unit 140 holds various types of information for the control unit 150 to execute processing. The storage unit 140 is a storage device such as a memory.

制御部１５０は、異常検知部１５１、依頼部１５２、表示制御部１５３を有する。制御部１５０は、たとえば、ＣＰＵやＭＰＵ等である。 The control unit 150 includes an abnormality detection unit 151, a request unit 152, and a display control unit 153. The control unit 150 is, for example, a CPU, an MPU, or the like.

異常検知部１５１は、顧客システム１０ａ～１０ｃを監視し、障害が発生したか否かを検知する。たとえば、異常検知部１５１は、顧客システム１０から障害情報を受信した場合、障害情報に設定されたシステム識別番号に対応する顧客システム１０に障害が発生したことを検知する。異常検知部１５１は、受信した障害情報を、依頼部１５２、表示制御部１５３に出力する。 The abnormality detection unit 151 monitors the customer systems 10a to 10c and detects whether a failure has occurred. For example, when receiving failure information from the customer system 10, the abnormality detection unit 151 detects that a failure has occurred in the customer system 10 corresponding to the system identification number set in the failure information. The abnormality detection unit 151 outputs the received failure information to the requesting unit 152 and the display control unit 153.

異常検知部１５１は、データを顧客システム１０に送信し、送信先の顧客システム１０から応答がない場合に、顧客システム１０の異常を検知してもよい。この場合、異常検知部１５１は、応答なしを示す障害コードと、異常を検知した顧客システム１０のシステム識別番号を設定した障害情報を生成し、生成した障害情報を、依頼部１５２、表示制御部１５３に出力する。 The abnormality detection unit 151 may transmit data to the customer system 10 and detect an abnormality in the customer system 10 when there is no response from the destination customer system 10. In this case, the abnormality detection unit 151 generates failure information in which a failure code indicating no response and the system identification number of the customer system 10 in which the abnormality was detected is set, and the generated failure information is transmitted to the requesting unit 152 and the display control unit. 153.

依頼部１５２は、異常検知部１５１から障害情報を取得した場合に、障害情報を設定したインシデント発行要求を、ＩＴＳＭサーバ２００に送信する。 When the requesting unit 152 acquires fault information from the abnormality detecting unit 151, it transmits an incident issuance request in which the fault information is set to the ITSM server 200.

表示制御部１５３は、各種の情報を表示部１３０に表示させる。たとえば、表示制御部１５３は、障害情報を、表示部１３０に表示させる。表示制御部１５３は、ＩＴＳＭサーバ２００から、エラー情報を受信した場合には、エラー情報を、表示部１３０に表示させる。 The display control unit 153 causes the display unit 130 to display various information. For example, the display control unit 153 causes the display unit 130 to display failure information. When receiving error information from the ITSM server 200, the display control unit 153 causes the display unit 130 to display the error information.

次に、図１で説明したＩＴＳＭサーバ２００の構成例について説明する。図６は、ＩＴＳＭサーバの構成を示す機能ブロック図である。図６に示すように、このＩＴＳＭサーバ２００は、通信部２１０と、入力部２２０と、表示部２３０と、記憶部２４０と、制御部２５０とを有する。 Next, a configuration example of the ITSM server 200 described in FIG. 1 will be described. FIG. 6 is a functional block diagram showing the configuration of the ITSM server. As shown in FIG. 6, the ITSM server 200 includes a communication section 210, an input section 220, a display section 230, a storage section 240, and a control section 250.

通信部２１０は、ネットワーク５０を介して、監視サーバ１００、自動化処理装置２０、顧客システム１０と情報の送受信を行う。通信部１１０は、ＮＩＣ等によって実現される。 The communication unit 210 transmits and receives information to and from the monitoring server 100, the automated processing device 20, and the customer system 10 via the network 50. The communication unit 110 is realized by a NIC or the like.

入力部２２０は、各種の情報を、ＩＴＳＭサーバ２００に入力する入力装置である。入力部２２０は、キーボードやマウス、タッチパネル等に対応する。 The input unit 220 is an input device that inputs various information to the ITSM server 200. The input unit 220 corresponds to a keyboard, mouse, touch panel, etc.

表示部２３０は、制御部１５０から出力される情報を表示する表示装置である。表示部２３０は、液晶ディスプレイ、有機ＥＬディスプレイ、タッチパネル等に対応する。たとえば、表示部１３０は、顧客を表示する。 The display unit 230 is a display device that displays information output from the control unit 150. The display unit 230 corresponds to a liquid crystal display, an organic EL display, a touch panel, or the like. For example, the display unit 130 displays customers.

記憶部２４０は、障害ＤＢ２４１、ワークアラウンド管理テーブル２４２、システムレベル管理テーブル２４３を保持する。記憶部２４０は、メモリ等の記憶装置である。 The storage unit 240 holds a failure DB 241, a workaround management table 242, and a system level management table 243. The storage unit 240 is a storage device such as a memory.

障害ＤＢ２４１は、障害情報に関する情報を保持する。障害ＤＢ２４１のデータ構造は、図２で説明したデータ構造に対応する。 The failure DB 241 holds information regarding failure information. The data structure of the failure DB 241 corresponds to the data structure explained in FIG.

ワークアラウンド管理テーブル２４２は、障害コードによって識別される障害に対処するためのワークアラウンドを定義する。図７は、ワークアラウンド管理テーブルのデータ構造の一例を示す図である。図７に示すように、このワークアラウンド管理テーブル２４２は、障害コードと、ワークアラウンドとを対応付ける。障害コードは、障害を一意に識別する情報である。ワークアラウンドは、障害に対処するためのワークアラウンド名である。たとえば、障害コード「error1000」に対応するワークアラウンド（ワークアラウンド名）は「システム再起動」である。 Workaround management table 242 defines workarounds for dealing with failures identified by failure codes. FIG. 7 is a diagram showing an example of the data structure of the workaround management table. As shown in FIG. 7, this workaround management table 242 associates failure codes with workarounds. A fault code is information that uniquely identifies a fault. Workaround is the name of a workaround for dealing with a failure. For example, the workaround (workaround name) corresponding to the failure code "error1000" is "system restart."

システムレベル管理テーブル２４３は、顧客システム１０のシステムレベルの情報を保持する。図８は、システムレベル管理テーブルのデータ構造の一例を示す図である。図８に示すように、システムレベル管理テーブル２４３は、システム識別番号と、システムレベルとを対応付ける。システム識別番号は、顧客システム１０を一意に識別する番号である。 The system level management table 243 holds system level information of the customer system 10. FIG. 8 is a diagram showing an example of the data structure of the system level management table. As shown in FIG. 8, the system level management table 243 associates system identification numbers with system levels. The system identification number is a number that uniquely identifies the customer system 10.

システムレベルは、顧客システムの障害が社会に与える重要性を示すレベルである。システムレベルが大きいほど、顧客システムの障害が社会に与える重要度が大きい。たとえば、顧客システム１０のシステムレベルは、（１）、（２）、（３）の何れかとなる。システムレベル（１）の顧客システムは、「社会的影響が殆どないシステム」である。システムレベル（２）の顧客システムは、「社会的影響が限定されるシステム」である。システムレベル（３）の顧客システムは、「社会的影響が極めて大きいシステム」である。 The system level is a level that indicates the importance that failures in customer systems have on society. The larger the system level, the greater the importance that a customer system failure has on society. For example, the system level of the customer system 10 is one of (1), (2), and (3). The customer system at the system level (1) is a "system that has almost no social impact." The customer system at the system level (2) is a "system with limited social influence." The customer system at the system level (3) is a "system that has an extremely large social impact."

たとえば、システム識別番号「ｓｙｓ１」によって識別される顧客システム１０ａのシステムレベルは「システムレベル（１）」である。このため、顧客システム１０ａは、障害が発生した場合でも、「社会的影響が殆どないシステム」である。 For example, the system level of the customer system 10a identified by the system identification number "sys1" is "system level (1)." Therefore, even if a failure occurs, the customer system 10a is a system that has almost no social impact.

図６の説明に戻る。制御部２５０は、受信部２５１、登録部２５２、アクセス受付部２５３を有する。制御部２５０は、たとえば、ＣＰＵやＭＰＵ等である。 Returning to the explanation of FIG. 6. The control unit 250 includes a receiving unit 251, a registration unit 252, and an access accepting unit 253. The control unit 250 is, for example, a CPU, an MPU, or the like.

受信部２５１は、監視サーバ１００からインシデント発行要求を受信した場合に、インシデント発行要求に設定された障害情報を、登録部２５２に出力する。 When receiving an incident issuance request from the monitoring server 100, the receiving unit 251 outputs the failure information set in the incident issuance request to the registration unit 252.

登録部２５２は、障害情報を基にして、インシデントに関する情報を障害ＤＢ２４１に登録する。たとえば、登録部２５２は、障害情報を取得した場合に、ユニークなインシデント番号を生成する。登録部２５２は、障害情報に設定された障害コードと、ワークアラウンド管理テーブル２４２とを比較して、障害コードに対応するワークアラウンドを特定する。 The registration unit 252 registers information regarding the incident in the failure DB 241 based on the failure information. For example, the registration unit 252 generates a unique incident number when acquiring failure information. The registration unit 252 compares the failure code set in the failure information with the workaround management table 242 to identify the workaround corresponding to the failure code.

登録部２５２は、障害情報に設定されたシステム識別番号を基にして、インシデントを登録する障害テーブルを選択する。登録部２５２は、システム識別番号が「ｓｙｓ１」である場合には、障害テーブルｔａ１を選択する。登録部２５２は、システム識別番号が「ｓｙｓ２」である場合には、障害テーブルｔａ２を選択する。登録部２５２は、システム識別番号が「ｓｙｓ３」である場合には、障害テーブルｔａ３を選択する。 The registration unit 252 selects a failure table in which to register an incident based on the system identification number set in the failure information. If the system identification number is "sys1", the registration unit 252 selects the failure table ta1. If the system identification number is "sys2", the registration unit 252 selects the failure table ta2. If the system identification number is "sys3", the registration unit 252 selects the failure table ta3.

登録部２５２は、選択した障害テーブルに、インシデント（インシデント番号、障害情報の障害コード、ワークアラウンド、対処フラグ＜ＯＦＦ＞）を登録する。 The registration unit 252 registers the incident (incident number, failure code of failure information, workaround, and handling flag <OFF>) in the selected failure table.

アクセス受付部２５３は、自動化処理装置２０から、障害ＤＢ２４１に対するアクセスを受け付ける。この際、アクセス受付部２５３は、自動化処理装置２０から通知されるシステム識別番号に対応する障害テーブルへのアクセスを許容する。たとえば、自動化処理装置２０ａの取得部２５ａは、障害テーブルｔａ１から、未対処のインシデントのワークアラウンドを取得する。 The access reception unit 253 receives access to the failure DB 241 from the automation processing device 20. At this time, the access reception unit 253 allows access to the failure table corresponding to the system identification number notified from the automated processing device 20. For example, the acquisition unit 25a of the automated processing device 20a acquires a workaround for an unhandled incident from the failure table ta1.

また、アクセス受付部２５３は、自動化処理装置２０から、ワークアラウンドに対する処理結果を受信する。たとえば、処理結果には、インシデント番号と、ワークアラウンドに対応するジョブの実行に成功したか否かの情報が含まれる。 The access reception unit 253 also receives processing results for workarounds from the automation processing device 20. For example, the processing result includes an incident number and information as to whether or not the job corresponding to the workaround was successfully executed.

アクセス受付部２５３は、ジョブの実行に成功した旨の情報が処理結果に含まれる場合には、処理結果に含まれるインシデント番号に対応する対処フラグを「ＯＮ」に更新する。 If the processing result includes information indicating that the job was successfully executed, the access receiving unit 253 updates the response flag corresponding to the incident number included in the processing result to "ON".

一方、アクセス受付部２５３は、ジョブの実行に失敗した旨の情報が処理結果に含まれる場合には、エラー情報を、監視サーバ１００に送信する。エラー情報には、システム識別番号、対処に失敗したワークアラウンドに対応するインシデント番号等が設定される。 On the other hand, if the processing result includes information indicating that job execution has failed, the access reception unit 253 transmits error information to the monitoring server 100. The error information includes a system identification number, an incident number corresponding to a failed workaround, and the like.

次に、図１に示した自動化処理装置２０ａの処理手順の一例について説明する。図９は、本実施例に係る自動化処理装置の処理手順を示すフローチャートである。図９に示すように、自動化処理装置２０の取得部２５ａは、一定時間経過していない場合には（ステップＳ１０１，Ｎｏ）、再度、ステップＳ１０１に移行する。 Next, an example of the processing procedure of the automated processing device 20a shown in FIG. 1 will be described. FIG. 9 is a flowchart showing the processing procedure of the automated processing device according to this embodiment. As shown in FIG. 9, if the predetermined period of time has not elapsed (step S101, No), the acquisition unit 25a of the automated processing device 20 moves to step S101 again.

取得部２５ａは、一定時間経過した場合には（ステップＳ１０１，Ｙｅｓ）、ＩＴＳＭサーバ２００の障害ＤＢ２４１にアクセスし、未対処のインシデントが存在するか否かを判定する（ステップＳ１０２）。取得部２５ａは、未対処のインシデントが存在しない場合には（ステップＳ１０３，Ｎｏ）、ステップＳ１０８に移行する。 If a certain period of time has elapsed (Step S101, Yes), the acquisition unit 25a accesses the failure DB 241 of the ITSM server 200 and determines whether there is an unhandled incident (Step S102). If there is no unhandled incident (step S103, No), the acquisition unit 25a moves to step S108.

一方、取得部２５ａは、未対処のインシデントが存在する場合には（ステップＳ１０３，Ｙｅｓ）、ワークアラウンドを取得する（ステップＳ１０４）。自動化処理装置２０ａの実行部２５ｂは、処理テーブル２４ａを基にして、ワークアラウンドに応じたジョブを選択する（ステップＳ１０５）。 On the other hand, if there is an unhandled incident (step S103, Yes), the acquisition unit 25a acquires a workaround (step S104). The execution unit 25b of the automated processing device 20a selects a job according to the workaround based on the processing table 24a (step S105).

実行部２５ｂは、顧客システム１０ａに対してジョブを実行する（ステップＳ１０６）。実行部２５ｂは、ジョブの処理結果をＩＴＳＭサーバ２００に送信する（ステップＳ１０７）。 The execution unit 25b executes the job on the customer system 10a (step S106). The execution unit 25b transmits the job processing result to the ITSM server 200 (step S107).

自動化処理装置２０ａは、処理を継続する場合には（ステップＳ１０８，Ｙｅｓ）、ステップＳ１０１に移行する。自動化処理装置２０ａは、処理を継続しない場合には（ステップＳ１０８，Ｎｏ）、処理を終了する。 If the automated processing device 20a continues the processing (step S108, Yes), the process moves to step S101. If the automated processing device 20a does not continue the processing (step S108, No), the automated processing device 20a ends the processing.

次に、図１に示した監視サーバ１００およびＩＴＳＭサーバ２００の処理手順について説明する。図１０は、監視サーバおよびＩＴＳＭサーバの処理手順を示すフローチャートである。監視サーバ１００は、障害情報を検知しない場合には（ステップＳ２０１，Ｎｏ）、再度、ステップＳ２０１に移行する。 Next, the processing procedures of the monitoring server 100 and the ITSM server 200 shown in FIG. 1 will be explained. FIG. 10 is a flowchart showing the processing procedure of the monitoring server and the ITSM server. If the monitoring server 100 does not detect failure information (step S201, No), the process moves to step S201 again.

一方、監視サーバ１００は、障害情報を検知した場合には（ステップＳ２０１，Ｙｅｓ）、障害情報を設定したインシデント発行要求をＩＴＳＭサーバ２００に送信する（ステップＳ２０２）。 On the other hand, if the monitoring server 100 detects fault information (step S201, Yes), it transmits an incident issuance request in which the fault information is set to the ITSM server 200 (step S202).

ＩＴＳＭサーバ２００は、インシデント発行要求を受信する（ステップＳ２０３）。ＩＴＳＭサーバ２００は、インシデント番号を生成する（ステップＳ２０４）。 The ITSM server 200 receives the incident issue request (step S203). The ITSM server 200 generates an incident number (step S204).

ＩＴＳＭサーバ２００は、ワークアラウンド管理テーブル２４２を基にして、障害情報に対応するワークアラウンドを特定する（ステップＳ２０５）。ＩＴＳＭサーバ２００は、障害ＤＢ２４１にインシデントの情報を登録する（ステップＳ２０６）。 The ITSM server 200 identifies a workaround corresponding to the failure information based on the workaround management table 242 (step S205). The ITSM server 200 registers incident information in the failure DB 241 (step S206).

次に、本実施例に係る監視システムの効果について説明する。監視システムにおいて、監視サーバ１００は、顧客システム１０の障害情報を受信した場合に、インシデント発行要求を、ＩＴＳＭサーバ２００に行い、ＩＴＳＭサーバ２００は、障害情報に関する情報を、障害ＤＢ２４１に登録する。また、自動化処理装置２０は、アウトバウンド通信によって、障害ＤＢ２４１にアクセスして、ワークアラウンドを取得し、ワークアラウンドに対応するジョブを、顧客システム１０に対して実行する。このように、アウトバウンド通信によって、自動化処理装置２０側から、ワークアラウンドを取得するため、インバウンド通信の場合と比較して、顧客システム１０の障害対応をセキュアに実行することができる。 Next, the effects of the monitoring system according to this embodiment will be explained. In the monitoring system, when the monitoring server 100 receives fault information of the customer system 10, it issues an incident issuance request to the ITSM server 200, and the ITSM server 200 registers information regarding the fault information in the fault DB 241. Further, the automation processing device 20 accesses the failure DB 241 through outbound communication, obtains a workaround, and executes a job corresponding to the workaround on the customer system 10. In this way, since the workaround is obtained from the automation processing device 20 side through outbound communication, troubleshooting of the customer system 10 can be executed more securely than in the case of inbound communication.

自動化処理装置２０は、ワークアラウンドに対応するジョブを、顧客システム１０に対して実行し、実行結果を、ＩＴＳＭサーバ２００に通知する。これによって、ＩＴＳＭサーバ２００は、障害に対応したか否かの情報を保持することができる。 The automation processing device 20 executes a job corresponding to the workaround on the customer system 10 and notifies the ITSM server 200 of the execution result. This allows the ITSM server 200 to hold information on whether or not the failure has been addressed.

上述した監視システムの処理の内容は一例である。以下では、監視システムのその他の処理１～３について説明する。 The content of the processing of the monitoring system described above is an example. Below, other processes 1 to 3 of the monitoring system will be explained.

まず、監視システムのその他の処理１について説明する。上述した説明では、自動化処理装置２０は、障害ＤＢ２４１に含まれる複数の障害テーブルのうち、決められた障害テーブルのインシデントから、未対処のインシデントのワークアラウンドを取得していた。すなわち、自動化処理装置２０ａは、障害テーブルｔａ１からワークアラウンドを取得し、自動化処理装置２０ｂは、障害テーブルｔａ２からワークアラウンドを取得し、自動化処理装置２０ｃは、障害テーブルｔａ３からワークアラウンドを取得していたが、これに限定されるものではない。 First, other processing 1 of the monitoring system will be explained. In the above description, the automated processing device 20 acquires workarounds for unhandled incidents from incidents in a determined failure table among the plurality of failure tables included in the failure DB 241. That is, the automation processing device 20a acquires a workaround from the failure table ta1, the automation processing device 20b acquires a workaround from the failure table ta2, and the automation processing device 20c acquires a workaround from the failure table ta3. However, it is not limited to this.

自動化処理装置２０は、自身がジョブの実行対象となる顧客システムと同一のシステムレベルとなる他の顧客システムに関する障害テーブルから、ワークアラウンドを取得し、ワークアラウンドに対応するジョブを実行してもよい。 The automation processing device 20 may obtain a workaround from a failure table related to another customer system that is at the same system level as the customer system that is the target of the job execution, and execute the job corresponding to the workaround. .

たとえば、顧客システム１０ａのシステムレベルと、顧客システム１０ｂのシステムレベルと同一のシステムレベルとする。顧客システム１０のシステムレベルは、ＩＴＳＭサーバ２００のシステムレベル管理テーブル２４３に登録される。 For example, the system level of the customer system 10a and the system level of the customer system 10b are assumed to be the same. The system level of the customer system 10 is registered in the system level management table 243 of the ITSM server 200.

自動化処理装置２０ａが、ＩＴＳＭサーバ２００の障害ＤＢ２４１にアクセスすると、ＩＴＳＭサーバ２００は、システムレベル管理テーブル２４３を基にして、顧客システム１０ａと同じシステムレベルとなる顧客システム１０ｂを特定する。ＩＴＳＭサーバ２００は、顧客システム１０ａのシステム識別番号に対応する障害テーブルｔａ１と、顧客システム１０ｂのシステム識別番号に対応する障害テーブルｔａ２とのアクセスを許容し、自動化処理装置２０ａは、障害テーブルｔａ１、障害テーブルｔａ２に含まれるインシデントのうち、未対処のインシデントのワークアラウンドを取得し、取得したワークアラウンドに応じたジョブを、顧客システム１０ａに実行する。 When the automation processing device 20a accesses the failure DB 241 of the ITSM server 200, the ITSM server 200 identifies the customer system 10b having the same system level as the customer system 10a based on the system level management table 243. The ITSM server 200 allows access to the fault table ta1 corresponding to the system identification number of the customer system 10a and the fault table ta2 corresponding to the system identification number of the customer system 10b, and the automated processing device 20a allows access to the fault table ta1, which corresponds to the system identification number of the customer system 10b. Workarounds for unhandled incidents among the incidents included in the failure table ta2 are acquired, and a job corresponding to the acquired workarounds is executed in the customer system 10a.

顧客システム１０ａと、顧客システム１０ｂとのシステムレベルが同じ場合に、顧客システム１０ａに障害が発生していなくても、顧客システム１０ｂに発生した障害の対応を、顧客システム１０ａに対して行うことが有効な場合もあり得る。このため、上記の処理を実行することで、顧客システム１０の障害対応を効率的に実行することができる。ここでは、システムレベルが同じ場合について説明したが、システムレベルが同じであるという条件に加えて、システムレベルが所定のシステムレベル以上の場合（たとえば、システムレベル（３）以上の場合）に、上記の処理を実行してもよい。 When the system level of the customer system 10a and the customer system 10b is the same, even if a failure has not occurred in the customer system 10a, it is possible to respond to a failure that has occurred in the customer system 10b on the customer system 10a. It may be valid in some cases. Therefore, by executing the above process, it is possible to efficiently deal with failures in the customer system 10. Here, we have explained the case where the system levels are the same, but in addition to the condition that the system levels are the same, if the system level is higher than a predetermined system level (for example, when system level (3) or higher), the above You may also perform the following processing.

続いて、監視システムのその他の処理２について説明する。自動化処理装置２０は、ワークアラウンドに対応するジョブを、顧客システム１０に対して実行し、実行結果を、ＩＴＳＭサーバ２００に通知している。ここで、ＩＴＳＭサーバ２００は、ワークアラウンドに対応するジョブの実行に失敗した場合には、エラー情報を、監視サーバ１００に送信する。監視サーバ１００のオペレータは、エラー情報を確認した場合には、監視サーバ１００から、対応する顧客システムに対して、手動で、所定のジョブを実行するようにしてもよい。 Next, other processing 2 of the monitoring system will be explained. The automation processing device 20 executes a job corresponding to the workaround on the customer system 10 and notifies the ITSM server 200 of the execution result. Here, when the ITSM server 200 fails to execute a job corresponding to the workaround, it transmits error information to the monitoring server 100. When the operator of the monitoring server 100 confirms the error information, the operator of the monitoring server 100 may manually execute a predetermined job from the monitoring server 100 to the corresponding customer system.

続いて、監視システムのその他の処理３について説明する。ＩＴＳＭサーバ２００は、監視サーバ１００から、インシデント発行要求を受付けると、インシデント番号を発行し、障害情報に関する情報を、障害ＤＢ２４１に登録していたが、障害情報に対応するワークアラウンドが所定レベル以上の難度のワークアラウンドである場合には、係るワークアラウンドの情報を、監視サーバ１００に通知してもよい。 Next, other processing 3 of the monitoring system will be explained. When the ITSM server 200 receives an incident issue request from the monitoring server 100, it issues an incident number and registers information related to the fault information in the fault DB 241, but if the workaround corresponding to the fault information is at a predetermined level or higher. In the case of a difficult workaround, information on the workaround may be notified to the monitoring server 100.

次に、上記実施例に示した監視サーバ１００、ＩＴＳＭサーバ２００と同様の機能を実現するコンピュータのハードウェア構成の一例について説明する。図１１は、実施例の監視サーバと同様の機能を実現するコンピュータのハードウェア構成の一例を示す図である。 Next, an example of the hardware configuration of a computer that implements the same functions as the monitoring server 100 and the ITSM server 200 shown in the above embodiment will be described. FIG. 11 is a diagram illustrating an example of the hardware configuration of a computer that implements the same functions as the monitoring server of the embodiment.

図１１に示すように、コンピュータ３００は、各種演算処理を実行するＣＰＵ３０１と、ユーザからのデータの入力を受け付ける入力装置３０２と、ディスプレイ３０３とを有する。また、コンピュータ３００は、有線または無線ネットワークを介して、顧客システム１０、自動化処理装置２０、ＩＴＳＭサーバ２００等との間でデータの授受を行う通信装置３０４と、インタフェース装置３０５とを有する。また、コンピュータ３００は、各種情報を一時記憶するＲＡＭ３０６と、ハードディスク装置３０７とを有する。そして、各装置３０１～３０７は、バス３０８に接続される。 As shown in FIG. 11, the computer 300 includes a CPU 301 that executes various calculation processes, an input device 302 that receives data input from a user, and a display 303. The computer 300 also includes a communication device 304 and an interface device 305 that exchange data with the customer system 10, the automated processing device 20, the ITSM server 200, etc. via a wired or wireless network. The computer 300 also includes a RAM 306 that temporarily stores various information and a hard disk device 307. Each device 301 to 307 is then connected to a bus 308.

ハードディスク装置３０７は、異常検知プログラム３０７ａ、依頼プログラム３０７ｂ、表示制御プログラム３０７ｃを有する。また、ＣＰＵ３０１は、各プログラム３０７ａ～３０７ｃを読み出してＲＡＭ３０６に展開する。 The hard disk device 307 has an abnormality detection program 307a, a request program 307b, and a display control program 307c. Further, the CPU 301 reads each program 307a to 307c and expands it into the RAM 306.

異常検知プログラム３０７ａは、異常検知プロセス３０６ａとして機能する。依頼プログラム３０７ｂは、依頼プロセス３０６ｂとして機能する。表示制御プログラム３０７ｃは、表示制御プロセス３０６ｃとして機能する。 The anomaly detection program 307a functions as an anomaly detection process 306a. The request program 307b functions as a request process 306b. The display control program 307c functions as a display control process 306c.

異常検知プロセス３０６ａの処理は、異常検知部１５１の処理に対応する。依頼プロセス３０６ｂの処理は、依頼部１５２の処理に対応する。表示制御プロセス３０６ｃの処理は、表示制御部１５３の処理に対応する。 The processing of the abnormality detection process 306a corresponds to the processing of the abnormality detection unit 151. The processing of the request process 306b corresponds to the processing of the requesting unit 152. The processing of the display control process 306c corresponds to the processing of the display control unit 153.

なお、各プログラム３０７ａ～３０７ｃについては、必ずしも最初からハードディスク装置３０７に記憶させておかなくても良い。例えば、コンピュータ３００に挿入されるフレキシブルディスク（ＦＤ）、ＣＤ－ＲＯＭ、ＤＶＤ、光磁気ディスク、ＩＣカードなどの「可搬用の物理媒体」に各プログラムを記憶させておく。そして、コンピュータ３００が各プログラム３０７ａ～３０７ｃを読み出して実行するようにしてもよい。 Note that each of the programs 307a to 307c does not necessarily have to be stored in the hard disk drive 307 from the beginning. For example, each program is stored in a "portable physical medium" such as a flexible disk (FD), CD-ROM, DVD, magneto-optical disk, or IC card that is inserted into the computer 300. Then, the computer 300 may read and execute each program 307a to 307c.

続いて、図１２の説明に移行する。図１２は、実施例のＩＴＳＭサーバと同様の機能を実現するコンピュータのハードウェア構成の一例を示す図である。 Next, the explanation will move on to FIG. 12. FIG. 12 is a diagram illustrating an example of the hardware configuration of a computer that implements the same functions as the ITSM server of the embodiment.

図１２に示すように、コンピュータ４００は、各種演算処理を実行するＣＰＵ４０１と、ユーザからのデータの入力を受け付ける入力装置４０２と、ディスプレイ４０３とを有する。また、コンピュータ４００は、有線または無線ネットワークを介して、顧客システム１０、自動化処理装置２０、監視サーバ１００等との間でデータの授受を行う通信装置４０４と、インタフェース装置４０５とを有する。また、コンピュータ４００は、各種情報を一時記憶するＲＡＭ４０６と、ハードディスク装置４０７とを有する。そして、各装置４０１～４０７は、バス４０８に接続される。 As shown in FIG. 12, the computer 400 includes a CPU 401 that executes various calculation processes, an input device 402 that receives data input from a user, and a display 403. Further, the computer 400 includes a communication device 404 and an interface device 405 that exchange data with the customer system 10, the automated processing device 20, the monitoring server 100, etc. via a wired or wireless network. The computer 400 also includes a RAM 406 that temporarily stores various information and a hard disk device 407. Each device 401 to 407 is then connected to a bus 408.

ハードディスク装置４０７は、受信プログラム４０７ａ、登録プログラム４０７ｂ、アクセス受付プログラム４０７ｃを有する。また、ＣＰＵ４０１は、各プログラム４０７ａ～４０７ｃを読み出してＲＡＭ４０６に展開する。 The hard disk device 407 has a reception program 407a, a registration program 407b, and an access reception program 407c. Further, the CPU 401 reads each program 407a to 407c and expands it into the RAM 406.

受信プログラム４０７ａは、受信プロセス４０６ａとして機能する。登録プログラム４０７ｂは、登録プロセス４０６ｂとして機能する。アクセス受付プログラム４０７ｃは、アクセス受付プロセス４０６ｃとして機能する。 The receiving program 407a functions as a receiving process 406a. Registration program 407b functions as registration process 406b. The access reception program 407c functions as an access reception process 406c.

受信プロセス４０６ａの処理は、受信部２５１の処理に対応する。登録プロセス４０６ｂの処理は、登録部２５２の処理に対応する。アクセス受付プロセス４０６ｃの処理は、アクセス受付部２５３の処理に対応する。 The processing of the receiving process 406a corresponds to the processing of the receiving section 251. The processing of the registration process 406b corresponds to the processing of the registration unit 252. The processing of the access reception process 406c corresponds to the processing of the access reception unit 253.

なお、各プログラム４０７ａ～４０７ｃについては、必ずしも最初からハードディスク装置４０７に記憶させておかなくても良い。例えば、コンピュータ４００に挿入されるフレキシブルディスク（ＦＤ）、ＣＤ－ＲＯＭ、ＤＶＤ、光磁気ディスク、ＩＣカードなどの「可搬用の物理媒体」に各プログラムを記憶させておく。そして、コンピュータ４００が各プログラム４０７ａ～４０７ｃを読み出して実行するようにしてもよい。 Note that each of the programs 407a to 407c does not necessarily need to be stored in the hard disk drive 407 from the beginning. For example, each program is stored in a "portable physical medium" such as a flexible disk (FD), CD-ROM, DVD, magneto-optical disk, or IC card that is inserted into the computer 400. Then, the computer 400 may read and execute each program 407a to 407c.

１０ａ，１０ｂ，１０ｃ顧客システム
２０ａ，２０ｂ，２０ｃ自動化処理装置
３０ａ，３０ｂ，３０ｃ Firewall
５０ネットワーク
１００監視サーバ
２００ＩＴＳＭサーバ 10a, 10b, 10c Customer system 20a, 20b, 20c Automated processing device 30a, 30b, 30c Firewall
50 Network 100 Monitoring Server 200 ITSM Server

Claims

A method for monitoring a monitoring system including a system and a monitoring server that monitors the system, the method comprising:
When the monitoring server detects that a failure has occurred in the system, the monitoring server registers response information for the failure in a storage unit,
A monitoring method characterized in that a tool of the system acquires the correspondence information registered in the storage unit and executes processing on the system according to the correspondence information.

2. The monitoring method according to claim 1, wherein the tool of the system further executes a process of notifying the monitoring server of the result of the process according to the correspondence information.

The monitoring system has a plurality of systems,
The monitoring server monitors the plurality of systems, and when detecting that a failure has occurred in any one of the plurality of systems, registers correspondence information for the failure in a storage unit. The monitoring method according to claim 1, wherein:

Among the plurality of systems, the tool of the first system acquires the correspondence information registered in the storage unit, and the acquired correspondence information is failure correspondence information of the tool of the second system, And, if the level of the second system (level according to social importance) is the same as the level of the first system, processing according to the failure response information of the tool of the second system. 4. The monitoring method according to claim 3, further comprising performing the following on the first system.

A monitoring system comprising a system and a monitoring server that monitors the system,
When the monitoring server detects that a failure has occurred in the system, the monitoring server registers response information for the failure in a storage unit,
A monitoring system characterized in that a tool of the system acquires the correspondence information registered in the storage unit and executes processing on the system according to the correspondence information.