JP2014010538A

JP2014010538A - Operation management device, operation management system, and operation management method

Info

Publication number: JP2014010538A
Application number: JP2012145377A
Authority: JP
Inventors: Hitoshi Ueda; 仁上田
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2012-06-28
Filing date: 2012-06-28
Publication date: 2014-01-20

Abstract

PROBLEM TO BE SOLVED: To provide an operation management device, an operation management system, and an operation management method which are capable of notifying an operation manager of a failure notice signal independently of knowhow of the operation manager.SOLUTION: An operation management device 2 monitoring an operation state of a monitoring object device 3 includes: performance index collection means 22 which collects performance index information of the monitoring object device 3; correlation calculation means 27 which calculates a correlation coefficient being a value showing a degree of coincidence between feature portion information being performance index information in a period from the time of a first fixed time before the time of failure occurrence in the monitoring object device 3 to this time of failure occurrence and performance index information in a period from the time of a second fixed time before the present time to the present time; and failure occurrence prediction means 28 which, in the case of the correlation coefficient being equal to or larger than a prescribed value, notifies an operation manager of a failure notice signal showing that the failure occurrence in the monitoring object device 3 is predicted.

Description

本発明は運用管理装置、運用管理システム及び運用管理方法に関する。 The present invention relates to an operation management apparatus, an operation management system, and an operation management method.

従来、運用管理装置は、監視対象装置において障害が発生したことを検知した時点でアラームを運用管理者に通知する。そのため、運用管理者は、監視対象装置において障害が発生した後に障害の発生を知ることしかできなかった。
また、監視対象装置の性能指標が予め定めた閾値を越えた時点で、監視対象装置において障害が発生していなくてもアラームを運用管理者に通知する運用管理装置がある。 Conventionally, an operation management apparatus notifies an operation manager of an alarm when it detects that a failure has occurred in a monitoring target apparatus. Therefore, the operation manager can only know the occurrence of the failure after the failure has occurred in the monitoring target device.
In addition, there is an operation management device that notifies an operation manager of an alarm even when a failure has not occurred in the monitoring target device when the performance index of the monitoring target device exceeds a predetermined threshold.

また、特許文献１及び２には、監視対象装置において障害が発生する前に障害を予測する方法が記載されている。具体的には、運用管理装置が監視対象装置を監視する際に得られる複数の監視データ同士の相関値を算出し、異常な相関値が算出された時点でアラームを運用管理者に通知する方法が記載されている。これにより、監視対象装置において障害が発生する前に、運用管理装置はアラームを運用管理者に通知することができる。 Patent Documents 1 and 2 describe a method for predicting a failure before a failure occurs in a monitoring target device. Specifically, a method for calculating a correlation value between a plurality of pieces of monitoring data obtained when the operation management apparatus monitors a monitoring target apparatus, and notifying the operation administrator of an alarm when an abnormal correlation value is calculated Is described. As a result, the operation management apparatus can notify the operation manager of an alarm before a failure occurs in the monitoring target apparatus.

特開２００５−３２７２６１号公報JP 2005-327261 A 国際公開第２０１０／０３２７０１号International Publication No. 2010/032701

しかし、予め閾値を定める方法では、運用管理者が予め適切な閾値を運用管理装置に登録する必要がある。そのため、運用管理者の経験やノウハウに依存するという欠点があった。 However, in the method of determining the threshold value in advance, the operation manager needs to register an appropriate threshold value in advance in the operation management apparatus. For this reason, there is a drawback that it depends on the experience and know-how of the operation manager.

また、特許文献１及び２では、監視対象装置に発生すると予測される障害の種類を予測することができない。
また、監視対象装置において障害が発生する直前に、異常な相関値が算出されることが多いため、監視対象装置において障害が発生する直前まで、アラームを通知することができないという課題がある。 In Patent Documents 1 and 2, the type of failure predicted to occur in the monitoring target device cannot be predicted.
In addition, since an abnormal correlation value is often calculated immediately before a failure occurs in the monitored device, there is a problem that an alarm cannot be notified until immediately before a failure occurs in the monitored device.

本発明の第１の態様に係る運用管理装置は、監視対象装置の動作状況を監視する運用管理装置である。また、前記運用管理装置は、性能指標収集手段と、相関算出手段と、障害発生予測手段と、を備える。また、前記性能指標収集手段は、前記監視対象装置の性能指標情報を収集する。また、前記相関算出手段は、前記監視対象装置において障害が発生した時刻より第１の一定時間前から、前記障害が発生した時刻までの性能指標情報である特徴部位情報と、現在時刻より第２の一定時間前から、前記現在時刻までの性能指標情報とが、どのくらい一致するかを示す値である相関係数を算出する。また、前記障害発生予測手段は、前記相関係数が規定値以上である場合に、前記監視対象装置において障害が発生すると予測される旨の障害予告信号を運用管理者に通知する。 The operation management apparatus according to the first aspect of the present invention is an operation management apparatus that monitors the operation status of a monitoring target apparatus. The operation management apparatus includes performance index collection means, correlation calculation means, and failure occurrence prediction means. The performance index collecting means collects performance index information of the monitoring target device. Further, the correlation calculating means includes feature part information that is performance index information from a first predetermined time before the time when the failure occurs to the time when the failure occurs, and second time from the current time. A correlation coefficient which is a value indicating how much the performance index information from a certain time before the current time matches is calculated. Further, the failure occurrence prediction means notifies the operation manager of a failure notice signal indicating that a failure is predicted to occur in the monitored device when the correlation coefficient is equal to or greater than a specified value.

本発明の第２の態様に係る運用管理システムは、運用管理装置と、運用管理者端末装置と、を備える。前記運用管理装置は、監視対象装置の動作状況を監視する。また、前記運用管理者端末装置は、前記運用管理装置から前記監視対象装置の動作状況に関する通知を受信する。また、前記運用管理装置は、性能指標収集手段と、相関算出手段と、障害発生予測手段と、を備える。また、前記性能指標収集手段は、前記監視対象装置の性能指標情報を収集する。また、前記相関算出手段は、前記監視対象装置において障害が発生した時刻より第１の一定時間前から、前記障害が発生した時刻までの性能指標情報である特徴部位情報と、現在時刻より第２の一定時間前から、前記現在時刻までの性能指標情報とが、どのくらい一致するかを示す値である相関係数を算出する。また、前記障害発生予測手段は、前記相関係数が規定値以上である場合に、前記監視対象装置において障害が発生すると予測される旨の障害予告信号を前記運用管理者端末装置に通知する。 An operation management system according to a second aspect of the present invention includes an operation management device and an operation manager terminal device. The operation management apparatus monitors the operation status of the monitoring target apparatus. Further, the operation manager terminal device receives a notification regarding the operation status of the monitoring target device from the operation management device. The operation management apparatus includes performance index collection means, correlation calculation means, and failure occurrence prediction means. The performance index collecting means collects performance index information of the monitoring target device. Further, the correlation calculating means includes feature part information that is performance index information from a first predetermined time before the time when the failure occurs to the time when the failure occurs, and second time from the current time. A correlation coefficient which is a value indicating how much the performance index information from a certain time before the current time matches is calculated. Further, the failure occurrence prediction means notifies the operation manager terminal device of a failure notice signal indicating that a failure is predicted to occur in the monitored device when the correlation coefficient is equal to or greater than a specified value.

本発明の第３の態様に係る運用管理方法は、監視対象装置の動作状況を監視する運用管理方法である。また、当該運用管理方法においては、前記監視対象装置の性能指標情報を収集する。また、前記監視対象装置において障害が発生した時刻より第１の一定時間前から、前記障害が発生した時刻までの性能指標情報である特徴部位情報と、現在時刻より第２の一定時間前から、前記現在時刻までの性能指標情報とが、どのくらい一致するかを示す値である相関係数を算出する。また、前記相関係数が規定値以上である場合に、前記監視対象装置において障害が発生すると予測される旨の障害予告信号を運用管理者に通知する。 The operation management method according to the third aspect of the present invention is an operation management method for monitoring the operation status of the monitoring target device. In the operation management method, performance index information of the monitoring target device is collected. Further, from the first fixed time before the time when the failure occurs in the monitoring target device to the characteristic part information that is performance index information from the time when the failure occurs to the second fixed time before the current time, A correlation coefficient that is a value indicating how much the performance index information up to the current time matches is calculated. Further, when the correlation coefficient is equal to or greater than a predetermined value, a failure notice signal indicating that a failure is predicted to occur in the monitored device is notified to the operation manager.

運用管理者のノウハウに依存することなく、監視対象装置において障害が発生する前に、運用管理装置はアラームを運用管理者に通知することができる。 Without depending on the know-how of the operation manager, the operation management apparatus can notify the operation manager of an alarm before a failure occurs in the monitored apparatus.

本発明の実施の形態１にかかる運用管理システムの一例を示すブロック図である。It is a block diagram which shows an example of the operation management system concerning Embodiment 1 of this invention. 本実施の形態１における性能指標情報である監視対象装置のＣＰＵ使用率と、監視対象装置における経過時間との関係を示すグラフである。4 is a graph showing a relationship between a CPU usage rate of a monitoring target device, which is performance index information in the first embodiment, and an elapsed time in the monitoring target device. 本実施の形態１における性能指標情報である監視対象装置のＣＰＵ使用率と、監視対象装置における経過時間との関係を示すグラフである。4 is a graph showing a relationship between a CPU usage rate of a monitoring target device, which is performance index information in the first embodiment, and an elapsed time in the monitoring target device. 本実施の形態１におけるアラーム収集処理について説明するフローチャートである。5 is a flowchart for describing alarm collection processing in the first embodiment. 本実施の形態１における性能指標収集処理について説明するフローチャートである。5 is a flowchart for describing performance index collection processing according to the first embodiment. 本実施の形態１における特徴部位抽出処理について説明するフローチャートである。It is a flowchart explaining the characteristic part extraction process in this Embodiment 1. FIG. 本実施の形態１における算出処理及び障害発生予測処理について説明するフローチャートである。5 is a flowchart for describing calculation processing and failure occurrence prediction processing in the first embodiment.

以下、図面を参照して本発明の実施の形態について説明する。なお、本発明は、以下の実施形態に限定されるものではない。
実施の形態１
本発明の実施の形態１にかかる運用管理システム１００は、図１に示すように、運用管理者端末装置１、運用管理装置２、監視対象装置３等を備えている。
運用管理者端末装置１と運用管理装置２とは、ＬＡＮ（Local Area Network）やインターネット等の通信回線によって接続されている。また、同様に、運用管理装置２と監視対象装置３とは、ＬＡＮやインターネット等の通信回線によって接続されている。 Embodiments of the present invention will be described below with reference to the drawings. In addition, this invention is not limited to the following embodiment.
Embodiment 1
The operation management system 100 according to the first embodiment of the present invention includes an operation manager terminal device 1, an operation management device 2, a monitoring target device 3, and the like as shown in FIG.
The operation manager terminal device 1 and the operation management device 2 are connected by a communication line such as a LAN (Local Area Network) or the Internet. Similarly, the operation management apparatus 2 and the monitoring target apparatus 3 are connected by a communication line such as a LAN or the Internet.

そして、運用管理装置２は、監視対象装置３の動作状況を監視する。また、運用管理装置２は、監視対象装置３の監視結果を運用管理者端末装置１に通知する。
なお、運用管理者端末装置１と運用管理装置２とは、１つの装置で構成されてもよい。また、運用管理システム１００には、１以上の監視対象装置３が備えられている。
また、運用管理者端末装置１と監視対象装置３とが、ＬＡＮやインターネット等の通信回線によって接続されていてもよい。 Then, the operation management apparatus 2 monitors the operation status of the monitoring target apparatus 3. Further, the operation management device 2 notifies the operation manager terminal device 1 of the monitoring result of the monitoring target device 3.
Note that the operation manager terminal device 1 and the operation management device 2 may be configured as a single device. In addition, the operation management system 100 includes one or more monitoring target devices 3.
Further, the operation manager terminal device 1 and the monitoring target device 3 may be connected by a communication line such as a LAN or the Internet.

運用管理者端末装置１は、ＣＰＵ（Central Processing Unit；図示省略）等を備えるコンピュータ（図示省略）等を有している。そして、ＣＰＵが、運用管理者端末装置１の各種機能を実現するためのプログラムを実行することにより、運用管理者端末装置１の各種機能が実現する。 The operation manager terminal device 1 includes a computer (not shown) having a CPU (Central Processing Unit; not shown) and the like. Various functions of the operation manager terminal device 1 are realized by the CPU executing programs for realizing various functions of the operation manager terminal device 1.

具体的には、ＣＰＵが、運用管理者端末装置１の各種機能を実現するためのプログラムを実行することにより、運用管理者端末装置１は、例えば、運用管理装置２から通知されるアラーム通知信号を受信する。また、運用管理者端末装置１は、当該アラーム通知信号に基づいて、運用管理者端末装置１の表示部に監視対象装置３において障害が発生した旨のアラーム等を表示する。ここで、アラーム通知信号は、少なくとも、監視対象装置３において発生した障害の種類、監視対象装置３において障害が発生した時刻に関する情報を含む。 Specifically, when the CPU executes programs for realizing various functions of the operation manager terminal device 1, the operation manager terminal device 1 receives an alarm notification signal notified from the operation management device 2, for example. Receive. Further, the operation manager terminal device 1 displays an alarm or the like indicating that a failure has occurred in the monitoring target device 3 on the display unit of the operation manager terminal device 1 based on the alarm notification signal. Here, the alarm notification signal includes at least information regarding the type of failure that occurred in the monitoring target device 3 and the time when the failure occurred in the monitoring target device 3.

また、ＣＰＵが、運用管理者端末装置１の各種機能を実現するためのプログラムを実行することにより、運用管理者端末装置１は、運用管理装置２から通知される障害予告信号を受信する。また、運用管理者端末装置１は、当該障害予告信号に基づいて、運用管理者端末装置１の表示部に監視対象装置３において発生すると予測される障害の種類等を表示する。ここで、障害予告信号は、少なくとも、監視対象装置３において発生すると予測される障害の種類、監視対象装置３において当該障害が発生すると予測される時刻に関する情報を含む。 Further, when the CPU executes programs for realizing various functions of the operation manager terminal device 1, the operation manager terminal device 1 receives a failure notice signal notified from the operation management device 2. Also, the operation manager terminal device 1 displays the type of failure predicted to occur in the monitored device 3 on the display unit of the operation manager terminal device 1 based on the failure notice signal. Here, the failure notice signal includes at least information regarding the type of failure predicted to occur in the monitoring target device 3 and the time when the failure is predicted to occur in the monitoring target device 3.

運用管理装置２は、ＣＰＵ（図示省略）等を備えるコンピュータ（図示省略）等を有している。そして、ＣＰＵが、運用管理装置２の各種機能を実現するためのプログラムを実行することにより、運用管理装置２の各種機能が実現する。具体的には、ＣＰＵは、運用管理装置２の各種機能を実現するためのプログラムを実行することにより、アラーム収集手段２１、性能指標収集手段２２、アラーム記録手段２３、性能指標記録手段２４、特徴部位抽出手段２５、特徴部位記録手段２６、相関算出手段２７、障害発生予測手段２８として機能する。 The operation management apparatus 2 includes a computer (not shown) having a CPU (not shown) and the like. Various functions of the operation management apparatus 2 are realized by the CPU executing programs for realizing various functions of the operation management apparatus 2. Specifically, the CPU executes a program for realizing various functions of the operation management apparatus 2, so that the alarm collection unit 21, performance index collection unit 22, alarm recording unit 23, performance index recording unit 24, features It functions as part extraction means 25, characteristic part recording means 26, correlation calculation means 27, and failure occurrence prediction means 28.

アラーム収集手段２１は、監視対象装置３において障害が発生した場合に、当該監視対象装置３からアラーム情報を収集する。
具体的には、アラーム収集手段２１は、監視対象装置３において障害が発生した場合に、監視対象装置３から送信されるアラーム信号を受信する。
ここで、当該アラーム信号は、少なくとも、監視対象装置３において発生した障害の種類、監視対象装置３において障害が発生した時刻に関する情報等を含む。
また、監視対象装置３からアラーム信号を受信した場合に、アラーム収集手段２１は、アラーム記録手段２３に、アラーム情報を記録する。ここで、当該アラーム情報は、少なくとも、監視対象装置３において発生した障害の種類、監視対象装置３において障害が発生した時刻に関する情報を含む。 The alarm collecting unit 21 collects alarm information from the monitoring target device 3 when a failure occurs in the monitoring target device 3.
Specifically, the alarm collecting unit 21 receives an alarm signal transmitted from the monitoring target device 3 when a failure occurs in the monitoring target device 3.
Here, the alarm signal includes at least information on the type of failure that occurred in the monitoring target device 3, the time when the failure occurred in the monitoring target device 3, and the like.
Further, when an alarm signal is received from the monitoring target device 3, the alarm collecting unit 21 records alarm information in the alarm recording unit 23. Here, the alarm information includes at least information regarding the type of failure that occurred in the monitoring target device 3 and the time when the failure occurred in the monitoring target device 3.

また、監視対象装置３からアラーム信号を受信した場合に、アラーム収集手段２１は、アラーム通知信号を運用管理者端末装置１に送信する。ここで、アラーム通知信号は、少なくとも、監視対象装置３において発生した障害の種類、監視対象装置３において障害が発生した時刻に関する情報を含む。なお、アラーム信号とアラーム通知信号に含まれる情報は、同じであってもよいし、アラーム信号に含まれる情報の一部がアラーム通知信号に含まれてもよい。
アラーム収集手段２１は、運用管理装置２における通常の処理と並行して、常時、上記処理を行う。 Further, when an alarm signal is received from the monitoring target device 3, the alarm collection unit 21 transmits an alarm notification signal to the operation manager terminal device 1. Here, the alarm notification signal includes at least information regarding the type of failure that occurred in the monitoring target device 3 and the time when the failure occurred in the monitoring target device 3. Note that the information included in the alarm signal and the alarm notification signal may be the same, or part of the information included in the alarm signal may be included in the alarm notification signal.
The alarm collecting unit 21 always performs the above process in parallel with the normal process in the operation management apparatus 2.

性能指標収集手段２２は、定期的に、性能指標情報を定期的に収集する。
具体的には、性能指標収集手段２２は、定期的に、性能指標取得要求信号を監視対象装置３に送信する。ここで、定期的にとは、予め定められた一定時間毎に、又は、監視対象装置３における処理内容や処理能力等によって随時定められる時間毎に、という意味である。 The performance index collecting unit 22 periodically collects performance index information.
Specifically, the performance index collection unit 22 periodically transmits a performance index acquisition request signal to the monitoring target device 3. Here, “regularly” means every predetermined time or every time determined at any time according to the processing content or processing capacity of the monitoring target device 3.

また、性能指標収集手段２２は、監視対象装置３から送信される性能指標情報を受信する。また、性能指標収集手段２２は、受信した性能指標情報を性能指標記録手段２４に記録する。ここで、当該性能指標情報には、ベンチマーク等の監視対象装置３の処理能力を表す尺度、監視対象装置３がある特定の処理を行うのにかかる処理時間、監視対象装置３における各種処理における応答時間等を使用することができる。本実施の形態１においては、性能指標情報として、監視対象装置３のＣＰＵ使用率を使用する。ここで、ＣＰＵ使用率とは、所定の単位時間当たりに、監視対象装置３において実行中のソフトウェアが監視対象装置３のＣＰＵを占有している時間の割合である。
また、性能指標収集手段２２は、当該性能指標情報を性能指標記録手段２４に記録する。具体的には、性能指標収集手段２２は、当該性能指標情報と、当該性能指標情報が示す性能が監視対象装置３で生じた時刻にかかる情報とを対応付けて、記録する。 In addition, the performance index collection unit 22 receives performance index information transmitted from the monitoring target device 3. Further, the performance index collecting unit 22 records the received performance index information in the performance index recording unit 24. Here, the performance index information includes a scale indicating the processing capability of the monitoring target device 3 such as a benchmark, a processing time required for the monitoring target device 3 to perform a specific process, and responses in various processes in the monitoring target device 3 Time etc. can be used. In the first embodiment, the CPU usage rate of the monitoring target device 3 is used as the performance index information. Here, the CPU usage rate is a ratio of time during which the software being executed in the monitoring target device 3 occupies the CPU of the monitoring target device 3 per predetermined unit time.
Further, the performance index collection unit 22 records the performance index information in the performance index recording unit 24. Specifically, the performance index collection unit 22 records the performance index information and information related to the time at which the performance indicated by the performance index information occurred in the monitoring target device 3 in association with each other.

アラーム記録手段２３は、アラーム情報を記録する。具体的には、アラーム記録手段２３は、例えば、監視対象装置３において発生した障害の種類と、性能指標記録手段２４に監視対象装置３において障害が発生した時刻に関する情報とを対応付けて記録する。 The alarm recording means 23 records alarm information. Specifically, the alarm recording unit 23 records, for example, the type of failure that has occurred in the monitoring target device 3 and information related to the time at which the failure has occurred in the monitoring target device 3 in the performance index recording unit 24. .

性能指標記録手段２４は、性能指標情報を記録する。具体的には、性能指標記録手段２４は、例えば、性能指標情報と、当該性能指標情報が示す性能が監視対象装置３で生じた時刻にかかる情報とを対応付けて記録する。より具体的には、性能指標記録手段２４は、ある時刻における監視対象装置３におけるＣＰＵ使用率と、当該時刻に関する情報とを対応付けて記録する。 The performance index recording means 24 records performance index information. Specifically, the performance index recording unit 24 records, for example, performance index information and information related to the time when the performance indicated by the performance index information occurs in the monitoring target device 3. More specifically, the performance index recording unit 24 records the CPU usage rate in the monitoring target device 3 at a certain time in association with the information related to the time.

特徴部位抽出手段２５は、定期的に、以下の処理を行う。ここで、定期的とは、１日に１回程度等の一定時間毎という意味である。
まず、特徴部位抽出手段２５は、アラーム記録手段２３に記録されているアラーム情報を取得する。具体的には、特徴部位抽出手段２５は、アラーム記録手段２３に記録されているアラーム情報のうち、前回の処理で取得していないアラーム情報を取得する。 The feature part extraction unit 25 periodically performs the following processing. Here, the term “regular” means every fixed time such as about once a day.
First, the characteristic part extraction unit 25 acquires alarm information recorded in the alarm recording unit 23. Specifically, the characteristic part extraction unit 25 acquires alarm information that has not been acquired in the previous process among the alarm information recorded in the alarm recording unit 23.

次に、特徴部位抽出手段２５は、性能指標記録手段２４に記録されている性能指標情報を取得する。具体的には、特徴部位抽出手段２５は、監視対象装置３において障害が発生した時刻から一定時間（第１の一定時間）前までの、性能指標情報を取得する。なお、監視対象装置３において障害が発生した時刻は、アラーム記録手段２３から取得したアラーム情報に含まれている。換言すれば、特徴部位抽出手段２５は、監視対象装置３において障害が発生した時刻より一定時間（第１の一定時間）前から、監視対象装置３において障害が発生した時刻までの、性能指標情報を取得する。ここで、一定時間（第１の一定時間）とは、例えば、１０分間等の一定の時間である。
次に、特徴部位抽出手段２５は、性能指標記録手段２４から取得した性能指標情報を特徴部位情報として、特徴部位記録手段２６に記録する。より具体的には、特徴部位抽出手段２５は、当該特徴部位情報と、当該特徴部位情報である性能指標情報の始点と終点の時刻を定める障害に関するアラーム情報と、を対応付けて特徴部位記録手段２６に記録する。 Next, the characteristic part extracting unit 25 acquires the performance index information recorded in the performance index recording unit 24. Specifically, the characteristic part extraction unit 25 acquires performance index information from a time when a failure occurs in the monitoring target device 3 to a certain time (first certain time) before. Note that the time when the failure occurred in the monitoring target device 3 is included in the alarm information acquired from the alarm recording means 23. In other words, the feature part extraction unit 25 performs performance index information from a certain time (first certain time) before the time when the failure occurs in the monitoring target device 3 to the time when the failure occurs in the monitoring target device 3. To get. Here, the certain time (first certain time) is a certain time such as 10 minutes.
Next, the feature part extraction unit 25 records the performance index information acquired from the performance index recording unit 24 in the feature part recording unit 26 as the feature part information. More specifically, the feature part extraction unit 25 associates the feature part information with the alarm information regarding the failure that determines the start and end times of the performance index information that is the feature part information. 26.

特徴部位記録手段２６は、特徴部位情報とアラーム情報とを対応付けて記録する。具体的には、特徴部位記録手段２６は、例えば、特徴部位抽出手段２５により取得された特徴部位情報と、当該特徴部位情報である性能指標情報の始点と終点の時刻を定めるアラーム情報と、を対応付けて、特徴部位情報として記録する。 The characteristic part recording means 26 records characteristic part information and alarm information in association with each other. Specifically, the feature part recording unit 26 includes, for example, the feature part information acquired by the feature part extraction unit 25 and alarm information that determines the start and end times of the performance index information that is the feature part information. Correspondingly, it is recorded as characteristic part information.

図２は、本実施の形態における性能指標情報である監視対象装置３のＣＰＵ使用率と、監視対象装置３における経過時間との関係を示すグラフである。図２では、監視対象装置３において障害が発生した前後におけるＣＰＵ使用率と、経過時間との関係が示されている。図２において、縦軸がＣＰＵ使用率（％）を示し、横軸が経過時間を示している。また、図２において、障害発生時刻をＴ１、障害発生時刻Ｔ１より一定時間（第１の一定時間）前の時刻をＴ２とする。また、図２において、ＣＰＵ使用率が１００％近い値となったとき、監視対象装置３において障害が発生するものとする。 FIG. 2 is a graph showing the relationship between the CPU usage rate of the monitoring target device 3 and the elapsed time in the monitoring target device 3, which is performance index information in the present embodiment. FIG. 2 shows the relationship between the CPU usage rate and the elapsed time before and after a failure occurs in the monitoring target device 3. In FIG. 2, the vertical axis represents the CPU usage rate (%), and the horizontal axis represents the elapsed time. In FIG. 2, the failure occurrence time is T1, and the time before the failure occurrence time T1 for a certain time (first certain time) is T2. In FIG. 2, it is assumed that a failure occurs in the monitoring target device 3 when the CPU usage rate is close to 100%.

図２を用いて、特徴部位抽出手段２５による特徴部位抽出処理を説明する。
特徴部位抽出手段２５は、アラーム記録手段２３に記録されているアラーム情報を取得する。たとえば、特徴部位抽出手段２５は、図２に示す時刻Ｔ１における障害に関するアラーム情報を、アラーム記録手段２３から取得する。
次に、特徴部位抽出手段２５は、性能指標記録手段２４に記録されている性能指標情報を取得する。具体的には、特徴部位抽出手段２５は、監視対象装置３において障害が発生した時刻Ｔ１より一定時間（第１の一定時間）前から、監視対象装置３において障害が発生した時刻Ｔ１までの、性能指標情報を取得する。換言すれば、特徴部位抽出手段２５は、図２に示す時刻Ｔ２から時刻Ｔ１までの性能指標情報を取得する。ここで、一定時間（第１の一定時間）とは、図２に示す時刻Ｔ２から時刻Ｔ１までの時間であり、例えば、１０分間等の一定の時間である。
次に、特徴部位抽出手段２５は、性能指標記録手段２４から取得した時刻Ｔ２から時刻Ｔ１までの性能指標情報を特徴部位情報として、特徴部位記録手段２６に記録する。より具体的には、特徴部位抽出手段２５は、当該特徴部位情報と、時刻Ｔ１における障害に関するアラーム情報とを対応付けて、特徴部位記録手段２６に記録する。 The feature part extraction process by the feature part extraction means 25 will be described with reference to FIG.
The characteristic part extraction unit 25 acquires the alarm information recorded in the alarm recording unit 23. For example, the feature part extraction unit 25 acquires alarm information related to the failure at time T1 shown in FIG.
Next, the characteristic part extracting unit 25 acquires the performance index information recorded in the performance index recording unit 24. Specifically, the feature part extraction means 25 is from a certain time (first fixed time) before the time T1 when the failure occurs in the monitoring target device 3 to the time T1 when the failure occurs in the monitoring target device 3. Get performance index information. In other words, the characteristic part extraction means 25 acquires performance index information from time T2 to time T1 shown in FIG. Here, the fixed time (first fixed time) is a time from time T2 to time T1 shown in FIG. 2, and is a fixed time such as 10 minutes.
Next, the feature part extraction unit 25 records the performance index information from the time T2 to the time T1 acquired from the performance index recording unit 24 in the feature part recording unit 26 as the feature part information. More specifically, the characteristic part extraction unit 25 records the characteristic part information in association with the alarm information regarding the failure at time T1 in the characteristic part recording unit 26.

相関算出手段２７は、性能指標記録手段２４から、現在の時刻から一定時間（第２の一定時間）前までの、性能指標情報を取得する。換言すれば、相関算出手段２７は、性能指標記録手段２４から、現在の時刻より一定時間（第２の一定時間）前から、現在の時刻までの、性能指標情報を取得する。ここで、一定時間（第２の一定時間）とは、例えば、１０分間等の一定の時間である。なお、相関算出手段２７が取得する性能指標情報における当該一定時間（第２の一定時間）と、特徴部位抽出手段２５が取得する性能指標情報における上記一定時間（第１の一定時間）とは、同じ長さの時間であってもよいし、相関算出手段２７が取得する性能指標情報における当該一定時間（第２の一定時間）が、特徴部位抽出手段２５が取得する性能指標情報における上記一定時間（第１の一定時間）より短い時間であってもよい。
また、相関算出手段２７は、特徴部位記録手段２５に記録されている複数の特徴部位情報から一の特徴部位情報を取得する。
そして、相関算出手段２７は、当該性能指標情報と、当該特徴部位情報との相関係数を算出する。ここで、相関係数とは、当該性能指標情報と、当該特徴部位情報である性能指標情報とが、どのくらい一致するかを示す値である。また、相関係数の値が大きいほど、一致度が高い。当該相関係数は、例えば、最小二乗法等を用いて算出する。 The correlation calculating unit 27 acquires performance index information from the performance index recording unit 24 from the current time to a certain time (second certain time) before. In other words, the correlation calculating unit 27 acquires performance index information from the performance index recording unit 24 from a certain time (second certain time) before the current time to the current time. Here, the fixed time (second fixed time) is a fixed time such as 10 minutes. The fixed time (second fixed time) in the performance index information acquired by the correlation calculating means 27 and the fixed time (first fixed time) in the performance index information acquired by the characteristic part extracting means 25 are: The time may be the same length, or the certain time (second certain time) in the performance index information acquired by the correlation calculating unit 27 is the above-mentioned certain time in the performance index information acquired by the characteristic part extracting unit 25. It may be shorter than (first fixed time).
Further, the correlation calculating unit 27 acquires one feature part information from the plurality of feature part information recorded in the feature part recording unit 25.
Then, the correlation calculation unit 27 calculates a correlation coefficient between the performance index information and the characteristic part information. Here, the correlation coefficient is a value indicating how much the performance index information matches the performance index information that is the characteristic part information. Also, the greater the correlation coefficient value, the higher the degree of coincidence. The correlation coefficient is calculated using, for example, the least square method.

障害発生予測手段２８は、相関算出手段２７により算出された相関係数の値が規定値以上か否かを判断する。そして、障害発生予測手段２８は、相関係数の値が規定値以上であると判断した場合に、障害予告信号を運用管理者端末装置１に送信する。ここで、障害予告信号は、少なくとも、監視対象装置３において発生すると予測される障害の種類、監視対象装置３において当該障害が発生すると予測される時刻に関する情報を含む。
また、ここで、規定値とは、監視対象装置３の種類、処理内容、当該監視対象装置３における障害の発生頻度等によって、任意に決定される値である。例えば、現在の時刻から一定時間（第２の一定時間）前までの性能指標情報と、特徴部位情報である性能指標情報とが一致する割合が７０％となる相関係数の値を規定値とすることができる。 The failure occurrence prediction unit 28 determines whether or not the value of the correlation coefficient calculated by the correlation calculation unit 27 is equal to or greater than a specified value. Then, the failure occurrence predicting unit 28 transmits a failure notice signal to the operation manager terminal device 1 when determining that the value of the correlation coefficient is equal to or greater than the specified value. Here, the failure notice signal includes at least information regarding the type of failure predicted to occur in the monitoring target device 3 and the time when the failure is predicted to occur in the monitoring target device 3.
Here, the prescribed value is a value that is arbitrarily determined according to the type of the monitoring target device 3, the processing content, the frequency of occurrence of a failure in the monitoring target device 3, and the like. For example, the correlation coefficient value at which the ratio of the performance index information from the current time to a certain time (second certain time) before the performance index information that is the characteristic part information is 70% is defined as the specified value. can do.

なお、相関係数の値が規定値以上であるということは、当該相関係数を算出する際に用いた特徴部位情報である性能指標情報と、現在の時刻から一定時間（第２の一定時間）前の性能指標情報までの性能指標情報とが、一致する割合が高いことを示している。つまり、監視対象装置３において過去に障害が発生した時刻から一定時間（第１の一定時間）前までの性能指標情報と、現在の時刻から一定時間（第２の一定時間）前の性能指標情報までの性能指標情報とが、一致する割合が高いことを示している。したがって、相関係数の値が規定値以上であるということは、監視対象装置３が、過去に障害を発生した状態と類似する状態にあるということを示している。そのため、障害発生予測手段２８は、相関係数の値が規定値以上であると判断した場合に、障害予告信号を運用管理者端末装置１に送信する。 Note that the value of the correlation coefficient being equal to or greater than the specified value means that the performance index information, which is the characteristic part information used when calculating the correlation coefficient, and a certain time (second constant time) from the current time. ) The ratio of the performance index information up to the previous performance index information is high. That is, performance index information from a time when a failure has occurred in the monitoring target device 3 to a certain time (first certain time) before, and performance index information a certain time (second certain time) before the current time. It is shown that the ratio of the performance index information up to is high. Therefore, the value of the correlation coefficient being equal to or greater than the specified value indicates that the monitoring target device 3 is in a state similar to a state in which a failure has occurred in the past. Therefore, the failure occurrence prediction unit 28 transmits a failure notice signal to the operation manager terminal device 1 when determining that the value of the correlation coefficient is equal to or greater than the specified value.

また、障害発生予測手段２８は、特徴部位記録手段２５に特徴部位情報と対応付けられて記録されているアラーム情報から、監視対象装置３において発生すると予測される障害の種類、監視対象装置３において当該障害が発生すると予測される時刻（障害発生予測時刻）とを決定する。換言すれば、障害発生予測手段２８は、特徴部位情報である性能指標情報の始点と終点の時刻を定める障害に関するアラーム情報から、監視対象装置３において発生すると予測される障害の種類、障害発生予測時刻とを決定する。そして、障害発生予測手段２８は、決定した障害の種類及び障害発生予測時刻を、障害予告信号として、運用管理者端末装置１に送信する。 Further, the failure occurrence predicting unit 28 determines the type of failure that is predicted to occur in the monitoring target device 3 from the alarm information recorded in the feature portion recording unit 25 in association with the feature portion information. The time when the failure is predicted to occur (failure occurrence predicted time) is determined. In other words, the failure occurrence predicting means 28 determines the type of failure that is predicted to occur in the monitoring target device 3 and the failure occurrence prediction from the alarm information related to the failure that determines the start time and end time of the performance index information that is the characteristic part information. Determine the time. Then, the failure occurrence prediction unit 28 transmits the determined failure type and failure occurrence prediction time to the operations manager terminal device 1 as a failure notification signal.

なお、相関算出手段２７による算出処理及び、障害発生予測手段２８による障害発生予測処理は、運用管理装置２における通常の処理と並行して、常時、上記処理を行う。 The calculation process by the correlation calculation unit 27 and the failure occurrence prediction process by the failure occurrence prediction unit 28 are always performed in parallel with the normal process in the operation management apparatus 2.

図３は、本実施の形態における性能指標情報である監視対象装置３のＣＰＵ使用率と、監視対象装置３における経過時間との関係を示すグラフである。図３では、監視対象装置３において障害が発生する前後におけるＣＰＵ使用率と、経過時間との関係が示されている。図３において、縦軸がＣＰＵ使用率（％）を示し、横軸が経過時間を示している。また、図３において、現在時刻をＴ３、障害発生予測時刻をＴ４、現在時刻Ｔ３より一定時間（第２の一定時間）前の時刻をＴ５とする。また、図３において、ＣＰＵ使用率が１００％近い値となったとき、監視対象装置３において障害が発生するものとする。 FIG. 3 is a graph showing the relationship between the CPU usage rate of the monitoring target device 3 and the elapsed time in the monitoring target device 3, which is performance index information in the present embodiment. FIG. 3 shows the relationship between the CPU usage rate and the elapsed time before and after a failure occurs in the monitoring target device 3. In FIG. 3, the vertical axis indicates the CPU usage rate (%), and the horizontal axis indicates the elapsed time. In FIG. 3, the current time is T3, the predicted failure occurrence time is T4, and the time before the current time T3 for a certain time (second certain time) is T5. In FIG. 3, it is assumed that a failure occurs in the monitoring target device 3 when the CPU usage rate is close to 100%.

図３を用いて、相関算出手段２７による算出処理及び、障害発生予測手段２８による障害発生予測処理を説明する。
相関算出手段２７は、現在時刻Ｔ３より一定時間（第２の一定時間）前の時刻Ｔ５から、現在時刻Ｔ３までの性能指標情報を取得する。また、相関算出手段２７は、特徴部位記録手段２５に記録されている複数の特徴部位情報から一の特徴部位情報を取得する。そして、相関算出手段２７は、当該性能指標情報と、当該特徴部位情報との相関係数を算出する。図３において、監視対象装置３の性能指標情報を実線で示し、相関算出手段２７が取得した特徴部位情報を太い破線で示す。時刻Ｔ５から時刻Ｔ３までの性能指標情報が、相関算出手段２７が取得した性能指標情報である。 The calculation process by the correlation calculation unit 27 and the failure occurrence prediction process by the failure occurrence prediction unit 28 will be described with reference to FIG.
The correlation calculating unit 27 acquires performance index information from a time T5 that is a fixed time (second fixed time) before the current time T3 to the current time T3. Further, the correlation calculating unit 27 acquires one feature part information from the plurality of feature part information recorded in the feature part recording unit 25. Then, the correlation calculation unit 27 calculates a correlation coefficient between the performance index information and the characteristic part information. In FIG. 3, the performance index information of the monitoring target device 3 is indicated by a solid line, and the characteristic part information acquired by the correlation calculating unit 27 is indicated by a thick broken line. The performance index information from time T5 to time T3 is the performance index information acquired by the correlation calculating unit 27.

障害発生予測手段２８は、相関算出手段２７により算出された相関係数の値が規定値以上か否かを判断する。
図３において、時刻Ｔ５から時刻Ｔ３までの実線と、太い破線で示す特徴部位情報である性能指標情報の前半部分とが、大部分において一致している。そして、相関算出手段２７により算出された相関係数の値が規定値以上となっている。
そのため、障害発生予測手段２８は、障害予告信号を運用管理者端末装置１に送信する。
なお、図３の場合における相関係数を算出した期間は、時刻Ｔ５から時刻Ｔ３までの時間である。 The failure occurrence prediction unit 28 determines whether or not the value of the correlation coefficient calculated by the correlation calculation unit 27 is equal to or greater than a specified value.
In FIG. 3, the solid line from time T5 to time T3 and the first half of the performance index information, which is characteristic part information indicated by a thick broken line, are mostly the same. The value of the correlation coefficient calculated by the correlation calculating unit 27 is equal to or greater than a specified value.
Therefore, the failure occurrence predicting means 28 transmits a failure notice signal to the operation manager terminal device 1.
The period in which the correlation coefficient is calculated in the case of FIG. 3 is the time from time T5 to time T3.

また、障害発生予測手段２８は、特徴部位記録手段２５に当該特徴部位情報と対応付けられて記録されているアラーム情報から、障害発生予測時刻Ｔ４を決定する。また、障害発生予測手段２８は、当該アラーム情報から、監視対象装置３において発生すると予測される障害の種類を決定する。そして、障害発生予測手段２８は、決定した障害の種類及び障害発生予測時刻を、障害予告信号として、運用管理者端末装置１に送信する。 Further, the failure occurrence prediction unit 28 determines the failure occurrence prediction time T4 from the alarm information recorded in the feature portion recording unit 25 in association with the feature portion information. Further, the failure occurrence prediction means 28 determines the type of failure predicted to occur in the monitoring target device 3 from the alarm information. Then, the failure occurrence prediction unit 28 transmits the determined failure type and failure occurrence prediction time to the operations manager terminal device 1 as a failure notification signal.

監視対象装置３は、ＣＰＵ（図示省略）等を備えるコンピュータ（図示省略）等を有している。そして、ＣＰＵが、監視対象装置３の各種機能を実現するためのプログラムを実行することにより、監視対象装置３の各種機能が実現する。 The monitoring target device 3 includes a computer (not shown) having a CPU (not shown) and the like. Various functions of the monitoring target device 3 are realized by the CPU executing programs for realizing the various functions of the monitoring target device 3.

具体的には、ＣＰＵが、監視対象装置３の各種機能を実現するためのプログラムを実行することにより、監視対象装置３は、例えば、運用管理装置２から通知される性能指標取得要求信号を受信する。また、監視対象装置３は、当該性能指標取得要求信号を受信した場合に、性能指標情報を運用管理装置２へ送信する。より具体的には、監視対象装置３は、性能指標情報と、当該性能指標情報が示す性能が監視対象装置３で生じた時刻にかかる情報とを対応付けて、運用管理装置２へ送信する。 Specifically, when the CPU executes a program for realizing various functions of the monitoring target device 3, the monitoring target device 3 receives, for example, a performance index acquisition request signal notified from the operation management device 2. To do. Moreover, the monitoring target apparatus 3 transmits performance index information to the operation management apparatus 2 when the performance index acquisition request signal is received. More specifically, the monitoring target device 3 associates the performance index information with information related to the time at which the performance indicated by the performance index information has occurred in the monitoring target device 3, and transmits the information to the operation management device 2.

また、ＣＰＵが、監視対象装置３の各種機能を実現するためのプログラムを実行することにより、監視対象装置３は、当該監視対象装置３において障害が発生した場合に、アラーム信号を運用管理装置２へ送信する。ここで、当該アラーム信号は、少なくとも、監視対象装置３において発生した障害の種類、監視対象装置３において障害が発生した時刻に関する情報を含む。 In addition, when the CPU executes programs for realizing various functions of the monitoring target device 3, the monitoring target device 3 sends an alarm signal to the operation management device 2 when a failure occurs in the monitoring target device 3. Send to. Here, the alarm signal includes at least information regarding the type of failure that occurred in the monitoring target device 3 and the time when the failure occurred in the monitoring target device 3.

次に、本実施の形態１にかかる運用管理システム１００における運用管理方法について、図４乃至図７に示すフローチャートを参照しながら説明する。 Next, an operation management method in the operation management system 100 according to the first embodiment will be described with reference to the flowcharts shown in FIGS.

まず、図４を参照して、本実施の形態１におけるアラーム収集処理等について説明する。
まず、監視対象装置３は、当該監視対象装置３において障害が発生したか否かを判断する（ステップＳ１）。
ステップＳ１において、監視対象装置３が、当該監視対象装置３において障害が発生していないと判断した場合は（ステップＳ１；Ｎｏ）、スタートに戻る。
ステップＳ１において、監視対象装置３が、当該監視対象装置３において障害が発生したと判断した場合は（ステップＳ１；Ｙｅｓ）、監視対象装置３は、アラーム信号を運用管理装置２へ送信する（ステップＳ２）。ここで、当該アラーム信号は、少なくとも、監視対象装置３において発生した障害の種類、監視対象装置３において障害が発生した時刻に関する情報を含む。 First, with reference to FIG. 4, the alarm collection process etc. in this Embodiment 1 are demonstrated.
First, the monitoring target device 3 determines whether or not a failure has occurred in the monitoring target device 3 (step S1).
In step S1, if the monitoring target device 3 determines that no failure has occurred in the monitoring target device 3 (step S1; No), the process returns to the start.
In step S1, if the monitoring target device 3 determines that a failure has occurred in the monitoring target device 3 (step S1; Yes), the monitoring target device 3 transmits an alarm signal to the operation management device 2 (step S1). S2). Here, the alarm signal includes at least information regarding the type of failure that occurred in the monitoring target device 3 and the time when the failure occurred in the monitoring target device 3.

次に、運用管理装置２のアラーム収集手段２１は、監視対象装置３から送信されるアラーム信号を受信する（ステップＳ３）。 Next, the alarm collection means 21 of the operation management apparatus 2 receives the alarm signal transmitted from the monitoring target apparatus 3 (step S3).

次に、アラーム収集手段２１は、アラーム記録手段２３に、アラーム情報を記録する（ステップＳ４）。ここで、当該アラーム情報は、少なくとも、監視対象装置３において発生した障害の種類、監視対象装置３において障害が発生した時刻に関する情報を含む。
また、アラーム収集手段２１は、アラーム通知信号を運用管理者端末装置１に送信する（ステップＳ５）。 Next, the alarm collecting means 21 records alarm information in the alarm recording means 23 (step S4). Here, the alarm information includes at least information regarding the type of failure that occurred in the monitoring target device 3 and the time when the failure occurred in the monitoring target device 3.
Further, the alarm collection means 21 transmits an alarm notification signal to the operation manager terminal device 1 (step S5).

次に、運用管理者端末装置１は、アラーム通知信号を受信する（ステップＳ６）。次に、運用管理者端末装置１は、当該運用管理者端末装置１の表示部に、監視対象装置３において障害が発生した旨のアラーム等を表示する（ステップＳ７）。当該アラームには、少なくとも、監視対象装置３において発生した障害の種類、当該監視対象装置３において当該障害が発生した時刻に関する情報が含まれている。 Next, the operation manager terminal device 1 receives the alarm notification signal (step S6). Next, the operation manager terminal device 1 displays an alarm or the like indicating that a failure has occurred in the monitoring target device 3 on the display unit of the operation manager terminal device 1 (step S7). The alarm includes at least information on the type of failure that occurred in the monitoring target device 3 and the time when the failure occurred in the monitoring target device 3.

なお、監視対象装置３は、当該監視対象装置３における通常の処理と並行して、常時、上記処理（ステップＳ１及びステップＳ２の処理）を行う。また、アラーム集手段２１は、運用管理装置２における通常の処理と並行して、常時、上記処理（ステップＳ３〜ステップＳ５の処理）を行う。 Note that the monitoring target device 3 always performs the above processing (steps S1 and S2) in parallel with the normal processing in the monitoring target device 3. Further, the alarm collecting means 21 always performs the above-described processing (the processing from step S3 to step S5) in parallel with the normal processing in the operation management apparatus 2.

次に、図５を参照して、本実施の形態１における性能指標収集処理等について説明する。
まず、性能指標収集手段２２は、性能指標取得要求信号を監視対象装置３に送信する（ステップＳ１０１）。なお、性能指標収集手段２２は、定期的に、ステップＳ１０１の処理を行う。ここで、定期的にとは、予め定められた一定時間毎に、又は、監視対象装置３における処理内容や処理能力等によって随時定められる時間毎に、という意味である。 Next, with reference to FIG. 5, the performance index collection processing and the like in the first embodiment will be described.
First, the performance index collection unit 22 transmits a performance index acquisition request signal to the monitoring target device 3 (step S101). Note that the performance index collecting unit 22 periodically performs the process of step S101. Here, “regularly” means every predetermined time or every time determined at any time according to the processing content or processing capacity of the monitoring target device 3.

次に、監視対象装置３は、運用管理装置２から、性能指標取得要求信号を受信する（ステップＳ１０２）。
次に、監視対象装置３は、性能指標情報を運用管理装置２に送信する（ステップＳ１０３）。
ここで、当該性能指標情報には、ベンチマーク等の監視対象装置３の処理能力を表す尺度、監視対象装置３がある特定の処理を行うのにかかる処理時間、監視対象装置３における各種処理における応答時間等を使用することができる。本実施の形態１においては、性能指標情報として、監視対象装置３のＣＰＵ使用率を使用する。ここで、ＣＰＵ使用率とは、所定の単位時間当たりに、監視対象装置３において実行中のソフトウェアが監視対象装置３のＣＰＵを占有している時間の割合である。 Next, the monitoring target device 3 receives a performance index acquisition request signal from the operation management device 2 (step S102).
Next, the monitoring target device 3 transmits the performance index information to the operation management device 2 (step S103).
Here, the performance index information includes a scale indicating the processing capability of the monitoring target device 3 such as a benchmark, a processing time required for the monitoring target device 3 to perform a specific process, and responses in various processes in the monitoring target device 3 Time etc. can be used. In the first embodiment, the CPU usage rate of the monitoring target device 3 is used as the performance index information. Here, the CPU usage rate is a ratio of time during which the software being executed in the monitoring target device 3 occupies the CPU of the monitoring target device 3 per predetermined unit time.

次に、性能指標収集手段２２は、監視対象装置３から送信される性能指標情報を受信する（ステップＳ１０４）。
次に、性能指標収集手段２２は、受信した性能指標情報を性能指標記録手段２４に記録する（ステップＳ１０５）。 Next, the performance index collection means 22 receives the performance index information transmitted from the monitoring target device 3 (step S104).
Next, the performance index collecting unit 22 records the received performance index information in the performance index recording unit 24 (step S105).

次に、性能指標収集手段２２は、ステップＳ１０１の処理を行ってから、上記一定時間が経過したか否かを判断する（ステップＳ１０６）。
ステップＳ１０６において、性能指標収集手段２２が、ステップＳ１０１の処理を行ってから、上記一定時間が経過していないと判断した場合は（ステップＳ１０６；Ｎｏ）、性能指標収集手段２２は、ステップＳ１０６の処理を繰り返す。
ステップＳ１０６において、性能指標収集手段２２が、ステップＳ１０１の処理を行ってから、上記一定時間が経過したと判断した場合は（ステップＳ１０６；Ｙｅｓ）、ステップＳ１０１に戻る。 Next, the performance index collecting unit 22 determines whether or not the predetermined time has elapsed after performing the process of step S101 (step S106).
In step S106, when the performance index collection unit 22 determines that the predetermined time has not elapsed after performing the process of step S101 (step S106; No), the performance index collection unit 22 performs the process of step S106. Repeat the process.
In step S106, when the performance index collection unit 22 determines that the predetermined time has elapsed after performing the process of step S101 (step S106; Yes), the process returns to step S101.

次に、図６を参照して、本実施の形態１における特徴部位抽出処理について説明する。
まず、特徴部位抽出手段２５は、アラーム記録手段２３に記録されているアラーム情報のうち、前回の特徴部位抽出処理で取得していないアラーム情報を取得する（ステップＳ２０１）。 Next, with reference to FIG. 6, the feature part extraction process in the first embodiment will be described.
First, the characteristic part extraction unit 25 acquires alarm information that has not been acquired in the previous characteristic part extraction process among the alarm information recorded in the alarm recording unit 23 (step S201).

次に、特徴部位抽出手段２５は、性能指標記録手段２４に記録されている性能指標情報を取得する（ステップＳ２０２）。具体的には、特徴部位抽出手段２５は、監視対象装置３において障害が発生した時刻より一定時間（第１の一定時間）前から、監視対象装置３において障害が発生した時刻までの、性能指標情報を取得する。なお、監視対象装置３において障害が発生した時刻は、アラーム記録手段２３から取得したアラーム情報に含まれている。ここで、一定時間（第１の一定時間）とは、例えば、１０分間等の一定の時間である。 Next, the characteristic part extraction unit 25 acquires the performance index information recorded in the performance index recording unit 24 (step S202). Specifically, the feature part extraction unit 25 performs a performance index from a certain time (first certain time) before the time when the failure occurs in the monitoring target device 3 to the time when the failure occurs in the monitoring target device 3. Get information. Note that the time when the failure occurred in the monitoring target device 3 is included in the alarm information acquired from the alarm recording means 23. Here, the certain time (first certain time) is a certain time such as 10 minutes.

次に、特徴部位抽出手段２５は、ステップＳ２０２において取得した性能指標情報を、特徴部位情報として、特徴部位記録手段２６に記録する（ステップＳ２０３）。より具体的には、特徴部位抽出手段２５は、ステップＳ２０１において取得したアラーム情報と、当該特徴部位情報と、を対応付けて、特徴部位記録手段２６に記録する。
なお、特徴部位抽出手段２５は、定期的に、上記ステップＳ２０１〜ステップＳ２０３の処理を行う。ここで、定期的とは、１日に１回程度等の一定時間毎という意味である。 Next, the characteristic part extraction unit 25 records the performance index information acquired in step S202 as characteristic part information in the characteristic part recording unit 26 (step S203). More specifically, the feature part extraction unit 25 records the alarm information acquired in step S201 and the feature part information in the feature part recording unit 26 in association with each other.
In addition, the characteristic part extraction means 25 performs the process of said step S201-step S203 regularly. Here, the term “regular” means every fixed time such as about once a day.

次に、図７を参照して、本実施の形態１における算出処理及び障害発生予測処理等について説明する。
まず、相関算出手段２７は、性能指標記録手段２４から性能指標情報を取得する（ステップＳ３０１）。具体的には、相関算出手段２７は、性能指標記録手段２４から、現在の時刻より一定時間（第２の一定時間）前から、現在の時刻までの、性能指標情報を取得する。ここで、一定時間（第２の一定時間）とは、例えば、１０分間等の一定の時間である。なお、ステップＳ３０１において、相関算出手段２７が取得する性能指標情報における当該一定時間（第２の一定時間）と、図６のステップＳ２０２において、特徴部位抽出手段２５が取得する性能指標情報における上記一定時間（第１の一定時間）とは、同じ長さの時間であってもよいし、相関算出手段２７が取得する性能指標情報における当該一定時間（第２の一定時間）が、特徴部位抽出手段２５が取得する性能指標情報における上記一定時間（第１の一定時間）より短い時間であってもよい。 Next, with reference to FIG. 7, a calculation process, a failure occurrence prediction process, and the like according to the first embodiment will be described.
First, the correlation calculating unit 27 acquires performance index information from the performance index recording unit 24 (step S301). Specifically, the correlation calculating unit 27 acquires performance index information from the performance index recording unit 24 from a certain time (second certain time) before the current time to the current time. Here, the fixed time (second fixed time) is a fixed time such as 10 minutes. In step S301, the constant time (second constant time) in the performance index information acquired by the correlation calculation unit 27 and the constant in the performance index information acquired by the characteristic part extraction unit 25 in step S202 of FIG. The time (first fixed time) may be the same length of time, or the fixed time (second fixed time) in the performance index information acquired by the correlation calculating unit 27 is the characteristic part extracting unit. 25 may be shorter than the predetermined time (first fixed time) in the performance index information acquired.

次に、相関算出手段２７は、特徴部位記録手段２５に記録されている複数の特徴部位情報から一の特徴部位情報を取得する（ステップＳ３０２）。 Next, the correlation calculating unit 27 acquires one piece of feature part information from a plurality of pieces of feature part information recorded in the feature part recording unit 25 (step S302).

次に、相関算出手段２７は、ステップＳ３０１において取得した性能指標情報と、ステップＳ３０２において取得した特徴部位情報である性能指標情報との相関係数を算出する（ステップＳ３０３）。ここで、相関係数とは、当該性能指標情報と、特徴部位情報である性能指標情報とが、どのくらい一致するかを示す値である。また、相関係数の値が大きいほど、一致度が高い。当該相関係数は、例えば、最小二乗法等を用いて算出する。 Next, the correlation calculating unit 27 calculates a correlation coefficient between the performance index information acquired in step S301 and the performance index information which is the characteristic part information acquired in step S302 (step S303). Here, the correlation coefficient is a value indicating how much the performance index information matches the performance index information that is characteristic part information. Also, the greater the correlation coefficient value, the higher the degree of coincidence. The correlation coefficient is calculated using, for example, the least square method.

次に、障害発生予測手段２８は、相関算出手段２７により算出された相関係数の値が規定値以上か否かを判断する（ステップＳ３０４）。 Next, the failure occurrence prediction unit 28 determines whether or not the value of the correlation coefficient calculated by the correlation calculation unit 27 is equal to or greater than a specified value (step S304).

ステップＳ３０４において、障害発生予測手段２８が、相関係数の値が規定値より小さいと判断した場合（ステップＳ３０４；Ｎｏ）、ステップＳ３０６に進む。
ステップＳ３０４において、障害発生予測手段２８が、相関係数の値が規定値以上であると判断した場合（ステップＳ３０４；Ｙｅｓ）、障害発生予測手段２８は、障害予告信号を運用管理者端末装置１に送信する（ステップＳ３０５）。ここで、障害予告信号は、少なくとも、監視対象装置３において発生すると予測される障害の種類、監視対象装置３において当該障害が発生すると予測される時刻に関する情報を含む。 In step S304, when the failure occurrence prediction unit 28 determines that the value of the correlation coefficient is smaller than the specified value (step S304; No), the process proceeds to step S306.
In step S304, when the failure occurrence prediction unit 28 determines that the value of the correlation coefficient is equal to or greater than the specified value (step S304; Yes), the failure occurrence prediction unit 28 sends a failure notification signal to the operation manager terminal device 1. (Step S305). Here, the failure notice signal includes at least information regarding the type of failure predicted to occur in the monitoring target device 3 and the time when the failure is predicted to occur in the monitoring target device 3.

なお、相関係数の値が規定値以上であるということは、監視対象装置３において過去に障害が発生した時刻から一定時間（第１の一定時間）前までの性能指標情報と、現在の時刻から一定時間（第２の一定時間）前の性能指標情報までの性能指標情報とが、一致する割合が高いことを示している。したがって、相関係数の値が規定値以上であるということは、監視対象装置３が、過去に障害を発生した状態と類似する状態にあるということを示している。そのため、障害発生予測手段２８は、相関係数の値が規定値以上であると判断した場合に、障害予告信号を運用管理者端末装置１に送信する。 Note that the value of the correlation coefficient being equal to or greater than the specified value means that the performance index information from the time when the failure has occurred in the monitoring target device 3 in the past to the predetermined time (first fixed time) and the current time The performance index information from the performance index information to the performance index information before a certain time (second certain time) is high. Therefore, the value of the correlation coefficient being equal to or greater than the specified value indicates that the monitoring target device 3 is in a state similar to a state in which a failure has occurred in the past. Therefore, the failure occurrence prediction unit 28 transmits a failure notice signal to the operation manager terminal device 1 when determining that the value of the correlation coefficient is equal to or greater than the specified value.

次に、運用管理者端末装置１は、障害予告信号を受信する（ステップＳ３０６）。次に、運用管理者端末装置１は、当該運用管理者端末装置１の表示部に、少なくとも、監視対象装置３において発生すると予測される障害の種類、当該監視対象装置３において当該障害が発生すると予測される時刻を表示する（ステップＳ３０７）。 Next, the operation manager terminal device 1 receives the failure notice signal (step S306). Next, the operation manager terminal device 1 causes at least the type of failure that is predicted to occur in the monitoring target device 3 on the display unit of the operation manager terminal device 1, and the failure occurs in the monitoring target device 3. The predicted time is displayed (step S307).

次に、障害発生予測手段２８は、特徴部位記録手段２５に記録されている全ての特徴部位情報について、ステップＳ３０２〜ステップＳ３０４の処理を行ったか否かを判断する（ステップＳ３０８）。 Next, the failure occurrence predicting unit 28 determines whether or not the processing of Step S302 to Step S304 has been performed for all the feature part information recorded in the feature part recording unit 25 (Step S308).

ステップＳ３０８において、障害発生予測手段２８が、特徴部位記録手段２５に記録されている全ての特徴部位情報について、ステップＳ３０２〜ステップＳ３０４の処理を行っていないと判断した場合（ステップＳ３０８；Ｎｏ）、ステップＳ３０２に戻る。
ステップＳ３０８において、障害発生予測手段２８が、特徴部位記録手段２５に記録されている全ての特徴部位情報について、ステップＳ３０２〜ステップＳ３０４の処理を行ったと判断した場合（ステップＳ３０８；Ｙｅｓ）、本処理を終了する。 In step S308, when the failure occurrence prediction unit 28 determines that the processing of step S302 to step S304 has not been performed for all the feature part information recorded in the feature part recording unit 25 (step S308; No). The process returns to step S302.
In step S308, when the failure occurrence prediction unit 28 determines that the processing of step S302 to step S304 has been performed for all the feature part information recorded in the feature part recording unit 25 (step S308; Yes), this process Exit.

なお、相関算出手段２７による算出処理（ステップＳ３０１〜ステップＳ３０３）及び、障害発生予測手段２８による障害発生予測処理（ステップＳ３０４〜ステップＳ３０５及びステップＳ３０８）は、運用管理装置２における通常の処理と並行して、常時、上記処理を行う。 Note that the calculation processing (steps S301 to S303) by the correlation calculation unit 27 and the failure occurrence prediction processing (steps S304 to S305 and step S308) by the failure occurrence prediction unit 28 are in parallel with normal processing in the operation management device 2. Thus, the above processing is always performed.

以上に説明した本発明の実施の形態１にかかる運用管理装置２、運用管理システム１００、及び、運用管理方法においては、相関算出手段２７が、監視対象装置３において障害が発生した時刻より一定時間（第１の一定時間）前から、当該障害が発生した時刻までの性能指標情報である特徴部位情報と、現在時刻より一定時間（第２の一定時間）前から、当該現在時刻までの性能指標情報とが、どのくらい一致するかを示す値である相関係数を算出する。また、障害発生予測手段２８が、当該相関係数が規定値以上である場合に、監視対象装置３において障害が発生すると予測される旨の障害予告信号を運用管理者端末装置１に通知する。
そのため、本実施の形態１にかかる運用管理装置２、運用管理システム１００、及び、運用管理方法においては、相関算出手段２７により自動で相関係数が算出され、当該相関係数の値に基づいて、障害発生予測手段２８により自動で障害予告信号が運用管理者端末装置１に通知される。そのため、運用管理者のノウハウに依存することなく、監視対象装置３において障害が発生する前に、運用管理装置２は障害予告信号を運用管理者に通知することができる。 In the operation management apparatus 2, the operation management system 100, and the operation management method according to the first embodiment of the present invention described above, the correlation calculation unit 27 is set to a certain time from the time when the failure occurs in the monitoring target apparatus 3. Feature part information that is performance index information from the time (first fixed time) to the time when the failure occurs, and a performance index from the current time to the current time until a certain time (second fixed time) A correlation coefficient, which is a value indicating how much the information matches, is calculated. Further, when the correlation coefficient is equal to or greater than the specified value, the failure occurrence prediction unit 28 notifies the operation manager terminal device 1 of a failure notice signal indicating that a failure is predicted to occur in the monitoring target device 3.
Therefore, in the operation management apparatus 2, the operation management system 100, and the operation management method according to the first embodiment, the correlation coefficient is automatically calculated by the correlation calculation unit 27, and based on the value of the correlation coefficient. Then, the failure notice prediction means 28 automatically notifies the operation manager terminal device 1 of the failure notice signal. Therefore, the operation management device 2 can notify the operation manager of a failure notice signal before a failure occurs in the monitoring target device 3 without depending on the know-how of the operation manager.

なお、本発明の実施の形態１においても、相関係数の値を判断するための基準となる規定値を予め定める必要がある。しかしながら、相関係数の値は、特徴部位情報と、現在時刻より一定時間（第２の一定時間）前から、当該現在時刻までの性能指標情報とが、どのくらい一致するかを示す値であるため、技術的に理解しやすい。そのため、当該規定値の設定には、運用管理者の特別なノウハウを必要としない。従って、本発明の実施の形態１にかかる運用管理装置２、運用管理システム１００、及び、運用管理方法によれば、運用管理者のノウハウに依存することなく、監視対象装置３において障害が発生する前に、運用管理装置２はアラームを運用管理者に通知することができる。 Also in the first embodiment of the present invention, it is necessary to predetermine a prescribed value as a reference for determining the value of the correlation coefficient. However, the value of the correlation coefficient is a value indicating how much the characteristic part information matches the performance index information from the current time to the current time until a certain time (second constant time). Easy to understand, technically. Therefore, special know-how of the operation manager is not required for setting the specified value. Therefore, according to the operation management apparatus 2, the operation management system 100, and the operation management method according to the first embodiment of the present invention, a failure occurs in the monitoring target apparatus 3 without depending on the know-how of the operation manager. Before, the operation management apparatus 2 can notify the operation manager of an alarm.

また、本発明の実施の形態１にかかる運用管理装置２、運用管理システム１００、及び、運用管理方法においては、アラーム収集手段２１は、監視対象装置３において障害が発生した場合に、当該監視対象装置３から、少なくとも、当該障害の種類及び当該障害が発生した時刻に関する情報を含むアラーム情報を収集する。また、障害発生予測手段２８は、特徴部位記録手段２６に、特徴部位情報と対応付けられて記録されたアラーム情報に基づいて、少なくとも、監視対象装置３において発生すると予測される障害の種類及び障害発生予測時刻を、障害予告信号として、運用管理者端末装置１に通知する。
そのため、本発明の実施の形態１にかかる運用管理装置２、運用管理システム１００、及び、運用管理方法によれば、監視対象装置３において将来発生する障害の種類を予測することもできる。 Further, in the operation management apparatus 2, the operation management system 100, and the operation management method according to the first embodiment of the present invention, the alarm collection unit 21 is configured to monitor the monitoring target device 3 when a failure occurs. Alarm information including at least information about the type of the failure and the time when the failure occurred is collected from the device 3. Further, the failure occurrence predicting means 28 is based on the alarm information recorded in the feature part recording means 26 in association with the feature part information, and at least the type of fault predicted to occur in the monitoring target device 3 and the trouble. The predicted occurrence time is notified to the operation manager terminal device 1 as a failure notice signal.
Therefore, according to the operation management apparatus 2, the operation management system 100, and the operation management method according to the first embodiment of the present invention, it is also possible to predict the type of failure that will occur in the monitoring target apparatus 3 in the future.

また、本発明の実施の形態１にかかる運用管理装置２、運用管理システム１００、及び、運用管理方法においては、性能指標収集手段２２は、性能指標情報を定期的に収集する。また、運用管理装置２における通常の処理と並行して、相関算出手段２７は相関係数を算出し、障害発生予測手段２８は監視対象装置３において発生する障害を予測する。
そのため、本発明の実施の形態１にかかる運用管理装置２、運用管理システム１００、及び、運用管理方法によれば、監視対象装置３において将来発生する障害を常時予測することができる。 Further, in the operation management apparatus 2, the operation management system 100, and the operation management method according to the first embodiment of the present invention, the performance index collection unit 22 periodically collects performance index information. In parallel with normal processing in the operation management apparatus 2, the correlation calculation unit 27 calculates a correlation coefficient, and the failure occurrence prediction unit 28 predicts a failure that occurs in the monitoring target device 3.
Therefore, according to the operation management apparatus 2, the operation management system 100, and the operation management method according to the first embodiment of the present invention, it is possible to always predict a failure that will occur in the monitoring target apparatus 3 in the future.

また、本発明の実施の形態１にかかる運用管理装置２、運用管理システム１００、及び、運用管理方法においては、運用管理装置２は、監視対象装置３において障害が発生した場合に、少なくとも、監視対象装置３において発生した障害の種類及び当該障害が発生した時刻に関する情報を、アラーム通知信号として、運用管理者端末装置１に通知する。
そのため、本発明の実施の形態１にかかる運用管理装置２、運用管理システム１００、及び、運用管理方法によれば、監視対象装置３において障害が発生した場合に、運用管理装置２は、当該障害が発生した旨のアラーム通知信号を運用管理者端末装置１に通知することができる。 Further, in the operation management apparatus 2, the operation management system 100, and the operation management method according to the first exemplary embodiment of the present invention, the operation management apparatus 2 monitors at least when a failure occurs in the monitoring target apparatus 3. Information regarding the type of failure that occurred in the target device 3 and the time at which the failure occurred is notified to the operation manager terminal device 1 as an alarm notification signal.
Therefore, according to the operation management apparatus 2, the operation management system 100, and the operation management method according to the first embodiment of the present invention, when a failure occurs in the monitoring target apparatus 3, the operation management apparatus 2 It is possible to notify the operation manager terminal device 1 of an alarm notification signal indicating the occurrence of the error.

なお、本発明は上記実施の形態に限られたものではなく、趣旨を逸脱しない範囲で適宜変更することが可能である。例えば、運用管理システム１００に、運用管理装置２が複数設けられても良い。
（付記）
以上の実施の形態に関し、更に以下の付記を開示する。
（付記１）
監視対象装置の動作状況を監視する運用管理装置であって、
前記監視対象装置の性能指標情報を収集する性能指標収集手段と、
前記監視対象装置において障害が発生した時刻より第１の一定時間前から、前記障害が発生した時刻までの性能指標情報である特徴部位情報と、現在時刻より第２の一定時間前から、前記現在時刻までの性能指標情報とが、どのくらい一致するかを示す値である相関係数を算出する相関算出手段と、
前記相関係数が規定値以上である場合に、前記監視対象装置において障害が発生すると予測される旨の障害予告信号を運用管理者に通知する障害発生予測手段と、
を備える運用管理装置。
（付記２）
前記運用管理装置は、
前記監視対象装置において障害が発生した場合に、前記監視対象装置から、少なくとも、当該障害の種類及び当該障害が発生した時刻に関する情報を含むアラーム情報を収集するアラーム収集手段を備え、
前記障害発生予測手段は、
前記特徴部位情報である前記性能指標情報の始点と終点の時刻を定める前記障害に関するアラーム情報に基づいて、少なくとも、前記監視対象装置において発生すると予測される障害の種類及び障害発生予測時刻を、前記障害予告信号として、前記運用管理者に通知する付記１に記載の運用管理装置。
（付記３）
前記性能指標収集手段は、前記性能指標情報を定期的に収集し、
前記運用管理装置における通常の処理と並行して、前記相関算出手段は前記相関係数を算出し、前記障害発生予測手段は前記監視対象装置において発生する障害を予測する付記１又は２に記載の運用管理装置。
（付記４）
前記運用管理装置は、
前記監視対象装置において障害が発生した場合に、少なくとも、前記監視対象装置において発生した障害の種類及び前記障害が発生した時刻に関する情報を、アラーム通知信号として、前記運用管理者に通知する付記１乃至３の何れかに記載の運用管理装置。
（付記５）
監視対象装置の動作状況を監視する運用管理装置と、前記運用管理装置から前記監視対象装置の動作状況に関する通知を受信する運用管理者端末装置と、を備える運用管理システムであって、
前記運用管理装置は、
前記監視対象装置の性能指標情報を収集する性能指標収集手段と、
前記監視対象装置において障害が発生した時刻より第１の一定時間前から、前記障害が発生した時刻までの性能指標情報である特徴部位情報と、現在時刻より第２の一定時間前から、前記現在時刻までの性能指標情報とが、どのくらい一致するかを示す値である相関係数を算出する相関算出手段と、
前記相関係数が規定値以上である場合に、前記監視対象装置において障害が発生すると予測される旨の障害予告信号を前記運用管理者端末装置に通知する障害発生予測手段と、
を備える運用管理システム。
（付記６）
前記運用管理装置は、
前記監視対象装置において障害が発生した場合に、前記監視対象装置から、少なくとも、当該障害の種類及び当該障害が発生した時刻に関する情報を含むアラーム情報を収集するアラーム収集手段を備え、
前記障害発生予測手段は、
前記特徴部位情報である前記性能指標情報の始点と終点の時刻を定める前記障害に関するアラーム情報に基づいて、少なくとも、前記監視対象装置において発生すると予測される障害の種類及び障害発生予測時刻を、前記障害予告信号として、前記運用管理者端末装置に通知する付記５に記載の運用管理システム。
（付記７）
前記性能指標収集手段は、前記性能指標情報を定期的に収集し、
前記運用管理装置における通常の処理と並行して、前記相関算出手段は前記相関係数を算出し、前記障害発生予測手段は前記監視対象装置において発生する障害を予測する付記５又は６に記載の運用管理システム。
（付記８）
前記運用管理装置は、
前記監視対象装置において障害が発生した場合に、少なくとも、前記監視対象装置において発生した障害の種類及び前記障害が発生した時刻に関する情報を、アラーム通知信号として、前記運用管理者端末装置に通知する付記５乃至７の何れかに記載の運用管理システム。
（付記９）
監視対象装置の動作状況を監視する運用管理方法であって、
前記監視対象装置の性能指標情報を収集し、
前記監視対象装置において障害が発生した時刻より第１の一定時間前から、前記障害が発生した時刻までの性能指標情報である特徴部位情報と、現在時刻より第２の一定時間前から、前記現在時刻までの性能指標情報とが、どのくらい一致するかを示す値である相関係数を算出し、
前記相関係数が規定値以上である場合に、前記監視対象装置において障害が発生すると予測される旨の障害予告信号を運用管理者に通知する運用管理方法。
（付記１０）
前記監視対象装置において障害が発生した場合に、前記監視対象装置から、少なくとも、当該障害の種類及び当該障害が発生した時刻に関する情報を含むアラーム情報を収集し、
前記特徴部位情報である前記性能指標情報の始点と終点の時刻を定める前記障害に関するアラーム情報に基づいて、少なくとも、前記監視対象装置において発生すると予測される障害の種類及び障害発生予測時刻を、前記障害予告信号として、前記運用管理者に通知する付記９に記載の運用管理方法。
（付記１１）
前記性能指標情報を定期的に収集し、
通常の処理と並行して、前記相関係数を算出し、及び、前記監視対象装置において発生する障害を予測する付記９又は１０に記載の運用管理方法。
（付記１２）
前記監視対象装置において障害が発生した場合に、少なくとも、前記監視対象装置において発生した障害の種類及び前記障害が発生した時刻に関する情報を、アラーム通知信号として、前記運用管理者に通知する付記９乃至１１の何れかに記載の運用管理方法。 Note that the present invention is not limited to the above-described embodiment, and can be changed as appropriate without departing from the spirit of the present invention. For example, the operation management system 100 may be provided with a plurality of operation management apparatuses 2.
(Appendix)
Regarding the above embodiment, the following additional notes are disclosed.
(Appendix 1)
An operation management device that monitors the operating status of monitored devices,
A performance index collecting means for collecting performance index information of the monitoring target device;
Feature part information, which is performance index information from a first predetermined time before the time when the failure occurs in the monitoring target device to the time when the failure occurs, and the current time from a second predetermined time before the current time Correlation calculating means for calculating a correlation coefficient that is a value indicating how much the performance index information up to the time matches,
A failure occurrence predicting means for notifying an operation administrator of a failure notice signal indicating that a failure is predicted to occur in the monitored device when the correlation coefficient is equal to or greater than a specified value;
An operation management apparatus comprising:
(Appendix 2)
The operation management device includes:
When a failure occurs in the monitoring target device, the monitoring target device comprises alarm collecting means for collecting alarm information including at least information about the type of the failure and the time when the failure occurred, from the monitoring target device,
The failure occurrence predicting means includes
Based on the alarm information related to the failure that defines the start time and the end time of the performance index information that is the characteristic part information, at least the type of failure predicted to occur in the monitored device and the failure occurrence predicted time, The operation management apparatus according to attachment 1, wherein the operation manager is notified as a failure notice signal.
(Appendix 3)
The performance index collection means periodically collects the performance index information,
In addition to the normal processing in the operation management device, the correlation calculation unit calculates the correlation coefficient, and the failure occurrence prediction unit predicts a failure occurring in the monitoring target device. Operation management device.
(Appendix 4)
The operation management device includes:
Supplementary notes 1 to 3 that notify the operation manager of at least information about the type of failure that occurred in the monitored device and the time when the failure occurred when a failure occurs in the monitored device 4. The operation management apparatus according to any one of 3.
(Appendix 5)
An operation management system comprising: an operation management device that monitors an operation status of a monitoring target device; and an operation manager terminal device that receives a notification about the operation status of the monitoring target device from the operation management device,
The operation management device includes:
A performance index collecting means for collecting performance index information of the monitoring target device;
Feature part information, which is performance index information from a first predetermined time before the time when the failure occurs in the monitoring target device to the time when the failure occurs, and the current time from a second predetermined time before the current time Correlation calculating means for calculating a correlation coefficient that is a value indicating how much the performance index information up to the time matches,
A failure occurrence predicting means for notifying the operation manager terminal device of a failure notice signal indicating that a failure is predicted to occur in the monitored device when the correlation coefficient is a specified value or more;
An operation management system comprising:
(Appendix 6)
The operation management device includes:
When a failure occurs in the monitoring target device, the monitoring target device comprises alarm collecting means for collecting alarm information including at least information about the type of the failure and the time when the failure occurred, from the monitoring target device,
The failure occurrence predicting means includes
Based on the alarm information related to the failure that defines the start time and the end time of the performance index information that is the characteristic part information, at least the type of failure predicted to occur in the monitored device and the failure occurrence predicted time, The operation management system according to appendix 5, which notifies the operation manager terminal device as a failure notice signal.
(Appendix 7)
The performance index collection means periodically collects the performance index information,
In addition to the normal processing in the operation management apparatus, the correlation calculation unit calculates the correlation coefficient, and the failure occurrence prediction unit predicts a failure occurring in the monitoring target device. Operation management system.
(Appendix 8)
The operation management device includes:
Note that when a failure occurs in the monitoring target device, at least information about the type of failure that occurred in the monitoring target device and the time when the failure occurred is notified to the operation manager terminal device as an alarm notification signal The operation management system according to any one of 5 to 7.
(Appendix 9)
An operation management method for monitoring the operating status of a monitored device,
Collecting performance index information of the monitored device;
Feature part information, which is performance index information from a first predetermined time before the time when the failure occurs in the monitoring target device to the time when the failure occurs, and the current time from a second predetermined time before the current time Calculate the correlation coefficient, which is a value indicating how much the performance index information until the time matches,
An operation management method of notifying an operation manager of a failure notice signal indicating that a failure is predicted to occur in the monitored device when the correlation coefficient is equal to or greater than a specified value.
(Appendix 10)
When a failure occurs in the monitored device, collect alarm information including at least information about the type of the failure and the time when the failure occurs from the monitored device,
Based on the alarm information related to the failure that defines the start time and the end time of the performance index information that is the characteristic part information, at least the type of failure predicted to occur in the monitored device and the failure occurrence predicted time, The operation management method according to appendix 9, wherein the operation manager is notified as a failure notice signal.
(Appendix 11)
Periodically collecting the performance index information;
The operation management method according to appendix 9 or 10, wherein the correlation coefficient is calculated and a failure occurring in the monitoring target device is predicted in parallel with normal processing.
(Appendix 12)
Supplementary notes 9 to 9 that notify the operation manager of at least information about the type of failure that occurred in the monitored device and the time when the failure occurred when a failure occurs in the monitored device The operation management method according to any one of 11.

１運用管理者端末装置
２運用管理装置
２１アラーム収集手段
２２性能指標収集手段
２３アラーム記録手段
２４性能指標記録手段
２５特徴部位抽出手段
２６特徴部位記録手段
２７相関算出手段
２８障害発生予測手段
３監視対象装置
１００運用管理システム DESCRIPTION OF SYMBOLS 1 Operation manager terminal device 2 Operation management device 21 Alarm collection means 22 Performance index collection means 23 Alarm recording means 24 Performance index recording means 25 Characteristic part extraction means 26 Characteristic part recording means 27 Correlation calculation means 28 Fault occurrence prediction means 3 Monitoring object Device 100 operation management system

Claims

An operation management device that monitors the operating status of monitored devices,
A performance index collecting means for collecting performance index information of the monitoring target device;
Feature part information, which is performance index information from a first predetermined time before the time when the failure occurs in the monitoring target device to the time when the failure occurs, and the current time from a second predetermined time before the current time Correlation calculating means for calculating a correlation coefficient that is a value indicating how much the performance index information up to the time matches,
A failure occurrence predicting means for notifying an operation administrator of a failure notice signal indicating that a failure is predicted to occur in the monitored device when the correlation coefficient is equal to or greater than a specified value;
An operation management apparatus comprising:

The operation management device includes:
When a failure occurs in the monitoring target device, the monitoring target device comprises alarm collecting means for collecting alarm information including at least information about the type of the failure and the time when the failure occurred, from the monitoring target device,
The failure occurrence predicting means includes
Based on the alarm information related to the failure that defines the start time and the end time of the performance index information that is the characteristic part information, at least the type of failure predicted to occur in the monitored device and the failure occurrence predicted time, The operation management apparatus according to claim 1, wherein the operation manager is notified as a failure notice signal.

The performance index collection means periodically collects the performance index information,
3. The parallel calculation according to claim 1, wherein the correlation calculation unit calculates the correlation coefficient and the failure occurrence prediction unit predicts a failure that occurs in the monitoring target device in parallel with normal processing in the operation management device. Operation management device.

The operation management device includes:
2. When a failure occurs in the monitoring target device, at least information regarding the type of failure that occurred in the monitoring target device and the time when the failure occurred is notified to the operation manager as an alarm notification signal. 4. The operation management device according to any one of items 1 to 3.

An operation management system comprising: an operation management device that monitors an operation status of a monitoring target device; and an operation manager terminal device that receives a notification about the operation status of the monitoring target device from the operation management device,
The operation management device includes:
A performance index collecting means for collecting performance index information of the monitoring target device;
Feature part information, which is performance index information from a first predetermined time before the time when the failure occurs in the monitoring target device to the time when the failure occurs, and the current time from a second predetermined time before the current time Correlation calculating means for calculating a correlation coefficient that is a value indicating how much the performance index information up to the time matches,
A failure occurrence predicting means for notifying the operation manager terminal device of a failure notice signal indicating that a failure is predicted to occur in the monitored device when the correlation coefficient is a specified value or more;
An operation management system comprising:

The operation management device includes:
When a failure occurs in the monitoring target device, the monitoring target device comprises alarm collecting means for collecting alarm information including at least information about the type of the failure and the time when the failure occurred, from the monitoring target device,
The failure occurrence predicting means includes
Based on the alarm information related to the failure that defines the start time and the end time of the performance index information that is the characteristic part information, at least the type of failure predicted to occur in the monitored device and the failure occurrence predicted time, The operation management system according to claim 5, wherein the operation manager terminal device is notified as a failure notice signal.

The performance index collection means periodically collects the performance index information,
The parallel calculation according to claim 5 or 6, wherein the correlation calculation unit calculates the correlation coefficient, and the failure occurrence prediction unit predicts a failure that occurs in the monitoring target device in parallel with normal processing in the operation management device. Operation management system.

The operation management device includes:
When a failure occurs in the monitoring target device, at least information about a type of the failure that occurred in the monitoring target device and a time when the failure occurred is notified to the operation manager terminal device as an alarm notification signal. Item 8. The operation management system according to any one of Items 5 to 7.

An operation management method for monitoring the operating status of a monitored device,
Collecting performance index information of the monitored device;
Feature part information, which is performance index information from a first predetermined time before the time when the failure occurs in the monitoring target device to the time when the failure occurs, and the current time from a second predetermined time before the current time Calculate the correlation coefficient, which is a value indicating how much the performance index information until the time matches,
An operation management method of notifying an operation manager of a failure notice signal indicating that a failure is predicted to occur in the monitored device when the correlation coefficient is equal to or greater than a specified value.

When a failure occurs in the monitored device, collect alarm information including at least information about the type of the failure and the time when the failure occurs from the monitored device,
Based on the alarm information related to the failure that defines the start time and the end time of the performance index information that is the characteristic part information, at least the type of failure predicted to occur in the monitored device and the failure occurrence predicted time, The operation management method according to claim 9, wherein the operation manager is notified as a failure notice signal.