JP5690690B2

JP5690690B2 - Abnormality detection device, abnormality detection method, and abnormality detection program

Info

Publication number: JP5690690B2
Application number: JP2011204607A
Authority: JP
Inventors: 伸夫福島; 多賀　康博; 康博多賀
Original assignee: エヌ・ティ・ティ・コムウェア株式会社
Priority date: 2011-09-20
Filing date: 2011-09-20
Publication date: 2015-03-25
Anticipated expiration: 2031-09-20
Also published as: JP2013066113A

Description

本発明は、異常検出装置、異常検出方法、及び異常検出プログラムに関する。 The present invention, abnormal detecting device, a malfunction detection method, and the abnormality detecting program.

ネットワークに接続された機器の故障を検出する手法として、ベースライン分析が知られている。例えば、特許文献１には、ネットワークから得られた現在の実時間性能データから、異常と相関する時間的な目的関数を導出し、ネットワークから得られた履歴性能データから目的関数の時間変動に対するする最大しきい値を導出し、現在の実時間性能データから導出された目的関数を、現在の実時間性能データに連関する時点に時間的に対応する時点における最大しきい値と比較し、現在の実時間性能データから導出された目的関数が、予め定められた時間よりも長い間、最大しきい値よりも大きいことを示す場合に、異常が存在すると定めるベースライン分析方法が記載されている。 Baseline analysis is known as a technique for detecting a failure of a device connected to a network. For example, in Patent Document 1, a temporal objective function that correlates with anomalies is derived from current real-time performance data obtained from a network, and a temporal variation of the objective function is performed from historical performance data obtained from the network. A maximum threshold is derived, and the objective function derived from the current real-time performance data is compared with the maximum threshold at the time corresponding in time to the time associated with the current real-time performance data. A baseline analysis method is described that determines that an anomaly exists when an objective function derived from real-time performance data indicates that it is greater than a maximum threshold for a longer period of time.

特開２００１−０５７５５５号公報JP 2001-057555 A

しかしながら、従来の技術に係るベースライン分析を用いた故障検出法は、エラーとして検出されないサイレント故障を検知する目的に用いられてきた。そのため、従来技術では、ベースライン分析時にサイレント故障が発生しているか否かに関する情報は利用されていなかった。また、従来技術では、各機器のデータの測定値を一定間隔のタイムスパン毎にベースラインと比較する機械的な統計処理を行っていた。従って、従来技術では、検出時のタイムスパンの設定が故障発生時のタイムスパンと合っていないため、また、正常値の基準となるベースラインが、異常時の値をも含んで生成されているため、ベースラインの信頼性が確かでなかった。つまり、異常の検出をできない場合があるという問題があった。 However, the fault detection method using the baseline analysis according to the prior art has been used for the purpose of detecting a silent fault that is not detected as an error. Therefore, in the prior art, information regarding whether or not a silent failure has occurred during baseline analysis has not been used. Further, in the prior art, mechanical statistical processing is performed in which the measured values of the data of each device are compared with the baseline every time span at regular intervals. Therefore, in the prior art, the time span setting at the time of detection does not match the time span at the time of failure occurrence, and the baseline serving as a reference for the normal value is generated including the value at the time of abnormality. Therefore, the reliability of the baseline was not certain. In other words, there is a problem that the abnormality may not be detected.

本発明は上記の点に鑑みてなされたものであり、異常の検出を確実にできる異常検出装置、異常検出方法、及び異常検出プログラムを提供する。
The present invention has been made in view of the above, that can abnormality detection reliably abnormal detecting device, to provide an abnormality detection method, and the abnormality detecting program.

（１）本発明は上記の課題を解決するためになされたものであり、本発明の一態様は、監視対象システムの動作状況を表す測定値のうち、前記監視対象システムの故障発生期間以外の測定値を抽出し、該抽出した測定値に基づき、該測定値の正常範囲を示すベースライン情報を生成するベースライン生成部を具備することを特徴とするベースライン生成装置である。 (1) The present invention has been made to solve the above-described problems, and one aspect of the present invention is a measurement value representing an operation status of the monitored system other than the failure occurrence period of the monitored system. A baseline generation apparatus comprising a baseline generation unit that extracts a measurement value and generates baseline information indicating a normal range of the measurement value based on the extracted measurement value.

（２）また、本発明の一態様は、監視対象システムの動作状況を表す測定値のうち、前記監視対象システムの故障期間以外の測定値を抽出し、該抽出した測定値に基づき、該測定値の正常範囲を示すベースライン情報を生成するベースライン生成部と、監視対象システムに対する操作に関する情報を記憶する操作履歴記憶部と、前記操作に関する情報に基づき、故障期間を決定する故障期間抽出部と、前記故障期間に測定された前記測定値について、前記ベースライン情報に基づき、異常値であるか否かを判定する異常値判定部とを具備することを特徴とする異常検出装置である。 (2) Further, according to one aspect of the present invention, a measurement value other than a failure period of the monitored system is extracted from measurement values representing an operation state of the monitored system, and the measurement is performed based on the extracted measurement value. A baseline generation unit that generates baseline information indicating a normal range of values, an operation history storage unit that stores information related to an operation on the monitored system, and a failure period extraction unit that determines a failure period based on the information related to the operation And an abnormal value determination unit that determines whether or not the measured value measured during the failure period is an abnormal value based on the baseline information.

（３）また、本発明の一態様は、上記の異常検出装置において、前記故障期間抽出部は、操作者から入力された情報に対応する情報であって、前記操作に関する情報に基づき、前記故障期間を決定することを特徴とする。 (3) Further, according to one aspect of the present invention, in the abnormality detection device, the failure period extraction unit is information corresponding to information input from an operator, and is based on the information related to the operation. It is characterized by determining a period.

（４）また、本発明の一態様は、上記の異常検出装置において、前記監視対象システムは、複数の機器から構成され、前記測定値は、複数の測定項目各々に対する測定値であり、前記ベースライン生成部は、前記測定項目の各々に関するベースライン情報を生成し、前記異常検出装置は、前記異常値と判定された測定値の測定項目に基づき、前記複数の機器の中から異常が発生している機器を検出する異常機器検出部を具備することを特徴とする。
（５）また、本発明の一態様は、ベースライン生成装置における方法において、前記ベースライン生成装置が、監視対象システムの動作状況を表す測定値のうち、前記監視対象システムの故障発生期間以外の測定値を抽出し、該抽出した測定値に基づき、該測定値の正常範囲を示すベースライン情報を生成するベースライン生成過程を有することを特徴とするベースライン生成方法である。
（６）また、本発明の一態様は、ベースライン生成装置のコンピュータに、監視対象システムの動作状況を表す測定値のうち、前記監視対象システムの故障発生期間以外の測定値を抽出し、該抽出した測定値に基づき、該測定値の正常範囲を示すベースライン情報を生成するベースライン生成手順を実行させるためのベースライン生成プログラムである。 (4) Further, according to one aspect of the present invention, in the abnormality detection apparatus, the monitoring target system includes a plurality of devices, and the measurement values are measurement values for a plurality of measurement items, and the base The line generation unit generates baseline information regarding each of the measurement items, and the abnormality detection device generates an abnormality from the plurality of devices based on the measurement item of the measurement value determined to be the abnormal value. It is characterized by comprising an abnormal device detection unit for detecting a device that is connected.
(5) Further, according to one aspect of the present invention, in the method in the baseline generation apparatus, the baseline generation apparatus has a measured value representing an operation state of the monitored system other than the failure occurrence period of the monitored system. A baseline generation method characterized by having a baseline generation step of extracting a measurement value and generating baseline information indicating a normal range of the measurement value based on the extracted measurement value.
(6) Further, according to one aspect of the present invention, a measurement value other than a failure occurrence period of the monitoring target system is extracted from the measurement values indicating the operation status of the monitoring target system in a computer of the baseline generation device, It is a baseline generation program for executing a baseline generation procedure for generating baseline information indicating the normal range of the measurement values based on the extracted measurement values.

本発明によれば、異常の検出を確実にできる。 According to the present invention, abnormality can be reliably detected.

本実施形態に係る故障検出システムの概念図である。It is a key map of a failure detection system concerning this embodiment. 本実施形態に係る故障検出装置の構成を示す概略ブロック図である。It is a schematic block diagram which shows the structure of the failure detection apparatus which concerns on this embodiment. 本実施形態に係る操作情報収集の動作の一例を示すフローチャートである。It is a flowchart which shows an example of the operation | movement of operation information collection which concerns on this embodiment. 本実施形態に係る操作情報の一例を示す概略図である。It is the schematic which shows an example of the operation information which concerns on this embodiment. 本実施形態に係る測定値収集の動作の一例を示すフローチャートである。It is a flowchart which shows an example of operation | movement of the measured value collection which concerns on this embodiment. 本実施形態に係る操作情報の一例を示す概略図である。It is the schematic which shows an example of the operation information which concerns on this embodiment. 本実施形態に係る故障期間情報テーブルの一例を示す概略図である。It is the schematic which shows an example of the failure period information table which concerns on this embodiment. 本実施形態に係るベースライン生成処理を示すフローチャートである。It is a flowchart which shows the baseline production | generation process which concerns on this embodiment. 本実施形態に係るベースラインテーブルの一例を示す概略図であるIt is the schematic which shows an example of the baseline table which concerns on this embodiment. 本実施形態に係る異常値故障箇所対応テーブルの一例を示す概略図である。It is the schematic which shows an example of the abnormal value failure location correspondence table which concerns on this embodiment. 本実施形態に係る故障箇所特定処理を示すフローチャートである。It is a flowchart which shows the fault location specific process which concerns on this embodiment.

以下、図面を参照しながら本発明の実施形態について詳しく説明する。
図１は、本実施形態に係る故障検出システム１の概念図である。図示する例では、故障検出システム１は、監視対象ネットワーク１０、ネットワーク機器１１ａ〜１１ｅ、入出力装置１２、故障検出装置１３を含んで構成される。
監視対象ネットワーク１０は、故障検出システム１が故障の検出をする対象のネットワークである。監視対象ネットワーク１０は、家庭内ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）や、ネットワークプロバイダが提供するＬＡＮやＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）などである。監視対象ネットワーク１０は、ゲートウェイＧＷ、ルータＲＴ、スイッチＳＷ、ハブＨｕｂなどのネットワーク管理機器及び一般のＰＣなどのネットワーク機器１１ａ〜１１ｅ（各々をネットワーク機器１１とも称する）を含んで構成される。ゲートウェイＧＷ、ルータＲＴ、スイッチＳＷ、ハブＨｕｂは、監視対象ネットワーク１０に測定値情報を出力する。測定値情報には、ＵＰｎＰ（ＵｎｉｖｅｒｓａｌＰｌａｇａｎｄＰｌａｙ）情報、ＳＮＭＰ（ＳｉｍｐｌｅＮｅｔｗｏｒｋＭａｎａｇｅｍｅｎｔＰｒｏｔｏｃｏｌ）情報、ＤＨＣＰ（ＤｙｎａｍｉｃＨｏｓｔＣｏｎｆｉｇｕｌａｔｉｏｎＰｒｏｔｏｃｏｌ）情報等の測定値内容情報（測定項目及び測定値）及び測定が行われた時刻（測定時刻とも称する）を示す測定時刻情報が含まれる。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
FIG. 1 is a conceptual diagram of a failure detection system 1 according to the present embodiment. In the illustrated example, the failure detection system 1 includes a monitoring target network 10, network devices 11 a to 11 e, an input / output device 12, and a failure detection device 13.
The monitoring target network 10 is a network to which the failure detection system 1 detects a failure. The monitored network 10 is a home LAN (Local Area Network), a LAN provided by a network provider, a WAN (Wide Area Network), or the like. The monitoring target network 10 includes network management devices such as a gateway GW, a router RT, a switch SW, and a hub Hub, and network devices 11a to 11e such as general PCs (each also referred to as a network device 11). The gateway GW, the router RT, the switch SW, and the hub Hub output measurement value information to the monitoring target network 10. The measurement value information includes measurement value content information (measurement items and measurement values) such as UPnP (Universal Plug and Play) information, SNMP (Simple Network Management Protocol) information, DHCP (Dynamic Host Configuration Protocol) information, and the like. Measurement time information indicating the measured time (also referred to as measurement time) is included.

ネットワーク機器１１ａ〜１１ｅ（各々をネットワーク機器１１とも称する）は、監視対象ネットワーク１０にネットワークインターフェイスを介して接続された電子機器である。ネットワーク機器１１は、ネットワークインターフェイスを介して監視対象ネットワーク１０に対して操作情報及び測定値情報を出力する。ここで、操作情報とは、ネットワーク機器１１に利用者から操作が行われたことに起因してネットワーク機器１１が出力する情報である。操作情報には、操作が行われた時刻を示す操作時刻情報、操作の内容を示す操作内容情報、操作の結果エラーが発生したか否かを示す操作状況情報が含まれる。 The network devices 11a to 11e (each also referred to as a network device 11) are electronic devices connected to the monitoring target network 10 via a network interface. The network device 11 outputs operation information and measurement value information to the monitoring target network 10 via the network interface. Here, the operation information is information output from the network device 11 due to an operation performed on the network device 11 by a user. The operation information includes operation time information indicating the time when the operation was performed, operation content information indicating the operation content, and operation status information indicating whether an error has occurred as a result of the operation.

入出力装置１２は、キーボードやタッチパネル等の入力装置とディスプレイなどの出力装置から構成される。入出力装置１２は、故障検出装置１３に接続されている。入出力装置１２には、サポートセンタのオペレータ又は利用者（以後操作者と呼ぶ）から障害が起こった時刻を含む障害発生情報を入力される。入出力装置１２は、障害が起こった箇所やその障害に対する対処法等を表示する。 The input / output device 12 includes an input device such as a keyboard and a touch panel and an output device such as a display. The input / output device 12 is connected to the failure detection device 13. The input / output device 12 receives failure occurrence information including the time when the failure occurred from an operator or user (hereinafter referred to as an operator) of the support center. The input / output device 12 displays a location where a failure has occurred and a countermeasure for the failure.

故障検出装置１３は、監視対象ネットワーク１０から操作情報及び測定値情報を収集する。故障検出装置１３は、収集した操作情報の履歴及び測定値情報の履歴を記録する。故障検出装置１３は、入出力装置１２から入力された障害発生情報と、記録した操作情報の履歴とに基づいて、障害が発生した時間である故障期間を抽出する。故障検出装置１３は、抽出した故障期間以外の測定時刻における測定値情報を用いて、ベースラインを生成する。
このベースラインは、測定項目が同一又は類似するものを、予め定めた時間で計数した数を時系列に並べたものである。
故障検出装置１３は、予め生成したベースラインと故障発生時の通信状況を比較して、通信異常箇所を判定する。故障検出装置１３は、判定した結果を入出力装置１２に表示させる。 The failure detection device 13 collects operation information and measurement value information from the monitoring target network 10. The failure detection device 13 records the collected history of operation information and history of measurement value information. The failure detection device 13 extracts a failure period, which is a time when a failure has occurred, based on the failure occurrence information input from the input / output device 12 and the history of recorded operation information. The failure detection device 13 generates a baseline using measurement value information at a measurement time other than the extracted failure period.
This baseline is obtained by arranging, in time series, numbers obtained by counting items having the same or similar measurement items at a predetermined time.
The failure detection device 13 compares the base line generated in advance with the communication status at the time of failure occurrence, and determines a communication abnormality location. The failure detection device 13 causes the input / output device 12 to display the determined result.

図２は本実施形態に係る故障検出装置１３の構成を示す概略ブロック図である。
図示する例では、故障検出装置１３は、操作情報収集部１０１、入力部１０２、操作履歴ＤＢ（ＤａｔａＢａｓｅ；操作履歴記録部）１０３、操作履歴抽出部１０４、故障期間抽出部１０５、故障期間ＤＢ１０６、測定値情報収集部１０７、測定値履歴ＤＢ１０８、ベースライン生成部１０９、ベースラインＤＢ１１０、異常値判定部１１１、異常値／故障箇所対応ＤＢ１１２、及び出力部（異常機器検出部）１１３を含んで構成される。 FIG. 2 is a schematic block diagram showing the configuration of the failure detection apparatus 13 according to this embodiment.
In the illustrated example, the failure detection apparatus 13 includes an operation information collection unit 101, an input unit 102, an operation history DB (Data Base; operation history recording unit) 103, an operation history extraction unit 104, a failure period extraction unit 105, and a failure period DB 106. A measurement value information collection unit 107, a measurement value history DB 108, a baseline generation unit 109, a baseline DB 110, an abnormal value determination unit 111, an abnormal value / failure location correspondence DB 112, and an output unit (abnormal device detection unit) 113. Composed.

操作情報収集部１０１は、監視対象ネットワーク１０から操作情報を収集する。ここで、操作情報には、操作内容情報、操作時刻情報、及び操作状況情報が含まれる。操作内容情報とは、例えば以下の（ア）〜（オ）のような情報である。
（ア）監視対象ネットワークにネットワーク機器１１が接続された又は切断されたことを示す情報。
（イ）監視対象ネットワークのネットワーク機器１１の電源がＯｎになった又はＯｆｆになったことを示す情報。
（ウ）監視対象ネットワークのネットワーク機器１１がサービス開始となった又はサービス停止となったことを示す情報。
（エ）監視対象ネットワークのネットワーク機器１１にてサービスＹが実施されたことを示す情報。
（オ）監視対象ネットワークのネットワーク機器１１にてサービスＹが実施され、エラーが発生したことを示す情報。
ここで、サービスＹとは、例えば、ウェブブラウザ、ＵＰｎＰ、ＤＨＣＰなどである。
操作情報収集部１０１は、操作情報を操作履歴ＤＢ１０３に書き込む。操作情報収集部１０１は、操作履歴ＤＢ１０３に記憶した操作情報が予め定めた数に達した場合は、最も古いタイムスタンプを持つ操作情報を削除して、新たに入力された操作情報を記録する。 The operation information collection unit 101 collects operation information from the monitoring target network 10. Here, the operation information includes operation content information, operation time information, and operation status information. The operation content information is information such as the following (a) to (e).
(A) Information indicating that the network device 11 is connected to or disconnected from the monitored network.
(A) Information indicating that the power source of the network device 11 of the monitoring target network is turned on or turned off.
(C) Information indicating that the network device 11 of the monitoring target network has started or stopped.
(D) Information indicating that the service Y has been implemented in the network device 11 of the monitored network.
(E) Information indicating that an error has occurred due to the service Y being performed in the network device 11 of the monitored network.
Here, the service Y is, for example, a web browser, UPnP, DHCP, or the like.
The operation information collection unit 101 writes operation information in the operation history DB 103. When the operation information stored in the operation history DB 103 reaches a predetermined number, the operation information collection unit 101 deletes the operation information having the oldest time stamp and records newly input operation information.

入力部１０２は、入出力装置１２を介して操作者から障害発生情報を入力される。ここで、障害発生情報とは、ネットワーク利用者が遭遇した障害を示す情報である。例えば、障害発生情報とは、「６月上旬」「Ｗｅｂブラウザアクセス」「エラー」といった操作者に利用者が伝達する情報である。入力部１０２は、入力された障害発生情報に基づいて、障害時間情報、障害内容情報、及び障害状況情報の候補を生成する。ここで、障害時間情報とは、障害があった時間（故障期間）を示す情報である。障害内容情報とは、障害が起こった操作（例えば、ｆｔｐアクセス、Ｗｅｂアクセス）と、その障害が起こったネットワーク機器１１のＩＰアドレス、ＭＡＣアドレスとを示す情報である。障害状況情報とは、例えば、エラー発生などの障害の状況を示す情報である。入力部１０２は、例えば、入出力装置１２に障害時間情報、障害内容情報、及び障害状況情報の候補を表示させ、操作者が選択した障害時間情報、障害内容情報、及び障害状況情報を生成する。
入力部１０２は、生成した障害時間情報、障害内容情報、及び障害状況情報を操作履歴抽出部１０４に出力する。 The input unit 102 receives failure occurrence information from the operator via the input / output device 12. Here, the failure occurrence information is information indicating a failure encountered by the network user. For example, the failure occurrence information is information transmitted by the user to the operator such as “early June”, “Web browser access”, and “error”. The input unit 102 generates failure time information, failure content information, and failure status information candidates based on the input failure occurrence information. Here, the failure time information is information indicating a time (failure period) when there is a failure. The failure content information is information indicating an operation in which a failure has occurred (for example, ftp access, Web access) and the IP address and MAC address of the network device 11 in which the failure has occurred. The failure status information is information indicating a failure status such as an error occurrence, for example. For example, the input unit 102 causes the input / output device 12 to display failure time information, failure content information, and failure status information candidates, and generates failure time information, failure content information, and failure status information selected by the operator. .
The input unit 102 outputs the generated failure time information, failure content information, and failure status information to the operation history extraction unit 104.

操作履歴ＤＢ１０３は、操作情報収集部１０１から入力された操作情報を操作履歴情報テーブルに記憶する。操作履歴情報テーブルとは、過去に入力された操作情報の履歴から構成されるテーブルである。なお、なお、操作履歴ＤＢ１０３が記憶する操作履歴情報テーブルの詳細については、図面を参照しながら後述する。 The operation history DB 103 stores the operation information input from the operation information collection unit 101 in the operation history information table. The operation history information table is a table composed of a history of operation information input in the past. Note that details of the operation history information table stored in the operation history DB 103 will be described later with reference to the drawings.

操作履歴抽出部１０４は、入力部１０２から入力された障害時間情報、障害内容情報、障害状況情報と、操作履歴ＤＢ１０３に記録された操作時刻情報、操作内容情報、操作状況情報を比較し、操作履歴ＤＢ１０３から次の（ア）〜（ウ）の全ての条件を満たす操作情報を抽出する。
（ア）操作時刻情報が、障害時間情報が示す時間の範囲内に含まれる。
（イ）操作内容情報が、障害内容情報と一致する、もしくは、予め設定された所定のルールで導かれるものと一致する。
（ウ）操作状況情報が、障害状況情報と一致する。
操作履歴抽出部１０４は、抽出した操作情報を故障期間抽出部１０５に出力する。 The operation history extraction unit 104 compares the failure time information, the failure content information, and the failure status information input from the input unit 102 with the operation time information, the operation content information, and the operation status information recorded in the operation history DB 103. Operation information that satisfies all the following conditions (a) to (c) is extracted from the history DB 103.
(A) The operation time information is included in the time range indicated by the failure time information.
(A) The operation content information matches the failure content information or matches the information derived by a predetermined rule set in advance.
(C) The operation status information matches the failure status information.
The operation history extraction unit 104 outputs the extracted operation information to the failure period extraction unit 105.

故障期間抽出部１０５は、操作履歴抽出部１０４で抽出された操作情報に基づいて故障期間、及び対応するＩＰアドレス、ＭＡＣアドレスを抽出する。ここで、故障期間とは、抽出した操作情報のうち、操作時刻情報が示す時刻が最も古い時刻から、操作時刻情報が示す時刻が最も新しい時刻までの期間を指す。故障期間抽出部１０５は、抽出した故障期間を示す故障期間情報を故障期間ＤＢ１０６及び異常値判定部１１１に出力する。 The failure period extraction unit 105 extracts the failure period and the corresponding IP address and MAC address based on the operation information extracted by the operation history extraction unit 104. Here, the failure period refers to a period from the oldest time indicated by the operation time information to the latest time indicated by the operation time information among the extracted operation information. The failure period extraction unit 105 outputs failure period information indicating the extracted failure period to the failure period DB 106 and the abnormal value determination unit 111.

故障期間ＤＢ１０６は、故障期間抽出部１０５から入力された故障期間情報を故障期間情報テーブルに記憶する。なお、故障期間ＤＢ１０６が記憶する故障期間情報テーブルの詳細については、図面を参照しながら後述する。 The failure period DB 106 stores the failure period information input from the failure period extraction unit 105 in the failure period information table. Details of the failure period information table stored in the failure period DB 106 will be described later with reference to the drawings.

測定値情報収集部１０７は、監視対象ネットワーク１０から測定値情報を収集する。ここで、測定値情報は、測定値時刻情報及び測定値内容情報を含む。測定値時刻情報は、測定ポイントで測定値を測定した測定時刻を示す。また、測定値内容情報は、例えば、測定ポイントが生成したＵＰｎＰ情報（例えば、ＢＢＲ（ブロードバンドルータ）のＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）側のパケット数、ＬＡＮ側のパケット数）、ＳＮＭＰ情報（例えば、ハブ、スイッチのポート毎の入出力パケット数、コリジョンパケット数など）、ＤＨＣＰ情報（例えば、Ｄｉｓｃｏｖｅｒコマンド送信数、ＯＦＦＥＲ応答数など）などである。
測定値情報収集部１０７は、測定値情報を測定値履歴ＤＢ１０８に書き込む。なお、測定値情報の詳細については、図面を参照しながら後述する。 The measurement value information collection unit 107 collects measurement value information from the monitoring target network 10. Here, the measurement value information includes measurement value time information and measurement value content information. The measurement value time information indicates the measurement time when the measurement value is measured at the measurement point. The measurement value content information includes, for example, UPnP information generated by the measurement point (for example, the number of packets on the WAN (Wide Area Network) side of the BBR (Broadband Router), the number of packets on the LAN side), SNMP information (for example, hub) , The number of input / output packets for each port of the switch, the number of collision packets, etc.), DHCP information (for example, the number of Discover command transmissions, the number of OFFER responses, etc.).
The measurement value information collection unit 107 writes the measurement value information in the measurement value history DB 108. Details of the measurement value information will be described later with reference to the drawings.

測定値履歴ＤＢ１０８は、測定値情報収集部１０７から入力された測定値情報を測定値情報テーブルに記憶する。つまり、測定値履歴テーブルとは、過去に入力された測定値情報の履歴から構成されるテーブルである。なお、測定値履歴ＤＢ１０８が記憶する測定値履歴情報テーブルの詳細については、図面を参照しながら後述する。 The measurement value history DB 108 stores the measurement value information input from the measurement value information collection unit 107 in the measurement value information table. That is, the measurement value history table is a table configured from a history of measurement value information input in the past. The details of the measurement value history information table stored in the measurement value history DB 108 will be described later with reference to the drawings.

ベースライン生成部１０９は、故障期間ＤＢ１０６から故障期間情報を読み出す。ベースライン生成部１０９は、故障期間情報が示す故障期間以外の期間（正常動作期間と呼ぶ）における各測定値情報を測定値履歴ＤＢ１０８から読み出す。ベースライン生成部１０９は、各測定値情報から各測定値を抽出する。ベースライン生成部１０９は、例えば、過去１ヶ月間に渡る各日の各測定値のうち、正常動作期間に含まれる各測定値について、予め定めたベースライン単位時間（例えば、１０分）毎の平均値、及び分散（ベースライン単位時間毎の各測定値の平均値、及び分散を総称してベースラインとも呼ぶ）を算出する（ベースライン作成処理と呼ぶ）。ベースライン生成部１０９は、算出したベースラインを示すベースライン情報をベースラインＤＢ１１０に書き込む。つまり、ベースライン生成部１０９は、正常動作期間に測定された測定値に基づいてベースラインを生成する。
ベースラインＤＢ１１０は、ベースライン生成部１０９が生成したベースライン情報を記録する。 The baseline generation unit 109 reads out failure period information from the failure period DB 106. The baseline generation unit 109 reads from the measurement value history DB 108 each measurement value information in a period other than the failure period indicated by the failure period information (referred to as a normal operation period). The baseline generation unit 109 extracts each measurement value from each measurement value information. For example, among the measurement values for each day over the past month, the baseline generation unit 109 performs measurement for each measurement value included in the normal operation period for each predetermined baseline unit time (for example, 10 minutes). An average value and variance (average value and variance of each measurement value per baseline unit time are collectively referred to as a baseline) are calculated (referred to as baseline creation processing). The baseline generation unit 109 writes the baseline information indicating the calculated baseline in the baseline DB 110. That is, the baseline generation unit 109 generates a baseline based on the measurement values measured during the normal operation period.
The baseline DB 110 records the baseline information generated by the baseline generation unit 109.

異常値判定部１１１は、測定値情報が示す測定値が異常値であるか否かを判定する。具体的には、異常値判定部１１１は、故障期間抽出部１０５から故障期間情報を入力される。異常値判定部１１１は、故障期間情報が示す故障期間に対応する測定値情報を測定値履歴ＤＢ１０８から読み出す。異常値判定部１１１は、故障期間情報が示す故障期間に対応するベースライン情報をベースラインＤＢ１１０から読み出す。異常値判定部１１１は、故障期間における測定値及びベースライン情報が示すベースライン（例えば、平均値、分散）に基づいて、測定値が異常であるか否かを判定する。例えば、異常値判定部１１１は、故障期間における測定値が平均値−ｎ×分散≦測定値≦平均値＋ｎ×分散（ｎは予め定めた正の数）の関係を満たしている場合は、測定値は正常であると判断する。異常値判定部１１１は、この関係を満たさない場合は、測定値は異常であると判断する。異常値判定部１１１は、各測定値について、それに対応するベースラインを用いて上記の判定を行い、各測定項目の正常、異常を判定する。異常値判定部１１１は、各測定項目の判定結果を出力部１１３に出力する。 The abnormal value determination unit 111 determines whether or not the measurement value indicated by the measurement value information is an abnormal value. Specifically, the abnormal value determination unit 111 receives failure period information from the failure period extraction unit 105. The abnormal value determination unit 111 reads measurement value information corresponding to the failure period indicated by the failure period information from the measurement value history DB 108. The abnormal value determination unit 111 reads baseline information corresponding to the failure period indicated by the failure period information from the baseline DB 110. The abnormal value determination unit 111 determines whether or not the measured value is abnormal based on the measured value in the failure period and the baseline (for example, average value, variance) indicated by the baseline information. For example, the abnormal value determination unit 111 performs measurement when the measured value in the failure period satisfies the relationship of average value−n × dispersion ≦ measured value ≦ average value + n × dispersion (n is a predetermined positive number). The value is determined to be normal. The abnormal value determination unit 111 determines that the measured value is abnormal when this relationship is not satisfied. The abnormal value determination unit 111 performs the above determination for each measurement value using the corresponding baseline, and determines whether each measurement item is normal or abnormal. The abnormal value determination unit 111 outputs the determination result of each measurement item to the output unit 113.

異常値／故障箇所対応ＤＢ１１２は、ＭＡＣアドレス、測定項目、及び故障箇所との関係を示す異常値／故障箇所対応テーブルを記憶する。異常値／故障箇所対応テーブルの詳細は図面を参照しながら後述する。 The abnormal value / failure location correspondence DB 112 stores an abnormal value / failure location correspondence table indicating the relationship between the MAC address, the measurement item, and the failure location. Details of the abnormal value / failure location correspondence table will be described later with reference to the drawings.

出力部１１３は、異常値判定部１１１から各測定項目の判定結果を入力される。出力部１１３は、各測定値の判定結果が異常と判定された場合は、異常値／故障箇所対応ＤＢ１１２に記録された異常値／故障箇所対応テーブルを参照して、異常であると判定された測定項目に対応する装置の故障箇所の情報を抽出し、抽出した情報を入出力装置１２に出力する。出力部１１３は、測定値の判定結果が異常と判定されなかったときは、異常が発見できなかったことを示す情報を入出力装置１２に出力する。 The output unit 113 receives the determination result of each measurement item from the abnormal value determination unit 111. When the determination result of each measurement value is determined to be abnormal, the output unit 113 is determined to be abnormal with reference to the abnormal value / failure location correspondence table recorded in the abnormal value / failure location correspondence DB 112. Information on the failure location of the device corresponding to the measurement item is extracted, and the extracted information is output to the input / output device 12. When the determination result of the measurement value is not determined to be abnormal, the output unit 113 outputs information indicating that no abnormality has been found to the input / output device 12.

図３は、本実施形態に係る操作情報収集の動作の一例を示すフローチャートである。
（ステップＳ１０１）操作情報収集部１０１は、監視対象ネットワーク１０から操作情報を取得する。その後ステップＳ１０２に進む。
（ステップＳ１０２) 操作情報収集部１０１は、操作履歴ＤＢ１０３に記録された操作履歴情報テーブル内の操作情報の数が予め定めた数より大きいか否かを判定する。操作情報の数が予め定めた数より大きいと判定した場合（Ｙｅｓ）はステップＳ１０３に進む。操作情報の数が予め定めた数より大きくないと判定した場合（Ｎｏ）はステップＳ１０４に進む。
（ステップＳ１０３）操作情報収集部１０１は、操作履歴情報テーブルから最も古い操作情報を削除する。その後ステップＳ１０４に進む。
（ステップＳ１０４）操作情報収集部１０１は、ステップＳ１０１で取得した操作情報を操作履歴ＤＢ１０３に書き込む。その後ステップＳ１０１に戻る。 FIG. 3 is a flowchart showing an example of operation information collection operation according to the present embodiment.
(Step S <b> 101) The operation information collection unit 101 acquires operation information from the monitoring target network 10. Thereafter, the process proceeds to step S102.
(Step S102) The operation information collection unit 101 determines whether or not the number of operation information in the operation history information table recorded in the operation history DB 103 is greater than a predetermined number. When it determines with the number of operation information being larger than the predetermined number (Yes), it progresses to step S103. When it determines with the number of operation information not being larger than the predetermined number (No), it progresses to step S104.
(Step S103) The operation information collection unit 101 deletes the oldest operation information from the operation history information table. Thereafter, the process proceeds to step S104.
(Step S104) The operation information collection unit 101 writes the operation information acquired in step S101 in the operation history DB 103. Thereafter, the process returns to step S101.

なお、ステップＳ１０２では、操作情報の数が予め定めた数より大きいか否かで判定を行ったが、操作情報のタイムスタンプが予め定めた期間より古いか否かに基づいて判定を行ってもよい。 In step S102, the determination is made based on whether or not the number of operation information is larger than a predetermined number. However, the determination may be made based on whether or not the time stamp of the operation information is older than a predetermined period. Good.

図４は、本実施形態に係る操作履歴ＤＢ１０３に記憶される操作情報の一例を示す概略図である。図示するように操作履歴テーブルは、操作時刻情報が示す時刻であるタイムスタンプ、ＩＰアドレス、ＭＡＣアドレス、機器名、操作内容情報が示す操作内容、操作状況情報が示すエラー状況の各項目の列を有している。操作履歴テーブルは、タイムスタンプ毎に操作情報が格納される行と列からなる２次元の表形式のデータである。 FIG. 4 is a schematic diagram illustrating an example of operation information stored in the operation history DB 103 according to the present embodiment. As shown in the figure, the operation history table includes columns of items of a time stamp indicated by the operation time information, an IP address, a MAC address, a device name, an operation content indicated by the operation content information, and an error status indicated by the operation status information. Have. The operation history table is two-dimensional tabular data composed of rows and columns in which operation information is stored for each time stamp.

例えば、符合４ａを付した操作情報は、タイムスタンプが２０１１／６／３１７：０３：４０、ＩＰアドレスは未定、ＭＡＣアドレスは不明、機器名が「ＣｅｎｔｅｒＳＷ」、操作内容は「ＬｉｎｋＵｐ」、エラー状況は正常であることを示している。
符合４ｂを付した操作情報は、タイムスタンプが２０１１／６／３１７：０４：２０、ＩＰアドレスが１９２．１６８．１．３０、ＭＡＣアドレスが００：１ｂ：ｂａ：ｅ０：ｂ４：９ｃ、機器名が「ＡｓｙａＴＶ」、操作内容が「ＤＨＣＰにてアドレス取得」、エラー状況は正常であることを示している。
符合４ｃを付した操作情報は、タイムスタンプが２０１１／６／３１７：１０：０５、ＩＰアドレスが１９２．１６８．１．３０、ＭＡＣアドレスが００：１ｂ：ｂａ：ｅ０：ｂ４：９ｃ、機器名が「ＡｓｙａＴＶ」、操作内容が「ＤＬＡＮで動画を視聴」、エラー状況は正常であることを示している。 For example, in the operation information with the reference 4a, the time stamp is 2011/6/3 17:03:40, the IP address is undecided, the MAC address is unknown, the device name is “CenterSW”, and the operation content is “Link Up”. The error status indicates normal.
The operation information with the reference 4b includes a time stamp of 2011/6/3 17:04:20, an IP address of 192.168.1.30, a MAC address of 00: 1b: ba: e0: b4: 9c, and a device. The name is “AsyaTV”, the operation content is “Acquire address by DHCP”, and the error status is normal.
The operation information with the reference 4c includes a time stamp of 2011/6/3 17:10:05, an IP address of 192.168.1.30, a MAC address of 00: 1b: ba: e0: b4: 9c, and a device. The name is “AsyaTV”, the operation content is “view video with DLAN”, and the error status is normal.

図５は、本実施形態に係る測定値収集の動作の一例を示すフローチャートである。
（ステップＳ２０１）測定値情報収集部１０７は、監視情報ネットワーク１０から測定値情報を取得する。その後ステップＳ２０２に進む。
（ステップＳ２０２) 測定値情報収集部１０７は、測定値履歴ＤＢ１０８に記録された測定値情報テーブル内の測定値情報の数が予め定めた数より大きいか否かを判定する。測定値情報の数が予め定めた数より大きいと判定した場合（Ｙｅｓ）はステップＳ２０３に進む。測定値情報の数が予め定めた数より大きくないと判定した場合（Ｎｏ）はステップＳ２０４に進む。
（ステップＳ２０３）測定値情報収集部１０７は、測定値履歴ＤＢ１０８から、最も古い測定値情報を削除する。その後ステップＳ２０４に進む。
（ステップＳ２０４）測定値情報収集部１０７は、ステップＳ２０１で取得した測定値情報を測定値履歴ＤＢ１０８に書き込む。その後ステップＳ２０１に進む。 FIG. 5 is a flowchart showing an example of the measurement value collection operation according to the present embodiment.
(Step S <b> 201) The measurement value information collection unit 107 acquires measurement value information from the monitoring information network 10. Thereafter, the process proceeds to step S202.
(Step S202) The measurement value information collection unit 107 determines whether or not the number of measurement value information in the measurement value information table recorded in the measurement value history DB 108 is greater than a predetermined number. If it is determined that the number of pieces of measurement value information is greater than the predetermined number (Yes), the process proceeds to step S203. If it is determined that the number of pieces of measurement value information is not greater than the predetermined number (No), the process proceeds to step S204.
(Step S203) The measurement value information collection unit 107 deletes the oldest measurement value information from the measurement value history DB. Thereafter, the process proceeds to step S204.
(Step S204) The measurement value information collection unit 107 writes the measurement value information acquired in step S201 in the measurement value history DB 108. Thereafter, the process proceeds to step S201.

図６は、本実施形態に係る測定値履歴ＤＢ１０８に記憶される測定値履歴テーブルの一例を示す概略図である。図示するように測定値履歴テーブルは、測定値時刻情報であるタイムスタンプ、ＩＰアドレス、ＭＡＣアドレス、測定項目、及び測定値の各項目の列を有している。測定値履歴テーブルは、タイムスタンプ毎に測定値情報が格納される行と列からなる２次元の表形式のデータである。
例えば、符号６ａを付した測定値情報は、タイムスタンプが２０１１／６／３１７：００：０１、ＩＰアドレスは１９２．１６８．１．２８、ＭＡＣアドレスは００：１ｂ：ｂａ：ｅ０：ｂ４：９ｃ、操作項目が「Ｗａｎ側パケット送出」、測定値が「２５」であることを示している。
符号６ｂを付した測定値情報は、タイムスタンプが２０１１／６／３１７：００：１０、ＩＰアドレスは１９２．１６８．１．２８、ＭＡＣアドレスは００：１ｂ：ｂａ：ｅ０：ｂ４：９ｃ、操作項目が「パケット入力」、測定値が「５５」であることを示している。
符号６ｃを付した測定値情報は、タイムスタンプが２０１１／６／３１７：０２：１１、ＩＰアドレスは１９２．１６８．１．４２、ＭＡＣアドレスは００：２２：１５：ｄｆ：６９：８３、操作内容が「パケットコリジョン」、測定値が「１０」であることを示している。 FIG. 6 is a schematic diagram illustrating an example of a measurement value history table stored in the measurement value history DB 108 according to the present embodiment. As shown in the figure, the measurement value history table has columns of time stamp, IP address, MAC address, measurement item, and measurement value, which are measurement value time information. The measurement value history table is two-dimensional tabular data composed of rows and columns in which measurement value information is stored for each time stamp.
For example, in the measurement value information with the reference numeral 6a, the time stamp is 2011/6/3 17:00:01, the IP address is 192.168.1.28, and the MAC address is 00: 1b: ba: e0: b4: 9c indicates that the operation item is “Wan side packet transmission” and the measurement value is “25”.
The measurement value information to which reference numeral 6b is attached is that the time stamp is 2011/6/3 17:00:10, the IP address is 192.168.1.28, the MAC address is 00: 1b: ba: e0: b4: 9c, This indicates that the operation item is “packet input” and the measurement value is “55”.
The measurement value information to which reference numeral 6c is attached has a time stamp of 2011/6/3 17:02:11, an IP address of 192.168.1.42, a MAC address of 00: 22: 15: df: 69: 83, The operation content is “packet collision” and the measurement value is “10”.

図７は、本実施形態に係る故障期間ＤＢ１０６に記憶される故障期間情報テーブルの一例を示す概略図である。図示するように故障期間情報テーブルは、故障期間情報が示す故障開始時刻、故障終了時刻、及びＩＰアドレス、ＭＡＣアドレスの各項目の列を有している。故障期間情報テーブルは、故障開始時刻毎に、故障終了時刻、ＩＰアドレス、ＭＡＣアドレスが格納される行と列からなる２次元の表形式のデータである。
例えば、符合７ａを付した故障期間情報は、故障開始時刻が２０１１／６／３１７：０２：１１、故障終了時刻が２０１１／６／３１７：３５：００、ＩＰアドレスは１９２．１６８．１．４２、ＭＡＣアドレスは００：２２：１５：ｄｆ：６９：８３であることを示している。
符合７ｂを付した故障期間情報は、故障開始時刻が２０１１／６／３１８：２５：３７、故障終了時刻が２０１１／６／３１９：１４：２０、ＩＰアドレスは１９２．１６８．１．２８、ＭＡＣアドレスは００：１ｂ：ｂａ：ｅ０：ｂ４：９ｃであることを示している。 FIG. 7 is a schematic diagram illustrating an example of a failure period information table stored in the failure period DB 106 according to the present embodiment. As shown in the figure, the failure period information table has columns of items of failure start time, failure end time, IP address, and MAC address indicated by the failure period information. The failure period information table is two-dimensional tabular data composed of rows and columns storing failure end times, IP addresses, and MAC addresses for each failure start time.
For example, the failure period information with the symbol 7a has a failure start time of 2011/6/3 17:02:11, a failure end time of 2011/6/3 17:35:00, and an IP address of 192.168.1. 42, indicating that the MAC address is 00: 22: 15: df: 69: 83.
The failure period information with the reference numeral 7b has a failure start time of 2011/6/3 18:25:37, a failure end time of 2011/6/3 19:14:20, and an IP address of 192.168.1.28. The MAC address is 00: 1b: ba: e0: b4: 9c.

図８は、本実施形態に係るベースラインテーブルの一例を示す概略図である。図示するようにベースラインテーブルは、ベースライン単位時間毎に、測定値の平均値と分散の各項目の列を有している行と列からなる２次元の表形式のデータである。ここで、測定値の種類と測定対象のＭＡＣアドレス毎に測定値の平均値と分散の各項目の列が設けられている。ベースラインテーブルには、例えば、過去１ヶ月間の各日における測定値のうち、正常動作期間に含まれる測定値についての平均値と分散が記録されている。
符合９ａを付したデータは、時間００：００：００〜００:１０:００において、ＭＡＣアドレスが００：１ａ：ｂａ：ｅ０：ｂ４：９ｃのパケット入力数の平均値が７５０、分散が７８、パケットコリジョンの平均値が１２、分散が３、ＭＡＣアドレスが００：２２：１５：ｄｆ：６９：８３のパケット数の平均値が２６４５、分散が２３０、パケットコリジョンの平均値が４５、分散が９であることを示している。実際は、全ての測定ポイントにおける全ての測定値についてベースラインが記憶されるが、本図では煩雑となるため省略している。 FIG. 8 is a schematic diagram illustrating an example of a baseline table according to the present embodiment. As shown in the figure, the baseline table is data in a two-dimensional tabular format composed of rows and columns each having columns of average values and variances of measured values for each baseline unit time. Here, for each type of measurement value and the MAC address of the measurement object, a column for each item of average value and variance of the measurement value is provided. In the baseline table, for example, average values and variances of the measurement values included in the normal operation period among the measurement values for each day in the past month are recorded.
The data with the sign 9a has an average value of 750 packet inputs and a variance of 78 at the time of 00:00:00 to 00:10:00, with the MAC address 00: 1a: ba: e0: b4: 9c. The packet collision average value is 12, the variance is 3, the MAC address is 00: 22: 15: df: 69: 83, the average number of packets is 2645, the variance is 230, the average packet collision value is 45, and the variance is 9 It is shown that. Actually, baselines are stored for all measured values at all measurement points, but are omitted in this figure because they are complicated.

図９は、本実施形態に係るベースライン生成処理を示すフローチャートである。
（ステップＳ３０１）ベースライン生成部１０９は、予め定めた一定期間が経過したか否かを判定する。ここで、一定期間とは、例えば、１０分である。予め定めた一定期間が経過したと判定した場合（Ｙｅｓ）は、ステップＳ３０２に進む。予め定めた一定期間が経過していないと判定した場合（Ｎｏ）は、ステップＳ３０１に進む。
（ステップＳ３０２）ベースライン生成部１０９は、故障期間ＤＢ１０６から故障期間情報を読み出し、故障期間情報が示す故障期間を抽出する。ベースライン生成部１０９は、故障期間に基づいて正常動作期間を算出する。その後ステップＳ３０３に進む。 FIG. 9 is a flowchart showing the baseline generation processing according to the present embodiment.
(Step S301) The baseline generation unit 109 determines whether or not a predetermined period has elapsed. Here, the fixed period is, for example, 10 minutes. If it is determined that a predetermined period has elapsed (Yes), the process proceeds to step S302. If it is determined that the predetermined period has not elapsed (No), the process proceeds to step S301.
(Step S302) The baseline generation unit 109 reads the failure period information from the failure period DB 106, and extracts the failure period indicated by the failure period information. The baseline generation unit 109 calculates a normal operation period based on the failure period. Thereafter, the process proceeds to step S303.

（ステップＳ３０３）ベースライン生成部１０９は、測定値履歴ＤＢ１０８に記録された測定値情報から、ステップＳ３０２で算出した正常動作期間に含まれる測定値を読み出す。その後ステップＳ３０４に進む。
（ステップＳ３０４）ベースライン生成部１０９は、ステップＳ３０３で読み出した各測定値について、ベースライン単位時間毎に各測定値の平均値、及び分散（ベースライン）を算出する。ベースライン生成部１０９は、算出したベースラインをベースラインＤＢ１１０に書き込む。その後ステップＳ３０１に進む。 (Step S303) The baseline generation unit 109 reads the measurement value included in the normal operation period calculated in Step S302 from the measurement value information recorded in the measurement value history DB 108. Thereafter, the process proceeds to step S304.
(Step S304) The baseline generation unit 109 calculates an average value and a variance (baseline) of each measurement value for each baseline unit time for each measurement value read in Step S303. The baseline generation unit 109 writes the calculated baseline in the baseline DB 110. Thereafter, the process proceeds to step S301.

図１０は、本実施形態に係る異常値／故障箇所対応テーブルの一例を示す概略図である。図示するように異常値／故障箇所対応テーブルは、異常値毎に、故障箇所の項目を有する行と列からなる２次元の表形式のデータである。
符合１０ａを付したデータは、ＭＡＣアドレスが００：２２：１５：ｄｆ：６９：８３の測定ポイントのＷｅｂサーバアクセス回数が異常値を示した場合には、ＭＡＣアドレスが１１：２２：３３：４４：５５：６６のＨＴＴＰサーバが故障している可能性が高いことを示している。符合１０ｂを付したデータは、ＭＡＣアドレスが００：１ｂ：ｂａ：ｅ０：ｂ４：９ｃの測定ポイントのパケット数が異常値を示した場合には、ＭＡＣアドレスが２２：３３：４４：５５：６６：７７のルータが故障している可能性が高いことを示している。符合１０ｃを付したデータは、ＭＡＣアドレスが００：０２：ｃ１：４ａ：７ｄ：ｂ６の測定ポイントのパケットコリジョン数が異常値を示した場合には、ＭＡＣアドレスが３３：４４：５５：６６：７７：８８のＤＨＣＰサーバが故障している可能性が高いことを示している。
異常値／故障箇所対応テーブルは、測定項目と故障の相関が高い組み合わせに基づいて予め作成しておく。 FIG. 10 is a schematic diagram illustrating an example of the abnormal value / failure location correspondence table according to the present embodiment. As shown in the drawing, the abnormal value / failure location correspondence table is data in a two-dimensional tabular format composed of rows and columns having items of failure locations for each abnormal value.
The data with the sign 10a has a MAC address of 11: 22: 33: 44 when the Web server access count of the measurement point with the MAC address of 00: 22: 15: df: 69: 83 shows an abnormal value. : 55: 66 indicates that there is a high possibility that the HTTP server is out of order. When the number of packets at the measurement point of the MAC address 00: 1b: ba: e0: b4: 9c indicates an abnormal value, the data with the sign 10b indicates that the MAC address is 22: 33: 44: 55: 66. : 77 indicates that there is a high possibility that 77 routers are out of order. The data with the sign 10c has a MAC address of 33: 44: 55: 66: when the packet collision number at the measurement point of the MAC address 00: 02: c1: 4a: 7d: b6 shows an abnormal value. 77:88 indicates that the DHCP server is likely to be out of order.
The abnormal value / failure location correspondence table is created in advance based on a combination having a high correlation between the measurement item and the failure.

図１１は、本実施形態に係る故障箇所特定処理を示すフローチャートである。
（ステップＳ４０１）入力部１０２は、操作者から障害発生情報を入力される。ここで、障害発生情報とは、例えば、障害時間情報（「６月上旬」）、障害内容情報（「Ｗｅｂブラウザアクセス」）障害状況情報（「エラー」）である。その後ステップＳ４０２に進む。
（ステップＳ４０２）操作履歴抽出部１０４は、操作履歴ＤＢ１０３から、ステップＳ４０１で入力された、障害発生情報に対応するイベントを抽出する。例えば、操作履歴ＤＢから、操作時刻情報が「６月上旬」に含まれ、操作内容情報が「Ｗｅｂブラウザアクセス」であり、操作状況情報が「エラー」である操作情報を抽出する。操作履歴抽出部１０４は、抽出した操作情報を故障期間抽出部１０５に出力する。その後ステップＳ４０３に進む。 FIG. 11 is a flowchart showing a failure location specifying process according to the present embodiment.
(Step S401) The input unit 102 receives failure occurrence information from the operator. Here, the failure occurrence information is, for example, failure time information (“early June”), failure content information (“Web browser access”), failure status information (“error”). Thereafter, the process proceeds to step S402.
(Step S402) The operation history extraction unit 104 extracts an event corresponding to the failure occurrence information input in step S401 from the operation history DB 103. For example, operation information whose operation time information is included in “early June”, operation content information is “Web browser access”, and operation status information is “error” is extracted from the operation history DB. The operation history extraction unit 104 outputs the extracted operation information to the failure period extraction unit 105. Thereafter, the process proceeds to step S403.

（ステップＳ４０３）故障期間抽出部１０５は、ステップＳ４０２で抽出した操作情報のうち、最もタイムスタンプの古い時刻と、最もタイムスタンプの新しい時刻との間の時間を故障期間として抽出する。故障期間抽出部１０５は、抽出した故障期間を示す故障期間情報を故障期間ＤＢ１０６に書き込み、異常値判定部１１１に出力する。その後ステップＳ４０４に進む。
（ステップＳ４０４）異常値判定部１１１は、ステップＳ４０３で抽出した故障期間情報が示す故障期間に対応する測定値情報を測定値履歴ＤＢ１０８から読み出す。異常値判定部１１１は、ステップ４０３で抽出した故障期間に対応するベースライン情報をベースラインＤＢ１１０から読み出す。その後ステップＳ４０５に進む。 (Step S403) The failure period extraction unit 105 extracts the time between the oldest time stamp and the newest time stamp as the failure period from the operation information extracted in Step S402. The failure period extraction unit 105 writes failure period information indicating the extracted failure period in the failure period DB 106 and outputs the failure period information to the abnormal value determination unit 111. Thereafter, the process proceeds to step S404.
(Step S404) The abnormal value determination unit 111 reads measurement value information corresponding to the failure period indicated by the failure period information extracted in step S403 from the measurement value history DB 108. The abnormal value determination unit 111 reads the baseline information corresponding to the failure period extracted in step 403 from the baseline DB 110. Thereafter, the process proceeds to step S405.

（ステップＳ４０５）異常値判定部１１１は、各測定ポイント、測定値の種類毎に、測定値情報が示す測定値とベースライン情報が示すベースラインとを比較し、測定値が異常であるか否かを判定する。具体的には、異常値判定部１１１は、故障期間における測定値が平均値−ｎ×分散≦測定値≦平均値＋ｎ×分散（ｎは予め定めた正の数）の関係を満たすか否かを判定する。関係を満たすと判定された場合（Ｙｅｓ）はステップＳ４０５に進む。関係を満たさないと判定された場合（Ｎｏ）はステップＳ４０７に進む。 (Step S405) The abnormal value determination unit 111 compares the measurement value indicated by the measurement value information with the baseline indicated by the baseline information for each measurement point and each type of measurement value, and determines whether or not the measurement value is abnormal. Determine whether. Specifically, the abnormal value determination unit 111 determines whether or not the measured value in the failure period satisfies the relationship of average value−n × dispersion ≦ measured value ≦ average value + n × dispersion (n is a predetermined positive number). Determine. If it is determined that the relationship is satisfied (Yes), the process proceeds to step S405. If it is determined that the relationship is not satisfied (No), the process proceeds to step S407.

（ステップＳ４０６）出力部１１３は、異常値／故障箇所対応テーブルから、ステップＳ４０５で異常と判定された測定値に対応する故障箇所を抽出する。その後ステップＳ４０７に進む。
（ステップＳ４０７）出力部１１３は、ステップＳ４０５において測定が異常値であると判定された場合は、ステップＳ４０６で抽出した故障個所を示す表示情報を入出力装置１２に出力する。出力部１１３は、ステップＳ４０５において測定が異常値であると判定されなかった場合は、異常が見つからなかったことを示す表示情報を入出力装置１２に出力する。入出力装置１２は、出力部１１３から入力された表示情報を表示する。その後終了処理に進む。 (Step S406) The output unit 113 extracts a failure location corresponding to the measurement value determined to be abnormal in step S405 from the abnormal value / failure location correspondence table. Thereafter, the process proceeds to step S407.
(Step S407) When the output unit 113 determines that the measurement is an abnormal value in Step S405, the output unit 113 outputs the display information indicating the failure location extracted in Step S406 to the input / output device 12. If the measurement is not determined to be an abnormal value in step S405, the output unit 113 outputs display information indicating that no abnormality has been found to the input / output device 12. The input / output device 12 displays the display information input from the output unit 113. Thereafter, the process proceeds to an end process.

このように、本実施形態では、故障検出装置１３は、ネットワーク機器１１の操作に関する情報とネットワーク機器１１に対する操作が行われた時刻とを紐付けた操作履歴情報を記録する操作履歴ＤＢ１０３と、ネットワーク機器１１の動作状況を表す測定値と測定値が測定された時刻とを紐付けた測定履歴情報を記録する測定値履歴ＤＢ１０８と、操作履歴情報に基づいて、ネットワーク機器１１の故障期間を特定する故障期間抽出部１０５と、故障期間と測定履歴情報とに基づいて故障期間以外の時刻に紐付けられた測定値を抽出し、抽出した測定値のベースライン情報を生成するベースライン生成部１０９と、ベースライン生成部１０９が生成したベースライン情報と、故障発生期間に時刻に紐付けられた測定値とを比較してネットワーク機器１１の異常を検出する異常値判定部１１１とを備える。これにより、故障検出装置１３は、故障が発生している時間帯でのデータに対し、故障が発生していない期間のデータを元に作成されたベースラインと比較することで、異常値の抽出率が向上し、故障箇所の推定能力を向上させることができる。すなわち、故障検出装置１３は、異常の検出を確実にできる。 As described above, in this embodiment, the failure detection apparatus 13 includes an operation history DB 103 that records operation history information in which information related to operation of the network device 11 and time when the operation on the network device 11 is performed, and a network. The failure period of the network device 11 is specified based on the measurement value history DB 108 that records the measurement history information that links the measurement value that represents the operation status of the device 11 and the time when the measurement value was measured, and the operation history information. A failure period extraction unit 105, a baseline generation unit 109 that extracts measurement values associated with times other than the failure period based on the failure period and measurement history information, and generates baseline information of the extracted measurement values; The baseline information generated by the baseline generation unit 109 is compared with the measurement value associated with the time during the failure occurrence period to compare the network. And an abnormal value determining unit 111 for detecting an abnormality of the click device 11. Accordingly, the failure detection device 13 extracts abnormal values by comparing the data in the time zone in which the failure occurs with the baseline created based on the data in the period in which the failure does not occur. The rate is improved, and the ability to estimate the failure location can be improved. That is, the failure detection device 13 can reliably detect an abnormality.

また、本実施形態では、抽出部１０５は、利用者からの情報と、操作履歴とに基づいて故障発生時間を特定する。これにより、利用者の記憶に基づいて、操作履歴ＤＢから、ネットワーク機器１１が出力した正確な操作履歴を抽出し、抽出した操作履歴に基づいて故障発生時間を特定することができる。これにより、真に故障が発生していた時間における異常値判定をおこなうことができるため、異常値判定の精度が向上する。 In the present embodiment, the extraction unit 105 specifies the failure occurrence time based on information from the user and the operation history. Thereby, it is possible to extract an accurate operation history output from the network device 11 from the operation history DB based on the user's memory, and to specify the failure occurrence time based on the extracted operation history. Thereby, since the abnormal value determination can be performed in the time when the failure has truly occurred, the accuracy of the abnormal value determination is improved.

なお、本実施形態では、操作履歴抽出部１０４は、操作者から入力された障害発生情報に基づいて操作情報を抽出し、故障期間抽出部１０５は、抽出された操作情報に基づいて故障期間情報を抽出した。しかし、故障期間情報の抽出はこれに限られず、操作履歴ＤＢ１０３に記録された操作状況情報に基づいて操作履歴抽出部１０４が抽出した操作情報に基づいて、故障期間抽出部１０５が故障期間情報を抽出してもよい。 In this embodiment, the operation history extraction unit 104 extracts operation information based on failure occurrence information input from the operator, and the failure period extraction unit 105 uses failure period information based on the extracted operation information. Extracted. However, the extraction of the failure period information is not limited to this, and the failure period extraction unit 105 obtains the failure period information based on the operation information extracted by the operation history extraction unit 104 based on the operation status information recorded in the operation history DB 103. It may be extracted.

なお、故障検出装置１３の各部及び各ＤＢはネットワークにて接続された別の装置の一部であってもよい。 Each unit and each DB of the failure detection device 13 may be a part of another device connected via a network.

なお、本実施形態では、測定値は、ＵＰｎＰ情報、ＳＮＭＰ情報、ＤＨＣＰ情報等に含まれている情報を用いたが、例えば、ネットワーク機器１１が単位時間に送信したパケットの数などを測定値としてもよい。 In this embodiment, information included in UPnP information, SNMP information, DHCP information, and the like is used as the measurement value. For example, the number of packets transmitted by the network device 11 per unit time is used as the measurement value. Also good.

なお、故障期間の抽出は、例えば以下のように行ってもよい。
（ア）あるイベント発生から予め定めた一定時間を故障期間とする。
（イ）あるイベントと別のあるイベントとの発生時刻の間の時間を故障期間とする。
（ウ）あるイベントと別のあるイベントとの発生時刻の間の時間の前後に一定時間を加えたものを故障期間とする。 The failure period may be extracted as follows, for example.
(A) A predetermined time from the occurrence of an event is defined as a failure period.
(A) A time between occurrence times of a certain event and another certain event is defined as a failure period.
(C) A failure period is defined by adding a certain time before and after the time between occurrence times of a certain event and another certain event.

なお、故障期間抽出部１０５は、あるイベントが一定時間内に発生した頻度が予め定めた値を超えた場合又は予め定めた値を下回った場合に、その期間を故障期間としてもよい。 Note that the failure period extraction unit 105 may set the period as a failure period when the frequency of occurrence of a certain event within a predetermined time exceeds a predetermined value or falls below a predetermined value.

なお、ベースライン生成部１０９は、ベースライン生成処理において、故障期間に含まれる全ての機器に関する、全ての測定値を除外してベースラインを生成したが、故障期間における故障と判定された装置に関する測定値のみを除外してベースラインを作成してもよい。また、故障期間における故障と判定された装置、故障と判定された測定値のみを除外してベースラインを作成してもよい。 In the baseline generation process, the baseline generation unit 109 generates a baseline excluding all measurement values for all devices included in the failure period, but relates to a device determined to be a failure in the failure period. A baseline may be created by excluding only measured values. In addition, the baseline may be created by excluding only the apparatus determined to have a failure in the failure period and only the measurement value determined to be a failure.

なお、操作情報または測定値情報に時刻が記録されていない場合には、故障検出装置１３が操作情報または測定値情報を収集した時刻を操作時刻情報または測定値時刻情報としてもよい。その場合には、収集した時刻を操作を行った時刻または測定値を測定した時刻として取り扱う。 When the time is not recorded in the operation information or the measurement value information, the time when the failure detection device 13 collects the operation information or the measurement value information may be used as the operation time information or the measurement value time information. In that case, the collected time is handled as the time when the operation is performed or the time when the measured value is measured.

なお、上述した実施形態における故障検出装置１３の一部、例えば、故障期間抽出部１０５、ベースライン生成部１０９、異常値判定部１１１をコンピュータで実現するようにしても良い。その場合、この制御機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することによって実現しても良い。なお、ここでいう「コンピュータシステム」とは、故障検出装置１３に内蔵されたコンピュータシステムであって、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含んでも良い。また上記プログラムは、前述した機能の一部を実現するためのものであっても良く、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであっても良い。
また、上述した実施形態における故障検出装置１３の一部、または全部を、ＬＳＩ（ＬａｒｇｅＳｃａｌｅＩｎｔｅｇｒａｔｉｏｎ）等の集積回路として実現しても良い。故障検出装置の各機能ブロックは個別にプロセッサ化してもよいし、一部、または全部を集積してプロセッサ化しても良い。また、集積回路化の手法はＬＳＩに限らず専用回路、または汎用プロセッサで実現しても良い。また、半導体技術の進歩によりＬＳＩに代替する集積回路化の技術が出現した場合、当該技術による集積回路を用いても良い。 Note that a part of the failure detection apparatus 13 in the above-described embodiment, for example, the failure period extraction unit 105, the baseline generation unit 109, and the abnormal value determination unit 111 may be realized by a computer. In that case, the program for realizing the control function may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read by a computer system and executed. Here, the “computer system” is a computer system built in the failure detection device 13 and includes an OS and hardware such as peripheral devices. The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Furthermore, the “computer-readable recording medium” is a medium that dynamically holds a program for a short time, such as a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line, In such a case, a volatile memory inside a computer system serving as a server or a client may be included and a program that holds a program for a certain period of time. The program may be a program for realizing a part of the functions described above, and may be a program capable of realizing the functions described above in combination with a program already recorded in a computer system.
Moreover, you may implement | achieve part or all of the failure detection apparatus 13 in embodiment mentioned above as integrated circuits, such as LSI (Large Scale Integration). Each functional block of the failure detection apparatus may be individually made into a processor, or a part or all of them may be integrated into a processor. Further, the method of circuit integration is not limited to LSI, and may be realized by a dedicated circuit or a general-purpose processor. Further, in the case where an integrated circuit technology that replaces LSI appears due to progress in semiconductor technology, an integrated circuit based on the technology may be used.

以上、図面を参照してこの発明の一実施形態について詳しく説明してきたが、具体的な構成は上述のものに限られることはなく、この発明の要旨を逸脱しない範囲内において様々な設計変更等をすることが可能である。 As described above, the embodiment of the present invention has been described in detail with reference to the drawings. However, the specific configuration is not limited to the above, and various design changes and the like can be made without departing from the scope of the present invention. It is possible to

１・・・故障検出システム、１０・・・監視対象ネットワーク、１１、１１ａ〜１１ｅ・・・ネットワーク機器、１２・・・入出力装置、１３・・・故障検出装置、１０１・・・操作情報収集部、１０２・・・入力部、１０３・・・操作履歴ＤＢ、１０４・・・操作履歴抽出部、１０５・・・故障期間抽出部、１０６・・・故障期間ＤＢ、１０７・・・測定値情報収集部、１０８・・・測定値履歴ＤＢ、１０９・・・ベースライン生成部、１１０・・・ベースラインＤＢ、１１１・・・異常値判定部、１１２・・・異常値／故障箇所対応ＤＢ、１１３・・・出力部、ＧＷ・・・ゲートウェイ、Ｈｕｂ・・・ハブ、ＲＴ・・・ルータ、ＳＷ・・・スイッチ DESCRIPTION OF SYMBOLS 1 ... Failure detection system, 10 ... Network to be monitored, 11, 11a-11e ... Network equipment, 12 ... I / O device, 13 ... Failure detection device, 101 ... Operation information collection , 102 ... Input unit, 103 ... Operation history DB, 104 ... Operation history extraction unit, 105 ... Failure period extraction unit, 106 ... Failure period DB, 107 ... Measurement value information Collection unit 108 ... Measured value history DB 109 ... Baseline generation unit 110 ... Baseline DB 111 ... Abnormal value determination unit 112 ... Abnormal value / failure location correspondence DB, 113 ... Output unit, GW ... Gateway, Hub ... Hub, RT ... Router, SW ... Switch

Claims

A measurement value other than the failure period of the monitoring target system is extracted from the measurement values representing the operation status of the monitoring target system, and baseline information indicating a normal range of the measurement value is generated based on the extracted measurement value. A baseline generator,
A failure period extraction unit that determines a failure period based on information on an operation on the monitored system ;
For the measurement value measured in the failure period, based on the baseline information, and the abnormal value determination unit to determine whether an abnormal value,
An abnormality detection device comprising:

An operation history storage unit that stores information related to operations on the monitored system
Abnormality detecting apparatus according to claim 1, characterized in that it comprises a.

The abnormality detection device according to claim 2, wherein the failure period extraction unit is information corresponding to information input from an operator, and determines the failure period based on information related to the operation.

The monitored system is composed of a plurality of devices,
The measurement value is a measurement value for each of a plurality of measurement items,
The baseline generation unit generates baseline information regarding each of the measurement items,
The abnormality detection device is:
The apparatus according to claim 2, further comprising: an abnormal device detection unit that detects a device in which an abnormality has occurred from the plurality of devices based on a measurement item of the measurement value determined to be the abnormal value. Anomaly detection device.

A measurement value other than the failure period of the monitoring target system is extracted from the measurement values representing the operation status of the monitoring target system, and baseline information indicating a normal range of the measurement value is generated based on the extracted measurement value. Process,
Determining a failure period based on information relating to operations on the monitored system ;
A step of determining whether or not the measurement value measured during the failure period is an abnormal value based on the baseline information;
The abnormality detection method characterized by having.

In the computer of the abnormality detection device,
A measurement value other than the failure period of the monitoring target system is extracted from the measurement values representing the operation status of the monitoring target system, and baseline information indicating a normal range of the measurement value is generated based on the extracted measurement value. A baseline generation step;
A failure period extraction step for determining a failure period based on information on an operation on the monitored system ;
For it said measured measurement values to the failure period, based on the baseline information, and the abnormal value determination step of determining whether or not an abnormal value,
Abnormality detection program to execute.