JP6649294B2

JP6649294B2 - State determination device, state determination method, and program

Info

Publication number: JP6649294B2
Application number: JP2017016995A
Authority: JP
Inventors: 松尾　洋一; 洋一松尾; 中野　雄介; 雄介中野; 暁渡邉; 敬志郎渡辺; 石橋　圭介; 圭介石橋; 川原　亮一; 亮一川原
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2017-02-01
Filing date: 2017-02-01
Publication date: 2020-02-19
Anticipated expiration: 2037-02-01
Also published as: JP2018124829A

Description

本発明は、状態判定装置、状態判定方法及びプログラムに関する。 The present invention relates to a state determination device, a state determination method, and a program.

コンピュータシステムの運用におい異常が発生した際に、オペレータは、ＣＰＵ使用率やエラーログ、トラヒックログなどシステムの機器から得られる様々な観測情報の観測値と、これまでの知識と経験をもとに原因箇所・要因の特定を行い、原因箇所に対処を行っていた。オペレータの判断を支援する手法の１つとして因果グラフを用いた分析手法がある。 When an abnormality occurs in the operation of the computer system, the operator uses the observation value of various observation information obtained from the system equipment, such as the CPU usage rate, error log, and traffic log, based on the knowledge and experience so far. The cause and the cause were identified, and the cause was dealt with. As one of the methods for assisting the operator in making a decision, there is an analysis method using a causal graph.

因果グラフを用いた異常発生原因箇所・要因推定手法とは、システムの構成要素の一部に異常が発生した場合に与える影響関係を記述した因果グラフを用いて迅速に原因箇所・要因を特定する手法である（例えば、非特許文献１〜３）。これらの手法は主に事前構築、学習フェーズと推論フェーズの二つのパートに分けられる。 An error occurrence cause location and cause estimation method using a causal graph is a method for quickly identifying the cause location and cause using a causal graph that describes the influence relationship that occurs when an error occurs in a part of the system component (For example, Non-Patent Documents 1 to 3). These methods are mainly divided into two parts: pre-construction, learning phase and inference phase.

学習フェーズにおいては、システムの構成要素１つ１つ（例：ルータ、サーバ）を機器状態・要因層のノードとし、各ノードで異常が発生した場合、影響を与える観測層のノード（例：アラート、パケットエラー、ＣＰＵ使用率なのどのシステムの構成要素から取得できる観測情報）に向かってある重み付きのエッジを張る。システムの専門家の知識や過去の事例をもとに全てのノードに対し行うことで、システム全体の影響関係を記述した因果グラフが完成する。 In the learning phase, each component of the system (eg, router, server) is used as a node in the device state / factor layer, and when an abnormality occurs in each node, a node in the observation layer (eg, an alert) that affects the node. , Packet errors, CPU utilization, etc.), and a weighted edge is created. By performing this for all nodes based on the knowledge of the system expert and past cases, a causal graph describing the influence relationship of the entire system is completed.

図１は、因果グラフを説明するための図である。図１において、Ｘ（ｘ_１，ｘ_２，…ｘ_ｎ）は、システムの構成要素となるｎ個の機器のそれぞれの状態を表し、各ｘ_ｉは、０（正常状態）又は１（異常状態）の値をとる。Ｙ（ｙ_１，ｙ_２，…ｙ_ｍ）は、ｍ個の観測情報の状態を表し、例えば、Ｙがパケットエラーの発生の有無を表す観測層であれば、各ｙ_ｊは０（パケットエラーなし）又は１（パケットエラー発生）を表す。エッジの重みは、条件付き確率Ｐで定義され、例えば、Ｐ（ｙ_ｊ＝１｜ｘ_ｉ＝１）は、ｘ_ｉが１をとったときｙ_ｊも１になる確率を表す。 FIG. 1 is a diagram for explaining a causal graph. In Figure _{_{1, X (x 1, x}} 2, ... x n) represents the state of each of the n device as a component of the system, each _{x i} is 0 (normal state) or 1 (abnormal state ). Y (y ₁ , y ₂ ,..., Y _m ) represents the state of m pieces of observation information. For example, if Y is an observation layer indicating whether a packet error has occurred, each y _j is 0 (packet error). None) or 1 (occurrence of packet error). Edge weights are defined by the conditional probability P, for _{_{example, P (y j = 1 |}} x i = 1) represents the probability _{that y j} becomes 1 _{when x i} took one.

推論フェーズでは、システム内で異常が発生したとき、構築された因果グラフの中で異常が観測されているノードに向かってきているエッジを逆にたどることにより、真因となっている原因箇所・要因を特定する。 In the inference phase, when an anomaly occurs in the system, the cause of the cause, Identify the factor.

Srikanth Kandula, Dina Katabi, and Jean-philippe Vasseur. Shrink: A tool for failure diagnosis in IP networks. Proceedings of the 2005 ACM SIGCOMM workshop on Mining network data, pages 173-178, 2005.Srikanth Kandula, Dina Katabi, and Jean-philippe Vasseur.Shrink: A tool for failure diagnosis in IP networks.Proceedings of the 2005 ACM SIGCOMM workshop on Mining network data, pages 173-178, 2005. R.R. Kompella, J. Yates, A. Greenberg, and A.C. Snoeren. IP Fault Localization via Risk Modeling. IEEE Transactions on Dependable and Secure Computing, 7(4):1-14, 2010.R.R.Kompella, J. Yates, A. Greenberg, and A.C.Snoeren.IP Fault Localization via Risk Modeling.IEEE Transactions on Dependable and Secure Computing, 7 (4): 1-14, 2010. He Yan, Lee Breslau, Zihui Ge, Dan Massey, Dan Pei, and Jennifer Yates. G-RCA: A Generic Root Cause Analysis Platform for Service Quality Management in Large IP Networks. IEEE/ACM Transactions on Networking, 20(6):1734-1747, 2012.He Yan, Lee Breslau, Zihui Ge, Dan Massey, Dan Pei, and Jennifer Yates.G-RCA: A Generic Root Cause Analysis Platform for Service Quality Management in Large IP Networks.IEEE/ACM Transactions on Networking, 20 (6): 1734-1747, 2012.

従来技術では、学習フェーズで、専門家の知識や過去の故障事例、システムの観測データから相関関係などをもとに因果グラフを構築するための手法が提案されているが、過去に起こっていない異常が発生したときの影響関係は記述できない。 In the prior art, in the learning phase, a method for constructing a causal graph based on correlations from expert knowledge, past failure cases, system observation data, etc. has been proposed, but has not occurred in the past The influence relationship when an error occurs cannot be described.

また、システムのマシンを新しくすることにより、ノードの影響範囲が変わるということもあり、学習フェーズのみにおいてシステムの因果関係を正確に記述するようなグラフを構築することは難しい。 In addition, since the influence range of the node may change due to the new machine of the system, it is difficult to construct a graph that accurately describes the causal relationship of the system only in the learning phase.

また、異常が発生しても機器がアラートを発しなかったり、観測値の分析ミスにより誤って観測層のノードを異常状態と判定したりと必ずしも本来の観測情報の状態を簡単に判断できない。 Further, even if an abnormality occurs, the device does not issue an alert, or a node in an observation layer is erroneously determined to be in an abnormal state due to an analysis error in an observation value, and thus the original state of observation information cannot always be easily determined.

このように、正確な因果グラフを構築できていない場合、また観測層の状態を正確に判定できていない場合には原因箇所・要因の推定精度が低下する可能性が有る。 As described above, when an accurate causal graph cannot be constructed, or when the state of the observation layer cannot be accurately determined, there is a possibility that the estimation accuracy of the causal point / factor is reduced.

本発明は、上記の点に鑑みてなされたものであって、システムの異常の原因箇所・要因の推定精度を向上させること目的とする。 The present invention has been made in view of the above points, and an object of the present invention is to improve the accuracy of estimating a cause and a cause of a system abnormality.

そこで上記課題を解決するため、状態判定装置は、システムの各構成要素の状態に対応する第１の層と前記システムにおける第１の層の各構成要素から出る観測情報の状態に対応する第２の層との関係を示す第１の因果グラフに対して、前記第１の層の各構成要素から出る観測情報に対する変換によって得られる第２の観測情報の状態に対応する第３の層を前記第１の層と前記第２の層との間に追加した第２の因果グラフと、前記第１の層と前記第３の層との間のエッジの重みを操作する関数ｆの集合Ｆと、前記変換を行う関数ｇの集合Ｇとに基づいて、前記システムから収集される観測情報の状態への尤度が最大である前記各構成要素の状態を判定する判定部と、前記判定部によって判定された前記各構成要素の状態を出力する出力部と、を有する。 Therefore, in order to solve the above problem, the state determination device includes a first layer corresponding to the state of each component of the system and a second layer corresponding to the state of observation information output from each element of the first layer in the system. For the first causal graph showing the relationship with the layer of the first layer, the third layer corresponding to the state of the second observation information obtained by converting the observation information output from each component of the first layer is described above. A second causal graph added between the first layer and the second layer, and a set F of functions f for manipulating the weight of the edge between the first layer and the third layer. A determination unit that determines the state of each component having the maximum likelihood to the state of the observation information collected from the system, based on the set G of the function g that performs the conversion, and the determination unit An output unit that outputs a state of each of the determined components; A.

システムの異常の原因箇所・要因の推定精度を向上させることができる。 It is possible to improve the accuracy of estimating the cause and the cause of the system abnormality.

因果グラフを説明するための図である。It is a figure for explaining a causal graph. 本発明の実施の形態におけるシステム構成例を示す図である。FIG. 1 is a diagram illustrating a system configuration example according to an embodiment of the present invention. 本発明の実施の形態における推定装置１０のハードウェア構成例を示す図である。FIG. 2 is a diagram illustrating a hardware configuration example of an estimation device 10 according to the embodiment of the present invention. 本発明の実施の形態における推定装置１０の機能構成例を示す図である。It is a figure showing an example of functional composition of estimating device 10 in an embodiment of the invention. 本発明の実施の形態における因果グラフの構成例を示す図である。FIG. 4 is a diagram illustrating a configuration example of a causal graph according to the embodiment of the present invention. 推定装置１０が実行する処理手順の一例を説明するためのフローチャートである。5 is a flowchart for explaining an example of a processing procedure executed by the estimation device 10. ノードとエッジの関係を表現したテキストファイルの一例を示す図である。FIG. 6 is a diagram illustrating an example of a text file expressing a relationship between nodes and edges. Ｘの状態列の出力例を示す図である。FIG. 9 is a diagram illustrating an output example of an X state sequence. 機器・要因状態層Ｘの状態の推定処理の処理手順の一例を説明するためのフローチャートである。It is a flowchart for demonstrating an example of the processing procedure of the estimation process of the state of the apparatus / factory state layer X.

以下、図面に基づいて本発明の実施の形態を説明する。本実施の形態では、学習フェーズにおいて、従来の因果グラフに対して拡張した因果グラフを構築する。また、推論フェーズにおいて、因果グラフのエッジと観測情報の状態の修正を行う。そうすることで、因果グラフや観測情報の正確性が低い場合においても高精度な原因箇所・要因の推定を可能とする。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. In the present embodiment, in the learning phase, a causal graph extended from the conventional causal graph is constructed. In the inference phase, the state of the edge of the causal graph and the state of the observation information are corrected. By doing so, even when the accuracy of the causal graph or the observation information is low, it is possible to estimate the cause and the cause with high accuracy.

図２は、本発明の実施の形態におけるシステム構成例を示す図である。図２において、推定装置１０は、複数の機器２０とネットワークを介して接続される。 FIG. 2 is a diagram illustrating a system configuration example according to the embodiment of the present invention. In FIG. 2, the estimation device 10 is connected to a plurality of devices 20 via a network.

各機器２０は、推定装置１０によって観測対象とされる運用システムの構成要素である。運用システムは、ネットワークシステム又はコンピュータシステム等、様々なシステムであってよい。 Each device 20 is a component of an operation system that is to be observed by the estimation device 10. The operation system may be various systems such as a network system or a computer system.

推定装置１０は、運用システムにおいて異常が発生した場合に、運用システムから観測情報を収集し、当該観測情報に基づいて、異常の原因箇所・要因の推定を行う１以上のコンピュータである。なお、本実施の形態において、観測情報とは、運用システムにおける観測項目又は観測対象の観測結果を示す情報をいう。観測結果は、観測値そのものであってもよいし、或る現象の有無あってもよい。或る現象とは、例えば、特定のメッセージ（例えば、パケットエラーを示すメッセージ）を含むログの出力である。 The estimation device 10 is one or more computers that collects observation information from the operation system when an abnormality occurs in the operation system, and estimates a cause and a cause of the abnormality based on the observation information. In the present embodiment, observation information refers to information indicating an observation item or an observation result of an observation target in the operation system. The observation result may be the observation value itself or the presence or absence of a certain phenomenon. The certain phenomenon is, for example, a log output including a specific message (for example, a message indicating a packet error).

図３は、本発明の実施の形態における推定装置１０のハードウェア構成例を示す図である。図３の推定装置１０は、それぞれバスＢで相互に接続されているドライブ装置１００、補助記憶装置１０２、メモリ装置１０３、ＣＰＵ１０４、インタフェース装置１０５、表示装置１０６、及び入力装置１０７等を有する。 FIG. 3 is a diagram illustrating a hardware configuration example of the estimation device 10 according to the embodiment of the present invention. 3 includes a drive device 100, an auxiliary storage device 102, a memory device 103, a CPU 104, an interface device 105, a display device 106, an input device 107, and the like, which are mutually connected by a bus B.

推定装置１０での処理を実現するプログラムは、ＣＤ−ＲＯＭ等の記録媒体１０１によって提供される。プログラムを記憶した記録媒体１０１がドライブ装置１００にセットされると、プログラムが記録媒体１０１からドライブ装置１００を介して補助記憶装置１０２にインストールされる。但し、プログラムのインストールは必ずしも記録媒体１０１より行う必要はなく、ネットワークを介して他のコンピュータよりダウンロードするようにしてもよい。補助記憶装置１０２は、インストールされたプログラムを格納すると共に、必要なファイルやデータ等を格納する。 A program for realizing the processing in the estimation device 10 is provided by a recording medium 101 such as a CD-ROM. When the recording medium 101 storing the program is set in the drive device 100, the program is installed from the recording medium 101 to the auxiliary storage device 102 via the drive device 100. However, the program need not always be installed from the recording medium 101, and may be downloaded from another computer via a network. The auxiliary storage device 102 stores the installed program and also stores necessary files and data.

メモリ装置１０３は、プログラムの起動指示があった場合に、補助記憶装置１０２からプログラムを読み出して格納する。ＣＰＵ１０４は、メモリ装置１０３に格納されたプログラムに従って推定装置１０に係る機能を実現する。インタフェース装置１０５は、ネットワークに接続するためのインタフェースとして用いられる。表示装置１０６はプログラムによるＧＵＩ（Graphical User Interface）等を表示する。入力装置１０７はキーボード及びマウス等で構成され、様々な操作指示を入力させるために用いられる。 The memory device 103 reads out the program from the auxiliary storage device 102 and stores it when there is an instruction to start the program. The CPU 104 implements functions related to the estimation device 10 according to a program stored in the memory device 103. The interface device 105 is used as an interface for connecting to a network. The display device 106 displays a GUI (Graphical User Interface) according to a program. The input device 107 includes a keyboard, a mouse, and the like, and is used to input various operation instructions.

図４は、本発明の実施の形態における推定装置１０の機能構成例を示す図である。図４において、推定装置１０は、ＵＩ部１１、因果グラフ構築部１２、修正候補構築部１３、異常度計算部１４及び推論部１５等を有する。これら各部は、推定装置１０にインストールされた１以上のプログラムが、ＣＰＵ１０４に実行させる処理により実現される。推定装置１０は、また、因果グラフＤＢ１６及び修正候補ＤＢ１７等のデータベースを利用する。これら各データベースは、例えば、補助記憶装置１０２、又は推定装置１０にネットワークを介して接続可能な記憶装置等を用いて実現可能である。 FIG. 4 is a diagram illustrating an example of a functional configuration of the estimation device 10 according to the embodiment of the present invention. 4, the estimation device 10 includes a UI unit 11, a causal graph construction unit 12, a correction candidate construction unit 13, an abnormality degree calculation unit 14, an inference unit 15, and the like. Each of these units is realized by a process of causing the CPU 104 to execute one or more programs installed in the estimation device 10. The estimating apparatus 10 also uses databases such as the causal graph DB 16 and the correction candidate DB 17. Each of these databases can be realized by using, for example, a storage device that can be connected to the auxiliary storage device 102 or the estimation device 10 via a network.

ＵＩ部１１は、運用システムの中の異常発生箇所・要因の推定を支援する情報として、各機器２０の状態（正常又は異常）を示す情報を出力する。ＵＩ部１１は、また、運用システムに新たに機器２０が追加された際などは、因果グラフへのノードの追加やそれに伴う因果関係の変更指示を利用者から受け付け、因果グラフを修正する。 The UI unit 11 outputs information indicating the state (normal or abnormal) of each device 20 as information that assists in estimating the location and cause of the abnormality in the operation system. When a new device 20 is added to the operation system or the like, the UI unit 11 accepts from the user an instruction to add a node to the causal graph and to change a causal relationship with the node, and corrects the causal graph.

因果グラフ構築部１２は、従来の因果グラフを拡張し、新しい層と、関数を導入可能なモデル（因果グラフ）を構築（生成）する。 The causal graph constructing unit 12 extends (generates) a conventional causal graph and constructs (generates) a new layer and a model (causal graph) into which a function can be introduced.

図５は、本発明の実施の形態における因果グラフの構成例を示す図である。本実施の形態では、観測情報の状態層Ｙ＝ｙ_ｊ（ｊ＝１，…，ｍ）と機器・要因状態層Ｘ＝ｘ_ｉ（ｉ＝１，…，ｎ）とを結んだ有向マルコフモデル（従来の因果グラフ）における、観測情報の状態層Ｙが、ノイズを含んだ観測情報（実際に観測される観測情報）の状態層Ｙとされ、真の観測情報の状態層Ｚ＝ｚ_ｊ（ｊ＝１，…，ｍ）がＹとＸとの間に追加される。 FIG. 5 is a diagram illustrating a configuration example of a causal graph according to the embodiment of the present invention. In this embodiment, the state layer _Y = y j observation information (j = 1, ..., m ) and equipment and cause state layer _{X = x i (i = 1} , ..., n) and the connecting it directed Markov The state layer Y of the observation information in the model (conventional causal graph) is the state layer Y of the observation information including noise (observation information actually observed), and the state layer Z = z _{j of the} true observation information. (J = 1,..., M) are added between Y and X.

真の観測情報の状態層Ｚは、本来であれば観測されうる観測情報の状態を表す。すなわち、真の観測情報とは、実際に観測される観測情報からノイズを除去する変換を行うことで得られる観測情報をいう。ノイズとは、例えば、観測ミスである。観測ミスは、観測情報が出力されているにも関わらず、何らかの原因により当該観測情報を捕捉できないことや、機器２０側の設定ミスにより、本来であれば出力されるべき観測情報が出力されないこと等である。例えば、真の観測情報の状態層Ｚは、「故障Ａは、観測層のノードＢ、Ｃ、Ｄに影響が出る」という故障について、実際にノードＢ、Ｃ、Ｄに影響が出ている状態である。 The state layer Z of the true observation information represents the state of the observation information that can be normally observed. That is, the true observation information refers to observation information obtained by performing conversion for removing noise from observation information actually observed. The noise is, for example, an observation mistake. An observation error is that the observation information cannot be captured for some reason, even though the observation information is output, or that the observation information that should be output is not output due to a setting mistake on the device 20 side. And so on. For example, the state layer Z of the true observation information indicates that the failure “A affects the nodes B, C, and D of the observation layer” and the nodes B, C, and D are actually affected. It is.

また、当該因果グラフは、ｘ_ｉからｚ_ｊへのエッジｅ_ｉｊの重みの値を変更する関数の集合Ｆ＝Ｉ＋Ｕ'，ｕ'_ｉｊ∈［０，±１］と、真の観測情報の状態ｚ_ｊにノイズを加えｙ_ｊに出力する関数の集合Ｇ＝Ｉ＋Ｕ''，ｕ''_ｉｊ∈［０，±１］との導入が予定される。なお、Ｉは、単位行列のような恒等写像を示す。 The causal graph includes a set of functions F = I + U ′, u ′ _ij ∈ [0, ± 1] that change the value of the weight of the edge e _ij from _xi to z _j and the state of the true observation information. A set of functions G = I + U ″, u ″ _ij ∈ [0, ± 1], which adds noise to z _j and outputs to y _j , is scheduled to be introduced. Here, I indicates an identity map such as a unit matrix.

因果グラフ構築部１２によって構築された因果グラフは、因果グラフＤＢ１６に記憶される。 The causal graph constructed by the causal graph constructing unit 12 is stored in the causal graph DB 16.

修正候補構築部１３は、関数の集合Ｆ及びＧを構築する。関数の集合Ｆ及びＧの構築とは、Ｆ又はＧに属する各関数の実体（式）を生成することをいう。具体的には、修正候補構築部１３は、因果グラフＤＢ１６に記憶された因果グラフに対し、ｆ_ｋを、因果グラフの中で変換が可能な任意のエッジの重みに作用する関数とし、変換可能な全てのパターンを網羅した関数の集合をＦとする。ｆ_ｋは、ＸとＺとからなる完全二部グラフのｋ番目の部分グラフに作用する関数である。ｋは完全二部グラフの部分グラフの個数だけ存在する。なお、変換とは、任意のＸとＺのノードｘ_ｉとｚ_ｊとの間のエッジの重みを変更する操作をいう。例えば、本実施の形態では、エッジの重みを０又は１にする場合について説明する。エッジの重みを０にすることは、当該エッジを削除することと等価であり、エッジの重みを１にすることは、当該エッジを追加することと等価である。但し、エッジの重みは、多値又は連続値であってもよい。 The correction candidate construction unit 13 constructs a set of functions F and G. The construction of the set of functions F and G refers to generating an entity (expression) of each function belonging to F or G. Specifically, the correction candidate construction unit 13 can convert f _k into a function acting on the weight of an arbitrary convertible edge in the causal graph for the causal graph stored in the causal graph DB 16 and perform conversion. Let F be a set of functions covering all the patterns. f _k is a function acting on the k-th subgraph of the complete bipartite graph consisting of X and Z. k exists as many as the number of subgraphs of the complete bipartite graph. Note that the conversion refers to an operation of changing the weight of an edge between nodes x _i and z _j of arbitrary X and Z. For example, in the present embodiment, a case where the weight of an edge is set to 0 or 1 will be described. Setting the weight of an edge to 0 is equivalent to deleting the edge, and setting the weight of the edge to 1 is equivalent to adding the edge. However, the weight of the edge may be a multi-value or a continuous value.

修正候補構築部１３は、また、真の観測情報の状態層Ｚの全ての部分集合のそれぞれについて、要素の状態値を変換する関数をｇ_ｓとし（ｓ＝１，…，ｍ）、Ｇをｇ_ｓの集合とする。なお、ｍは、真の観測情報の状態層Ｚの全ての部分集合の個数である。 The correction candidate construction unit 13 also sets _gs (s = 1,..., M) as a function for converting the state value of an element for each of all subsets of the state layer Z of true observation information, and sets G to It is the set of g _s. Here, m is the number of all subsets of the state layer Z of the true observation information.

修正候補構築部１３によって構築された関数の集合Ｆ及びＧは、修正候補ＤＢ１７に記憶される。 The function sets F and G constructed by the modification candidate construction unit 13 are stored in the modification candidate DB 17.

異常度計算部１４は、運用システム内の機器２０に対するコマンド操作により観測情報を機器２０から収集し、当該観測情報について異常度を計算する。異常度の計算手法は、閾値を超えたか否かに基づいて、０又は１で判定されてもよいし、ｌｏｃａｌｏｕｔｌｉｅｒｆａｃｔｏｒなどの外れ値計算手法、自己回帰モデルなどの一般的な時系列分析等を用いて計算されてもよい。異常度計算部１４は、更に、異常度に基づいて、ノイズを含んだ観測情報の観測層Ｙのノードｙ_ｊの状態を判定する。ｙ_ｊが、正常（０）又は異常（１）の値をとるのであれば、異常度計算部１４は、計算した異常度に基づいて、ｙ_ｊが０又は１のいずれの状態に該当するのかを判定する。ｙ_ｊが、正常（０）、異常（１）、又は部分的に異常（０．５）の３値をとるのであれば、異常度計算部１４は、計算した異常度に基づいて、ｙ_ｊが当該３値のうちのどの状態になるのかを判定する。判定には、閾値との比較やｔ−検定等が用いられてもよい。ｙ_ｊの状態が０から１までの連続値を取りうるのであれば、計算された異常値が０から１に正規化され、正規化後の値がｙ_ｊの値とされてもよい。 The abnormality degree calculation unit 14 collects observation information from the device 20 by performing a command operation on the device 20 in the operation system, and calculates an abnormality degree for the observation information. The method of calculating the degree of abnormality may be determined as 0 or 1 based on whether or not a threshold value is exceeded, an outlier calculation method such as a local outlier factor, a general time series analysis such as an autoregressive model, or the like. May be calculated using Abnormality level calculation unit 14 is further based on the degree of abnormality, it determines the state of the node y _j of the observed layer Y of the observation information including the noise. If y _j takes a value of normal (0) or abnormal (1), the abnormality degree calculation unit 14 determines whether y _j corresponds to 0 or 1 based on the calculated abnormality degree. Is determined. If y _j takes the three values of normal (0), abnormal (1), or partially abnormal (0.5), the abnormality degree calculator 14 calculates y _j based on the calculated abnormality degree. Of the three values is determined. For the determination, a comparison with a threshold value, a t-test, or the like may be used. If the state of y _j can take a continuous value from 0 to 1, the calculated abnormal value may be normalized from 0 to 1, and the normalized value may be the value of y _j .

推論部１５は、因果グラフＤＢ１６に記憶された因果グラフ、修正候補ＤＢ１７に記憶された関数の集合Ｆ及びＧに対し、異常度計算部１４によって、観測情報に基づいて計算された、ノイズを含んだ観測情報の状態層ｙ_ｊの状態を当てはめて、状態層ｙ_ｊの状態を最も良く表す（状態層ｙ_ｊの状態に対して尤度が最高である）ｘ_ｉの状態を推定する。 The inference unit 15 includes, for the causal graph stored in the causal graph DB 16 and the set of functions F and G stored in the correction candidate DB 17, the noise calculated by the anomaly degree calculator 14 based on the observation information. we by applying the status of the layer y _j observation information, the status of the layer y _j best represent (likelihood relative state layer y _j state is a maximum) to estimate the state of the x _i.

以下、推定装置１０が実行する処理手順について説明する。図６は、推定装置１０が実行する処理手順の一例を説明するためのフローチャートである。 Hereinafter, a processing procedure executed by the estimation device 10 will be described. FIG. 6 is a flowchart illustrating an example of a processing procedure executed by the estimation device 10.

ステップＳ１０１において、因果グラフ構築部１２は、既存技術（例えば、非特許文献１〜３）で構築可能な因果グラフのエッジの重みをｘ_ｉ、ｙ_ｊの辺ごとに動的に変化させることができるような因果グラフを構築（生成）する。当該因果グラフの機器・要因状態層Ｘ及びノイズを含んだ観測情報の状態層Ｙの各ノードやエッジの重みの初期値は、運用システムの構成情報や過去の異常発生事例等に基づいて計算される。当該因果グラフでは、新たに真の観測情報の状態層Ｚが導入され、真の観測情報の状態層Ｚにノイズを与えてＹに写像する関数の集合Ｇの適用が可能とされる。ｙ_ｊは、ｚ_ｊにノイズを付与した観測情報となる。換言すれば、ｚ_ｊは、ｙ_ｊからノイズを除去した観測情報である。 In step S101, a causal graph construction unit 12, the existing technology (e.g., Non-Patent Documents 1 to 3) the weight of an edge of a causal graph can be constructed with x _i, be dynamically changed for each side of the y _j Construct (generate) a causal graph as possible. The initial value of the weight of each node and edge of the device / factor state layer X of the causal graph and the state layer Y of the observation information including noise is calculated based on the configuration information of the operation system, past abnormal occurrence cases, and the like. You. In the causal graph, a new state layer Z of true observation information is newly introduced, and a set G of functions that gives noise to the state layer Z of true observation information and maps to Y can be applied. y _j is observation information obtained by adding noise to z _j . In other words, z _j is observation information obtained by removing noise from y _j .

また、当該因果グラフでは、ｘ_ｉ、ｚ_ｊに対してエッジの重みに作用する関数の集合Ｆの適用が可能とされる。Ｆによって、当該因果グラフは事前に想定した範囲外への影響が発生することも考慮できるモデルとなる。例えば、関数の集合Ｆによりエッジの重みを後から変更できるようにすることで、当該因果グラフは、運用システムの構成情報が間違っていた場合や、過去に起きたことがない故障に対しても対応可能となる。 Further, in the causal graph, it is possible to apply a set F of functions acting on edge weights to x _i and z _j . By F, the causal graph is a model that can consider that an influence outside the range assumed in advance occurs. For example, by allowing the weight of an edge to be changed later by the set of functions F, the causal graph can be used even when the configuration information of the operation system is incorrect or a failure that has never occurred in the past. It becomes possible to correspond.

因果グラフ構築部１２は、以上のように様々な状態を表現できる因果グラフを構築すると、当該因果グラフの各層のノードとエッジとの関係が記載されたテキストファイルを因果グラフＤＢ１６に記憶する。 When the causal graph constructing unit 12 constructs a causal graph capable of expressing various states as described above, the causal graph DB 16 stores a text file in which a relationship between a node and an edge of each layer of the causal graph is described.

図７は、ノードとエッジの関係を表現したテキストファイルの一例を示す図である。図７の左側には、因果グラフの一例が示されている。図７の右側には、当該因果グラフに対応するテキストファイルが示されている。当該テキストファイルには、エッジ名とそのエッジがつなぐノードの関係と、エッジに対する重みとが記述される。 FIG. 7 is a diagram illustrating an example of a text file expressing the relationship between nodes and edges. An example of the causal graph is shown on the left side of FIG. On the right side of FIG. 7, a text file corresponding to the causal graph is shown. In the text file, a relationship between an edge name, a node connected by the edge, and a weight for the edge is described.

エッジ名とそのエッジがつなぐノードの関係は、例えば、ｅＡＳＬ１：ＡＳｐｉｎｅ１、Ｌｉｎｋ１のように記述される。当該記述は、エッジ名ｅＡＳＬ１のエッジが、ＸのノードＡＳｐｉｎｅ１とＹのノードＬｉｎｋ１とを接続することを示す。 The relationship between an edge name and a node to which the edge connects is described, for example, as eASL1: ASpine1, Link1. This description indicates that the edge having the edge name eASL1 connects the node ASpine1 of X and the node Link1 of Y.

エッジに対する重みは、例えば、ｅＡＳＬ１：１、０のように記述される。当該記述は、条件付き確率Ｐ（ｙ_ｊ＝１｜ｘ_ｉ＝１）＝１、Ｐ（ｙ_ｊ＝０｜ｘ_ｉ＝１）＝０というように、条件付き確率の値を示す。すなわち、「ｅＡＳＬ１：１、０」におけるコロン（：）の後の最初の数値は、条件付き確率Ｐ（ｙ_ｊ＝１｜ｘ_ｉ＝１）を示し、２番目の数値は、条件付き確率Ｐ（ｙ_ｊ＝０｜ｘ_ｉ＝１）を示す。なお、図７では、便宜上、ノイズを含んだ観測情報の状態層Ｙのノードは省略されている。 The weight for the edge is described, for example, as eASL1: 1,0. The description, the conditional probability _{_{P (y j = 1 | x}} i = 1) = 1, P | and so _{(y j = 0 x i =} 1) = 0, indicating the value of the conditional probability. In other words,: The first number after the colon in "eASL1 1,0" (:) is, the conditional probability _P (y j = 1 _| shows the _x i = 1), the second number, the conditional probability P shows the _{_{| (x i = 1 y j}} = 0). In FIG. 7, nodes of the state layer Y of observation information including noise are omitted for convenience.

続いて、修正候補構築部１３は、関数の集合Ｆを以下の方法で構築（生成）する（Ｓ１０２）。まず、修正候補構築部１３は、因果グラフのデータ（図７のテキストファイル）を因果グラフＤＢ１６から取得する。続いて、修正候補構築部１３は、エッジの重みが１又は０の場合において、ｆを、１と０を入れ替える変換とし、ｆ_ｋを、因果グラフのＸからＺの任意のエッジの部分集合に対し作用する関数とする。ここで、任意のエッジの部分集合は、重みが０のエッジも含む。重みが０のエッジとは、例えば、図７の例では表現されていないエッジをいい、全ての条件付き確率が０となるエッジをいう。例えば、図７では、ＡＳｐｉｎｅ１とＬｉｎｋ２との間のエッジ、及びＢＬｅａｆ１とＬｉｎｋ１との間のエッジが、重みが０のエッジに該当する。したがって、任意のエッジの部分集合はＸとＺからなる完全二部グラフの部分集合の個数だけ存在し、ｆ_ｋも同じだけ存在する。ｆ_ｋは、ｋ番目の部分集合に対して、重みが１のエッジは０に、重みが０のエッジは１とする関数である。修正候補構築部１３は、ｆ_ｋをまとめた集合をＦとする。修正候補構築部１３は、構築した関数の集合Ｆを修正候補ＤＢ１７へ記憶する。なお、重みが多値の場合、又は連続値の場合には、それぞれに応じた関数が定義されればよい。 Subsequently, the modification candidate construction unit 13 constructs (generates) the function set F by the following method (S102). First, the modification candidate construction unit 13 acquires the data of the causal graph (the text file of FIG. 7) from the causal graph DB 16. Subsequently, when the weight of the edge is 1 or 0, the correction candidate constructing unit 13 sets f to a transformation that replaces 1 and 0, and sets f _k to a subset of any edge from X to Z in the causal graph. A function that acts on Here, an arbitrary subset of edges also includes an edge having a weight of 0. An edge having a weight of 0 refers to, for example, an edge that is not represented in the example of FIG. 7, and refers to an edge having all conditional probabilities of 0. For example, in FIG. 7, the edge between Aspine1 and Link2 and the edge between BLeaf1 and Link1 correspond to the edge having a weight of 0. Therefore, there are as many arbitrary subsets of edges as the number of subsets of the complete bipartite graph consisting of X and Z, and _fk is also the same. f _k is a function that sets an edge having a weight of 1 to 0 and an edge having a weight of 0 to 1 for the k-th subset. The correction candidate construction unit 13 sets a set _obtained by summing f _k as F. The modification candidate construction unit 13 stores the constructed function set F in the modification candidate DB 17. When the weight is multi-valued or continuous, a function corresponding to each may be defined.

続いて、修正候補構築部１３は、関数の集合Ｇを構築（生成）する（Ｓ１０３）。具体的には、修正候補構築部１３は、真の観測情報の状態層Ｚ上の観測情報の状態ｚ_ｊが、０が正常、１が異常というような｛０，１｝の２値を取る場合に、ｇを０、１を反転させる関数とし、ｇ_ｓを真の観測情報の状態層Ｚの要素がｓ個の任意の部分集合に対して、要素の状態値（０又は１）を反転させる関数とする（ｓ＝１，…，ｍ）。修正候補構築部１３は、ｇ_ｓをまとめた集合をＧとする。修正候補構築部１３は、構築した関数の集合Ｇを修正候補ＤＢ１７へ記憶する。なお、観測情報の状態が多値又は連続値をとる場合、それぞれに応じた関数が定義されればよい。 Subsequently, the modification candidate construction unit 13 constructs (generates) a set G of functions (S103). Specifically, correction candidate building unit 13, the state z _j observation information on the state layer Z of the true observation information, 0 is normal, take the binary 1 is such that abnormality {0,1} In this case, g is a function for inverting 0 and 1 and g _s is an inversion of the element state value (0 or 1) for an arbitrary subset of s elements of the state layer Z of the true observation information. (S = 1,..., M). Suggestions construction unit 13, a set summarizing g _s and G. The modification candidate construction unit 13 stores the constructed function set G in the modification candidate DB 17. When the state of the observation information takes multiple values or continuous values, it is only necessary to define a function corresponding to each value.

なお、ステップＳ１０１〜Ｓ１０３は、運用システムにおける異常の発生とは無関係に予め実行されてよい。 Steps S101 to S103 may be executed in advance regardless of the occurrence of an abnormality in the operation system.

運用システム内で異常が発生した際に、ＵＩ部１１が利用者から分析実行指示を受け付けると、ステップＳ１０４以降が実行される。 When the UI unit 11 receives an analysis execution instruction from the user when an abnormality occurs in the operation system, the steps after step S104 are executed.

ステップＳ１０４において、異常度計算部１４は、システム内の各機器２０に対するコマンド操作により、各機器２０から観測情報を収集する。 In step S104, the abnormality degree calculation unit 14 collects observation information from each device 20 by performing a command operation on each device 20 in the system.

続いて、異常度計算部１４は、収集された各観測情報をについて異常度を計算し、当該異常度に基づいて各観測情報の状態ｙ_ｊ（すなわち、ノイズを含んだ観測情報の状態層Ｙ）判定する（Ｓ１０５）。 Subsequently, the degree-of-abnormality calculation unit 14 calculates the degree of abnormality of each collected observation information, and based on the degree of abnormality, the state y _j of each observation information (that is, the state layer Y of the observation information including noise). ) Is determined (S105).

続いて、推論部１５は、因果グラフＤＢ１６に記憶されている因果グラフの情報と、修正候補ＤＢ１７に記憶されている関数の集合Ｆ及びＧと、異常度計算部１４による、ノイズを含んだ観測情報の状態層Ｙの判定結果とに基づき、機器・要因状態層Ｘの状態の推定を行う（Ｓ１０６）。すなわち、ノイズを含んだ観測情報の状態層Ｙに対して尤度が最高である機器・要因状態層Ｘの状態が推定される。 Subsequently, the inference unit 15 uses the causal graph information stored in the causal graph DB 16, the sets F and G of functions stored in the correction candidate DB 17, and the observation including the noise by the anomaly degree calculation unit 14. Based on the determination result of the status layer Y of the information, the status of the device / factor status layer X is estimated (S106). That is, the state of the device / factor state layer X having the highest likelihood with respect to the state layer Y of the observation information including noise is estimated.

続いて、ＵＩ部１１は、推論部１５によって推定されたＸの状態列を異常発生箇所・要因を示す情報として出力する（Ｓ１０７）。例えば、図８に示されるようなテキストファイルが出力されてもよい。 Subsequently, the UI unit 11 outputs the state sequence of X estimated by the inference unit 15 as information indicating the location and the cause of the abnormality (S107). For example, a text file as shown in FIG. 8 may be output.

図８は、Ｘの状態列の出力例を示す図である。図８には、図７に示した機器・要因状態層Ｘの各ノードの状態が０（正常状態）又は１（異常状態）によって示されている。図８の例では、ノードＢｓｐｉｎｅ１が１（異常状態）となっており、このノードが異常発生箇所、要因と推定される。 FIG. 8 is a diagram illustrating an output example of the state sequence of X. FIG. 8 shows the state of each node of the device / factory state layer X shown in FIG. 7 as 0 (normal state) or 1 (abnormal state). In the example of FIG. 8, the node Bspine1 is 1 (abnormal state), and this node is presumed to be the location and the cause of the abnormality.

続いて、ステップＳ１０６の詳細について説明する。図９は、機器・要因状態層Ｘの状態の推定処理の処理手順の一例を説明するためのフローチャートである。 Subsequently, the details of step S106 will be described. FIG. 9 is a flowchart illustrating an example of a processing procedure of a process of estimating a state of the device / factor state layer X.

ステップＳ２０１において、推論部１５は、ＦとＧとを恒等写像Ｉとした状態で、最大事後確率ｐ＝ｍａｘ_ＸＦ（Ｐ（Ｘ│Ｇ^−１（Ｙ）））を、収集されたＹに基づいて計算する。この計算は、既存の因果グラフと同様に、既存手法にて、操作を加える前の最大事後確率ｐ＝ｍａｘ_ＸＰ（Ｘ│Ｙ）を計算することと等価である。したがって、計算方法としては、例えば、非特許文献１に記載された方法が用いられてもよい。 In step S201, the inference unit 15, the F and G in a state where the identity mapping I, maximum a posteriori _{p = max X F (P (} X│G -1 (Y))) was collected Y Calculate based on This calculation is equivalent to calculating the maximum posterior probability p = max XP ( _X | Y) before applying the operation by the existing method, similarly to the existing causal graph. Therefore, for example, the method described in Non-Patent Document 1 may be used as the calculation method.

続いて、推論部１５は、ＦとＧとを用いて因果グラフを変更して、最大事後確率ｐ'＝ｍａｘ_ＸＦ（Ｐ（Ｘ│Ｇ^−１（Ｙ）））を計算する（Ｓ２０２）。具体的には、推論部１５は、ｆ_ｋ∈Ｆと、ｇ_ｓ∈Ｇとの組み合わせごとに、最大事後確率ｐ'_ｋｓ＝ｆ_ｋ（Ｐ（Ｘ│ｇ_ｓ ^−１（Ｙ）））を計算すると共に、Ｙに対して最も尤度の高いＸの状態列を求める。したがって、Ｆの要素数及びＧの要素数の少なくともいずれか一方が複数であれば、複数のｐ'_ｋｓが算出され、ｐ'_ｋｓごとにＸの状態列が求められる。 Subsequently, the inference unit 15 changes the causal graph using F and G, and calculates the maximum posterior probability p ′ = max XF (P ( _X | G ⁻¹ (Y))) (S202). . Specifically, the inference unit 15 calculates the maximum posterior probability p ′ _ks = f _k (P (X│g _s ⁻¹ (Y)) for each combination of f _k ∈F and g _s ∈G. At the same time, the state sequence of X having the highest likelihood for Y is obtained. Therefore, if at least one of the number of elements of F and the number of elements of G is plural, a plurality of p ′ _ks are calculated, and a state sequence of X is obtained for each p ′ _ks .

続いて、推論部１５は、ステップＳ２０２の中から、ｐ'_ｋｓ／ｐが閾値τ（τ＞１）を超えるｐ'_ｋｓ（すなわち、少なくともｐより大きいｐ'_ｋｓ）を抽出し、抽出されたｐ'_ｋｓに対応するＸの状態列を解の候補として出力する（Ｓ２０３）。すなわち、ｐに対する大きさの程度が閾値τを超えるｐ'_ｋｓに対応するＸの状態列が解の候補として出力される。出力される解の候補は、閾値τ＞１を満たしており、既存の因果グラフによる手法よりも、最大事後確率が大きいＸの状態列である。すなわち、既存手法よりも精度よく異常発生の原因箇所・要因を示した解が抽出される。 Subsequently, inference unit 15, from the step S202, p _'ks / p threshold τ (τ> 1) greater than p' _ks (i.e., at least p is greater than p _'ks) were extracted and the extracted The state sequence of X corresponding to _p'ks is output as a solution candidate (S203). That is, a state sequence of X corresponding to p ′ _ks whose magnitude with respect to p exceeds the threshold τ is output as a solution candidate. The output solution candidate is a state sequence of X that satisfies the threshold τ> 1 and has a larger maximum posterior probability than the existing causal graph method. That is, a solution indicating the location and factor of the occurrence of the abnormality is extracted with higher accuracy than the existing method.

なお、ｐ'_ｋｓ／ｐが閾値τ（τ＞１）を超えるが否かの判定の際、推論部１５は、ｆ_ｋによる因果グラフの変更の度合い（例えば、重みを変更したエッジの本数（追加したエッジの本数又は削除したエッジの本数等））に応じて、閾値τの値を変化させてペナルティを与えることで、ｆ_ｋによる変更の自由度を制限する。 Note that when determining whether p ′ _ks / p exceeds the threshold τ (τ> 1), the inference unit 15 determines the degree of change of the causal graph by f _k (for example, the number of edges whose weights have been changed ( The value of the threshold τ is changed according to the number of added edges or the number of deleted edges, etc.) to give a penalty, thereby limiting the degree of freedom of the change by _fk .

ペナルティを与えるというのは、例えば、閾値τを変更の度合に応じて変化させることをいい、ｆ_ｋによる因果グラフの変更の度合いが相対的に小さい場合には、閾値τを小さくし、ｆ_ｋによる因果グラフの変更の度合いが相対的に大きい場合には、閾値τを大きくすることをいう。そうすることで、変更の度合いが小幅な場合には最大事後確率ｐに対する最大事後確率ｐ'の上昇がわずかな場合でも解の候補とすることができ、変更の度合いが大幅な場合には最大事後確率ｐに対する最大事後確率ｐ'の上昇が大きい場合のみ解の候補となるようにすることができる。 Giving a penalty means, for example, changing the threshold value τ in accordance with the degree of change. If the degree of change in the causal graph by f _k is relatively small, the threshold value τ is reduced and f _k Means that the threshold value τ is increased when the degree of change of the causal graph is relatively large. By doing so, when the degree of change is small, even if the increase of the maximum posterior probability p ′ with respect to the maximum posterior probability p is slight, it can be considered as a solution candidate, and when the degree of change is large, the maximum Only when the increase of the maximum posterior probability p ′ with respect to the posterior probability p is large, a solution candidate can be set.

閾値τの変化のさせ方は特定の方法に限定されないが、例えば、ｆ_ｋやｇ_ｓが作用する部分集合の要素数をｓとしたとき、τのｓ乗とする方法がある。 The method of changing the threshold value τ is not limited to a specific method. For example, when the number of elements of a subset on which f _k or g _s acts is s, there is a method of setting τ to the s power.

なお、図８は、解の候補が１つの場合に対応するが、解の候補は複数の場合が有る。この場合、例えば、最大事後確率ｐ'_ｋｓに基づいて出力対象が絞り込まれてもよい。例えば、最大事後確率ｐ'_ｋｓが上位Ｎ個（Ｎ≧１）である解の候補が出力対象とされてもよい。また、最大事後確率ｐ'_ｋｓに基づいて、解の候補がソートされて出力されてもよい。また、解の候補ごとに、対応する最大事後確率ｐ'_ｋｓが出力されてもよい。 Note that FIG. 8 corresponds to the case where there is one solution candidate, but there may be a plurality of solution candidates. In this case, for example, the output targets may be narrowed down based on the maximum posterior probability p ′ _ks . For example, a solution candidate having the highest N posterior probabilities p ′ _ks (N ≧ 1) may be output. Further, the solution candidates may be sorted and output based on the maximum posterior probability p ′ _ks . Further, for each solution candidate, the corresponding maximum posterior probability p ′ _ks may be output.

上述したように、本実施の形態によれば、Ｆの変換によりエッジの加減を考慮しながら最大事後確率を計算することで、事前に構築した因果グラフが不正確な場合でもそれを修正しつつ、機器・要因状態層Ｘの状態を正しく推定することができる。同様に、Ｇの変換によりノイズの付加を考慮しながら最大事後確率を計算することで、観測情報の状態判定が不正確だった場合でも修正しながら状態層Ｘの状態を正しく推定することができる。その結果、システムの異常の原因箇所・要因の推定精度を向上させることができる。したがって、利用者は、高精度に原因箇所・要因を特定し、対処を行うために本実施の形態を利用することができる。 As described above, according to the present embodiment, the maximum posterior probability is calculated by considering the addition and subtraction of edges by the transformation of F, so that even if the causal graph constructed in advance is inaccurate, it is corrected. The state of the device / factor state layer X can be estimated correctly. Similarly, by calculating the maximum posterior probability while considering the addition of noise by converting G, even if the state determination of the observation information is incorrect, the state of the state layer X can be correctly estimated while correcting it. . As a result, it is possible to improve the accuracy of estimating the location and the cause of the system abnormality. Therefore, the user can use the present embodiment to specify the cause / factor with high accuracy and to take measures.

なお、本実施の形態において、推定装置１０は、状態判定装置の一例である。既存手法の因果グラフは、第１の因果グラフの一例である。図５に示されるような因果グラフは、第２の因果グラフの一例である。推論部１５は、判定部の一例である。ＵＩ部１１は、出力部の一例である。異常度計算部１４は、計算部の一例である。機器・要因状態層Ｘは、第１の層の一例である。ノイズを含んだ観測情報の状態層Ｙは、第２の層の一例である。真の観測情報の状態層Ｚは、第３の層の一例である。ノイズを含んだ観測情報は、第１の観測情報の一例である。真の観測情報は、第２の観測情報の一例である。最大事後確率ｐは、第１の最大事後確率の一例である。最大事後確率ｐ'は、第２の最大事後確率の一例である。 In the present embodiment, the estimation device 10 is an example of a state determination device. The causal graph of the existing method is an example of a first causal graph. The causal graph as shown in FIG. 5 is an example of the second causal graph. The inference unit 15 is an example of a determination unit. The UI unit 11 is an example of an output unit. The abnormality degree calculator 14 is an example of a calculator. The device / factor state layer X is an example of a first layer. The state layer Y of the observation information including noise is an example of a second layer. The state layer Z of the true observation information is an example of a third layer. Observation information including noise is an example of first observation information. True observation information is an example of second observation information. The maximum posterior probability p is an example of a first maximum posterior probability. The maximum posterior probability p ′ is an example of a second maximum posterior probability.

以上、本発明の実施例について詳述したが、本発明は斯かる特定の実施形態に限定されるものではなく、特許請求の範囲に記載された本発明の要旨の範囲内において、種々の変形・変更が可能である。 As mentioned above, although the Example of this invention was described in full detail, this invention is not limited to such a specific embodiment, A various deformation | transformation is carried out within the range of the gist of this invention described in the claim.・ Changes are possible.

１０推定装置
１１ＵＩ部
１２因果グラフ構築部
１３修正候補構築部
１４異常度計算部
１５推論部
１６因果グラフＤＢ
１７修正候補ＤＢ
２０機器
１００ドライブ装置
１０１記録媒体
１０２補助記憶装置
１０３メモリ装置
１０４ＣＰＵ
１０５インタフェース装置
１０６表示装置
１０７入力装置
Ｂバス 10 Estimation Device 11 UI Unit 12 Causal Graph Constructing Unit 13 Correction Candidate Constructing Unit 14 Abnormality Calculation Unit 15 Inference Unit 16 Causal Graph DB
17 Correction candidate DB
Reference Signs List 20 device 100 drive device 101 recording medium 102 auxiliary storage device 103 memory device 104 CPU
105 interface device 106 display device 107 input device B bus

Claims

A first causal graph showing a relationship between a first layer corresponding to a state of each component of the system and a second layer corresponding to a state of observation information output from each component of the first layer in the system. On the other hand, a third layer corresponding to the state of the second observation information obtained by converting the observation information output from each component of the first layer is placed between the first layer and the second layer. , A set F of functions f for manipulating the weights of edges between the first layer and the third layer, and a set G of functions g for performing the conversion. A determination unit that determines the state of each of the constituent elements having the maximum likelihood to the state of the observation information collected from the system,
An output unit that outputs a state of each of the components determined by the determination unit,
A state determination device comprising:

The determining unit calculates a first maximum posterior probability for a state of observation information collected from the system based on the first causal graph, and for each combination of the function f and the function g, Based on the second causal graph, calculate the second maximum posterior probability in the combination and determine the state of each component in the combination,
The output unit outputs, among the second maximum posterior probabilities, states of the components determined with respect to a second maximum posterior probability larger than the first maximum posterior probability.
The state determination device according to claim 1, wherein:

The output unit outputs, among the second maximum posterior probabilities, a state of each of the components determined with respect to a second maximum posterior probability whose magnitude with respect to the first maximum posterior probability exceeds a threshold value. And
The threshold value is increased as the combination of the function f with a greater degree of operation of the weight is increased.
3. The state determination device according to claim 2, wherein:

A calculation unit that calculates the degree of abnormality for the observation information collected from the system, and determines the state of the observation information based on the degree of abnormality,
The state determination device according to any one of claims 1 to 3, further comprising:

A first causal graph showing a relationship between a first layer corresponding to a state of each component of the system and a second layer corresponding to a state of observation information output from each component of the first layer in the system. On the other hand, a third layer corresponding to the state of the second observation information obtained by converting the observation information output from each component of the first layer is placed between the first layer and the second layer. , A set F of functions f for manipulating the weights of edges between the first layer and the third layer, and a set G of functions g for performing the conversion. A determination procedure for determining the state of each of the constituent elements having the maximum likelihood to the state of the observation information collected from the system,
An output step of outputting a state of each of the components determined in the determination step,
And a computer for executing the state determination method.

A program that causes a computer to function as each unit according to claim 1.