JP7335378B1

JP7335378B1 - Message classifier, message classifier method, and program

Info

Publication number: JP7335378B1
Application number: JP2022031838A
Authority: JP
Inventors: 晃範小杉; 啓司寺澤; 桂子青木; 寛悟山本; 泰登石井; 宇蘭金澤; 賢橋本
Original assignee: NTT Comware Corp
Current assignee: NTT Comware Corp
Priority date: 2022-03-02
Filing date: 2022-03-02
Publication date: 2023-08-29
Anticipated expiration: 2042-03-02
Also published as: JP2023127887A

Abstract

【課題】ログメッセージを分類するためのコンフィグレーション作業の手間を省くこと。【解決手段】本発明の一態様は、監視対象システムからログメッセージを収集する収集部と、収集部により収集された複数のログメッセージのそれぞれをベクトル化するベクトル化部と、ベクトル化部によりベクトル化されたログメッセージを分類するための閾値を設定する閾値設定部と、閾値設定部により設定された閾値を用いて複数のログメッセージをクラスタリングし、クラスタリングされたログメッセージを識別する識別子を設定するクラスタリング部と、収集部により新たなログメッセージを取得した場合に、取得した新たなログメッセージにクラスタリング部により設定された識別子を付与する分類部と、を備える、メッセージ分類装置である。【選択図】図２[Problem] To eliminate the hassle of configuration work for classifying log messages. One aspect of the present invention includes a collection unit that collects log messages from a monitored system, a vectorization unit that vectorizes each of a plurality of log messages collected by the collection unit, and a vectorization unit that vectorizes each of the plurality of log messages collected by the collection unit. a threshold setting unit that sets a threshold for classifying the clustered log messages; a threshold setting unit that clusters a plurality of log messages using the threshold set by the threshold setting unit; and sets an identifier for identifying the clustered log messages. A message classification device includes a clustering unit, and a classification unit that, when a new log message is acquired by a collection unit, gives an identifier set by the clustering unit to the acquired new log message. [Selection diagram] Figure 2

Description

本発明は、メッセージ分類装置、メッセージ分類方法、およびプログラムに関する。 The present invention relates to a message classification device, a message classification method, and a program.

従来より、各種のシステムから出力されたログを監視し、システムの異常を検知する技術が知られている。例えば、特許文献１に記載された保守管理装置が知られている。この保守管理装置は、ログ情報を収集する収集部と、ログ情報を識別するログ識別子とログ情報の時刻情報とを関連付けて記憶する記憶部と、複数のログ識別子を時刻情報に基づいてまとめたログシーケンスを作成し、ログシーケンスの開始時刻と終了時刻との差分からシーケンス時間を算出し、ログシーケンスとシーケンス時間とを関連付けたシーケンスグループにグループ化する分析部と、を備え、分析部は、シーケンスグループが予め登録された正常シーケンスグループ及び異常シーケンスグループと一致しない場合、予め登録されたインシデント予兆グループのうち、シーケンスグループと最も適合率の高いシーケンスグループに基づいてインシデント発生までのインシデント発生見込み時間を算出する。これにより、保守管理装置は、インシデントの発生を予測することが可能であるとしている。 2. Description of the Related Art Conventionally, there has been known a technique for monitoring logs output from various systems and detecting system anomalies. For example, a maintenance management device described in Patent Literature 1 is known. This maintenance management device includes a collection unit that collects log information, a storage unit that associates and stores a log identifier that identifies log information and time information of the log information, and a plurality of log identifiers that are collected based on the time information. an analysis unit that creates a log sequence, calculates the sequence time from the difference between the start time and the end time of the log sequence, and groups the log sequence and the sequence time into a sequence group that associates the log sequence, the analysis unit If the sequence group does not match the pre-registered normal sequence group and abnormal sequence group, the estimated time until the incident occurs based on the sequence group and the sequence group with the highest matching rate among the pre-registered incident sign groups Calculate This enables the maintenance management device to predict the occurrence of incidents.

特許第６５１２６４６号公報Japanese Patent No. 6512646

しかしながら、上述した保守管理装置は、予め正常シーケンスグループ、異常シーケンスグループ、およびインシデント予兆グループといった情報を事前登録し、ログ情報を監視している。しかしながら、各種の情報を事前登録するためには手作業で実施されている場合がある。例えば、ログメッセージに含まれる情報を検知してログメッセージを分類する必要があるが、検知対象の情報を設定する作業（コンフィグレーション作業）を手作業で行うため手間が大きくなるという課題があった。 However, the maintenance management device described above pre-registers information such as a normal sequence group, an abnormal sequence group, and an incident sign group, and monitors log information. However, in some cases, pre-registration of various information is performed manually. For example, it is necessary to detect information contained in log messages and classify the log messages, but there was a problem that the work of setting the information to be detected (configuration work) was done manually, which was a lot of work. .

本発明は、上記の課題に鑑みてなされたものであって、ログメッセージを分類するためのコンフィグレーション作業の手間を省くことができるメッセージ分類装置、メッセージ分類方法、およびプログラムを提供することを目的としている。 SUMMARY OF THE INVENTION It is an object of the present invention to provide a message classification device, a message classification method, and a program that can save the trouble of configuration work for classifying log messages. and

（１）本発明の一態様は、監視対象システムからログメッセージを収集する収集部と、前記収集部により収集された複数のログメッセージのそれぞれをベクトル化するベクトル化部と、前記ベクトル化部によりベクトル化されたログメッセージを分類するための閾値を設定する閾値設定部と、前記閾値設定部により設定された前記閾値を用いて前記複数のログメッセージをクラスタリングし、クラスタリングされたログメッセージを識別する識別子を設定するクラスタリング部と、前記収集部により新たなログメッセージを取得した場合に、取得した新たなログメッセージに前記クラスタリング部により設定された識別子を付与する分類部と、を備える、メッセージ分類装置である。 (1) One aspect of the present invention is a collection unit that collects log messages from a monitored system, a vectorization unit that vectorizes each of the plurality of log messages collected by the collection unit, and a threshold setting unit for setting a threshold for classifying vectorized log messages; and clustering the plurality of log messages using the threshold set by the threshold setting unit to identify the clustered log messages. A message classification device, comprising: a clustering unit that sets an identifier; and a classifying unit that, when a new log message is acquired by the collecting unit, adds the identifier set by the clustering unit to the acquired new log message. is.

（２）本発明の一態様は、上記のメッセージ分類装置であって、前記ベクトル化部は、前記ログメッセージに含まれる単語ごとに、前記単語の出現位置が前記ログメッセージの先頭に近いほど重みを高く計算し、前記単語に数値が含まれる場合に前記重みを低く計算し、計算された前記重みに基づいてベクトル化を行ってよい。 (2) An aspect of the present invention is the above-described message classification device, wherein the vectorization unit weights each word included in the log message as the appearance position of the word is closer to the beginning of the log message. may be calculated high, the weight may be calculated low if the word contains a numerical value, and vectorization may be performed based on the calculated weight.

（３）本発明の一態様は、上記のメッセージ分類装置であって、前記クラスタリング部は、前記ログメッセージの集合のうちクラスタに含まれていない一部のログメッセージをサンプリングする処理と、サンプリングしたログメッセージに対してクラスタリングする処理と、を複数回行ってクラスタ集合を取得し、取得した前記クラスタ集合に識別子を設定してよい。 (3) An aspect of the present invention is the message classification device described above, wherein the clustering unit includes a process of sampling some log messages that are not included in the cluster from the set of log messages; and clustering the log messages a plurality of times to obtain a cluster set, and an identifier may be set to the obtained cluster set.

（４）本発明の一態様は、上記のメッセージ分類装置であって、前記閾値設定部は、前記クラスタリング部により複数の閾値候補のそれぞれを用いてクラスリングを行わせ、前記ログメッセージの集合に含まれるログメッセージ数、前記クラスタ集合のそれぞれに含まれるクラスタ数に基づいてAIC（Akaike information criterion, 赤池情報量規準）またはBIC（Bayesian information criterion, ベイズ情報量規準）を計算し、AICまたはBICが最小となる閾値候補を前記閾値として採用してよい。 (4) An aspect of the present invention is the message classification device described above, wherein the threshold setting unit causes the clustering unit to perform classification using each of a plurality of threshold candidates, and the set of log messages Calculate AIC (Akaike information criterion) or BIC (Bayesian information criterion) based on the number of log messages included, the number of clusters included in each of the cluster sets, and determine whether AIC or BIC is A minimum threshold candidate may be adopted as the threshold.

（５）本発明の一態様は、情報処理装置が、監視対象システムからログメッセージを収集するステップと、前記情報処理装置が、収集された複数のログメッセージのそれぞれをベクトル化するステップと、前記情報処理装置が、ベクトル化されたログメッセージを分類するための閾値を設定するステップと、前記情報処理装置が、前記閾値を用いて前記複数のログメッセージをクラスタリングし、クラスタリングされたログメッセージを識別する識別子を設定するステップと、前記情報処理装置が、新たなログメッセージを取得した場合に、取得した新たなログメッセージに前記識別子を付与するステップと、を含む、メッセージ分類方法である。 (5) An aspect of the present invention is an information processing device collecting log messages from a monitored system; the information processing device vectorizing each of the plurality of collected log messages; an information processing device setting a threshold for classifying the vectorized log messages; and the information processing device clustering the plurality of log messages using the threshold and identifying the clustered log messages. and, when the information processing device acquires a new log message, assigning the identifier to the acquired new log message.

（６）本発明の一態様は、コンピュータに、監視対象システムからログメッセージを収集するステップと、収集された複数のログメッセージのそれぞれをベクトル化するステップと、ベクトル化されたログメッセージを分類するための閾値を設定するステップと、前記閾値を用いて前記複数のログメッセージをクラスタリングし、クラスタリングされたログメッセージを識別する識別子を設定するステップと、新たなログメッセージを取得した場合に、取得した新たなログメッセージに前記識別子を付与するステップと、を実行させる、プログラムである。 (6) An aspect of the present invention provides a computer with the steps of collecting log messages from a monitored system, vectorizing each of the plurality of collected log messages, and classifying the vectorized log messages. clustering the plurality of log messages using the threshold, setting an identifier for identifying the clustered log messages, and acquiring a new log message, the acquired and giving the identifier to a new log message.

本発明の一態様によれば、監視対象システムの未知な異常や複雑な異常を検知することができる。 According to one aspect of the present invention, it is possible to detect an unknown anomaly or a complicated anomaly in a monitored system.

ログメッセージの一例を示す図である。FIG. 4 is a diagram showing an example of a log message; FIG. 実施形態のシーケンス推定システム１の機能的な構成の一例を示すブロック図である。It is a block diagram showing an example of functional composition of sequence estimation system 1 of an embodiment. 実施形態におけるシーケンス推定システム１の全体の処理手順を示すフローチャートである。4 is a flow chart showing the overall processing procedure of the sequence estimation system 1 in the embodiment; ベクトル化処理の一例を説明するための図であり、（Ａ）は、ベクトル化処理のうち単語を抽出処理の一例を示す図であり、（Ｂ）は、ベクトル化処理のうち単語の出現位置を考慮する処理の一例を示す図であり、（Ｃ）は、ベクトル化処理のうち単語に重み係数を設定する処理の一例を示す図である。FIG. 4 is a diagram for explaining an example of vectorization processing, (A) is a diagram illustrating an example of word extraction processing in the vectorization processing, and (B) is a diagram illustrating an appearance position of a word in the vectorization processing; is a diagram showing an example of a process that considers , and (C) is a diagram showing an example of a process of setting a weighting factor to a word in the vectorization process. 分類処理の一例を示す図である。It is a figure which shows an example of a classification process. ＡＩＣの算出式およびＢＩＣの算出式を示す図である。It is a figure which shows the calculation formula of AIC, and the calculation formula of BIC. コンフィグレーションを作成する処理の一例を示すフローチャートを示す図である。FIG. 5 is a diagram showing a flowchart showing an example of processing for creating a configuration; コンフィグレーションを設定する処理の一例を示すフローチャートである。6 is a flowchart illustrating an example of processing for setting a configuration; ログメッセージを登録する処理の一例を示すフローチャートである。7 is a flowchart illustrating an example of processing for registering a log message; メッセージ集合推定処理の処理手順の一例を示すフローチャートである。FIG. 11 is a flowchart showing an example of a processing procedure of message set estimation processing; FIG. 相関係数を計算する処理の一例を説明するための図である。FIG. 4 is a diagram for explaining an example of processing for calculating a correlation coefficient; FIG. メッセージ集合推定処理における自動推定処理を説明するための図である。FIG. 10 is a diagram for explaining automatic estimation processing in message set estimation processing; 包含関係にあるログメッセージの集合の一例を示す図である。FIG. 4 is a diagram showing an example of a set of log messages having an inclusion relationship; FIG. 同時発生関係にあるログメッセージの集合の一例を示す図である。FIG. 4 is a diagram showing an example of a set of log messages in a concurrent relationship; モデル作成処理の全体を示すフローチャートである。4 is a flowchart showing the entire model creation process; 学習データの収集処理の処理手順の一例を示すフローチャートである。7 is a flowchart illustrating an example of a processing procedure of learning data collection processing; カーネル密度推定による学習データ収集処理の一例を示すフローチャートである。7 is a flowchart showing an example of learning data collection processing by kernel density estimation; カーネル密度推定による学習データを収集する処理の一例を示す図である。FIG. 5 is a diagram showing an example of processing for collecting learning data by kernel density estimation; 通常マルコフモデルおよび優先マルコフモデルの作成処理の処理手順の一例を示すフローチャートである。FIG. 10 is a flowchart showing an example of a processing procedure for creating a normal Markov model and a priority Markov model; FIG. 通常マルコフモデルおよび優先マルコフモデルの作成処理の一例を示す図である。FIG. 10 is a diagram illustrating an example of processing for creating a normal Markov model and a priority Markov model; デュレーション値の一例を示す図である。FIG. 5 is a diagram showing an example of duration values; 一つの学習データおよび複数の学習データを示す図である。It is a figure which shows one learning data and several learning data. 優先モデルを作成する処理の処理手順の一例を示すフローチャートである。FIG. 11 is a flowchart showing an example of a processing procedure of processing for creating a priority model; FIG. デュレーション値の算出処理の処理手順の一例を示すフローチャートである。7 is a flowchart illustrating an example of a processing procedure of duration value calculation processing; デュレーション値のクラスタリング処理を処理手順の一例を示すフローチャートである。FIG. 11 is a flowchart showing an example of a processing procedure for clustering processing of duration values; FIG. デュレーション値のクラスタリング処理の一例を示す図である。FIG. 7 is a diagram showing an example of clustering processing of duration values; 異常値を考慮したデュレーション値のクラスタリングを説明するための図である。FIG. 5 is a diagram for explaining clustering of duration values in consideration of abnormal values; 優先マルコフモデルを高次化する処理を説明するための図である。FIG. 10 is a diagram for explaining processing for increasing the order of a prioritized Markov model; シーケンス推定処理の一例を示すシーケンス図である。FIG. 10 is a sequence diagram showing an example of sequence estimation processing; シーケンス推定処理の処理手順の一例を示すシーケンス図である。FIG. 11 is a sequence diagram showing an example of a processing procedure of sequence estimation processing; 競合調整済みマルコフモデルの作成処理の処理手順の一例を示すフローチャートである。FIG. 11 is a flowchart illustrating an example of a processing procedure for creating a conflict-adjusted Markov model; FIG. 競合調整済みの優先マルコフモデルの作成処理の一例を説明するための図である。FIG. 10 is a diagram for explaining an example of processing for creating a conflict-adjusted prioritized Markov model; ログメッセージについてのシーケンス推定処理の処理手順の一例を示すフローチャートである。FIG. 11 is a flowchart illustrating an example of a sequence estimation process procedure for log messages; FIG. 優先マルコフモデルおよび通常マルコフモデルを用いたシーケンス推定処理を説明するための図である。FIG. 4 is a diagram for explaining sequence estimation processing using a priority Markov model and a normal Markov model; シーケンス推定処理の他の一例を示すフローチャートである。9 is a flowchart showing another example of sequence estimation processing; シーケンスを決定する処理を説明するための図である。FIG. 10 is a diagram for explaining processing for determining a sequence; FIG. 異常判定処理の処理手順の一例を示すフローチャートである。7 is a flowchart illustrating an example of a processing procedure of abnormality determination processing; 異常判定処理の処理内容の一例を示すフローチャートである。6 is a flowchart showing an example of processing contents of an abnormality determination process;

以下、本発明を適用したメッセージ分類装置、メッセージ分類方法、およびプログラム、学習システム、および学習方法を、図面を参照して説明する。 A message classification device, a message classification method, a program, a learning system, and a learning method to which the present invention is applied will be described below with reference to the drawings.

＜実施形態の概要＞
本発明を適用したメッセージ分類装置、メッセージ分類方法、およびプログラム、学習システム、および学習方法は、実施形態のシーケンス推定システムにより実現される。実施形態のシーケンス推定システムは、監視対象システムからログメッセージを収集し、複数のログメッセージからなるシーケンスを抽出するシステムである。シーケンス推定システムは、一または複数の監視対象システムから出力される多数のログメッセージのうち関係性の高いログメッセージを一つの集合として抽出し、抽出結果を、異常の検知や異常箇所の特定等のオペレーションで利用できるようにする。また、シーケンス推定システムは、集合におけるログメッセージ間の順列の誤りを明確にして、オペレーションで利用できるようにする。これにより、シーケンス推定システムは、未知の異常や、複数の監視対象システムに跨る複雑な異常などが発生した場合、異常箇所の特定精度の向上や、異常箇所の特定に必要なログトレースの時間を短縮することができる。 <Overview of Embodiment>
A message classification device, a message classification method, a program, a learning system, and a learning method to which the present invention is applied are realized by the sequence estimation system of the embodiments. A sequence estimation system of an embodiment is a system that collects log messages from a monitored system and extracts a sequence composed of a plurality of log messages. A sequence estimation system extracts a set of highly related log messages from a large number of log messages output from one or more monitored systems, and uses the extraction results to detect anomalies, identify anomalous locations, etc. Make it available for operations. The sequence estimation system also accounts for permutation errors between log messages in the set and makes them available for operation. As a result, the sequence estimation system can improve the accuracy of identifying anomalies and reduce the log trace time required to identify anomalous locations when an unknown anomaly or a complex anomaly that spans multiple monitored systems occurs. can be shortened.

図１は、ログメッセージの一例を示す図である。例えば、任意の監視対象システムやシステム内の構成要素から、数日に亘り収集したログメッセージ群１、ログメッセージ群２およびログメッセージ群３が存在するものとする。シーケンス推定システムに実装された推定モデルは、１２月２１日から２８日に亘り、ログメッセージ群１～３間で関連性の高い「正常なシーケンス」を学習しているものとする。この正常なシーケンスは、ログメッセージ群１に含まれるログメッセージ「ａａａａａ」、ログメッセージ群２に含まれるログメッセージ「ｂｂｂｂｂ」、およびログメッセージ群３に含まれるログメッセージ「ｃｃｃｃｃ」が時系列的な順列で発生するというシーケンスである。例えば、１２月２９日においてログメッセージ群１に含まれるログメッセージ「ａａａａａ」、ログメッセージ群２に含まれるログメッセージ「ｘｘｘｘｘ」、およびログメッセージ群３に含まれるログメッセージ「ｃｃｃｃｃ」が時系列的な順列で発生した場合、シーケンス推定システムは、当該シーケンスが正常ではない「エラーシーケンス」であると検知することができる。このように、シーケンス推定システムは、例えば、「正常なシーケンス」を学習しておくことにより、未知の異常なシーケンスを検知することができる。
以下、このようなシーケンス推定システムについて説明する。 FIG. 1 is a diagram showing an example of a log message. For example, it is assumed that there are log message group 1, log message group 2, and log message group 3 collected over several days from an arbitrary monitored system or components within the system. It is assumed that the estimation model implemented in the sequence estimation system has learned a "normal sequence" with high relevance among log message groups 1 to 3 from December 21st to 28th. In this normal sequence, the log message "aaaa" included in log message group 1, the log message "bbbbb" included in log message group 2, and the log message "ccccc" included in log message group 3 are chronologically It is a sequence that occurs in permutation. For example, on December 29, the log message "aaaaa" included in log message group 1, the log message "xxxx" included in log message group 2, and the log message "cccccc" included in log message group 3 are displayed chronologically. permutations, the sequence estimation system can detect that the sequence is an "erroneous sequence" that is not normal. Thus, the sequence estimation system can detect unknown abnormal sequences, for example, by learning "normal sequences".
Such a sequence estimation system will be described below.

＜シーケンス推定システム１の構成＞
図２は、実施形態のシーケンス推定システム１の機能的な構成の一例を示すブロック図である。シーケンス推定システム１は、例えば、一又は複数の監視対象システム１００と、データ処理装置２００と、異常検知装置３００と、ユーザ端末装置４００と、を備える。監視対象システム１００、データ処理装置２００、異常検知装置３００、およびユーザ端末装置４００は、例えば、通信ネットワークに接続される。通信ネットワークに接続される各装置は、ＮＩＣ（ＮｅｔｗｏｒｋＩｎｔｅｒｆａｃｅＣａｒｄ）や無線通信モジュールなどの通信インターフェースを備えている（図２では不図示）。通信ネットワークは、例えば、インターネット、ＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、セルラー網などを含む。 <Configuration of sequence estimation system 1>
FIG. 2 is a block diagram showing an example of the functional configuration of the sequence estimation system 1 of the embodiment. The sequence estimation system 1 includes, for example, one or more monitored systems 100, a data processing device 200, an anomaly detection device 300, and a user terminal device 400. The monitored system 100, the data processing device 200, the anomaly detection device 300, and the user terminal device 400 are connected to, for example, a communication network. Each device connected to the communication network has a communication interface such as a NIC (Network Interface Card) or a wireless communication module (not shown in FIG. 2). Communication networks include, for example, the Internet, WANs (Wide Area Networks), LANs (Local Area Networks), cellular networks, and the like.

監視対象システム１００は、データ処理装置２００および異常検知装置３００によってログメッセージが監視される情報処理システムである。監視対象システム１００は、例えば、各種のサービスを提供するサービスサーバ装置や、ネットワーク網に含まれる多数のネットワークノードの動作状態を管理するネットワーク管理装置等である。ネットワークノードは、例えば、ＯＳ（ＯｐｅｒａｔｉｏｎＳｙｓｔｅｍ）、ＶＭ（ＶｉｒｔｕａｌＭａｃｈｉｎｅ）、ＨＷ（Ｈａｒｄｗａｒｅ）、ＤＣ（ＤａｔａＣｅｎｔｅｒ）などである。監視対象システム１００は、所定のトリガに従ってログメッセージをデータ処理装置２００に提供する。また、監視対象システム１００は、単独で動作するサーバ装置であってよいが、他のサーバ装置と連携して動作する複数のサーバ装置群であってよい。 The monitored system 100 is an information processing system in which log messages are monitored by a data processing device 200 and an anomaly detection device 300 . The monitored system 100 is, for example, a service server device that provides various services, a network management device that manages the operating states of many network nodes included in the network, and the like. Network nodes are, for example, OS (Operation System), VM (Virtual Machine), HW (Hardware), DC (Data Center), and the like. The monitored system 100 provides log messages to the data processing device 200 according to predetermined triggers. The monitored system 100 may be a server device that operates independently, or may be a group of server devices that operate in cooperation with other server devices.

データ処理装置２００は、例えば、ログ運用のためのＯＳＳ（オープンソースソフトウェア）を実装したコンピュータである。ＯＳＳは、例えば、Elasticsearch、Logstash、およびKibanaと称される要素により構成される。データ処理装置２００は、例えば、Logstashにより構成されるフォーマット変換部２０２と、Elasticsearchにより構成されるデータ処理部２０４と、ログデータ蓄積部２０６と、検知結果蓄積部２０８と、Kibanaにより構成される可視化部２１０とを備える。データ処理装置２００は、監視対象システム１００からログメッセージを収集する収集部の一例である。 The data processing device 200 is, for example, a computer implementing OSS (open source software) for log operation. OSS is composed of elements called Elasticsearch, Logstash, and Kibana, for example. The data processing device 200 includes, for example, a format conversion unit 202 configured by Logstash, a data processing unit 204 configured by Elasticsearch, a log data storage unit 206, a detection result storage unit 208, and a visualization unit configured by Kibana. and a section 210 . The data processing device 200 is an example of a collection unit that collects log messages from the monitored system 100 .

フォーマット変換部２０２、データ処理部２０４、および可視化部２１０といった機能部は、例えばＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）等のプロセッサがプログラムメモリに格納されたプログラムを実行することにより実現される。また、これらの機能部のうち一部または全部は、ＬＳＩ（ＬａｒｇｅＳｃａｌｅＩｎｔｅｇｒａｔｉｏｎ）、ＡＳＩＣ（ＡｐｐｌｉｃａｔｉｏｎＳｐｅｃｉｆｉｃＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）、またはＦＰＧＡ（Ｆｉｅｌｄ-ＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙ）等のハードウェアにより実現されてもよいし、ソフトウェアとハードウェアが協働することで実現されてもよい。プログラムは、予めデータ処理装置２００のＨＤＤやフラッシュメモリなどの記憶装置（非一過性の記憶媒体を備える記憶装置）に格納されていてもよいし、ＤＶＤやＣＤ－ＲＯＭなどの着脱可能な記憶媒体に格納されており、記憶媒体（非一過性の記憶媒体）がドライブ装置に装着されることでデータ処理装置２００のＨＤＤやフラッシュメモリにインストールされてもよい。ログデータ蓄積部２０６および検知結果蓄積部２０８は、例えば、ＨＤＤ（ＨａｒｄＤｉｓｃＤｒｉｖｅ）、フラッシュメモリ、ＥＥＰＲＯＭ（ＥｌｅｃｔｒｉｃａｌｌｙＥｒａｓａｂｌｅＰｒｏｇｒａｍｍａｂｌｅＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）などの記憶装置により実現される。 Functional units such as the format conversion unit 202, the data processing unit 204, and the visualization unit 210 are implemented by a processor such as a CPU (Central Processing Unit) executing a program stored in a program memory. Some or all of these functional units may be realized by hardware such as LSI (Large Scale Integration), ASIC (Application Specific Integrated Circuit), or FPGA (Field-Programmable Gate Array), It may be realized by cooperation of software and hardware. The program may be stored in advance in a storage device (a storage device having a non-transitory storage medium) such as the HDD or flash memory of the data processing device 200, or may be stored in a removable storage such as a DVD or CD-ROM. It is stored in a medium, and may be installed in the HDD or flash memory of the data processing device 200 by loading the storage medium (non-transitory storage medium) into the drive device. The log data accumulation unit 206 and the detection result accumulation unit 208 are, for example, HDD (Hard Disc Drive), flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), ROM (Read Only Memory), RAM (Random Access Memory). ory) It is implemented by a storage device.

フォーマット変換部２０２は、異常検知装置３００により供給されたコンフィグレーション情報に基づいて、監視対象システム１００から収集したログメッセージにメッセージＩＤを追加する。コンフィグレーション情報は、ログメッセージにメッセージＩＤを付与するルールを示す情報である。メッセージＩＤは、ログメッセージを登録するために参照される情報であって、例えばログメッセージの種類を識別する情報である。データ処理部２０４は、フォーマット変換部２０２によりメッセージＩＤが追加されたログメッセージをログデータ蓄積部２０６に記憶する。データ処理部２０４は、異常検知装置３００の要求に応じてログデータ蓄積部２０６から所望のログメッセージを検索し、検索したログメッセージを異常検知装置３００に提供する。データ処理部２０４は、異常検知装置３００から提供されたシーケンス推定結果や異常判定結果を検知結果蓄積部２０８に記憶する。可視化部２１０は、シーケンス推定結果や異常判定結果をユーザが閲覧可能な可視化データに変換して、ユーザ端末装置４００に提供する。 The format conversion unit 202 adds message IDs to log messages collected from the monitored system 100 based on the configuration information supplied by the anomaly detection device 300 . The configuration information is information indicating rules for assigning message IDs to log messages. The message ID is information that is referred to for registering a log message, and is information that identifies the type of log message, for example. The data processing unit 204 stores the log message to which the message ID is added by the format conversion unit 202 in the log data storage unit 206 . The data processing unit 204 searches for a desired log message from the log data storage unit 206 in response to a request from the anomaly detection device 300 and provides the searched log message to the anomaly detection device 300 . The data processing unit 204 stores the sequence estimation result and the abnormality determination result provided from the abnormality detection device 300 in the detection result accumulation unit 208 . The visualization unit 210 converts the sequence estimation result and the abnormality determination result into visualization data that can be browsed by the user, and provides the user terminal device 400 with the visualization data.

ユーザ端末装置４００は、例えばパーソナルコンピュータや、スマートフォンやタブレット端末などの端末装置である。ユーザ端末装置４００は、例えば監視対象システム１００の管理者の操作を受け付け、監視対象システム１００の状態や異常に関する情報をデータ処理装置２００から取得し、表示処理等を行う。 The user terminal device 400 is, for example, a terminal device such as a personal computer, a smart phone, or a tablet terminal. The user terminal device 400 receives, for example, an operation by an administrator of the monitored system 100, acquires information about the state and abnormality of the monitored system 100 from the data processing device 200, and performs display processing and the like.

異常検知装置３００は、データ処理装置２００から取得したログメッセージを分析し、分析結果に基づく情報をデータ処理装置２００に提供する情報処理装置である。異常検知装置３００は、例えば、コンフィグレーション作成部３１０と、メッセージ登録部３２０と、学習部３３０と、推定部３４０とを備える。コンフィグレーション作成部３１０、メッセージ登録部３２０、学習部３３０、および推定部３４０といった機能部は、例えばＣＰＵ等のプロセッサがプログラムメモリに格納されたプログラムを実行することにより実現される。なお、本実施形態はコンフィグレーション作成部３１０およびメッセージ登録部３２０を異常検知装置３００に搭載する一例について説明するが、コンフィグレーション作成部３１０およびメッセージ登録部３２０の機能は、異常検知装置３００に代えてデータ処理装置２００に搭載してよく、データ処理装置２００以外の別装置に搭載してよい。 The anomaly detection device 300 is an information processing device that analyzes log messages acquired from the data processing device 200 and provides the data processing device 200 with information based on the analysis results. The anomaly detection device 300 includes, for example, a configuration creation unit 310 , a message registration unit 320 , a learning unit 330 and an estimation unit 340 . Functional units such as the configuration creating unit 310, the message registering unit 320, the learning unit 330, and the estimating unit 340 are implemented by a processor such as a CPU executing a program stored in a program memory. In this embodiment, an example in which the configuration creation unit 310 and the message registration unit 320 are installed in the anomaly detection device 300 will be described. may be installed in the data processing apparatus 200 or may be installed in a separate apparatus other than the data processing apparatus 200 .

コンフィグレーション作成部３１０は、例えば、AIOps（Artificial Intelligence for IT Operations）を利用する。コンフィグレーション作成部３１０は、例えばログデータ蓄積部２０６に蓄積されたログメッセージを用いてログメッセージにメッセージＩＤを付与するルールを作成し、コンフィグレーション情報（図中ではＩＤ付与ルール）をフォーマット変換部２０２に供給する。これによりコンフィグレーション作成部３１０は、フォーマット変換部２０２によりログメッセージにメッセージＩＤを追加させる。メッセージＩＤを付与するルールは、ログメッセージを分類するルールに相当し、コンフィグレーション作成部３１０は、データ処理装置２００にログメッセージを分類する機能を持たせる。 The configuration creation unit 310 uses AIOps (Artificial Intelligence for IT Operations), for example. The configuration creating unit 310 creates rules for assigning message IDs to log messages using, for example, log messages stored in the log data storage unit 206, and transfers configuration information (ID assignment rules in the figure) to a format conversion unit. 202. Accordingly, the configuration creation unit 310 causes the format conversion unit 202 to add the message ID to the log message. A rule for assigning a message ID corresponds to a rule for classifying log messages, and the configuration creation unit 310 allows the data processing device 200 to classify log messages.

メッセージ登録部３２０は、ログデータ蓄積部２０６に蓄積されたログメッセージを学習処理および推定処理に用いる情報として登録する。学習部３３０は、例えばメッセージ集合推定部３３２と、モデル作成部３３４とを備える。メッセージ集合推定部３３２は、ログメッセージの集合を推定する。モデル作成部３３４は、シーケンスを推定するためのモデルを作成する。推定部３４０は、例えば、シーケンス推定部３４２と、異常判定部３４４とを備える。シーケンス推定部３４２は、一連のログメッセージを含むシーケンスを推定する。一連のログメッセージは、例えば、時系列的に関連した複数のログメッセージである。異常判定部３４４は、シーケンス推定部３４２により推定された結果に基づいて異常を判定する。異常検知装置３００は、シーケンス推定結果や異常判定結果を、異常検知装置３００の分析結果に基づく情報としてデータ処理装置２００に提供する。 The message registration unit 320 registers log messages accumulated in the log data accumulation unit 206 as information used for learning processing and estimation processing. The learning unit 330 includes, for example, a message set estimation unit 332 and a model creation unit 334 . The message set estimation unit 332 estimates a set of log messages. A model creation unit 334 creates a model for estimating a sequence. The estimator 340 includes, for example, a sequence estimator 342 and an abnormality determiner 344 . Sequence estimator 342 estimates a sequence that includes a series of log messages. A series of log messages is, for example, a plurality of chronologically related log messages. The abnormality determination section 344 determines abnormality based on the result estimated by the sequence estimation section 342 . The abnormality detection device 300 provides the data processing device 200 with the sequence estimation result and the abnormality determination result as information based on the analysis result of the abnormality detection device 300 .

＜シーケンス推定システム１の全体処理＞
図３は、実施形態におけるシーケンス推定システム１の全体の処理手順を示すフローチャートである。シーケンス推定システム１は、先ず、監視対象システム１００から収集したログメッセージを仮登録する（ステップＳ１００）。次にシーケンス推定システム１は、仮登録したログメッセージを用いてコンフィグレーション情報を作成することで、ログメッセージを分類する（ステップＳ１１０）。シーケンス推定システム１は、監視対象システム１００から収集したログメッセージを登録する（ステップＳ２００）。このときシーケンス推定システム１は、ログメッセージにメッセージＩＤおよびタイムスタンプを付加して登録する。タイムスタンプはログメッセージの発生時刻を示す情報である。次にシーケンス推定システム１は、ログメッセージの集合を推定する（ステップＳ３００）。次にシーケンス推定システム１は、推定モデルを作成する（ステップＳ４００）。ステップＳ２００からステップＳ４００までの処理が、学習フェーズに相当する。 <Overall Processing of Sequence Estimation System 1>
FIG. 3 is a flow chart showing the overall processing procedure of the sequence estimation system 1 in the embodiment. The sequence estimation system 1 first temporarily registers log messages collected from the monitored system 100 (step S100). Next, the sequence estimation system 1 classifies the log messages by creating configuration information using the temporarily registered log messages (step S110). The sequence estimation system 1 registers log messages collected from the monitored system 100 (step S200). At this time, the sequence estimation system 1 adds a message ID and a time stamp to the log message and registers it. A time stamp is information indicating the time of occurrence of a log message. Next, the sequence estimation system 1 estimates a set of log messages (step S300). Next, the sequence estimation system 1 creates an estimation model (step S400). The processing from step S200 to step S400 corresponds to the learning phase.

次にシーケンス推定システム１は、シーケンスを推定する（ステップＳ５００）。シーケンスの推定処理は、定期的なタイミングやログメッセージが所定量だけ蓄積したタイミングなどの所定の条件が成立した場合に開始してよい。次にシーケンス推定システム１は、異常を判定する（ステップＳ６００）。ステップＳ５００およびステップＳ６００は、推定・検知フェーズに属する。なお、シーケンス推定システム１は、シーケンスの異常を判定することなく、シーケンス抽出結果だけをデータ処理装置２００に提供してもよい。また、シーケンス推定システム１は、シーケンス推定タイミング、異常判定タイミングや、異常のレベルなどを監視対象システム１００に応じて変更してもよい。また、シーケンス推定システム１は、監視対象システム１００から随時供給されるログメッセージを用いて、学習フェーズと推定・検知フェーズを並行して実行してよい。
以下、ステップＳ１００からステップＳ６００までの各処理を詳細に説明する。 Next, the sequence estimation system 1 estimates a sequence (step S500). The sequence estimation process may be started when a predetermined condition such as regular timing or timing when a predetermined amount of log messages is accumulated is satisfied. Next, the sequence estimation system 1 determines abnormality (step S600). Steps S500 and S600 belong to the estimation/detection phase. Note that the sequence estimation system 1 may provide only the sequence extraction result to the data processing device 200 without judging the abnormality of the sequence. Further, the sequence estimation system 1 may change the sequence estimation timing, the abnormality determination timing, the abnormality level, etc. according to the monitored system 100 . Also, the sequence estimation system 1 may execute the learning phase and the estimation/detection phase in parallel using log messages supplied from the monitored system 100 as needed.
Each process from step S100 to step S600 will be described in detail below.

［メッセージ分類処理］
以下、メッセージ分類処理について説明する。メッセージ分類処理は、仮登録されたログメッセージのそれぞれをベクトル化するベクトル化処理と、ベクトル化されたログメッセージを分類するための閾値を設定する閾値設定処理と、閾値を用いて複数のログメッセージを分類し、分類されたログメッセージ群を識別するメッセージＩＤ（識別子）を設定するＩＤ設定処理と、データ処理装置２００により新たなログメッセージを取得した場合に、取得した新たなログメッセージにメッセージＩＤを付与する分類処理とを含む。メッセージ分類処理は、コンフィグレーション作成部３１０により実行される。これによりコンフィグレーション作成部３１０は、ベクトル化部、閾値設定部、ＩＤ（識別子）設定部、分類部といった機能部を実現する。 [Message classification process]
The message classification process will be described below. The message classification processing includes vectorization processing for vectorizing each of the temporarily registered log messages, threshold setting processing for setting a threshold for classifying the vectorized log messages, and a plurality of log messages using the threshold. and setting a message ID (identifier) for identifying the classified log message group; and a classification process that assigns The message classification process is executed by the configuration creating section 310 . Thereby, the configuration creation unit 310 implements functional units such as a vectorization unit, a threshold value setting unit, an ID (identifier) setting unit, and a classification unit.

図４は、ベクトル化処理の一例を説明するための図であり、図４（Ａ）は、ベクトル化処理のうち単語を抽出処理の一例を示す図である。コンフィグレーション作成部３１０は、例えばログデータ蓄積部２０６からログメッセージを取り出し、ログメッセージに含まれる単語を抽出する。コンフィグレーション作成部３１０は、例えば、スペース等を区切り文字とした文字を単語として抽出する。例えば、「今日は、１０ｄａｙ．」というログメッセージがある場合、コンフィグレーション作成部３１０は、「今日は」、「１０」、「ｄａｙ．」という３個の単語を抽出する。 FIG. 4 is a diagram for explaining an example of vectorization processing, and FIG. 4A is a diagram illustrating an example of word extraction processing in the vectorization processing. The configuration creating unit 310, for example, takes out log messages from the log data storage unit 206 and extracts words included in the log messages. The configuration creating unit 310 extracts, for example, characters with spaces as delimiters as words. For example, if there is a log message "today is 10 days.", the configuration creating unit 310 extracts three words "today is", "10", and "day.".

比較例としてｎ－ｓｈｉｎｇｌｅｓ（n-gram）と称される重み付き処理がある。このｎ－ｓｈｉｎｇｌｅｓは、「今日は、」というログメッセージに対し、「今日は」という要素１と、「日は、」という要素２と、「は、ｆ」という要素３とに分割し、要素１に「１．０」の重みを付与し、要素２に「０．５」の重みを付与し、要素３に「０．３」の重みを付与する。このようなｎ－ｓｈｉｎｇｌｅｓでは、データ処理装置２００において利用されるアプリケーションと連携することが困難であること、ログメッセージの可変部分と固定部分の区別が難しいこと、単語の出現位置を考慮したベクトル化を行うことが困難であること、単語数が増えやすいという不都合がある。これに対し、図４を参照して説明した処理によれば、これらの不都合を回避することができる。 As a comparative example, there is weighted processing called n-shingles (n-gram). This n-shingles divides the log message "today" into an element 1 "today", an element 2 "today", and an element 3 "ha, f". 1 is given a weight of "1.0", element 2 is given a weight of "0.5", and element 3 is given a weight of "0.3". In such n-shingles, it is difficult to cooperate with applications used in the data processing device 200, it is difficult to distinguish between the variable part and the fixed part of the log message, vectorization considering the appearance position of the word and the number of words tends to increase. On the other hand, according to the processing described with reference to FIG. 4, these inconveniences can be avoided.

図４（Ｂ）は、ベクトル化処理のうち単語の出現位置を考慮する処理の一例を示す図である。コンフィグレーション作成部３１０は、抽出した単語の出現したログメッセージ中の位置も考慮してベクトルを作成する。ログメッセージにおいて同じ単語が複数回出現する場合がある。ただし、ログメッセージは予め設定されたテンプレートに従って単語が配置されている場合が多い。したがって、同じテンプレートに従った複数のログメッセージにおいては、ログメッセージ中の同じ位置に同じ単語が出現する可能性が高い。そこでコンフィグレーション作成部３１０は、単語に加えて、当該単語の出現した位置を考慮してベクトルを作成する。コンフィグレーション作成部３１０は、単語の出現位置を表す情報を一次元情報に変換してベクトルを作成する。コンフィグレーション作成部３１０は、例えば、「今日は」、「１０」、「ｄａｙ．」という３個の単語を、「１＿今日は」、「２＿１０」、「３＿ｄａｙ．」という３個の情報に変換する。単語の出現位置を考慮すると次元が増えて、2次元のベクトル（マトリクス）になってしまうが、数学的に扱いにくいという問題点がある。２次元のベクトルを次元圧縮することもできるが、情報の損失を避けるため、実施形態のコンフィグレーション作成部３１０は、単語の出現位置を単語にマージすることで1次元情報に変換する。 FIG. 4B is a diagram showing an example of a vectorization process that considers the appearance positions of words. The configuration creation unit 310 also considers the position in the log message where the extracted word appears to create a vector. The same word may appear multiple times in log messages. However, log messages often have words arranged according to a preset template. Therefore, in multiple log messages following the same template, it is highly likely that the same word will appear at the same position in the log message. Therefore, the configuration creation unit 310 creates a vector in consideration of the position where the word appears in addition to the word. The configuration creation unit 310 creates a vector by converting information representing the appearance position of a word into one-dimensional information. The configuration creating unit 310 converts, for example, three words “today”, “10”, and “day.” into three pieces of information “1_today”, “2_10”, and “3_day.” do. Considering the positions of occurrence of words increases the number of dimensions, resulting in a two-dimensional vector (matrix), which is mathematically difficult to handle. A two-dimensional vector can be dimensionally compressed, but in order to avoid loss of information, the configuration creation unit 310 of the embodiment converts the appearance positions of words into one-dimensional information by merging them with words.

図４（Ｃ）は、ベクトル化処理のうち単語に重み係数を設定する処理の一例を示す図である。コンフィグレーション作成部３１０は、単語をベクトル化するときに単語に重み係数を導入してよい。重みは、単語の出現位置をｎとしたとき１／ｎであってよい。すなわち重みは、単語の出現位置の先頭から末尾に向かって反比例的に減じてよい。例えば、「今日は」に１．０の重みを付与し、「１０」に０．５の重みを付与し、「ｄａｙ．」に０．３の重みを付与する。ログメッセージは、ログメッセージの先頭に近い単語ほど重要性が高い傾向があるため、当該傾向を重みに反映することができる。数字を含む単語についての重みを小さくしてよい。例えば「１０」という単語の重みを０．００５に減じてよい。ログメッセージにおける日付部分などは数字で表現されることが多く、数字を含む単語の重要性が低い傾向があるため、当該傾向を重みに反映することができる。重みは、単語の出現位置を示すパラメータと単語に数字を含むか否かを示すパラメータの双方を含む関数を用いて導出されてよい。図４に示したベクトル化処理を行うことにより、コンフィグレーション作成部３１０は、「今日は、１０ｄａｙ．」というログメッセージを、（１．０，０．００５，０．３）という数値化されたベクトルに変換することができる。 FIG. 4C is a diagram showing an example of processing for setting weighting factors to words in the vectorization processing. The configuration generator 310 may introduce weighting factors to words when vectorizing the words. The weight may be 1/n, where n is the occurrence position of the word. That is, the weight may decrease in inverse proportion from the beginning of the occurrence position of the word to the end. For example, "today" is given a weight of 1.0, "10" is given a weight of 0.5, and "day." is given a weight of 0.3. A word closer to the head of a log message tends to be more important, so this tendency can be reflected in the weight. Less weight may be given to words containing numbers. For example, the weight of the word "10" may be reduced to 0.005. Dates and the like in log messages are often represented by numbers, and since words containing numbers tend to be less important, this tendency can be reflected in the weight. A weight may be derived using a function that includes both a parameter indicating the position of occurrence of a word and a parameter indicating whether the word contains numbers. By performing the vectorization process shown in FIG. 4, the configuration creation unit 310 converts the log message "Today is 10 days." It can be converted to a vector.

図５は、分類処理の一例を示す図である。コンフィグレーション作成部３１０は、サンプリングを利用したクラスタリングを行う。コンフィグレーション作成部３１０は、例えば、サンプリングのアルゴリズムはアンサンブルのboosting方式を利用する。具体的には以下の通りである。まず、コンフィグレーション作成部３１０は、ログデータ蓄積部２０６からメッセージ集合を取得する。コンフィグレーション作成部３１０は、１回目のサンプリング（Phase.1またはPh.1と記載する）においてメッセージ集合から複数のメッセージをサンプリングする。コンフィグレーション作成部３１０は、サンプリングした複数のログメッセージに対してクラスタリングを行う。コンフィグレーション作成部３１０は、例えば、メッセージ分類処理における教師なし学習であるクラスタリング法としてＤＢＳＣＡＮ(Density-Based Spatial Clustering of Applications with Noise)によりクラスタリングを行う。次にコンフィグレーション作成部３１０は、Phase.1で縮退したクラスタ（クラスタに属するメッセージに共通する単語のみが含まれたもの）と重複しないログメッセージをサンプリングし（Phase.2）、サンプリングしたログメッセージに対してクラスタリングを行う。次にコンフィグレーション作成部３１０は、Phase.1およびPhase.2で縮退したクラスタと重複しないログメッセージをサンプリングし（Phase.3）、サンプリングしたログメッセージに対してクラスタリングを行う。このようにコンフィグレーション作成部３１０は、Phase.1～Phase.x-1で縮退したクラスタと重複しないログメッセージをサンプリングし（Phase.x）、サンプリングしたログメッセージに対してクラスタリングを行う。これによりコンフィグレーション作成部３１０は、X個のクラスタを集合させることで最終的なクラスタ集合を取得する。コンフィグレーション作成部３１０は、メッセージ集合に対して１度でクラスタリングを行うのではなく、複数回のサンプリングのそれぞれでクラスタリングを行う。これにより、メッセージ集合に偏りがある場合でも、少ないサンプリング量、すなわち少ないメモリでメッセージ分類を行うことができる。 FIG. 5 is a diagram illustrating an example of classification processing. The configuration creation unit 310 performs clustering using sampling. The configuration creation unit 310 uses, for example, an ensemble boosting method as a sampling algorithm. Specifically, it is as follows. First, the configuration creating unit 310 acquires a set of messages from the log data storage unit 206 . The configuration creating unit 310 samples a plurality of messages from the message set in the first sampling (referred to as Phase.1 or Ph.1). The configuration creating unit 310 clusters the sampled log messages. The configuration creating unit 310 performs clustering by, for example, DBSCAN (Density-Based Spatial Clustering of Applications with Noise) as a clustering method that is unsupervised learning in message classification processing. Next, the configuration creation unit 310 samples log messages (Phase.2) that do not overlap with degenerate clusters in Phase.1 (messages belonging to the clusters contain only common words), and samples the sampled log messages Perform clustering on Next, the configuration creation unit 310 samples log messages that do not overlap with clusters degenerated in Phase.1 and Phase.2 (Phase.3), and clusters the sampled log messages. In this way, the configuration creation unit 310 samples log messages that do not overlap with degenerate clusters in Phase.1 to Phase.x-1 (Phase.x), and clusters the sampled log messages. As a result, the configuration creation unit 310 acquires the final cluster set by collecting X clusters. The configuration creating unit 310 does not perform clustering on a set of messages once, but clusters each of a plurality of times of sampling. As a result, even if the message set is biased, message classification can be performed with a small amount of sampling, that is, with a small amount of memory.

コンフィグレーション作成部３１０は、クラスタリングの閾値を調整する。閾値は、DBSCANなどのクラスタリングにおいて対象となるログメッセージをクラスタに含めるか否かを判定するための値である。コンフィグレーション作成部３１０は、AIC（Akaike information criterion, 赤池情報量規準）、BIC（Bayesian information criterion, ベイズ情報量規準）の何れかを用いて閾値を算出する。図６は、ＡＩＣの算出式およびＢＩＣの算出式を示す図である。コンフィグレーション作成部３１０は、ログメッセージのデータ数、パラメータ数、および偏差に基づいてＡＩＣまたはＢＩＣを算出する。ログメッセージのデータ数は、サンプリング数に相当し、パラメータ数はDBSCANによるクラスタ数に相当する。コンフィグレーション作成部３１０は、複数の閾値候補の値（vth-1, vth-2, …）で複数のメッセージ分類を行った後、各メッセージ分類についてＡＩＣまたはＢＩＣを算出し、ＡＩＣまたはＢＩＣが最小となる閾値候補（vth-min）を閾値として採用する。 The configuration creation unit 310 adjusts the clustering threshold. The threshold is a value for determining whether or not to include target log messages in clustering such as DBSCAN. The configuration creation unit 310 calculates a threshold using either AIC (Akaike information criterion) or BIC (Bayesian information criterion). FIG. 6 is a diagram showing an AIC calculation formula and a BIC calculation formula. The configuration creation unit 310 calculates AIC or BIC based on the number of log message data, the number of parameters, and the deviation. The number of log message data corresponds to the number of samples, and the number of parameters corresponds to the number of clusters by DBSCAN. The configuration creation unit 310 performs a plurality of message classifications using a plurality of threshold candidate values (vth-1, vth-2, . A threshold candidate (vth-min) that satisfies is adopted as the threshold.

図７は、コンフィグレーションを作成する処理の一例を示すフローチャートを示す図である。コンフィグレーション作成部３１０は、先ず所定のフェーズ数だけステップＳ１２０からステップＳ１２６の処理を繰り返す。ステップＳ１２０においてコンフィグレーション作成部３１０は、ログメッセージ集合からログメッセージをサンプリングする（ステップＳ１２０）。コンフィグレーション作成部３１０は、２回目以降のフェースにおいては、前回までのフェーズでクラスタに属するログメッセージを除くログメッセージをサンプリングする。次にコンフィグレーション作成部３１０は、サンプリングしたログメッセージを単語列に変換し、各単語のログメッセージにおける出現位置を認識する（ステップＳ１２２）。次にコンフィグレーション作成部３１０は、各単語の重みを用いてログメッセージをベクトル化する（ステップＳ１２４）。次にコンフィグレーション作成部３１０は、DBSCANによりログメッセージをクラスタリングする（ステップＳ１２６）。 FIG. 7 is a diagram illustrating a flowchart illustrating an example of processing for creating a configuration. First, the configuration creating unit 310 repeats the processing from step S120 to step S126 for a predetermined number of phases. In step S120, the configuration creation unit 310 samples log messages from the log message set (step S120). In the second and subsequent phases, the configuration creating unit 310 samples log messages other than the log messages belonging to the cluster in the previous phases. Next, the configuration creating unit 310 converts the sampled log message into a word string, and recognizes the appearance position of each word in the log message (step S122). Next, the configuration creating unit 310 vectorizes the log message using the weight of each word (step S124). Next, the configuration creating unit 310 clusters the log messages by DBSCAN (step S126).

次にコンフィグレーション作成部３１０は、閾値候補数だけステップＳ１２９およびステップＳ１３０を繰り返す。コンフィグレーション作成部３１０は、所定のフェーズ分のクラスタに対して閾値候補（ｖｔｈ－ｘ）を用いたDBSCANによりクラスタリングを行い（ステップＳ１２９）、クラスタリングの結果からＡＩＣまたはＢＩＣを算出する（ステップＳ１３０）。これによりコンフィグレーション作成部３１０は、閾値候補数だけＡＩＣまたはＢＩＣを算出する。次にコンフィグレーション作成部３１０は、ＡＩＣまたはＢＩＣが最小となったクラスタリング結果を採用する（ステップＳ１３２）。次にコンフィグレーション作成部３１０は、クラスタリングの結果として得られたクラスタに属するログメッセージを識別するためのルールおよびメッセージＩＤを含むコンフィグレーション情報を生成する。ログメッセージを識別するためのルールは、例えば、ログメッセージにどのような単語が出現するかを特定する情報であり、例えば、単語Ａ、単語Ｂ、および単語Ｃが含まれる場合に、当該ログメッセージにログメッセージＩＤ：aを付与することを表す。 Next, the configuration creating unit 310 repeats steps S129 and S130 by the number of threshold candidates. The configuration creating unit 310 clusters the clusters for a predetermined phase by DBSCAN using the threshold candidate (vth-x) (step S129), and calculates AIC or BIC from the clustering result (step S130). . Thereby, the configuration creation unit 310 calculates AIC or BIC for the number of threshold candidates. Next, the configuration creation unit 310 adopts the clustering result with the smallest AIC or BIC (step S132). Next, the configuration creating unit 310 creates configuration information including message IDs and rules for identifying log messages belonging to clusters obtained as a result of clustering. A rule for identifying a log message is, for example, information specifying what words appear in a log message. indicates that the log message ID: a is given to .

図８は、コンフィグレーションを設定する処理の一例を示すフローチャートである。異常検知装置３００は、コンフィグレーション作成部３１０により作成したコンフィグレーション情報をデータ処理装置２００に送信し（ステップＳ１４０）、データ処理装置２００は、異常検知装置３００から受信したコンフィグレーション情報を更新する（ステップＳ１４２）。また、異常検知装置３００は、仮登録されたログメッセージにメッセージ種別としてメッセージＩＤを書き込む（ステップＳ１４４）。 FIG. 8 is a flowchart illustrating an example of processing for setting the configuration. The anomaly detection device 300 transmits the configuration information created by the configuration creating unit 310 to the data processing device 200 (step S140), and the data processing device 200 updates the configuration information received from the anomaly detection device 300 ( step S142). Further, the abnormality detection device 300 writes the message ID as the message type in the temporarily registered log message (step S144).

図９は、ログメッセージを登録する処理の一例を示すフローチャートである。監視対象システム１００は、新たなログメッセージをデータ処理装置２００に送信したとき、データ処理装置２００は、コンフィグレーション情報に含まれるルールに基づいてログメッセージを解析し、ログメッセージにメッセージＩＤを追加する（ステップＳ１５０）。データ処理装置２００は、異常検知装置３００にメッセージＩＤが追加されたログメッセージを異常検知装置３００に送信する。シーケンス推定システム１は、コンフィグレーション情報に従って自動的にログメッセージにメッセージＩＤを追加することで、ログメッセージを分類するためのコンフィグレーション作業の手間を省くことができる。 FIG. 9 is a flowchart illustrating an example of processing for registering a log message. When the monitored system 100 transmits a new log message to the data processing device 200, the data processing device 200 analyzes the log message based on the rules included in the configuration information and adds a message ID to the log message. (Step S150). The data processing device 200 transmits to the anomaly detection device 300 the log message to which the message ID is added. By automatically adding message IDs to log messages according to configuration information, the sequence estimation system 1 can save configuration work for classifying log messages.

［メッセージ集合推定処理］
メッセージ集合推定処理は、ログメッセージＩＤに含まれるログメッセージの集合を特定する処理である。同じログメッセージＩＤは、一連の動作や異常といった同じ機会で出現するものが多いため、同じ機会で出現するログメッセージが集合を形成するものとする。なお、実施形態において、「ログメッセージの集合」を、「ログメッセージのシーケンス」と読み替えてよい。メッセージ集合推定処理は、ログメッセージの集合のキーとなるログメッセージＩＤを指定するＩＤ指定処理と、キーを指定しない処理である自動推定処理の少なくとも一方を含む。 [Message set estimation process]
The message set estimation process is a process of identifying a set of log messages included in the log message ID. Since many of the same log message ID appear at the same opportunity such as a series of operations or anomalies, it is assumed that log messages appearing at the same opportunity form a set. In addition, in the embodiment, "a set of log messages" may be read as "a sequence of log messages". The message set estimation process includes at least one of an ID designation process of designating a log message ID that is a key of a set of log messages and an automatic estimation process that is a process of not designating a key.

図１０は、メッセージ集合推定処理の処理手順の一例を示すフローチャートである。先ず、異常検知装置３００は、対象とするログメッセージＸを決定する（ステップＳ３０２）。異常検知装置３００は、ＩＤ指定処理を行う場合、予め指定された２つのログメッセージＩＤに属するログメッセージＸを、処理対象として決定する。異常検知装置３００は、自動推定処理を行う場合、全てのログメッセージＩＤにおけるログメッセージＸを、処理対象として決定する。異常検知装置３００は、決定されたログメッセージＸの数分だけ、ステップＳ３０４からステップＳ３２０までの処理を繰り返す。 FIG. 10 is a flowchart illustrating an example of a processing procedure for message set estimation processing. First, the anomaly detection device 300 determines a target log message X (step S302). When performing the ID designation process, the anomaly detection device 300 determines log messages X belonging to two log message IDs designated in advance as objects to be processed. When the anomaly detection device 300 performs the automatic estimation process, the log message X of all log message IDs is determined as a process target. The anomaly detection device 300 repeats the processes from step S304 to step S320 by the number of log messages X determined.

先ず異常検知装置３００は、データ処理装置２００から、決定したログメッセージＸの発生時刻のリストを取得する（ステップＳ３０４）。次に異常検知装置３００は、ステップＳ３０６からステップＳ３１６までのブートストラップ法を、規定回数だけ繰り返す。規定回数は、ブートストラップ法により作成する疑似データの数に相当する。図１１は、相関係数を計算する処理の一例を説明するための図である。異常検知装置３００は、ログメッセージＸをオリジナルデータとして用いて、例えば３個の疑似データ（１）～（３）を含む疑似データセットを生成する。本例において、規定回数は「３」であり、疑似データごとに相関係数を計算する。 First, the anomaly detection device 300 acquires a list of determined occurrence times of log messages X from the data processing device 200 (step S304). Next, the abnormality detection device 300 repeats the bootstrap method from step S306 to step S316 a specified number of times. The specified number of times corresponds to the number of pseudo data created by the bootstrap method. FIG. 11 is a diagram for explaining an example of processing for calculating a correlation coefficient. The anomaly detection device 300 uses the log message X as original data to generate a pseudo data set including, for example, three pseudo data (1) to (3). In this example, the specified number of times is "3", and the correlation coefficient is calculated for each pseudo data.

異常検知装置３００は、発生時刻リストから所定数の発生時刻Ｔを取得し（ステップＳ３０６）、相関係数の計算用の行列Ｍを作成する（ステップＳ３０８）。次に異常検知装置３００は、取得した全ての発生時刻Ｔについて、ステップＳ３１０からステップＳ３１４までの処理を繰り返す。異常検知装置３００は、発生時刻Ｔから所定期間内に出現するログメッセージＹを取得し（ステップＳ３１０）、行列Ｍの発生時刻Ｔの行とログメッセージＹ’の各列との対応箇所に１をマークし（ステップＳ３１２）、行列Ｍの発生時刻Ｔの行とログメッセージＹ以外の各列との対応箇所に０をマークする（ステップＳ３１４）。 The anomaly detection device 300 acquires a predetermined number of occurrence times T from the occurrence time list (step S306), and creates a matrix M for calculating correlation coefficients (step S308). Next, the anomaly detection device 300 repeats the processing from step S310 to step S314 for all acquired occurrence times T. FIG. The anomaly detection device 300 acquires the log message Y that appears within a predetermined period from the occurrence time T (step S310), and puts 1 in the row corresponding to the occurrence time T in the matrix M and each column of the log message Y'. 0 is marked (step S312), and 0 is marked in the row corresponding to the occurrence time T in the matrix M and each column other than the log message Y (step S314).

次に異常検知装置３００は、行列Ｍを用いて、ログメッセージＸとログメッセージＹの相互の時系列的な相関度合いを表す相関係数Ｃを計算する（ステップＳ３１６）。異常検知装置３００は、疑似データ（１）～（３）のそれぞれについて、下記の式により相関係数Ｃ（１）～Ｃ（３）を計算する。下記式においてｘ、ｙは疑似データにおける所定期間内の２つのログメッセージＸ，Ｙであり、ｎはデータ数であり、ｘバーはｘの相加平均であり、ｙバーはｙの相加平均であり、相関係数は、標本共分散を標本標準偏差で除算することにより算出される。

Next, the anomaly detection device 300 uses the matrix M to calculate a correlation coefficient C representing the degree of time-series correlation between the log messages X and Y (step S316). Anomaly detection device 300 calculates correlation coefficients C(1) to C(3) for each of pseudo data (1) to (3) using the following equations. In the following formula, x and y are two log messages X and Y within a predetermined period in pseudo data, n is the number of data, x bar is the arithmetic mean of x, and y bar is the arithmetic mean of y and the correlation coefficient is calculated by dividing the sample covariance by the sample standard deviation.

次に異常検知装置３００は、相関係数Ｃの平均値Ｃ’を計算し（ステップＳ３１８）、相関係数Ｃの平均値Ｃ’が所定値以上のログメッセージＩＤＺを取り出す（ステップＳ３２０）。これにより、異常検知装置３００は、ログメッセージＸのそれぞれについて、当該ログメッセージＸと時系列的な相関が高いログメッセージＩＤ（Ｚ）を取得する。 Next, the anomaly detection device 300 calculates the average value C' of the correlation coefficients C (step S318), and extracts the log messages IDZ whose average value C' of the correlation coefficients C is equal to or greater than a predetermined value (step S320). As a result, for each log message X, the anomaly detection device 300 acquires the log message ID (Z) that has a high time-series correlation with the log message X. FIG.

次に異常検知装置３００は、ステップＳ３０２から自動推定処理を実行しているか否かを判定する（ステップＳ３２２）。異常検知装置３００は、自動推定処理を実行していない場合（ステップＳ３２２：ＮＯ）、本処理を終了する。異常検知装置３００は、自動推定処理を実行している場合（ステップＳ３２２：ＹＥＳ）、ログメッセージＩＤ（Ｚ）と包含関係にある他のログメッセージＩＤを統合する処理を、全てのログメッセージＩＤ（Ｚ）について実行する（ステップＳ３２４）。次に異常検知装置３００は、ログメッセージＩＤ（Ｚ）と同時発生関係にある他のログメッセージＩＤを統合する処理を、全てのログメッセージＩＤ（Ｚ）について実行する（ステップＳ３２６）。 Next, the abnormality detection device 300 determines whether or not the automatic estimation process is being executed from step S302 (step S322). If the anomaly detection device 300 does not execute the automatic estimation process (step S322: NO), the process ends. If the anomaly detection device 300 is executing automatic estimation processing (step S322: YES), the anomaly detection device 300 performs processing for integrating other log message IDs that have an inclusion relationship with the log message ID (Z) for all log message IDs ( Z) (step S324). Next, the anomaly detection device 300 performs the process of integrating other log message IDs that occur simultaneously with the log message ID (Z) for all log message IDs (Z) (step S326).

図１２は、メッセージ集合推定処理における自動推定処理を説明するための図である。異常検知装置３００は、ログメッセージＩＤに含まれるログメッセージを用いてブートストラップ法を利用した疑似データセットの作成、およびアンサンブル法を利用した相関係数の算出処理を行う。これにより、異常検知装置３００は、行数がログメッセージＩＤ数であり且つ列数がログメッセージＩＤ数である、相関係数のマトリクスを作成する。 FIG. 12 is a diagram for explaining automatic estimation processing in message set estimation processing. The anomaly detection device 300 uses the log message included in the log message ID to create a pseudo data set using the bootstrap method and to calculate the correlation coefficient using the ensemble method. As a result, the anomaly detection device 300 creates a matrix of correlation coefficients in which the number of rows is the number of log message IDs and the number of columns is the number of log message IDs.

異常検知装置３００は、時系列的に相関係数が高いログメッセージの集合であっても、実質的に重複するログメッセージの集合が含まれるために、補正を行う。異常検知装置３００は、包含関係にあるログメッセージの集合同士を、同じログメッセージＩＤに補正する。図１３は、包含関係にあるログメッセージの集合の一例を示す図である。例えば、メッセージＮｏ．４０６のログメッセージの集合と、メッセージＮｏ．４１８のログメッセージの集合とはログメッセージの番号（４０５，４０４，４０７）が包含関係にある。包含関係とは、一方のログメッセージの集合が他方のログメッセージの集合を含む関係である。異常検知装置３００は、包含関係にあるログメッセージの集合同士を同じログメッセージＩＤとして補正（統合）する。 The anomaly detection apparatus 300 performs correction because even a set of log messages with a high time-series correlation coefficient includes a set of log messages that substantially overlap. The anomaly detection device 300 corrects groups of log messages having an inclusive relationship to have the same log message ID. FIG. 13 is a diagram showing an example of a set of log messages having an inclusion relationship. For example, message no. 406 log messages and message numbers. Log message numbers (405, 404, 407) have an inclusion relationship with the set of 418 log messages. A containment relationship is a relationship in which one set of log messages contains another set of log messages. The anomaly detection device 300 corrects (integrates) sets of log messages having an inclusive relationship to have the same log message ID.

異常検知装置３００は、同時発生関係にあるログメッセージの集合同士を同じログメッセージＩＤに補正する。図１４は、同時発生関係にあるログメッセージの集合の一例を示す図である。例えば、メッセージＮｏ．４０６のログメッセージの集合と、メッセージＮｏ．４１８のログメッセージの集合とは同じ時刻に発生している。同時発生関係とは、時間的に同じタイミングで発生するログメッセージ集合同士の関係である。異常検知装置３００は、同時発生関係にあるログメッセージの集合同士を同じログメッセージＩＤとして補正（統合）する。 The anomaly detection device 300 corrects groups of log messages that occur simultaneously to have the same log message ID. FIG. 14 is a diagram showing an example of a set of concurrent log messages. For example, message no. 406 log messages and message numbers. A set of 418 log messages occurred at the same time. A co-occurrence relationship is a relationship between log message sets that occur at the same timing. The anomaly detection device 300 corrects (integrates) groups of log messages that occur simultaneously as having the same log message ID.

［モデル作成処理］
図１５は、モデル作成処理の全体を示すフローチャートであり、図１６は、学習データの収集処理の処理手順の一例を示すフローチャートであり、図１７は、カーネル密度推定による学習データ収集処理の一例を示すフローチャートである。 [Model creation process]
FIG. 15 is a flow chart showing the entire model creation process, FIG. 16 is a flow chart showing an example of the processing procedure of the learning data collection process, and FIG. 17 is an example of the learning data collection process by kernel density estimation. It is a flow chart showing.

異常検知装置３００は、図１５に示すように、先ず、学習データを収集し（ステップＳ４００）、学習データを用いてモデルを作成する（ステップＳ４０２）。 As shown in FIG. 15, the anomaly detection device 300 first collects learning data (step S400) and creates a model using the learning data (step S402).

異常検知装置３００は、図１６に示すように、学習データの収集において、ステップＳ４１０からステップＳ４２０までの処理を、メッセージ集合数分繰り返す。
異常検知装置３００は、先ず、対象のログメッセージの集合ＸとログメッセージＩＤが重複するメッセージの集合Ｙを算出する（ステップＳ４１０）。次に異常検知装置３００は、データ処理装置２００からログメッセージの集合Ｘの発生時刻およびログメッセージの集合Ｙの発生時刻を取得する（ステップＳ４１２）。次に異常検知装置３００は、ステップＳ４１２において取得した発生時刻のうち、前後に所定間隔の空きがある発生時刻Ｔを抜き出す（ステップＳ４１４）。次に異常検知装置３００は、メッセージの集合Ｘに属するログメッセージＩＤをインデックスとして、ステップＳ４１４において抜き出した発生時刻Ｔから所定時間内にあるログメッセージＬを取り出す（ステップＳ４１６）。次に異常検知装置３００は、ステップＳ４１６において発生時刻Ｔから所定時間内にあるログメッセージＬ、すなわち学習データがあるか否かを判定する（ステップＳ４１８）。異常検知装置３００は、学習データがあるときには（ステップＳ４１８：ＹＥＳ）、ステップＳ４１０以降の処理を繰り返し、学習データがないときには（ステップＳ４１８：ＮＯ）、カーネル密度推定によるデータ収集を行って（ステップＳ４２０）、ステップＳ４１０以降の処理を繰り返す。これにより異常検知装置３００は、メッセージの集合ごとに学習データ（Ｌ）を収集する。 As shown in FIG. 16, the anomaly detection device 300 repeats the processing from step S410 to step S420 for the number of message sets in collecting learning data.
The anomaly detection device 300 first calculates a set Y of messages having the same log message ID as the target log message set X (step S410). Next, the abnormality detection device 300 acquires the occurrence time of the log message set X and the occurrence time of the log message set Y from the data processing device 200 (step S412). Next, the anomaly detection device 300 extracts an occurrence time T having a predetermined interval before and after the occurrence time acquired in step S412 (step S414). Next, the anomaly detection device 300 uses the log message ID belonging to the message set X as an index to extract log messages L within a predetermined time period from the occurrence time T extracted in step S414 (step S416). Next, the anomaly detection device 300 determines whether or not there is a log message L within a predetermined period of time from the occurrence time T in step S416, that is, learning data (step S418). If there is learning data (step S418: YES), anomaly detection device 300 repeats the processing from step S410. ), and the processing after step S410 is repeated. Thereby, the anomaly detection device 300 collects learning data (L) for each set of messages.

例えば、メッセージの集合を含む期間（全ての日付け・時刻）において高密度にログメッセージが発生している場合、学習データが取得できない（ステップＳ４１８：ＮＯ）。高密度にログメッセージが発生するとは、例えばログメッセージ同士の間隔が短く連続的にログメッセージが発生しているために、ステップＳ４１４において発生時刻Ｔが特定できない場合などである。この場合、異常検知装置３００は、図１７に示すカーネル密度推定による学習データ収集（ステップＳ４２０）において、図１８に示すステップＳ４２００～ステップＳ４２０３の処理を規定回数繰り返す。図１８はカーネル密度推定による学習データを収集する処理の一例を示す図である。 For example, when log messages are generated at high density during a period (all dates and times) including a set of messages, learning data cannot be obtained (step S418: NO). Log messages are generated at high density, for example, when the interval between log messages is short and the log messages are generated continuously, so that the generation time T cannot be specified in step S414. In this case, anomaly detection apparatus 300 repeats the processing of steps S4200 to S4203 shown in FIG. 18 a prescribed number of times in learning data collection (step S420) by kernel density estimation shown in FIG. FIG. 18 is a diagram showing an example of processing for collecting learning data by kernel density estimation.

まず異常検知装置３００は、対象となるログメッセージと、当該対象となるログメッセージと相関性の高いログメッセージとを含むログメッセージ集合をサンプリングする（ステップＳ４２００）。異常検知装置３００は、例えば、メッセージＩＤが同じ（mid=A）である複数のログメッセージを、ログメッセージ集合（sampling001、sampling002、・・・）をサンプリングする。 First, the anomaly detection device 300 samples a log message set including a target log message and a log message highly correlated with the target log message (step S4200). For example, the anomaly detection device 300 samples a plurality of log messages having the same message ID (mid=A) from a log message set (sampling001, sampling002, . . . ).

次に異常検知装置３００は、サンプリングしたログメッセージ集合におけるログメッセージ間の時間差分を算出する（ステップＳ４２０２）。異常検知装置３００は、例えば、あるログメッセージ（mid-A）を取得し、当該ログメッセージと時間的に近傍にあるmid-A1、mid-A2との時間差分（tA1, ,tA2,… ）を算出する。次に異常検知装置３００は、時間差分（tA1, tA2,… ）ごとに、カーネル密度推定を行うことで確率が最大となる時間差分（tA1-max, tA2-max,… ）を算出する（ステップＳ４２０４）。次に異常検知装置３００は、確率が最大となる時間差分（tA1-max, tA2-max,… ）をソートし、シーケンスを推定する（ステップＳ４２０３）。異常検知装置３００は、例えば、ログメッセージのメッセージＩＤごとに、規定回数だけステップＳ４２００～ステップＳ４２０３の処理を繰り返す。次に、異常検知装置３００は、ソートした結果、上位から所定数割の時間差分（tA1-max, tA2-max,… ）に出現するログメッセージを、メッセージＩＤがmid-Aのログメッセージの順序（シーケンス）であると推定する（ステップＳ４２０４）。次に異常検知装置３００は、メッセージＩＤが同じ順序で配列したシーケンスと同じ順序となる複数のログメッセージを、学習データとして収集する（ステップＳ４２０５）。 Next, the anomaly detection device 300 calculates the time difference between log messages in the sampled log message set (step S4202). For example, the anomaly detection device 300 acquires a certain log message (mid-A), and calculates time differences (tA1, tA2, . calculate. Next, the anomaly detection device 300 calculates the time difference (tA1-max, tA2-max, . . . ) with the maximum probability by performing kernel density estimation for each time difference (tA1, tA2, . S4204). Next, the anomaly detection device 300 sorts the time differences (tA1-max, tA2-max, . The anomaly detection device 300 repeats the processing of steps S4200 to S4203 a specified number of times, for example, for each message ID of the log message. Next, as a result of the sorting, the anomaly detection device 300 sorts the log messages appearing in the time difference (tA1-max, tA2-max, . (sequence) is estimated (step S4204). Next, the anomaly detection device 300 collects, as learning data, a plurality of log messages in the same order as the sequence in which the message IDs are arranged in the same order (step S4205).

例えば、図１８に示すように、複数のサンプリング001～100について、mid-Aに関連するログメッセージとしてmid-A2、mid-A1、…をサンプリングしたとき、mid-Aとmid-A2との差分（tA2）、mid-Aとmid-A1との差分（tA1）、…を算出し、差分（tA2）の分布において確率密度が最大となる差分（tA2-max）、差分（tA1）の分布において確率密度が最大となる差分（tA1-max）、…を算出し、ソートしたとする。ソートした結果、mid-A、mid-A2、mid-A1、…の順序のシーケンスが91個であり、mid-A、mid-A1、mid-A2、…の順序のシーケンスが9個であったとする。異常検知装置３００は、mid-A、mid-A2、mid-A1、…の順序のシーケンスを採用し、当該シーケンスと同じシーケンスを学習データとして取得することができる。これにより異常検知装置３００は、複数のログメッセージが高密度で隔たっていても、学習データを取得することができる。 For example, as shown in FIG. 18, when sampling mid-A2, mid-A1, . (tA2), the difference between mid-A and mid-A1 (tA1), etc. are calculated. Suppose that the difference (tA1-max) with the maximum probability density is calculated and sorted. As a result of sorting, there were 91 sequences in the order of mid-A, mid-A2, mid-A1, … and 9 sequences in the order of mid-A, mid-A1, mid-A2, … do. The anomaly detection device 300 can adopt a sequence in the order of mid-A, mid-A2, mid-A1, . . . and acquire the same sequence as the learning data. As a result, the anomaly detection device 300 can acquire learning data even if a plurality of log messages are densely spaced apart.

異常検知装置３００は、図１９に示すように、モデル作成処理において、ステップＳ４２０からステップＳ４２４までの処理をメッセージ集合数分だけ繰り返す。図２０は、通常マルコフモデルおよび優先マルコフモデルの作成処理の一例を示す図である。
異常検知装置３００は、先ず、ログメッセージＬの集合を学習データとして通常マルコフモデルＭを作成する（ステップＳ４３０）。通常マルコフモデルＭは、例えば、シーケンスを構成するログメッセージと、当該ログメッセージ間の遷移確率を表す情報とを含む。異常検知装置３００は、学習データをマルコフモデルの機械学習アルゴリズムに入力し、機械学習アルゴリズムの出力誤差を最小にするように機械学習アルゴリズムのパラメータを調整する。次に異常検知装置３００は、作成した通常マルコフモデルＭに含まれる各ログメッセージのデュレーション値を算出する（ステップＳ４３２）。デュレーション値とは、ログメッセージ間の時間差を表す情報である。図２１は、デュレーション値の一例を示す図である。次に異常検知装置３００は、ログメッセージＬのうち優先メッセージを学習データとして優先モデルを作成する（ステップＳ４３４）。 As shown in FIG. 19, the anomaly detection device 300 repeats the processes from step S420 to step S424 by the number of message sets in the model creation process. FIG. 20 is a diagram showing an example of processing for creating a normal Markov model and a priority Markov model.
The anomaly detection device 300 first creates a normal Markov model M using a set of log messages L as learning data (step S430). The normal Markov model M includes, for example, log messages forming a sequence and information representing transition probabilities between the log messages. The anomaly detection device 300 inputs the learning data to a Markov model machine learning algorithm and adjusts the parameters of the machine learning algorithm so as to minimize the output error of the machine learning algorithm. Next, the anomaly detection device 300 calculates the duration value of each log message included in the created normal Markov model M (step S432). A duration value is information representing a time difference between log messages. FIG. 21 is a diagram showing an example of duration values. Next, the anomaly detection device 300 creates a priority model using the priority messages of the log messages L as learning data (step S434).

図２２は、一つの学習データおよび複数の学習データを示す図である。優先メッセージは、一シーケンス当たりの発生数は少ないが、一シーケンス当たりの発生確率が高いログメッセージの集合である。「一シーケンス当たりの発生数は少ないログメッセージ」とは、一つの学習データの中でそれほど繰り替えして発生しないログメッセージである。優先メッセージは、例えば、任意のログメッセージの発生数よりも少ない発生数のログメッセージである。「一シーケンス当たりの発生確率が高いログメッセージ」とは、どの学習データ（Ｌ１、Ｌ２、・・・Ｌｎ）でも出現するログメッセージである。 FIG. 22 is a diagram showing one learning data and a plurality of learning data. A priority message is a set of log messages with a low number of occurrences per sequence but a high probability of occurrence per sequence. “Log messages with a small number of occurrences per sequence” are log messages that do not occur repeatedly in one set of learning data. A priority message is, for example, a log message with a lower number of occurrences than any of the log messages. A “log message with a high occurrence probability per sequence” is a log message that appears in any learning data (L1, L2, . . . Ln).

異常検知装置３００は、図２０に示すように、メッセージ群Ａ，Ｂ，Ｃ・・・を含むログメッセージＬを用いて通常マルコフモデルＭを作成し、ログメッセージＬのうちメッセージ群Ａ，Ｃを含む優先メッセージを用いて優先マルコフモデルを作成する。また、異常検知装置３００は、メッセージ集合数分だけステップＳ４２０からステップＳ４２４までの処理を繰り返すことで、メッセージの集合数分の通常マルコフモデルおよび優先マルコフモデルの作成を行う。 As shown in FIG. 20, the anomaly detection device 300 creates a normal Markov model M using a log message L including message groups A, B, C, . . . Create a prioritized Markov model using the prioritized message containing Further, the anomaly detection apparatus 300 repeats the processes from step S420 to step S424 for the number of message sets, thereby creating normal Markov models and priority Markov models for the number of message sets.

（優先モデルの作成）
図２３は、優先モデルを作成する処理の処理手順の一例を示すフローチャートである。
異常検知装置３００は、ステップＳ４４０からステップＳ４４２までのブートストラップ法を、所定数だけ繰り返す。先ず異常検知装置３００は、対象となるメッセージの集合Ｘの学習データＬから所定数の学習データＬ’を抜き出し（ステップＳ４４０）、学習データＬ’の中でそれほど繰り替えして発生しないログメッセージＩＤのメッセージＬ’’を抜き出す（ステップＳ４４１）。異常検知装置３００は、例えば学習データＬ’のうち出現数が最小のログメッセージＩＤのメッセージＬ’’を抜き出してよい。これにより異常検知装置３００は、所定数の学習データＬ’のセットごとにメッセージＬ’’を含む疑似データを作成することで、複数の疑似データを含む疑似データセットを作成する（ブートストラップ法）。次に異常検知装置３００は、学習データＬ’’の１回の学習データあたりの各ログメッセージＩＤの出現確率Ｃを算出する（ステップＳ４４２）。これにより異常検知装置３００は、疑似データごとに出現確率Ｃを取得する。 (Creation of preferred model)
FIG. 23 is a flowchart illustrating an example of a processing procedure for creating a priority model;
Anomaly detection device 300 repeats the bootstrap method from step S440 to step S442 a predetermined number of times. First, the anomaly detection device 300 extracts a predetermined number of learning data L′ from the learning data L of the target message set X (step S440), and extracts log message IDs that do not occur so repeatedly in the learning data L′. Message L'' is extracted (step S441). The anomaly detection device 300 may extract, for example, the message L'' of the log message ID with the smallest number of appearances from the learning data L'. Thereby, the anomaly detection device 300 creates a pseudo data set including a plurality of pseudo data by creating pseudo data including the message L'' for each set of a predetermined number of learning data L' (bootstrap method). . Next, the anomaly detection device 300 calculates the appearance probability C of each log message ID per one learning data of the learning data L'' (step S442). Thereby, the anomaly detection device 300 acquires the appearance probability C for each pseudo data.

次に異常検知装置３００は、出現確率Ｃの平均値Ｃ’を算出し（ステップＳ４４３）、出現確率平均値Ｃ’が所定値以上のログメッセージＩＤ（Ｚ）を取り出し（ステップＳ４４４）、対象のログメッセージの集合Ｘの学習データＬからログメッセージＩＤ（Ｚ）のメッセージＬ’’’を抜き出す（ステップＳ４４５）。次に異常検知装置３００は、学習データＬ’’’からＡＩＣを算出し（ステップＳ４４６）、学習データＬ’’’からｎ次マルコフモデルＭ’を作成する（ステップＳ４４７）。次に異常検知装置３００は、作成したマルコフモデルＭ’に含まれる各ログメッセージのデュレーション値を算出する（ステップＳ４４８）。 Next, the anomaly detection device 300 calculates the average value C′ of the appearance probability C (step S443), extracts log message IDs (Z) whose average appearance probability value C′ is equal to or greater than a predetermined value (step S444), Message L''' with log message ID (Z) is extracted from learning data L of log message set X (step S445). Next, the abnormality detection device 300 calculates AIC from the learning data L''' (step S446), and creates an n-order Markov model M' from the learning data L''' (step S447). Next, the anomaly detection device 300 calculates the duration value of each log message included in the created Markov model M' (step S448).

（デュレーション値の算出）
図２４は、デュレーション値の算出処理の処理手順の一例を示すフローチャートである。異常検知装置３００は、ステップＳ４５０～ステップＳ４６２のデュレーション値の算出処理をマルコフモデルＭ’の状態遷移数分だけ繰り返して行う。 (Calculation of duration value)
FIG. 24 is a flowchart illustrating an example of a processing procedure of duration value calculation processing. The anomaly detection device 300 repeats the duration value calculation process from step S450 to step S462 by the number of state transitions of the Markov model M'.

先ず異常検知装置３００は、ブートストラップ法およびアンサンブル法によって所定数だけ、ステップＳ４５０～ステップＳ４５８の処理を繰り返す。異常検知装置３００は、日単位でデュレーション値を集計する処理として学習期間の日数分、ステップＳ４５０～ステップＳ４５４を繰り返す。まず、異常検知装置３００は、対象とする状態遷移における対象とする日のデュレーション値を、学習データから抜き出し（ステップＳ４５０）、デュレーション値のクラスタリングを行う（ステップＳ４５２）。次に異常検知装置３００は、デュレーション値の各クラスタから所定数のデュレーション値Ｃを取り出す（ステップＳ４５４）。次に異常検知装置３００は、デュレーション値Ｃの集合をクラスタリングする（ステップＳ４５６）。次に異常検知装置３００は、デュレーション値Ｃのクラスタのうちデュレーション値が所定数以下のクラスタを破棄する（ステップＳ４５８）。 First, the anomaly detection device 300 repeats the processes of steps S450 to S458 a predetermined number of times by the bootstrap method and the ensemble method. The anomaly detection device 300 repeats steps S450 to S454 for the number of days of the learning period as a process of totalizing duration values on a daily basis. First, the anomaly detection device 300 extracts the duration value of the target day in the target state transition from the learning data (step S450), and clusters the duration values (step S452). Next, the anomaly detection device 300 extracts a predetermined number of duration values C from each cluster of duration values (step S454). Next, the anomaly detection device 300 clusters the set of duration values C (step S456). Next, the anomaly detection device 300 discards the clusters whose duration value is equal to or less than a predetermined number among the clusters with the duration value C (step S458).

次に異常検知装置３００は、デュレーション値Ｃ’の集合をクラスタリングし（ステップＳ４６０）、デュレーション値Ｃ’のクラスタの平均および偏差を算出する（ステップＳ４６２）。 Next, the anomaly detection device 300 clusters the set of duration values C' (step S460), and calculates the average and deviation of the clusters of duration values C' (step S462).

（デュレーション値のクラスタリング）
図２５は、デュレーション値のクラスタリング処理を処理手順の一例を示すフローチャートである。先ず異常検知装置３００は、停止条件が成立したか否かを判定し、成立した場合には本フローチャートの処理を終了し、成立していない場合にはｋ－ｍｅａｎｓを用いてデュレーション値の集合を、２つのデュレーション値の集合（Ｄ１およびＤ２）に分割する（ステップＳ４７２）。次に異常検知装置３００は、分割されたデュレーション値の集合のそれぞれを、デュレーション値を再帰的にクラスタリングする（ステップＳ４７４）。異常検知装置３００は、停止条件が成立するまでにデュレーション値の集合の２分割、および分割された各デュレーション値の集合の再帰的なクラスタリングを繰り返す。これにより、異常検知装置３００は、複数のデュレーション値のクラスタを生成することができる。 (Clustering of duration values)
FIG. 25 is a flowchart illustrating an example of a procedure for clustering processing of duration values. First, the anomaly detection device 300 determines whether or not the stop condition is satisfied. , into two sets of duration values (D1 and D2) (step S472). Next, the anomaly detection device 300 recursively clusters the duration values of each of the divided sets of duration values (step S474). The anomaly detection device 300 repeatedly divides the set of duration values into two and recursively clusters each divided set of duration values until the stop condition is satisfied. Thereby, the anomaly detection device 300 can generate clusters of a plurality of duration values.

図２６は、デュレーション値のクラスタリング処理の一例を示す図である。例えば１～１０１０［μｓｅｃ］まで複数のデュレーション値が存在するものとし、停止条件はＣＶ＜０．５且つＺ＜１であるものする。ＣＶは変動係数であり、変動係数は偏差σ／平均μであり、Ｚは平均から最大乖離度（ｍａｘ|ｘ－μ|／σ）である。異常検知装置３００は、複数のデュレーション値を、２つのデュレーション値の集合（Ｄ１，Ｄ２）に分割し、デュレーション値の集合Ｄ１をさらに、２つのデュレーション値の集合（Ｄ１１，Ｄ１２）に分割し、デュレーション値の集合Ｄ１１をさらに２つのデュレーション値の集合（Ｄ１１１，Ｄ１１２）に分割し、デュレーション値の集合Ｄ１２をさらに２つのデュレーション値の集合（Ｄ１２１，Ｄ１２２）に分割する。この結果、異常検知装置３００は、５個のデュレーション値のクラスタに分割することができる。これにより異常検知装置３００は、処理前にクラスタ数を設定していなくても、停止条件を満たすクラスタを生成することができる。 FIG. 26 is a diagram illustrating an example of duration value clustering processing. For example, it is assumed that there are a plurality of duration values from 1 to 1010 [μsec], and the stop conditions are CV<0.5 and Z<1. CV is the coefficient of variation, where the coefficient of variation is deviation σ/mean μ, and Z is the maximum divergence from the mean (max|x−μ|/σ). The anomaly detection device 300 divides the plurality of duration values into two duration value sets (D1, D2), further divides the duration value set D1 into two duration value sets (D11, D12), The duration value set D11 is further divided into two duration value sets (D111, D112), and the duration value set D12 is further divided into two duration value sets (D121, D122). As a result, the anomaly detection device 300 can be divided into clusters of five duration values. As a result, the anomaly detection device 300 can generate clusters that satisfy the stop condition even if the number of clusters is not set before processing.

なお、デュレーション値のクラスタリング処理は、デュレーション値の大きさおよびバラツキという複数の条件に基づいてクラスタリングを行えれば、上述したｋ－ｍｅａｎｓを利用したクラスタリング処理以外の処理を行ってもよい。例えば、デュレーション値が１０００ｍｓｅｃと１０１０ｍｓｅｃとを同じクラスタとし、１ｍｓｅｃと１０ｍｓｅｃとで別のクラスタを生成できればよい。 It should be noted that the duration value clustering process may be any process other than the above-described clustering process using k-means as long as clustering can be performed based on a plurality of conditions such as duration value magnitude and variation. For example, it is sufficient if duration values of 1000 msec and 1010 msec are treated as the same cluster, and different clusters are generated for duration values of 1 msec and 10 msec.

（デュレーション値の異常の排除）
図２７は、異常値を考慮したデュレーション値のクラスタリングを説明するための図である。異常検知装置３００は、デュレーション値の異常値を排除する処理を行うことが望ましい。異常検知装置３００は、上述したように日単位で集計したデュレーション値をクラスタリングした後、全学習期間における各クラスタのサンプル数を、ブートストラップ法により所定数に補正する。次に異常検知装置３００は日単位の各クラスタのサンプル数を結合すると、学習期間に多く発生しているデュレーション値は多く積み上がり、学習期間における発生数が少ないデュレーション値は積み上がりが少ない。異常検知装置３００は、結合後のデュレーション値のうち所定の閾値よりも積み上がりが少ないデュレーション値を切り捨てることを決定する。これにより異常検知装置３００は、日単位のデュレーション値から、切り捨て対象のデュレーション値を排除することができる。この結果、異常検知装置３００は、学習期間に亘って発生回数が少ないデュレーション値を異常値として排除し、正常値からなるデュレーション値のクラスタを作成することができる。異常検知装置３００は、デュレーション値から高い精度で異常値を排除するために、複数回に亘り、日単位のデュレーション値を所定数に補正する処理、学習期間でデュレーション値を積み上げる処理、および所定の閾値よりも少ないデュレーション値を切り捨てる処理を行うことが望ましい。 (Elimination of abnormal duration values)
FIG. 27 is a diagram for explaining clustering of duration values in consideration of abnormal values. It is desirable that the anomaly detection device 300 perform processing for excluding anomalous duration values. After clustering the duration values aggregated on a daily basis as described above, the anomaly detection device 300 corrects the number of samples in each cluster in the entire learning period to a predetermined number by the bootstrap method. Next, when the anomaly detection device 300 combines the number of samples of each cluster on a daily basis, the duration values that occur frequently during the learning period are increased, and the duration values that occur less frequently during the learning period are accumulated less. The anomaly detection device 300 determines to truncate a duration value that accumulates less than a predetermined threshold among the combined duration values. As a result, the anomaly detection device 300 can eliminate the duration values to be rounded down from the daily duration values. As a result, the abnormality detection device 300 can create a cluster of duration values consisting of normal values by excluding duration values that occur less frequently over the learning period as abnormal values. In order to eliminate abnormal values from the duration values with high accuracy, the anomaly detection device 300 performs a process of correcting the daily duration value to a predetermined number, a process of accumulating the duration value during the learning period, and a predetermined It is desirable to perform a process of truncating duration values that are less than a threshold.

デュレーション値の異常値はバースト的に発生する場合があるので、学習期間の合計回数ではなく、日単位の発生回数に基づいてデュレーション値が異常であるか否かを判定することが望ましい。しかし、デュレーション値は連続値であるため、デュレーション値に閾値を設けて異常値を判定しようとしても正確に異常値を排除することはできない。そこで、上述したように、日単位でクラスタリングしたデュレーション値の発生回数を学習期間において比較することで、日単位で発生回数が少ないデュレーション値を異常値として排除することができる。また、異常検知装置３００は、上述した処理を複数回繰り返すことでデュレーション値の精度を向上させることができる。 Abnormal duration values may occur in bursts, so it is desirable to determine whether or not the duration value is abnormal based on the number of occurrences per day rather than the total number of times during the learning period. However, since the duration value is a continuous value, even if an attempt is made to determine an abnormal value by setting a threshold value for the duration value, the abnormal value cannot be accurately eliminated. Therefore, as described above, by comparing the number of occurrences of duration values clustered on a daily basis during the learning period, it is possible to eliminate duration values with a small number of occurrences on a daily basis as abnormal values. Further, the abnormality detection device 300 can improve the accuracy of the duration value by repeating the above-described processing a plurality of times.

（マルコフモデルの高次化）
図２８は、優先マルコフモデルを高次化する処理を説明するための図である。
既知の単純マルコフモデルは１つ前のログメッセージを考慮して次のログメッセージを推定するが、異常検知装置３００は、推定精度を向上させるために、２つ前以上のログメッセージを考慮してログメッセージを推定する高次マルコフモデルを作成してよい。しかし、単純マルコフモデルに代えて高次マルコムモデルを適用すると推定精度が落ちてしまう場合がある。特に、メッセージ集合から推定されるシーケンスの長さがかなり長い場合、推定精度の劣化が起きやすい。そこで、異常検知装置３００は、高次化する範囲を制限し、優先マルコフモデルのみ高次化する部分高次化処理を行う。 (Higher-order Markov model)
FIG. 28 is a diagram for explaining processing for increasing the order of the prioritized Markov model.
Although the known simple Markov model estimates the next log message considering the log message one before, the anomaly detection device 300 considers the log message two or more before in order to improve the estimation accuracy. A higher order Markov model may be created to estimate the log message. However, if a high-order Malcolm model is applied in place of the simple Markov model, the estimation accuracy may drop. In particular, when the length of the sequence estimated from the message set is considerably long, deterioration of estimation accuracy is likely to occur. Therefore, the anomaly detection device 300 limits the range of order enhancement and performs partial enhancement processing in which only the priority Markov model is enhanced.

異常検知装置３００は、優先メッセージを用いて高次マルコフモデル作成処理を行う。異常検知装置３００は、下記式のＡＩＣ（赤池情報量基準）を用いて次数ｋを選択し、ｋ次マルコフモデルを作成する。下記の式において、ｋηｍは尤度比統計量(likelihood ratio statistics)であり「－２×（ＬＬｋ－ＬＬｍ）」と表現され、ＬＬｋ(log likelihood for k-order markov chain)は、ｋ次マルコフチェーンの対数尤度であり、ＬＬｍ(log likelihood for m-order markov chain)は、ｍ次マルコフチェーンの対数尤度であり、Ｓ^ｍ－Ｓ^ｋ（Ｓ－１）は、尤度比検定統計量(likelihood ratio test statistics)であり、Ｓは、もともとの状態数(original number of states)である。。これにより、異常検知装置３００は、シーケンスの長さが長くても安定的に高い推定精度を得ることができる。

The anomaly detection device 300 uses the priority message to perform higher-order Markov model creation processing. The anomaly detection device 300 selects the order k using the AIC (Akaike Information Criterion) of the following formula and creates a k-order Markov model. In the following formula, kηm is the likelihood ratio statistics expressed as "-2 × (LLk-LLm)", and LLk (log likelihood for k-order markov chain) is the k order Markov chain , LLm (log likelihood for m-order markov chain) is the log likelihood for m-order markov chain, and S ^m −S ^k (S−1) is the likelihood ratio test statistic ( likelihood ratio test statistics), and S is the original number of states. . As a result, the anomaly detection device 300 can stably obtain high estimation accuracy even if the length of the sequence is long.

［シーケンス推定処理］
図２９は、シーケンス推定処理の一例を示すシーケンス図である。
監視対象システム１００は、データ処理装置２００にログメッセージを送信し、データ処理装置２００は、上述したように、コンフィグレーション情報に基づいてログメッセージを解析し（ステップＳ５０２）、ログメッセージにメッセージＩＤを追加する（ステップＳ５０４）。データ処理装置２００は、メッセージＩＤが追加されたログメッセージを異常検知装置３００に送信し、異常検知装置３００は、メッセージＩＤを用いてシーケンスを推定し（ステップＳ５０６）、シーケンスを示すシーケンス値をデータ処理装置２００に送信する。これにより、データ処理装置２００は、シーケンス値を表す情報や、当該シーケンスに関する情報をユーザ端末装置４００に提供することができる。 [Sequence estimation process]
FIG. 29 is a sequence diagram illustrating an example of sequence estimation processing.
The monitored system 100 transmits a log message to the data processing device 200, and the data processing device 200 analyzes the log message based on the configuration information as described above (step S502), and adds the message ID to the log message. Add (step S504). The data processing device 200 transmits the log message to which the message ID is added to the anomaly detection device 300, and the anomaly detection device 300 estimates the sequence using the message ID (step S506), and stores the sequence value indicating the sequence as data. Send to the processing device 200 . Thereby, the data processing device 200 can provide the user terminal device 400 with information representing the sequence value and information about the sequence.

図３０は、シーケンス推定処理の処理手順の一例を示すシーケンス図である。異常検知装置３００は、先ずログメッセージを取り出し（ステップＳ５１０）、シーケンスを推定する。シーケンスの推定は、競合調整済である優先マルコフモデルもしくは競合調整済みでない優先マルコフモデルのいずれか、通常マルコフモデルの順で、マルコフモデルを用いて行う。競合調整済みであるマルコフモデルについては後述する。 FIG. 30 is a sequence diagram illustrating an example of a processing procedure of sequence estimation processing. The anomaly detection device 300 first retrieves the log message (step S510) and estimates the sequence. Sequence estimation is performed using Markov models in the order of either competitively adjusted prior Markov models or non-competitively adjusted prior Markov models, usually Markov models. The conflict-adjusted Markov model will be described later.

まず、異常検知装置３００は、競合調整済である優先マルコフモデルもしくは競合調整済みでない優先マルコフモデルのいずれかの作成を行い（ステップＳ５１２）、ログメッセージの推定を行う（ステップＳ５１３）。異常検知装置３００は、競合調整済みである優先マルコフモデルの作成およびログメッセージの推定を、ステップＳ５１０で取り出したログメッセージの数分を繰り返して行う。なお、競合調整済みである優先マルコフモデルを用いない場合、競合調整済みである優先マルコフモデルの作成を行わなくてよい。異常検知装置３００は、競合調整済である優先マルコフモデルもしくは競合調整済みでない優先マルコフモデルのいずれかを用いてログメッセージの数分を繰り返してシーケンスの推定を行い、通常マルコフモデルを用いてログメッセージの数分を繰り返してシーケンスの推定を行う。 First, the anomaly detection device 300 creates either a conflict-adjusted prioritized Markov model or a conflict-unadjusted prioritized Markov model (step S512), and estimates a log message (step S513). The anomaly detection apparatus 300 repeats the creation of the conflict-adjusted prioritized Markov model and the estimation of the log messages for the number of log messages extracted in step S510. Note that when conflict-adjusted prioritized Markov models are not used, conflict-adjusted prioritized Markov models need not be created. The anomaly detection device 300 repeats the sequence for several minutes of the log messages using either the conflict-adjusted prioritized Markov model or the non-conflict-adjusted prioritized Markov model, and estimates the sequence using the normal Markov model. is repeated for several minutes to estimate the sequence.

図３１は、競合調整済みマルコフモデルの作成処理の処理手順の一例を示すフローチャートである。先ず異常検知装置３００は、通常マルコフモデルの学習データと優先マルコフモデルの学習データとの間に、時間的に近いログメッセージである共通メッセージが存在するか否かを判定する（ステップＳ５２０）。異常検知装置３００は、共通メッセージがない場合（ステップＳ５２０：ＮＯ）、本フローチャートの処理を終了し、共通メッセージがある場合（ステップＳ５２０：ＹＥＳ）、ステップＳ５２２の処理を行う。ステップＳ５２２において、異常検知装置３００は、学習データから共通メッセージを除く。次に異常検知装置３００は、共通データを除いた学習データを用いて優先マルコフモデルを作成する（ステップＳ５２４）。 FIG. 31 is a flowchart illustrating an example of a processing procedure for creating a conflict-adjusted Markov model. First, the anomaly detection device 300 determines whether or not there is a common message, which is a log message close in time, between the learning data of the normal Markov model and the learning data of the priority Markov model (step S520). If there is no common message (step S520: NO), the abnormality detection device 300 ends the process of this flowchart, and if there is a common message (step S520: YES), it performs the process of step S522. In step S522, the anomaly detection device 300 removes common messages from the learning data. Next, the anomaly detection device 300 creates a priority Markov model using the learning data excluding the common data (step S524).

図３２は、競合調整済みの優先マルコフモデルの作成処理の一例を説明するための図である。例えば、優先マルコフモデルがログメッセージの集合Ｘに含まれる優先メッセージにより学習され、通常マルコフモデルがログメッセージの集合Ｙにより学習されたものとする。上述したように、異常検知装置３００は、優先マルコフモデルによりシーケンスを推定した後、通常マルコフモデルによりシーケンスを推定する。 FIG. 32 is a diagram for explaining an example of processing for creating a conflict-adjusted prioritized Markov model. For example, it is assumed that the priority Markov model is trained with the priority messages contained in the set X of log messages, and the normal Markov model is trained with the set Y of log messages. As described above, the anomaly detection device 300 estimates the sequence by the normal Markov model after estimating the sequence by the priority Markov model.

しかし、図３２に示すようにログメッセージの集合Ｘとログメッセージの集合Ｙとが時間的に重複している場合、ログメッセージの集合Ｘの優先メッセージＸ’に含まれるが、ログメッセージの集合Ｙの優先メッセージＹ’に含まれないログメッセージＭが存在する。この場合、ログメッセージＭは、ログメッセージの集合Ｘに偏っていることになる。そこで、異常検知装置３００は、ログメッセージの集合Ｘおよびログメッセージの集合Ｙについて優先メッセージから共通メッセージＭを除外して優先マルコフモデルの作成を行う。すなわち、異常検知装置３００は、メッセージの集合Ｘに含まれる優先メッセージＸ’から共通メッセージＭを除いた共通メッセージＭ’を用いて競合調整済みの優先マルコフモデルを作成する。共通メッセージＭ’は、ログメッセージの集合Ｘに含まれる優先メッセージＸ’から、ログメッセージの集合Ｙとログメッセージの集合Ｙに含まれる優先メッセージＹ’との差分に含まれる共通ログメッセージＭを除いたログメッセージである。以上のように、異常検知装置３００によれば、時間的に近いログメッセージが一方のログメッセージの集合の優先メッセージに偏ることを回避することができる。 However, when log message set X and log message set Y overlap in time as shown in FIG. There is a log message M that is not included in the priority message Y' of . In this case, the log messages M are biased towards the set X of log messages. Therefore, the anomaly detection device 300 creates a priority Markov model by excluding the common message M from the priority messages for the set X of log messages and the set Y of log messages. That is, the anomaly detection device 300 creates a conflict-adjusted prioritized Markov model using the common messages M′ obtained by excluding the common messages M from the prioritized messages X′ included in the message set X. The common message M' is obtained by removing the common log message M included in the difference between the log message set Y and the priority message Y' included in the log message set Y from the priority message X' included in the log message set X. is the log message. As described above, according to the anomaly detection device 300, it is possible to prevent log messages that are close in time from being biased toward priority messages of one set of log messages.

なお、マルコフモデルの作成時に競合調整済みの優先マルコフモデルを作成せずに、シーケンスの推定時に競合調整済みの優先マルコフモデルを作成することが望ましい。マルコフモデルの作成時には、ログメッセージの集合同士が時間的に近いタイミングで発生するか否かを判定する処理を行っていないためである。仮に、マルコフモデルの作成時に、他のマルコフモデルを作成するために用いたログメッセージと時間的に近いことを判定すると、優先メッセージが減少して優先マルコフモデルのシーケンス推定精度が低下するためである。 It is preferable to create a conflict-adjusted prioritized Markov model when estimating a sequence without creating a conflict-adjusted prioritized Markov model when creating a Markov model. This is because when the Markov model is created, the process of determining whether or not sets of log messages occur at timings close to each other is not performed. This is because, if it is determined at the time of creating a Markov model that the log message used to create another Markov model is close in time, priority messages will decrease and the sequence estimation accuracy of the priority Markov model will decrease. .

図３３は、ログメッセージについてのシーケンス推定処理の処理手順の一例を示すフローチャートである。まず異常検知装置３００は、対象のログメッセージについてシーケンスが推定済であるか否かを判定し（ステップＳ５３０）、対象のログメッセージについてシーケンスが推定済である場合には本フローチャートの処理を終了する（ステップＳ５３０：ＹＥＳ）。異常検知装置３００は、ログメッセージについてのシーケンスが推定済でない場合には（ステップＳ５３０：ＮＯ）、対象のログメッセージｘについてマルコフモデルの状態遷移に合致する、ログメッセージｘよりも時系列的に前のログメッセージｙを抜き出す（ステップＳ５３２）。次に異常検知装置３００は、ステップＳ５３２においてログメッセージｙの候補が存在するか否かを判定する（ステップＳ５３４）。異常検知装置３００は、ログメッセージｙの候補が存在する場合（ステップＳ５３４：ＹＥＳ）、ログメッセージｘにログメッセージｙと同じシーケンス値を付与する（ステップＳ５３６）。異常検知装置３００は、ログメッセージｙの候補が存在しない場合（ステップＳ５３４：ＮＯ）、ログメッセージｘに時間的に最も近いログメッセージを抜き出し、ログメッセージｘに、当該抜き出したログメッセージと同じシーケンス値を付与する（ステップＳ５３８）。 FIG. 33 is a flowchart illustrating an example of a sequence estimation process procedure for log messages. First, the anomaly detection device 300 determines whether or not the sequence of the target log message has been estimated (step S530), and if the sequence of the target log message has been estimated, the process of this flowchart ends. (Step S530: YES). If the sequence of the log message has not been estimated (step S530: NO), the anomaly detection apparatus 300 detects the state transition of the Markov model for the target log message x before the log message x in chronological order. is extracted (step S532). Next, in step S532, the anomaly detection device 300 determines whether or not there is a candidate for the log message y (step S534). If there is a candidate for log message y (step S534: YES), anomaly detection device 300 gives log message x the same sequence value as log message y (step S536). If there is no candidate for log message y (step S534: NO), anomaly detection device 300 extracts the log message that is temporally closest to log message x, and assigns the same sequence value as the extracted log message to log message x. is given (step S538).

図３４は、優先マルコフモデルおよび通常マルコフモデルを用いたシーケンス推定処理を説明するための図である。
異常検知装置３００は、取り出したログメッセージの集合を、優先メッセージと優先メッセージ以外のログメッセージとに分割する。異常検知装置３００は、優先メッセージを優先マルコフモデルのみに入力して、優先メッセージに含まれるシーケンスを推定する。次に異常検知装置３００は、優先メッセージ以外のログメッセージを通常マルコフモデルに入力して、優先メッセージ以外のログメッセージに含まれるシーケンスを推定する。これにより、異常検知装置３００は、優先マルコフモデルによるシーケンス推定結果と、通常マルコフモデルによるシーケンス推定結果とを取得することができる。 FIG. 34 is a diagram for explaining sequence estimation processing using a priority Markov model and a normal Markov model.
The anomaly detection device 300 divides the extracted set of log messages into priority messages and log messages other than priority messages. The anomaly detection device 300 inputs the priority message only to the priority Markov model and estimates the sequence included in the priority message. Next, the anomaly detection device 300 inputs the log messages other than the priority messages into the normal Markov model, and estimates the sequences included in the log messages other than the priority messages. Thereby, the anomaly detection device 300 can acquire the sequence estimation result by the priority Markov model and the sequence estimation result by the normal Markov model.

図３５は、シーケンス推定処理の他の一例を示すフローチャートである。異常検知装置３００は、デュレーション値を利用して補助的なシーケンス推定を行ってよい。異常検知装置３００は、シーケンス推定結果を参照し、複数のシーケンス値が付与されているログメッセージを検索し、シーケンス候補が複数存在するログメッセージが存在するか否かを判定する（ステップＳ５４０）。異常検知装置３００は、複数のシーケンス値が付与されたログメッセージがない場合は（ステップＳ５４０：ＮＯ）、シーケンス推定処理を終了する。 FIG. 35 is a flowchart illustrating another example of sequence estimation processing. The anomaly detection device 300 may use the duration value to perform auxiliary sequence estimation. The anomaly detection device 300 refers to the sequence estimation result, searches for log messages with multiple sequence values, and determines whether or not there is a log message with multiple sequence candidates (step S540). If there is no log message to which multiple sequence values have been assigned (step S540: NO), anomaly detection device 300 terminates the sequence estimation process.

異常検知装置３００は、複数のシーケンス値が付与されたログメッセージがある場合（ステップＳ５４０：ＹＥＳ）、ログメッセージのデュレーション値と、マルコフモデルのデュレーション値とを比較する（ステップＳ５４２）。異常検知装置３００は、推定されたシーケンスに含まれるログメッセージ間の状態遷移分のデュレーション値を計算し、計算したデュレーション値と、推定された複数のシーケンス値それぞれに対応する複数のマルコフモデルにおけるデュレーション値とを比較する。異常検知装置３００は、ログメッセージのシーケンス値と最も近いデュレーション値を持つマルコフモデルに対応するシーケンス値に決定する（ステップＳ５４４）。 If there is a log message with a plurality of sequence values (step S540: YES), the anomaly detection device 300 compares the duration value of the log message with the duration value of the Markov model (step S542). The anomaly detection device 300 calculates duration values for state transitions between log messages included in the estimated sequence, and the calculated duration values and the durations in a plurality of Markov models corresponding to each of the plurality of estimated sequence values. Compare with value. The anomaly detection device 300 determines the sequence value corresponding to the Markov model having the duration value closest to the sequence value of the log message (step S544).

図３６は、シーケンスを決定する処理を説明するための図である。例えば、ログメッセージｍ１１、ｍ１２、およびｍ１３の順に並ぶメッセージ群（シーケンスＳ１）が、シーケンスＳ２およびシーケンスＳ３であると推定されたとする。この場合、異常検知装置３００は、シーケンスＳ１のデュレーション値ｄ１１およびｄ１２を計算し、計算したｄ１１およびｄ１２とシーケンスＳ２のデュレーション値ｄ２１およびｄ２２との差が、計算したｄ１１およびｄ１２とシーケンスＳ３のデュレーション値ｄ３１およびｄ３２との差よりも大きいと判定する。この結果、異常検知装置３００は、メッセージ群（シーケンスＳ１）がシーケンスＳ３であることを推定することができる。 FIG. 36 is a diagram for explaining processing for determining a sequence. For example, assume that a group of messages (sequence S1) arranged in order of log messages m11, m12, and m13 is estimated to be sequence S2 and sequence S3. In this case, anomaly detection device 300 calculates duration values d11 and d12 of sequence S1, and the difference between calculated d11 and d12 and duration values d21 and d22 of sequence S2 is the duration of calculated d11 and d12 and sequence S3. It is judged to be larger than the difference between the values d31 and d32. As a result, the anomaly detection device 300 can estimate that the message group (sequence S1) is the sequence S3.

［異常判定処理］
図３７は、異常判定処理の処理手順の一例を示すフローチャートである。異常検知装置３００は、シーケンス推定処理によりシーケンス値が付与されたログメッセージＸの数分を、異常判定処理（ステップＳ６１０）を繰り返して行う。異常検知装置３００は、異常判定処理により異常であることが判定されたログメッセージＸに対応づけて、異常フラグを検知結果蓄積部２０８に書き込む（ステップＳ６１２）。 [Abnormality determination process]
FIG. 37 is a flowchart illustrating an example of a processing procedure for abnormality determination processing. The anomaly detection device 300 repeats the anomaly determination process (step S610) for the number of log messages X to which the sequence value is assigned by the sequence estimation process. The abnormality detection device 300 writes an abnormality flag in the detection result accumulation unit 208 in association with the log message X determined to be abnormal by the abnormality determination process (step S612).

図３８は、異常判定処理の処理内容の一例を示すフローチャートである。異常検知装置３００は、対象とするログメッセージＸについて優先マルコフモデルおよび通常マルコフモデルの何れか一つのマルコフモデルに一致するか否かを判定し（ステップＳ６１０）、何れか一つのマルコフモデルに一致する場合（ステップＳ６１０：ＹＥＳ）、デュレーション値に一致するか否かを判定する（ステップＳ６１２）。 FIG. 38 is a flowchart illustrating an example of processing contents of abnormality determination processing. The anomaly detection device 300 determines whether the target log message X matches any one Markov model of the priority Markov model and the normal Markov model (step S610). If so (step S610: YES), it is determined whether or not they match the duration value (step S612).

異常検知装置３００は、優先マルコフモデルおよび通常マルコフモデルの何れか一つのマルコフモデルに一致しない場合（ステップＳ６１０：ＮＯ）、対象とするログメッセージＸについての異常フラグをＯＮに設定する（ステップＳ６１４）。異常検知装置３００は、一致したマルコフモデルにおけるデュレーション値に、対象とするログメッセージＸと同じシーケンス値のログメッセージ間のデュレーション値が一致しない場合（ステップＳ６１２：ＮＯ）、対象とするログメッセージＸについての異常フラグをＯＮに設定する（ステップＳ６１４）。 If the anomaly detection device 300 does not match any one of the priority Markov model and the normal Markov model (step S610: NO), the anomaly detection device 300 sets the anomaly flag for the target log message X to ON (step S614). . If the duration value in the matched Markov model does not match the duration value between log messages having the same sequence value as the target log message X (step S612: NO), the anomaly detection device 300 is set to ON (step S614).

＜実施形態の効果＞
以上説明したように、実施形態のシーケンス推定システム１によれば、データ処理装置２００により収集された複数のログメッセージのそれぞれをベクトル化し、ベクトル化されたログメッセージを分類するための閾値を設定し、設定された閾値を用いて複数のログメッセージを分類し、分類されたログメッセージ群を識別する識別子（ＩＤ）を設定する。これにより、シーケンス推定システム１は、データ処理装置２００にＩＤを付与するルールを設定することができる。そしてシーケンス推定システム１は、データ処理装置２００により新たなログメッセージを取得した場合に、取得した新たなログメッセージにメッセージＩＤを付与することができる。これによりシーケンス推定システム１によれば、ログメッセージを分類するためのコンフィグレーション作業の手間を省くことができる。 <Effects of Embodiment>
As described above, according to the sequence estimation system 1 of the embodiment, each of the plurality of log messages collected by the data processing device 200 is vectorized, and a threshold value for classifying the vectorized log messages is set. , classify a plurality of log messages using the set threshold, and set an identifier (ID) for identifying a group of classified log messages. Thereby, the sequence estimation system 1 can set a rule for assigning an ID to the data processing device 200 . Then, when the data processing device 200 acquires a new log message, the sequence estimation system 1 can assign a message ID to the acquired new log message. As a result, according to the sequence estimation system 1, it is possible to save the trouble of configuration work for classifying log messages.

なお、各実施形態および変形例について説明したが、一例であってこれらに限られず、例えば、各実施形態や各変形例のうちのいずれかや、各実施形態の一部や各変形例の一部を、他の１または複数の実施形態や他の１または複数の変形例と組み合わせて本発明の一態様を実現させてもよい。 Although each embodiment and modifications have been described, these are only examples and are not limited to these. A section may be combined with one or more embodiments or one or more modified examples to realize one aspect of the present invention.

なお、本実施形態におけるデータ処理装置２００や異常検知装置３００の各処理を実行するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、当該記録媒体に記録されたプログラムを、コンピュータシステムに読み込ませ、実行することにより、データ処理装置２００や異常検知装置３００に係る上述した種々の処理を行ってもよい。 A program for executing each process of the data processing device 200 and the abnormality detection device 300 in this embodiment is recorded in a computer-readable recording medium, and the program recorded in the recording medium is read into the computer system. The above-described various processes related to the data processing device 200 and the abnormality detection device 300 may be performed by setting and executing.

なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器などのハードウェアを含むものであってもよい。また、「コンピュータシステム」は、ＷＷＷシステムを利用している場合であれば、ホームページ提供環境（あるいは表示環境）も含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、フラッシュメモリなどの書き込み可能な不揮発性メモリ、ＣＤ－ＲＯＭなどの可搬媒体、コンピュータシステムに内蔵されるハードディスクなどの記憶装置のことをいう。 Note that the “computer system” referred to here may include hardware such as an OS and peripheral devices. The "computer system" also includes the home page providing environment (or display environment) if the WWW system is used. In addition, "computer-readable recording medium" means writable non-volatile memory such as flexible disk, magneto-optical disk, ROM, flash memory, portable medium such as CD-ROM, hard disk built in computer system, etc. storage device.

さらに「コンピュータ読み取り可能な記録媒体」とは、インターネットなどのネットワークや電話回線などの通信回線を介してプログラムが送信された場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリ（例えばＤＲＡＭ（Ｄｙｎａｍｉｃ
ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ））のように、一定時間プログラムを保持しているものも含むものとする。また、上記プログラムは、このプログラムを記憶装置などに格納したコンピュータシステムから、伝送媒体を介して、あるいは、伝送媒体中の伝送波により他のコンピュータシステムに伝送されてもよい。 Furthermore, "computer-readable recording medium" means a volatile memory (e.g., DRAM (Dynamic
Random Access Memory)), which holds a program for a certain period of time. Also, the program may be transmitted from a computer system storing the program in a storage device or the like to another computer system via a transmission medium or by a transmission wave in the transmission medium.

ここで、プログラムを伝送する「伝送媒体」は、インターネットなどのネットワーク（通信網）や電話回線などの通信回線（通信線）のように情報を伝送する機能を有する媒体のことをいう。また、上記プログラムは、前述した機能の一部を実現するためのものであってもよい。さらに、前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるもの、いわゆる差分ファイル（差分プログラム）であってもよい。 Here, the "transmission medium" for transmitting the program refers to a medium having a function of transmitting information, such as a network (communication network) such as the Internet or a communication line (communication line) such as a telephone line. Further, the program may be for realizing part of the functions described above. Further, it may be a so-called difference file (difference program) that can realize the above-described functions in combination with a program already recorded in the computer system.

１シーケンス推定システム
１００監視対象システム
２００データ処理装置
２０２フォーマット変換部
２０４データ処理部
２０６ログデータ蓄積部
２０８検知結果蓄積部
２１０可視化部
３００異常検知装置
３１０コンフィグレーション作成部
３２０メッセージ登録部
３３０学習部
３３２メッセージ集合推定部
３３４モデル作成部
３４０推定部
３４２シーケンス推定部
３４４異常判定部
４００ユーザ端末装置 1 sequence estimation system 100 monitoring target system 200 data processing device 202 format conversion unit 204 data processing unit 206 log data storage unit 208 detection result storage unit 210 visualization unit 300 anomaly detection device 310 configuration creation unit 320 message registration unit 330 learning unit 332 Message set estimating unit 334 Model creating unit 340 Estimating unit 342 Sequence estimating unit 344 Abnormality determining unit 400 User terminal device

Claims

a collector that collects log messages from the monitored system;
a vectorization unit that vectorizes each of the plurality of log messages collected by the collection unit;
a threshold setting unit that sets a threshold for classifying log messages vectorized by the vectorization unit;
a clustering unit that clusters the plurality of log messages using the threshold set by the threshold setting unit and sets an identifier for identifying the clustered log messages;
a classifying unit that assigns an identifier set by the clustering unit to the acquired new log message when the collecting unit acquires a new log message;
A message classifier, comprising:

The vectorization unit calculates a higher weight for each word included in the log message as the appearance position of the word is closer to the beginning of the log message, and lowers the weight when the word includes a numerical value. and performing vectorization based on the calculated weights.

The clustering unit acquires a cluster set by performing a process of sampling some log messages that are not included in a cluster in the set of log messages and a process of clustering the sampled log messages multiple times. 3. The message classification device according to claim 1, further comprising setting an identifier to the obtained cluster set.

The threshold setting unit causes the clustering unit to perform classification using each of the plurality of threshold candidates, and based on the number of log messages included in the set of log messages and the number of clusters included in each of the cluster sets The message according to claim 3, wherein AIC (Akaike information criterion) or BIC (Bayesian information criterion) is calculated, and a threshold candidate with the minimum AIC or BIC is adopted as the threshold. Classifier.

an information processing device collecting log messages from a monitored system;
the information processing device vectorizing each of the plurality of collected log messages;
the information processing device setting a threshold for classifying vectorized log messages;
a step of the information processing device clustering the plurality of log messages using the threshold and setting an identifier for identifying the clustered log messages;
when the information processing device acquires a new log message, assigning the identifier to the acquired new log message;
Message classification method, including

to the computer,
collecting log messages from the monitored system;
vectorizing each of the collected plurality of log messages;
setting a threshold for classifying the vectorized log messages;
clustering the plurality of log messages using the threshold and setting an identifier to identify the clustered log messages;
when acquiring a new log message, assigning the identifier to the acquired new log message;
The program that causes the to run.