JP6823501B2

JP6823501B2 - Anomaly detection device, anomaly detection method and program

Info

Publication number: JP6823501B2
Application number: JP2017040589A
Authority: JP
Inventors: 泰弘池田; 中野　雄介; 雄介中野; 敬志郎渡辺; 石橋　圭介; 圭介石橋; 川原　亮一; 亮一川原
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2017-03-03
Filing date: 2017-03-03
Publication date: 2021-02-03
Anticipated expiration: 2037-03-03
Also published as: JP2018147172A

Description

本発明は、異常検知装置、異常検知方法及びプログラムに関する。 The present invention relates to an abnormality detection device, an abnormality detection method and a program.

リアルタイムな異常検知においては、様々なデータを定期的に観測し、データが正常時と異なる傾向を示した場合に「異常」が検知される。ここで、異常検知アルゴリズムは、予め正常時として定義された「学習期間」のデータを教師データとして用いて学習を行い、異常検知を行う「テスト期間」においては、観測されたテストデータと、学習した教師データの傾向の比較を行うものとする。このような異常検知アルゴリズムとしては、正常時における各種データの相関関係を学習し、テスト期間においては、学習したデータの相関関係が崩れた際に「異常」と判断するようなアルゴリズムが提案されている（例えば、非特許文献１、非特許文献２、非特許文献３）。 In real-time abnormality detection, various data are regularly observed, and "abnormality" is detected when the data shows a tendency different from the normal state. Here, the anomaly detection algorithm learns by using the data of the "learning period" defined in advance as normal as the teacher data, and in the "test period" in which the abnormality is detected, the observed test data and the learning The tendency of the teacher data is compared. As such an abnormality detection algorithm, an algorithm has been proposed that learns the correlation of various data in the normal state and determines "abnormal" when the correlation of the learned data is broken during the test period. (For example, Non-Patent Document 1, Non-Patent Document 2, Non-Patent Document 3).

このようなアルゴリズムは、異常か否かを判定することが困難な異常時のデータを用いることなく、正常時のデータだけを用いて異常検知ができるという利点がある。 Such an algorithm has an advantage that abnormality detection can be performed using only normal data without using abnormal data, which is difficult to determine whether or not it is abnormal.

Hodge, Victoria J., and Jim Austin. "A survey of outlier detection methodologies." Artificial intelligence review 22.2 (2004): 85-126.Hodge, Victoria J., and Jim Austin. "A survey of outlier detection methodologies." Artificial intelligence review 22.2 (2004): 85-126. 櫻田麻由，矢入健久，"オートエンコーダを用いた次元削減による宇宙機の異常検知"，人工知能学会全国大会論文集 28, 1-3, 2014Mayu Sakurada, Takehisa Yairi, "Anomaly Detection of Spacecraft by Dimensionality Reduction Using Autoencoder", Proceedings of the Japanese Society for Artificial Intelligence National Convention 28, 1-3, 2014 Ringberg, Haakon, et al. "Sensitivity of PCA for traffic anomaly detection." ACM SIGMETRICS Performance Evaluation Review 35.1 (2007): 109-120.Ringberg, Haakon, et al. "Sensitivity of PCA for traffic anomaly detection." ACM SIGMETRICS Performance Evaluation Review 35.1 (2007): 109-120.

しかしながら、相関が低いデータが入力に多く含まれると、それに従って正常時にデータが取り得る状態のパターンも組合せ的に増加するため、学習に要する教師データが増加し、十分な教師データが無い場合に精度の良い異常検知が困難になる。特に、観測対象となるデータの種別が増加した場合、相関が低いデータが増加することから、このような問題はより顕著となる。 However, when a large amount of data with low correlation is included in the input, the pattern of the state in which the data can be obtained at normal times also increases in combination, so that the teacher data required for learning increases and there is not enough teacher data. Accurate abnormality detection becomes difficult. In particular, when the types of data to be observed increase, the number of data with low correlation increases, so that such a problem becomes more remarkable.

本発明は、上記の点に鑑みてなされたものであって、異常を検知するための学習に要するデータの増加を抑制することを目的とする。 The present invention has been made in view of the above points, and an object of the present invention is to suppress an increase in data required for learning to detect an abnormality.

そこで上記課題を解決するため、異常検知装置は、異常の検知対象が正常である場合に前記検知対象から得られる複数の種別のデータのデータ要素間の相関関係を、前記データ要素間の相関の高さに基づいて分類される単位ごとに、当該単位に関して生成される複数の学習器を用いて学習し、学習結果を出力する学習部と、前記検知対象から複数のタイミングで得られる複数の種別のデータのデータ要素群について、前記単位ごとに、当該単位に係る学習結果に基づいて、当該単位に分類されるデータ要素群の相関関係の崩れの程度を示す異常度を算出し、前記単位ごとの異常度に基づいて前記検知対象の異常を検知する検知部と、を有する。 Therefore, in order to solve the above problem, the abnormality detection device determines the correlation between the data elements of a plurality of types of data obtained from the detection target when the abnormality detection target is normal, and the correlation between the data elements. For each unit classified based on height, a learning unit that learns using a plurality of learning devices generated for the unit and outputs the learning result, and a plurality of types obtained from the detection target at a plurality of timings. For each of the data element groups of the data of, based on the learning result related to the unit, the degree of anomaly indicating the degree of collapse of the correlation of the data element group classified into the unit is calculated, and for each of the units. It has a detection unit that detects an abnormality of the detection target based on the degree of abnormality of.

異常を検知するための学習に要するデータの増加を抑制することができる。 It is possible to suppress an increase in data required for learning to detect an abnormality.

第１の実施の形態におけるシステム構成例を示す図である。It is a figure which shows the system configuration example in 1st Embodiment. 第１の実施の形態における異常検知装置１０のハードウェア構成例を示す図である。It is a figure which shows the hardware configuration example of the abnormality detection apparatus 10 in 1st Embodiment. 第１の実施の形態における異常検知装置１０の機能構成例を示す図である。It is a figure which shows the functional configuration example of the abnormality detection apparatus 10 in the 1st Embodiment. 第１の実施の形態における学習処理の処理手順の一例を説明するためのフローチャートである。It is a flowchart for demonstrating an example of the processing procedure of the learning process in 1st Embodiment. 第１の実施の形態における検知処理の処理手順の一例を説明するためのフローチャートである。It is a flowchart for demonstrating an example of the processing procedure of the detection process in 1st Embodiment. オートエンコーダを説明するための図である。It is a figure for demonstrating an autoencoder. 第６の実施の形態において前処理部１３が追加的に実行する処理手順を説明するためのフローチャートである。6 is a flowchart for explaining a processing procedure additionally executed by the preprocessing unit 13 in the sixth embodiment.

以下、図面に基づいて本発明の実施の形態を説明する。図１は、第１の実施の形態におけるシステム構成例を示す図である。図１において、ネットワークＮ１は、異常の検知対象とされるネットワークである。ネットワークＮ１は、ルータやサーバ装置等の複数のノードが相互に接続されることによって構成され、所定のサービスを提供するために任意のノード間においてパケットの送受信が行われる。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a diagram showing an example of a system configuration according to the first embodiment. In FIG. 1, the network N1 is a network for which an abnormality is detected. The network N1 is configured by connecting a plurality of nodes such as routers and server devices to each other, and packets are transmitted and received between arbitrary nodes in order to provide a predetermined service.

ネットワークＮ１の複数箇所には測定装置２０が配置されている。測定装置２０は、配置箇所を監視することで得られる観測データを複数のタイミングで採取する。収集される観測データの一例として、ＭＩＢ（Management Information Base）データ、ＮｅｔＦｌｏｗによるフローデータ、ＣＰＵ使用率等が挙げられる。 Measuring devices 20 are arranged at a plurality of locations in the network N1. The measuring device 20 collects observation data obtained by monitoring the arrangement location at a plurality of timings. Examples of collected observation data include MIB (Management Information Base) data, NetFlow flow data, CPU usage rate, and the like.

ＭＩＢは、ネットワーク機器を監視するためのメーカ間の共通ポリシーである。ＭＩＢデータは、例えば、５分単位で集約され、「時刻、ホスト名、インターフェース（ＩＦ）名、入力データ量（ｉｂｐｓ）、出力データ量（ｏｂｐｓ）」等を含む。 MIB is a common policy among manufacturers for monitoring network devices. The MIB data is aggregated in units of 5 minutes, for example, and includes "time, host name, interface (IF) name, input data amount (ibps), output data amount (obps)", and the like.

ＮｅｔＦｌｏｗは、フロー単位でのネットワーク監視をおこなう技術であり、通信が終了した段階でそのフローに関する情報が出力される。また、フローとは、「何処」と「何処」が「どのような通信」を「どれだけの量」行っているかを把握するための単位をいい、通信の送り手側のＩＰアドレス（ｓｒｃＩＰ）、送り手側のポート番号（ｓｒｃｐｏｒｔ）、受け手側のＩＰアドレス（ｄｓｔＩＰ）、受け手側のポート番号（ｄｓｔｐｏｒｔ）、通信プロトコル（ｐｒｏｔｏ）の５属性によりまとめられる。フローデータは、「フロー開始時刻、ｓｒｃＩＰ、ｓｒｃｐｏｒｔ、ｄｓｔＩＰ、ｄｓｔｐｏｒｔ、ｐｒｏｔｏ、フロー継続時間、総送信パケット数、総送信バイト数」等を含む。 NetFlow is a technology that monitors the network in units of flows, and information about the flow is output when communication is completed. A flow is a unit for grasping "where" and "where" are performing "what kind of communication" and "how much", and is an IP address (srcIP) on the sender side of communication. , Port number on the sender side (srcport), IP address on the receiver side (dstIP), port number on the receiver side (dstport), and communication protocol (proto). The flow data includes "flow start time, srcIP, srcport, dstIP, dstport, proto, flow duration, total number of transmitted packets, total number of transmitted bytes" and the like.

ＣＰＵ使用率は、例えば、ネットワークＮ１に含まれるサーバ装置又はルータ等のＣＰＵの使用率である。 The CPU usage rate is, for example, the usage rate of a CPU such as a server device or a router included in the network N1.

測定装置２０によって採取された観測データは、異常検知装置１０によって収集される。異常検知装置１０は、収集された観測データから、正常時の特徴を学習し、学習結果に基づいて、その後に入力される観測データについて、異常の発生を検知する（異常の有無を判定する）コンピュータである。なお、正常時の特徴の学習が行われる処理を「学習処理」という。学習処理において学習された結果に基づいて異常の検知が行われる処理を「テスト処理」という。 The observation data collected by the measuring device 20 is collected by the abnormality detecting device 10. The abnormality detection device 10 learns the characteristics at the normal time from the collected observation data, and detects the occurrence of an abnormality in the observation data input thereafter based on the learning result (determines the presence or absence of the abnormality). It is a computer. The process in which the characteristics in the normal state are learned is called "learning process". The process in which an abnormality is detected based on the result learned in the learning process is called "test process".

図２は、第１の実施の形態における異常検知装置１０のハードウェア構成例を示す図である。図２の異常検知装置１０は、それぞれバスＢで相互に接続されているドライブ装置１００、補助記憶装置１０２、メモリ装置１０３、ＣＰＵ１０４、及びインタフェース装置１０５等を有する。 FIG. 2 is a diagram showing a hardware configuration example of the abnormality detection device 10 according to the first embodiment. The abnormality detection device 10 of FIG. 2 has a drive device 100, an auxiliary storage device 102, a memory device 103, a CPU 104, an interface device 105, and the like, which are connected to each other by a bus B, respectively.

異常検知装置１０での処理を実現するプログラムは、ＣＤ−ＲＯＭ等の記録媒体１０１によって提供される。プログラムを記憶した記録媒体１０１がドライブ装置１００にセットされると、プログラムが記録媒体１０１からドライブ装置１００を介して補助記憶装置１０２にインストールされる。但し、プログラムのインストールは必ずしも記録媒体１０１より行う必要はなく、ネットワークを介して他のコンピュータよりダウンロードするようにしてもよい。補助記憶装置１０２は、インストールされたプログラムを格納すると共に、必要なファイルやデータ等を格納する。 The program that realizes the processing in the abnormality detection device 10 is provided by a recording medium 101 such as a CD-ROM. When the recording medium 101 storing the program is set in the drive device 100, the program is installed in the auxiliary storage device 102 from the recording medium 101 via the drive device 100. However, the program does not necessarily have to be installed from the recording medium 101, and may be downloaded from another computer via the network. The auxiliary storage device 102 stores the installed program and also stores necessary files, data, and the like.

メモリ装置１０３は、プログラムの起動指示があった場合に、補助記憶装置１０２からプログラムを読み出して格納する。ＣＰＵ１０４は、メモリ装置１０３に格納されたプログラムに従って異常検知装置１０に係る機能を実行する。インタフェース装置１０５は、ネットワークに接続するためのインタフェースとして用いられる。 The memory device 103 reads and stores the program from the auxiliary storage device 102 when the program is instructed to start. The CPU 104 executes the function related to the abnormality detection device 10 according to the program stored in the memory device 103. The interface device 105 is used as an interface for connecting to a network.

図３は、第１の実施の形態における異常検知装置１０の機能構成例を示す図である。図３において、異常検知装置１０は、受信部１１、学習処理制御部１２、前処理部１３、学習部１４、検知処理制御部１５及び検知部１６等を有する。これら各部は、異常検知装置１０にインストールされた１以上のプログラムが、ＣＰＵ１０４に実行させる処理により実現される。異常検知装置１０は、また、教師データ記憶部１２１、パラメータ記憶部１２２、観測データ記憶部１２３、学習結果記憶部１２４及び学習データ記憶部１２５等を利用する。これら各記憶部は、例えば、補助記憶装置１０２、又は異常検知装置１０にネットワークを介して接続可能な記憶装置等を用いて実現可能である。 FIG. 3 is a diagram showing a functional configuration example of the abnormality detection device 10 according to the first embodiment. In FIG. 3, the abnormality detection device 10 includes a receiving unit 11, a learning processing control unit 12, a preprocessing unit 13, a learning unit 14, a detection processing control unit 15, a detection unit 16, and the like. Each of these parts is realized by a process of causing the CPU 104 to execute one or more programs installed in the abnormality detection device 10. The abnormality detection device 10 also uses the teacher data storage unit 121, the parameter storage unit 122, the observation data storage unit 123, the learning result storage unit 124, the learning data storage unit 125, and the like. Each of these storage units can be realized by using, for example, a storage device that can be connected to the auxiliary storage device 102 or the abnormality detection device 10 via a network.

教師データ記憶部１２１には、予め正常時に収集されたことが確認されている観測データが教師データとして記憶されている。但し、教師データは、観測データから選別されるのではなく、人為的に作成されてもよい。 The teacher data storage unit 121 stores observation data that has been confirmed to have been collected in the normal state in advance as teacher data. However, the teacher data may be artificially created instead of being selected from the observation data.

受信部１１は、測定装置２０から観測データを受信する。受信された観測データは、観測データ記憶部１２３に記憶される
学習処理制御部１２は、学習処理を制御する。 The receiving unit 11 receives the observation data from the measuring device 20. The received observation data is stored in the observation data storage unit 123. The learning processing control unit 12 controls the learning process.

前処理部１３は、教師データの集合、観測データの集合、又は学習データ記憶部１２５に記憶されている学習データの集合について前処理を実行する。前処理とは、データ集合からの単位時間ごとの特徴量の抽出や、抽出された特徴量の正規化等の処理である。特徴量は、数値ベクトルの形式で表現される。なお、１回目の学習時には、教師データ記憶部１２１に記憶されている教師データ群が前処理の対象とされる。受信部１１によって観測データの受信が開始されると、観測データ群が前処理の対象とされる。更に、検知部１６による異常の検知が開始され、正常であると判定され、学習データとして学習データ記憶部１２５に記憶された観測データが所定数に達すると、当該学習データ群が前処理の対象とされる。 The pre-processing unit 13 executes pre-processing on a set of teacher data, a set of observation data, or a set of learning data stored in the learning data storage unit 125. The pre-processing is processing such as extraction of feature quantities for each unit time from a data set and normalization of the extracted feature quantities. Features are expressed in the form of numerical vectors. At the time of the first learning, the teacher data group stored in the teacher data storage unit 121 is targeted for preprocessing. When the reception unit 11 starts receiving the observation data, the observation data group is subject to preprocessing. Further, when the detection unit 16 starts detecting an abnormality, determines that it is normal, and reaches a predetermined number of observation data stored in the learning data storage unit 125 as learning data, the learning data group is subject to preprocessing. It is said that.

前処理部１３は、また、教師データ群又は学習データ群について前処理を実行する際に、観測データ又は学習データを正規化するためのパラメータ（以下、「正規化パラメータ」という。）を生成又は更新し、生成又は更新された正規化パラメータをパラメータ記憶部１２２に記憶する。 The preprocessing unit 13 also generates or generates a parameter for normalizing the observation data or the training data (hereinafter, referred to as “normalization parameter”) when executing the preprocessing for the teacher data group or the training data group. The updated and generated or updated normalized parameters are stored in the parameter storage unit 122.

学習部１４は、教師データ又は学習データに基づいて学習を実行する。学習部１４による学習結果は、学習結果記憶部１２４に記憶される。 The learning unit 14 executes learning based on the teacher data or the learning data. The learning result by the learning unit 14 is stored in the learning result storage unit 124.

検知処理制御部１５は、検知処理を制御する。 The detection process control unit 15 controls the detection process.

検知部１６は、観測データ記憶部１２３に記憶されている観測データが前処理部１３によって前処理されることで生成される数値ベクトルと、学習結果記憶部１２４に記憶されている学習結果とに基づいて異常の発生を検知する。具体的には、検知部１６は、前処理された数値ベクトルについて、学習結果との違いを異常度として算出し、当該異常度を閾値と比較することで異常の発生を検知する。異常が検知されなかった数値ベクトルの正規化前の値は、学習データとして学習データ記憶部１２５に記憶される。 The detection unit 16 uses a numerical vector generated by preprocessing the observation data stored in the observation data storage unit 123 by the preprocessing unit 13 and a learning result stored in the learning result storage unit 124. Detect the occurrence of anomalies based on. Specifically, the detection unit 16 calculates the difference from the learning result of the preprocessed numerical vector as the degree of abnormality, and detects the occurrence of the abnormality by comparing the degree of abnormality with the threshold value. The value before normalization of the numerical vector in which no abnormality is detected is stored in the learning data storage unit 125 as learning data.

以下、異常検知装置１０が実行する処理手順について説明する。図４は、第１の実施の形態における学習処理の処理手順の一例を説明するためのフローチャートである。なお、以下においては、便宜上、フローデータが処理対象である例について示す。 Hereinafter, the processing procedure executed by the abnormality detection device 10 will be described. FIG. 4 is a flowchart for explaining an example of the processing procedure of the learning process according to the first embodiment. In the following, for convenience, an example in which the flow data is the processing target will be shown.

学習処理が開始されると、学習処理制御部１２は、教師データ記憶部１２１から教師データ群を取得し、当該教師データ群を前処理部１３へ入力する（Ｓ１０１）。 When the learning process is started, the learning process control unit 12 acquires the teacher data group from the teacher data storage unit 121 and inputs the teacher data group to the preprocessing unit 13 (S101).

続いて、前処理部１３は、入力された教師データ群を、単位時間ごとの集合に分割する（Ｓ１０２）。なお、教師データ記憶部１２１には、単位時間×Ｕの期間（以下、「学習期間」という。）分の教師データが記憶されていることとする。したがって、教師データ群は、Ｕ個の集合に分割される。 Subsequently, the preprocessing unit 13 divides the input teacher data group into a set for each unit time (S102). It is assumed that the teacher data storage unit 121 stores teacher data for a period of unit time × U (hereinafter, referred to as “learning period”). Therefore, the teacher data group is divided into U sets.

続いて、前処理部１３は、分割された集合ごとに、目的に応じた特徴量を抽出し、抽出された特徴量を各次元の要素とする多次元数値ベクトルを生成する（Ｓ１０３）。 Subsequently, the preprocessing unit 13 extracts a feature amount according to a purpose for each divided set, and generates a multidimensional numerical vector having the extracted feature amount as an element of each dimension (S103).

例えば、単位時間が１分で、前処理部１３が、１分間ごとの特徴量を抽出するとする。また、特徴量を、各プロトコル（ＴＣＰ、ＵＤＰ）の全送信バイト数であるとする。この場合、先頭の教師データのフロー開始時刻が１２：００：００であるとすると、前処理部１３は、全教師データのうち、フロー開始時刻ｔが１１：５９：００＜＝ｔ＜１２：００：００であるような教師データ（フローデータ）の集合について、プロトコルがＴＣＰである全フローの全送信バイト数、プロトコルがＵＤＰである全フローの全送信バイト数等を計算し、それらの特徴量を各次元の要素とする２次元数値ベクトルを生成する。（Ｕ−１）個の他の集合についても同様に、数値ベクトルが生成される。 For example, suppose that the unit time is 1 minute and the preprocessing unit 13 extracts the feature amount for each minute. Further, it is assumed that the feature amount is the total number of transmitted bytes of each protocol (TCP, UDP). In this case, assuming that the flow start time of the first teacher data is 12:00:00, the preprocessing unit 13 has the flow start time t of 11:59:00 <= t <12: of all the teacher data. For a set of teacher data (flow data) such as 00:00, the total number of transmitted bytes of all flows whose protocol is TCP, the total number of transmitted bytes of all flows whose protocol is UDP, etc. are calculated, and their characteristics. Generate a two-dimensional numerical vector with a quantity as an element of each dimension. Similarly, numerical vectors are generated for (U-1) other sets.

なお、特徴量の属性としては、「ＴＣＰかつ送信ポート番号が８０」のような組合せとして指定することも可能である。また、各フローが「フロー数：１」のような値を持つと見なせば、各属性を持つフローの総フロー数についても同様に計算し、特徴量としてみなすことが可能である。 The feature amount attribute can also be specified as a combination such as "TCP and transmission port number is 80". Further, if it is considered that each flow has a value such as "number of flows: 1", the total number of flows of the flows having each attribute can be calculated in the same manner and regarded as a feature amount.

続いて、前処理部１３は、各数値ベクトルにおける各メトリックｉ（各次元ｉ）の最大値ｘｍａｘ＿ｉを算出し、算出したｘｍａｘ＿１をパラメータ記憶部１２２に記憶する（Ｓ１０４）。すなわち、第１の実施の形態において、各メトリックｉの最大値ｘｍａｘ＿ｉが、正規化パラメータである。 Subsequently, the preprocessing unit 13 calculates the maximum value xmax_i of each metric i (each dimension i) in each numerical vector, and stores the calculated xmax_1 in the parameter storage unit 122 (S104). That is, in the first embodiment, the maximum value xmax_i of each metric i is a normalization parameter.

ここで、Ｕ＝３とする。また、ステップＳ１０３において生成された数値ベクトルが｛｛８０，２０｝，｛９０，３５｝，｛１００，５０｝｝であるとする。これは、或る３分におけるＴＣＰの総送信バイト数及びＵＤＰの総送信バイト数がそれぞれ「ＴＣＰ：８０ｂｙｔｅ，ＵＤＰ：２０ｂｙｔｅ」、「ＴＣＰ：９０ｂｙｔｅ，ＵＤＰ：３５ｂｙｔｅ」、「ＴＣＰ：１００ｂｙｔｅ，ＵＤＰ：５０ｂｙｔｅ」であったことを示す。この場合、これらの数値ベクトルの各メトリックの最大値ｘｍａｘ＿ｉは、｛１００，５０｝である（すなわち、ｘｍａｘ＿１＝１００，ｘｍａｘ＿２＝５０である）。 Here, U = 3. Further, it is assumed that the numerical vectors generated in step S103 are {{80,20}, {90,35}, {100,50}}. This is because the total number of transmitted bytes of TCP and the total number of transmitted bytes of UDP in a certain 3 minutes are "TCP: 80 bytes, UDP: 20 bytes", "TCP: 90 bytes, UDP: 35 bytes", "TCP: 100 bytes, UDP:", respectively. It shows that it was "50 bytes". In this case, the maximum value xmax_i of each metric of these numerical vectors is {100,50} (that is, xmax_1 = 100, xmax_2 = 50).

続いて、前処理部１３は、正規化パラメータに基づいて、各数値ベクトルを正規化する（Ｓ１０５）。正規化は、各数値ベクトルのメトリックｉの値が最大値ｘｍａｘ＿ｉによって除されることにより行われる。したがって、正規化された数値ベクトルは、｛｛０．８，０．４｝，｛０．９，０．７｝，｛１，１｝｝となる。 Subsequently, the preprocessing unit 13 normalizes each numerical vector based on the normalization parameter (S105). Normalization is performed by dividing the value of the metric i of each numerical vector by the maximum value xmax_i. Therefore, the normalized numerical vectors are {{0.8, 0.4}, {0.9, 0.7}, {1,1}}.

続いて、学習部１４は、当該数値ベクトルについて学習器を利用して学習する（Ｓ１０６）。学習結果は、学習結果記憶部１２４に記憶される。 Subsequently, the learning unit 14 learns the numerical vector using the learner (S106). The learning result is stored in the learning result storage unit 124.

続いて、学習処理制御部１２は、学習データ記憶部１２５に、学習期間分の学習データが記憶（蓄積）されるのを待機する（Ｓ１０７）。すなわち、Ｕ個の正規化前の数値ベクトルが学習データ記憶部１２５に記憶されるまで待機が継続する。なお、学習データ記憶部１２５には、検知部１６によって正常である（異常が発生していない）と判定された数値ベクトルが記憶される。 Subsequently, the learning processing control unit 12 waits for the learning data storage unit 125 to store (accumulate) the learning data for the learning period (S107). That is, the standby continues until U unnormalized numerical vectors are stored in the learning data storage unit 125. The learning data storage unit 125 stores a numerical vector determined to be normal (no abnormality has occurred) by the detection unit 16.

学習期間分の数値ベクトルが学習データ記憶部１２５に記憶されると（Ｓ１０７でＹｅｓ）、学習処理制御部１２は、学習データ記憶部１２５から数値ベクトル群を取得し、当該数値ベクトル群を前処理部１３へ入力する（Ｓ１０８）。なお、取得された数値ベクトル群は、学習データ記憶部１２５から削除される。続いて、当該数値ベクトル群について、ステップＳ１０４以降が実行される。したがって、次のステップＳ１０５では、新たに計算されるｘｍａｘ＿ｉに基づいて正規化が行われる。 When the numerical vectors for the learning period are stored in the learning data storage unit 125 (Yes in S107), the learning processing control unit 12 acquires the numerical vector group from the learning data storage unit 125 and preprocesses the numerical vector group. Input to unit 13 (S108). The acquired numerical vector group is deleted from the learning data storage unit 125. Subsequently, steps S104 and subsequent steps are executed for the numerical vector group. Therefore, in the next step S105, normalization is performed based on the newly calculated xmax_i.

図５は、第１の実施の形態における検知処理の処理手順の一例を説明するためのフローチャートである。図５の処理手順は、図４のステップＳ１０６が少なくとも１回実行された後であれば、いつ開始されてもよい。すなわち、図５の処理手順は、図４の処理手順と並行して実行される。 FIG. 5 is a flowchart for explaining an example of the processing procedure of the detection process according to the first embodiment. The processing procedure of FIG. 5 may be started at any time after step S106 of FIG. 4 has been executed at least once. That is, the processing procedure of FIG. 5 is executed in parallel with the processing procedure of FIG.

ステップＳ２０１において、検知処理制御部１５は、単位時間の経過を待機する。当該単位時間は、図４の説明における単位時間と同じ時間長である。この待機中に、リアルタイムに収集され、受信部１１によって受信された観測データは観測データ記憶部１２３に記憶される
単位時間が経過すると（Ｓ２０１でＹｅｓ）、検知処理制御部１５は、直近の単位時間分の観測データ群を観測データ記憶部１２３から取得し、当該観測データ群を前処理部１３へ入力する（Ｓ２０２）。 In step S201, the detection processing control unit 15 waits for the elapse of the unit time. The unit time is the same time length as the unit time in the description of FIG. During this standby, the observation data collected in real time and received by the receiving unit 11 is stored in the observation data storage unit 123. When the unit time elapses (Yes in S201), the detection processing control unit 15 has the latest unit. The observation data group for the time is acquired from the observation data storage unit 123, and the observation data group is input to the preprocessing unit 13 (S202).

続いて、前処理部１３は、当該観測データ群から目的に応じた特徴量を抽出し、抽出された特徴量を各次元の要素とする多次元数値ベクトルを生成する（Ｓ２０３）。例えば、プロトコルがＴＣＰである全フローの全送信バイト数、プロトコルがＵＤＰである全フローの全送信バイト数が抽出され、これらを各次元の要素とする２次元数値ベクトルが生成される。ここでは、１つの数値ベクトルが生成される。 Subsequently, the preprocessing unit 13 extracts a feature amount according to the purpose from the observation data group, and generates a multidimensional numerical vector having the extracted feature amount as an element of each dimension (S203). For example, the total number of transmitted bytes of all flows whose protocol is TCP and the total number of transmitted bytes of all flows whose protocol is UDP are extracted, and a two-dimensional numerical vector having these as elements of each dimension is generated. Here, one numerical vector is generated.

続いて、前処理部１３は、生成された数値ベクトルを、パラメータ記憶部１２２に記憶されている最大値ｘｍａｘ＿ｉに基づいて正規化する（Ｓ２０４）。すなわち、当該数値ベクトルの各メトリックｉが、最大値ｘｍａｘ＿ｉによって除算される。 Subsequently, the preprocessing unit 13 normalizes the generated numerical vector based on the maximum value xmax_i stored in the parameter storage unit 122 (S204). That is, each metric i of the numerical vector is divided by the maximum value xmax_i.

例えば、図４のステップＳ１０４が上記の教師データに基づいて１回のみ実行されている場合、最大値ｘｍａｘ＿ｉは、｛１００，５０｝である。したがって、当該数値ベクトルが｛６０，４０｝である場合、当該数値ベクトルは、｛０．６，０．８｝に正規化される。 For example, when step S104 of FIG. 4 is executed only once based on the above teacher data, the maximum value xmax_i is {100,50}. Therefore, when the numerical vector is {60,40}, the numerical vector is normalized to {0.6,0.8}.

続いて、検知部１６は、異常判定処理を実行する（Ｓ２０５）。異常判定処理では、正規化された数値ベクトルと、学習結果記憶部１２４に記憶されている最新の学習結果とに基づいて、ネットワークＮ１について異常の有無が判定される。 Subsequently, the detection unit 16 executes the abnormality determination process (S205). In the abnormality determination process, the presence or absence of an abnormality in the network N1 is determined based on the normalized numerical vector and the latest learning result stored in the learning result storage unit 124.

異常が無いと判定された場合（Ｓ２０６でＹｅｓ）、検知処理制御部１５は、当該数値ベクトルの正規化前の数値ベクトルを、学習データとして学習データ記憶部１２５に記憶する（Ｓ２０７）。異常が有ると判定された場合（Ｓ２０６でＮｏ）、当該数値ベクトルの正規化前の数値ベクトルは、学習データ記憶部１２５に記憶されない。したがって、学習データ記憶部１２５には、正常時の数値ベクトルのみが記憶される。 When it is determined that there is no abnormality (Yes in S206), the detection processing control unit 15 stores the numerical vector before normalization of the numerical vector in the learning data storage unit 125 as learning data (S207). When it is determined that there is an abnormality (No in S206), the numerical vector before normalization of the numerical vector is not stored in the learning data storage unit 125. Therefore, only the numerical vector at the normal time is stored in the learning data storage unit 125.

続いて、ステップＳ２０１以降が繰り返される。なお、ステップＳ２０１以降が繰り返される過程において、ステップＳ２０４で利用される正規化パラメータは、並行して実行されている図４のステップＳ１０４において随時更新される。その結果、入力される観測データのトレンドを考慮して数値ベクトルを正規化することができる。 Subsequently, steps S201 and subsequent steps are repeated. In the process of repeating step S201 and subsequent steps, the normalization parameters used in step S204 are updated at any time in step S104 of FIG. 4 which is executed in parallel. As a result, the numerical vector can be normalized in consideration of the trend of the input observation data.

例えば、Ｕ＝３である場合、ステップＳ２０７が３回実行されて、｛｛６０，４０｝，｛４５，２０｝，｛３０，３０｝｝が学習データ記憶部１２５に記憶されたとする。この場合、ｘｍａｘ＿１＝６０、ｘｍａｘ＿２＝４０に更新され、更新結果がパラメータ記憶部１２２に反映される。 For example, when U = 3, step S207 is executed three times, and {{60,40}, {45,20}, {30,30}} are stored in the learning data storage unit 125. In this case, it is updated to xmax_1 = 60 and xmax_2 = 40, and the update result is reflected in the parameter storage unit 122.

なお、上記では、観測データがフローデータである例について説明したが、フローデータ、ＭＩＢデータ、及びＣＰＵ使用率が並列的に観測データとして受信されてもよい。この場合、図４及び図５の処理手順の各ステップでは、データ種別ごと（フローデータ、ＭＩＢデータ、及びＣＰＵ使用率ごと）に実行されればよい。 Although the example in which the observation data is the flow data has been described above, the flow data, the MIB data, and the CPU usage rate may be received as the observation data in parallel. In this case, each step of the processing procedure of FIGS. 4 and 5 may be executed for each data type (flow data, MIB data, and CPU usage rate).

なお、例えば｛ｈｏｓｔＩＤ，ｉｎｔｅｒｆａｃｅＩＤ，ｉｂｐｓ，ｏｂｐｓ｝のような形式で与えられるＭＩＢデータについては、「単位時間におけるホストＩＤａのｉｂｐｓ」、「単位時間におけるホストＩＤａのｏｂｐｓ」、「単位時間におけるホストＩＤｂのｉｂｐｓ」、「単位時間におけるホストＩＤｂのｏｂｐｓ」...「単位時間におけるｉｎｔｅｒｆａｃｅＩＤｘのｉｂｐｓ」、「単位時間におけるｉｎｔｅｒｆａｃｅＩＤｘのｏｂｐｓ」、「単位時間におけるｉｎｔｅｒｆａｃｅＩＤｙのｉｂｐｓ」、「単位時間におけるｉｎｔｅｒｆａｃｅＩＤｙのｏｂｐｓ」のように、数値ベクトルを抽出することが可能である。 For MIB data given in a format such as {hostID, interfaceID, ibps, obps}, "ibps of host IDa in unit time", "obps of host IDa in unit time", and "host IDb in unit time". "Ibps", "Obps of host IDb in unit time" ... "Ibps of interfaceIDx in unit time", "Obps of interfaceIDx in unit time", "IBps of interfaceIDy in unit time", "obps of interfaceIDy in unit time" It is possible to extract a numerical vector as in.

続いて、図４のステップＳ１０６及び図５のステップＳ２０５の一例について説明する。ステップＳ１０６及びＳ２０５では、データ種別がラベルとして付与された数値ベクトル群が学習部１４又は検知部１６に入力される。本実施の形態において、ラベルは「フローデータ」、「ＭＩＢデータ」、及び「ＣＰＵ使用率」のいずれかである。ラベルは、例えば、測定装置２０又は受信部１１によって教師データ及び観測データに付与される。すなわち、観測データの採取元に基づいて当該観測データに付与すべきラベルが特定可能である。当該ラベルは、前処理部１３によって生成される数値ベクトルに引き継がれる。 Subsequently, an example of step S106 of FIG. 4 and step S205 of FIG. 5 will be described. In steps S106 and S205, the numerical vector group to which the data type is assigned as a label is input to the learning unit 14 or the detecting unit 16. In this embodiment, the label is one of "flow data", "MIB data", and "CPU usage". The label is attached to the teacher data and the observation data by, for example, the measuring device 20 or the receiving unit 11. That is, the label to be attached to the observation data can be specified based on the source of the observation data. The label is taken over by the numerical vector generated by the preprocessing unit 13.

図４のステップＳ１０６において、学習部１４は、データ種別ごとに学習器を生成する。学習部１４は、入力される数値ベクトルに付与されているラベルに基づいて数値ベクトルを分類し、分類結果に対応する学習器へ当該数値ベクトルを入力する。本実施の形態では「フローデータの学習器」、「ＭＩＢデータの学習器」、「ＣＰＵ使用率の学習器」が生成される。学習器としては数値ベクトルのメトリック間の相関関係の学習による異常検知を行うオートエンコーダ（非特許文献２）や主成分分析（非特許文献３）等を用いることができる。本実施の形態では、学習器にオートエンコーダを用いる例について説明する。 In step S106 of FIG. 4, the learning unit 14 generates a learning device for each data type. The learning unit 14 classifies the numerical vector based on the label given to the input numerical vector, and inputs the numerical vector to the learning device corresponding to the classification result. In the present embodiment, a "flow data learner", a "MIB data learner", and a "CPU usage rate learner" are generated. As the learner, an autoencoder (Non-Patent Document 2) or a principal component analysis (Non-Patent Document 3) that detects anomalies by learning the correlation between numerical vector metrics can be used. In this embodiment, an example in which an autoencoder is used as the learning device will be described.

図６は、オートエンコーダを説明するための図である。オートエンコーダは、ディープラーニングによる異常検知アルゴリズムである。オートエンコーダは、正常時の入力データがメトリック間で相関関係を持ち、低次元に圧縮可能であることを利用する。異常時には入力データの相関関係が崩れるため、圧縮が正しく行われず入力データと出力データとの差が大きくなる。 FIG. 6 is a diagram for explaining an autoencoder. The autoencoder is an anomaly detection algorithm based on deep learning. The autoencoder utilizes the fact that normal input data has a correlation between metrics and can be compressed to a low dimension. In the event of an abnormality, the correlation between the input data is broken, so compression is not performed correctly and the difference between the input data and the output data becomes large.

図６の（１）に示されるように、学習部１４が生成する学習器（オートエンコーダ）は、出力層（ＬａｙｅｒＬ_３）が入力層（ＬａｙｅｒＬ_１）に近くなるように学習を行う。具体的には、学習部１４は、数値ベクトルを２つに複製し、一方を入力層へ当てはめ、他方を出力層に当てはめて学習を行い、学習結果を出力する。学習結果は、学習結果記憶部１２４に記憶される。学習結果は、学習器に対するパラメータ群である。なお、学習器は、データ種別ごとに生成されるため、学習結果もデータ種別ごとに出力され、学習結果記憶部１２４に記憶される。 As shown in (1) of FIG. 6, the learner (autoencoder) generated by the learning unit 14 learns so that the output layer (Layer L ₃ ) is close to the input layer (Layer L ₁ ). Specifically, the learning unit 14 duplicates the numerical vector into two, applies one to the input layer, applies the other to the output layer, performs learning, and outputs the learning result. The learning result is stored in the learning result storage unit 124. The learning result is a group of parameters for the learning device. Since the learning device is generated for each data type, the learning result is also output for each data type and stored in the learning result storage unit 124.

一方、検知部１６も、学習部１４と同様に、データ種別ごとに学習器を生成する。当該学習器には、学習部１４によって生成される学習器と同様にオートエンコーダ又は主成分分析等のうち、学習部１４が生成する学習器に対応する方法を用いることができる。 On the other hand, the detection unit 16 also generates a learning device for each data type, similarly to the learning unit 14. As the learning device, a method corresponding to the learning device generated by the learning unit 14 among the autoencoder, the principal component analysis, and the like can be used as in the learning device generated by the learning unit 14.

図５のステップＳ２０５において、検知部１６は、学習結果記憶部１２４に記憶されている学習結果に基づいて、「フローデータの学習器」、「ＭＩＢデータの学習器」、「ＣＰＵ使用率の学習器」を生成する。すなわち、検知部１６によって生成される学習器は、当該学習結果の出力時において学習部１４によって生成された学習器と同じである。検知部１６は、図６の（２）に示されるように、ステップＳ２０５において入力されたデータ種別ごとの数値ベクトルを当該数値ベクトルのデータ種別に対応する学習器へ入力し、学習器に対する入力データと出力データとの距離（メトリック間の相関関係の崩れの程度を示す指標）を異常度として計算する。本実施の形態ではオートエンコーダの入力層と出力層との距離である平均二乗誤差（ＭＳＥ：Mean Squared Error）が異常度として計算される。ＭＳＥの計算式は、以下の通りである。 In step S205 of FIG. 5, the detection unit 16 learns the “flow data learner”, the “MIB data learner”, and the “CPU usage rate” based on the learning result stored in the learning result storage unit 124. Generate a vessel. That is, the learning device generated by the detection unit 16 is the same as the learning device generated by the learning unit 14 at the time of outputting the learning result. As shown in (2) of FIG. 6, the detection unit 16 inputs the numerical vector for each data type input in step S205 to the learning device corresponding to the data type of the numerical vector, and the input data to the learning device. The distance between the output data and the output data (an index showing the degree of collapse of the correlation between the metrics) is calculated as the degree of abnormality. In the present embodiment, the mean squared error (MSE: Mean Squared Error), which is the distance between the input layer and the output layer of the autoencoder, is calculated as the degree of abnormality. The calculation formula of MSE is as follows.

本実施の形態では、フローデータのＭＳＥ、ＭＩＢデータのＭＳＥ、ＣＰＵ使用率のＭＳＥの３種のＭＳＥが得られる。検知部１６は、得られたＭＳＥの平均を、最終的な異常度として計算し、最終的な異常度が予め定められた閾値を超えていた場合に異常であると判定する。そうでない場合、検知部１６は、正常とであると判定する。

In the present embodiment, three types of MSEs are obtained: flow data MSE, MIB data MSE, and CPU usage rate MSE. The detection unit 16 calculates the average of the obtained MSE as the final degree of abnormality, and determines that the degree of abnormality is abnormal when the final degree of abnormality exceeds a predetermined threshold value. If not, the detection unit 16 determines that it is normal.

上述したように、第１の実施の形態によれば、データの種別ごとに学習器が生成されて、学習及び異常の検知が行われる。ここで、同一のデータ種別に属するメトリック（データ要素）は、相関が高いことが推定される。したがって、相関の低いデータが同一の学習器に入力される可能性を低下させることができる。その結果、異常を検知するための学習に要するデータの増加を抑制することができる。 As described above, according to the first embodiment, a learning device is generated for each type of data, and learning and abnormality detection are performed. Here, it is presumed that the metrics (data elements) belonging to the same data type have a high correlation. Therefore, it is possible to reduce the possibility that data with low correlation is input to the same learner. As a result, it is possible to suppress an increase in data required for learning to detect an abnormality.

次に、第２の実施の形態について説明する。第２の実施の形態では第１の実施の形態と異なる点について説明する。第２の実施の形態において特に言及されない点については、第１の実施の形態と同様でもよい。 Next, the second embodiment will be described. The second embodiment will explain the differences from the first embodiment. The points not particularly mentioned in the second embodiment may be the same as those in the first embodiment.

第２の実施の形態において、検知部１６は、各学習器から出力された異常度の重み付け平均を、最終的な異常度として算出する。この際、教師データ又は学習データに基づく数値ベクトル群を学習器に入力した際のＭＳＥの平均値が重みとして用いられる。 In the second embodiment, the detection unit 16 calculates the weighted average of the abnormalities output from each learning device as the final abnormalities. At this time, the average value of MSE when the numerical vector group based on the teacher data or the learning data is input to the learner is used as the weight.

そこで、第２の実施の形態では、学習部１４が、図４のステップＳ１０６を実行するたびに、教師データ又は学習データの数値ベクトル群に基づくデータ種別ごとの学習器から出力される学習結果を学習結果記憶部１２４に記憶する際に、データ種別ごとに、当該学習結果に基づく学習器へ各数値ベクトルを入力したデータ種別ごとの数値ベクトルを入力する。そうすることで、学習部１４は、データ種別ごと、かつ、数値ベクトルごとに異常度を算出し、更に、データ種別ごとに異常度の平均を算出する。例えば、Ｕ＝３であれば、データ種別ごとに３つの異常度が算出され、データ種別ごとに異常度の平均が算出される。データ種別ごとの異常度の平均は、学習結果と共に学習結果記憶部１２４に記憶される。したがって、「フローデータのＭＳＥ平均」、「ＭＩＢデータのＭＳＥ平均」、「ＣＰＵ使用率のＭＳＥ平均」が記憶される。以下、それぞれを、β'＿｛ｔｒａｉｎ，１｝、β'＿｛ｔｒａｉｎ，２｝、β'＿｛ｔｒａｉｎ，３｝と表記する。 Therefore, in the second embodiment, each time the learning unit 14 executes step S106 of FIG. 4, the learning result output from the learning device for each data type based on the teacher data or the numerical vector group of the learning data is obtained. When storing in the learning result storage unit 124, for each data type, the numerical vector for each data type in which each numerical vector is input to the learner based on the learning result is input. By doing so, the learning unit 14 calculates the degree of abnormality for each data type and for each numerical vector, and further calculates the average degree of abnormality for each data type. For example, if U = 3, three abnormalities are calculated for each data type, and the average of the abnormalities is calculated for each data type. The average degree of abnormality for each data type is stored in the learning result storage unit 124 together with the learning result. Therefore, "MSE average of flow data", "MSE average of MIB data", and "MSE average of CPU usage rate" are stored. Hereinafter, each will be referred to as β _ {train, 1}, β _ {train, 2}, β _ {train, 3}.

検知処理において、観測データに基づくデータ種別ごとの数値ベクトルを各学習器に入力することで得られるＭＳＥの平均を算出する際に、教師データ又は学習データに基づくＭＳＥの平均が大きいデータ種別ほど、観測データに基づくＭＳＥも大きくなることが考えられる。そこで、検知部１６は、学習結果記憶部１２４に記憶されている、教師データ又は学習データに基づくＭＳＥの平均を重みとして、データ種別ごとの異常度について重み付け平均を算出する。 In the detection process, when calculating the average of MSE obtained by inputting the numerical vector for each data type based on the observation data to each learner, the larger the average of MSE based on the teacher data or learning data, the larger the data type. It is possible that the MSE based on the observation data will also increase. Therefore, the detection unit 16 calculates the weighted average for the degree of abnormality for each data type, using the average of the MSE based on the teacher data or the learning data stored in the learning result storage unit 124 as a weight.

具体的には、フローデータ、ＭＩＢデータ、ＣＰＵ使用率の観測データに基づく数値ベクトルを、学習結果に基づく学習器に入力した時のＭＳＥが、それぞれβ＿｛ｔｅｓｔ，１｝、β＿｛ｔｅｓｔ，２｝、β＿｛ｔｅｓｔ，３｝である場合、検知部１６は、最終的な異常度βを、以下の計算式に基づいて計算する。
β＝（β＿｛ｔｅｓｔ，１｝／β'＿｛ｔｒａｉｎ，１｝＋β＿｛ｔｅｓｔ，２｝／β'＿｛ｔｒａｉｎ，２｝＋β＿｛ｔｅｓｔ，３｝／β'＿｛ｔｒａｉｎ，３｝）／（１／β'＿｛ｔｒａｉｎ，１｝＋１／β'＿｛ｔｒａｉｎ，２｝＋１／β'＿｛ｔｒａｉｎ，３｝）
これは、重み係数を、教師データ又は学習データに基づくＭＳＥの平均の逆数（１／β'＿｛ｔｒａｉｎ，ｉ｝）とすることで、教師データ又は学習データに基づくＭＳＥが大きいほど、観測データに基づくＭＳＥの重みを小さくしていることを示す。 Specifically, the MSE when the numerical vector based on the flow data, the MIB data, and the observation data of the CPU usage rate is input to the learner based on the learning result is β_ {test, 1}, β_ {test, 2, respectively. }, β_ {test, 3}, the detection unit 16 calculates the final degree of abnormality β based on the following formula.
β = (β_ {test, 1} / β'_ {train, 1} + β_ {test, 2} / β'_ {train, 2} + β_ {test, 3} / β'_ {train, 3}) / (1 / β'_ {train, 1} + 1 / β'_ {train, 2} + 1 / β'_ {train, 3})
This is because the weighting coefficient is the reciprocal of the average of MSE based on the teacher data or training data (1 / β'_ {train, i}), and the larger the MSE based on the teacher data or training data, the more the observation data. It is shown that the weight of MSE based on is reduced.

上述したように、第２の実施の形態によれば、正常時におけるデータ種別間の異常度の大きさの違いを考慮して、検知処理において最終的な異常度を算出することができる。 As described above, according to the second embodiment, the final degree of abnormality can be calculated in the detection process in consideration of the difference in the magnitude of the degree of abnormality between the data types in the normal state.

次に、第３の実施の形態について説明する。第３の実施の形態では第１の実施の形態と異なる点について説明する。第３の実施の形態において特に言及されない点については、第１の実施の形態と同様でもよい。 Next, a third embodiment will be described. A difference from the first embodiment will be described in the third embodiment. The points not particularly mentioned in the third embodiment may be the same as those in the first embodiment.

第３の実施の形態において、検知部１６は、学習器ごとに（すなわち、データ種別ごとに）異常有無の判定を行い、少なくともいずれか一つのデータ種別に関して異常が有ると判定した場合に、最終的な判定結果を「異常有り」とする。 In the third embodiment, the detection unit 16 determines whether or not there is an abnormality for each learner (that is, for each data type), and when it is determined that there is an abnormality for at least one of the data types, the final determination is made. Judgment result is "abnormal".

検知処理において、データ種別ごとに、観測データに基づく数値ベクトルを当該データ種別に係る学習器に入力した際に得られるＭＳＥを、β＿｛ｔｅｓｔ，１｝、β＿｛ｔｅｓｔ，２｝、β＿｛ｔｅｓｔ，３｝とする。ここで、閾値は、データ種別ごとに予め定められているとし、それぞれθ＿１、θ＿２、θ＿３と表記する。この場合、検知部１６は、学習器ｉごとに、β＿｛ｔｅｓｔ，ｉ｝≧θ＿ｉの場合に異常有り、そうでない場合に異常無しと判定する。本実施の形態では、「フローデータ」、「ＭＩＢデータ」、「ＣＰＵ使用率」の３種の学習器についてそれぞれ異常有無の判定が行われ、少なくともいずれか一つについて「異常有り」と判定された場合に、最終的な異常有無の判断が「異常有り」とされ、そうでない場合に「異常無し」とされる。 In the detection process, for each data type, the MSE obtained when a numerical vector based on the observation data is input to the learner related to the data type is β_ {test, 1}, β_ {test, 2}, β_ {test. , 3}. Here, the threshold value is assumed to be predetermined for each data type, and is expressed as θ_1, θ_2, and θ_3, respectively. In this case, the detection unit 16 determines that there is an abnormality when β_ {test, i} ≧ θ_i for each learning device i, and that there is no abnormality when it is not. In the present embodiment, the presence or absence of an abnormality is determined for each of the three types of learners, "flow data", "MIB data", and "CPU usage rate", and at least one of them is determined to be "abnormal". If this is the case, the final judgment of the presence or absence of an abnormality is determined to be "abnormal", and if not, "no abnormality" is determined.

上述したように、第３の実施の形態によれば、第１の実施の形態と同様の効果を得ることができる。 As described above, according to the third embodiment, the same effect as that of the first embodiment can be obtained.

次に、第４の実施の形態について説明する。第４の実施の形態では第３の実施の形態と異なる点について説明する。第４の実施の形態において特に言及されない点については、第３の実施の形態と同様でもよい。 Next, a fourth embodiment will be described. The fourth embodiment will be described as different from the third embodiment. The points not particularly mentioned in the fourth embodiment may be the same as those in the third embodiment.

第４の実施の形態において、検知部１６は、各データ種別の学習器ごとに異常有無の判定を行った後に、全ての学習器について「異常有り」と判定した場合にのみ、最終的な判定結果を「異常有り」とする。例えば、「フローデータ」、「ＭＩＢデータ」、「ＣＰＵ使用率」の３種の全ての学習器について「異常有り」と判定された場合にのみ、最終的な判定結果が「異常有り」となり、それ以外では最終的な判定結果が「異常無し」となる。 In the fourth embodiment, the detection unit 16 determines the presence or absence of an abnormality for each learning device of each data type, and then makes a final determination only when it is determined that all the learning devices have an abnormality. The result is "abnormal". For example, the final judgment result is "abnormal" only when it is judged as "abnormal" for all three types of learners of "flow data", "MIB data", and "CPU usage rate". Otherwise, the final judgment result will be "no abnormality".

上述したように、第４の実施の形態によれば、第１の実施の形態と同様の効果を得ることができる。 As described above, according to the fourth embodiment, the same effect as that of the first embodiment can be obtained.

次に、第５の実施の形態について説明する。第５の実施の形態では第３の実施の形態と異なる点について説明する。第５の実施の形態において特に言及されない点については、第３の実施の形態と同様でもよい。 Next, a fifth embodiment will be described. The fifth embodiment will explain the differences from the third embodiment. The points not particularly mentioned in the fifth embodiment may be the same as those in the third embodiment.

第５の実施の形態において、検知部１６は、各データ種別の学習器ごとに異常有無の判定を行った後に、「異常有り」と判定した学習器の数と「異常無し」と判定した学習器の数との多数決によって、最終的な異常有無の判定を行う。例えば、「フローデータ」、「ＭＩＢデータ」、「ＣＰＵ使用率」の３種の学習器のうち、２つ以上について「異常有り」と判定された場合には、最終的な判定結果が「異常有り」となり、それ以外では最終的な判定結果が「異常無し」となる。学習器の数が偶数の場合、「異常有り」の数と「異常無し」の数が同数で合った場合の取扱いは、「異常有り」とするか「異常無し」とするか、又はランダムに決定するか等、予め定められる。 In the fifth embodiment, the detection unit 16 determines the presence or absence of an abnormality for each learning device of each data type, and then determines the number of learning devices that are determined to be "abnormal" and the learning that is determined to be "no abnormality". The final determination of the presence or absence of an abnormality is made by a majority vote with the number of vessels. For example, if two or more of the three types of learners, "flow data", "MIB data", and "CPU usage rate", are judged to be "abnormal", the final judgment result is "abnormal". "Yes", otherwise the final judgment result is "No abnormality". When the number of learners is even, the handling when the number of "abnormal" and the number of "no abnormality" match is "abnormal", "no abnormality", or randomly. Whether to decide or not is decided in advance.

上述したように、第５の実施の形態によれば、第１の実施の形態と同様の効果を得ることができる。 As described above, according to the fifth embodiment, the same effect as that of the first embodiment can be obtained.

次に、第６の実施の形態について説明する。第６の実施の形態では上記各実施の形態と異なる点について説明する。第６の実施の形態において特に言及されない点については、上記各実施の形態と同様でもよい。 Next, the sixth embodiment will be described. In the sixth embodiment, the points different from each of the above embodiments will be described. The points not particularly mentioned in the sixth embodiment may be the same as those in the above-described embodiments.

第６の実施の形態では、データ種別ごとではなく、データ種別ごとの数値ベクトルのメトリック間の相関関係に基づくクラスタごとに、学習器が生成される例について説明する。すなわち、上記各実施の形態では、同一のデータ種別に属する各データ要素（メトリック）は相関が高いであろうという推定に基づいて、データ種別が、データ要素間（メトリック間）の相関の高さに基づいて分類される単位として用いられた。一方、第６の実施の形態では、斯かる推定に基づくのではなく、実際に各データ要素間（各メトリック間）の相関の高さに基づいて、データ要素群が複数の集合（以下のクラスタ）に分類され、当該集合が、データ要素間（メトリック間）の相関の高さに基づいて分類される単位とされる。 In the sixth embodiment, an example in which a learner is generated for each cluster based on the correlation between the metrics of the numerical vector for each data type, not for each data type, will be described. That is, in each of the above embodiments, the data type has a high correlation between data elements (metrics) based on the estimation that each data element (metric) belonging to the same data type will have a high correlation. It was used as a unit to be classified based on. On the other hand, in the sixth embodiment, the data element group is a plurality of sets (the following clusters) based on the height of the correlation between each data element (each metric), not based on such estimation. ), And the set is a unit to be classified based on the height of correlation between data elements (between metrics).

まず、第６の実施の形態では、図４のステップＳ１０３及び図５のステップＳ２０３において、単位時間ごとに、データ種別ごとではなく、１つの数値ベクトルｘが生成される。例えば、フローデータの数値ベクトルの各メトリック、ＭＩＢデータの数値ベクトルの各メトリック、及びＣＰＵ使用率の各メトリックを要素として含む一つの数値ベクトルｘが生成される。単位時間ｔにおける数値ベクトルｘを、ｘ＿｛ｉ，ｔ｝（ｉ＝１，...，Ｎ，ｔ＝１，...，Ｕ）と表記する。 First, in the sixth embodiment, in step S103 of FIG. 4 and step S203 of FIG. 5, one numerical vector x is generated for each unit time, not for each data type. For example, one numerical vector x including each metric of the numerical vector of the flow data, each metric of the numerical vector of the MIB data, and each metric of the CPU usage rate is generated. The numerical vector x in the unit time t is expressed as x_ {i, t} (i = 1, ..., N, t = 1, ..., U).

また、前処理部１３は、図４のステップＳ１０３に続いて、図７に示される処理手順を実行する。 Further, the preprocessing unit 13 executes the processing procedure shown in FIG. 7 following step S103 in FIG.

図７は、第６の実施の形態において前処理部１３が追加的に実行する処理手順を説明するためのフローチャートである。 FIG. 7 is a flowchart for explaining a processing procedure additionally executed by the preprocessing unit 13 in the sixth embodiment.

ステップＳ３０１において、前処理部１３は、数値ベクトルｘの各メトリックに独立なＩＤを付与する。 In step S301, the preprocessing unit 13 assigns an independent ID to each metric of the numerical vector x.

続いて、前処理部１３は、２つのメトリックの全ての組ごとに、ピアソン相関係数を算出する（Ｓ３０２）。すなわち、メトリックｉ，ｊ間の相関係数α＿｛ｉ，ｊ｝が、（ｘ＿｛ｉ，１｝，...，ｘ＿｛ｉ，Ｔ｝）と、（ｘ＿｛ｊ，１｝，...，｛ｊ，Ｕ｝）とのピアソン相関係数により算出される（ｉ＝１，...，Ｎ、ｊ＝１，...，Ｎ、ｉ＜ｊ）。 Subsequently, the preprocessing unit 13 calculates the Pearson correlation coefficient for each set of the two metrics (S302). That is, the correlation coefficients α_ {i, j} between the metrics i and j are (x_ {i, 1}, ..., x_ {i, T}) and (x_ {j, 1} ,. It is calculated by the Pearson correlation coefficient with., {J, U}) (i = 1, ..., N, j = 1, ..., N, i <j).

続いて、前処理部１３は、ピアソン相関係数α＿｛ｉ，ｊ｝に基づいて、多次元尺度構成法を用いて、予め定めたグループ数Ｋに各メトリックのＩＤをクラスタリングする（Ｓ３０３）。続いて、前処理部１３は、各ＩＤが、いずれのクラスタに分類されたのかを示す、ＩＤとクラスタとの対応情報を学習結果記憶部１２４に記憶する（Ｓ３０４）。 Subsequently, the preprocessing unit 13 clusters the IDs of each metric into a predetermined number of groups K using a multidimensional scaling method based on the Pearson correlation coefficient α_ {i, j} (S303). Subsequently, the preprocessing unit 13 stores the correspondence information between the ID and the cluster, which indicates which cluster each ID is classified into, in the learning result storage unit 124 (S304).

なお、図７の処理手順は、１回実行されればよい。すなわち、図４のステップＳ１０８に続いて実行されなくてよい。 The processing procedure of FIG. 7 may be executed once. That is, it does not have to be executed following step S108 of FIG.

その他においては、上記各実施の形態におけるデータ種別が、クラスタに置き換えられればよい。例えば、図４のステップＳ１０６において、学習部１４は、学習結果記憶部１２４に記憶されている対応情報に基づいて、クラスタごとに学習器を生成し、学習を行う。各学習器には、正規化された数値ベクトルのうち、当該学習器が対応するクラスタに分類されたＩＤに対応するメトリックが入力される。学習結果は、クラスタごとに学習結果記憶部１２４に記憶される。 In other cases, the data type in each of the above embodiments may be replaced with a cluster. For example, in step S106 of FIG. 4, the learning unit 14 generates a learning device for each cluster based on the correspondence information stored in the learning result storage unit 124, and performs learning. Among the normalized numerical vectors, the metric corresponding to the ID classified into the cluster to which the learner corresponds is input to each learner. The learning result is stored in the learning result storage unit 124 for each cluster.

また、図５のステップＳ２０５において、検知部１６は、学習結果記憶部１２４に記憶されているクラスタごとの学習結果に基づいて、クラスタごとに学習器を生成する。検知部１６は、各学習器に、正規化された数値ベクトルのうち、当該学習器が対応するクラスタに分類されたＩＤに対応するメトリックを入力する。なお、検知処理（図５）において、例えば、ステップＳ２０３に続いて、前処理部１３は、図７のステップＳ３０１と同様に、数値ベクトルの各ベクトルに独立したＩＤを付与すればよい。 Further, in step S205 of FIG. 5, the detection unit 16 generates a learning device for each cluster based on the learning result for each cluster stored in the learning result storage unit 124. The detection unit 16 inputs to each learner a metric corresponding to an ID classified into the cluster to which the learner corresponds among the normalized numerical vectors. In the detection process (FIG. 5), for example, following step S203, the preprocessing unit 13 may assign an independent ID to each of the numerical vectors, as in step S301 of FIG.

上述したように、第６の実施の形態によれば、より相関の高いメトリック群ごとに学習器を生成することができる。 As described above, according to the sixth embodiment, the learner can be generated for each metric group having a higher correlation.

なお、上記各実施の形態は、ネットワーク以外から収集されるデータに関して適用されてもよい。例えば、コンピュータシステムから収集されるデータに関して上記各実施の形態が適用されてもよい。 It should be noted that each of the above embodiments may be applied to data collected from other than the network. For example, each of the above embodiments may be applied to data collected from a computer system.

なお、上記各実施の形態において、前処理部１３は、分類部の一例である。 In each of the above embodiments, the pretreatment unit 13 is an example of a classification unit.

以上、本発明の実施例について詳述したが、本発明は斯かる特定の実施形態に限定されるものではなく、特許請求の範囲に記載された本発明の要旨の範囲内において、種々の変形・変更が可能である。 Although the examples of the present invention have been described in detail above, the present invention is not limited to such specific embodiments, and various modifications are made within the scope of the gist of the present invention described in the claims.・ Can be changed.

１０異常検知装置
１１受信部
１２学習処理制御部
１３前処理部
１４学習部
１５検知処理制御部
１６検知部
２０測定装置
１００ドライブ装置
１０１記録媒体
１０２補助記憶装置
１０３メモリ装置
１０４ＣＰＵ
１０５インタフェース装置
１２１教師データ記憶部
１２２パラメータ記憶部
１２３観測データ記憶部
１２４学習結果記憶部
１２５学習データ記憶部
Ｂバス
Ｎ１ネットワーク 10 Anomaly detection device 11 Receiving unit 12 Learning processing control unit 13 Preprocessing unit 14 Learning unit 15 Detection processing control unit 16 Detection unit 20 Measuring device 100 Drive device 101 Recording medium 102 Auxiliary storage device 103 Memory device 104 CPU
105 Interface device 121 Teacher data storage unit 122 Parameter storage unit 123 Observation data storage unit 124 Learning result storage unit 125 Learning data storage unit B bus N1 network

Claims

When the abnormality detection target is normal, the correlation between the data elements of a plurality of types of data obtained from the detection target is classified for each unit classified based on the height of the correlation between the data elements. A learning unit that learns using multiple learning devices generated for each unit and outputs the learning results,
Regarding the data element group of a plurality of types of data obtained from the detection target at a plurality of timings, the correlation of the data element group classified into the unit is broken for each unit based on the learning result related to the unit. A detection unit that calculates the degree of abnormality indicating the degree of the above and detects the abnormality of the detection target based on the degree of abnormality for each unit.
Have a,
The data elements that make up each of the units are different from each other.
Anomaly detection device characterized by this.

The unit is a unit for each type.
The abnormality detection device according to claim 1, wherein the abnormality detection device is characterized.

It has a classification unit that classifies data element groups of a plurality of types of data obtained from the detection target when the abnormality detection target is normal into a plurality of sets based on the height of correlation.
The unit is a unit for each set.
The abnormality detection device according to claim 1, wherein the abnormality detection device is characterized.

The detector is
The average degree of anomaly for each unit,
Alternatively, a weighted average of the anomaly degree for each unit, which weights the anomaly degree calculated based on the learning result for each unit of the data element group when the abnormality detection target is normal.
Or, the degree of abnormality of at least one of the above units exceeds the threshold value.
Or, whether the degree of abnormality related to all the units exceeds the threshold value
Or, the number of the units whose degree of abnormality exceeds the threshold value,
Detects the abnormality of the detection target based on
The abnormality detection device according to any one of claims 1 to 3, wherein the abnormality detection device is characterized.

When the abnormality detection target is normal, the correlation between the data elements of a plurality of types of data obtained from the detection target is classified for each unit classified based on the height of the correlation between the data elements. A learning procedure that learns using multiple learning devices generated for each unit and outputs the learning results,
Regarding the data element group of a plurality of types of data obtained from the detection target at a plurality of timings, the correlation of the data element group classified into the unit is broken for each unit based on the learning result related to the unit. A detection procedure that calculates the degree of abnormality indicating the degree of the above and detects the abnormality of the detection target based on the degree of abnormality for each unit.
The computer runs ,
The data elements that make up each of the units are different from each other.
Anomaly detection method characterized by this.

The unit is a unit for each type.
The abnormality detection method according to claim 5, wherein the abnormality is detected.

It has a classification procedure for classifying data elements of a plurality of types of data obtained from the detection target when the abnormality detection target is normal into a plurality of sets based on the height of correlation.
The unit is a unit for each set.
The abnormality detection method according to claim 5, wherein the abnormality is detected.

A program for operating a computer as each part according to any one of claims 1 to 4.