JP2020113216A

JP2020113216A - Analyzer and method for analysis

Info

Publication number: JP2020113216A
Application number: JP2019005589A
Authority: JP
Inventors: 千絵増田; Chie Masuda; 和三村; Kazu Mimura; 幸三池上; Kozo Ikegami
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2019-01-16
Filing date: 2019-01-16
Publication date: 2020-07-27
Anticipated expiration: 2039-01-16
Also published as: JP7033560B2; WO2020148934A1

Abstract

To reduce the analysis load on an analyst generated by increase of analysis target alerts.SOLUTION: The analyzer can access behavior history information, behavior classification information, and behavior sequence information, acquires the result of classification of behaviors of the communication partner with a monitoring target before generation of an alert for each alert group generated in the monitoring target, determines whether a specific communication partner with the monitoring target before a specific one of the alert groups is generated exists in the behavior classification information, generates a specific behavior sequence as the time-series behavior of the specific communication partner on the basis of the result of classification of behaviors of the specific communication partner and the previous behavior history of the specific communication partner stored in the behavior history information if the specific communication partner does not exist in the behavior classification information, determines whether the specific behavior sequence is a part of another behavior sequence, by using the behavior sequence information, and selects the result of classification of the behaviors the specific communication partner if the specific behavior sequence is a part of another behavior sequence.SELECTED DRAWING: Figure 1

Description

本発明は、データを分析する分析装置および分析方法に関する。 The present invention relates to an analysis device and an analysis method for analyzing data.

サイバー空間では攻撃側が構造的に優位であり、その攻撃は日々高度化、増加、変化している。そのような中、攻撃対象は、従来の金融事業者およびＩＴ（ＩｎｆｏｒｍａｔｉｏｎＴｅｃｈｎｏｌｏｇｙ）サービス事業者からインフラ事業者へ拡大している。対策に必要な対策コストは右肩上がりだが、投資がそれに追いつかないのが現状である。セキュリティ専門家の人数も不足しており、将来に向けた人材確保が課題となっている。十分な数のセキュリティ専門家が確保されないため、ＳＯＣ（ＳｅｃｕｒｉｔｙＯｐｅｒａｔｉｏｎＣｅｎｔｅｒ）の運用業務に支障を来たすことが懸念される。特に、社会インフラ事業者は、システム全体を監視する。したがって、これまでに比べて、ＳＯＣ運用性能の大幅な向上が要求される。 The attacking side is structurally dominant in cyberspace, and the attacks are becoming more sophisticated, increasing, and changing day by day. Under such circumstances, the targets of attack are expanding from conventional financial businesses and IT (Information Technology) service businesses to infrastructure businesses. The cost of countermeasures required for countermeasures is increasing, but the current situation is that investment cannot keep up with it. The number of security experts is also insufficient, and securing human resources for the future is an issue. Since a sufficient number of security experts are not secured, it is feared that the operation work of SOC (Security Operation Center) will be hindered. In particular, social infrastructure operators monitor the entire system. Therefore, a significant improvement in SOC operational performance is required as compared with the past.

ＳＯＣ運用業務において最も工数を要するのは、ＦＷ（Ｆｉｒｅｗａｌｌ）／ＩＰＳ（ＩｎｔｒｕｓｉｏｎＰｒｅｖｅｎｔｉｏｎＳｙｓｔｅｍ）から通知されるセキュリティアラートの重要度を判断する作業（インシデントか誤検知かを人手で判断する作業）である。従来はアラートが発生した際には、ＳＯＣの専門家がシステム内の各装置ログと外部脅威情報（たとえば、ＵＲＬ（ＵｎｉｆｏｒｍＲｅｓｏｕｒｃｅＬｏｃａｔｏｒ）やマルウェアの危険度評価）を参照し、そのアラートの重要度を経験と勘に基づいて判断していた。増加し続けるサイバー攻撃や監視対象システムの大規模化に対して、将来に渡って持続可能なＳＯＣ運用を実現するには、上記アラート重要度判断の自動化が必須である。しかし、判断ロジックをすべてルール定義することは、コスト面で困難である。よって、機械学習により、専門家によるアラート重要度の蓄積データから、アラート重要度の判断ロジックを作り出す技術が必要となる。 What requires the most man-hours in the SOC operation work is the work of judging the importance of the security alert notified from the FW (Firewall)/IPS (Intrusion Prevention System) (the work of manually judging whether it is an incident or a false positive). .. Conventionally, when an alert occurs, an SOC expert refers to each device log in the system and external threat information (for example, URL (Uniform Resource Locator) and malware risk evaluation), and the importance of the alert. Was judged based on experience and intuition. In order to realize sustainable SOC operation in the future against the ever-increasing number of cyber attacks and the large scale of monitored systems, automation of the above alert importance determination is essential. However, it is difficult in terms of cost to define all decision logic rules. Therefore, there is a need for a technique for creating a judgment logic of the alert importance level by machine learning from accumulated data of the alert importance level by an expert.

また、特許文献１は、機械学習の教師データとして用いるための、特定の分野に関するデータを収集する教師データ収集装置を開示する。この教師データ収集装置は、予め登録しておいた特定の分野に関するデータの特徴ベクトルである第１の特徴ベクトルを算出する特徴算出部と、前記第１の特徴ベクトルから、前記特定の分野に関するデータの収集に用いる検索条件を生成する生成部と、生成された前記検索条件をもとに、前記特定の分野に関するデータを収集する収集部と、収集した前記データの特徴ベクトルである第２の特徴ベクトルを前記特徴算出部が算出すると、該第２の特徴ベクトルと前記第１の特徴ベクトルとの類似度を算出する類似度算出部と、前記類似度が所定の範囲内にある収集した前記データを前記教師データとして抽出する抽出部と、を備える。 Further, Patent Document 1 discloses a teacher data collection device for collecting data relating to a specific field, which is used as teacher data for machine learning. This teacher data collection device includes a feature calculation unit that calculates a first feature vector, which is a feature vector of data relating to a specific field registered in advance, and data relating to the specific field from the first feature vector. And a collecting unit that collects data relating to the specific field based on the generated search condition, and a second feature that is a feature vector of the collected data. When the feature calculation unit calculates a vector, a similarity calculation unit that calculates a similarity between the second feature vector and the first feature vector, and the collected data whose similarity is within a predetermined range. Is extracted as the teacher data.

特開２０１８−１２４６１７号公報JP, 2008-124617, A

サイバー攻撃は、時間経過とともに手口が変わる。変化に追随するには、継続的な追加学習が欠かせない。よって、機械的に判断できなかった、判断の確度が低いアラートに対して、専門家であるセキュリティアナリストによる判断（ラベルという）を新たに入力してもらう必要がある。しかしながら、入力回数が増加するにつれ、専門家の負担が増加する。 Cyber attacks change their methods over time. In order to keep up with change, continuous additional learning is essential. Therefore, it is necessary to have a security analyst, who is an expert, newly input a judgment (called a label) for an alert that cannot be judged mechanically and whose accuracy is low. However, as the number of inputs increases, the burden on the expert increases.

本開示技術は、分析対象アラート増加によるセキュリティアナリストの分析負担を軽減することを目的とする。 The disclosed technique aims to reduce the analysis burden on the security analyst due to an increase in the number of alerts to be analyzed.

本開示技術の一側面となる分析装置および分析方法は、プログラムを実行するプロセッサと、前記プログラムを記憶する記憶デバイスと、を有する分析装置および分析方法であって、前記プロセッサは、行動履歴情報と行動分類情報と行動シーケンス情報とにアクセス可能であり、前記行動履歴情報は、監視対象との通信相手が前記監視対象に対して実行した過去の行動履歴を記憶した情報であり、前記行動分類情報は、前記通信相手と、分類された前記通信相手の行動群と、を対応付けた情報であり、前記行動シーケンス情報は、前記通信相手の時系列な行動である行動シーケンスと、前記行動シーケンスが前記行動シーケンスよりも長い他の行動シーケンスの一部であるか否かを示す被包含情報と、を対応付けた情報であり、前記プロセッサは、前記監視対象で発生したアラート群の各々のアラートの発生前の第１所定期間内における前記監視対象との通信相手の行動の分類結果を取得する取得処理と、前記アラート群の中の特定のアラートの発生前の前記第１所定期間内における前記監視対象との特定の通信相手が前記行動分類情報に存在するか否かを判定する第１判定処理と、前記第１判定処理によって前記特定の通信相手が前記行動分類情報に存在しないと判定された場合、前記特定の通信相手の行動の分類結果と、前記行動履歴情報に記憶された前記特定の通信相手の過去の行動履歴と、に基づいて、前記特定の通信相手の時系列な行動である特定の行動シーケンスを生成し、前記行動シーケンス情報を用いて、前記特定の行動シーケンスが前記他の行動シーケンスの一部であるか否かを判定する第２判定処理と、前記第２判定処理によって前記特定の行動シーケンスが前記他の行動シーケンスの一部であると判定された場合、前記特定の通信相手の行動の分類結果を選択する選択処理と、前記選択処理によって選択された前記特定の通信相手の行動の分類結果を出力する出力処理と、を実行することを特徴とする。 An analysis apparatus and an analysis method according to an aspect of the present disclosure are an analysis apparatus and an analysis method that include a processor that executes a program and a storage device that stores the program, the processor including action history information. The action classification information and the action sequence information are accessible, and the action history information is information in which a past action history executed by the communication partner with the monitoring target for the monitoring target is stored, and the action classification information Is information that associates the communication partner with the classified action group of the communication partner, and the action sequence information is an action sequence that is a time-series action of the communication partner, and the action sequence is The included information indicating whether or not it is a part of another action sequence longer than the action sequence is information in which the associated information is associated with the processor, and the processor is provided for each alert of the alert group generated in the monitoring target. Acquisition processing for acquiring the classification result of the behavior of the communication partner with the monitoring target within the first predetermined period before occurrence, and the monitoring within the first predetermined period before the occurrence of a specific alert in the alert group A first determination process for determining whether or not a specific communication partner with the target exists in the action classification information, and it is determined by the first determination process that the specific communication partner does not exist in the action classification information. In this case, it is a time-series action of the specific communication partner based on the action classification result of the specific communication partner and the past action history of the specific communication partner stored in the action history information. A second determination process of generating a specific action sequence and using the action sequence information to determine whether the specific action sequence is a part of the other action sequence, and the second determination process. When it is determined that the specific action sequence is a part of the other action sequence, a selection process of selecting an action classification result of the specific communication partner, and the specific communication selected by the selection process And an output process for outputting the classification result of the actions of the other party.

本開示技術の他の側面となる分析装置および分析方法は、プログラムを実行するプロセッサと、前記プログラムを記憶する記憶デバイスと、を有する分析装置および分析方法であって、前記プロセッサは、行動履歴情報と行動分類情報と行動シーケンス情報とにアクセス可能であり、前記行動履歴情報は、監視対象との通信相手が前記監視対象に対して実行した過去の行動履歴を記憶した情報であり、前記行動分類情報は、前記通信相手と、分類された前記通信相手の行動群と、を対応付けた情報であり、前記行動シーケンス情報は、前記通信相手の時系列な行動である行動シーケンスと、前記行動シーケンスの出現頻度と、を対応付けた情報であり、前記プロセッサは、前記監視対象で発生したアラート群の各々のアラートの発生前の第１所定期間内における前記監視対象との通信相手の行動の分類結果を取得する取得処理と、前記アラート群の中の特定のアラートの発生前の前記第１所定期間内における前記監視対象との特定の通信相手が前記行動分類情報に存在するか否かを判定する第１判定処理と、前記第１判定処理によって前記特定の通信相手が前記行動分類情報に存在しないと判定された場合、前記特定の通信相手の行動の分類結果と、前記行動履歴情報に記憶された前記特定の通信相手の過去の行動履歴と、に基づいて、前記特定の通信相手の時系列な行動である特定の行動シーケンスを生成し、前記行動シーケンス情報を用いて、前記特定の行動シーケンスの出現頻度が所定頻度以下であるか否かを判定する第２判定処理と、前記第２判定処理によって前記特定の行動シーケンスの出現頻度が前記所定頻度以下であると判定された場合、前記特定の通信相手の行動の分類結果を選択する選択処理と、前記選択処理によって選択された前記特定の通信相手の行動の分類結果を出力する出力処理と、を実行することを特徴とする。 An analysis apparatus and an analysis method according to another aspect of the disclosed technique are an analysis apparatus and an analysis method that include a processor that executes a program and a storage device that stores the program, wherein the processor is action history information. And action sequence information and action sequence information are accessible, and the action history information is information in which past action history executed by the communication partner with the monitoring target for the monitoring target is stored. The information is information in which the communication partner is associated with a classified action group of the communication partner, and the action sequence information is an action sequence that is a time-series action of the communication partner, and the action sequence. And the appearance frequency of each of the monitoring target, and the processor classifies the behavior of the communication partner with the monitoring target within the first predetermined period before the occurrence of each alert of the alert group generated in the monitoring target. An acquisition process for acquiring a result and determining whether or not a specific communication partner with the monitoring target within the first predetermined period before the occurrence of a specific alert in the alert group exists in the action classification information Stored in the action history information and the action classification result of the specific communication partner when it is determined by the first determination process that the specific communication partner does not exist in the action classification information. Based on the past action history of the specified communication partner, a specific action sequence that is a time-series action of the specific communication partner is generated, and using the action sequence information, the specific action Second determination processing for determining whether or not the appearance frequency of the sequence is less than or equal to a predetermined frequency, and when the appearance frequency of the specific action sequence is determined to be less than or equal to the predetermined frequency by the second determination processing, It is characterized in that a selection process for selecting the action classification result of the specific communication partner and an output process for outputting the action classification result of the specific communication partner selected by the selection process are performed.

本発明の代表的な実施の形態によれば、手口変化の関連性が高いアラートの絞り込みの自動化を図ることができる。前述した以外の課題、構成及び効果は、以下の実施例の説明により明らかにされる。 According to the exemplary embodiment of the present invention, it is possible to automate the narrowing down of alerts that are highly related to the change in the tactics. Problems, configurations, and effects other than those described above will be clarified by the following description of the embodiments.

図１は、サイバー攻撃の分析例を示す説明図である。FIG. 1 is an explanatory diagram showing an example of cyber attack analysis. 図２は、監視システムのシステム構成例を示すブロック図である。FIG. 2 is a block diagram showing a system configuration example of the monitoring system. 図３は、図２に示した各種コンピュータのハードウェア構成例を示すブロック図である。FIG. 3 is a block diagram showing an example of the hardware configuration of the various computers shown in FIG. 図４は、特徴量テーブルの一例を示す説明図である。FIG. 4 is an explanatory diagram showing an example of the feature amount table. 図５は、ラベルテーブルの一例を示す説明図である。FIG. 5 is an explanatory diagram showing an example of the label table. 図６は、行動分類テーブルの一例を示す説明図である。FIG. 6 is an explanatory diagram showing an example of the action classification table. 図７は、行動シーケンステーブルの一例を示す説明図である。FIG. 7 is an explanatory diagram showing an example of the action sequence table. 図８は、比率テーブルの一例を示す説明図である。FIG. 8 is an explanatory diagram showing an example of the ratio table. 図９は、分析装置の機能的構成例を示すブロック図である。FIG. 9 is a block diagram showing a functional configuration example of the analyzer. 図１０は、分析装置の動作を示すシーケンス図である。FIG. 10 is a sequence diagram showing the operation of the analyzer. 図１１は、アナリスト端末の出力画面表示例を示す説明図である。FIG. 11 is an explanatory diagram showing an output screen display example of the analyst terminal. 図１２は、図１０に示した行動分類予測（ステップＳ１００５）の詳細な処理手順例を示すフローチャートである。FIG. 12 is a flowchart showing a detailed processing procedure example of the action classification prediction (step S1005) shown in FIG. 図１３は、図１０に示した通信相手別異常行動判定（ステップＳ１００７）の詳細な処理手順例を示すフローチャートである。FIG. 13 is a flowchart showing a detailed processing procedure example of the communication partner-specific abnormal behavior determination (step S1007) shown in FIG. 図１４は、図１０に示した行動別異常行動判定（ステップＳ１００９）の詳細な処理手順例を示すフローチャートである。FIG. 14 is a flowchart showing a detailed processing procedure example of the abnormal behavior determination for each behavior (step S1009) shown in FIG. 図１５は、図１１に示したフィードバック処理（ステップＳ１０１２）の詳細な処理手順例を示すフローチャートである。FIG. 15 is a flowchart showing a detailed processing procedure example of the feedback processing (step S1012) shown in FIG. 図１６は、図１０に示した追加学習処理（ステップＳ１０１３）の詳細な処理手順例を示すフローチャートである。FIG. 16 is a flowchart showing a detailed processing procedure example of the additional learning processing (step S1013) shown in FIG.

＜サイバー攻撃の分析例＞
図１は、サイバー攻撃の分析例を示す説明図である。図１では、監視対象システム１１０が、サイバー攻撃をおこなう通信端末１２０（単に攻撃者と表記する場合もある）からサイバー攻撃を受けた場合を例に挙げて説明する。監視対象システム１１０は、通信端末１２０からサイバー攻撃を受けると、ＳＯＣ１００にアラートを送信する。アラートは、たとえば、アラートを一意に特定するアラート識別子と、アラートの発生日時と、通信相手（この例では通信端末１２０）と、を含む。 <Example of cyber attack analysis>
FIG. 1 is an explanatory diagram showing an example of cyber attack analysis. In FIG. 1, a case where the monitored system 110 receives a cyber attack from a communication terminal 120 (which may be simply referred to as an attacker) that performs a cyber attack will be described as an example. When the monitored system 110 receives a cyber attack from the communication terminal 120, the monitored system 110 sends an alert to the SOC 100. The alert includes, for example, an alert identifier that uniquely identifies the alert, an alert occurrence date and time, and a communication partner (the communication terminal 120 in this example).

ＳＯＣ１００は、アラート管理装置１０１と、ログ収集装置１０２と、分析装置１０３と、を有する。アラート管理装置１０１は、監視対象システム１１０からのアラートを受信してログ収集装置１０２に転送する。ログ収集装置１０２は、監視対象システム１１０の挙動を示すログ（たとえば、キャッシュミス回数や異常応答回数）を監視対象システム１１０から取得して分析装置１０３に送信する。 The SOC 100 includes an alert management device 101, a log collection device 102, and an analysis device 103. The alert management apparatus 101 receives an alert from the monitored system 110 and transfers it to the log collection apparatus 102. The log collection device 102 acquires a log indicating the behavior of the monitored system 110 (for example, the number of cache misses and the number of abnormal responses) from the monitored system 110 and sends it to the analysis device 103.

分析装置１０３は、ＤＢ１３０を有し、ログ収集装置１０２からアラート発生前の所定期間内ログを取得して、当該所定期間内に発生したアラートのアラート識別子を当該ログに付与し、ＤＢ１３０に格納する。アラート識別子が付与されたログを「アラート特徴データ」と称す。異なるログに同一のアラート識別子が付与されても、それぞれがアラート特徴データとなる。 The analysis device 103 has a DB 130, acquires a log within a predetermined period before an alert is generated from the log collection device 102, assigns an alert identifier of an alert generated within the predetermined period to the log, and stores the log in the DB 130. .. The log to which the alert identifier is given is called "alert characteristic data". Even if the same alert identifier is assigned to different logs, each becomes the alert feature data.

分析装置１０３は、機械学習処理１３１を実行して、アラート特徴データを分類するための分類器を生成する。分類器は、たとえば、学習パラメータである。なお、機械学習処理１３１において、２種類の学習データが適用される。１つは、教師なしアラート特徴データであり、もう１つは、教師ありアラート特徴データである。 The analysis device 103 executes the machine learning process 131 to generate a classifier for classifying the alert feature data. The classifier is, for example, a learning parameter. Note that two types of learning data are applied in the machine learning process 131. One is unsupervised alert characteristic data, and the other is supervised alert characteristic data.

教師なしアラート特徴データは、過去のアラート発生前における監視対象システム１１０またはその通信相手の挙動に関するログである。教師ありアラート特徴データは、過去のアラート発生前における監視対象システム１１０またはその通信相手の挙動に関するログと、当該ログに対してセキュリティアナリスト１４１が付与した分類のラベルと、の組み合わせである。機械学習処理１３１で教師なしアラート特徴データ群をクラスタリングすることにより、教師なしアラート特徴データ群は、複数のクラスタに分類される。各クラスタは、セキュリティアナリスト１４１によってサイバー攻撃の分類を示すラベルが付与される。これにより、教師なしアラート特徴データは、教師ありアラート特徴データとなる。分析装置１０３は、教師ありアラート特徴データを用いて機械学習処理１３１を実行することにより、分類器を生成する。 The unsupervised alert feature data is a log relating to the behavior of the monitored system 110 or its communication partner before the occurrence of past alerts. The supervised alert feature data is a combination of a log relating to the behavior of the monitored system 110 or its communication partner before the occurrence of a past alert, and a classification label given by the security analyst 141 to the log. By clustering the unsupervised alert characteristic data group in the machine learning process 131, the unsupervised alert characteristic data group is classified into a plurality of clusters. The security analyst 141 gives each cluster a label indicating the classification of cyber attacks. As a result, the unsupervised alert characteristic data becomes the supervised alert characteristic data. The analysis device 103 generates a classifier by executing the machine learning process 131 using the alert feature data with teacher.

あらたに、サイバー攻撃の予測対象アラートの発生前所定期間内の予測対象アラート特徴データ群がログ収集装置１０２から得られると、分析装置１０３は、アラート分類処理１３２により、生成済みの分類器を用いて当該予測対象アラート特徴データ群を分類し、分類結果をＤＢ１３０に格納する。 When a prediction target alert feature data group within a predetermined period before the occurrence of a cyber attack prediction target alert is obtained from the log collection device 102, the analysis device 103 uses the generated classifier by the alert classification process 132. Then, the prediction target alert feature data group is classified, and the classification result is stored in the DB 130.

分析装置１０３は、サンプリング処理１３３により、分類器から得られた分類結果であるラベルから手口変化と強い関連があるラベルを、以下の観点（１）および（２）によってサンプリングする。 Through the sampling process 133, the analysis device 103 samples, from the labels that are the classification results obtained from the classifier, the labels that are strongly associated with the change in the signature according to the following viewpoints (1) and (2).

（１）通信相手別異常行動判定処理１３４により、分析装置１０３は、予測対象アラートの発生原因となる通信相手にとって、付与されたラベルまたはラベルの時系列シーケンスが初出または低頻度であるか否かを判定する。分析装置１０３は、付与されたラベルまたはラベルの時系列シーケンスが初出または低頻度であれば、当該ラベルが付与された予測対象アラート特徴データに対応する予測対象アラートを選択アラートとしてサンプリングする。 (1) By the communication partner-specific abnormal behavior determination processing 134, the analysis apparatus 103 determines whether the assigned label or the time-series sequence of the label is first appearance or low frequency for the communication partner that is the cause of the prediction target alert. To judge. If the given label or the time-series sequence of the label appears for the first time or the frequency is low, the analyzer 103 samples the prediction target alert corresponding to the prediction target alert feature data to which the label is given as a selected alert.

（２）行動別異常行動判定処理１３５により、分析装置１０３は、通信相手集団にとって、付与されたラベルまたはラベルの時系列な行動シーケンスが過去に長期攻撃の一部であった実績があるか否かを判定する。分析装置１０３は、付与されたラベルまたは行動シーケンスが過去に長期攻撃の一部であった実績があれば、当該ラベルが付与された予測対象アラート特徴データに対応する予測対象アラートを選択アラートとしてサンプリングする。 (2) By the behavior-specific abnormal behavior determination processing 135, the analysis apparatus 103 determines whether or not the assigned label or the time-series behavior sequence of the label has been a part of a long-term attack in the past for the communication partner group. Determine whether. If the assigned label or the action sequence has a history of being part of a long-term attack in the past, the analysis device 103 samples the prediction target alert corresponding to the prediction target alert feature data to which the label is assigned as a selected alert. To do.

観点（１）および（２）により、分析装置１０３は、手口変化の関連性が高い予測対象アラート群を選択アラートに自動的に絞り込むことができる。これにより、セキュリティアナリスト１４１の分析負担を軽減することができる。 From the viewpoints (1) and (2), the analysis apparatus 103 can automatically narrow down a prediction target alert group having a high degree of association with a change in signature to a selected alert. As a result, the analysis load on the security analyst 141 can be reduced.

アナリスト端末１４０は、サンプリング処理１３３でサンプリングされた選択アラートを受信し、セキュリティアナリスト１４１は、アナリスト端末１４０に表示された当該選択アラートを分析する。たとえば、セキュリティアナリスト１４１は、選択アラートのラベルを変更する。そして、アナリスト端末１４０は、セキュリティアナリスト１４１が作成したフィードバック情報を分析装置１０３に送信する。フィードバック情報は、たとえば、選択アラートの変更後のラベルを含む。分析装置１０３は、フィードバック情報に基づいて、ＤＢ１３０内の選択アラートのラベルを更新する。なお、セキュリティアナリスト１４１は、分析装置１０３で表示された選択アラートを分析したり、分析装置１０３を操作してフィードバック情報を作成してもよい。 The analyst terminal 140 receives the selected alert sampled in the sampling process 133, and the security analyst 141 analyzes the selected alert displayed on the analyst terminal 140. For example, the security analyst 141 changes the label of the selected alert. Then, the analyst terminal 140 transmits the feedback information created by the security analyst 141 to the analysis device 103. The feedback information includes, for example, the changed label of the selected alert. The analysis device 103 updates the label of the selected alert in the DB 130 based on the feedback information. The security analyst 141 may analyze the selected alert displayed on the analysis device 103 or may operate the analysis device 103 to create feedback information.

また、分析装置１０３は、分類器の再学習の際、調整処理１３７により、分類器に与える追加学習データの分布の偏りを抑制する。追加学習データとは、たとえば、サンプリング処理１３３済みの予測対象アラート特徴データと、当該予測対象アラート特徴データに対してセキュリティアナリスト１４１が新たに付与した分類を示すラベルと、の組み合わせである。具体的には、たとえば、分析装置１０３は、調整処理１３７により、追加学習データ群のラベルが特定のラベルに集中しないよう、ラベルごとの選択アラートの数を以下の観点（３）で調整する。 Further, the analysis device 103 suppresses the bias of the distribution of the additional learning data given to the classifier by the adjustment processing 137 when re-learning the classifier. The additional learning data is, for example, a combination of the prediction target alert feature data that has been subjected to the sampling process 133 and a label that is newly assigned to the prediction target alert feature data by the security analyst 141. Specifically, for example, the analysis device 103 adjusts the number of selection alerts for each label from the following viewpoint (3) by the adjustment processing 137 so that the labels of the additional learning data group are not concentrated on a specific label.

（３）分析装置１０３は、ラベルごとの選択アラートの数が同数となるように、選択アラートの数が少ないラベルに、サンプリング処理１３３でサンプリングされていない未選択アラートを補充する。 (3) The analysis device 103 replenishes unselected alerts that have not been sampled in the sampling process 133 to labels with a small number of selected alerts so that the number of selected alerts for each label is the same.

観点（３）により、追加学習データの分布の偏りが抑制される。したがって、分析装置１０３は、追加学習データで再学習する際に、追加学習データの分布の偏り調整についてセキュリティアナリスト１４１へ問い合わせる回数を最小限にすることができる。また、追加学習データの分布の偏りを抑制することにより、特定のラベルに集中することによる過学習を抑制する。したがって、時間経過とともに手口が変わるサイバー攻撃に対応することができる。 From the viewpoint (3), the bias of the distribution of the additional learning data is suppressed. Therefore, the analysis apparatus 103 can minimize the number of inquiries to the security analyst 141 regarding adjustment of the bias of the distribution of the additional learning data when re-learning with the additional learning data. Further, by suppressing the bias of the distribution of the additional learning data, it is possible to suppress over-learning due to concentration on a specific label. Therefore, it is possible to deal with a cyber attack whose method changes with the passage of time.

＜システム構成例＞
図２は、監視システムのシステム構成例を示すブロック図である。監視システム２００は、監視対象システム１１０と、ＳＯＣ１００と、を有する。監視対象システム１１０とＳＯＣ１００は、通信可能に接続される。 <Example of system configuration>
FIG. 2 is a block diagram showing a system configuration example of the monitoring system. The monitoring system 200 includes a monitoring target system 110 and an SOC 100. The monitoring target system 110 and the SOC 100 are communicably connected.

監視対象システム１１０は、ＳＯＣ１００に監視されるシステムである。監視対象システム１１０は、たとえば、第１ネットワーク２１０、１台以上のクライアント端末２１１、業務サーバ２１２、ネットワーク監視装置２１３、第１ＦＷ／ＩＰＳ２１４、およびプロキシサーバ２１６を有する。 The monitoring target system 110 is a system monitored by the SOC 100. The monitored system 110 has, for example, a first network 210, one or more client terminals 211, a business server 212, a network monitoring device 213, a first FW/IPS 214, and a proxy server 216.

第１ネットワーク２１０は、たとえば、バスであり、１台以上のクライアント端末２１１、業務サーバ２１２、ネットワーク監視装置２１３、第１ＦＷ／ＩＰＳ２１４、プロキシサーバ２１６、第２ＦＷ／ＩＰＳ２２３およびＳＯＣ１００を通信可能に接続する。第１ＦＷ／ＩＰＳ２１４は、外部ネットワーク２０２に通信可能に接続される。外部ネットワーク２０２は、たとえば、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、ＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）、インターネットである。 The first network 210 is, for example, a bus, and communicatively connects one or more client terminals 211, business servers 212, network monitoring devices 213, first FW/IPS 214, proxy servers 216, second FW/IPS 223, and SOC 100. .. The first FW/IPS 214 is communicatively connected to the external network 202. The external network 202 is, for example, a LAN (Local Area Network), a WAN (Wide Area Network), or the Internet.

また、監視対象システム１１０は、たとえば、第２ネットワーク２２０、制御装置２２１、コントローラ２２２、および第２ＦＷ／ＩＰＳ２２３を有する。第２ネットワーク２２０は、たとえば、バスであり、第２ネットワーク２２０、制御装置２２１、コントローラ２２２、および第２ＦＷ／ＩＰＳ２２３を通信可能に接続する。 Further, the monitored system 110 has, for example, a second network 220, a control device 221, a controller 222, and a second FW/IPS 223. The second network 220 is, for example, a bus, and communicatively connects the second network 220, the control device 221, the controller 222, and the second FW/IPS 223.

ＳＯＣ１００は、たとえば、アラート管理装置１０１と、ログ収集装置１０２と、分析装置１０３と、アナリスト端末１４０と、第３ネットワーク２０３と、を有する。第３ネットワーク２０３は、たとえば、バスであり、アラート管理装置１０１、ログ収集装置１０２、分析装置１０３、アナリスト端末１４０、および外部脅威情報データベース２０１を通信可能に接続する。 The SOC 100 includes, for example, an alert management device 101, a log collection device 102, an analysis device 103, an analyst terminal 140, and a third network 203. The third network 203 is, for example, a bus, and communicatively connects the alert management apparatus 101, the log collection apparatus 102, the analysis apparatus 103, the analyst terminal 140, and the external threat information database 201.

アラート管理装置１０１は、事象の一例として、監視対象システム１１０からウィルス検出、異常な挙動検出、未登録装置との接続検出といったアラートを取得して格納する。アラートは、たとえば、アラートの発生日時と、アラート対象（アラートの発生元である監視対象システム１１０内のコンピュータ）と、アラート対象の通信相手（たとえば、通信端末１２０を特定するＩＰアドレス）と、を含む情報である。アラート管理装置１０１は、取得したアラートを分析装置１０３に通知する。 The alert management apparatus 101 acquires and stores alerts such as virus detection, abnormal behavior detection, and connection detection with an unregistered apparatus from the monitored system 110 as an example of an event. The alert includes, for example, the date and time of the alert, the alert target (computer in the monitored system 110 that is the source of the alert), and the communication partner of the alert target (for example, an IP address that identifies the communication terminal 120). It is the information to include. The alert management apparatus 101 notifies the analysis apparatus 103 of the acquired alert.

ログ収集装置１０２は、監視対象システム１１０からのログを取得して格納する。ログは、いつ、監視対象システム１１０内のどのコンピュータ３００がどのようなデータをどの通信相手に送受信したかを示す履歴情報である。 The log collection device 102 acquires and stores the log from the monitored system 110. The log is history information indicating when and which computer 300 in the monitored system 110 transmitted/received what data to/from which communication partner.

分析装置１０３は、アラート管理装置１０１で管理されているアラートとログ収集装置１０２で管理されているログと外部脅威情報データベース２０１に登録されている脅威情報を用いて、アラートを分析する。外部脅威情報データベース２０１は、たとえば、インターネット上で脅威情報を公開するデータベースである。脅威情報には、たとえば、マルウェア、プログラムの脆弱性、スパム、不正ＵＲＬがある。アナリスト端末１４０は、セキュリティアナリスト１４１が操作する端末である。 The analysis device 103 analyzes the alert using the alert managed by the alert management device 101, the log managed by the log collection device 102, and the threat information registered in the external threat information database 201. The external threat information database 201 is, for example, a database that publishes threat information on the Internet. The threat information includes, for example, malware, program vulnerability, spam, and malicious URL. The analyst terminal 140 is a terminal operated by the security analyst 141.

＜コンピュータのハードウェア構成例＞
図３は、図２に示した各種コンピュータ（クライアント端末２１１、業務サーバ２１２、ネットワーク監視装置２１３、第１ＦＷ／ＩＰＳ２１４、プロキシサーバ２１６、制御装置２２１、コントローラ２２２、第２ＦＷ／ＩＰＳ２２３、アラート管理装置１０１、ログ収集装置１０２、分析装置１０３、アナリスト端末１４０）のハードウェア構成例を示すブロック図である。 <Computer hardware configuration example>
FIG. 3 shows various computers (client terminal 211, business server 212, network monitoring device 213, first FW/IPS 214, proxy server 216, control device 221, controller 222, second FW/IPS 223, alert management device 101 shown in FIG. 2 is a block diagram showing an example of the hardware configuration of a log collection device 102, an analysis device 103, and an analyst terminal 140).

コンピュータ３００は、プロセッサ３０１と、記憶デバイス３０２と、入力デバイス３０３と、出力デバイス３０４と、通信インターフェース（通信ＩＦ）３０５と、を有する。プロセッサ３０１、記憶デバイス３０２、入力デバイス３０３、出力デバイス３０４、および通信ＩＦ３０５は、バス３０６により接続される。プロセッサ３０１は、コンピュータ３００を制御する。 The computer 300 includes a processor 301, a storage device 302, an input device 303, an output device 304, and a communication interface (communication IF) 305. The processor 301, storage device 302, input device 303, output device 304, and communication IF 305 are connected by a bus 306. The processor 301 controls the computer 300.

記憶デバイス３０２は、プロセッサ３０１の作業エリアとなる。また、記憶デバイス３０２は、各種プログラムやデータを記憶する非一時的なまたは一時的な記録媒体である。記憶デバイス３０２としては、たとえば、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）、フラッシュメモリがある。入力デバイス３０３は、データを入力する。入力デバイス３０３としては、たとえば、キーボード、マウス、タッチパネル、テンキー、スキャナがある。出力デバイス３０４は、データを出力する。出力デバイス３０４としては、たとえば、ディスプレイ、プリンタがある。通信ＩＦ３０５は、ネットワークと接続し、データを送受信する。 The storage device 302 serves as a work area of the processor 301. The storage device 302 is a non-temporary or temporary recording medium that stores various programs and data. Examples of the storage device 302 include a ROM (Read Only Memory), a RAM (Random Access Memory), a HDD (Hard Disk Drive), and a flash memory. The input device 303 inputs data. Examples of the input device 303 include a keyboard, a mouse, a touch panel, a numeric keypad, and a scanner. The output device 304 outputs data. Examples of the output device 304 include a display and a printer. The communication IF 305 connects to a network and transmits/receives data.

＜特徴量テーブル４００＞
図４は、特徴量テーブル４００の一例を示す説明図である。特徴量テーブル４００は、アラートの特徴量を示すアラート特徴量を記憶するテーブルであり、分析装置１０３によってログ収集装置１０２から収集され、たとえば、分析装置１０３の記憶デバイス３０２に記憶される。特徴量テーブル４００は、ログ収集装置１０２に記憶されていてもよい。 <Feature amount table 400>
FIG. 4 is an explanatory diagram showing an example of the feature amount table 400. The feature amount table 400 is a table that stores alert feature amounts that indicate alert feature amounts, is collected from the log collection device 102 by the analysis device 103, and is stored in, for example, the storage device 302 of the analysis device 103. The feature amount table 400 may be stored in the log collection device 102.

特徴量テーブル４００は、アラート識別子４０１と、集計日時４０２と、プロキシサーバログ４０３と、業務サーバログ４０４と、外部脅威情報４０５と、をフィールドとして有する。ある行における各フィールドの値の組み合わせがアラート特徴データを構成するエントリである。 The feature amount table 400 has an alert identifier 401, a total date/time 402, a proxy server log 403, a business server log 404, and external threat information 405 as fields. The combination of the values of the fields in a certain row is the entry that constitutes the alert feature data.

プロキシサーバログ４０３、業務サーバログ４０４、および外部脅威情報４０５以外にも監視対象システム１１０内の他のコンピュータ（クライアント端末２１１やＦＷ／ＩＰＳ２１４，２２３、ネットワーク監視装置２１３など）についてのログがあってもよいが、図４では省略する。 In addition to the proxy server log 403, the business server log 404, and the external threat information 405, there are logs for other computers in the monitored system 110 (the client terminal 211, the FW/IPS 214, 223, the network monitoring device 213, etc.). However, it is omitted in FIG.

アラート識別子４０１は、アラートを一意に特定する識別情報である。なお、アラート管理装置１０１からのアラートは、アラート識別子４０１、発生日時、アラート対象、および通信相手を含む。発生日時は、そのアラートが発生した日付時刻である。アラート対象は、アラートの発生元、ここでは、監視対象システム１１０内のコンピュータである。通信相手は、アラート対象が送信したデータの宛先またはアラート対象にデータを送信した送信元である。なお、分析装置１０３は、そのアラートの発生日時前から所定時間遡った日時から発生日時までの時間帯に集計日時４０２が含まれるエントリに、そのアラートのアラート識別子４０１を付与する。 The alert identifier 401 is identification information that uniquely identifies the alert. The alert from the alert management apparatus 101 includes the alert identifier 401, the date and time of occurrence, the alert target, and the communication partner. The date and time of occurrence is the date and time when the alert occurred. The alert target is a source of the alert, here, a computer in the monitored system 110. The communication partner is the destination of the data transmitted by the alert target or the transmission source of the data transmitted to the alert target. The analysis device 103 assigns the alert identifier 401 of the alert to an entry including the aggregated date and time 402 in the time zone from the date and time before the alert date and time to the date and time when the alert occurred.

集計日時４０２は、アラート識別子４０１で特定されるアラートの発生日時から所定時間遡った日時から一定時間間隔で発生日時までログ収集装置１０２がログを集計した日付時刻である。本例では、所定時間を１時間とし、一定時間間隔を１０分とする。集計日時４０２は、一定時間間隔の終了日時を示す。たとえば、集計日時４０２が「１０／１０１２：５７」のエントリは、１０／１０の１２：４８から１２：５７までの１０分間で集計されたログで特定されるアラート特徴データを示す。 The total date and time 402 is the date and time when the log collecting apparatus 102 totals the logs from the date and time when a predetermined time goes back from the date and time when the alert identified by the alert identifier 401 occurs to the date and time when the log is generated. In this example, the predetermined time is 1 hour and the fixed time interval is 10 minutes. The total date and time 402 indicates the end date and time of a fixed time interval. For example, the entry having the aggregation date/time 402 of “10/10 12:57” indicates the alert feature data specified by the log aggregated in 10 minutes from 12:48 to 12:57 of 10/10.

たとえば、アラート識別子４０１が「Ａｌｅｒｔ＿００５」であるアラートの発生日時を「１０／１０１３：５７」とすると、アラート識別子４０１が「Ａｌｅｒｔ＿００５」であるアラートの集計日時４０２は、発生日時である「１０／１０１３：５７」から１時間遡った「１０／１０１２：５７」と、「１０／１０１３：５７」から１０分刻みの「１０／１０１３：０７」、「１０／１０１３：１７」、「１０／１０１３：２７」、「１０／１０１３：３７」、「１０／１０１３：４７」、および「１０／１０１３：５７」となる。このようにして、アラート発生時のアラート特徴データの収集タイミングが設定される。 For example, if the date and time of occurrence of an alert whose alert identifier 401 is "Alert_005" is "10/10 13:57", the aggregated date and time 402 of alerts whose alert identifier 401 is "Alert_005" is "10/ "10/10 12:57" which goes back one hour from "10 13:57" and "10/10 13:07" and "10/10 13:17" in 10 minute intervals from "10/10 13:57" , "10/10 13:27", "10/10 13:37", "10/10 13:47", and "10/10 13:57". In this way, the collection timing of alert characteristic data when an alert is generated is set.

プロキシサーバログ４０３は、サブフィールドとして、キャッシュミス回数４３１と異常応答回数４３２とを有する。キャッシュミス回数４３１は、集計日時４０２においてプロキシサーバ１１６がキャッシュミスした回数である。異常応答回数４３２は、集計日時４０２においてプロキシサーバ１１６が異常応答を受信した回数である。なお、プロキシサーバログ４０３のサブフィールドは、キャッシュミス回数４３１や異常応答回数４３２以外（たとえば、通信バイト数）であってもよいが、図４では省略する。 The proxy server log 403 has a cache miss count 431 and an abnormal response count 432 as subfields. The number of cache misses 431 is the number of times the proxy server 116 has made a cache miss at the total time 402. The abnormal response count 432 is the number of times the proxy server 116 has received an abnormal response at the total date/time 402. The subfield of the proxy server log 403 may be other than the cache miss count 431 and the abnormal response count 432 (for example, the communication byte count), but it is omitted in FIG.

業務サーバログ４０４は、サブフィールドとして、異常応答回数４４１とアクセス回数４４２とを有する。異常応答回数４４１は、集計日時４０２において業務サーバ１１２が異常応答を受信した回数である。アクセス回数４４２は、集計日時４０２で特定される一定時間間隔の集計期間において業務サーバ１１２が他のコンピュータ３００にアクセスされた回数である。なお、業務サーバログ４０４のサブフィールドは、異常応答回数４４１やアクセス回数４４２以外（たとえば、認証失敗回数）であってもよいが、図６では省略する。 The business server log 404 has an abnormal response count 441 and an access count 442 as subfields. The abnormal response count 441 is the number of times the business server 112 has received an abnormal response at the aggregation date/time 402. The access count 442 is the number of times the business server 112 was accessed by another computer 300 during the counting period of the fixed time interval specified by the counting date 402. The subfield of the business server log 404 may be other than the abnormal response count 441 and the access count 442 (for example, the authentication failure count), but it is omitted in FIG.

外部脅威情報４０５は、サブフィールドとして、ＩＰアドレス危険度４５１とＵＲＬ危険度４５２とを有する。ＩＰアドレス危険度４５１は、集計日時４０２におけるアラート対象の通信相手がＩＰアドレスで特定された場合に、外部脅威情報データベース２０１において当該ＩＰアドレスの危険度を段階的に示した指標値である。本例では、０〜５の６段階とし、５が最も危険度が高いことを示す。 The external threat information 405 has an IP address risk 451 and a URL risk 452 as subfields. The IP address risk level 451 is an index value that indicates the risk level of the IP address in the external threat information database 201 in a stepwise manner when the communication target of the alert at the aggregation date/time 402 is specified by the IP address. In this example, 6 levels from 0 to 5 are set, and 5 indicates the highest risk.

ＵＲＬ危険度４５２は、集計日時４０２におけるアラート対象の通信相手がＵＲＬで特定された場合に、外部脅威情報データベース２０１において当該ＵＲＬの危険度を段階的に示した指標値である。本例では、０〜５の６段階とし、５が最も危険度が高いことを示す。なお、外部脅威情報４０５のサブフィールドは、ＩＰアドレス危険度４５１やＵＲＬ危険度４５２以外であってもよいが、図４では省略する。 The URL risk level 452 is an index value that indicates the risk level of the URL in the external threat information database 201 in a stepwise manner when the communication target of the alert at the aggregation date/time 402 is specified by the URL. In this example, 6 levels from 0 to 5 are set, and 5 indicates the highest risk. The subfield of the external threat information 405 may be other than the IP address risk 451 or the URL risk 452, but is omitted in FIG.

＜ラベルテーブル５００＞
図５は、ラベルテーブル５００の一例を示す説明図である。ラベルテーブル５００は、分類のラベルを記憶するテーブルであり、たとえば、分析装置１０３の記憶デバイス３０２に記憶される。ラベルテーブル５００は、ログ収集装置１０２に記憶されていてもよい。 <Label table 500>
FIG. 5 is an explanatory diagram showing an example of the label table 500. The label table 500 is a table that stores classification labels, and is stored in, for example, the storage device 302 of the analyzer 103. The label table 500 may be stored in the log collection device 102.

ラベルテーブル５００は、アラート識別子４０１と、集計日時４０２と、通信相手分類情報５０３と、行動分類（ラベル）５０４と、確度５０５と、行動分類（旧ラベル）５０６と、サンプリング５０７と、追加学習済み５０８と、をフィールドとして有する。ある行における各フィールドの値の組み合わせが、ラベルデータを構成するエントリである。 The label table 500 includes an alert identifier 401, a total date/time 402, communication partner classification information 503, a behavior classification (label) 504, a probability 505, a behavior classification (old label) 506, sampling 507, and additional learning completed. And 508 as fields. The combination of the values of the fields in a certain row is the entry that constitutes the label data.

通信相手分類情報５０３は、通信相手を分類する情報であり、アラートに含まれる監視対象システム１１０の通信相手である。行動分類（ラベル）５０４は、通信相手の行動、換言すれば、通信端末１２０からのサイバー攻撃を分類する最新のラベルであり、たとえば、マルウェア、プログラムの脆弱性、スパム、不正ＵＲＬなどの脅威情報である。本実施例では、便宜上、行動分類（ラベル）５０４の個々の値をアルファベット大文字で示す。 The communication partner classification information 503 is information for classifying communication partners, and is a communication partner of the monitored system 110 included in the alert. The behavior classification (label) 504 is the latest label for classifying the behavior of the communication partner, in other words, the cyber attack from the communication terminal 120, and threat information such as malware, program vulnerability, spam, and illegal URL. Is. In this embodiment, for convenience, individual values of the action classification (label) 504 are shown in uppercase letters.

確度５０５は、行動分類（ラベル）５０４の値の確からしさを示す数値であり、値が大きいほど行動分類（ラベル）５０４が確からしいことを示す。確度５０５の値は、０．０以上１．０以下の範囲を取る。確度５０５の値は、機械学習処理１３１による学習結果である。行動分類（ラベル）５０４の値で特定されるクラスタの中心に近いほど確度５０５の値は大きくなる。 The accuracy 505 is a numerical value indicating the likelihood of the value of the action classification (label) 504, and the larger the value, the more likely the action classification (label) 504 is. The value of the accuracy 505 ranges from 0.0 to 1.0. The value of the accuracy 505 is a learning result by the machine learning processing 131. The closer to the center of the cluster specified by the value of the action classification (label) 504, the larger the value of the accuracy 505.

行動分類（旧ラベル）５０６は、過去のラベルを示す。行動分類（ラベル）５０４にあらたなラベルが記録されると、記録される前のラベルは、旧ラベルとして行動分類（旧ラベル）５０６に格納される。 The action classification (old label) 506 indicates a past label. When a new label is recorded in the action classification (label) 504, the label before recording is stored in the action classification (old label) 506 as an old label.

サンプリング５０７は、そのアラートが分析対象としてサンプリングされたか否かを示すフラグであり、初期値は、サンプリングされていないことを示す「Ｎ」である。サンプリングされると、サンプリングされたことを示す「Ｙ」に更新される。 The sampling 507 is a flag indicating whether or not the alert is sampled as an analysis target, and the initial value is “N” indicating that the alert is not sampled. When sampled, it is updated to "Y" indicating that it has been sampled.

追加学習済み５０８は、調整処理１３７により追加学習データの偏りを抑制する場合に、そのアラートのアラート識別子および行動分類（ラベル）が、追加学習データとして選択されたか否かを示すフラグである。初期値は、追加学習データでないことを示す「Ｎ」であり、追加学習データとして追加されると「Ｙ」に更新される。追加学習データについては後述する。 The additional learning completed 508 is a flag indicating whether or not the alert identifier and the action classification (label) of the alert are selected as the additional learning data when the bias of the additional learning data is suppressed by the adjustment processing 137. The initial value is “N” indicating that it is not additional learning data, and is updated to “Y” when added as additional learning data. The additional learning data will be described later.

＜行動分類テーブル６００＞
図６は、行動分類テーブル６００の一例を示す説明図である。行動分類テーブル６００は、攻撃者の行動を記憶するテーブルであり、たとえば、分析装置１０３の記憶デバイス３０２に記憶される。行動分類テーブル６００は、ログ収集装置１０２に記憶されていてもよい。 <Behavior classification table 600>
FIG. 6 is an explanatory diagram showing an example of the action classification table 600. The action classification table 600 is a table that stores the action of the attacker, and is stored in, for example, the storage device 302 of the analysis device 103. The action classification table 600 may be stored in the log collection device 102.

行動分類テーブル６００は、通信相手分類情報５０３と、行動分類６０１と、をフィールドとして有する。ある行における各フィールドの値の組み合わせが行動分類データを構成するエントリである。 The behavior classification table 600 has communication partner classification information 503 and behavior classification 601 as fields. The combination of the values of the fields in a certain row is the entry that constitutes the action classification data.

行動分類６０１は、サブフィールドとして、ラベルを有する。したがって、通信相手分類情報５０３で特定される通信相手のラベルごとに、当該ラベルの出現頻度および正常／異常が特定される。ラベルの出現頻度は、その通信相手分類情報５０３で特定される攻撃者からの当該ラベルに該当する通信相手からのアクセス回数ｎで規定される。本実施例では、「Ｈ」（多い）、「Ｍ」（普通）、「Ｌ」（少ない）、空白の４段階で出現頻度が規定される。 The action classification 601 has a label as a subfield. Therefore, for each label of the communication partner specified by the communication partner classification information 503, the appearance frequency and normal/abnormal of the label are specified. The appearance frequency of the label is defined by the number of times n of access from the communication partner corresponding to the label from the attacker specified by the communication partner classification information 503. In the present embodiment, the appearance frequency is defined in four levels of "H" (high), "M" (normal), "L" (low), and blank.

空白は、１度もその通信相手のアクセス回数ｎが０（ｎ＝０）であることを示す。「Ｌ」は、その通信相手によるそのラベルの行動でのアクセス回数ｎが１回以上ｎａ−１回（１≦ｎ＜ｎａ−１）であることを示す。「Ｍ」は、その通信相手によるそのラベルの行動でのアクセス回数ｎがｎａ回以上ｎｂ−１回（ｎａ≦ｎ＜ｎｂ−１）であることを示す。「Ｈ」は、その通信相手によるそのラベルの行動でのアクセス回数ｎがｎｂ回以上（ｎｂ≦ｎ）であることを示す。ｎａ、ｎｂは、あらかじめ設定される値である。 A blank indicates that the access count n of the communication partner is 0 (n=0) even once. “L” indicates that the number of times of access n in the action of the label by the communication partner is 1 or more and na−1 times (1≦n<na−1). “M” indicates that the number of accesses n in the action of the label by the communication partner is na times or more and nb−1 times (na≦n<nb−1). “H” indicates that the number of times of access n in the action of the label by the communication partner is nb or more (nb≦n). na and nb are preset values.

なお、通信相手分類情報５０３から行動分類６０１の中のあるラベルに相当するサイバー攻撃などの挙動があったことが検出されると、当該ラベルに該当するアクセス回数ｎが１回加算される。加算後のアクセス回数ｎがｎａになるとラベルの出現頻度が「Ｌ」から「Ｍ」に更新され、加算後の回数ｎからｎｂになると「Ｍ」から「Ｈ」に更新される。また、ラベルが正常であるか異常であるかは、セキュリティアナリスト１４１が決定する。具体的には、たとえば、アナリスト端末１４０からのフィードバック情報に、ラベルが正常であるか異常であるかを示す情報が含まれており、図１に示したフィードバック処理１３６により、更新される。 When it is detected from the communication party classification information 503 that there is a behavior such as a cyber attack corresponding to a certain label in the behavior classification 601, the number of times of access n corresponding to the relevant label is added once. When the number of accesses n after the addition becomes na, the appearance frequency of the label is updated from "L" to "M", and when the number of times n after the addition becomes nb, it is updated from "M" to "H". Further, the security analyst 141 determines whether the label is normal or abnormal. Specifically, for example, the feedback information from the analyst terminal 140 includes information indicating whether the label is normal or abnormal, and is updated by the feedback processing 136 shown in FIG.

＜行動シーケンステーブル７００＞
図７は、行動シーケンステーブル７００の一例を示す説明図である。行動シーケンステーブル７００は、通信相手の異常行動を記憶するテーブルであり、たとえば、分析装置１０３の記憶デバイス３０２に記憶される。行動シーケンステーブル７００は、ログ収集装置１０２に記憶されていてもよい。 <Behavior sequence table 700>
FIG. 7 is an explanatory diagram showing an example of the action sequence table 700. The action sequence table 700 is a table that stores abnormal actions of communication partners, and is stored in the storage device 302 of the analysis device 103, for example. The action sequence table 700 may be stored in the log collection device 102.

行動シーケンステーブル７００は、行動シーケンス７０１と、頻度７０２と、長期攻撃の一部７０３と、をフィールドとして有する。ある行における各フィールドの値の組み合わせが行動シーケンスデータを構成するエントリである。なお、行動シーケンステーブル７００は、頻度７０２および長期攻撃の一部７０３のうち少なくとも一方を有していればよい。 The action sequence table 700 has an action sequence 701, a frequency 702, and a long-term attack part 703 as fields. The combination of the values of the fields in a certain row is the entry that constitutes the action sequence data. The action sequence table 700 may have at least one of the frequency 702 and the part 703 of the long-term attack.

行動シーケンス７０１は、１以上のラベルの時系列である。なお、同じラベルが２個以上連続する場合は、１個に縮約される。たとえば、ラベルの時系列が「Ａ→Ｂ→Ｂ→Ｂ→Ｃ」である場合、「Ｂ」が３回連続で出現しているため、行動シーケンス７０１は、「Ａ→Ｂ→Ｃ」となる。 The action sequence 701 is a time series of one or more labels. When two or more same labels are consecutive, the number is reduced to one. For example, when the time series of the label is “A→B→B→B→C”, the action sequence 701 is “A→B→C” because “B” appears three times in a row. ..

頻度７０２は、行動分類６０１で示した場合と同様、「Ｈ」（多い）、「Ｍ」（普通）、「Ｌ」（少ない）、空白の４段階で規定される行動シーケンス７０１の出現頻度である。 The frequency 702 is the appearance frequency of the action sequence 701 defined by four levels of “H” (high), “M” (normal), “L” (low), as in the case of the action classification 601. is there.

長期攻撃の一部７０３とは、その行動シーケンス７０１が長期攻撃の一部であるか否かを示す被包含情報である。長期攻撃の一部７０３の初期値は、長期攻撃の一部７０３ではないことを示す「Ｎ」であり、長期攻撃の一部７０３に含まれていれば、「Ｙ」に更新される。長期攻撃とは、過去に出現した行動シーケンスであり、行動シーケンス７０１よりもラベル数が１個以上多い。 The part 703 of the long-term attack is included information indicating whether or not the action sequence 701 is part of the long-term attack. The initial value of the part 703 of the long-term attack is “N” indicating that it is not the part 703 of the long-term attack, and is updated to “Y” if included in the part 703 of the long-term attack. The long-term attack is an action sequence that has appeared in the past, and has one or more labels than the action sequence 701.

たとえば、過去の行動シーケンスとして「Ａ→Ｂ→Ｃ→Ｄ」が長期攻撃として規定されている場合、「Ａ」、「Ｂ」、「Ｃ」、「Ｄ」、「Ａ→Ｂ」、「Ｂ→Ｃ」、「Ｃ→Ｄ」、「Ａ→Ｂ→Ｃ」、「Ｂ→Ｃ→Ｄ」の各行動シーケンス７０１は、長期攻撃の一部７０３に該当する。したがって、これらの行動シーケンス７０１の長期攻撃の一部７０３は、「Ｙ」に更新される。また、長期攻撃の一部７０３のラベル数は、２個以上であってもよい。たとえば、長期攻撃の一部７０３のラベル数が３個以上である場合、上記の例では、「Ａ→Ｂ→Ｃ」、「Ｂ→Ｃ→Ｄ」の各行動シーケンス７０１が、長期攻撃の一部７０３に該当する。 For example, when “A→B→C→D” is defined as a long-term attack as a past action sequence, “A”, “B”, “C”, “D”, “A→B”, “B”. Each action sequence 701 of “→C”, “C→D”, “A→B→C”, and “B→C→D” corresponds to a part 703 of the long-term attack. Therefore, a part 703 of the long-term attacks in these action sequences 701 is updated to “Y”. Further, the number of labels of the part 703 of the long-term attack may be two or more. For example, when the number of labels of part of the long-term attack 703 is 3 or more, in the above example, each action sequence 701 of “A→B→C” and “B→C→D” is one of the long-term attacks. It corresponds to the section 703.

なお、どの行動シーケンス７０１が長期攻撃の一部７０３であるかは、セキュリティアナリスト１４１が決定する。具体的には、たとえば、アナリスト端末１４０からのフィードバック情報に、どの行動シーケンス７０１が長期攻撃の一部７０３であるかを示す情報が含まれる。この場合、図１に示したフィードバック処理１３６により、長期攻撃の一部７０３に該当する行動シーケンス７０１があれば、その長期攻撃の一部７０３が「Ｙ」に更新される。 The security analyst 141 determines which action sequence 701 is part of the long-term attack 703. Specifically, for example, the feedback information from the analyst terminal 140 includes information indicating which action sequence 701 is part of the long-term attack 703. In this case, if there is an action sequence 701 corresponding to a part 703 of the long-term attack by the feedback processing 136 shown in FIG. 1, the part 703 of the long-term attack is updated to “Y”.

＜比率テーブル８００＞
図８は、比率テーブル８００の一例を示す説明図である。比率テーブル８００は、ラベルごとのアラート数の比率を記憶するテーブルであり、たとえば、分析装置１０３の記憶デバイス３０２に記憶される。比率テーブル８００は、ログ収集装置１０２に記憶されていてもよい。 <Ratio table 800>
FIG. 8 is an explanatory diagram showing an example of the ratio table 800. The ratio table 800 is a table that stores the ratio of the number of alerts for each label, and is stored in the storage device 302 of the analyzer 103, for example. The ratio table 800 may be stored in the log collection device 102.

比率テーブル８００は、行動分類（ラベル）５０４と、アラート数の比率８０１と、をフィールドとして有する。アラート数の比率８０１は、追加学習データとして追加されるアラート数（すなわち、ラベル数）の比率を示す。追加学習データとして追加されるアラートは、サンプリング５０７が「Ｙ」であるラベルのアラートである。図８では、サンプリング５０７が「Ｙ」であるラベルは「Ａ」、「Ｂ」、「Ｃ」とする。各ラベルのアラート数の比率８０１が均等に近づくほど、機械学習処理１３１での追加学習における学習偏り、すなわち、分類精度低下を抑制することができる。 The ratio table 800 has an action classification (label) 504 and an alert number ratio 801 as fields. The ratio 801 of the number of alerts shows the ratio of the number of alerts (that is, the number of labels) added as additional learning data. The alert added as the additional learning data is an alert with a label of which sampling 507 is “Y”. In FIG. 8, labels whose sampling 507 is “Y” are “A”, “B”, and “C”. As the ratio 801 of the number of alerts of each label becomes closer to equal, the learning bias in the additional learning in the machine learning process 131, that is, the deterioration of the classification accuracy can be suppressed.

＜分析装置１０３の機能的構成例＞
図９は、分析装置１０３の機能的構成例を示すブロック図である。分析装置１０３は、特徴量テーブル４００と、ラベルテーブル５００と、行動分類テーブル６００と、行動シーケンステーブル７００と、比率テーブル８００と、学習部９０１と、予測部９０２と、第１判定部９０３と、第２判定部９０４と、通信部９０５と、サンプリング候補リスト９１１と、追加学習用アラートリスト９１２と、を有する。 <Example of functional configuration of analyzer 103>
FIG. 9 is a block diagram showing a functional configuration example of the analysis device 103. The analysis device 103 includes a feature amount table 400, a label table 500, an action classification table 600, an action sequence table 700, a ratio table 800, a learning unit 901, a prediction unit 902, a first determination unit 903, It has a second determination unit 904, a communication unit 905, a sampling candidate list 911, and an additional learning alert list 912.

学習部９０１、予測部９０２、第１判定部９０３、第２判定部９０４、および通信部９０５は、具体的には、たとえば、図３に示した記憶デバイス３０２に記憶されたプログラムをプロセッサ３０１に実行させることにより実現される。また、サンプリング候補リスト９１１および追加学習用アラートリスト９１２は、具体的には、たとえば、図３に示した記憶デバイス３０２により実現される。 The learning unit 901, the prediction unit 902, the first determination unit 903, the second determination unit 904, and the communication unit 905, specifically, for example, in the processor 301, a program stored in the storage device 302 illustrated in FIG. It is realized by executing it. Further, the sampling candidate list 911 and the additional learning alert list 912 are specifically realized by, for example, the storage device 302 illustrated in FIG. 3.

学習部９０１は、図１に示した機械学習処理を実行する。予測部９０２は、図１に示したアラート分類処理１３２を実行する。第１判定部９０３は、図１に示した通信相手別異常行動判定処理１３４を実行する。第２判定部９０４は、図１に示した行動別異常行動判定処理１３５およびフィードバック処理１３６を実行する。通信部９０５は、通信ＩＦ３０５により、アナリスト端末１４０とデータを送受信する。 The learning unit 901 executes the machine learning process shown in FIG. The prediction unit 902 executes the alert classification process 132 shown in FIG. The first determination unit 903 executes the communication partner-specific abnormal behavior determination processing 134 shown in FIG. The second determination unit 904 executes the behavioral abnormal behavior determination processing 135 and the feedback processing 136 illustrated in FIG. 1. The communication unit 905 transmits/receives data to/from the analyst terminal 140 via the communication IF 305.

＜分析装置１０３の動作シーケンス＞
図１０は、分析装置１０３の動作を示すシーケンス図である。学習部９０１は、ログ収集装置１０２から特徴量テーブル４００に記憶されたログを取得する（ステップＳ１００１）。学習部９０１は、当該取得したログを教師なしアラート特徴データ群として入力し、機械学習を実行する（ステップＳ１００２）。これにより、教師なしアラート特徴データ群は、複数のクラスタに分類される。各クラスタは、セキュリティアナリスト１４１によって分類を示すラベルが付与される。これにより、分類器９００が生成される。 <Operation sequence of analyzer 103>
FIG. 10 is a sequence diagram showing the operation of the analyzer 103. The learning unit 901 acquires the log stored in the feature amount table 400 from the log collection device 102 (step S1001). The learning unit 901 inputs the acquired log as an unsupervised alert feature data group and executes machine learning (step S1002). As a result, the unsupervised alert feature data group is classified into a plurality of clusters. A label indicating the classification is given to each cluster by the security analyst 141. As a result, the classifier 900 is generated.

なお、学習部９０１は、この教師なしアラート特徴データ群にラベルが付与された教師ありアラート特徴データ群を用いて再学習し、分類器９００を更新してもよい。なお、分類器９００は、セキュリティアナリスト１４１が作成してもよい。 The learning unit 901 may re-learn using the unsupervised alert feature data group in which a label is added to the unsupervised alert feature data group, and update the classifier 900. The classifier 900 may be created by the security analyst 141.

つぎに、予測部９０２は、アラート管理装置１０１から監視対象システム１１０で発生した予測対象アラートを取得し、ログ収集装置１０２から特徴量テーブル４００に記憶された予測対象アラート発生前のログを予測対象アラート特徴データとして取得する（ステップＳ１００３）。予測部９０２は、分類器９００を取得する（ステップＳ１００４）。 Next, the prediction unit 902 acquires a prediction target alert that has occurred in the monitoring target system 110 from the alert management apparatus 101, and predicts the log before the prediction target alert stored in the feature amount table 400 from the log collection apparatus 102 as a prediction target. It is acquired as alert characteristic data (step S1003). The prediction unit 902 acquires the classifier 900 (step S1004).

予測部９０２は、予測対象アラート特徴データ群の各々を分類器９００に与えることで行動分類予測を実行して（ステップＳ１００５）、分類結果（予測対象アラート特徴データが分類されたラベルおよびその確度の組み合わせ）を第１判定部９０３に出力する（ステップＳ１００６）。行動分類予測（ステップＳ１００５）の詳細は、図１２で後述する。なお、行動分類予測（ステップＳ１００５）を実行せず、セキュリティアナリスト１４１が、アナリスト端末１４０で、予測対象アラート特徴データを参照して分類結果を作成してもよい。 The prediction unit 902 executes the behavior classification prediction by giving each of the prediction target alert feature data groups to the classifier 900 (step S1005), and the classification result (the label in which the prediction target alert feature data is classified and its accuracy). The combination) is output to the first determination unit 903 (step S1006). Details of the behavior classification prediction (step S1005) will be described later with reference to FIG. The security analyst 141 may create the classification result by referring to the prediction target alert feature data at the analyst terminal 140 without executing the action classification prediction (step S1005).

つぎに、第１判定部９０３は、分類結果を用いて、通信相手別異常行動判定を実行して（ステップＳ１００７）、初出または低頻度の分類結果を第２判定部９０４に出力する（ステップＳ１００８）。通信相手別異常行動判定（ステップＳ１００７）の詳細は、図１３で後述する。 Next, the first determination unit 903 executes the communication partner-specific abnormal behavior determination using the classification result (step S1007), and outputs the first-appearing or low-frequency classification result to the second determination unit 904 (step S1008). ). Details of the abnormal behavior determination for each communication partner (step S1007) will be described later with reference to FIG.

つぎに、第２判定部９０４は、初出または低頻度の分類結果を用いて、行動別異常行動判定を実行して（ステップＳ１００９）、予測対象アラート群の中からサンプリングすべきアラート（以下、選択アラート）を決定する。選択アラートは、たとえば、選択アラートを特定するアラート識別子４０１と、選択アラートの分類結果と、の組み合わせである。選択アラートは、選択アラートの行動分類（旧ラベル）５０６を含んでもよい。第２判定部９０４は、選択アラートを、通信部９０５を介してアナリスト端末１４０に送信する（ステップＳ１０１０）。行動別異常行動判定（ステップＳ１０１０）の詳細は、図１３で後述する。 Next, the second determination unit 904 executes the behavior-specific abnormal behavior determination using the first-appearing or low-frequency classification result (step S1009), and selects an alert to be sampled (hereinafter, selected alert) from the prediction target alert group. Alert). The selected alert is, for example, a combination of an alert identifier 401 that identifies the selected alert and the classification result of the selected alert. The selection alert may include a behavior class (old label) 506 of the selection alert. The second determination unit 904 transmits the selection alert to the analyst terminal 140 via the communication unit 905 (step S1010). Details of the behavioral abnormal behavior determination (step S1010) will be described later with reference to FIG.

アナリスト端末１４０は、通信部９０５から選択アラートを受信してディスプレイに表示する。出力画面表示例については、図１１で後述する。セキュリティアナリスト１４１は、出力画面から選択アラートのラベルや行動属性（正常／異常）などの情報を入力する。これにより、アナリスト端末１４０は、フィードバック情報を生成して、分析装置１０３に送信する（ステップＳ１０１１）。 The analyst terminal 140 receives the selection alert from the communication unit 905 and displays it on the display. An output screen display example will be described later with reference to FIG. The security analyst 141 inputs information such as the label of the selected alert and the action attribute (normal/abnormal) from the output screen. As a result, the analyst terminal 140 generates feedback information and transmits it to the analysis device 103 (step S1011).

分析装置１０３は、フィードバック情報を受信すると、フィードバック処理を実行する（ステップＳ１０１２）。具体的には、たとえば、第２判定部９０４は、フィードバック情報にしたがってラベルテーブル５００を更新する（ステップＳ１０２１）。また、第１判定部９０３も、フィードバック情報にしたがって行動分類テーブル６００を更新する（ステップＳ１０２２）。 Upon receiving the feedback information, the analysis device 103 executes the feedback process (step S1012). Specifically, for example, the second determination unit 904 updates the label table 500 according to the feedback information (step S1021). The first determination unit 903 also updates the action classification table 600 according to the feedback information (step S1022).

また、学習部９０１は、追加学習処理を実行する（ステップＳ１０１３）。具体的には、たとえば、学習部９０１は、追加学習データを選択し（ステップＳ１０３１）、比率調整を実行する（ステップＳ１０３２）。これにより、追加学習データの偏りを抑制する。比率調整後、学習部９０１は、追加学習データに基づいて分類器９００を再学習して更新する（ステップＳ１０３３）。これにより、その後あらたにログが取得された場合（ステップＳ１００３）、予測部９０２は、最新の分類器９００で入力アラート群の行動分類を実行することができる。 The learning unit 901 also executes additional learning processing (step S1013). Specifically, for example, the learning unit 901 selects additional learning data (step S1031) and executes ratio adjustment (step S1032). This suppresses the bias of the additional learning data. After adjusting the ratio, the learning unit 901 re-learns and updates the classifier 900 based on the additional learning data (step S1033). Thereby, when a log is newly acquired after that (step S1003), the prediction unit 902 can execute the action classification of the input alert group with the latest classifier 900.

＜アナリスト端末１４０の出力画面表示例＞
図１１は、アナリスト端末１４０の出力画面表示例を示す説明図である。出力画面１１００は、分析アラートリスト１１０１を表示する。分析アラートリストは、アラート識別子４０１と、発生日時１１０２と、サンプリング結果１１０３と、分析優先度１１０４と、を含む。分析アラートリスト１１０１におけるリスト番号ごとのエントリが選択アラートを規定する。 <Example of output screen display of analyst terminal 140>
FIG. 11 is an explanatory diagram showing an output screen display example of the analyst terminal 140. The output screen 1100 displays the analysis alert list 1101. The analysis alert list includes an alert identifier 401, an occurrence date and time 1102, a sampling result 1103, and an analysis priority 1104. An entry for each list number in the analysis alert list 1101 defines a selection alert.

発生日時１１０２は、アラート識別子４０１で特定される選択アラートがアラート管理装置１０１で発生した日付時刻である。サンプリング結果１１０３は、アラート識別子４０１で特定される選択アラートが選択アラートとしてサンプリングされた経緯を示す文字列である。サンプリング結果１１０３は、たとえば、その選択アラートの行動分類（ラベル）５０４と、行動分類（旧ラベル）５０６と、行動分類（旧ラベル）５０６で特定される最高出現頻度のラベルの出現確率と、を用いて、第１判定部９０３により作成される。たとえば、リスト番号が「１」の選択アラートのサンプリング結果１１０３の場合、行動分類（ラベル）５０４が「Ａ」、行動分類（旧ラベル）５０６が最高出現頻度のラベルである「Ｂ」を含み、「Ｂ」の出現確率が「０．７」である。 The occurrence date and time 1102 is the date and time when the selected alert specified by the alert identifier 401 occurred in the alert management apparatus 101. The sampling result 1103 is a character string indicating the background of sampling of the selected alert identified by the alert identifier 401 as the selected alert. The sampling result 1103 includes, for example, the behavior classification (label) 504 of the selected alert, the behavior classification (old label) 506, and the appearance probability of the label having the highest appearance frequency specified by the behavior classification (old label) 506. It is created by the first determination unit 903. For example, in the case of the sampling result 1103 of the selection alert with the list number “1”, the action classification (label) 504 includes “A”, and the action classification (old label) 506 includes the label “B” having the highest appearance frequency, The appearance probability of “B” is “0.7”.

分析優先度１１０４とは、アラート識別子４０１で特定される選択アラートをセキュリティアナリスト１４１に分析してもらう際の優先度を示す文字列である。分析優先度１１０４は、たとえば、その選択アラートの行動分類６０１と、行動分類６０１に該当するアラートについての攻撃者の割合と、該当する行動シーケンス７０１についての長期攻撃の一部７０３と、を用いて、第２判定部９０４により作成される。 The analysis priority 1104 is a character string indicating the priority when the security analyst 141 analyzes the selected alert specified by the alert identifier 401. The analysis priority 1104 uses, for example, the action category 601 of the selected alert, the attacker ratio for the alert corresponding to the action category 601, and the long-term attack part 703 for the corresponding action sequence 701. , The second determination unit 904.

たとえば、リスト番号が「１」の選択アラートの分析優先度１１０３の場合、行動分類６０１が「Ａ」（「正常／異常」はまだセキュリティアナリスト１４１から付与されていない）である。また、行動分類６０１に該当するアラートについての攻撃者の割合が、通信相手分類情報５０３の全エントリ数で、行動分類６０１の「Ａ」の頻度が「Ｈ」、「Ｍ」、「Ｌ」のいずれかに該当するエントリの数を割った値である。また、該当する行動シーケンス７０１についての長期攻撃の一部７０３が「Ｙ」である。 For example, in the case of the analysis priority 1103 of the selected alert whose list number is “1”, the action classification 601 is “A” (“normal/abnormal” has not yet been given by the security analyst 141). In addition, the ratio of attackers to alerts corresponding to the action classification 601 is the total number of entries in the communication counterpart classification information 503, and the frequency of “A” of the action classification 601 is “H”, “M”, or “L”. It is a value obtained by dividing the number of entries corresponding to any one. Further, a part 703 of the long-term attack on the corresponding action sequence 701 is “Y”.

セキュリティアナリスト１４１は、分析アラートリスト１１０１を参照して、分析アラートリスト１１０１を表示するタブ（ラベル付け依頼アラート）とは別のタブ（不図示）で、入力デバイス３０３を操作することにより、選択アラートのラベルを付与したり、選択アラートのラベルに正常および異常のいずれかを付与したり、長期攻撃の一部７０３であるか否かを示すフラグ（ＹまたはＮ）を付与したりする。これら付与した情報がフィードバック情報となる。そして、アナリスト端末１４０は、フィードバック情報を分析装置１０３に送信する（ステップＳ１０１１）。 The security analyst 141 refers to the analysis alert list 1101 and selects a tab (not shown) different from the tab (labeling request alert) displaying the analysis alert list 1101 by operating the input device 303. The label of the alert is given, either normal or abnormal is given to the label of the selected alert, and a flag (Y or N) indicating whether it is a part 703 of the long-term attack is given. The added information becomes feedback information. Then, the analyst terminal 140 transmits the feedback information to the analysis device 103 (step S1011).

＜行動分類予測（ステップＳ１００５）＞
図１２は、図１０に示した行動分類予測（ステップＳ１００５）の詳細な処理手順例を示すフローチャートである。行動分類予測（ステップＳ１００５）は、たとえば、特徴量テーブル４００のエントリである予測対象アラート特徴データ毎に実行される。予測部９０２は、特徴量テーブル４００のエントリである予測対象アラート特徴データを取得する（ステップＳ１２０１）。予測部９０２は、学習部９０１から分類器９００を取得する（ステップＳ１２０２）。予測部９０２は、予測対象アラート特徴データを分類器９００に入力する（ステップＳ１２０３）。 <Behavior classification prediction (step S1005)>
FIG. 12 is a flowchart showing a detailed processing procedure example of the action classification prediction (step S1005) shown in FIG. The action classification prediction (step S1005) is executed for each prediction target alert feature data that is an entry of the feature amount table 400, for example. The prediction unit 902 acquires prediction target alert characteristic data that is an entry in the characteristic amount table 400 (step S1201). The prediction unit 902 acquires the classifier 900 from the learning unit 901 (step S1202). The prediction unit 902 inputs the prediction target alert characteristic data to the classifier 900 (step S1203).

予測部９０２は、予測対象アラート特徴データを分類器９００に入力して、通信相手の行動を分類する（ステップＳ１２０４）。予測部９０２は、分類器９００から分類結果であるラベルおよび確度を取得する（ステップＳ１２０５）。予測部９０２は、ラベルおよび確度５０５をそれぞれラベルテーブル５００の行動分類（ラベル）５０４および確度５０５に格納する（ステップＳ１２０６）。なお、当該格納前に行動分類（ラベル）５０４に格納されていたラベルは、行動分類（旧ラベル）５０６の末尾に格納される。これにより、当該予測対象アラート特徴データついての行動分類予測（ステップＳ１００５）が終了する。 The prediction unit 902 inputs the prediction target alert characteristic data to the classifier 900 and classifies the behavior of the communication partner (step S1204). The prediction unit 902 acquires the label and the accuracy that are the classification results from the classifier 900 (step S1205). The prediction unit 902 stores the label and the probability 505 in the action classification (label) 504 and the probability 505 of the label table 500, respectively (step S1206). The label stored in the action classification (label) 504 before the storage is stored at the end of the action classification (old label) 506. As a result, the action classification prediction (step S1005) for the prediction target alert feature data ends.

＜通信相手別異常行動判定（ステップＳ１００７）＞
図１３は、図１０に示した通信相手別異常行動判定（ステップＳ１００７）の詳細な処理手順例を示すフローチャートである。通信相手別異常行動判定（ステップＳ１００７）は、たとえば、行動分類予測（ステップＳ１００５）で新たに分類結果が追加されたラベルテーブル５００のラベルデータ毎に実行される。 <Abnormal behavior determination by communication partner (step S1007)>
FIG. 13 is a flowchart showing a detailed processing procedure example of the communication partner-specific abnormal behavior determination (step S1007) shown in FIG. The abnormal behavior determination for each communication partner (step S1007) is executed for each label data of the label table 500 to which the classification result is newly added in the behavior classification prediction (step S1005), for example.

第１判定部９０３は、ラベルテーブル５００からラベルデータを取得する（ステップＳ１３０１）。第１判定部９０３は、ステップＳ１３０１で取得したアラートに対応する行動分類６０１を行動分類テーブル６００から特定する（ステップＳ１３０２）。第１判定部９０３は、当該行動分類６０１の値があるか否かを判定する（ステップＳ１３０３）。 The first determination unit 903 acquires label data from the label table 500 (step S1301). The first determination unit 903 identifies the action classification 601 corresponding to the alert acquired in step S1301 from the action classification table 600 (step S1302). The first determination unit 903 determines whether or not there is a value of the action classification 601 (step S1303).

たとえば、取得したラベルデータに属する通信相手分類情報５０３が「ｗｗｗ．ａａａａ．ｂｂ」で行動分類（ラベル）５０４が「Ａ」である場合、行動分類テーブル６００における通信相手分類情報５０３が「ｗｗｗ．ａａａａ．ｂｂ」のエントリに、行動分類６０１が「Ａ」である場合の値「Ｌ／異常」が存在する。一方、取得したアラートに属する通信相手分類情報５０３が「１２３．４５．６．７」で行動分類（ラベル）５０４が「Ａ」である場合、行動分類テーブル６００における通信相手分類情報５０３が「１２３．４５．６．７」のエントリに、行動分類６０１が「Ａ」である場合の値は存在しない（空欄）。 For example, when the communication partner classification information 503 belonging to the acquired label data is “www.aaaa.bb” and the action classification (label) 504 is “A”, the communication partner classification information 503 in the action classification table 600 is “www.aaa.bb”. The entry “aaa.bb” has a value “L/abnormal” when the action classification 601 is “A”. On the other hand, when the communication partner classification information 503 belonging to the acquired alert is “123.45.6.7” and the action classification (label) 504 is “A”, the communication partner classification information 503 in the action classification table 600 is “123”. .45.6.7”, there is no value when the action classification 601 is “A” (blank).

特定した行動分類６０１の値がない場合（ステップＳ１３０３：Ｎｏ）、第１判定部９０３は、ステップＳ１３０１で取得したラベルデータ（当該ラベルデータへのポインタでもよい）をサンプリング候補リスト９１１に追加する（ステップＳ１３０４）。また、この場合、第１判定部９０３は、通信相手分類情報５０３および行動分類（ラベル）５０４を行動分類テーブル６００に新規登録する。これにより、ステップＳ１３０１で取得したラベルデータについての通信相手別異常行動判定（ステップＳ１００７）が終了する。 When there is no value of the identified action classification 601 (step S1303: No), the first determination unit 903 adds the label data (which may be a pointer to the label data) acquired in step S1301 to the sampling candidate list 911 ( Step S1304). In this case, the first determination unit 903 newly registers the communication partner classification information 503 and the behavior classification (label) 504 in the behavior classification table 600. As a result, the abnormal behavior determination for each communication partner (step S1007) for the label data acquired in step S1301 ends.

また、特定した行動分類６０１の値がある場合（ステップＳ１３０３：Ｙｅｓ）、第１判定部９０３は、そのエントリ内の行動属性が「異常」であるか否かを判断する（ステップＳ１３０５）。なお、このとき、アクセス回数ｎも１回加算され、出現頻度が更新される場合がある。異常である場合（ステップＳ１３０５：Ｙｅｓ）、第１判定部９０３は、ステップＳ１３０１で取得したラベルデータを、通信部９０５を介してアナリスト端末１４０に送信する（ステップＳ１３０６）。ラベルデータではなく、ラベルデータとアラート識別子４０１および集計日時４０２が同一なアラート特徴データを送信してもよい。これにより、取得したラベルデータについての通信相手別異常行動判定（ステップＳ１００７）が終了する。 When there is a value of the identified action classification 601 (step S1303: Yes), the first determination unit 903 determines whether the action attribute in the entry is “abnormal” (step S1305). At this time, the number of times of access n may be added once and the appearance frequency may be updated. When it is abnormal (step S1305: Yes), the first determination unit 903 transmits the label data acquired in step S1301 to the analyst terminal 140 via the communication unit 905 (step S1306). Instead of the label data, the alert feature data in which the label data and the alert identifier 401 and the total date/time 402 are the same may be transmitted. As a result, the abnormal behavior determination for each communication partner for the acquired label data (step S1007) ends.

一方、異常でない場合（ステップＳ１３０５：Ｎｏ）、第１判定部９０３は、特定した行動分類６０１の値に含まれる頻度が「Ｌ」であるか否かを判断する（ステップＳ１３０７）。頻度が「Ｌ」である場合（ステップＳ１３０４：Ｙｅｓ）、第１判定部９０３は、ステップＳ１３０１で取得したラベルデータ（当該ラベルデータへのポインタでもよい）をサンプリング候補リスト９１１に追加する（ステップＳ１３０４）。これにより、取得したラベルデータについての通信相手別異常行動判定（ステップＳ１００７）が終了する。一方、頻度が「Ｌ」でない場合（ステップＳ１３０７：Ｎｏ）、頻度は「Ｌ」よりも高い「Ｍ」または「Ｈ」である。したがって、取得したラベルデータについての通信相手別異常行動判定（ステップＳ１００７）が終了する。 On the other hand, if not abnormal (step S1305: No), the first determination unit 903 determines whether or not the frequency included in the value of the identified action classification 601 is “L” (step S1307). When the frequency is “L” (step S1304: Yes), the first determination unit 903 adds the label data (which may be a pointer to the label data) acquired in step S1301 to the sampling candidate list 911 (step S1304). ). As a result, the abnormal behavior determination for each communication partner for the acquired label data (step S1007) ends. On the other hand, when the frequency is not "L" (step S1307: No), the frequency is "M" or "H" higher than "L". Therefore, the abnormal behavior determination for each communication partner for the acquired label data (step S1007) ends.

＜行動別異常行動判定（ステップＳ１００９）＞
図１４は、図１０に示した行動別異常行動判定（ステップＳ１００９）の詳細な処理手順例を示すフローチャートである。行動別異常行動判定（ステップＳ１００９）は、たとえば、通信相手別異常行動判定（ステップＳ１００７）でサンプリング候補リスト９１１に登録されたラベルデータ毎に実行される。 <Abnormal Behavior Judgment by Behavior (Step S1009)>
FIG. 14 is a flowchart showing a detailed processing procedure example of the abnormal behavior determination for each behavior (step S1009) shown in FIG. The behavior-specific abnormal behavior determination (step S1009) is executed for each label data registered in the sampling candidate list 911 in the communication partner-specific abnormal behavior determination (step S1007), for example.

第２判定部９０４は、サンプリング候補リスト９１１からラベルデータを取得する（ステップＳ１４０１）。第２判定部９０４は、ステップＳ１４０１で取得したラベルデータに含まれている行動分類（ラベル）５０４および行動分類（旧ラベル）５０６から行動シーケンス７０１を生成する（ステップＳ１４０２）。 The second determination unit 904 acquires label data from the sampling candidate list 911 (step S1401). The second determination unit 904 generates an action sequence 701 from the action classification (label) 504 and the action classification (old label) 506 included in the label data acquired in step S1401 (step S1402).

第２判定部９０４は、行動シーケンステーブル７００を参照して、ステップＳ１４０２で生成した行動シーケンス７０１のエントリを特定する（ステップＳ１４０３）。当該エントリが特定されない場合（ステップＳ１４０４：Ｎｏ）、すなわち、当該エントリが存在しない場合、第２判定部９０４は、行動シーケンステーブル７００に新規エントリを作成し、ステップＳ１４０１で取得したラベルデータをサンプリング候補リスト９１１から削除する（ステップＳ１４０５）。これにより、当該取得したラベルデータについての行動別異常行動判定（ステップＳ１００９）が終了する。 The second determination unit 904 refers to the action sequence table 700 to identify the entry of the action sequence 701 generated in step S1402 (step S1403). If the entry is not specified (step S1404: No), that is, if the entry does not exist, the second determination unit 904 creates a new entry in the action sequence table 700, and the label data acquired in step S1401 is a sampling candidate. It is deleted from the list 911 (step S1405). As a result, the behavior-specific abnormal behavior determination (step S1009) for the acquired label data ends.

一方、ステップＳ１４０２で生成した行動シーケンス７０１のエントリが特定された場合（ステップＳ１４０４：Ｙｅｓ）、第２判定部９０４は、特定された行動シーケンス７０１のエントリの長期攻撃の一部７０３が「Ｙ」であるか否かを判定する（ステップＳ１４０６）。「Ｙ」である場合（ステップＳ１４０６：Ｙｅｓ）、第２判定部９０４は、ステップＳ１４０１で取得したラベルデータのサンプリング５０７を「Ｙ」に更新する（ステップＳ１４０７）。これにより、当該取得したラベルデータについての行動別異常行動判定（ステップＳ１００９）が終了する。 On the other hand, when the entry of the action sequence 701 generated in step S1402 is identified (step S1404: Yes), the second determination unit 904 determines that part of the long-term attack 703 of the identified action sequence 701 entry is “Y”. Or not (step S1406). In the case of “Y” (step S1406: Yes), the second determination unit 904 updates the sampling 507 of the label data acquired in step S1401 to “Y” (step S1407). As a result, the behavior-specific abnormal behavior determination (step S1009) for the acquired label data ends.

一方、特定された行動シーケンス７０１のエントリの長期攻撃の一部７０３が「Ｙ」でない場合（ステップＳ１４０６：Ｎｏ）、第２判定部９０４は、当該特定された行動シーケンス７０１のエントリ内の頻度が「Ｌ」であるか否かを判断する（ステップＳ１４０８）。頻度が「Ｌ」である場合（ステップＳ１４０８：Ｙｅｓ）、ステップＳ１４０７に移行する。すなわち、特定された行動シーケンス７０１が長期攻撃の一部７０３でない場合でも頻度が「Ｌ」であれば、特定された行動シーケンス７０１内の行動分類（ラベル）５０４は、サンプリング対象となる。 On the other hand, when a part of the long-term attack 703 of the entry of the identified action sequence 701 is not “Y” (step S1406: No), the second determination unit 904 determines that the frequency in the entry of the identified action sequence 701 is It is determined whether or not it is “L” (step S1408). When the frequency is “L” (step S1408: Yes), the process proceeds to step S1407. That is, even if the identified action sequence 701 is not part of the long-term attack 703, if the frequency is “L”, the action classification (label) 504 in the identified action sequence 701 becomes a sampling target.

一方、頻度が「Ｌ」でない場合（ステップＳ１４０８：Ｎｏ）、第２判定部９０４は、ステップＳ１４０１で取得したラベルデータをサンプリング候補から削除する（ステップＳ１４０９）。これにより、当該取得したラベルデータについての行動別異常行動判定（ステップＳ１００９）が終了する。このあと、サンプリング候補リスト９１１に登録された全ラベルデータについて通信相手別異常行動判定（ステップＳ１００７）が終了した場合、通信部９０５は、サンプリング候補リスト９１１内のラベルデータ群を選択アラート群としてアナリスト端末１４０に送信する。これにより、図１１に示したように、アナリスト端末１４０は、サンプリング候補リスト９１１の各アラートを分析して、分析アラートリスト１１０１を表示する。 On the other hand, if the frequency is not “L” (step S1408: No), the second determination unit 904 deletes the label data acquired in step S1401 from the sampling candidates (step S1409). As a result, the behavior-specific abnormal behavior determination (step S1009) for the acquired label data ends. Thereafter, when the abnormal behavior determination by communication partner (step S1007) is completed for all the label data registered in the sampling candidate list 911, the communication unit 905 determines that the label data group in the sampling candidate list 911 is used as the selection alert group. It is transmitted to the wrist terminal 140. As a result, as shown in FIG. 11, the analyst terminal 140 analyzes each alert in the sampling candidate list 911 and displays the analysis alert list 1101.

＜フィードバック処理（ステップＳ１０１２）＞
図１５は、図１１に示したフィードバック処理（ステップＳ１０１２）の詳細な処理手順例を示すフローチャートである。フィードバック処理（ステップＳ１０１２）は、フィードバック情報に含まれる選択アラート毎に実行される。分析装置１０３は、アナリスト端末１４０からフィードバック情報を取得する（ステップＳ１５０１）。分析装置１０３は、フィードバック情報の中の選択アラートについてラベル更新があるか否かを判断する（ステップＳ１５０２）。ラベル更新がない場合（ステップＳ１５０２：Ｎｏ）、当該選択アラートについては、フィードバック処理が終了する。 <Feedback processing (step S1012)>
FIG. 15 is a flowchart showing a detailed processing procedure example of the feedback processing (step S1012) shown in FIG. The feedback process (step S1012) is executed for each selected alert included in the feedback information. The analysis apparatus 103 acquires feedback information from the analyst terminal 140 (step S1501). The analysis device 103 determines whether or not there is label update for the selected alert in the feedback information (step S1502). If there is no label update (step S1502: No), the feedback process ends for the selected alert.

一方、ラベル更新がある場合（ステップＳ１５０２：Ｙｅｓ）、分析装置１０３は、第２判定部９０４により、選択アラートであるラベルテーブル５００のラベルデータの行動分類（ラベル）５０４を、フィードバック情報に含まれる同一選択アラートの変更後のラベルに更新し、行動分類（旧ラベル）５０６の末尾に、行動分類（ラベル）５０４に記録されていたラベルを格納する（ステップＳ１５０３）。 On the other hand, when there is a label update (step S1502: Yes), the analysis apparatus 103 causes the second determination unit 904 to include the action classification (label) 504 of the label data of the label table 500, which is the selection alert, in the feedback information. The same selected alert is updated to the changed label, and the label recorded in the action classification (label) 504 is stored at the end of the action classification (old label) 506 (step S1503).

そして、分析装置１０３は、第１判定部９０３により、行動分類テーブル６００を更新する（ステップＳ１５０４）。具体的には、たとえば、分析装置１０３は、選択アラートの通信相手分類情報５０３が行動分類テーブル６００にない場合、行動分類テーブル６００に新規エントリを作成する。 Then, the analysis apparatus 103 updates the action classification table 600 by the first determination unit 903 (step S1504). Specifically, for example, when the communication partner classification information 503 of the selected alert does not exist in the action classification table 600, the analysis device 103 creates a new entry in the action classification table 600.

分析装置１０３は、選択アラートの通信相手分類情報５０３が行動分類テーブル６００にある場合、その行動分類６０１の値の頻度を更新する。また、その行動分類６０１の値に登録されている行動属性（正常／異常）が、フィードバック情報に含まれる選択アラートの行動属性（正常／異常）と異なる場合、分析装置１０３は、その行動分類６０１の値の行動属性を、フィードバック情報に含まれる選択アラートの行動属性（正常／異常）に更新する。 When the communication partner classification information 503 of the selected alert is in the behavior classification table 600, the analysis device 103 updates the frequency of the value of the behavior classification 601. If the behavior attribute (normal/abnormal) registered in the value of the behavior classification 601 is different from the behavior attribute (normal/abnormal) of the selected alert included in the feedback information, the analysis device 103 causes the behavior classification 601. The action attribute of the value of is updated to the action attribute (normal/abnormal) of the selected alert included in the feedback information.

分析装置１０３は、行動シーケンステーブル７００を参照して、ステップＳ１５０３で更新後の選択アラートから得られる行動シーケンス７０１が長期攻撃の一部７０３であるか否かを判断する（ステップＳ１５０５）。長期攻撃の一部７０３でない場合（ステップＳ１５０５：Ｎｏ）、当該選択アラートについては、フィードバック処理が終了する。 The analysis apparatus 103 refers to the action sequence table 700 and determines whether or not the action sequence 701 obtained from the updated selected alert in step S1503 is a part of the long-term attack 703 (step S1505). When it is not part of the long-term attack 703 (step S1505: No), the feedback process ends for the selected alert.

一方、長期攻撃の一部７０３である場合（ステップＳ１５０５：Ｙｅｓ）、分析装置１０３は、行動シーケンステーブル７００の当該行動シーケンス７０１の長期攻撃の一部７０３を「Ｙ」に更新する。そして、当該選択アラートについては、フィードバック処理が終了する。 On the other hand, when it is the long-term attack part 703 (step S1505: Yes), the analysis device 103 updates the long-term attack part 703 of the action sequence 701 of the action sequence table 700 to “Y”. Then, the feedback process ends for the selected alert.

＜追加学習処理（ステップＳ１０１２）＞
図１６は、図１０に示した追加学習処理（ステップＳ１０１３）の詳細な処理手順例を示すフローチャートである。追加学習処理（ステップＳ１０１２）は一定期間ごとに実行される。一定期間とは、予測対象アラートの発生日時から所定期間遡った日時から予測対象アラートの発生日時までの期間でもよく、あらかじめ設定された予定日時から所定期間遡った日時から予定日時までの期間でもよい。 <Additional learning process (step S1012)>
FIG. 16 is a flowchart showing a detailed processing procedure example of the additional learning processing (step S1013) shown in FIG. The additional learning process (step S1012) is executed at regular intervals. The fixed period may be a period from the date and time of the forecast target alert going back a predetermined period to the date and time of the forecast target alert, or may be a period from the preset scheduled date and time back to the predetermined period to the scheduled date and time. ..

図１６では、一例として、ラベルテーブル５００には、Ａ〜Ｅのラベルが登録されており、一定期間分のラベルデータの数を１０００個とする。また、この１０００個のラベルデータのうち、サンプリング５０７が「Ｙ」でかつ追加学習済み５０８が「Ｎ」ある選択アラートは、３０個とする。この３０個の内訳は、行動分類（ラベル）５０４の値「Ａ」が１５個、「Ｂ」が１０個、「Ｃ」が５個、「Ｄ」および「Ｅ」がそれぞれ０個とする。 In FIG. 16, as an example, the labels A to E are registered in the label table 500, and the number of label data for a certain period is 1000. Further, among the 1000 pieces of label data, the number of selection alerts in which the sampling 507 is “Y” and the additional learning completed 508 is “N” is 30. Of these 30 items, the value “A” of the action classification (label) 504 is 15, “B” is 10, “C” is 5, and “D” and “E” are 0.

学習部９０１は、ラベルテーブル５００から一定期間分のラベルデータ（本例では１０００個）を取得する（ステップＳ１６０１）。学習部９０１は、一定期間分のラベルデータからサンプリング５０７が「Ｙ」である選択アラート（本例では３０個）を取得し、追加学習済み５０８を「Ｙ」に更新して、追加学習用アラートリスト９１２に追加する（ステップＳ１６０２）。 The learning unit 901 acquires label data for a certain period (1000 pieces in this example) from the label table 500 (step S1601). The learning unit 901 acquires the selected alerts (30 in this example) whose sampling 507 is “Y” from the label data for a fixed period, updates the additional learned 508 to “Y”, and alerts for additional learning. It is added to the list 912 (step S1602).

学習部９０１は、追加学習用アラートリスト９１２内の選択アラートを行動分類（ラベル）５０４別に集計し、比率を計算する（ステップＳ１６０３）。比率とは、選択アラートの行動分類（ラベル）５０４ごとに、選択アラートが占める割合である。たとえば、選択アラートは３０個である。そのうち、行動分類（ラベル）５０４が「Ａ」である選択アラートは１５個である。したがって、ラベルＡの比率は、１５／３０＝１／２である。同様に、行動分類（ラベル）５０４が「Ｂ」の比率は１０／３０＝１／３であり、行動分類（ラベル）５０４が「Ｃ」の比率は５／３０＝１／６である。行動分類（ラベル）５０４が「Ｄ」、「Ｅ」の比率は、選択アラートが０個であるため、０である。 The learning unit 901 aggregates the selected alerts in the additional learning alert list 912 for each action classification (label) 504 and calculates the ratio (step S1603). The ratio is the ratio occupied by the selected alert for each action classification (label) 504 of the selected alert. For example, there are 30 selected alerts. Among them, there are 15 selection alerts whose action classification (label) 504 is “A”. Therefore, the ratio of label A is 15/30=1/2. Similarly, the ratio of “B” in the behavior classification (label) 504 is 10/30=1/3, and the ratio of “C” in the behavior classification (label) 504 is 5/30=1/6. The ratio of the action classification (label) 504 to “D” and “E” is 0 because the number of selected alerts is 0.

学習部９０１は、一定期間分のアラートに含まれるすべての行動分類（ラベル）５０４が下記式（１）を満たすか否か、すなわち、目標比率からステップＳ１６０３で引いた値の絶対値が閾値以下であるか否かを判断する（ステップＳ１６０４）。 The learning unit 901 determines whether or not all the action classifications (labels) 504 included in the alert for a certain period satisfy the following formula (1), that is, the absolute value of the value subtracted from the target ratio in step S1603 is less than or equal to the threshold value. It is determined whether or not (step S1604).

｜目標比率−比率｜≦閾値・・・（１） |Target ratio-Ratio |≦threshold value (1)

ここで、目標比率とは、選択アラートの行動分類（ラベル）５０４の種類数の逆数である。本例では、選択アラートの行動分類（ラベル）５０４の種類数はＡ〜Ｅの５個であるため、目標比率は１／５である。また、閾値を０．１５とする。１つでも式（１）を充足しない行動分類（ラベル）５０４が存在する場合（ステップＳ１６０４：Ｎｏ）、ステップＳ１６０５に移行する。すべてのラベルが式（１）を充足する場合（ステップＳ１６０４：Ｙｅｓ）、ステップＳ１６０８に移行する。 Here, the target ratio is the reciprocal of the number of types of the action classification (label) 504 of the selected alert. In this example, the number of types of the action classification (label) 504 of the selected alert is 5 from A to E, so the target ratio is ⅕. Further, the threshold value is 0.15. When there is at least one action classification (label) 504 that does not satisfy the formula (1) (step S1604: No), the process proceeds to step S1605. When all the labels satisfy the expression (1) (step S1604: Yes), the process proceeds to step S1608.

ラベルＡの場合、その比率は１／２であるため、式（１）の左辺は０．３０となり、式（１）を充足しない。ラベルＢの場合、その比率は１／３であるため、式（１）の左辺は０．１３となり、式（１）を充足する。ラベルＣの場合、その比率は１／６であるため、式（１）の左辺は０．０３となり、式（１）を充足する。 In the case of label A, the ratio is 1/2, so the left side of equation (1) is 0.30, which does not satisfy equation (1). In the case of label B, the ratio is 1/3, so the left side of equation (1) is 0.13, which satisfies equation (1). In the case of label C, the ratio is 1/6, so the left side of equation (1) is 0.03, which satisfies equation (1).

ラベルＤ，Ｅの場合、その比率は０であるため、式（１）の左辺は０．２０となり、式（１）を充足しない。本例では、ステップＳ１６０５に移行する。 In the case of labels D and E, since the ratio is 0, the left side of Expression (1) is 0.20, which does not satisfy Expression (1). In this example, the process proceeds to step S1605.

ステップＳ１６０５において、学習部９０１は、ステップＳ１６０４で閾値を超えたラベルを１つ選択する（ステップＳ１６０５）。たとえば、閾値を超えたラベルＡ，Ｄ，ＥからＡが選択されたとする。選択されたラベルを「選択ラベル」と称す。 In step S1605, the learning unit 901 selects one label that has exceeded the threshold in step S1604 (step S1605). For example, assume that A is selected from the labels A, D, and E that exceed the threshold. The selected label is called a "selected label".

学習部９０１は、ステップＳ１６０１で取得した一定期間分のラベルデータのうち、行動分類（ラベル）５０４が選択ラベルでかつサンプリング５０７が「Ｎ」である選択アラートを１つ選択する（ステップＳ１６０６）。すなわち、１０００個ある一定期間分のラベルデータのうち、サンプリング５０７が「Ｙ」である３０個の選択アラートを除いた、９７０個のラベルデータから１つ選択される。学習部９０１は、ステップＳ１６０６での選択アラート（選択アラートへのポインタでもよい）を追加学習用アラートリスト９１２に追加する（ステップＳ１６０７）。そして、ステップＳ１６０３に戻り、比率を再計算して、一定期間分のラベルデータに含まれるすべてのラベルが式（１）を充足するか否かを判断する。 The learning unit 901 selects one selection alert in which the action classification (label) 504 is the selected label and the sampling 507 is “N” from the label data for the certain period acquired in step S1601 (step S1606). That is, one piece is selected from 970 pieces of label data excluding 30 pieces of selection alerts for which the sampling 507 is “Y”, out of the 1000 pieces of label data for a certain period. The learning unit 901 adds the selected alert in step S1606 (which may be a pointer to the selected alert) to the additional learning alert list 912 (step S1607). Then, returning to step S1603, the ratio is recalculated, and it is determined whether or not all the labels included in the label data for a certain period satisfy the expression (1).

一定期間分のラベルデータに含まれるすべての行動分類（ラベル）５０４が式（１）を充足する場合、ラベルＡ〜Ｅの比率が、追加学習処理（ステップＳ１０１３）の開始当初に比べて均等に近づいている。この時点で追加学習用アラートリスト９１２に存在する選択アラートが追加学習データの元になる。そして、学習部９０１は、追加学習用アラートリスト９１２内の選択アラートの追加学習済み５０８を「Ｙ」に更新する（ステップＳ１６０８）。 When all the action classifications (labels) 504 included in the label data for a certain period satisfy the expression (1), the ratios of the labels A to E become even compared to the beginning of the additional learning process (step S1013). It is approaching. At this point, the selected alert existing in the additional learning alert list 912 becomes the source of the additional learning data. Then, the learning unit 901 updates the additional learning completed 508 of the selected alert in the additional learning alert list 912 to “Y” (step S1608).

そして、学習部９０１は、追加学習用アラートリスト９１２内の選択アラートからアラート識別子４０１、集計日時４０２、および行動分類（ラベル）５０４を取得し、取得したアラート識別子４０１および集計日時４０２に一致するアラート特徴データを特徴量テーブル４００から取得して、追加学習データを生成する。そして、学習部９０１は、追加学習データを分類器９００に与えて再学習する（ステップＳ１６０９）。これにより、分類器９００が更新されて過学習が抑制され、分類精度低下が抑制される。 Then, the learning unit 901 acquires the alert identifier 401, the total date and time 402, and the action classification (label) 504 from the selected alerts in the additional learning alert list 912, and the alerts that match the acquired alert identifier 401 and total date and time 402. The characteristic data is acquired from the characteristic amount table 400, and additional learning data is generated. Then, the learning unit 901 gives the additional learning data to the classifier 900 to relearn (step S1609). As a result, the classifier 900 is updated, over-learning is suppressed, and deterioration of classification accuracy is suppressed.

（１）以上説明したように、本実施例にかかる分析装置１０３は、ラベルテーブル５００と行動分類テーブル６００と行動シーケンステーブル７００とにアクセス可能である。ラベルテーブル５００は、監視対象システム１１０との通信相手分類情報５０３が監視対象システム１１０に対して実行した行動分類（旧ラベル）５０６を記憶した情報である。行動分類テーブル６００は、通信相手分類情報５０３と、分類された通信相手分類情報５０３の行動分類６０１と、を対応付けた情報である。行動シーケンステーブル７００は、通信相手分類情報５０３の時系列な行動である行動シーケンス７０１と、行動シーケンス７０１が行動シーケンス７０１よりも長い他の行動シーケンスの一部であるか否かを示す長期攻撃の一部７０３と、を対応付けた情報である。 (1) As described above, the analysis device 103 according to the present embodiment can access the label table 500, the action classification table 600, and the action sequence table 700. The label table 500 is information in which the action classification (old label) 506 executed on the monitoring target system 110 by the communication partner classification information 503 with the monitoring target system 110 is stored. The action classification table 600 is information in which the communication partner classification information 503 and the action classification 601 of the classified communication partner classification information 503 are associated with each other. The action sequence table 700 includes the action sequence 701, which is a time-series action of the communication partner classification information 503, and a long-term attack indicating whether the action sequence 701 is part of another action sequence longer than the action sequence 701. This is information in which the part 703 is associated with each other.

プロセッサ３０１は、監視対象システム１１０で発生したアラート群の各々のアラートの発生前の第１所定期間内における監視対象システム１１０との通信相手分類情報５０３の行動の分類結果を取得する（ステップＳ１００３）。 The processor 301 acquires the classification result of the action of the communication partner classification information 503 with the monitoring target system 110 within the first predetermined period before the occurrence of each alert of the alert group generated in the monitoring target system 110 (step S1003). ..

プロセッサ３０１は、アラート群の中の特定のアラートの発生前の第１所定期間内における監視対象システム１１０との特定の通信相手分類情報５０３が行動分類テーブル６００に存在するか否かを判定する（ステップＳ１００７）。 The processor 301 determines whether or not the action classification table 600 has the specific communication partner classification information 503 with the monitored system 110 within the first predetermined period before the specific alert in the alert group is generated ( Step S1007).

プロセッサ３０１は、特定の通信相手分類情報５０３が行動分類テーブル６００に存在しないと判定された場合、特定の通信相手分類情報５０３の行動の分類結果と、ラベルテーブル５００に記憶された特定の通信相手分類情報５０３の行動分類（旧ラベル）５０６と、に基づいて、特定の通信相手分類情報５０３の時系列な行動である特定の行動シーケンス７０１を生成し、行動シーケンステーブル７００を用いて、特定の行動シーケンス７０１が長期攻撃の一部７０３であるか否かを判定する（ステップＳ１００９）。 When it is determined that the specific communication partner classification information 503 does not exist in the behavior classification table 600, the processor 301 determines the behavior classification result of the specific communication partner classification information 503 and the specific communication partner stored in the label table 500. Based on the action classification (old label) 506 of the classification information 503, a specific action sequence 701, which is a time-series action of the specific communication partner classification information 503, is generated, and a specific action sequence table 700 is used. It is determined whether the action sequence 701 is part of the long-term attack 703 (step S1009).

プロセッサ３０１は、特定の行動シーケンス７０１が長期攻撃の一部７０３であると判定された場合、特定の通信相手分類情報５０３の行動の分類結果を選択する（ステップＳ１４０５、Ｓ１４０７、Ｓ１４０９）。 When it is determined that the specific action sequence 701 is part of the long-term attack 703, the processor 301 selects the action classification result of the specific communication partner classification information 503 (steps S1405, S1407, S1409).

プロセッサ３０１は、選択された特定の通信相手分類情報５０３の行動の分類結果を出力する（ステップＳ１０１０）。 The processor 301 outputs the action classification result of the selected specific communication partner classification information 503 (step S1010).

選択された特定の通信相手分類情報５０３の行動の分類結果は、その通信相手にとって初出の行動であり、かつ、その通信相手を含む通信相手集団の過去の行動において過去に長期攻撃の一部であるため、手口変化の関連性が高いアラートに属する。したがって、分析装置１０３は、アラート群から手口変化の関連性が高いアラートを自動的に絞り込むことができる。このように、どのアラートが手口変化の関連性が高いかをセキュリティアナリスト１４１が判断する必要がないため、セキュリティアナリスト１４１の負担軽減を図ることができる。 The action classification result of the selected specific communication partner classification information 503 is the first action for the communication partner, and is a part of a long-term attack in the past in the past behavior of the communication partner group including the communication partner. Because of this, it belongs to the alerts that are highly relevant to changes in tactics. Therefore, the analysis device 103 can automatically narrow down the alerts that are highly relevant to the change in the signature from the alert group. As described above, the security analyst 141 does not need to determine which alert has a high relevance to the change in the signature, and thus the burden on the security analyst 141 can be reduced.

（２）また、本実施例にかかる分析装置１０３において、プロセッサ３０１は、特定の通信相手分類情報５０３が行動分類テーブル６００に存在しないと判定された場合、特定の通信相手分類情報５０３の行動の分類結果と、ラベルテーブル５００に記憶された特定の通信相手分類情報５０３の行動分類（旧ラベル）５０６と、に基づいて、特定の通信相手分類情報５０３の時系列な行動である特定の行動シーケンス７０１を生成し、行動シーケンステーブル７００を用いて、特定の行動シーケンス７０１の頻度７０２が所定頻度以下であるか否かを判定する（ステップＳ１４０８）。 (2) Further, in the analysis device 103 according to the present exemplary embodiment, when the processor 301 determines that the specific communication partner classification information 503 does not exist in the behavior classification table 600, the processor 301 determines the behavior of the specific communication partner classification information 503. Based on the classification result and the action classification (old label) 506 of the specific communication partner classification information 503 stored in the label table 500, a specific action sequence that is a time-series action of the specific communication partner classification information 503. 701 is generated, and the action sequence table 700 is used to determine whether or not the frequency 702 of the specific action sequence 701 is equal to or lower than a predetermined frequency (step S1408).

プロセッサ３０１は、特定の行動シーケンス７０１の頻度７０２が所定頻度以下であると判定された場合、特定の通信相手分類情報５０３の行動の分類結果を選択する（ステップＳ１４０７、Ｓ１４０９）。 When it is determined that the frequency 702 of the specific action sequence 701 is less than or equal to the predetermined frequency, the processor 301 selects the action classification result of the specific communication partner classification information 503 (steps S1407 and S1409).

選択された特定の通信相手分類情報５０３の行動の分類結果は、その通信相手にとって初出の行動であり、かつ、その通信相手を含む通信相手集団の過去の行動において低頻度の行動シーケンスであるため、手口変化の関連性が高いアラートに属する。したがって、分析装置１０３は、アラート群から手口変化の関連性が高いアラートを自動的に絞り込むことができる。このように、どのアラートが手口変化の関連性が高いかをセキュリティアナリスト１４１が判断する必要がないため、セキュリティアナリスト１４１の負担軽減を図ることができる。 Since the action classification result of the selected specific communication partner classification information 503 is a first-time action for the communication partner, and is a low-frequency action sequence in the past actions of the communication partner group including the communication partner. , Belongs to alerts that are highly relevant to tactic changes. Therefore, the analysis device 103 can automatically narrow down the alerts that are highly relevant to the change in the signature from the alert group. As described above, the security analyst 141 does not need to determine which alert has a high relevance to the change in the signature, and thus the burden on the security analyst 141 can be reduced.

（３）また、上記（１）の分析装置１０３において、行動分類テーブル６００は、通信相手分類情報５０３の行動に、当該行動が正常な行動であるか異常な行動であるかを示す行動属性と、通信相手分類情報５０３の行動の出現頻度と、が対応付けられた情報である。 (3) Further, in the analysis device 103 of (1) above, the behavior classification table 600 includes, in the behavior of the communication partner classification information 503, a behavior attribute indicating whether the behavior is a normal behavior or an abnormal behavior. , And the appearance frequency of the action of the communication partner classification information 503 are associated with each other.

プロセッサ３０１は、特定の通信相手分類情報５０３の行動属性が正常であり、かつ、特定の通信相手分類情報５０３の行動の出現頻度が所定頻度以下であると判定された場合、特定の行動シーケンス７０１が長期攻撃の一部７０３であるか否かを判定する。 When it is determined that the behavior attribute of the specific communication partner classification information 503 is normal and the appearance frequency of the behavior of the specific communication partner classification information 503 is equal to or lower than the predetermined frequency, the processor 301 determines that the specific behavior sequence 701. Is a part 703 of the long-term attack.

（４）また、上記（１）の分析装置１０３において、行動シーケンステーブル７００は、行動シーケンス７０１に、行動シーケンス７０１の頻度７０２と、長期攻撃の一部７０３と、が対応付けられた情報である。 (4) In the analysis device 103 of (1), the action sequence table 700 is information in which the action sequence 701 is associated with the frequency 702 of the action sequence 701 and a part of the long-term attack 703. ..

プロセッサ３０１は、特定の行動シーケンス７０１が長期攻撃の一部７０３でないと判定された場合、特定の行動シーケンス７０１の頻度７０２が所定頻度以下であるか否かを判定する（ステップＳ１４０８）。 When it is determined that the specific action sequence 701 is not part of the long-term attack 703, the processor 301 determines whether the frequency 702 of the specific action sequence 701 is equal to or lower than a predetermined frequency (step S1408).

プロセッサ３０１は、特定の行動シーケンス７０１の頻度７０２が所定頻度以下であると判定された場合、特定の通信相手分類情報５０３の行動の分類結果を選択する。 When the frequency 702 of the specific action sequence 701 is determined to be equal to or lower than the predetermined frequency, the processor 301 selects the action classification result of the specific communication partner classification information 503.

選択された特定の通信相手分類情報５０３の行動の分類結果は、その通信相手にとって低頻度の行動であり、かつ、その通信相手を含む通信相手集団の過去の行動において過去に長期攻撃の一部であるため、手口変化の関連性が高いアラートに属する。したがって、分析装置１０３は、アラート群から手口変化の関連性が高いアラートを自動的に絞り込むことができる。このように、どのアラートが手口変化の関連性が高いかをセキュリティアナリスト１４１が判断する必要がないため、セキュリティアナリスト１４１の負担軽減を図ることができる。 The action classification result of the selected specific communication partner classification information 503 is a low frequency action for the communication partner, and a part of the long-term attack in the past in the past behavior of the communication partner group including the communication partner. Therefore, it belongs to an alert that has a high relevance to changes in tactics. Therefore, the analysis device 103 can automatically narrow down the alerts that are highly relevant to the change in the signature from the alert group. As described above, the security analyst 141 does not need to determine which alert has a high relevance to the change in the signature, and thus the burden on the security analyst 141 can be reduced.

（５）また、上記（１）の分析装置１０３において、プロセッサ３０１は、特定の通信相手分類情報５０３が行動分類テーブル６００に存在しないと判定された場合、特定の通信相手分類情報５０３と、特定の通信相手分類情報５０３の行動の分類結果と、を対応付けて行動分類テーブル６００に登録する。 (5) Further, in the analysis device 103 of (1) above, when the processor 301 determines that the specific communication partner classification information 503 does not exist in the action classification table 600, the processor 301 identifies the specific communication partner classification information 503 and And the action classification result of the communication partner classification information 503 are registered in the action classification table 600 in association with each other.

これにより、行動分類テーブル６００を自動的に更新することができ、分析装置１０３の管理者（セキュリティアナリスト１４１でもよい。）負担の軽減を図ることができる。 As a result, the action classification table 600 can be automatically updated, and the burden on the administrator (or the security analyst 141) of the analysis apparatus 103 can be reduced.

（６）また、上記（１）の分析装置１０３において、プロセッサ３０１は、特定の行動シーケンス７０１が行動シーケンステーブル７００に存在しなかった場合、特定の行動シーケンス７０１を行動シーケンステーブル７００に登録する。 (6) Further, in the analysis device 103 of (1) above, the processor 301 registers the specific action sequence 701 in the action sequence table 700 when the specific action sequence 701 does not exist in the action sequence table 700.

これにより、行動シーケンステーブル７００を自動的に更新することができ、分析装置１０３の管理者（セキュリティアナリスト１４１でもよい。）負担の軽減を図ることができる。 As a result, the action sequence table 700 can be automatically updated, and the burden on the administrator (or the security analyst 141) of the analyzer 103 can be reduced.

（７）また、上記（１）の分析装置１０３において、プロセッサ３０１は、監視対象システム１１０の挙動を示す特徴量を記憶する特徴量テーブル４００にアクセス可能である。 (7) Further, in the analysis device 103 of (1) above, the processor 301 can access the feature amount table 400 that stores the feature amount indicating the behavior of the monitored system 110.

プロセッサ３０１は、第１所定期間以前の第２所定期間内における特徴量と、第２所定期間内における通信相手分類情報５０３の行動の分類結果と、の組み合わせに基づいて、通信相手分類情報５０３の行動の分類結果を予測する分類器９００を生成する（ステップＳ１００２）。 The processor 301 stores the communication partner classification information 503 based on the combination of the feature amount in the second predetermined period before the first predetermined period and the action classification result of the communication partner classification information 503 in the second predetermined period. The classifier 900 that predicts the action classification result is generated (step S1002).

プロセッサ３０１は、第１所定期間内における監視対象システム１１０の挙動を示す特徴量を特徴量テーブル４００から取得し、当該取得した特徴量を分類器９００に与えることにより、通信相手分類情報５０３の行動の分類結果を取得する。 The processor 301 acquires a characteristic amount indicating the behavior of the monitored system 110 within the first predetermined period from the characteristic amount table 400, and gives the acquired characteristic amount to the classifier 900, whereby the behavior of the communication partner classification information 503 is obtained. Get the classification result of.

（８）また、上記（７）の分析装置１０３において、プロセッサ３０１は、選択された特定の通信相手分類情報５０３の行動の分類結果と、当該分類結果に対応し、かつ、第１所定期間内における監視対象システム１１０の挙動を示す特徴量と、に基づいて、分類器９００を再学習する（ステップＳ１０１３）。 (8) Further, in the analysis device 103 of (7) above, the processor 301 corresponds to the classification result of the action of the selected specific communication partner classification information 503, and corresponds to the classification result, and within the first predetermined period. The classifier 900 is relearned based on the feature amount indicating the behavior of the monitored system 110 in step S1013 (step S1013).

これにより、分類器９００による分類精度の向上を図ることができる。 Thereby, the classification accuracy of the classifier 900 can be improved.

（９）また、上記（８）の分析装置１０３において、プロセッサ３０１は、特定の通信相手分類情報５０３の行動の分類結果を追加対象分類結果として分類結果別に集計し、追加対象分類結果の総数に対する分類結果別の数を示す比率を算出する（ステップＳ１６０３）。 (9) Further, in the analysis device 103 of (8) above, the processor 301 aggregates the action classification results of the specific communication partner classification information 503 as additional target classification results by classification result, and compares the total result of the additional target classification results. A ratio indicating the number of classified results is calculated (step S1603).

プロセッサ３０１は、各比率と所定の目標比率との差が許容範囲内であるか否かを判断する（ステップＳ１６０４）。 The processor 301 determines whether the difference between each ratio and the predetermined target ratio is within the allowable range (step S1604).

プロセッサ３０１は、各比率と所定の目標比率との差が許容範囲内である場合に（ステップＳ１６０４：Ｙｅｓ）、追加対象分類結果と、追加対象分類結果に対応し、かつ、第１所定期間内における監視対象システム１１０の挙動を示す特徴量と、に基づいて、分類結果を再学習する（ステップＳ１６０９）。 When the difference between each ratio and the predetermined target ratio is within the allowable range (step S1604: Yes), the processor 301 corresponds to the additional target classification result and the additional target classification result, and within the first predetermined period. The classification result is relearned based on the feature amount indicating the behavior of the monitoring target system 110 in step S1609 (step S1609).

プロセッサ３０１は、いずれかの比率と所定の目標比率との差が許容範囲内である場合に（ステップＳ１６０４：Ｎｏ）、選択されなかった特定の通信相手分類情報５０３の行動の分類結果を追加対象分類結果に追加して、各比率を再算出する（ステップＳ１６０５〜Ｓ１６０７、Ｓ１６０３）。 When the difference between any of the ratios and the predetermined target ratio is within the allowable range (step S1604: No), the processor 301 adds the action classification result of the specific communication partner classification information 503 that has not been selected. Each ratio is recalculated in addition to the classification result (steps S1605 to S1607, S1603).

これにより、追加対象分類結果が特定の分類結果に集中しないよう、分類結果別の周係数が調整される。したがって、分析装置１０３は、追加対象分類結果で分類器９００を再学習する際に、セキュリティアナリスト１４１へ問い合わせる回数を最小限にしつつ、時間経過とともに手口が変わるサイバー攻撃に対応することができる。 Thereby, the division coefficient for each classification result is adjusted so that the addition target classification result is not concentrated on a specific classification result. Therefore, the analysis device 103 can deal with a cyber attack that changes its method over time while minimizing the number of inquiries to the security analyst 141 when re-learning the classifier 900 with the addition target classification result.

なお、本発明は前述した実施例に限定されるものではなく、添付した特許請求の範囲の趣旨内における様々な変形例及び同等の構成が含まれる。例えば、前述した実施例は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに本発明は限定されない。また、ある実施例の構成の一部を他の実施例の構成に置き換えてもよい。また、ある実施例の構成に他の実施例の構成を加えてもよい。また、各実施例の構成の一部について、他の構成の追加、削除、または置換をしてもよい。 The present invention is not limited to the above-described embodiments, but includes various modifications and equivalent configurations within the spirit of the appended claims. For example, the above-described embodiments have been described in detail in order to explain the present invention in an easy-to-understand manner, and the present invention is not necessarily limited to those having all the configurations described. Further, part of the configuration of one embodiment may be replaced with the configuration of another embodiment. Further, the configuration of another embodiment may be added to the configuration of one embodiment. Further, with respect to a part of the configuration of each embodiment, another configuration may be added, deleted, or replaced.

また、前述した各構成、機能、処理部、処理手段等は、それらの一部又は全部を、例えば集積回路で設計する等により、ハードウェアで実現してもよく、プロセッサ３０１がそれぞれの機能を実現するプログラムを解釈し実行することにより、ソフトウェアで実現してもよい。 Further, the above-described respective configurations, functions, processing units, processing means, etc. may be realized by hardware by designing a part or all of them with, for example, an integrated circuit, and the processor 301 realizes each function. It may be realized by software by interpreting and executing the realized program.

各機能を実現するプログラム、テーブル、ファイル等の情報は、メモリ、ハードディスク、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）等の記憶装置、又は、ＩＣ（ＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）カード、ＳＤカード、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｃ）の記録媒体に格納することができる。 Information such as a program, a table, and a file for realizing each function is recorded in a memory, a hard disk, a storage device such as an SSD (Solid State Drive), or an IC (Integrated Circuit) card, an SD card, a DVD (Digital Versatile Disc). It can be stored on the medium.

また、制御線や情報線は説明上必要と考えられるものを示しており、実装上必要な全ての制御線や情報線を示しているとは限らない。実際には、ほとんど全ての構成が相互に接続されていると考えてよい。 Further, the control lines and information lines are shown to be necessary for explanation, and not all the control lines and information lines necessary for mounting are shown. In reality, it can be considered that almost all configurations are connected to each other.

１０１アラート管理装置
１０２ログ収集装置
１０３分析装置
１１０監視対象システム
１２０通信端末
１３１機械学習処理
１３２アラート分類処理
１３３サンプリング処理
１３４通信相手別異常行動判定処理
１３５行動別異常行動判定処理
１３６フィードバック処理
１３７調整処理
１４０アナリスト端末
１４１セキュリティアナリスト
２０１外部脅威情報データベース
３００コンピュータ
３０１プロセッサ
３０２記憶デバイス
４００特徴量テーブル
４０１アラート識別子
５００ラベルテーブル
５０３通信相手分類情報
６００行動分類テーブル
７００行動シーケンステーブル
７０１行動シーケンス
７０２頻度
７０３長期攻撃の一部
８００比率テーブル
９００分類器
９０１学習部
９０２予測部
９０３第１判定部
９０４第２判定部
９１１サンプリング候補リスト
９１２追加学習用アラートリスト 101 Alert Management Device 102 Log Collection Device 103 Analysis Device 110 Monitoring Target System 120 Communication Terminal 131 Machine Learning Process 132 Alert Classification Process 133 Sampling Process 134 Communication Party Abnormal Behavior Judgment Process 135 Behavior Specific Abnormal Behavior Judgment Process 136 Feedback Process 137 Adjustment Process 140 Analyst terminal 141 Security analyst 201 External threat information database 300 Computer 301 Processor 302 Storage device 400 Feature amount table 401 Alert identifier 500 Label table 503 Communication partner classification information 600 Action classification table 700 Action sequence table 701 Action sequence 702 Frequency 703 Long term Part of attack 800 Ratio table 900 Classifier 901 Learning unit 902 Prediction unit 903 First determination unit 904 Second determination unit 911 Sampling candidate list 912 Alert list for additional learning

Claims

An analysis apparatus comprising: a processor that executes a program; and a storage device that stores the program,
The processor can access the action history information, the action classification information, and the action sequence information, and the action history information stores a past action history executed by the communication partner with the monitoring target on the monitoring target. The action classification information is information in which the communication partner is associated with the classified action group of the communication partner, and the action sequence information is a time-series action of the communication partner. An action sequence and information that associates the action sequence with included information indicating whether or not the action sequence is a part of another action sequence longer than the action sequence,
The processor is
An acquisition process of acquiring the classification result of the behavior of the communication partner with the monitoring target within the first predetermined period before the occurrence of each alert of the alert group generated in the monitoring target;
A first determination process of determining whether or not a specific communication partner with the monitoring target within the first predetermined period before the occurrence of a specific alert in the alert group exists in the action classification information;
When it is determined by the first determination processing that the specific communication partner does not exist in the behavior classification information, the classification result of the behavior of the specific communication partner and the specific communication partner stored in the behavior history information. Based on the past action history of, a specific action sequence that is a time-series action of the specific communication partner is generated, and using the action sequence information, the specific action sequence is the other action sequence. Second determination processing for determining whether or not it is a part of
When it is determined by the second determination process that the specific action sequence is a part of the other action sequence, a selection process of selecting the action classification result of the specific communication partner,
An output process of outputting a classification result of the behavior of the specific communication partner selected by the selection process,
An analysis device, characterized in that

An analysis apparatus comprising: a processor that executes a program; and a storage device that stores the program,
The processor can access the action history information, the action classification information, and the action sequence information, and the action history information stores a past action history executed by the communication partner with the monitoring target on the monitoring target. The action classification information is information in which the communication partner is associated with the classified action group of the communication partner, and the action sequence information is a time-series action of the communication partner. It is information in which the action sequence and the appearance frequency of the action sequence are associated with each other,
The processor is
An acquisition process of acquiring the classification result of the behavior of the communication partner with the monitoring target within the first predetermined period before the occurrence of each alert of the alert group generated in the monitoring target;
A first determination process of determining whether or not a specific communication partner with the monitoring target within the first predetermined period before the occurrence of a specific alert in the alert group exists in the action classification information;
When it is determined by the first determination processing that the specific communication partner does not exist in the behavior classification information, the classification result of the behavior of the specific communication partner and the specific communication partner stored in the behavior history information. Based on the past action history of, a specific action sequence that is a time-series action of the specific communication partner is generated, and using the action sequence information, the appearance frequency of the specific action sequence is a predetermined frequency. A second determination process for determining whether or not:
When the appearance frequency of the specific action sequence is determined to be equal to or lower than the predetermined frequency by the second determination process, a selection process of selecting an action classification result of the specific communication partner,
An output process of outputting a classification result of the behavior of the specific communication partner selected by the selection process,
An analysis device, characterized in that

The analysis device according to claim 1, wherein
The action classification information is information in which an action attribute indicating whether the action is a normal action or an abnormal action and an appearance frequency of the action of the communication partner are associated with the action of the communication partner. And
In the second determination process, the processor determines that the action attribute of the specific communication partner is normal and the appearance frequency of the action of the specific communication partner is equal to or lower than a predetermined frequency by the first determination process. If determined, determine whether the particular behavior sequence is part of the other behavior sequence,
An analyzer characterized by the above.

The analysis device according to claim 1, wherein
The action sequence information is information in which the action sequence, the appearance frequency of the action sequence, and the included information are associated with each other,
In the second determination process, the processor determines whether or not the appearance frequency of the specific action sequence is equal to or lower than a predetermined frequency when it is determined that the specific action sequence is not a part of the other action sequence. Judge,
In the selection process, the processor selects the action classification result of the specific communication partner when the appearance frequency of the specific action sequence is determined to be equal to or lower than the predetermined frequency by the second determination process,
An analyzer characterized by the above.

The analysis device according to claim 1, wherein
In the first determination process, when it is determined that the specific communication partner does not exist in the behavior classification information, the processor determines the specific communication partner and the classification result of the behavior of the specific communication partner. Register in association with the action classification information,
An analyzer characterized by the above.

The analysis device according to claim 1, wherein
In the second determination process, the processor registers the specific action sequence in the action sequence information when the specific action sequence does not exist in the action sequence information,
An analyzer characterized by the above.

The analysis device according to claim 1, wherein
The processor is capable of accessing feature amount information that stores a feature amount indicating the behavior of the monitoring target,
The processor is
A classification result of the behavior of the communication partner based on a combination of the feature amount in the second predetermined period before the first predetermined period and a classification result of the behavior of the communication partner in the second predetermined period. Execute a learning process to generate a predictive classifier,
In the acquisition process, the processor acquires a feature amount indicating the behavior of the monitoring target within the first predetermined period from the feature amount information, and the acquired feature amount is used as a classifier generated by the learning process. By giving the classification result of the behavior of the communication partner,
An analyzer characterized by the above.

The analysis device according to claim 7, wherein
In the learning process, the processor determines a classification result of the behavior of the specific communication partner selected by the selection process, and a behavior of the monitoring target corresponding to the classification result and within the first predetermined period. Re-learning the classifier based on the indicated feature amount,
An analyzer characterized by the above.

The analysis device according to claim 8, wherein
In the learning process, the processor is
A calculation process of collecting the classification results of the behaviors of the specific communication partners by classification result as an addition target classification result, and calculating a ratio indicating the number of each classification result with respect to the total number of the addition target classification results,
A determination process of determining whether or not the difference between each of the ratios and a predetermined target ratio is within an allowable range,
When the difference between each of the ratios and the predetermined target ratio is within an allowable range, the additional target classification result corresponds to the additional target classification result, and the monitoring target of the monitoring target within the first predetermined period is A re-learning process for re-learning the classification result based on the feature amount indicating the behavior,
When the difference between any one of the ratios and the predetermined target ratio is within the allowable range, the action classification result of the specific communication partner not selected by the selection process is added to the addition target classification result. And a recalculation process for recalculating the respective ratios,
An analysis device, characterized in that

An analysis method executed by an analysis device having a processor that executes a program and a storage device that stores the program,
The processor can access the action history information, the action classification information, and the action sequence information, and the action history information stores a past action history executed by the communication partner with the monitoring target on the monitoring target. The action classification information is information in which the communication partner is associated with the classified action group of the communication partner, and the action sequence information is a time-series action of the communication partner. An action sequence and information that associates the action sequence with included information indicating whether or not the action sequence is a part of another action sequence longer than the action sequence,
The processor is
An acquisition process of acquiring the classification result of the behavior of the communication partner with the monitoring target within the first predetermined period before the occurrence of each alert of the alert group generated in the monitoring target;
A first determination process of determining whether or not a specific communication partner with the monitoring target within the first predetermined period before the occurrence of a specific alert in the alert group exists in the action classification information;
When it is determined by the first determination processing that the specific communication partner does not exist in the behavior classification information, the classification result of the behavior of the specific communication partner and the specific communication partner stored in the behavior history information. Based on the past action history of, a specific action sequence that is a time-series action of the specific communication partner is generated, and using the action sequence information, the specific action sequence is the other action sequence. Second determination processing for determining whether or not it is a part of
When it is determined by the second determination process that the specific action sequence is a part of the other action sequence, a selection process of selecting the action classification result of the specific communication partner,
An output process of outputting a classification result of the behavior of the specific communication partner selected by the selection process,
An analysis method characterized by executing.