JP2015222471A

JP2015222471A - Malicious communication pattern detecting device, malicious communication pattern detecting method, and malicious communication pattern detecting program

Info

Publication number: JP2015222471A
Application number: JP2014106027A
Authority: JP
Inventors: 一史青木; Kazufumi Aoki; 剛男針生; Takeo Hario
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2014-05-22
Filing date: 2014-05-22
Publication date: 2015-12-10
Anticipated expiration: 2034-05-22
Also published as: JP6174520B2

Abstract

PROBLEM TO BE SOLVED: To detect when an attacker attempts to transmit a command to a malware even via a legitimate site, as highly malicious communication, and shorten analysis time required for the detection.SOLUTION: A malicious communication pattern detecting device extracts a communication pattern which indicates a transition state of a communication destination from respective communication logs of a malware-infected terminal and a monitoring target NW terminal. Then, the malicious communication pattern detecting device extracts a feature quantity of a malware communication pattern extracted from the communication log of the malware-infected terminal and a feature quantity of a monitoring target NW communication pattern extracted from the monitoring target NW communication log, and registers the feature quantities in a detection model. The malicious communication pattern detecting device applies the detection model to communication of the monitoring target NW that is a detection target, and calculates a degree of maliciousness of the communication based on an extent of similarity to the malware communication pattern and an extent of similarity to the monitoring target NW communication pattern.

Description

本発明は、悪性通信パターン検知装置、悪性通信パターン検知方法、および、悪性通信パターン検知プログラムに関する。 The present invention relates to a malignant communication pattern detection apparatus, a malignant communication pattern detection method, and a malignant communication pattern detection program.

近年、情報漏えいや不正アクセス等の脅威をもたらす不正プログラム（以下、「マルウェア」と呼ぶ）が猛威を振るっている。マルウェアは、感染後に攻撃者からサーバ等を介して指令を受け取り、攻撃や情報漏えい等の脅威をもたらす。その際、一般に使われているホスティングサービスやソーシャルネットワーキングサービスを介して、マルウェアに対して指令を送る攻撃者の存在が確認されている（非特許文献１参照）。 In recent years, malicious programs (hereinafter referred to as “malware”) that cause threats such as information leaks and unauthorized access have become increasingly popular. Malware receives instructions from an attacker via a server or the like after infection and causes threats such as attacks and information leakage. At that time, it is confirmed that there is an attacker who sends a command to malware through a commonly used hosting service or social networking service (see Non-Patent Document 1).

発見されるマルウェアの数の増加も著しく、数秒に１つの新たなマルウェアが出現しているということが報告されている（非特許文献２参照）。そのため、アンチウィルスソフト等のエンドポイントでの対策だけではマルウェアによる脅威を防ぎきれない。そこで、通信データを分析し、マルウェアに感染した端末を特定することでマルウェアの脅威を低減させる手法が注目されている（非特許文献３参照）。 The increase in the number of discovered malware is also remarkable, and it has been reported that one new malware appears in a few seconds (see Non-Patent Document 2). For this reason, it is not possible to prevent threats caused by malware only by countermeasures at the endpoint such as anti-virus software. Therefore, a technique for reducing the threat of malware by analyzing communication data and identifying a terminal infected with malware has attracted attention (see Non-Patent Document 3).

マルウェアに感染した端末を検知する手法として、マルウェアが実行された際に通信する通信先の情報をブラックリストとして保有し、ブラックリストに掲載された通信先への通信か否かをもって、マルウェアに感染した端末を検知する手法が一般に行われている。また、マルウェア解析で確認された通信データの状態遷移に基づいてマルウェアに感染した端末を検知する手法もある（特許文献１参照）。 As a method for detecting terminals infected with malware, information on the communication destination that communicates when malware is executed is held as a blacklist, and it is infected with malware depending on whether or not communication is made to a communication destination that is listed on the blacklist. In general, a method of detecting a terminal that has been used is performed. There is also a technique for detecting a terminal infected with malware based on the state transition of communication data confirmed by malware analysis (see Patent Document 1).

特許第５００９２４４号公報Japanese Patent No. 5009244

Chasing CnC Servers - False positives、[online]、[平成26年3月12日検索]、インターネット<URL：http://www.fireeye.com/blog/technical/botnet-activities-research/2010/09/chasing-cnc-servers-part-2.html>Chasing CnC Servers-False positives, [online], [Search March 12, 2014], Internet <URL: http://www.fireeye.com/blog/technical/botnet-activities-research/2010/09/ chasing-cnc-servers-part-2.html> McAfee脅威レポート：2013年第1四半期、[online]、[平成25年9月3日検索]、インターネット<URL：http://www.mcafee.com/japan/media/mcafeeb2b/international/japan/pdf/threatreport/threatreport13q1.pdf>McAfee Threat Report: 1st quarter 2013, [online], [searched September 3, 2013], Internet <URL: http://www.mcafee.com/japan/media/mcafeeb2b/international/japan/pdf /threatreport/threatreport13q1.pdf> Sebastian Garcia他、Survey on network-based botnet detection methods、Security and communication networks 2013、[online]、[平成26年3月13日検索]、インターネット<URL：http://onlinelibrary.wiley.com/doi/10.1002/sec.800/full>Sebastian Garcia et al., Survey on network-based botnet detection methods, Security and communication networks 2013, [online], [March 13, 2014 search], Internet <URL: http://onlinelibrary.wiley.com/doi/ 10.1002 / sec.800 / full>

しかしながら、上記従来の技術には以下のような問題があった。すなわち、前述のマルウェアが実行された際の通信先には、無害かつ一般にアクセスされることが多い通信先が含まれる。このため、マルウェアの解析で得られた全ての通信先をブラックリストにすると誤検知が多発してしまう。一般にアクセスされることが多いサイトをホワイトリストとして利用し、ブラックリストへの掲載を抑制するという対処も考えられるが、その場合、一般にアクセスされるサイトを介してマルウェアに指令を送る攻撃者との通信を見逃してしまう。また、前述の特許文献１に記載の手法の場合、膨大な通信データそのものを分析対象としているため、分析に多大な時間を要してしまう。 However, the above conventional technique has the following problems. In other words, communication destinations when the above-described malware is executed include communication destinations that are harmless and often accessed in general. For this reason, if all communication destinations obtained by analyzing malware are blacklisted, false detections frequently occur. Although it is conceivable to use a site that is commonly accessed as a whitelist and suppress blacklisting, in such a case, an attacker who sends instructions to the malware via a commonly accessed site Missed communication. Further, in the case of the method described in Patent Document 1 described above, a large amount of time is required for analysis because a huge amount of communication data itself is an analysis target.

そこで、本発明は、上述の問題を解決し、一般にアクセスされることが多いサイトを介して攻撃者がマルウェアに指令を送信しようとしている場合であっても、これを悪性度の高い通信として検知し、かつ、検知のための分析に要する時間を低減することを課題とする。 Therefore, the present invention solves the above-mentioned problem, and even when an attacker tries to send a command to malware via a site that is often accessed, this is detected as a highly malignant communication. And reducing the time required for analysis for detection.

上述した課題を解決するため、本発明は、マルウェアに感染した端末および監視対象ネットワークの端末それぞれの通信ログから、前記それぞれの端末の一連の通信先のうち、前記端末の第一の通信先と、その後の通信先である第二の通信先とのペアを示す通信パターンの抽出を行う通信パターン抽出部と、前記マルウェアに感染した端末の通信ログから抽出した通信パターンであるマルウェア通信パターンの特徴量、および、前記監視対象ネットワークの端末の通信ログから抽出した通信パターンである監視対象ネットワーク通信パターンの特徴量の取得を行い、前記取得した各通信パターンの特徴量を、悪性通信を検知するための検知モデルに登録する特徴量抽出部と、検知対象となる通信の通信ログの入力を受け付ける入力部と、前記検知対象となる通信の通信ログから抽出された通信パターンである検知対象通信パターンに対する、前記マルウェア通信パターンの特徴量、および、前記監視対象ネットワーク通信パターンの特徴量を前記検知モデルから取得し、前記取得したマルウェア通信パターンの特徴量が大きいほど、また、前記取得した監視対象ネットワーク通信パターンの特徴量が小さいほど前記検知対象通信パターンの悪性度の値を高く算出する悪性度算出部とを備えることを特徴とする。 In order to solve the above-described problem, the present invention is based on communication logs of a terminal infected with malware and a terminal of a monitoring target network, and a first communication destination of the terminal among a series of communication destinations of the terminals. A communication pattern extraction unit that extracts a communication pattern indicating a pair with a second communication destination that is a subsequent communication destination, and a feature of a malware communication pattern that is a communication pattern extracted from a communication log of a terminal infected with the malware To acquire the feature amount of the monitoring target network communication pattern, which is a communication pattern extracted from the communication log of the terminal of the monitoring target network, and detect the malignant communication using the acquired feature amount of each communication pattern A feature amount extraction unit to be registered in the detection model, an input unit that receives an input of a communication log of communication to be detected, The feature amount of the malware communication pattern and the feature amount of the monitoring target network communication pattern for the detection target communication pattern that is a communication pattern extracted from the communication log of the communication to be known are acquired from the detection model, and A malignancy calculation unit that calculates a higher malignancy value of the detection target communication pattern as the feature amount of the acquired malware communication pattern is larger and the feature amount of the acquired monitored network communication pattern is smaller. It is characterized by.

本発明によれば、一般にアクセスされることが多いサイトを介して攻撃者がマルウェアに指令を送信しようとしている場合であっても、これを悪性度の高い通信として検知し、かつ、検知のための分析に要する時間を低減することができる。 According to the present invention, even when an attacker tries to send a command to malware through a site that is generally accessed, this is detected as a highly malignant communication, and for detection The time required for the analysis can be reduced.

図１は、悪性通信パターン検知装置の処理の概要を示す図である。FIG. 1 is a diagram showing an outline of processing of the malignant communication pattern detection apparatus. 図２は、悪性通信パターン検知装置の構成を示す図である。FIG. 2 is a diagram illustrating the configuration of the malignant communication pattern detection apparatus. 図３は、通信パターンの抽出を説明する図である。FIG. 3 is a diagram for explaining extraction of communication patterns. 図４は、検知モデルの構成例を示す図である。FIG. 4 is a diagram illustrating a configuration example of the detection model. 図５は、マルウェア繰り返しパターン特徴量の抽出を説明する図である。FIG. 5 is a diagram for explaining extraction of malware repeating pattern feature amounts. 図６は、通信ペアおよび通信の発生間隔を説明する図である。FIG. 6 is a diagram for explaining a communication pair and a communication generation interval. 図７は、通信パターン抽出部の処理手順を示す図である。FIG. 7 is a diagram illustrating a processing procedure of the communication pattern extraction unit. 図８は、モデルデータ抽出部の処理手順を示す図である。FIG. 8 is a diagram illustrating a processing procedure of the model data extraction unit. 図９は、生起間隔類似度算出部の処理手順を示す図である。FIG. 9 is a diagram illustrating a processing procedure of the occurrence interval similarity calculation unit. 図１０は、悪性度算出部の処理手順を示す図である。FIG. 10 is a diagram illustrating a processing procedure of the malignancy degree calculation unit. 図１１は、悪性通信パターン検知プログラムを実行するコンピュータを示す図である。FIG. 11 is a diagram illustrating a computer that executes a malicious communication pattern detection program.

（概要）
以下、本発明を実施するための形態（実施形態）について説明する。本発明は本実施形態に限定されるものではない。まず、図１を用いて、本実施形態の悪性通信パターン検知装置１０の概要を説明する。 (Overview)
Hereinafter, modes (embodiments) for carrying out the present invention will be described. The present invention is not limited to this embodiment. First, the outline | summary of the malignant communication pattern detection apparatus 10 of this embodiment is demonstrated using FIG.

悪性通信パターン検知装置１０は、検知対象となる監視対象ネットワーク（ＮＷ）の端末の通信の悪性度を検知結果として出力する装置である。この悪性度は、マルウェアに感染した端末からの通信である可能性の高さを示す値である。悪性度の算出は、以下のようにして行われる。 The malignant communication pattern detection apparatus 10 is an apparatus that outputs a communication malignancy of a terminal of a monitoring target network (NW) to be detected as a detection result. This malignancy is a value indicating a high possibility of communication from a terminal infected with malware. The grade of malignancy is calculated as follows.

まず、悪性通信パターン検知装置１０は、マルウェアに感染した端末の通信ログであるマルウェア解析ログと、監視対象ＮＷの端末の通信ログである監視対象ＮＷログ（モデル用監視対象ＮＷログ）とを取得する。通信ログは、例えば、監視対象ＮＷの通信機器により得られる通信ログであり、通信元の端末からの一連の通信先、各通信先との通信が発生した時間等の情報を含む。そして、悪性通信パターン検知装置１０は、マルウェア解析ログと監視対象ＮＷログとを参照して、検知モデルを生成する。この検知モデルは、悪性通信パターンの検知に用いられるモデルであり、マルウェア通信（マルウェアに感染した端末からの通信）の通信パターンの特徴量（マルウェア通信パターン特徴量）と、監視対象ＮＷの通信パターンの特徴量（監視対象ＮＷ通信パターン特徴量）と、それぞれの通信パターンの発生間隔の情報（通信パターン発生間隔情報）とを含む。 First, the malignant communication pattern detection apparatus 10 acquires a malware analysis log that is a communication log of a terminal infected with malware and a monitoring target NW log (a model monitoring target NW log) that is a communication log of a terminal of the monitoring target NW. To do. The communication log is, for example, a communication log obtained by a communication device of the monitoring target NW, and includes information such as a series of communication destinations from the communication source terminal and the time when communication with each communication destination occurs. The malignant communication pattern detection apparatus 10 generates a detection model with reference to the malware analysis log and the monitoring target NW log. This detection model is a model used for detection of a malignant communication pattern. A communication pattern feature amount (malware communication pattern feature amount) of malware communication (communication from a terminal infected with malware) and a communication pattern of the monitoring target NW Feature amount (monitoring target NW communication pattern feature amount) and information on the occurrence interval of each communication pattern (communication pattern occurrence interval information).

その後、悪性通信パターン検知装置１０は、生成した検知モデルを用いて、検知対象の監視対象ＮＷログ（つまり新たな監視対象ＮＷログ）に登場する通信パターンの悪性度を算出する。具体的には、悪性通信パターン検知装置１０は、検知モデルを参照して、検知対象の監視対象ＮＷログに含まれる通信パターンが、マルウェア通信の通信パターンとどの程度似ているか、また、モデル用監視対象ＮＷログに示される通信の通信パターンとどの程度似ているかをスコアリングすることで、当該通信パターンの悪性度を算出する。そして、算出した悪性度を検知結果として出力する。 Thereafter, the malignant communication pattern detection apparatus 10 calculates the malignancy of the communication pattern appearing in the detection target NW log (that is, the new monitoring target NW log) using the generated detection model. Specifically, the malignant communication pattern detection device 10 refers to the detection model, determines how much the communication pattern included in the monitoring target NW log to be detected is similar to the communication pattern of malware communication, and for the model. The degree of malignancy of the communication pattern is calculated by scoring how much the communication pattern of the communication indicated in the monitoring target NW log is similar. Then, the calculated malignancy is output as a detection result.

ここで、悪性通信パターン検知装置１０は、検知対象の監視対象ＮＷログの通信パターンが、マルウェア通信の通信パターンと似ているほど悪性度を高くし、モデル用監視対象ＮＷログに示される通信（正常通信）の通信パターンと似ているほど悪性度を低くする。これにより、悪性通信パターン検知装置１０はマルウェア通信と似ており、かつ、正常通信と似ていない通信パターンを悪性度の高い通信パターンとして検知できる。 Here, the malignant communication pattern detection apparatus 10 increases the malignancy as the communication pattern of the monitoring target NW log to be detected is similar to the communication pattern of the malware communication, and the communication shown in the model monitoring target NW log ( The malignancy is lowered as the communication pattern is similar to normal communication. Thereby, the malignant communication pattern detection apparatus 10 can detect a communication pattern similar to malware communication and not similar to normal communication as a communication pattern having a high malignancy.

また、悪性通信パターン検知装置１０は、検知対象の監視対象ＮＷログが、マルウェア通信およびモデル用監視対象ＮＷログに示される通信（正常通信）とどの程度似ているかを分析する際に、通信パターンを用いる。この通信パターンは、通信ログに示される通信元の端末からの一連の通信先について、当該端末がどの通信先への通信の後、どの通信先への通信を行ったか、またそれぞれの通信先への通信の時間間隔等を示す情報である。これにより、検知対象の通信が、マルウェア通信および正常通信それぞれとどの程度似ているかを、通信先の遷移や、その通信先への遷移に要する時間を考慮して分析できるので、通信の悪性度の算出精度を向上させることができる。その結果、例えば、攻撃者が一般にアクセスされることが多いサイトを介してマルウェアに指令を送信しようとしている場合であっても、これを悪性度の高い通信として検知することができる。さらに、悪性通信パターン検知装置１０は、上記の分析において、通信ログを用いるので、通信データそのものを用いるよりも、分析に要する時間を低減できる。 Further, when the malignant communication pattern detection device 10 analyzes how much the monitoring target NW log to be detected is similar to the communication (normal communication) indicated in the malware communication and the model monitoring target NW log, the communication pattern Is used. This communication pattern is for a series of communication destinations from the communication source terminal indicated in the communication log, to which communication destination the terminal has communicated to which communication destination, and to each communication destination. This is information indicating the communication time interval. This makes it possible to analyze how much the communication to be detected is similar to malware communication and normal communication in consideration of the transition of the communication destination and the time required for the transition to the communication destination. The calculation accuracy of can be improved. As a result, for example, even when an attacker is trying to send a command to malware via a site that is often accessed, this can be detected as highly malignant communication. Furthermore, since the malignant communication pattern detection apparatus 10 uses the communication log in the above analysis, the time required for the analysis can be reduced as compared with the communication data itself.

（構成）
次に、悪性通信パターン検知装置１０を説明する。図２に示すように、悪性通信パターン検知装置１０は、検知モデル生成部１１と、検知部１２とを備える。 (Constitution)
Next, the malignant communication pattern detection apparatus 10 will be described. As shown in FIG. 2, the malignant communication pattern detection device 10 includes a detection model generation unit 11 and a detection unit 12.

検知モデル生成部１１は、マルウェア解析ログと監視対象ＮＷログとを用いて検知モデルを生成する。 The detection model generation unit 11 generates a detection model using the malware analysis log and the monitoring target NW log.

マルウェア解析ログは、マルウェア解析により得られたマルウェアに感染した端末の通信ログであり、例えば、ハニーポットで収集されたマルウェアを実際に動作させることにより得られる。なお、マルウェア解析ログに含まれる通信ログの各エントリには、例えば、マルウェア解析の実施ごとに設定される解析ＩＤ、マルウェアの通信先、通信の発生した時刻等が含まれており、通信の発生した順に並べられている。マルウェア解析ＩＤは、例えば、Linux（登録商標）のuuidgenコマンドのような、乱数や時刻をベースにした文字列により構成される。このマルウェア解析ログは、通信ログの他に、マルウェア解析により得られるレジストリアクセスやファイルアクセスのログ等を含んでもよい。 The malware analysis log is a communication log of a terminal infected with malware obtained by malware analysis, and is obtained, for example, by actually operating malware collected in a honeypot. In addition, each entry of the communication log included in the malware analysis log includes, for example, an analysis ID set for each malware analysis, a communication destination of the malware, a communication occurrence time, and the like. They are arranged in the order they were done. The malware analysis ID is composed of a character string based on a random number or time, such as a Linux (registered trademark) uuidgen command. This malware analysis log may include, in addition to the communication log, a registry access or file access log obtained by malware analysis.

また、監視対象ＮＷログは、監視対象ＮＷのFireWallやWebProxy等の通信機器によって監視対象ＮＷから外部に対して通信を行った際に取得される通信ログで、通信元の端末のＩＰ（Internet Protocol）アドレス、通信先、通信の発生した時刻等が含まれており、通信の発生した順に整列されている。この監視対象ＮＷログは所定期間分を予め蓄積しておくものとする。このマルウェア解析ログおよび監視対象ＮＷログは、悪性通信パターン検知装置１０の記憶部（図示省略）の所定領域に記憶される。 The monitoring target NW log is a communication log acquired when communication from the monitoring target NW to the outside is performed by a communication device such as a FireWall or WebProxy of the monitoring target NW, and the IP (Internet Protocol) of the communication source terminal. ) Address, communication destination, communication time, etc. are included and are arranged in the order of communication. This monitoring target NW log is accumulated in advance for a predetermined period. The malware analysis log and the monitoring target NW log are stored in a predetermined area of a storage unit (not shown) of the malignant communication pattern detection device 10.

なお、マルウェア解析ログおよび監視対象ＮＷログにおける通信先は、例えば、通信先のＩＰアドレス、ＵＲＬ（Uniform Resource Locator）全体、クエリストリングを除くＵＲＬ、ＦＱＤＮ（Fully Qualified Domain Name）、ＵＲＬのパス部分等である。 The communication destination in the malware analysis log and the monitoring target NW log is, for example, the IP address of the communication destination, the entire URL (Uniform Resource Locator), the URL excluding the query string, the FQDN (Fully Qualified Domain Name), the URL path portion, etc. It is.

この検知モデル生成部１１は、通信パターン抽出部１１１とモデルデータ抽出部１１２とを備える。 The detection model generation unit 11 includes a communication pattern extraction unit 111 and a model data extraction unit 112.

通信パターン抽出部１１１は、マルウェア解析ログおよび監視対象ＮＷログから通信パターンを抽出する。なお、マルウェア解析ログから抽出された通信パターンをマルウェア通信パターンと呼び、監視対象ＮＷログから抽出された通信パターンを監視対象ＮＷ通信パターンと呼ぶ。 The communication pattern extraction unit 111 extracts a communication pattern from the malware analysis log and the monitoring target NW log. The communication pattern extracted from the malware analysis log is called a malware communication pattern, and the communication pattern extracted from the monitoring target NW log is called a monitoring target NW communication pattern.

例えば、通信パターン抽出部１１１が、図３に示すように、通信元の端末からの通信先がＡ、Ｂ、Ｃであり、通信先がＡ→Ｂ→Ｃという順に出現した旨の通信ログを取得した場合、この通信ログについて、所定のタイムウィンドウ（Ｔ）内に登場する通信先のペア（トリガ通信先と関連通信先との通信ペア）を通信パターンとして抽出する。なお、このタイムウィンドウの起点は通信元の端末からトリガ通信先への通信が発生した時刻である。トリガ通信先とは、通信ペアを構成する最初の通信先（第一の通信先）であり、関連通信先とは、タイムウィンドウの中で出現するトリガ通信先を除く通信先（第二の通信先）である。例えば、通信パターン抽出部１１１は、図３に示す通信ログから、タイムウィンドウ（Ｔ）をずらしながら、このタイムウィンドウ（Ｔ）に含まれるＡ→Ｂ、Ａ→Ｃ、Ｂ→Ｃという３つの通信ペアを通信パターンとして抽出する。また、ここでは図示を省略したが、通信パターン抽出部１１１は通信パターンに、それぞれの通信ペア（Ａ→Ｂ、Ａ→Ｃ、Ｂ→Ｃ）の時間間隔に関する情報も記録する。なお、上記の処理でタイムウィンドウをずらしていった結果、タイムウィンドウ内に１つしか通信先が含まれない場合、つまり、トリガ通信先の関連通信先がない場合、当該トリガ通信先の関連通信先がない状態の通信パターンを抽出するものとする。このタイムウィンドウ（Ｔ）の値は、悪性通信パターン検知装置１０の管理者等が適宜設定可能である。 For example, as shown in FIG. 3, the communication pattern extraction unit 111 generates a communication log indicating that the communication destinations from the communication source terminal are A, B, and C, and the communication destinations appear in the order of A → B → C. When acquired, a communication destination pair (communication pair of a trigger communication destination and a related communication destination) appearing within a predetermined time window (T) is extracted as a communication pattern for this communication log. The starting point of this time window is the time when communication from the communication source terminal to the trigger communication destination occurs. The trigger communication destination is the first communication destination (first communication destination) constituting the communication pair, and the related communication destination is a communication destination (second communication) excluding the trigger communication destination that appears in the time window. First). For example, the communication pattern extraction unit 111 shifts the time window (T) from the communication log shown in FIG. 3 and shifts the three communications A → B, A → C, and B → C included in the time window (T). A pair is extracted as a communication pattern. Although not shown here, the communication pattern extraction unit 111 also records information on the time interval of each communication pair (A → B, A → C, B → C) in the communication pattern. As a result of shifting the time window in the above processing, when only one communication destination is included in the time window, that is, when there is no related communication destination of the trigger communication destination, the related communication of the trigger communication destination It is assumed that a communication pattern with no destination is extracted. The value of the time window (T) can be appropriately set by the administrator of the malignant communication pattern detection apparatus 10 or the like.

通信パターン抽出部１１１は、上記の処理をマルウェア解析ログと監視対象ＮＷログの両方に対して行う。 The communication pattern extraction unit 111 performs the above processing on both the malware analysis log and the monitoring target NW log.

モデルデータ抽出部１１２は、通信パターン抽出部１１１により得られたマルウェア通信パターンおよび監視対象ＮＷ通信パターンそれぞれの特徴量を取得し、取得した特徴量をモデルデータとして検知モデルに登録する。なお、マルウェア通信パターンの特徴量をマルウェア通信パターン特徴量と呼び、監視対象ＮＷ通信パターンの特徴量を監視対象ＮＷ通信パターン特徴量と呼ぶ。 The model data extraction unit 112 acquires feature amounts of the malware communication pattern and the monitoring target NW communication pattern obtained by the communication pattern extraction unit 111, and registers the acquired feature amounts in the detection model as model data. The feature quantity of the malware communication pattern is called a malware communication pattern feature quantity, and the feature quantity of the monitoring target NW communication pattern is called a monitoring target NW communication pattern feature quantity.

検知モデルには、モデルデータとして、マルウェア通信パターン特徴量と、監視対象ＮＷ通信パターン特徴量と、通信パターン発生間隔情報とが登録される。図４に例示するように、マルウェア通信パターン特徴量は、例えば、マルウェア繰り返しパターン特徴量（Ｍ_ｆｒｅｑ）を含み、監視対象ＮＷ通信パターン特徴量は、ＮＷログ頻出パターン特徴量（Ｌ_ｆｒｅｑ）、パターン内関連性強度特徴量（Ｌ_{ｒｅｄｉｒ}）を含む。通信パターン発生間隔情報は、マルウェア通信パターン発生間隔情報、監視対象ＮＷ通信パターン発生間隔情報を含む。マルウェア通信パターン発生間隔情報は、例えば、マルウェア通信の発生間隔の平均値（μ_ｍ）および標準偏差（σ_ｍ）で表され、監視対象ＮＷ通信パターン発生間隔情報は、例えば、監視対象ＮＷの通信の発生間隔の平均値（μ_ｌ）および標準偏差（σ_ｌ）で表される。この検知モデルのモデルデータは悪性通信パターン検知装置１０の記憶部（図示省略）の所定領域に記憶される。 In the detection model, malware communication pattern feature amounts, monitoring target NW communication pattern feature amounts, and communication pattern occurrence interval information are registered as model data. As illustrated in FIG. 4, the malware communication pattern feature amount includes, for example, a malware repeated pattern feature amount (M _freq ), and the monitoring target NW communication pattern feature amount includes an NW log frequent pattern feature amount (L _freq ), pattern Intra-relationship strength feature quantity (L _redir ). The communication pattern occurrence interval information includes malware communication pattern occurrence interval information and monitoring target NW communication pattern occurrence interval information. The malware communication pattern occurrence interval information is represented by, for example, the average value (μ _m ) and standard deviation (σ _m ) of the occurrence interval of malware communication, and the monitoring target NW communication pattern occurrence interval information is, for example, the communication of the monitoring target NW It is expressed by an average value (μ _l ) and a standard deviation (σ _l ) of the occurrence interval. The model data of this detection model is stored in a predetermined area of a storage unit (not shown) of the malignant communication pattern detection device 10.

図２の検知部１２は、検知モデルを参照して、検知対象となる通信ログの通信パターンの悪性度を算出する。検知部１２は、通信パターン抽出部１２１と、生起間隔類似度算出部１２２と、悪性度算出部１２３とを備える。 The detection unit 12 in FIG. 2 calculates the malignancy of the communication pattern of the communication log to be detected with reference to the detection model. The detection unit 12 includes a communication pattern extraction unit 121, an occurrence interval similarity calculation unit 122, and a malignancy degree calculation unit 123.

通信パターン抽出部１２１は、検知対象となる通信ログから通信パターンを抽出する。この検知対象となる通信ログは、悪性通信パターン検知装置１０の入力部（図示省略）から入力される。この通信パターンの抽出方法は、検知モデル生成部１１の通信パターン抽出部１１１と同様なので説明を省略する。 The communication pattern extraction unit 121 extracts a communication pattern from a communication log to be detected. The communication log to be detected is input from an input unit (not shown) of the malignant communication pattern detection device 10. The communication pattern extraction method is the same as that of the communication pattern extraction unit 111 of the detection model generation unit 11, and thus the description thereof is omitted.

生起間隔類似度算出部１２２は、検知モデルの通信パターン発生間隔情報を参照して、検知対象となる通信ログの通信パターンが、マルウェア通信パターンの生起間隔とどの程度類似しているか、また監視対象ＮＷの通信パターンの生起間隔とどの程度類似しているかを示す値（生起間隔類似度）を算出する。 The occurrence interval similarity calculation unit 122 refers to the communication pattern occurrence interval information of the detection model, and to what extent the communication pattern of the communication log to be detected is similar to the occurrence interval of the malware communication pattern. A value (occurrence interval similarity) indicating how similar to the occurrence interval of the NW communication pattern is calculated.

悪性度算出部１２３は、検知モデルを参照して、検知対象となる通信ログの通信パターンの悪性度を算出する。具体的には、検知部１２は、検知モデルを参照して、検知対象となる通信ログ（監視対象ＮＷログの通信ログ）の通信パターンの、マルウェア通信パターン特徴量の大きさ、および、監視対象ＮＷ通信パターン特徴量の大きさを取得する。そして、検知部１２は、取得したそれぞれの特徴量と、生起間隔類似度算出部１２２により算出された生起間隔類似度とを用いて検知対象となる通信ログの通信パターンの悪性度を算出する。そして、悪性度算出部１２３は算出した悪性度を検知結果として出力する。 The malignancy calculating unit 123 calculates the malignancy of the communication pattern of the communication log to be detected with reference to the detection model. Specifically, the detection unit 12 refers to the detection model, the size of the malware communication pattern feature amount of the communication pattern of the communication log to be detected (the communication log of the monitoring target NW log), and the monitoring target The size of the NW communication pattern feature amount is acquired. And the detection part 12 calculates the malignancy of the communication pattern of the communication log used as a detection target using each acquired feature-value and the occurrence interval similarity calculated by the occurrence interval similarity calculation part 122. FIG. Then, the malignancy calculator 123 outputs the calculated malignancy as a detection result.

検知モデル生成部１１および検知部１２の詳細はフローチャートを用いて後記する。 Details of the detection model generation unit 11 and the detection unit 12 will be described later using a flowchart.

（処理手順）
以下、悪性通信パターン検知装置１０の処理手順を説明する。 (Processing procedure)
Hereinafter, the processing procedure of the malignant communication pattern detection apparatus 10 will be described.

（検知モデル生成部の通信パターン抽出部）
まず、図７を用いて検知モデル生成部１１の通信パターン抽出部１１１の処理手順を説明する。通信パターン抽出部１１１は、マルウェア解析ログおよび監視対象ＮＷ通信ログに対し、以下の処理を行う。 (Communication pattern extraction unit of detection model generation unit)
First, the processing procedure of the communication pattern extraction unit 111 of the detection model generation unit 11 will be described with reference to FIG. The communication pattern extraction unit 111 performs the following processing on the malware analysis log and the monitoring target NW communication log.

まず、通信パターン抽出部１１１は、処理対象が、マルウェア解析ログであるか、それとも監視対象ＮＷログであるかを判定し（Ｓ１）、処理対象がマルウェア解析ログであった場合（Ｓ１でＹｅｓ）、送信元識別子（通信元の端末の識別情報）としてマルウェア解析ＩＤを設定する（Ｓ４）。一方、処理対象が監視対象ＮＷ通信ログであった場合（Ｓ１でＮｏ→Ｓ２でＹｅｓ）、通信パターン抽出部１１１は、送信元識別子として送信元ＩＰアドレスを設定する（Ｓ３）。なお、処理対象が、マルウェア解析ログ、監視対象ＮＷ通信ログのいずれでもなかった場合（Ｓ２でＮｏ）、処理を終了する。 First, the communication pattern extraction unit 111 determines whether the processing target is a malware analysis log or a monitoring target NW log (S1). If the processing target is a malware analysis log (Yes in S1) The malware analysis ID is set as a transmission source identifier (identification information of the communication source terminal) (S4). On the other hand, when the processing target is a monitoring target NW communication log (No in S1 → Yes in S2), the communication pattern extraction unit 111 sets a transmission source IP address as a transmission source identifier (S3). If the processing target is neither the malware analysis log nor the monitoring target NW communication log (No in S2), the process ends.

通信パターン抽出部１１１は、まだ通信パターンの抽出を行っていない送信元識別子が存在するとき（Ｓ５でＹｅｓ）、この通信パターンの抽出を行っていない送信元識別子の１つを読み込み対象の送信元識別子と設定し（Ｓ６）、当該送信元識別子の通信ログを読み込む（Ｓ７）。そして、通信パターン抽出部１１１は、通信ログを読み込んだ際、当該通信ログの最初のエントリに通信パターンを抽出する開始点であることを示すインデックスを設定する（Ｓ８）。一方、通信パターン抽出部１１１は、通信パターンの抽出を行っていない送信元識別子が存在しなければ（Ｓ５でＮｏ）、処理を終了する。 When there is a transmission source identifier that has not yet been extracted (No in S5), the communication pattern extraction unit 111 reads one of the transmission source identifiers from which this communication pattern has not been extracted. The identifier is set (S6), and the communication log of the sender identifier is read (S7). Then, when the communication pattern extraction unit 111 reads the communication log, the communication pattern extraction unit 111 sets an index indicating the start point of extracting the communication pattern in the first entry of the communication log (S8). On the other hand, the communication pattern extraction unit 111 ends the process if there is no transmission source identifier from which communication patterns are not extracted (No in S5).

Ｓ８の後、通信パターン抽出部１１１は、インデックスが設定されているエントリに掲載されている通信先と時刻を読み込み、インデックスが設定されているエントリに掲載されている通信先をトリガ通信先として設定する（Ｓ９）。次に、通信パターン抽出部１１１は、インデックスが設定してあるエントリの次のエントリ以降を読み込む。そして、通信パターン抽出部１１１は、トリガ通信先との通信が発生した時刻から、所定時間（例えば、タイムウィンドウ（Ｔ））以内に出現した全ての通信先を関連通信先として設定し、トリガ通信先と各関連通信先との通信ペアを通信パターンとして設定する（Ｓ１０）。 After S8, the communication pattern extraction unit 111 reads the communication destination and time listed in the entry for which the index is set, and sets the communication destination listed in the entry for which the index is set as the trigger communication destination. (S9). Next, the communication pattern extraction unit 111 reads the subsequent entries after the entry in which the index is set. Then, the communication pattern extraction unit 111 sets all communication destinations that appear within a predetermined time (for example, time window (T)) from the time when communication with the trigger communication destination occurs as a related communication destination. A communication pair of the destination and each related communication destination is set as a communication pattern (S10).

このとき通信パターン抽出部１１１は、各通信パターンについて、当該通信パターンのトリガ通信先と各関連通信先との間の発生間隔（時間間隔）を取得し、各通信パターンと、取得したトリガ通信先と各関連通信先との発生間隔をモデルデータ抽出部１１２に転送する（Ｓ１１）。 At this time, the communication pattern extraction unit 111 acquires, for each communication pattern, an occurrence interval (time interval) between the trigger communication destination of the communication pattern and each related communication destination, and each communication pattern and the acquired trigger communication destination And the generation intervals between the related communication destinations are transferred to the model data extraction unit 112 (S11).

例えば、通信先Ａとの通信が出現した後、タイムウィンドウ（Ｔ）以内に通信先Ｂ、Ｃとの通信が出現した場合、トリガ通信先は通信先Ａとなり、Ａ→Ｂ、Ａ→Ｃの２つ通信ペアを通信パターンとして設定し、この２つの通信パターンそれぞれの通信ペアとその通信ペアの発生間隔とをモデルデータ抽出部１１２に転送する。 For example, if communication with the communication destinations B and C appears within the time window (T) after communication with the communication destination A appears, the trigger communication destination becomes the communication destination A, and A → B and A → C. Two communication pairs are set as communication patterns, and the communication pairs of the two communication patterns and the generation intervals of the communication pairs are transferred to the model data extraction unit 112.

Ｓ１１の後、通信パターン抽出部１１１は、通信ログにインデックスが設定されていないエントリが存在すれば（Ｓ１２でＹｅｓ）、インデックスを１インクリメントして（Ｓ１３）、Ｓ９へ戻る。一方、通信ログにインデックスが設定されていないエントリが存在しなければ（Ｓ１２でＮｏ）、Ｓ５へ戻る。 After S11, if there is an entry for which no index is set in the communication log (S12: Yes), the communication pattern extraction unit 111 increments the index by 1 (S13) and returns to S9. On the other hand, if there is no entry for which no index is set in the communication log (No in S12), the process returns to S5.

このようして通信パターン抽出部１１１は、マルウェア解析ログおよび監視対象ＮＷ通信ログから通信パターンを抽出する。 In this way, the communication pattern extraction unit 111 extracts a communication pattern from the malware analysis log and the monitoring target NW communication log.

なお、通信パターン抽出部１１１が監視対象ＮＷログを読み込む際には、まず、マルウェア通信ログの通信パターンを取得しておき、その後、監視対象ＮＷログから、当該通信パターンに登場する通信ペアからなる通信パターンを抽出してもよい。このように、監視対象ＮＷログから抽出する通信パターンを限定することで、監視対象ＮＷログからの通信パターンの抽出処理に要する時間を低減することができる。 When the communication pattern extraction unit 111 reads the monitoring target NW log, first, the communication pattern of the malware communication log is acquired, and thereafter, the communication pattern extraction unit 111 includes communication pairs appearing in the communication pattern from the monitoring target NW log. A communication pattern may be extracted. Thus, by limiting the communication patterns extracted from the monitoring target NW log, it is possible to reduce the time required for the communication pattern extraction processing from the monitoring target NW log.

また、説明を省略したが、検知部１２の通信パターン抽出部１２１も、検知対象となる監視対象ＮＷログを対象に、同様の手順で通信パターンの抽出を行う。そして、通信パターン抽出部１２１は、抽出した通信パターンを、生起間隔類似度算出部１２２および悪性度算出部１２３へ転送する。 Although not described, the communication pattern extraction unit 121 of the detection unit 12 also extracts a communication pattern in the same procedure for the monitoring target NW log to be detected. Then, the communication pattern extraction unit 121 transfers the extracted communication pattern to the occurrence interval similarity calculation unit 122 and the malignancy degree calculation unit 123.

（モデルデータ抽出部）
次に、図８を用いて検知モデル生成部１１のモデルデータ抽出部１１２の処理手順を説明する。まず、モデルデータ抽出部１１２は、マルウェア通信パターンから、マルウェア通信パターン特徴量の抽出を行う。 (Model data extraction unit)
Next, the processing procedure of the model data extraction unit 112 of the detection model generation unit 11 will be described with reference to FIG. First, the model data extraction unit 112 extracts a malware communication pattern feature quantity from the malware communication pattern.

モデルデータ抽出部１１２は、通信パターン抽出部１１１により抽出されたマルウェア通信パターンを読み込み（Ｓ２１）、マルウェア通信パターンに含まれる通信パターンごとに、マルウェア通信ログ中に、当該通信パターンが所定時間以内に所定回数以上存在することが確認された送信元識別子数を、当該通信パターンが確認された送信元識別子数で除算した結果を、マルウェア繰り返しパターン特徴量（Ｍ_ｆｒｅｑ）として取得する（Ｓ２２）。 The model data extraction unit 112 reads the malware communication pattern extracted by the communication pattern extraction unit 111 (S21), and the communication pattern is included in the malware communication log within a predetermined time for each communication pattern included in the malware communication pattern. A result obtained by dividing the number of transmission source identifiers that have been confirmed to be present a predetermined number of times or more by the number of transmission source identifiers for which the communication pattern has been confirmed is acquired as a malware repetitive pattern feature (M _freq ) (S22).

例えば、モデルデータ抽出部１１２が図５に例示するマルウェア通信の通信ログからマルウェア繰り返しパターン特徴量（Ｍ_ｆｒｅｑ）を抽出する場合を考える。図５において、Ｔ１は通信ペアを取得するためのタイムウィンドウであり、Ｔ２は通信ペアの登場回数をカウントするためのタイムウィンドウである。この場合、モデルデータ抽出部１１２は、マルウェア通信ログから、通信パターン（例えば、タイムウィンドウ（Ｔ１）以内に登場するＡ→Ｂの通信ペア）が所定時間（例えば、タイムウィンドウ（Ｔ２））以内に所定回数（例えば、３回）以上存在することが確認された送信元識別子数を取得する。また、モデルデータ抽出部１１２は、マルウェア通信ログから当該通信パターン（例えば、Ａ→Ｂの通信ペア）が確認された送信元識別子数を取得する。そして、モデルデータ抽出部１１２は、マルウェア通信ログ中に、当該通信パターン（例えば、Ａ→Ｂの通信ペア）が所定時間以内に所定回数以上存在することが確認された送信元識別子数を、当該通信パターンが確認された送信元識別子数で除算する。モデルデータ抽出部１１２は、このような処理をマルウェア通信パターンの通信ペアそれぞれに対し実行した結果をマルウェア繰り返しパターン特徴量（Ｍ_ｆｒｅｑ）として取得する。なお、このタイムウィンドウ（Ｔ１、Ｔ２）の値、通信パターンの存在回数（例えば、３回）の値は、悪性通信パターン検知装置１０の管理者等が適宜設定可能である。 For example, let us consider a case where the model data extraction unit 112 extracts a malware repetition pattern feature (M _freq ) from a communication log of malware communication illustrated in FIG. In FIG. 5, T1 is a time window for acquiring a communication pair, and T2 is a time window for counting the number of appearances of the communication pair. In this case, the model data extraction unit 112 determines that the communication pattern (for example, the communication pair A → B that appears within the time window (T1)) is within a predetermined time (for example, the time window (T2)) from the malware communication log. The number of source identifiers that have been confirmed to exist for a predetermined number of times (for example, three times) or more is acquired. Further, the model data extraction unit 112 acquires the number of transmission source identifiers for which the communication pattern (for example, a communication pair of A → B) is confirmed from the malware communication log. Then, the model data extraction unit 112 determines the number of transmission source identifiers that have been confirmed that the communication pattern (for example, the communication pair A → B) is present more than a predetermined number of times within a predetermined time in the malware communication log. Divide by the number of sender identifiers for which the communication pattern was confirmed. The model data extraction unit 112 acquires a result of executing such processing for each communication pair of the malware communication pattern as a malware repetitive pattern feature (M _freq ). Note that the value of the time window (T1, T2) and the value of the number of times the communication pattern exists (for example, 3 times) can be set as appropriate by the administrator of the malignant communication pattern detection apparatus 10 or the like.

このようにして取得されたマルウェア繰り返しパターン特徴量（Ｍ_ｆｒｅｑ）は、マルウェアが攻撃者から指令を受け取るためのサーバに対して繰り返し通信を試みる動作を捉えたものであり、本特徴量が大きいことは、通信パターンがマルウェアに特徴的なものであることを意味する。 The malware repetitive pattern feature (M _freq ) acquired in this way captures the operation of the malware repeatedly trying to communicate with the server for receiving instructions from the attacker, and this feature is large. Means that the communication pattern is characteristic of malware.

図８の説明に戻る。モデルデータ抽出部１１２は、Ｓ２２の後、マルウェア通信パターンの通信パターンごとに当該通信パターンの発生間隔の平均値（μ_ｍ）および標準偏差（σ_ｍ）をマルウェア通信パターン発生間隔情報として取得する（Ｓ２３）。そして、モデルデータ抽出部１１２は、Ｓ２２で取得したマルウェア繰り返しパターン特徴量（Ｍ_ｆｒｅｑ）とＳ２３で取得したマルウェア通信パターン発生間隔情報（μ_ｍ、σ_ｍ）を検知モデルに登録する（Ｓ２４）。Ｓ２４の後、マルウェア通信パターンで未処理のものが存在しなければ（Ｓ２５でＮｏ）、Ｓ２６へ進み、マルウェア通信パターンで未処理のものが存在すれば（Ｓ２５でＹｅｓ）、Ｓ２２へ戻る。 Returning to the description of FIG. After S22, the model data extraction unit 112 acquires the average value (μ _m ) and standard deviation (σ _m ) of the occurrence interval of the communication pattern for each communication pattern of the malware communication pattern as malware communication pattern occurrence interval information ( S23). Then, the model data extraction unit 112 registers the malware repetition pattern feature amount (M _freq ) acquired in S22 and the malware communication pattern occurrence interval information (μ _m , σ _m ) acquired in S23 in the detection model (S24). After S24, if there is no unprocessed malware communication pattern (No in S25), the process proceeds to S26, and if there is an unprocessed malware communication pattern (Yes in S25), the process returns to S22.

Ｓ２６において、モデルデータ抽出部１１２は、監視対象ＮＷ通信パターンを読み込む。そして、モデルデータ抽出部１１２は、読み込んだ監視対象ＮＷ通信パターンの通信パターンごとに当該通信パターンが確認された送信元識別子の数を、当該監視対象ＮＷログに含まれる送信元識別子数で除算した結果を、ＮＷログ頻出パターン特徴量（Ｌ_ｆｒｅｑ）として取得する（Ｓ２７）。 In S26, the model data extraction unit 112 reads the monitoring target NW communication pattern. Then, the model data extraction unit 112 divides the number of transmission source identifiers for which the communication pattern is confirmed for each communication pattern of the read monitoring target NW communication pattern by the number of transmission source identifiers included in the monitoring target NW log. The result is acquired as an NW log frequent pattern feature amount (L _freq ) (S27).

例えば、モデルデータ抽出部１１２は、監視対象ＮＷ通信パターンにおいて通信パターン（Ａ→Ｂ）が確認された送信元識別子の数を取得する。そして、モデルデータ抽出部１１２は、通信パターン（Ａ→Ｂ）が確認された送信元識別子の数を、当該監視対象ＮＷログに含まれる送信元識別子数で除算する。モデルデータ抽出部１１２は、このような処理を、監視対象ＮＷ通信パターンの通信パターンそれぞれに対し実行した結果をＮＷログ頻出パターン特徴量（Ｌ_ｆｒｅｑ）として取得する。 For example, the model data extraction unit 112 acquires the number of transmission source identifiers whose communication patterns (A → B) have been confirmed in the monitoring target NW communication pattern. Then, the model data extraction unit 112 divides the number of transmission source identifiers for which the communication pattern (A → B) is confirmed by the number of transmission source identifiers included in the monitoring target NW log. The model data extraction unit 112 acquires a result obtained by executing such processing for each communication pattern of the monitoring target NW communication pattern as an NW log frequent pattern feature (L _freq ).

このようにして取得されたＮＷログ頻出パターン特徴量（Ｌ_ｆｒｅｑ）は、監視対象ＮＷ通信パターンに登場する各通信パターンが監視対象ＮＷにおいて一般的に発生する度合いを定量化したものである。本特徴量が大きいことは、監視対象ＮＷ内の多くの端末から当該通信パターンが確認されることを意味する。例えば、前記した例でいうと、通信パターン（Ａ→Ｂ）に関するＮＷログ頻出パターン特徴量（Ｌ_ｆｒｅｑ）が大きいということは、監視対象ＮＷの多くの端末から通信パターン（Ａ→Ｂ）が確認されることを意味する。つまり、監視対象ＮＷにマルウェア感染端末が少ないという前提をおくと、本特徴量が大きいということは、当該通信パターンがマルウェアに感染した場合に発生する可能性が低いことを意味する。 The NW log frequent pattern feature quantity (L _freq ) acquired in this way is a quantification of the degree to which each communication pattern that appears in the monitoring target NW communication pattern generally occurs in the monitoring target NW. A large feature value means that the communication pattern is confirmed from many terminals in the monitoring target NW. For example, in the example described above, the fact that the NW log frequent pattern feature (L _freq ) relating to the communication pattern (A → B) is large is confirmed by the communication pattern (A → B) from many terminals of the monitoring target NW. Means that That is, assuming that there are few malware-infected terminals in the monitoring target NW, the large feature value means that there is a low possibility of occurrence when the communication pattern is infected with malware.

図８の説明に戻る。モデルデータ抽出部１１２は、監視対象ＮＷ通信パターンの通信パターンごとに、当該通信パターンが確認された数をトリガ通信先が確認された数で除算した結果を、監視対象ＮＷログにおけるパターン内関連性強度特徴量（Ｌ_{ｒｅｄｉｒ}）として取得する（Ｓ２８）。 Returning to the description of FIG. For each communication pattern of the monitoring target NW communication pattern, the model data extraction unit 112 divides the number of confirmed communication patterns by the number of confirmed trigger communication destinations, and the intra-pattern relevance in the monitoring target NW log It is acquired as an intensity feature quantity (L _redir ) (S28).

例えば、モデルデータ抽出部１１２は、監視対象ＮＷ通信パターンにおいて通信パターン（Ａ→Ｂ）が確認された数を取得する。また、モデルデータ抽出部１１２は、監視対象ＮＷ通信パターンにおいて、当該通信パターン（Ａ→Ｂ）のトリガ通信先（Ａ）が通信先として確認された数を取得する。そして、モデルデータ抽出部１１２は、当該通信パターンが確認された数をトリガ通信先が確認された数で除算する。モデルデータ抽出部１１２は、このような処理を監視対象ＮＷ通信パターンの通信パターンそれぞれに対し実行した結果をパターン内関連性強度特徴量（Ｌ_{ｒｅｄｉｒ}）として取得する。本特徴量は、トリガ通信先（例えば、Ａ）が確認された際に、どの程度の確率で関連通信先（例えば、Ｂ）との通信が確認されるのかを表しており、トリガ通信先（例えば、Ａ）との通信が関連通信先（例えば、Ｂ）との通信を引き起こす確からしさを表す指標となる。換言すると、パターン内関連性強度特徴量（Ｌ_{ｒｅｄｉｒ}）は、例えば、監視対象ＮＷにおけるＨＴＴＰ（HyperText Transfer Protocol）でのリダイレクトや画像の読み込み等、ある通信先に行くと必ず発生する通信先との関係を数値化した指標となる。 For example, the model data extraction unit 112 acquires the number of confirmed communication patterns (A → B) in the monitoring target NW communication pattern. In addition, the model data extraction unit 112 acquires the number of trigger communication destinations (A) of the communication pattern (A → B) confirmed as the communication destination in the monitoring target NW communication pattern. Then, the model data extraction unit 112 divides the number of confirmed communication patterns by the number of confirmed trigger communication destinations. The model data extraction unit 112 acquires a result of executing such processing for each communication pattern of the monitoring target NW communication pattern as an intra-pattern relevance strength feature quantity (L _redir ). This feature amount represents the probability that communication with the related communication destination (for example, B) is confirmed when the trigger communication destination (for example, A) is confirmed. For example, the communication with A) is an index representing the probability of causing communication with the related communication destination (for example, B). In other words, the intra-pattern relevance strength feature quantity (L _redir ) is, for example, a communication destination that always occurs when going to a certain communication destination such as HTTP (HyperText Transfer Protocol) in the monitoring target NW or reading of an image. It is an index that quantifies the relationship.

監視対象ＮＷにマルウェア感染端末が少ないという前提をおくと、本特徴量が大きいことは、正常通信で発生する可能性が高い通信パターンであり、当該通信パターンはマルウェアに感染した場合に発生する可能性が低いことを意味する。 Assuming that there are few malware-infected terminals in the monitored NW, this large feature value is a communication pattern that is likely to occur in normal communication, and this communication pattern can occur when infected with malware. It means that the nature is low.

モデルデータ抽出部１１２は、監視対象ＮＷ通信パターンの通信パターンごとに、当該通信パターンの発生間隔の平均値（μ_ｌ）および標準偏差（σ_ｌ）を算出し、これを監視対象ＮＷ通信パターン発生間隔情報として取得する（Ｓ２９）。例えば、モデルデータ抽出部１１２は、通信パターン（例えば、Ａ→Ｂ、Ａ→Ｃ、Ｂ→Ｃ）ごとに、それぞれの通信パターンの発生間隔の平均値（μ_ｌ）および標準偏差（μ_ｌ）を算出し、これを監視対象ＮＷ通信パターン発生間隔情報として取得する。 The model data extraction unit 112 calculates, for each communication pattern of the monitoring target NW communication pattern, an average value (μ _l ) and standard deviation (σ _l ) of the generation interval of the communication pattern, and generates the monitoring target NW communication pattern generation Obtained as interval information (S29). For example, for each communication pattern (for example, A → B, A → C, B → C), the model data extraction unit 112 calculates the average value (μ _l ) and standard deviation (μ _l ) of the occurrence intervals of the respective communication patterns. And is obtained as monitoring target NW communication pattern generation interval information.

モデルデータ抽出部１１２は、取得したＮＷログ頻出パターン特徴量（Ｌ_ｆｒｅｑ）、パターン内関連性強度特徴量（Ｌ_{ｒｅｄｉｒ}）および監視対象ＮＷ通信パターン発生間隔情報（μ_ｌ、σ_ｌ）を、検知モデルに登録する（Ｓ３０）。そして、モデルデータ抽出部１１２は、監視対象ＮＷ通信パターンで、未処理のものが存在すれば（Ｓ３１でＹｅｓ）、Ｓ２７へ戻り、未処理のものが存在しなければ（Ｓ３１Ｎｏ）、処理を終了する。 The model data extraction unit 112 detects the acquired NW log frequent pattern feature quantity (L _freq ), intra-pattern relevance strength feature quantity (L _redir ), and monitoring target NW communication pattern generation interval information (μ _l , σ _l ). The model is registered (S30). The model data extraction unit 112 returns to S27 if there is an unprocessed monitoring target NW communication pattern (Yes in S31), and ends the process if there is no unprocessed communication pattern (No in S31). To do.

（生起間隔類似度算出部）
次に、図９を用いて、生起間隔類似度算出部１２２の処理手順を説明する。生起間隔類似度算出部１２２は、検知部１２の通信パターン抽出部１２１から、検査対象となる監視対象ＮＷログの通信パターン（通信ペアとその通信ペアの発生間隔）を検知対象通信パターンとして読み込む（Ｓ４１）。また、生起間隔類似度算出部１２２は、検知モデルから、検知対象通信パターンに対応するマルウェア通信パターン発生間隔情報（μ_ｍ、σ_ｍ）と、監視対象ＮＷ通信パターン発生間隔情報（μ_ｌ、σ_ｌ）とを読み込む（Ｓ４２）。 (Occurrence interval similarity calculation unit)
Next, the processing procedure of the occurrence interval similarity calculation unit 122 will be described with reference to FIG. The occurrence interval similarity calculation unit 122 reads the communication pattern (the communication pair and the generation interval of the communication pair) of the monitoring target NW log to be inspected as the detection target communication pattern from the communication pattern extraction unit 121 of the detection unit 12 ( S41). Further, the occurrence interval similarity calculation unit 122 determines, from the detection model, malware communication pattern occurrence interval information (μ _m , σ _m ) corresponding to the detection target communication pattern and monitoring target NW communication pattern occurrence interval information (μ _l , σ _l ) is read (S42).

そして、生起間隔類似度算出部１２２は、読み込んだマルウェア通信パターン発生間隔情報（μ_ｍ、σ_ｍ）と、監視対象ＮＷ通信パターン発生間隔情報（μ_ｌ、σ_ｌ）とを用いて、検知対象通信パターンに対して、マルウェア通信パターン、監視対象ＮＷ通信パターンそれぞれの発生間隔の類似度を算出する（Ｓ４３）。 Then, the occurrence interval similarity calculation unit 122 uses the read malware communication pattern occurrence interval information (μ _m , σ _m ) and the monitoring target NW communication pattern occurrence interval information (μ _l , σ _l ) to detect For the communication pattern, the similarity between the occurrence intervals of the malware communication pattern and the monitoring target NW communication pattern is calculated (S43).

具体的には、生起間隔類似度算出部１２２は、マルウェア通信パターン発生間隔情報（μ_ｍ、σ_ｍ）を用いて、検知対象通信パターンに対するマルウェア通信パターン発生間隔の類似度を算出する。また、生起間隔類似度算出部１２２は、監視対象ＮＷ通信パターン発生間隔情報（μ_ｌ、σ_ｌ）を用いて、検知対象通信パターンに対する監視対象ＮＷ通信パターン発生間隔の類似度を算出する。ここで各類似度の計算は、例えば、以下の式（１）により計算する。 Specifically, the occurrence interval similarity calculation unit 122 calculates the similarity of the malware communication pattern generation interval with respect to the detection target communication pattern using the malware communication pattern generation interval information (μ _m , σ _m ). In addition, the occurrence interval similarity calculation unit 122 calculates the similarity of the monitoring target NW communication pattern generation interval with respect to the detection target communication pattern using the monitoring target NW communication pattern generation interval information (μ _l , σ _l ). Here, each similarity is calculated by, for example, the following equation (1).

ここで、Ｓｉｍ（ｄ）はある通信パターンの発生間隔がｄであったときの類似度であり、σはそれぞれの通信パターンの発生間隔の標準偏差、μはそれぞれの通信パターンの発生間隔の平均値である。ただし、Ｓｉｍ（ｄ）の最大値は１とし、１を超える場合はＳｉｍ（ｄ）=１とする。また、ｄ−μが０となる場合にはＳｉｍ（ｄ）＝１とする。 Here, Sim (d) is the similarity when the occurrence interval of a certain communication pattern is d, σ is the standard deviation of the occurrence interval of each communication pattern, and μ is the average of the occurrence intervals of each communication pattern Value. However, the maximum value of Sim (d) is 1, and when it exceeds 1, Sim (d) = 1. When d−μ is 0, Sim (d) = 1.

このように生起間隔類似度算出部１２２は、通信パターンに登場する通信の発生間隔を考慮して、通信パターンの類似度を算出することができる。例えば、図６に例示するように、通信ログに出てくる通信ペア（Ａ→Ｂ、Ａ→Ｃ、Ｂ→Ｃ）は同じだが、通信の発生間隔が異なる場合、生起間隔類似度算出部１２２は、通信の発生間隔を考慮し、両者の通信パターンの類似度の値を低く算出する。 As described above, the occurrence interval similarity calculation unit 122 can calculate the communication pattern similarity in consideration of the occurrence interval of communication appearing in the communication pattern. For example, as illustrated in FIG. 6, when the communication pairs (A → B, A → C, B → C) appearing in the communication log are the same, but the communication generation intervals are different, the occurrence interval similarity calculation unit 122. In consideration of the communication occurrence interval, the similarity value between the two communication patterns is calculated to be low.

（悪性度算出部）
次に、図１０を用いて、悪性度算出部１２３の処理手順を説明する。悪性度算出部１２３は、通信パターン抽出部１２１から、検査対象となる監視対象ＮＷログの通信パターン（通信ペアとその通信ペアの発生間隔）を検知対象通信パターンとして読み込む（Ｓ５１）。 (Grade level calculator)
Next, the processing procedure of the malignancy calculation unit 123 will be described with reference to FIG. The malignancy calculation unit 123 reads the communication pattern of the monitoring target NW log to be inspected (communication pair and the generation interval of the communication pair) as a detection target communication pattern from the communication pattern extraction unit 121 (S51).

また、悪性度算出部１２３は、生起間隔類似度算出部１２２から検知対象通信パターンに対応する発生間隔の類似度（つまり、マルウェア通信パターン発生間隔および監視対象ＮＷ通信パターン発生間隔それぞれの類似度）を取得する（Ｓ５２）。 Further, the malignancy calculation unit 123 generates the similarity of the generation intervals corresponding to the detection target communication patterns from the occurrence interval similarity calculation unit 122 (that is, the similarities of the malware communication pattern generation interval and the monitoring target NW communication pattern generation interval). Is acquired (S52).

さらに、悪性度算出部１２３は、検知モデルから当該通信パターンに対応するマルウェア繰り返しパターン特徴量（Ｍ_ｆｒｅｑ）、ＮＷログ頻出パターン特徴量（Ｌ_ｆｒｅｑ）、および、パターン内関連性強度特徴量（Ｌ_{ｒｅｄｉｒ}）を読み込む（Ｓ５３）。そして、悪性度算出部１２３は、読み込んだこれらの特徴量を元に、検知対象通信パターンの悪性度を算出する（Ｓ５４）。悪性度は、マルウェア通信パターンの発生間隔類似度が大きい場合、監視対象ＮＷ通信パターンの発生間隔類似度が小さいほど、マルウェア繰り返しパターン特徴量（Ｍ_ｆｒｅｑ）の値が大きいほど、ＮＷログ頻出パターン特徴量（Ｌ_ｆｒｅｑ）の値が小さいほど、またパターン内関連性強度特徴量（Ｌ_{ｒｅｄｉｒ}）の値が小さいほど、大きくなるように算出される。悪性度は、例えば、以下の式（２）により算出される。 Further, the malignancy calculation unit 123 calculates the malware repeated pattern feature (M _freq ), the NW log frequent pattern feature (L _freq ), and the intra-pattern relevance strength feature (L) corresponding to the communication pattern from the detection model. _redir ) is read (S53). Then, the malignancy calculator 123 calculates the malignancy of the detection target communication pattern based on the read feature quantities (S54). As for the malignancy, when the occurrence interval similarity of the malware communication pattern is large, the smaller the occurrence interval similarity of the monitoring target NW communication pattern is, the larger the value of the malware repeat pattern feature (M _freq ) is, The smaller the value of the quantity (L _freq ), and the smaller the value of the intra-pattern related strength feature quantity (L _redir ), the larger the value. The grade of malignancy is calculated by the following formula (2), for example.

式（２）のＳｃｏｒｅは悪性度、Ｓｉｍ_ｍはマルウェア通信パターンに対する発生間隔の類似度、Ｓｉｍ_ｌは監視対象ＮＷ通信パターンに対する発生間隔の類似度、Ｍ_ｆｒｅｑはマルウェア繰り返しパターン特徴量、Ｌ_ｆｒｅｑはＮＷログ頻出パターン特徴量、Ｌ_{ｒｅｄｉｒ}はパターン内関連性強度特徴量を示す。 In Equation (2), Score is the malignancy, Sim _m is the occurrence interval similarity to the malware communication pattern, Sim _l is the occurrence interval similarity to the monitored NW communication pattern, M _freq is the malware repetitive pattern feature, and L _freq is The NW log frequent pattern feature amount, L _redir indicates the intra-pattern related strength feature amount.

つまり、悪性度算出部１２３は、検知対象通信パターンの悪性度を算出する際、マルウェア通信パターンに似ているほど悪性度を高く算出し、監視対象ＮＷ通信パターンに似ているほど悪性度を低く算出する。 That is, when calculating the malignancy of the detection target communication pattern, the malignancy calculation unit 123 calculates the malignancy higher as it resembles the malware communication pattern, and lowers the malignancy as resembling the monitoring target NW communication pattern. calculate.

なお、悪性度算出部１２３は、各特徴量の重み付けをして悪性度を算出してもよい。また、通信パターン抽出部１２１は通信ログに示されるＵＲＬ、ＦＱＤＮ、パス等を通信先として通信パターンを抽出し、悪性度算出部１２３は、このＵＲＬ、ＦＱＤＮ、パス等の通信パターンを用いて悪性度を算出してもよい。なお、ここで用いるＵＲＬはクエリストリングなしのＵＲＬであってもよい。 The malignancy calculating unit 123 may calculate the malignancy by weighting each feature amount. In addition, the communication pattern extraction unit 121 extracts a communication pattern using the URL, FQDN, path, and the like indicated in the communication log as a communication destination, and the malignancy calculation unit 123 uses the communication pattern such as URL, FQDN, path, etc. The degree may be calculated. The URL used here may be a URL without a query string.

そして、悪性度算出部１２３は、ＵＲＬ、ＦＱＤＮ、パスの通信パターンそれぞれの悪性度を算出した後、算出したそれぞれの悪性度を足し合わせた値を検知対象通信パターンの悪性度として出力してもよい。ここでの悪性度は、例えば、以下の式（３）により算出される。 Then, after calculating the malignancy of each of the URL, FQDN, and path communication patterns, the malignancy calculation unit 123 outputs a value obtained by adding the calculated malignancy as the malignancy of the detection target communication pattern. Good. The grade of malignancy here is calculated by the following formula (3), for example.

式（３）のＳｃｏｒｅ_ｍはＵＲＬ、ＦＱＤＮ、Ｐａｔｈ（パス）の通信パターンそれぞれの悪性度である。なお、悪性度算出部１２３は、式（３）によりＳｃｏｒｅ(悪性度)を算出するときＳｃｏｒｅ_ｍに対して重み付けをして算出してもよい。このようにすることで、ＵＲＬ、ＦＱＤＮ、Ｐａｔｈ（パス）等、様々な観点で抽出された通信パターンについて総合的に判断した悪性度を算出することができる。 The Score _{m in the} expression (3) is a malignancy of each of the communication patterns of URL, FQDN, and Path. Note that the malignancy calculation unit 123 may calculate the Score (malignancy) by weighting the Score _m when calculating the Score (malignancy) according to the equation (3). By doing in this way, the malignancy degree judged comprehensively about the communication pattern extracted from various viewpoints, such as URL, FQDN, Path (path), is computable.

（その他の実施形態）
前記した実施形態において、悪性通信パターン検知装置１０はマルウェア通信パターン発生間隔情報（μ_ｍ、σ_ｍ）、監視対象ＮＷ通信パターン発生間隔情報（μ_ｌ、σ_ｌ）を用いずに悪性度を算出してもよい。つまり、検知部１２が生起間隔類似度算出部１２２を含まず、悪性度算出部１２３は、検知対象通信パターンに対応するマルウェア繰り返しパターン特徴量（Ｍ_ｆｒｅｑ）、ＮＷログ頻出パターン特徴量（Ｌ_ｆｒｅｑ）、および、パターン内関連性強度特徴量（Ｌ_{ｒｅｄｉｒ}）を用いて検知対象通信パターンの悪性度を算出するようにしてもよい。 (Other embodiments)
In the above-described embodiment, the malignant communication pattern detection apparatus 10 calculates the malignancy without using the malware communication pattern generation interval information (μ _m , σ _m ) and the monitoring target NW communication pattern generation interval information (μ _l , σ _l ). May be. That is, the detection unit 12 does not include the occurrence interval similarity calculation unit 122, and the malignancy calculation unit 123 includes the malware repetitive pattern feature amount (M _freq ) and the NW log frequent pattern feature amount (L _freq ) corresponding to the detection target communication pattern. ) And the intra-pattern relationship strength feature quantity (L _redir ) may be used to calculate the malignancy of the detection target communication pattern.

さらに、検知モデル生成部１１および検知部１２は同じ装置内に装備されるものとして説明したが、それぞれ別個の装置により実現されてもよい。 Furthermore, although the detection model generation part 11 and the detection part 12 were demonstrated as what is equipped in the same apparatus, you may each implement | achieve by a separate apparatus.

（プログラム）
また、上記実施形態に係る悪性通信パターン検知装置１０が実行する処理をコンピュータが実行可能な言語で記述したプログラムを作成することもできる。この場合、コンピュータがプログラムを実行することにより、上記実施形態と同様の効果を得ることができる。さらに、かかるプログラムをコンピュータに読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータに読み込ませて実行することにより上記実施形態と同様の処理を実現してもよい。以下に、悪性通信パターン検知装置１０と同様の機能を実現する悪性通信パターン検知プログラムを実行するコンピュータの一例を説明する。 (program)
It is also possible to create a program that describes the processing executed by the malignant communication pattern detection apparatus 10 according to the above embodiment in a language that can be executed by a computer. In this case, the same effect as the above-described embodiment can be obtained by the computer executing the program. Further, such a program may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read by the computer and executed to execute the same processing as in the above embodiment. An example of a computer that executes a malignant communication pattern detection program that realizes the same function as that of the malignant communication pattern detection apparatus 10 will be described below.

図１１は、悪性通信パターン検知プログラムを実行するコンピュータを示す図である。図１１に示すように、コンピュータ１０００は、例えば、メモリ１０１０と、ＣＰＵ（Central Processing Unit）１０２０と、ハードディスクドライブインタフェース１０３０と、ディスクドライブインタフェース１０４０と、シリアルポートインタフェース１０５０と、ビデオアダプタ１０６０と、ネットワークインタフェース１０７０とを有する。これらの各部は、バス１０８０によって接続される。 FIG. 11 is a diagram illustrating a computer that executes a malicious communication pattern detection program. As shown in FIG. 11, the computer 1000 includes, for example, a memory 1010, a CPU (Central Processing Unit) 1020, a hard disk drive interface 1030, a disk drive interface 1040, a serial port interface 1050, a video adapter 1060, and a network. Interface 1070. These units are connected by a bus 1080.

メモリ１０１０は、ＲＯＭ（Read Only Memory）１０１１およびＲＡＭ（Random Access Memory）１０１２を含む。ＲＯＭ１０１１は、例えば、ＢＩＯＳ（Basic Input Output System）等のブートプログラムを記憶する。ハードディスクドライブインタフェース１０３０は、ハードディスクドライブ１０９０に接続される。ディスクドライブインタフェース１０４０は、ディスクドライブ１１００に接続される。ディスクドライブ１１００には、例えば、磁気ディスクや光ディスク等の着脱可能な記憶媒体が挿入される。シリアルポートインタフェース１０５０には、例えば、マウス１１１０およびキーボード１１２０が接続される。ビデオアダプタ１０６０には、例えば、ディスプレイ１１３０が接続される。 The memory 1010 includes a ROM (Read Only Memory) 1011 and a RAM (Random Access Memory) 1012. The ROM 1011 stores a boot program such as BIOS (Basic Input Output System). The hard disk drive interface 1030 is connected to the hard disk drive 1090. The disk drive interface 1040 is connected to the disk drive 1100. A removable storage medium such as a magnetic disk or an optical disk is inserted into the disk drive 1100, for example. For example, a mouse 1110 and a keyboard 1120 are connected to the serial port interface 1050. For example, a display 1130 is connected to the video adapter 1060.

ここで、図１１に示すように、ハードディスクドライブ１０９０は、例えば、ＯＳ１０９１、アプリケーションプログラム１０９２、プログラムモジュール１０９３およびプログラムデータ１０９４を記憶する。上記実施形態で説明した各テーブルは、例えばハードディスクドライブ１０９０やメモリ１０１０に記憶される。 Here, as shown in FIG. 11, the hard disk drive 1090 stores, for example, an OS 1091, an application program 1092, a program module 1093, and program data 1094. Each table described in the above embodiment is stored in the hard disk drive 1090 or the memory 1010, for example.

また、悪性通信パターン検知プログラムは、例えば、コンピュータ１０００によって実行される指令が記述されたプログラムモジュールとして、ハードディスクドライブ１０９０に記憶される。具体的には、上記実施形態で説明した悪性通信パターン検知装置１０が実行する各処理が記述されたプログラムモジュールが、ハードディスクドライブ１０９０に記憶される。 Further, the malicious communication pattern detection program is stored in the hard disk drive 1090 as a program module in which a command executed by the computer 1000 is described, for example. Specifically, a program module describing each process executed by the malicious communication pattern detection apparatus 10 described in the above embodiment is stored in the hard disk drive 1090.

また、悪性通信パターン検知プログラムによる情報処理に用いられるデータは、プログラムデータとして、例えば、ハードディスクドライブ１０９０に記憶される。そして、ＣＰＵ１０２０が、ハードディスクドライブ１０９０に記憶されたプログラムモジュール１０９３やプログラムデータ１０９４を必要に応じてＲＡＭ１０１２に読み出して、上述した各手順を実行する。 Data used for information processing by the malignant communication pattern detection program is stored as program data, for example, in the hard disk drive 1090. Then, the CPU 1020 reads out the program module 1093 and the program data 1094 stored in the hard disk drive 1090 to the RAM 1012 as necessary, and executes the above-described procedures.

なお、悪性通信パターン検知プログラムに係るプログラムモジュール１０９３やプログラムデータ１０９４は、ハードディスクドライブ１０９０に記憶される場合に限られず、例えば、着脱可能な記憶媒体に記憶されて、ディスクドライブ１１００等を介してＣＰＵ１０２０によって読み出されてもよい。あるいは、悪性通信パターン検知プログラムに係るプログラムモジュール１０９３やプログラムデータ１０９４は、ＬＡＮ（Local Area Network）やＷＡＮ（Wide Area Network）等のネットワークを介して接続された他のコンピュータに記憶され、ネットワークインタフェース１０７０を介してＣＰＵ１０２０によって読み出されてもよい。 The program module 1093 and the program data 1094 related to the malicious communication pattern detection program are not limited to being stored in the hard disk drive 1090. For example, the program module 1093 and the program data 1094 are stored in a removable storage medium and the CPU 1020 via the disk drive 1100 or the like. May be read. Alternatively, the program module 1093 and the program data 1094 related to the malicious communication pattern detection program are stored in another computer connected via a network such as a LAN (Local Area Network) or a WAN (Wide Area Network), and the network interface 1070 is stored. It may be read by the CPU 1020 via

１０悪性通信パターン検知装置
１１検知モデル生成部
１２検知部
１１１、１２１通信パターン抽出部
１１２モデルデータ抽出部
１２２生起間隔類似度算出部
１２３悪性度算出部 DESCRIPTION OF SYMBOLS 10 Malignant communication pattern detection apparatus 11 Detection model production | generation part 12 Detection part 111,121 Communication pattern extraction part 112 Model data extraction part 122 Occurrence interval similarity calculation part 123 Malignancy degree calculation part

Claims

From the communication logs of the terminals infected with malware and the terminals of the monitored network, the first communication destination of the terminal and the second communication destination that is the subsequent communication destination among the series of communication destinations of the respective terminals. A communication pattern extraction unit that extracts a communication pattern indicating a pair with
A feature amount of a malware communication pattern which is a communication pattern extracted from a communication log of a terminal infected with the malware, and a feature amount of a monitored network communication pattern which is a communication pattern extracted from a communication log of a terminal of the monitored network A feature amount extraction unit that performs acquisition and registers the feature amount of each acquired communication pattern in a detection model for detecting malignant communication;
An input unit for receiving an input of a communication log of communication to be detected;
For the detection target communication pattern that is a communication pattern extracted from the communication log of the communication to be detected, the feature amount of the malware communication pattern and the feature amount of the monitoring target network communication pattern are acquired from the detection model, A malignancy calculating unit that calculates a higher malignancy value of the detection target communication pattern as the feature amount of the acquired malware communication pattern is larger and the feature amount of the acquired monitoring target network communication pattern is smaller; A malignant communication pattern detection apparatus comprising:

The communication pattern further includes information regarding an occurrence interval of communication between the first communication destination and the second communication destination,
The feature amount extraction unit obtains a feature amount of a communication occurrence interval in the malware communication pattern as a feature amount of the malware communication pattern in the detection model, and as a feature amount of the monitored network communication pattern Furthermore, the feature quantity of the communication occurrence interval in the monitored network communication pattern is acquired,
The malignant communication pattern detection device further includes:
With reference to the feature quantity of the occurrence interval of each communication in the detection model, the similarity between the occurrence interval of communication of the detection target communication pattern and the occurrence interval of communication of the malware communication pattern, and the monitoring target network communication pattern An occurrence interval similarity calculation unit for calculating the similarity of communication occurrence intervals;
The malignancy calculation unit further increases the similarity of the calculated communication interval of the malware communication pattern, and the lower the similarity of the calculated communication interval of the monitored network communication pattern. The malignant communication pattern detection apparatus according to claim 1, wherein the malignancy degree value is calculated high.

In the communication log of the terminal infected with the malware, for each communication pattern included in the malware communication pattern, the feature amount extraction unit within a predetermined time as a feature amount of the malware communication pattern in the detection model The malignant communication pattern detection apparatus according to claim 1, wherein a malware repetitive pattern feature amount that is a value indicating a degree of repeated appearance is acquired.

The feature amount extraction unit, as a feature amount of the monitoring target network communication pattern in the detection model, for each communication pattern included in the monitoring target network communication pattern, in the communication log of the terminal of the monitoring target network, The malignant communication pattern detection device according to claim 1, wherein a network log frequent pattern feature amount that is a value indicating a degree of appearance is acquired.

The feature amount extraction unit includes, as a feature amount of the monitoring target network communication pattern in the detection model, for each communication pattern included in the monitoring target network communication pattern, in the communication log of the terminal of the monitoring target network, the communication pattern. After the communication to the first communication destination shown, the relevance strength feature amount in the pattern, which is a value indicating the rate of occurrence of communication to the second communication destination shown in the communication pattern, is acquired The malignant communication pattern detection apparatus according to any one of claims 1 to 4.

The said communication pattern extraction part extracts the communication pattern which appears in the said malware communication pattern acquired previously, when extracting the said monitoring object network communication pattern, The any one of Claims 1-5 characterized by the above-mentioned. The malignant communication pattern detection apparatus according to item 1.

The malignancy calculation unit includes a communication pattern in which a URL (Uniform Resource Locator) is extracted from the communication log as the communication destination, a communication pattern in which FQDN (Fully Qualified Domain Name) is extracted as the communication destination, and a URL path The malignancy is calculated for at least one of the communication patterns extracted as the communication destination, and a value obtained by integrating the malignancy of the calculated communication pattern is calculated. The malignant communication pattern detection device described in 1.

From the communication logs of the terminals infected with malware and the terminals of the monitored network, the first communication destination of the terminal and the second communication destination that is the subsequent communication destination among the series of communication destinations of the respective terminals. Extracting a communication pattern indicating a pair with;
A feature amount of a malware communication pattern which is a communication pattern extracted from a communication log of a terminal infected with the malware, and a feature amount of a monitored network communication pattern which is a communication pattern extracted from a communication log of a terminal of the monitored network Performing acquisition, and registering the acquired feature amount of each communication pattern in a detection model for detecting malignant communication;
Receiving an input of a communication log of communication to be detected;
For the detection target communication pattern that is a communication pattern extracted from the communication log of the communication to be detected, the feature amount of the malware communication pattern and the feature amount of the monitoring target network communication pattern are acquired from the detection model, Calculating the malignancy value of the detection target communication pattern higher as the characteristic amount of the acquired malware communication pattern is larger and as the characteristic amount of the acquired monitored network communication pattern is smaller. A characteristic malignant communication pattern detection method.

From the communication logs of the terminals infected with malware and the terminals of the monitored network, the first communication destination of the terminal and the second communication destination that is the subsequent communication destination among the series of communication destinations of the respective terminals. Extracting a communication pattern indicating a pair with;
A feature amount of a malware communication pattern which is a communication pattern extracted from a communication log of a terminal infected with the malware, and a feature amount of a monitored network communication pattern which is a communication pattern extracted from a communication log of a terminal of the monitored network Performing acquisition, and registering the acquired feature amount of each communication pattern in a detection model for detecting malignant communication;
Receiving an input of a communication log of communication to be detected;
For the detection target communication pattern that is a communication pattern extracted from the communication log of the communication to be detected, the feature amount of the malware communication pattern and the feature amount of the monitoring target network communication pattern are acquired from the detection model, Causing the computer to execute a step of calculating a higher malignancy value of the detection target communication pattern as the characteristic amount of the acquired malware communication pattern is larger and as the characteristic amount of the acquired monitored network communication pattern is smaller. A malignant communication pattern detection program characterized by the above.