JP4948359B2

JP4948359B2 - Unauthorized access detection device, unauthorized access detection method and program

Info

Publication number: JP4948359B2
Application number: JP2007278507A
Authority: JP
Inventors: 裕之榊原; 竜太坂口; 清人河内
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2007-10-26
Filing date: 2007-10-26
Publication date: 2012-06-06
Anticipated expiration: 2027-10-26
Also published as: JP2009111448A

Description

本発明は、不正アクセスの検知のためにネットワークトラフィックデータを解析する技術に関する。 The present invention relates to a technique for analyzing network traffic data for detecting unauthorized access.

ネットワークトラフィックの時系列データに対して主成分分析（ＰｒｉｎｃｉｐａｌＣｏｍｐｏｎｅｎｔＡｎａｌｙｓｉｓ、以下ＰＣＡともいう）を行って不正アクセスを検知する場合がある。
従来の主成分分析による不正アクセスの時系列データ分析として、例えば、非特許文献１に記載されている技術がある。
非特許文献１では、ネットワークトラフィックデータをある一定の長さの１単位時間ずつシフトしながら切り出し、行列化し主成分分析を行い、特徴量を用いて異常を検知する方法が示されている。この方法では、主成分分析を行った結果、定常状態に該当する特徴量から乖離した特徴量をもつネットワークトラフィックデータを異常と判断する。
また、不正アクセスの時系列データ分析の例として、ＴＣＰ／ＩＰ（ＴｒａｎｓｍｉｓｓｉｏｎＣｏｎｔｒｏｌＰｒｏｔｏｃｏｌ／ＩｎｔｅｒｎｅｔＰｒｏｔｏｃｏｌ）、ＵＤＰ／ＩＰ（ＵｓｅｒＤａｔａｇｒａｍＰｒｏｔｏｃｏｌ／ＩｎｔｅｒｎｅｔＰｒｏｔｏｃｏｌ）における特定のポート宛のパケット数の変動を監視することがある（例えば、特許文献１）。
特開２００６−３１４０７７号公報「定点観測による不正アクセス分析システム」情報処理学会２００６−ＣＳＥＣ−３５ There are cases where unauthorized access is detected by performing principal component analysis (Principal Component Analysis, hereinafter also referred to as PCA) on time series data of network traffic.
As conventional time series data analysis of unauthorized access by principal component analysis, for example, there is a technique described in Non-Patent Document 1.
Non-Patent Document 1 discloses a method of extracting network traffic data while shifting it by one unit time of a certain length, forming a matrix, performing principal component analysis, and detecting an abnormality using a feature amount. In this method, as a result of principal component analysis, network traffic data having a feature amount deviating from a feature amount corresponding to a steady state is determined to be abnormal.
In addition, as an example of time-series data analysis of unauthorized access, monitoring changes in the number of packets destined for a specific port in TCP / IP (Transmission Control Protocol / Internet Protocol) and UDP / IP (User Datagram Protocol / Internet Protocol) (For example, Patent Document 1).
JP 2006-314077 A "Unauthorized access analysis system by fixed point observation" IPSJ 2006-CSEC-35

非特許文献１に示されるようにＰＣＡを利用した不正アクセスの検知、特にワームの活動の検知においては例えば特定の宛先ポートのアクセス数の変動を検知する方法がある。
従来のワームは、ある特定のポートへの感染を試みることが多かったが、最近のワームは様々な宛先ポートに接続してバックドア活動などを行ったり、拡散を試みたりする。
従って、常に全てのポートを監視することが重要である。
ところが、従来のＰＣＡによる宛先ポートにおけるアクセス数の時系列データの変動の検知においては、１回のＰＣＡにかかる時間が大きいため、全てのポート番号に対して個別にＰＣＡを実施することは困難であった。
ポート番号は０〜６５，５３５と非常に多いため、一つのポートをＰＣＡする場合、ある実装では例えばＰｅｎｔｉｕｍ４（登録商標）２．４Ｇで０．４秒かかるため、全ポートでは７時間以上の計算時間がかかってしまう。従って、頻繁に不正アクセスに利用される一部のポートの時系列データをＰＣＡにかけるのが一般的であった。
しかし、前述の様に最近のワームが様々なポートへのアクセスを行うことを鑑みると、ワームの活動を検知するためには全ポートの変動を同時に短時間でＰＣＡで検知することが必要である。 As shown in Non-Patent Document 1, in the detection of unauthorized access using PCA, particularly the detection of worm activity, for example, there is a method of detecting a change in the number of accesses of a specific destination port.
Conventional worms often attempt to infect a specific port, but recent worms connect to various destination ports to perform backdoor activities and try to spread.
Therefore, it is important to always monitor all ports.
However, in the conventional detection of time-series data fluctuation of the number of accesses at the destination port by the PCA, since it takes a long time for one PCA, it is difficult to perform the PCA individually for all the port numbers. there were.
Since the port number is very large as 0 to 65,535, when one port is PCA, for example, Pentium4 (registered trademark) 2.4G takes 0.4 seconds in some implementations, so it takes 7 hours or more for all ports. It takes time. Therefore, it is common to apply time series data of some ports frequently used for unauthorized access to PCA.
However, in view of recent worms accessing various ports as described above, it is necessary to detect changes in all ports simultaneously and in a short time in order to detect worm activity. .

この発明は、上記のような課題を解決することを主な目的の一つとしており、不正アクセス検知のためのネットワークトラフィックデータの解析を短時間で効率的かつ高精度に行うことを主な目的とする。 The main object of the present invention is to solve the above-mentioned problems, and to analyze network traffic data for detecting unauthorized access efficiently and with high accuracy in a short time. And

本発明に係る不正アクセス検知装置は、
複数個の通信識別子が含まれる定常状態の通信ログデータを入力し、通信識別子ごとに前記定常状態の通信ログデータにおける出現回数を計数する定常状態出現回数計数部と、
前記定常状態出現回数計数部による計数結果に基づき、出現回数が多い通信識別子を個別対応通信識別子に指定し、出現回数が少ない通信識別子を合算対応通信識別子に指定する通信識別子指定部と、
複数個の通信識別子が含まれる不正アクセス検知対象の通信ログデータを入力し、通信識別子ごとに前記不正アクセス検知対象の通信ログデータにおける出現回数を計数する検知対象出現回数計数部と、
前記検知対象出現回数計数部により計数された出現回数のうち、前記合算対応通信識別子に指定された通信識別子の出現回数を合算する出現回数合算部と、
前記個別対応通信識別子に指定された通信識別子に対する不正アクセス検知分析を前記検知対象出現回数計数部により計数された個々の通信識別子の出現回数を用いて行い、前記合算対応通信識別子に指定された通信識別子に対する不正アクセス検知分析を前記出現回数合算部により算出された出現回数の合算値を用いて行う不正アクセス検知分析部とを有することを特徴とする。 The unauthorized access detection device according to the present invention is:
A steady-state appearance count counter that inputs steady-state communication log data including a plurality of communication identifiers and counts the number of appearances in the steady-state communication log data for each communication identifier;
Based on the counting result by the steady state appearance frequency counting unit, a communication identifier having a large number of appearances is designated as an individually corresponding communication identifier, a communication identifier having a small number of appearances is designated as a combined correspondence communication identifier,
Input the communication log data of the unauthorized access detection target including a plurality of communication identifiers, and a detection target appearance number counting unit that counts the number of appearances in the communication log data of the unauthorized access detection target for each communication identifier;
Of the number of appearances counted by the detection target appearance number counting unit, an appearance number summing unit for summing up the number of appearances of the communication identifier specified in the summing-compatible communication identifier;
Unauthorized access detection analysis for the communication identifier designated as the individual correspondence communication identifier is performed using the number of appearances of each communication identifier counted by the detection target appearance number counting unit, and the communication designated as the sum correspondence communication identifier. And an unauthorized access detection / analysis unit that performs unauthorized access detection analysis on the identifier using the sum of the appearance counts calculated by the appearance count summation unit.

本発明では、定常状態において出現回数が少ない合算対応通信識別子に対する不正アクセス検知分析を出現回数の合算値を用いて行うため、不正アクセス検知分析の件数を抑えることができ、不正アクセス検知分析を短時間で効率的かつ高精度に行うことができる。 In the present invention, since the unauthorized access detection analysis for the combined correspondence communication identifier with a small number of appearances in the steady state is performed using the sum of the number of appearances, the number of unauthorized access detection analyzes can be suppressed, and the unauthorized access detection analysis can be shortened. It can be done efficiently and accurately in time.

実施の形態１．
本実施の形態では、上述した課題を解決するために全ポートの変動を捉えるためのＰＣＡの回数を減らすことに着眼している。
その方法として、通信ログデータにおける出現回数（出現個数又はアクセス数ともいう）が大きいポートは従来どおり個別にＰＣＡを実施し、出現回数が小さいポートはその出現回数を足しこんでまとめてしまってから（マージしてから）ＰＣＡを実施する。このマージによりＰＣＡの数を減らす。 Embodiment 1 FIG.
In the present embodiment, in order to solve the above-described problem, attention is focused on reducing the number of times of PCA for capturing fluctuations of all ports.
As a method, PCA is performed individually for ports having a large number of appearances (also referred to as the number of appearances or accesses) in communication log data, and ports having a small number of appearances are summed up by adding the number of appearances. Perform PCA (after merging). This merge reduces the number of PCAs.

ワームの発生においては、時間の経過と共に感染が増え、ポートへのアクセス数が増加することを考慮すれば、普段、殆どアクセスの発生しないポートへのアクセス数をマージしたとしても、１つのポートにワームが感染すれば、そのアクセス数の増加はマージした結果に対しても現われてくるという予想に基づいている。 In the occurrence of worms, considering that the number of accesses to ports increases as time passes and the number of accesses to ports increases, even if the number of accesses to ports that rarely occur normally is merged, If the worm is infected, the increase in the number of accesses is based on the expectation that it will appear in the merged result.

図１は、本実施の形態に係る不正アクセス検知装置１００の構成例を示す図である。
図１に示す不正アクセス検知装置１００は、例えば、ＩＤＳ（ＩｎｔｒｕｓｉｏｎＤｅｔｅｃｔｉｏｎＳｙｓｔｅｍ）の一部を構成する。 FIG. 1 is a diagram illustrating a configuration example of an unauthorized access detection device 100 according to the present embodiment.
The unauthorized access detection device 100 shown in FIG. 1 constitutes a part of an IDS (Intrusion Detection System), for example.

図１において、データ取得部１０１は、ネットワーク機器の通信ログデータ１５０（以下、ネットワーク機器のログ１５０又はログ１５０という）を取り込む。
後述するように、不正アクセス検知装置１００の動作として、準備段階と検知段階の２段階があり、データ取得部１０１は、準備段階及び検知段階のそれぞれでログ１５０を取得する。 In FIG. 1, a data acquisition unit 101 captures communication log data 150 of a network device (hereinafter referred to as network device log 150 or log 150).
As will be described later, the operation of the unauthorized access detection device 100 includes two stages, a preparation stage and a detection stage, and the data acquisition unit 101 acquires the log 150 in each of the preparation stage and the detection stage.

ポート別集計部１０２は、データ取得部１０１により取得されたネットワーク機器のログ１５０を入力し、ネットワーク機器のログ１５０から、集計期間内における各宛先ポート番号（以下、単に宛先ポート又はポートともいう）の出現頻度をカウントし、各宛先ポート番号別に、カウント値をポート別カウント１５１として出力する。
つまり、ネットワーク機器のログ１５０には複数個の宛先ポート番号（通信識別子）が含まれており、ポート別集計部１０２は、ネットワーク機器のログ１５０に含まれる各宛先ポート番号の一定期間における出現回数を係数し、ポート別カウント１５１として出力する。
ポート別集計部１０２は、準備段階では、定常状態のログ１５０を入力し、宛先ポート番号ごとに定常状態のログ１５０における出現回数を計数する。定常状態のログ１５０とは、一定期間分の通信ログデータであって、対象となるネットワークの通常の（平均的な）通信状態を示すと考えられる通信ログデータである。
また、ポート別集計部１０２は、検知段階で、複数個の宛先ポート番号が含まれる不正アクセス検知対象のログ１５０を入力し、宛先ポート番号ごとに不正アクセス検知対象のログ１５０における出現回数を計数する。
ポート別集計部１０２は、定常状態出現回数計数部及び検知対象出現回数計数部の例である。 The port totaling unit 102 receives the network device log 150 acquired by the data acquisition unit 101, and from the network device log 150, each destination port number within the totaling period (hereinafter also simply referred to as a destination port or port). And the count value is output as a port-specific count 151 for each destination port number.
That is, the network device log 150 includes a plurality of destination port numbers (communication identifiers), and the port totaling unit 102 indicates the number of times each destination port number included in the network device log 150 appears in a certain period. Is output as a port-specific count 151.
In the preparation stage, the port totaling unit 102 inputs the steady-state log 150 and counts the number of appearances in the steady-state log 150 for each destination port number. The steady-state log 150 is communication log data for a certain period, and is communication log data that is considered to indicate a normal (average) communication state of a target network.
Further, the port totaling unit 102 inputs the unauthorized access detection target log 150 including a plurality of destination port numbers at the detection stage, and counts the number of appearances in the unauthorized access detection target log 150 for each destination port number. To do.
The port totaling unit 102 is an example of a steady state appearance number counting unit and a detection target appearance number counting unit.

ポート域分割部１０３は、ポート別カウント１５１に対してポート帯域を分割しポート別カウント１５１のマージを行い、マージしなかったポート別カウント１５１’及びその時系列データであるポート別時系列データ１５２を出力する。
さらに、マージしたポート別カウント１５１であるマージポートカウント１５３及びその時系列データであるマージポート時系列データ１５４を出力する。 The port area dividing unit 103 divides the port bandwidth with respect to the port-specific count 151 and merges the port-specific count 151. Output.
Further, the merge port count 153 that is the merged port count 151 and the merge port time series data 154 that is the time series data are output.

ポート域分割部１０３は、ポート別集計部１０２による定常状態のログ１５０に対する計数結果に基づき、出現回数が多い宛先ポート番号を個別対応ポート（個別対応通信識別子）に指定し、出現回数が少ない宛先ポート番号をマージ対応ポート（合算対応通信識別子）に指定する。
個別対応ポートとは、主成分分析が単一のポート番号の出現回数に対して行われるポート番号である。つまり、個別対応ポートでは、主成分分析は当該個別対応ポート単独の出現回数を対象とする。一方、マージ対応ポートとは、主成分分析が複数のポート番号の出現回数の合算値に対して行われるポート番号である。つまり、マージ対応ポートでは、主成分分析はマージ対応ポートとして指定されている複数のポート番号の出現回数の合算値を対象とする。
また、ポート域分割部１０３は、ポート別集計部１０２によって不正アクセス検知対象のログ１５０に対して計数された出現回数のうち、マージ対応ポートに指定された宛先ポートの出現回数を合算し、合算値をマージポートカウント１５３として出力する。
ポート別時系列データ１５２は、定常状態のログ１５０における個別対応ポートの個々の出現回数の時系列データであり、マージポート時系列データ１５４は、定常状態のログ１５０におけるマージ対応ポートの出現回数の合算値の時系列データである。
また、ポート別カウント１５１’は、不正アクセス検知対象のログ１５０における個別対応ポートの個々の出現回数を示すデータであり、マージポートカウント１５３は、不正アクセス検知対象のログ１５０におけるマージ対応ポートの出現回数の合算値を示すデータである。
ポート域分割部１０３は、通信識別子指定部及び出現回数合算部の例である。 The port area dividing unit 103 designates a destination port number with a large number of appearances as an individual correspondence port (individual correspondence communication identifier) based on the count result for the steady state log 150 by the port totaling unit 102, and a destination with a small number of appearances. Specify the port number as a merge-compatible port (combined-compatible communication identifier).
The individual correspondence port is a port number for which principal component analysis is performed on the number of appearances of a single port number. In other words, in the individual correspondence port, the principal component analysis targets the number of appearances of the individual correspondence port alone. On the other hand, the merge-corresponding port is a port number for which principal component analysis is performed on the sum of the appearance counts of a plurality of port numbers. In other words, in the merge-compatible port, the principal component analysis targets the sum of the appearance counts of a plurality of port numbers specified as the merge-compatible port.
Further, the port area dividing unit 103 adds up the number of appearances of the destination port designated as the merge-compatible port among the number of appearances counted for the unauthorized access detection target log 150 by the port totaling unit 102, The value is output as the merge port count 153.
The port time-series data 152 is time-series data of the number of individual appearances of the individually corresponding ports in the steady-state log 150, and the merge port time-series data 154 is the number of appearances of the merge-compatible ports in the steady-state log 150. It is the time series data of the total value.
The port-specific count 151 ′ is data indicating the number of individual appearances of the individual corresponding port in the unauthorized access detection target log 150, and the merge port count 153 is the appearance of the merge correspondence port in the unauthorized access detection target log 150. It is data indicating the total value of the number of times.
The port area dividing unit 103 is an example of a communication identifier specifying unit and an appearance frequency adding unit.

分析部１０４は、ポート別カウント１５１’に対して個別にＰＣＡを行い個別対応ポートごとに特徴量（主成分得点）１５５を算出する。
また、マージポートカウント１５３に対してＰＣＡを行いマージ対応ポートの特徴量（主成分得点）１５６を算出する。
つまり、分析部１０４は、個別対応ポートに指定された宛先ポート番号に対する不正アクセス検知分析をポート別集計部１０２により計数された個々の宛先ポート番号の出現回数（ポート別カウント１５１’）を用いて行い、マージ対応ポートに指定された宛先ポート番号に対する不正アクセス検知分析をポート域分割部１０３により算出された出現回数の合算値を用いて行う。
分析部１０４は、不正アクセス検知分析部の例である。 The analysis unit 104 individually performs PCA on the port-specific count 151 ′ and calculates a feature value (principal component score) 155 for each individually corresponding port.
Also, PCA is performed on the merge port count 153 to calculate the merge-corresponding port feature quantity (principal component score) 156.
In other words, the analysis unit 104 uses the number of appearances of each destination port number (count by port 151 ′) counted by the port-by-port totaling unit 102 for unauthorized access detection analysis for the destination port number designated as the individual corresponding port. The unauthorized access detection analysis for the destination port number designated as the merge-compatible port is performed by using the total number of appearances calculated by the port area dividing unit 103.
The analysis unit 104 is an example of an unauthorized access detection analysis unit.

異常検知部１０５は、分析部１０４の主成分分析の結果において、現在のカウントの特徴量が定常域から逸脱（異常）しているか否か（定常）を検知する。 The abnormality detection unit 105 detects whether or not the current count feature amount deviates (abnormal) from the steady region in the result of the principal component analysis of the analysis unit 104 (steady state).

定常域データ定義部１０６は、ポート域分割部１０３と連携し定常状態の集計データ（ポート別時系列データ１５２及びマージポート時系列データ１５４）を保持する。 The stationary area data definition unit 106 cooperates with the port area dividing unit 103 to hold the aggregated data (port-specific time series data 152 and merge port time series data 154) in a steady state.

次に、本実施の形態に係る不正アクセス検知装置１００の動作について説明する。 Next, the operation of the unauthorized access detection device 100 according to this embodiment will be described.

まず、検知開始前の準備段階における不正アクセス検知装置１００の準備動作を図１１のフローチャートを参照しながら説明する。
データ取得部１０１は、定常域として扱う期間のネットワークトラフィックデータ、つまり定常状態のネットワークトラフィックデータ（例えば、１週間分のネットワーク機器のログ１５０）を受信し（Ｓ１１０１）、ポート別集計部１０２にデータを渡す。
ネットワークトラフィックデータとして例えば、図２の様なログファイルを取り込む。
図２のログファイルでは、パケットの捕捉日時（Ｄａｔｅ）に対して、送信元ＩＰアドレス（ＳｒｃＩＰ）、宛先ＩＰアドレス（ＤｓｔＩＰ）、宛先ポート番号（ＤｓｔＰｏｒｔ）、警告番号（ＡｌｅｒｔＩＤ）が示されている。 First, the preparation operation of the unauthorized access detection device 100 in the preparation stage before the start of detection will be described with reference to the flowchart of FIG.
The data acquisition unit 101 receives network traffic data for a period to be treated as a steady region, that is, network traffic data in a steady state (for example, the network device log 150 for one week) (S1101), and the data is sent to the port-by-port totaling unit 102 give.
For example, a log file as shown in FIG. 2 is taken in as network traffic data.
In the log file of FIG. 2, the transmission source IP address (SrcIP), the destination IP address (DstIP), the destination port number (DstPort), and the warning number (AlertID) are shown for the packet capture date (Date). .

次に、ポート別集計部１０２が、データ取得部１０１から受け取ったネットワークトラフィックデータ（定常状態のログ１５０）から、宛先ポート番号別の出現回数をカウントし、宛先ポート番号別の出現回数の時系列データを生成する（Ｓ１１０２）（定常状態出現回数計数ステップ）。この例では、宛先ポートが０から６５５３５の各ポート番号別に出現回数の時系列データを生成する。
ポート別集計部１０２は、定められた集計時間、例えば５分間において、そのログ中の「宛先ポート０の出現個数」、「宛先ポート１の出現個数」、「宛先ポート２の出現個数」と順に数えていき、「宛先ポート６５５３５の出現個数」までそれぞれのポート宛の出現個数を数え上げる。これを集計時間毎に順に行うと、ポートごとに出現回数の時系列データが生成できる。 Next, the port totaling unit 102 counts the number of appearances for each destination port number from the network traffic data (steady state log 150) received from the data acquisition unit 101, and the time series of the number of appearances for each destination port number Data is generated (S1102) (steady state appearance frequency counting step). In this example, time-series data of the number of appearances is generated for each port number in which the destination port is 0 to 65535.
The totaling unit for each port 102, in a predetermined totaling time, for example, 5 minutes, in order, “number of appearances of destination port 0”, “number of appearances of destination port 1”, “number of appearances of destination port 2” in that order. The number of appearances for each port is counted up to “number of appearances of destination port 65535”. If this is performed in order for each aggregation time, time-series data of the number of appearances can be generated for each port.

例えば、図２のログの例であれば、０６／０４／１８：１２：００から５分間の間にポート番号４４５は３個、ポート番号１３５は２個、ポート番号１２３は１個、ポート番号１４３４は１個出現しており、その他のポートは全て０回の出現個数である。
この作業を５分ごとに行えば、各ポートについて５分集計の時系列データが生成される。
ログデータが１週間分であれば、６０分×２４時間×７日＝１００８０分であり、５分で割れば２０１６個の集計値から構成される時系列データが得られる。 For example, in the example of the log in FIG. 2, the port number 445 is three, the port number 135 is two, the port number 123 is one, and the port number within 5 minutes from 06/04/18: 12: 00. 1434 appears, and the number of appearances of all other ports is zero.
If this operation is performed every 5 minutes, time series data of 5 minutes is generated for each port.
If the log data is for one week, 60 minutes × 24 hours × 7 days = 108080 minutes, and if divided by 5 minutes, time series data composed of 2016 total values is obtained.

当処理をフローチャートで表したのが、図３である。まず、ｊは集計期間の単位のインデックスである。例えば、０６／０４／１８：１２：００から５分を集計期間とし１週間のログデータを与えるのであれば、以下の様になる。
０６／０４／１８：１２：００から５分間ｊ＝１
０６／０４／１８：１２：０５から５分間ｊ＝２
０６／０４／１８：１２：１０から５分間ｊ＝３
・・・
０６／０４／２５：１２：１０から５分間ｊ＝２０１６
これは、図３において、ｋ＝１、ｎ＝２０１６＋１とした場合に相当する。 FIG. 3 shows this processing in a flowchart. First, j is an index of the unit of the aggregation period. For example, if log data for one week is given with a total period of 5 minutes from 06/04/18: 12: 00, the result is as follows.
5 minutes from 06/04/18: 12: 00 j = 1
5 minutes from 06/04/18: 12: 05 j = 2
5 minutes from 06/04/18: 12: 10 j = 3
...
5 minutes from 06/04/25: 12: 10 j = 2016
This corresponds to the case where k = 1 and n = 2016 + 1 in FIG.

次に、フローにおいては、ポート別集計部１０２は、ｊ＝１の場合は、まず、ポート番号に相当するインデックスｉに０を設定し、ｊ＝１の集計期間中のログに現われた宛先ポートが０の出現回数をカウントする（Ｓ３０３）。
次に、カウント値をＣｏｕｎｔ＿０＿１に設定する（Ｓ３０４）。
次に、ｉを１増やす（Ｓ３０５）。これは、次のポート番号の出現回数をカウントするための準備である。
同様に、宛先ポートが１の出現回数をカウントし（Ｓ３０３）、Ｃｏｕｎｔ＿１＿１に設定する（Ｓ３０４）。
これをｉ＝６５５３５まで繰り返すと（Ｓ３０２）、Ｃｏｕｎｔ＿０＿１〜Ｃｏｕｎｔ＿６５５３５＿１が出力される。
Ｃｏｕｎｔ＿０＿１〜Ｃｏｕｎｔ＿６５５３５＿１は、集計期間のインデックスｊ＝１、つまり０６／０４／１８：１２：００から５分間における各ポートのログ中の出現頻度である。 Next, in the flow, when j = 1, first, when j = 1, the index i corresponding to the port number is set to 0, and the destination port appearing in the log during the aggregation period of j = 1. The number of occurrences of 0 is counted (S303).
Next, the count value is set to Count_0_1 (S304).
Next, i is increased by 1 (S305). This is preparation for counting the number of appearances of the next port number.
Similarly, the number of appearances where the destination port is 1 is counted (S303) and set to Count_1_1 (S304).
When this is repeated until i = 65535 (S302), Count_0_1 to Count_65535_1 are output.
Count — 0 — 1 to Count — 65535 — 1 are the frequency of appearance in the log of each port for 5 minutes from the index j = 1 of the aggregation period, that is, 06/04/18: 12: 00.

ポート別集計部１０２は、上記作業を、ｊ＝１〜ｊ＝２０１６について繰り返す（Ｓ３０１）。
この結果、各ポートにおけるｊ＝１〜２０１６の時系列データが生成される。
図４は、０から６５５３５の各ポート番号別に時系列データを生成した例である（ＤｓｔＰｏｒｔ＝０、１００００、６５５３５に関してのみグラフを表記している）。
ポート別集計部１０２はこの各ポートの時系列データをポート別時系列データ１５２としてポート域分割部１０３に渡す。つまり、Ｃｏｕｎｔ＿ｉ＿ｊ（ｉ＝０〜６５５３５、ｊ＝１〜２０１６）を渡す。 The port totaling unit 102 repeats the above operation for j = 1 to j = 2016 (S301).
As a result, time series data of j = 1 to 2016 at each port is generated.
FIG. 4 is an example in which time-series data is generated for each port number from 0 to 65535 (only graphs for DstPort = 0, 10000, and 65535 are shown).
The port totaling unit 102 passes the time series data of each port to the port area dividing unit 103 as port time series data 152. That is, Count_i_j (i = 0 to 65535, j = 1 to 2016) is passed.

ポート域分割部１０３では渡されたポート別時系列データ１５２を元に、定常時にアクセス数の平均が高いものと低いものに分割して、個別対応ポートとマージ対応ポートを指定する処理を行う（Ｓ１１０３）（通信識別子指定ステップ）。これをポート域の分割処理と呼ぶことにする。 Based on the port-specific time-series data 152 passed, the port area dividing unit 103 divides the average number of accesses into one having a high average and a low one in a steady state, and performs processing for designating individual corresponding ports and merge corresponding ports ( S1103) (communication identifier designation step). This is called port area division processing.

ここで、ポート域の分割処理の詳細を説明する。
まず、ポート域分割部１０３は受け取った各ポートの時系列データごとに平均μ_ｉと分散σ^２ _ｉを計算する。
例えばポート番号０については、以下の計算になる。
μ_０＝ΣＣｏｕｎｔ＿０＿ｊ［ｊ＝１〜２０１６］／２０１６
σ^２ _０＝Σ（Ｃｏｕｎｔ＿０＿ｊ−μ_０）^２［ｊ＝１〜２０１６］／２０１６
また、同様にして、ポート域分割部１０３は、（μ_０、σ^２ _０）・・・（μ_{６５５３５}、σ^２ _{６５５３５}）を求める。 Here, details of the port area division processing will be described.
First, the port area dividing unit 103 calculates the average μ _i and the variance σ ² _i for each received time-series data of each port.
For example, for port number 0, the following calculation is performed.
μ ₀ = ΣCount — ₀ — j [j = 1 to 2016] / 2016
σ ² ₀ = Σ (Count_0_j−μ ₀ ) ² [j = 1 to 2016] / 2016
Similarly, the port area dividing unit 103 obtains (μ ₀ , σ ² ₀ )... (Μ ₆₅₅₃₅ , σ ² ₆₅₅₃₅ ).

次に、ポート域分割部１０３は、平均μに基づき、平均的にアクセス数が多い順にポート番号のソートを行う。
その結果、図５のヒストグラムが得られる（図４と図５のポート番号は一致していない）。この例では、（１）の宛先ポート４４５では平均で１，０００のアクセス数、宛先ポート１３５、宛先ポート１３９では７００以上があるが、それ以外の（２）の宛先ポートでは０〜１程度しかない。
この様にソートしたヒストグラムにおいて、（１）の様にアクセス数が多いグループは個別にＰＣＡをかける対象である個別対応ポートとする。逆に（２）の様に通常アクセス数が小さいか殆ど無いグループに関してはこのグループ内全てのアクセス数をマージする対象であるマージ対応ポートとする。 Next, the port area dividing unit 103 sorts the port numbers based on the average μ in descending order of the number of accesses on average.
As a result, the histogram of FIG. 5 is obtained (the port numbers of FIG. 4 and FIG. 5 do not match). In this example, the average number of accesses is 1,000 for the destination port 445 of (1) and 700 or more for the destination port 135 and destination port 139, but only about 0 to 1 for the other destination ports of (2). Absent.
In the histogram sorted in this way, a group having a large number of accesses as shown in (1) is an individually corresponding port that is a target for PCA. On the contrary, as in (2), a group with a small or almost no normal access number is set as a merge-corresponding port which is the object of merging all the access numbers in this group.

次に、ポート域分割部１０３は、マージ対応ポートの出現回数をマージして、マージ後の出現回数の時系列データであるマージポート時系列データ１５４を生成する（Ｓ１１０４）。 Next, the port area dividing unit 103 merges the appearance counts of the merge-compatible ports, and generates merge port time-series data 154 that is time-series data of the appearance count after the merge (S1104).

マージとは、以下の様に示される。
グループ（２）のようにマージ対象となるポート番号の集合を、ＭｅｒｇｅＰｏｒｔ＝｛０〜６５５３５，｜４４５，１３５，１３９を除く｝とする。ここでは例として分かりやすくするため、殆どアクセスが無いと仮定した（０〜１アクセス程度）。
ＭｅｒｇｅＣｏｕｎｔ＿ｊ＝ΣＣｏｕｎｔ＿ｉ＿ｊ
ｉ＝｛０〜６５５３５，｜４４５，１３５，１３９を除く｝
として計算される。
当例では、ｊ＝１〜２０１６までそれぞれのｊについてマージを行う。これをフローで表したのが図６であり、ｋ＝１，ｎ＝２０１７とした場合である。
つまり、この処理の結果得られるのは、ＭｅｒｇｅＣｏｕｎｔ＿ｊ［ｊ＝１〜２０１６］である。
ＭｅｒｇｅＣｏｕｎｔ＿ｊをｊを１から時系列に並べていけば図７の様にマージされたデータの時系列データが生成される。 The merge is shown as follows.
Assume that a set of port numbers to be merged as in the group (2) is MergePort = {0 to 65535, | 445, 135, 139}. Here, for the sake of easy understanding, it is assumed that there is almost no access (about 0 to 1 access).
MergeCount_j = ΣCount_i_j
i = {excluding 0 to 65535, | 445, 135, 139}
Is calculated as
In this example, merging is performed for each j from j = 1 to 2016. This is represented by a flow in FIG. 6, where k = 1 and n = 2017.
That is, the result of this process is MergeCount_j [j = 1 to 2016].
By merging MergeCount_j from j to 1 in time series, time-series data of merged data is generated as shown in FIG.

ポート域分割部１０３は、単独のポート番号の時系列データであるポート別時系列データ１５２とマージされた時系列データであるマージポート時系列データ１５４を定常域データ定義部１０６に渡す。
ポート別時系列データ１５２はこの例では、Ｃｏｕｎｔ＿４４５＿ｊ、Ｃｏｕｎｔ＿１３５＿ｊ、Ｃｏｕｎｔ＿１３９＿ｊ［ｊ＝１〜２０１６］の時系列データである。
マージポート時系列データ１５４はこの例では、ＭｅｒｇｅＣｏｕｎｔ＿ｊ［ｊ＝１〜２０１６］である。 The port area dividing unit 103 passes the merge port time-series data 154 that is time-series data merged with the port-specific time-series data 152 that is time-series data of a single port number to the stationary area data defining unit 106.
In this example, the port-specific time-series data 152 is time-series data of Count_445_j, Count_135_j, and Count_139_j [j = 1 to 2016].
In this example, the merge port time-series data 154 is MergeCount_j [j = 1 to 2016].

定常域データ定義部１０６では、ポート域分割部１０３から受け取ったポート別時系列データ１５２とマージポート時系列データ１５４を保存する（Ｓ１１０５）。
ここまでが、検知開始前の準備動作である。 The stationary area data definition unit 106 stores the port-specific time series data 152 and the merge port time series data 154 received from the port area division unit 103 (S1105).
This is the preparation operation before the start of detection.

次に、検知段階における不正アクセス検知装置１００の検知動作を図１２のフローチャートを参照しながら説明する。 Next, the detection operation of the unauthorized access detection device 100 in the detection stage will be described with reference to the flowchart of FIG.

データ取得部１０１はネットワーク機器のログ１５０を５分間分受信し（Ｓ１２０１）、ポート別集計部１０２に渡す。
データ取得部１０１が受信したこの５分間分のネットワーク機器のログ１５０は、不正アクセス検知対象のログである。 The data acquisition unit 101 receives the log 150 of the network device for 5 minutes (S1201), and passes it to the port totaling unit 102.
The network device log 150 received for five minutes by the data acquisition unit 101 is a log for unauthorized access detection.

ポート別集計部１０２では、宛先ポートごとにログ中の出現個数をカウントし（Ｓ１２０２）（検知対象出現回数計数ステップ）、ポート別カウント１５１としてポート域分割部１０３に渡す。
当処理は、図３の処理において、ｋ＝２０１７、ｎ＝２０１８とおくことに等しく、この結果得られる、Ｃｏｕｎｔ＿ｉ＿２０１７［ｉ＝０〜６５５３７］がポート別カウント１５１である。 The port totaling unit 102 counts the number of appearances in the log for each destination port (S1202) (detection target appearance count counting step), and passes it to the port area dividing unit 103 as a port specific count 151.
This processing is equivalent to setting k = 2017 and n = 2018 in the processing of FIG. 3, and Count_i_2017 [i = 0 to 65537] obtained as a result is the port-specific count 151.

次に、ポート域分割部１０３では、以下の２つの処理（（１）個別対応ポートに対する処理及び（２）マージ対応ポートに対する処理）を行う。 Next, the port area dividing unit 103 performs the following two processes ((1) process for individual correspondence port and (2) process for merge correspondence port).

ポート域分割部１０３は、（１）個別対応ポートに対する処理として、検知開始前の準備動作で決定された単独でＰＣＡをかける個別対応ポートであるポート番号４４５、１３５、１３９に対しては、ポート別集計部１０２から渡された該当するポート番号の集計数を使用する。
つまり、Ｃｏｕｎｔ＿４４５＿２０１７、Ｃｏｕｎｔ＿１３５＿２０１７、Ｃｏｕｎｔ＿１３９＿２０１７をポート別カウント１５１’として分析部１０４へ出力する。 The port area dividing unit 103 performs (1) port processing for individual port numbers 445, 135, and 139, which are individual port ports that are individually subjected to PCA determined in the preparatory operation before detection. The total number of corresponding port numbers passed from the separate total unit 102 is used.
That is, Count_445_2017, Count_135_2017, and Count_139_2017 are output to the analysis unit 104 as port counts 151 ′.

また、ポート域分割部１０３は、（２）マージ対応ポートに対する処理として、検知開始前の準備動作で決定されたマージ対応ポートに該当するポート番号に対して、検知開始前の準備動作と同様に渡された該当するポート番号の集計数を足しこむ（Ｓ１２０３）（出現回数合算ステップ）。
つまり、ＭｅｒｇｅＣｏｕｎｔ＿２０１７＝ΣＣｏｕｎｔ＿ｉ＿２０１７
ｉ＝｛０〜６５５３５，｜４４５，１３５，１３９を除く｝
を計算する。
次に、ポート域分割部１０３は、ＭｅｒｇｅＣｏｕｎｔ＿２０１７をマージポートカウント１５３として分析部１０４へ出力する。 Further, the port area dividing unit 103 performs (2) processing for merge-compatible ports, for the port numbers corresponding to the merge-compatible ports determined in the pre-detection preparatory operation, in the same manner as the pre-detection preparatory operation. The total number of the corresponding port numbers passed is added (S1203) (appearance count summing step).
That is, MergeCount_2017 = ΣCount_i_2017
i = {excluding 0 to 65535, | 445, 135, 139}
Calculate
Next, the port area dividing unit 103 outputs MergeCount_2017 to the analysis unit 104 as the merge port count 153.

分析部１０４では、以下の２つの処理（（１）個別対応ポートに対する処理及び（２）マージ対応ポートに対する処理）を行い、ポート別カウント１５１’、マージポートカウント１５３、ポート別時系列データ１５２及びマージポート時系列データ１５４を用いて主成分分析（不正アクセス検知分析）を行う（Ｓ１２０４）（不正アクセス検知分析ステップ）。 The analysis unit 104 performs the following two processes ((1) processing for individual correspondence ports and (2) processing for merge correspondence ports), and counts by port 151 ′, merge port count 153, port time series data 152, and A principal component analysis (unauthorized access detection analysis) is performed using the merge port time series data 154 (S1204) (unauthorized access detection analysis step).

先ず、（１）個別対応ポートに対する処理として、分析部１０４は、定常域データ定義部１０６からポート番号４４５に対応するポート別時系列データ１５２を取り出す。つまり、Ｃｏｕｎｔ＿４４５＿ｊ［ｊ＝１〜２０１６］を取り出す。
次に、分析部１０４は、当時系列データの末尾にＣｏｕｎｔ＿４４５＿２０１７を追加する。つまり、Ｃｏｕｎｔ＿４４５＿ｊ［ｊ＝１〜２０１７］を生成する。
そして、分析部１０４は、この時系列データに対してスライディングウィンドウ方式のＰＣＡをかけ、特徴量１５５を算出する。
特徴量１５５は異常検知部１０５へ送られる。同様の処理を、ポート番号１３５、１３９についても同じ処理を行う。 First, (1) as processing for an individually corresponding port, the analysis unit 104 extracts time-series data 152 for each port corresponding to the port number 445 from the steady-state area data definition unit 106. That is, Count_445_j [j = 1 to 2016] is extracted.
Next, the analysis unit 104 adds Count — 445 — 2017 to the end of the time series data. That is, Count_445_j [j = 1 to 2017] is generated.
Then, the analysis unit 104 performs a sliding window type PCA on the time series data to calculate a feature quantity 155.
The feature quantity 155 is sent to the abnormality detection unit 105. The same processing is performed for the port numbers 135 and 139.

また、（２）マージ対応ポートに対する処理として、分析部１０４は、定常域データ定義部１０６からマージデータに対応するマージポート時系列データ１５４を取り出す。つまり、ＭｅｒｇｅＣｏｕｎｔ＿ｊ［ｊ＝１〜２０１６］を取り出す。
次に、分析部１０４は、この時系列データの末尾にＭｅｒｇｅＣｏｕｎｔ＿２０１７を追加する。つまり、ＭｅｒｇｅＣｏｕｎｔ＿ｊ［ｊ＝１〜２０１７］を生成する。
そして、分析部１０４は、時系列データに対してスライディングウィンドウ方式のＰＣＡをかけ、特徴量１５６を算出する。
特徴量１５６は異常検知部１０５へ送られる。 (2) As processing for the merge-compatible port, the analysis unit 104 extracts merge port time-series data 154 corresponding to the merge data from the steady-state area data definition unit 106. That is, MergeCount_j [j = 1 to 2016] is taken out.
Next, the analysis unit 104 adds MergeCount_2017 to the end of the time series data. That is, MergeCount_j [j = 1 to 2017] is generated.
Then, the analysis unit 104 performs the sliding window method PCA on the time-series data to calculate the feature quantity 156.
The feature quantity 156 is sent to the abnormality detection unit 105.

異常検知部１０５では、以下の処理（（１）個別対応ポートに対する処理及び（２）マージ対応ポートに対する処理）を行って、異常検知を行う（Ｓ１２０５）。 The abnormality detection unit 105 performs the following processing ((1) processing for individual correspondence port and (2) processing for merge correspondence port) to detect abnormality (S1205).

異常検知部１０５は、（１）個別対応ポートに対する処理として、ポート番号４４５に対応する特徴量において、Ｃｏｕｎｔ＿４４５＿２０１７に対応する特徴量が、Ｃｏｕｎｔ＿４４５＿ｊ［ｊ＝１〜２０１６］に対応する特徴量に対して乖離しているか否かを判定する。乖離していれば異常と判定する。
異常と判定されなかった場合は、定常域データ定義部はＣｏｕｎｔ＿４４５＿ｊ［ｊ＝１〜２０１７］を定常時の時系列データとして扱い、次の集計データの判定に備える。
同様に、ポート番号１３５、１３９についても同じ処理を行う。 The abnormality detection unit 105 performs (1) the processing for the individual corresponding port with respect to the feature amount corresponding to Count_445_j [j = 1 to 2016] in the feature amount corresponding to the port number 445. It is determined whether or not there is a divergence. If there is a deviation, it is determined as abnormal.
If it is not determined to be abnormal, the steady-state data definition unit treats Count_445_j [j = 1 to 2017] as time-series data at the normal time and prepares for the determination of the next aggregated data.
Similarly, the same processing is performed for the port numbers 135 and 139.

また、異常検知部１０５は、（２）マージ対応ポートに対する処理として、マージされたポートの時系列データに対応する特徴量において、ＭｅｒｇｅＣｏｕｎｔ＿２０１７に対応する特徴量が、ＭｅｒｇｅＣｏｕｎｔ＿ｊ［ｊ＝１〜２０１６］に対応する特徴量に対して乖離しているか否かを判定する。乖離していれば異常と判定する。
異常と判定されなかった場合は、定常域データ定義部はＭｅｒｇｅＣｏｕｎｔ＿ｊ［ｊ＝１〜２０１７］を定常時の時系列データとして扱い、次の集計データの判定に備える。 In addition, as a process for the merge-compatible port, the abnormality detection unit 105 sets the feature amount corresponding to MergeCount_2017 to MergeCount_j [j = 1 to 2016] in the feature amount corresponding to the time-series data of the merged port. It is determined whether or not there is a deviation from the corresponding feature amount. If there is a deviation, it is determined as abnormal.
If it is not determined to be abnormal, the steady-state data definition unit treats MergeCount_j [j = 1 to 2017] as time-series data at the normal time and prepares for the determination of the next aggregated data.

マージ対応ポートの時系列データについて異常が検出された場合は、実際に出現回数が増加したポートを調べる必要がある。
そのためには、図８の様に、検知したタイミングでのマージ対応ポートの出現回数のヒストグラムを生成し、マージ対応ポートの中から、出現回数が増加したポートを見つければよい。
例えば、ｊ＝２０１７で異常が判定された場合、ｊ＝２０１７におけるマージ対応ポートのアクセス数のヒストグラムを作成する。
図８では、ポート５５５５番が増加しているため、このポート番号において異常が発生したと判断する。
アクセス数の増加したポートの見つけ方としては、マージ対応ポートにおけるＣｏｕｎｔ＿ｉ＿２０１７［ｉ＝０〜６５５３５｜４４５，１３５，１３９を除く］を大きい順にソートし、一番大きいＣｏｕｎｔ＿ｉ＿２０１７のｉが原因となるポートと判断する。この例ではｉ＝５５５５である。 When an abnormality is detected in the time-series data of the merge-compatible port, it is necessary to examine the port where the number of appearances has actually increased.
For this purpose, as shown in FIG. 8, a histogram of the number of appearances of merge-compatible ports at the detected timing is generated, and a port with an increased number of appearances is found from the merge-compatible ports.
For example, when an abnormality is determined at j = 2017, a histogram of the number of accesses of merge-compatible ports at j = 2017 is created.
In FIG. 8, since the port 5555 has increased, it is determined that an abnormality has occurred at this port number.
As a method of finding a port having an increased number of accesses, Count_i_2017 (excluding i = 0 to 65535 | 445, 135, 139) in the merge-compatible port is sorted in descending order, and the port caused by i of the largest Count_i_2017 is used. to decide. In this example, i = 5555.

マージ対応ポートは、元々、個々のアクセス数が無いか、非常に小さいため、多数個足しこんだ場合において、仮に１つのポートでワームが発生してもその影響はマージポートカウント中にも顕著に現われる可能性がある。 Merge-compatible ports originally have no individual access count or are very small, so if a large number of ports are added, even if a worm occurs on one port, the effect is significant even during the merge port count. It may appear.

このように、本実施の形態に係る不正アクセス検知装置では、アクセス数の少ないポートについてアクセス数をマージすることでＰＣＡの個数を減らすことが可能となる。 As described above, in the unauthorized access detection apparatus according to the present embodiment, it is possible to reduce the number of PCAs by merging the access numbers for ports with a small number of accesses.

以上、本実施の形態では、アクセス数の多いポートと少ないポートを分け、アクセス数の多いポートにはその時系列データに対して単独でスライディングウィンドウのＰＣＡによる分析を行い、アクセス数の少ないポートはその時系列データを足し合わせた結果に対してスライディングウィンドウのＰＣＡによる分析を行うことにより、ＰＣＡの処理個数を減らし計算時間を短縮する不正アクセス検知装置について説明した。 As described above, in this embodiment, a port having a large number of accesses is divided from a port having a small number of accesses, and the port having a large number of accesses is analyzed by PCA of a sliding window independently for the time series data. An unauthorized access detection apparatus has been described in which the PCA of the sliding window is analyzed with respect to the result of adding the series data, thereby reducing the number of PCA processes and the calculation time.

実施の形態２．
次に、ポート域分割部１０３における分割の方法について説明する。
実施の形態１では、アクセス数の多いポート番号４４５、１３５、１３９の３つを個別対応ポートとし、残りのポートは全てアクセス数が極めて少ないか０と仮定した。
つまり、実施の形態１は、マージ対応ポートを１つしか設けず、個別対応ポート以外のポートはすべて単一のマージ対応ポートに分類していた。
しかし、実際には６５，５３６−３＝６５，５３３個のポートの全てのアクセス数が１または０とは限らず、６５，５３３個のポートのアクセス数を全部足しこむと、１つの集計時間で１０，０００を超える可能性もある。
ここで、仮に１つのポートでワームが発生し、アクセス数が１から１００に増えたとする。しかし、マージされたデータでは、１０，１００程度であり、この値を増加として捉えられない可能性がある。つまり、誤差の範囲として捉えられてしまう可能性があるということである。 Embodiment 2. FIG.
Next, a dividing method in the port area dividing unit 103 will be described.
In the first embodiment, three port numbers 445, 135, and 139 having a large number of accesses are set as individually corresponding ports, and it is assumed that all the remaining ports have a very small number of accesses or 0.
That is, in the first embodiment, only one merge-compatible port is provided, and all ports other than the individual-compatible ports are classified as a single merge-compatible port.
However, in reality, the total number of accesses of 65,536-3 = 65,533 ports is not necessarily 1 or 0. If the total number of accesses of 65,533 ports is added, one total time There is also a possibility of exceeding 10,000.
Here, it is assumed that a worm is generated in one port and the number of accesses increases from 1 to 100. However, the merged data is about 10,100, and this value may not be regarded as an increase. In other words, there is a possibility of being caught as an error range.

そこで、本実施の形態では、ワームによるアクセス数の増加を捉えるために、マージ対応ポートを複数のグループに分割し１つのグループのマージ結果が小さくなるようにする。
具体的には、不正アクセス検知装置１００において性能的に同時に実施可能なＰＣＡの数をＮとおく。不正アクセスに用いられる既知のポート（過去のインターネットの調査結果によれば統計的には１０個程度である）数をＭとおく。
ポート域分割部１０３は、ポート別集計部１０２による定常状態のログ１５０に対する集計結果に従い、アクセス数の多い順にＭ個のポートを個別対応ポートとして指定する。
そして、Ｋ＝Ｎ−Ｍはマージした時系列データをＰＣＡで処理できる上限値となり、マージ対応ポートの分割数となる。
このため、ポート域分割部１０３は、上記のＭ個の個別対応ポート以外のポートをマージ対応ポートとして指定するとともに、マージ対応ポートをＫ個のグループに分類する。 Therefore, in this embodiment, in order to catch the increase in the number of accesses due to the worm, the merge-compatible ports are divided into a plurality of groups so that the merge result of one group becomes small.
Specifically, N is the number of PCAs that can be implemented simultaneously in terms of performance in the unauthorized access detection apparatus 100. Let M be the number of known ports used for unauthorized access (statistically about 10 according to past Internet survey results).
The port area dividing unit 103 designates M ports as individually corresponding ports in descending order of the number of accesses, in accordance with the result of counting the steady state log 150 by the port counting unit 102.
K = N−M is an upper limit value at which merged time-series data can be processed by the PCA, and is the number of divisions of merge-compatible ports.
For this reason, the port area dividing unit 103 specifies ports other than the M individual corresponding ports as merge corresponding ports and classifies the merge corresponding ports into K groups.

本実施の形態では、図５の様にソートされたヒストグラムにおいて、単独でＰＣＡを実施する上位Ｍ個のポート（個別対応ポート）を除いた残りを、単純に（６５５３６−Ｍ）／Ｋとして分割する。
つまり、例えば、Ｎ＝３０、Ｍ＝１０の場合、Ｋ＝２０であるから、（６５５３６−１０）／２０≒３２７６となる。
つまり、ヒストグラムで、上位１１位以降のポートを３２７６個ずつ２０のグループに分類し、グループごとにアクセス数をマージし、各グループにおいてＭｅｒｇｅＣｏｕｎｔ＿ｊを生成する方式である。 In the present embodiment, in the histogram sorted as shown in FIG. 5, the remainder excluding the top M ports (individually supported ports) that perform PCA independently is simply divided as (65536-M) / K. To do.
That is, for example, when N = 30 and M = 10, since K = 20, (65536-10) / 20≈3276.
In other words, in the histogram, the top 11 and subsequent ports are classified into 20 groups of 3276 ports, the number of accesses is merged for each group, and MergeCount_j is generated in each group.

マージする単位（グループ）が決まった場合の不正アクセス検知装置１００の各構成の働きは、実施の形態１におけるマージするデータの処理と同じ処理を各マージ単位に実施するものである。
つまり、検知開始前の準備動作においては、ポート域分割部１０３は、定常状態のログ１５０に関して、マージ対応ポートのグループごとに共通するグループに分類されているポートの出現回数を合算する。このため、マージポート時系列データ１５４がグループ数（上記の例では２０）だけ生成される。
また、検知動作についても、マージポートカウント１５３が２０あり各々に実施の形態１のときと同じ処理を行う。つまり、ポート域分割部１０３は、不正アクセス検知対象のログ１５０に関して、マージ対応ポートのグループごとに共通するグループに分類されているポートの出現回数を合算する。このため、マージポートカウント１５３がグループ数（上記の例では２０）だけ生成される。そして、分析部１０４は、マージ対応ポートについては、グループごとに、マージポートカウント１５３及びマージポート時系列データ１５４を用いて主成分分析を行う。 The operation of each component of the unauthorized access detection device 100 when the unit (group) to be merged is determined is to perform the same processing as that of the data to be merged in the first embodiment for each merge unit.
That is, in the preparatory operation before the start of detection, the port area dividing unit 103 adds up the appearance counts of the ports classified into a common group for each group of merge-compatible ports with respect to the steady state log 150. Therefore, the merge port time series data 154 is generated for the number of groups (20 in the above example).
As for the detection operation, there are 20 merge port counts 153, and the same processing as in the first embodiment is performed for each. That is, the port area dividing unit 103 adds up the number of appearances of ports classified into a common group for each group of merge-compatible ports in the unauthorized access detection target log 150. Therefore, the merge port count 153 is generated by the number of groups (20 in the above example). And the analysis part 104 performs a principal component analysis about the merge corresponding | compatible port using the merge port count 153 and the merge port time series data 154 for every group.

このように、本実施の形態では、マージ対応ポートを複数のグループに分類するため、単一のマージ対応ポートする場合に比べて各グループのマージされた出現回数が小さくなり、マージ対応ポートにおける出現回数の増加を検出しやすくなり、異常検知の精度が向上する。 As described above, in this embodiment, since the merge-compatible ports are classified into a plurality of groups, the number of merged appearances of each group is smaller than in the case of a single merge-compatible port, so It becomes easy to detect the increase in the number of times, and the accuracy of abnormality detection is improved.

以上、本実施の形態では、監視するポートをまとめることでＰＣＡの数を減らし処理を効率化するが、そのまとめ方の方式として、不正アクセス検知装置のＰＣＡの同時処理個数（同時に処理可能な個数）に基づき、アクセス数の平均でソートされたポートについて、上位から固定数で区切ってマージする方式について説明した。 As described above, in the present embodiment, the number of PCAs is reduced by reducing the number of PCAs to be monitored, and the processing efficiency is improved. ), The method of merging the ports sorted by the average number of accesses by dividing them by a fixed number from the top.

実施の形態３．
実施の形態２では、マージ対応ポートにおいて、ソートされた上位から固定個数（例として３２７６）ずつ分割したのに対し、実施の形態３では、ポート別集計部１０２による定常状態のログ１５０に対する集計結果について、単独でＰＣＡを実施するＭ個のポート（個別対応ポート）以外は、ランダムに各ポートを並べ、ランダムに並べた最初のポートから、３２７６ずつ分割する。
実施の形態２では、ソートされた上位側のマージ単位程、ＭｅｒｇｅＣｏｕｎｔ＿ｊが大きくなる可能性があったが、本実施の形態では、ランダムにグループ分けをするため、グループ別のＭｅｒｇｅＣｏｕｎｔ＿ｊの偏りが無くなる。 Embodiment 3 FIG.
In the second embodiment, a fixed number (for example, 3276) is divided from the sorted higher rank in the merge-compatible port, whereas in the third embodiment, the totaling result for the steady state log 150 by the port-based totaling unit 102 For each port, except for M ports (individually supported ports) that perform PCA independently, the ports are randomly arranged, and 3276 are divided from the first port arranged randomly.
In the second embodiment, there is a possibility that MergeCount_j becomes larger as the sorted higher-order merge unit. However, in this embodiment, since grouping is performed at random, there is no bias in MergeCount_j for each group.

なお、本実施の形態においても、検知動作は実施の形態１に示したとおりであり、ポート域分割部１０３は、不正アクセス検知対象のログ１５０に関して、マージ対応ポートのグループごとに共通するグループに分類されているポートの出現回数を合算する。そして、分析部１０４は、マージ対応ポートについては、グループごとに、マージポートカウント１５３及びマージポート時系列データ１５４を用いて主成分分析を行う。 In this embodiment as well, the detection operation is as described in the first embodiment, and the port area dividing unit 103 sets a common group for each group of merge-compatible ports with respect to the unauthorized access detection target log 150. Add up the number of occurrences of the classified ports. And the analysis part 104 performs a principal component analysis about the merge corresponding | compatible port using the merge port count 153 and the merge port time series data 154 for every group.

このように、本実施の形態では、ポート域分割部１０３は、マージ対応ポートに指定された複数の宛先ポートからランダムに宛先ポートを選択し、選択した宛先ポートから順にグループに分類していくため、グループ間の出現回数の偏りが減少し、このため、出現回数の増加を検出しやすくなり、異常検知の精度が向上する。 As described above, in this embodiment, the port area dividing unit 103 randomly selects a destination port from a plurality of destination ports designated as merge-compatible ports, and sequentially classifies them into groups from the selected destination port. , The deviation in the number of appearances between groups is reduced, which makes it easier to detect an increase in the number of appearances and improves the accuracy of abnormality detection.

以上、本実施の形態では、監視するポートをまとめることでＰＣＡの数を減らし処理を効率化するが、そのまとめ方の方式として、不正アクセス検知装置のＰＣＡの同時処理個数（同時に処理可能な個数）に基づき、ランダムに選んだポートを固定数でマージする方式について説明した。 As described above, in the present embodiment, the number of PCAs is reduced by reducing the number of PCAs to be monitored, and the processing efficiency is improved. ) Explained the method of merging randomly selected ports with a fixed number.

実施の形態４．
本実施の形態では、図９のようにアクセス数の平均でソートしたポート番号に対して、アクセス数の全体に対する上位のポートの割合で、単独でＰＣＡを実施する個別対応ポートとマージするマージ対応ポートを決める方式について説明する。 Embodiment 4 FIG.
In the present embodiment, for the port numbers sorted by the average number of accesses as shown in FIG. 9, the merge correspondence that merges with the individually corresponding ports that perform PCA independently at the ratio of the upper ports to the total number of accesses. A method for determining a port will be described.

本実施の形態では、ポート域分割部１０３は、ポート別集計部１０２による定常状態のログ１５０に対する集計結果について、アクセス数の上位のポートからアクセス数を足し合わせていき、全体のアクセス数に対して一定の割合に達した段階を個別対応ポートとマージ対応ポートとの境界と決定する。
例えば、図９の（１）に含まれるポートのアクセス数を全て足すと全体の９０％に相当するのであれば、残りの（２）に属するポートについてはトラフィックの少ないポートと見なす事ができ、マージを行う。
この場合は（１）に含まれるポートの数が、最高でＮ−１である必要がある（Ｎは、不正アクセス検知装置１００が並列処理可能なＰＣＡの数）。 In the present embodiment, the port area dividing unit 103 adds the number of accesses from the upper ports of the number of accesses for the totaled result for the steady state log 150 by the port-by-port totaling unit 102, and the total number of accesses The stage at which a certain ratio is reached is determined as the boundary between the individual corresponding port and the merge corresponding port.
For example, if the total number of port accesses included in (1) in FIG. 9 corresponds to 90% of the total, the remaining ports (2) can be regarded as ports with low traffic. Perform a merge.
In this case, the number of ports included in (1) needs to be N-1 at the maximum (N is the number of PCAs that can be processed in parallel by the unauthorized access detection apparatus 100).

このように、本実施の形態では、ポート域分割部１０３が、ポート別集計部１０２による定常状態のログ１５０に対する計数結果に基づき出現回数の多い宛先ポート番号から順に個別対応ポートに指定するとともに、個別対応ポートに指定された宛先ポート番号の出現回数の合計値が定常状態のログ１５０における宛先ポート番号の総出現回数に対して所定の割合（例えば、９０％）となった際に個別対応ポートに指定されていない宛先ポート番号をマージ対応ポートに指定する。
このため、マージ対応ポートに指定される各ポート番号の出現回数が少数となり、マージ対応ポートにおける出現回数の増加を検出しやすくなり、異常検知の精度が向上する。 As described above, in the present embodiment, the port area dividing unit 103 designates the individual port corresponding to the port number in descending order of the appearance number based on the counting result for the log 150 in the steady state by the port totaling unit 102, When the total value of the number of appearances of the destination port number designated as the individual correspondence port becomes a predetermined ratio (for example, 90%) with respect to the total number of appearances of the destination port number in the steady state log 150, the individual correspondence port Specify a destination port number not specified in the port as a merge-compatible port.
For this reason, the number of appearances of each port number designated as a merge-compatible port becomes small, and it becomes easy to detect an increase in the number of appearances in the merge-compatible port, and the accuracy of abnormality detection is improved.

以上、本実施の形態では、監視するポートをまとめることでＰＣＡの数を減らし処理を効率化するが、そのまとめ方の方式として、マージするポート数を全体のトラフィック量に対する固定の割合で決める方式について説明した。 As described above, according to the present embodiment, the number of PCAs is reduced and the processing efficiency is improved by collecting the ports to be monitored. Explained.

実施の形態５．
本実施の形態では、ポート別集計部１０２による定常状態のログ１５０に対する集計結果について、ソートされたポートの隣同士の差分を計算していき差分が一定の値より小さくなった場合に、個別対応ポートとマージ対応ポートとの境界として定める。
但し、偶然上位にある２つのポートのアクセス数の差が小さかった場合は、誤って判定してしまうので、アクセス数についても条件を設ける。
例えば、アクセス数が１０個以下で、ソートした隣のポート番号（直近に個別対応ポートに指定されたポート番号）のアクセス数との差が２以下になった時点で境界と定める。
本実施の形態に係る方式は、図１０に例示するように、トラフィック量が上位の方のポートでは差分が大きくバラつきがあるが（図１０の（１）の領域）、下位のポートになるにつれて差分が小さくなる（差分が収束していく）（図１０の（２）の領域）という経験的な予想を前提としている。
そして、図１０では、（１）の領域に含まれるポートを個別対応ポートとし、（２）の領域に含まれるポートをマージ対応ポートとしている。 Embodiment 5 FIG.
In the present embodiment, for the totaling result for the log 150 in the steady state by the port totaling unit 102, the difference between the sorted ports is calculated, and the individual difference is handled when the difference becomes smaller than a certain value. Determined as the boundary between the port and the merge-enabled port.
However, if there is a small difference in the number of accesses between two ports that are in the upper rank by accident, a determination is made erroneously, so a condition is also set for the number of accesses.
For example, the boundary is determined when the number of accesses is 10 or less and the difference between the number of access of the sorted adjacent port number (port number designated as the individual corresponding port most recently) is 2 or less.
As illustrated in FIG. 10, the method according to the present embodiment has a large difference in the traffic volume at the higher port (area (1) in FIG. 10). This is based on an empirical expectation that the difference becomes smaller (the difference converges) (region (2) in FIG. 10).
In FIG. 10, the ports included in the area (1) are the individually corresponding ports, and the ports included in the area (2) are the merge corresponding ports.

このように、本実施の形態では、ポート別集計部１０２による定常状態のログ１５０に対する計数結果に基づき出現回数の多い宛先ポート番号から順に個別対応ポートに指定するとともに、直近に個別対応ポートに指定された宛先ポート番号の出現回数との差異が所定数以下（例えば、２以下）となる宛先ポート番号及び当該宛先ポート番号よりも出現回数が少ない宛先ポート番号をマージ対応ポートに指定する。
このような方法によっても、マージ対応ポートに指定される各ポート番号の出現回数が少数となり、マージ対応ポートにおける出現回数の増加を検出しやすくなり、異常検知の精度が向上する。 As described above, according to the present embodiment, the port number is designated as the individual corresponding port in order from the destination port number with the most appearances based on the counting result for the steady state log 150 by the port totaling unit 102, and the individual corresponding port is designated most recently. A destination port number whose difference from the number of appearances of the destination port number is a predetermined number or less (for example, 2 or less) and a destination port number whose appearance count is smaller than the destination port number are designated as merge-compatible ports.
Even with such a method, the number of appearances of each port number designated as a merge-compatible port becomes small, and it becomes easy to detect an increase in the number of appearances in a merge-compatible port, and the accuracy of abnormality detection is improved.

以上、本実施の形態では、監視するポートをまとめることでＰＣＡの数を減らし処理を効率化するが、そのまとめ方の方式として、アクセス数の平均でソートされたポートの隣同士の差分を計算していき差分が一定の値より小さくなった場合に、個別にＰＣＡを行うポートと１つにマージするポートの境界を決める方式について説明した。 As described above, in this embodiment, the number of PCAs is reduced by reducing the number of PCA to be monitored, and the processing efficiency is improved. However, as a method of grouping, the difference between adjacent ports sorted by the average number of accesses is calculated. As described above, when the difference becomes smaller than a certain value, the method of determining the boundary between the port that performs PCA individually and the port that is merged into one is described.

実施の形態６．
本実施の形態では、ワームの発生時のアクセス数の増加が、ＭｅｒｇｅＣｏｕｎｔ＿ｊにおける時系列データに埋もれないようにするためにＭｅｒｇｅＣｏｕｎｔ＿ｊの上限を決める。その手段として、ワームのシュミレーションデータを用いる。
つまり、本実施の形態では、ポート域分割部１０３は、マージ対応ポートを複数のグループに分割するが、分割の際に、不正アクセスによる影響をシミュレートするためのシミュレーションデータを用いて、各グループにおける出現回数の合算値の上限を計算して、分割するグループ数を決定する。 Embodiment 6 FIG.
In the present embodiment, the upper limit of MergeCount_j is determined so that the increase in the number of accesses when a worm is generated is not buried in the time series data in MergeCount_j. As the means, worm simulation data is used.
In other words, in this embodiment, the port area dividing unit 103 divides the merge-compatible ports into a plurality of groups, but at the time of the division, each group is used by using simulation data for simulating the influence of unauthorized access. The upper limit of the total number of appearances in is calculated, and the number of groups to be divided is determined.

文献
Ｃ．Ｃ．Ｚｏｕ，Ｌ．Ｇａｏ，Ｗ．Ｇｒｏｎｇ，ａｎｄＤ．Ｔｏｗｓｌｅｙ，“ＭｏｎｉｔｏｒｉｎｇａｎｄＥａｒｌｙＷａｒｎｉｎｇｆｏｒＩｎｔｅｒｎｅｔＷｏｒｍｓ”，ＩｎＰｒｏｃｅｅｄｉｎｇｓｏｆ１０ｔｈＡＣＭＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒａｎｄＣｏｍｍｕｎｉｃａｔｉｏｎｓＳｅｃｕｒｉｔｙ（ＣＣＳ’０３），２００３．
によれば、下記の式で全感染ホスト数の時間変化が表現される。
Ｉ_ｔ＝（１＋α）Ｉ_ｔ−１−（α／Ｎ）Ｉ^２ _ｔ−１
Ｚ_ｔ＝ｍ／２^３２ηＩ_ｔ−１
Ｎ：脆弱性を持つ全ホスト数、Ｉ_ｔ：時刻ｔにおける全感染ホスト数、α：感染レート、Ｚ_ｔ：時刻ｔで観測されるアクセス数に対するワームによる影響、ｍ：監視ホスト数、η：ワームからの単位時間あたりの平均アクセス数。 Literature C.I. C. Zou, L.M. Gao, W.H. Grong, and D.G. Towsley, “Monitoring and Early Warning for Internet Worms”, In Proceedings of 10th ACM Conference on Computers and Communications Security (CCS'03), 2003.
According to the following formula, the time change of the total number of infected hosts is expressed by the following formula.
I _t = (1 + α) I _t−1 − (α / N) I ² _t−1
Z _t = m / 2 ³² ηI _t-1
N: total number of vulnerable hosts, I _t : total number of infected hosts at time t, α: infection rate, Z _t : influence of worms on the number of accesses observed at time t, m: number of monitored hosts, η: Average number of accesses per unit time from the worm.

ある例によると、１週間にわたって３０分間隔の平均アクセス数を調べたところ、全ポートの内の約６０，０００個のポートがアクセス数０であり、約５，０００ポートについてアクセス数が１〜３であった。このアクセス数が０である６０，０００ポートに関してはまとめてしまってもワームからの検知が容易にできる（全部まとめてしまってもアクセス数が０になるので、ワームからのアクセス数が鋭敏にでる）。
アクセス数が１〜３のポートに関しては、ワームのアクセス数が埋もれてしまわない範囲内で足しあわさなければならない（アクセス数が１〜３のポートでも５，０００個のポートを足せばアクセス数が５，０００〜１５，０００になってしまう）。
そこで、ポート域の足し合わせの指標として上記のワームアクセス数のシミュレーションデータを用いる。 According to an example, when the average number of accesses at intervals of 30 minutes over one week is examined, about 60,000 ports out of all the ports have 0 accesses, and about 5,000 ports have 1 to 1 accesses. 3. The 60,000 ports with 0 access can be easily detected from the worm even if they are grouped (the number of accesses is 0 even if all are grouped, so the number of accesses from the worm is sharp. ).
For ports with 1 to 3 access counts, the worm access count must be added within the range that is not buried (even if the access count is 1 to 3 ports, adding 5,000 ports will increase the access count). 5,000 to 15,000).
Therefore, the simulation data of the number of worm accesses is used as an index for adding the port areas.

上記の式において、ワームがＣｏｄｅＲｅｄ型であると仮定すると各パラメータは以下のようになる。
時刻ｔ−１での全感染ホスト数Ｉ_ｔ−１＝５０，０００、監視ホスト数ｍ＝１
ワームからの３０分間あたりの平均アクセス数η＝１，０７１
従って、時刻（ｔ−１）からｔでのワームからのパケット数Ｚ_ｔは、
Ｚ_ｔ＝１／２^３２×１０７１×５０，０００＝８，１７１となる（時間間隔は３０分）。 In the above equation, assuming that the worm is a CodeRed type, the parameters are as follows.
Total number of infected hosts at time t−1 I _t−1 = 50,000, number of monitored hosts m = 1
Average number of accesses from worms per 30 minutes η = 1,071
Therefore, the number of packets Z _t from the worm from time (t−1) to _t is
Z _t = 1/2 ³² × 1071 × 50,000 = 8,171 (time interval is 30 minutes).

この時、ポート域のアクセス数が８，０００になるまで、ポート毎のアクセス数を足し合わせたとする。
ここで、足し合わせたポートのうち一つが感染した場合に、ワームのアクセス数８１７１が加わるので、全体として約２倍のアクセス数になる。この場合は、明らかにアクセス数に変化があるので、精度よくワームの検知が出来る。
従って、アクセス数が３のポートを５，０００個監視するためには、８，０００／３≒２，６６６より、５，０００個のポートを２，６６６個と２，３３４個ずつ２つのポート域に分割して、各々をＰＣＡすればよい。
この方法によって５，０００個のポートのアクセス数の時系列データを５，０００個、個別にＰＣＡにかけなくても、２つのマージしたポート域をＰＣＡにかけることによってワームの検知を行う事ができる。 At this time, it is assumed that the number of accesses for each port is added until the number of accesses in the port area reaches 8,000.
Here, when one of the added ports is infected, the number of worm accesses 8171 is added, so the number of accesses is about twice as a whole. In this case, since the number of accesses is clearly changed, the worm can be detected with high accuracy.
Therefore, to monitor 5,000 ports with 3 accesses, 8,000 / 3 is equal to 2,666, so two ports of 5,000 ports are 2,666 and 2,334. It is sufficient to divide the area and PCA each.
With this method, it is possible to detect a worm by applying two merged port areas to PCA without applying 5,000 pieces of time-series data of the access number of 5,000 ports individually to PCA. .

このように、本実施の形態では、ポート域分割部１０３は、個別対応ポートを実施の形態１〜５のいずれかに示す方法により決定するとともに、マージ対応ポートについては、上記のＺ_ｔ＝８，１７１という計算結果に基づき、マージ対応ポートの各グループのアクセス数の合算値の上限を８，０００と決定する。そして、この８，０００をマージ対応ポートの各ポートに想定されるアクセス数（３アクセス）で除した値（２，６６６）を各グループのポート数の上限として各ポートを複数のグループに分類する。
グループへの分類の方法は、例えば、実施の形態２に示すように平均アクセス数でソートされた上位のポート番号から順に分類してもよいし、又は実施の形態３に示すようにランダムに並べられたポート番号の出現順に分類してもよい。 As described above, in the present embodiment, the port area dividing unit 103 determines the individual corresponding port by the method described in any of the first to fifth embodiments, and for the merge corresponding port, Z _t = 8 described above. , 171, the upper limit of the sum of the access counts of each group of merge-compatible ports is determined to be 8,000. Then, each port is classified into a plurality of groups with a value (2,666) obtained by dividing this 8,000 by the number of accesses (3 accesses) assumed for each port of the merge-compatible ports as the upper limit of the number of ports in each group. .
As a method of classifying into groups, for example, as shown in the second embodiment, the higher port numbers sorted by the average number of accesses may be sorted in order, or randomly arranged as shown in the third embodiment. The port numbers may be classified in the order of appearance.

上記の例ではＣｏｄｅＲｅｄの例を挙げたが、より緩やかなアクセス数の増加を見せるワームもある。
その場合は、検知対象とすべきワームの種類に基づき、パラメータを変更する。例えばηを小さくする。η＝１００とした場合は、Ｚ_ｔ＝８１７となる。この場合は足し合わせた数が８００になるように５，０００のポートを分割する。アクセス数が３のポートを５０００個監視するためには、８００／３アクセス＝２６７ポート数の足しあわせであればアクセス数の合計は８００を超えない。従って、例えば、５０００／２６７＝１８個のポート域（グループ）に分割する。このことにより、マージしたポート域において感染が発生しても通常の２倍のアクセス数が観測される。 In the above example, CodeRed was used, but there is a worm that shows a more moderate increase in the number of accesses.
In that case, the parameter is changed based on the type of worm to be detected. For example, η is reduced. When η = 100, Z _t = 817. In this case, 5,000 ports are divided so that the total number becomes 800. In order to monitor 5000 ports with 3 accesses, the total number of accesses does not exceed 800 if 800/3 accesses = 267 ports. Therefore, for example, it is divided into 5000/267 = 18 port areas (groups). As a result, even if infection occurs in the merged port area, the number of accesses twice the normal number is observed.

また、通常の２倍で無ければＰＣＡでは検知できないかというと、ＰＣＡで検知する際のパラメータチューニングの方式により、実験的には１．５倍以下でも検知が可能である。
仮に１．５倍を目安とした場合は、η＝１００とした場合は、Ｚ_ｔ＝８１７となるため、アクセス数の合計が１，６００になるまでマージが可能である。
つまり、ワームが発生しても１，６００＋８１７≒１，６００＊１．５となる。
この場合は、１，６００／３アクセス＝５３３ポート数より、５０００／５３３＝９となり、９個のポート分割（９グループへの分割）ですむことになる。 Also, if it is not twice as usual, it can be detected by PCA, and it can be detected experimentally even by 1.5 times or less by the parameter tuning method at the time of detection by PCA.
Assuming 1.5 times as a guide, if η = 100, Z _t = 817, so merging is possible until the total number of accesses reaches 1,600.
That is, even if a worm is generated, 1,600 + 817≈1,600 * 1.5.
In this case, since 1,600 / 3 access = 533 ports, 5000/533 = 9, and 9 port division (division into 9 groups) is sufficient.

このように、実施の形態では、ワームの拡散状況を想定し分割するポート域を決定することが可能である。
本実施の形態に係る方法では、シミュレーションデータに基づき、ワームによるアクセス数の増加があっても当該アクセス数の増加が埋もれてしまわないようにマージ対応ポートを分割しているため、いずれかのグループのポートにおいてワームによりアクセス数が増加した場合に、的確に異常を検知することができる。 As described above, in the embodiment, it is possible to determine a port area to be divided on the assumption of a worm diffusion state.
In the method according to this embodiment, the merge-compatible ports are divided based on the simulation data so that the increase in the number of accesses is not buried even if the number of accesses is increased due to the worm. When the number of accesses increases due to worms at the port, it is possible to accurately detect an abnormality.

以上、本実施の形態では、監視するポートをまとめることでＰＣＡの数を減らし処理を効率化するが、そのまとめ方の方式として、ワームの発生時のアクセスの増加がポートのマージにより埋もれないようにするために、ワーム発生のシュミレーションデータを利用し、埋もれないポートのマージ単位を決定する方式について説明した。 As described above, in the present embodiment, the number of PCAs is reduced by reducing the number of PCA to be monitored, and the processing efficiency is improved. However, as a method of the aggregation, an increase in access when a worm occurs is not buried by merging of ports. In order to achieve this, a method has been described in which the merging unit of ports that are not buried is determined using simulation data of worm generation.

実施の形態７．
今までの実施の形態では、宛先ポート番号別の集計を扱ってきた。
実施の形態では、宛先ポート番号の代わりに、ＩＰアドレス別の集計を行うことに適用する。 Embodiment 7 FIG.
The embodiments so far have dealt with tabulation by destination port number.
In the embodiment, the present invention is applied to aggregation by IP address instead of the destination port number.

ＩＰアドレスは、そのクラスにより例えばクラスＢであれば１ネットワークアドレスあたり６５，５３４個のホストが存在することになる。
例えば、送信元が、クラスＢのある１つのネットワークアドレスのホスト全てについてＰＣＡでアクセス数を監視しようとした場合に、その監視数は６５，５３４となりポート番号と同じ課題が生じる。
そこで、当実施の形態では実施の形態１〜６でポート別に集計した処理をＩＰアドレス別の集計に置き換えたものである。
ＳｒｃＩＰアドレス（送信元ＩＰアドレス）ごとに集計した場合、実施の形態１〜６のポートの処理をＳｒｃＩＰアドレスに置き換えるのみでよい。
その結果、普段通信の多いＳｒｃＩＰアドレスは個別にＰＣＡを行い、通信の殆ど無いＳｒｃＩＰアドレスはマージされることで、ＰＣＡの個数を減らすことが可能となる。
また、同様にして、ＤｓｔＩＰアドレス（宛先ＩＰアドレス）についても、普段通信の多いＤｓｔＩＰアドレスは個別にＰＣＡを行い、通信の殆ど無いＤｓｔＩＰアドレスはマージしてＰＣＡを行ってもよい。
また、宛先ポート番号、送信元ＩＰアドレス、宛先ＩＰアドレスの組み合わせに対して、実施の形態１〜６に示す手法を適用してもよい。
また、図２に示すログファイルには記述されていないが、ログファイルに送信元ポート番号が記述される場合には、送信元ポート番号に対して、実施の形態１〜６に示す手法を適用してもよい。 If the IP address is class B, for example, there are 65,534 hosts per network address.
For example, when the transmission source tries to monitor the number of accesses by PCA for all hosts of one network address with class B, the number of monitoring becomes 65,534 and the same problem as the port number arises.
Therefore, in the present embodiment, the processing aggregated for each port in the first to sixth embodiments is replaced with the aggregation for each IP address.
When counting is performed for each SrcIP address (source IP address), it is only necessary to replace the port processing of Embodiments 1 to 6 with the SrcIP address.
As a result, it is possible to reduce the number of PCAs by individually performing PCA on SrcIP addresses that are frequently used for communication, and merging SrcIP addresses that have almost no communication.
Similarly, for DstIP addresses (destination IP addresses), PCA may be performed by individually merging DstIP addresses that are frequently used for communication and merging DstIP addresses that are rarely used for communication.
Further, the methods described in the first to sixth embodiments may be applied to combinations of destination port numbers, transmission source IP addresses, and destination IP addresses.
Further, although not described in the log file shown in FIG. 2, when the transmission source port number is described in the log file, the method described in the first to sixth embodiments is applied to the transmission source port number. May be.

以上、本実施の形態では、実施の形態１〜６に示す手法をＳｒｃＩＰアドレス又はＤｓｔＩＰアドレス別のアクセス数の変化の検知に適用する方式について説明した。 As described above, in the present embodiment, the method of applying the method described in Embodiments 1 to 6 to the detection of the change in the number of accesses for each SrcIP address or DstIP address has been described.

最後に、実施の形態１〜７に示した不正アクセス検知装置１００のハードウェア構成例について説明する。
図１３は、実施の形態１〜７に示す不正アクセス検知装置１００のハードウェア資源の一例を示す図である。
なお、図１３の構成は、あくまでも不正アクセス検知装置１００のハードウェア構成の一例を示すものであり、不正アクセス検知装置１００のハードウェア構成は図１３に記載の構成に限らず、他の構成であってもよい。 Finally, a hardware configuration example of the unauthorized access detection device 100 shown in the first to seventh embodiments will be described.
FIG. 13 is a diagram illustrating an example of hardware resources of the unauthorized access detection device 100 illustrated in the first to seventh embodiments.
13 is merely an example of the hardware configuration of the unauthorized access detection device 100, and the hardware configuration of the unauthorized access detection device 100 is not limited to the configuration illustrated in FIG. There may be.

図１３において、不正アクセス検知装置１００は、プログラムを実行するＣＰＵ９１１（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ、中央処理装置、処理装置、演算装置、マイクロプロセッサ、マイクロコンピュータ、プロセッサともいう）を備えている。
ＣＰＵ９１１は、バス９１２を介して、例えば、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）９１３、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）９１４、通信ボード９１５、表示装置９０１、キーボード９０２、マウス９０３、磁気ディスク装置９２０と接続され、これらのハードウェアデバイスを制御する。
更に、ＣＰＵ９１１は、ＦＤＤ９０４（ＦｌｅｘｉｂｌｅＤｉｓｋＤｒｉｖｅ）、コンパクトディスク装置９０５（ＣＤＤ）、プリンタ装置９０６、スキャナ装置９０７と接続していてもよい。また、磁気ディスク装置９２０の代わりに、光ディスク装置、メモリカード（登録商標）読み書き装置などの記憶装置でもよい。
ＲＡＭ９１４は、揮発性メモリの一例である。ＲＯＭ９１３、ＦＤＤ９０４、ＣＤＤ９０５、磁気ディスク装置９２０の記憶媒体は、不揮発性メモリの一例である。これらは、記憶装置の一例である。
通信ボード９１５、キーボード９０２、マウス９０３、スキャナ装置９０７、ＦＤＤ９０４などは、入力装置の一例である。
また、通信ボード９１５、表示装置９０１、プリンタ装置９０６などは、出力装置の一例である。 In FIG. 13, the unauthorized access detection device 100 includes a CPU 911 (also referred to as a central processing unit, a central processing unit, a processing unit, a processing unit, a microprocessor, a microcomputer, and a processor) that executes a program.
The CPU 911 is connected to, for example, a ROM (Read Only Memory) 913, a RAM (Random Access Memory) 914, a communication board 915, a display device 901, a keyboard 902, a mouse 903, and a magnetic disk device 920 via a bus 912. Control hardware devices.
Further, the CPU 911 may be connected to an FDD 904 (Flexible Disk Drive), a compact disk device 905 (CDD), a printer device 906, and a scanner device 907. Further, instead of the magnetic disk device 920, a storage device such as an optical disk device or a memory card (registered trademark) read / write device may be used.
The RAM 914 is an example of a volatile memory. The storage media of the ROM 913, the FDD 904, the CDD 905, and the magnetic disk device 920 are an example of a nonvolatile memory. These are examples of the storage device.
A communication board 915, a keyboard 902, a mouse 903, a scanner device 907, an FDD 904, and the like are examples of input devices.
The communication board 915, the display device 901, the printer device 906, and the like are examples of output devices.

通信ボード９１５は、例えば、ＬＡＮ（ローカルエリアネットワーク）、インターネット、ＷＡＮ（ワイドエリアネットワーク）などに接続されている。
また、通信ボード９１５は、上記のＬＡＮ等を介してネットワーク機器の通信ログデータを受信する。 The communication board 915 is connected to a LAN (Local Area Network), the Internet, a WAN (Wide Area Network), etc., for example.
Further, the communication board 915 receives the communication log data of the network device via the LAN or the like.

磁気ディスク装置９２０には、オペレーティングシステム９２１（ＯＳ）、ウィンドウシステム９２２、プログラム群９２３、ファイル群９２４が記憶されている。プログラム群９２３のプログラムは、ＣＰＵ９１１、オペレーティングシステム９２１、ウィンドウシステム９２２により実行される。 The magnetic disk device 920 stores an operating system 921 (OS), a window system 922, a program group 923, and a file group 924. The programs in the program group 923 are executed by the CPU 911, the operating system 921, and the window system 922.

ＲＯＭ９１３には、ＢＩＯＳ（ＢａｓｉｃＩｎｐｕｔＯｕｔｐｕｔＳｙｓｔｅｍ）プログラムが格納され、磁気ディスク装置９２０にはブートプログラムが格納されている。
不正アクセス検知装置１００の起動時には、ＲＯＭ９１３のＢＩＯＳプログラム及び磁気ディスク装置９２０のブートプログラムが実行され、ＢＩＯＳプログラム及びブートプログラムによりオペレーティングシステム９２１が起動される。 The ROM 913 stores a BIOS (Basic Input Output System) program, and the magnetic disk device 920 stores a boot program.
When the unauthorized access detection device 100 is activated, the BIOS program in the ROM 913 and the boot program in the magnetic disk device 920 are executed, and the operating system 921 is activated by the BIOS program and the boot program.

上記プログラム群９２３には、実施の形態１〜７の説明において「〜部」として説明している機能を実行するプログラムが記憶されている。プログラムは、ＣＰＵ９１１により読み出され実行される。 The program group 923 stores programs for executing the functions described as “˜units” in the description of the first to seventh embodiments. The program is read and executed by the CPU 911.

ファイル群９２４には、実施の形態１〜７の説明において、「〜の判断」、「〜の計数」、「〜のカウント」、「〜の比較」、「〜の更新」、「〜の設定」、「〜の指定」、「〜の選択」等として説明している処理の結果を示す情報やデータや信号値や変数値やパラメータが、「〜ファイル」や「〜データベース」の各項目として記憶されている。
「〜ファイル」や「〜データベース」は、ディスクやメモリなどの記録媒体に記憶される。ディスクやメモリなどの記憶媒体に記憶された情報やデータや信号値や変数値やパラメータは、読み書き回路を介してＣＰＵ９１１によりメインメモリやキャッシュメモリに読み出され、抽出・検索・参照・比較・演算・計算・処理・編集・出力・印刷・表示などのＣＰＵの動作に用いられる。
抽出・検索・参照・比較・演算・計算・処理・編集・出力・印刷・表示のＣＰＵの動作の間、情報やデータや信号値や変数値やパラメータは、メインメモリ、レジスタ、キャッシュメモリ、バッファメモリ等に一時的に記憶される。
また、実施の形態１〜７で説明しているフローチャートの矢印の部分は主としてデータや信号の入出力を示し、データや信号値は、ＲＡＭ９１４のメモリ、ＦＤＤ９０４のフレキシブルディスク、ＣＤＤ９０５のコンパクトディスク、磁気ディスク装置９２０の磁気ディスク、その他光ディスク、ミニディスク、ＤＶＤ等の記録媒体に記録される。また、データや信号は、バス９１２や信号線やケーブルその他の伝送媒体によりオンライン伝送される。 In the file group 924, in the description of the first to seventh embodiments, “determination of”, “counting of”, “counting of”, “comparison of”, “updating of”, and “setting of” are set. ”,“ Specify ”,“ Selection ”, etc. Information, data, signal values, variable values, and parameters that indicate the results of the processing are listed as“ ~ file ”and“ ~ database ”items. It is remembered.
The “˜file” and “˜database” are stored in a recording medium such as a disk or a memory. Information, data, signal values, variable values, and parameters stored in a storage medium such as a disk or memory are read out to the main memory or cache memory by the CPU 911 via a read / write circuit, and extracted, searched, referenced, compared, and calculated. Used for CPU operations such as calculation, processing, editing, output, printing, and display.
Information, data, signal values, variable values, and parameters are stored in the main memory, registers, cache memory, and buffers during the CPU operations of extraction, search, reference, comparison, calculation, processing, editing, output, printing, and display. It is temporarily stored in a memory or the like.
In addition, the arrows in the flowcharts described in the first to seventh embodiments mainly indicate input / output of data and signals, and the data and signal values are the RAM 914 memory, the FDD 904 flexible disk, the CDD 905 compact disk, and the magnetic field. Recording is performed on a recording medium such as a magnetic disk of the disk device 920, other optical disks, mini disks, DVDs, and the like. Data and signals are transmitted online via a bus 912, signal lines, cables, or other transmission media.

また、実施の形態１〜７の説明において「〜部」として説明しているものは、「〜回路」、「〜装置」、「〜機器」であってもよく、また、「〜ステップ」、「〜手順」、「〜処理」であってもよい。すなわち、「〜部」として説明しているものは、ＲＯＭ９１３に記憶されたファームウェアで実現されていても構わない。或いは、ソフトウェアのみ、或いは、素子・デバイス・基板・配線などのハードウェアのみ、或いは、ソフトウェアとハードウェアとの組み合わせ、さらには、ファームウェアとの組み合わせで実施されても構わない。ファームウェアとソフトウェアは、プログラムとして、磁気ディスク、フレキシブルディスク、光ディスク、コンパクトディスク、ミニディスク、ＤＶＤ等の記録媒体に記憶される。プログラムはＣＰＵ９１１により読み出され、ＣＰＵ９１１により実行される。すなわち、プログラムは、実施の形態１〜７の「〜部」としてコンピュータを機能させるものである。あるいは、実施の形態１〜７の「〜部」の手順や方法をコンピュータに実行させるものである。 In addition, what is described as “to part” in the description of the first to seventh embodiments may be “to circuit”, “to apparatus”, “to device”, and “to step”, It may be “˜procedure” or “˜processing”. That is, what is described as “˜unit” may be realized by firmware stored in the ROM 913. Alternatively, it may be implemented only by software, or only by hardware such as elements, devices, substrates, and wirings, by a combination of software and hardware, or by a combination of firmware. Firmware and software are stored as programs in a recording medium such as a magnetic disk, a flexible disk, an optical disk, a compact disk, a mini disk, and a DVD. The program is read by the CPU 911 and executed by the CPU 911. That is, the program causes the computer to function as “to part” in the first to seventh embodiments. Alternatively, the computer executes the procedure and method of “to part” in the first to seventh embodiments.

このように、実施の形態１〜７に示す不正アクセス検知装置１００は、処理装置たるＣＰＵ、記憶装置たるメモリ、磁気ディスク等、入力装置たるキーボード、マウス、通信ボード等、出力装置たる表示装置、通信ボード等を備えるコンピュータであり、上記したように「〜部」として示された機能をこれら処理装置、記憶装置、入力装置、出力装置を用いて実現するものである。 As described above, the unauthorized access detection device 100 shown in the first to seventh embodiments includes a CPU as a processing device, a memory as a storage device, a magnetic disk, a keyboard as an input device, a mouse, a communication board, and a display device as an output device, A computer including a communication board or the like, and implements the functions indicated as “˜units” as described above using these processing devices, storage devices, input devices, and output devices.

実施の形態１〜７に係る不正アクセス検知装置の構成例を示す図。The figure which shows the structural example of the unauthorized access detection apparatus which concerns on Embodiment 1-7. 実施の形態１に係るネットワーク機器のログの例を示す図。FIG. 3 is a diagram illustrating an example of a log of a network device according to the first embodiment. 実施の形態１に係るポート別集計部の動作例を示すフローチャート図。FIG. 4 is a flowchart showing an operation example of a port-by-port totaling unit according to Embodiment 1; 実施の形態１に係るポート別のアクセス数の時系列データの例を示す図。The figure which shows the example of the time series data of the access number according to port which concerns on Embodiment 1. FIG. 実施の形態１に係るポート域分割の例を示す図。FIG. 3 is a diagram showing an example of port area division according to the first embodiment. 実施の形態１に係るポート域分割部の動作例を示すフローチャート図。FIG. 4 is a flowchart showing an operation example of a port area dividing unit according to the first embodiment. 実施の形態１に係るマージされたアクセス数の時系列データの例を示す図。The figure which shows the example of the time series data of the merged access number which concerns on Embodiment 1. FIG. 実施の形態１に係るマージ対応ポートにおけるアクセス数の増加の検出例を示す図。FIG. 6 is a diagram showing an example of detecting an increase in the number of accesses in a merge-compatible port according to the first embodiment. 実施の形態４に係るポート域分割の例を示す図。FIG. 10 is a diagram illustrating an example of port area division according to the fourth embodiment. 実施の形態５に係るポート域分割の例を示す図。FIG. 10 shows an example of port area division according to the fifth embodiment. 実施の形態１〜７に係る不正アクセス検知装置の準備段階における動作例を示すフローチャート図。The flowchart figure which shows the operation example in the preparation stage of the unauthorized access detection apparatus which concerns on Embodiment 1-7. 実施の形態１〜７に係る不正アクセス検知装置の検知段階における動作例を示すフローチャート図。The flowchart figure which shows the operation example in the detection stage of the unauthorized access detection apparatus which concerns on Embodiment 1-7. 実施の形態１〜７に係る不正アクセス検知装置のハードウェア構成例を示す図。The figure which shows the hardware structural example of the unauthorized access detection apparatus which concerns on Embodiment 1-7.

Explanation of symbols

１００不正アクセス検知装置、１０１データ取得部、１０２ポート別集計部、１０３ポート域分割部、１０４分析部、１０５異常検知部、１０６定常域データ定義部、１５０ネットワーク機器のログ、１５１ポート別カウント、１５２ポート別時系列データ、１５３マージポートカウント、１５４マージポート時系列データ。 100 unauthorized access detection device, 101 data acquisition unit, 102 port totaling unit, 103 port area division unit, 104 analysis unit, 105 anomaly detection unit, 106 stationary area data definition unit, 150 network device log, 151 port count 152 Time series data by port, 153 Merge port count, 154 Merge port time series data.

Claims

A steady-state appearance count counter that inputs steady-state communication log data including a plurality of communication identifiers and counts the number of appearances in the steady-state communication log data for each communication identifier;
Based on the counting result by the steady state appearance frequency counting unit, a communication identifier having a large number of appearances is designated as an individually corresponding communication identifier, a communication identifier having a small number of appearances is designated as a combined correspondence communication identifier,
Input the communication log data of the unauthorized access detection target including a plurality of communication identifiers, and a detection target appearance number counting unit that counts the number of appearances in the communication log data of the unauthorized access detection target for each communication identifier;
Of the number of appearances counted by the detection target appearance number counting unit, an appearance number summing unit for summing up the number of appearances of the communication identifier specified in the summing-compatible communication identifier;
Unauthorized access detection analysis for the communication identifier designated as the individual correspondence communication identifier is performed using the number of appearances of each communication identifier counted by the detection target appearance number counting unit, and the communication designated as the sum correspondence communication identifier. An unauthorized access detection and analysis device, comprising: an unauthorized access detection and analysis unit that performs unauthorized access detection and analysis for an identifier using a sum of appearance counts calculated by the appearance count summation unit.

The communication identifier designating unit
Classifying a plurality of communication identifiers designated by the summing correspondence identifier into each of two or more groups;
The appearance frequency summing unit is:
For each group of summing correspondence identifiers, sum the number of occurrences of communication identifiers classified into a common group,
The unauthorized access detection analysis unit
The unauthorized access detection analysis for the communication identifier designated as the total correspondence identifier is performed for each group of the total correspondence identifier using a total value of the number of appearances of the communication identifier classified into a common group. The unauthorized access detection device according to claim 1.

The unauthorized access detection analysis unit
It is possible to process multiple unauthorized access detection analysis in parallel,
The communication identifier designating unit
Classifying the plurality of communication identifiers designated as the total correspondence identifiers into a number of groups determined from the number of parallel processes of unauthorized access detection analysis by the unauthorized access detection analysis unit and the number of individual correspondence communication identifiers. The unauthorized access detection device according to claim 2, wherein:

The communication identifier designating unit
The unauthorized access detection apparatus according to claim 2 or 3, wherein among the plurality of communication identifiers specified as the summing correspondence identifier, the communication identifiers are classified into groups in descending order of appearance frequency.

The communication identifier designating unit
The unauthorized access detection device according to claim 2 or 3, wherein a communication identifier is randomly selected from a plurality of communication identifiers designated as the summing correspondence identifier, and is classified into a group in order from the selected communication identifier. .

The communication identifier designating unit
Based on the counting result by the steady state appearance number counting unit, the communication identifiers are specified in order from the communication identifier having the highest number of appearances, and the total number of appearances of the communication identifier specified in the individual corresponding communication identifier is the steady state. A communication identifier that is not designated as the individual correspondence communication identifier is designated as the total correspondence communication identifier when a predetermined ratio is reached with respect to the total number of occurrences of the communication identifier in the communication log data in the state. The unauthorized access detection device according to claim 1 or 2.

The communication identifier designating unit
Based on the counting result by the steady state appearance number counting unit, the communication identifiers are specified in order from the communication identifier having the highest number of appearances, and the difference from the number of appearances of the communication identifier most recently specified as the individual correspondence communication identifier is predetermined. The unauthorized access detection device according to claim 1 or 2, wherein a communication identifier that is equal to or less than a number and a communication identifier that is less frequently appearing than the communication identifier are designated as the combined correspondence communication identifier.

The communication identifier designating unit
3. The unauthorized access detection apparatus according to claim 2, wherein the number of groups of the combined correspondence identifiers is determined using simulation data for simulating the influence of unauthorized access.

The steady state appearance number counting unit is
As a communication identifier, input steady state communication log data including a plurality of destination port numbers, count the number of appearances in the steady state communication log data for each destination port number,
The communication identifier designating unit
Based on the counting result by the steady state appearance number counting unit, a destination port number with a large number of appearances is designated as an individually corresponding communication identifier, a destination port number with a small number of appearances is designated as a combined correspondence communication identifier,
The detection target appearance number counting unit is
The communication log data for unauthorized access detection including a plurality of destination port numbers is input as a communication identifier, and the number of appearances in the communication log data for unauthorized access detection is counted for each destination port number. The unauthorized access detection apparatus according to any one of claims 1 to 8.

The steady state appearance number counting unit is
As a communication identifier, input steady state communication log data including a plurality of source addresses, and count the number of appearances in the steady state communication log data for each source address,
The communication identifier designating unit
Based on the counting result by the steady state appearance number counting unit, specify a source address with a large number of appearances as an individually corresponding communication identifier, specify a source address with a small number of appearances as a combined correspondence communication identifier,
The detection target appearance number counting unit is
The communication log data targeted for unauthorized access detection including a plurality of transmission source addresses is input as a communication identifier, and the number of appearances in the communication log data targeted for unauthorized access detection is counted for each transmission address. The unauthorized access detection apparatus according to any one of claims 1 to 8.

The unauthorized access detection analysis unit
The unauthorized access detection apparatus according to any one of claims 1 to 10, wherein as the unauthorized access detection analysis, a principal component analysis is performed with respect to a temporal variation in the number of appearances of a communication identifier.

The computer inputs steady state communication log data including a plurality of communication identifiers, and counts the number of appearances in the steady state communication log data for each communication identifier,
A communication identifier designating step in which a computer designates a communication identifier having a large number of appearances as an individually corresponding communication identifier, and designates a communication identifier having a small number of appearances as a combined correspondence communication identifier based on the counting result of the steady state appearance number counting step; ,
A computer that inputs unauthorized access detection target communication log data including a plurality of communication identifiers and counts the number of occurrences in the unauthorized access detection target communication log data for each communication identifier;
An appearance number summing step in which the computer sums the number of appearances of the communication identifier specified in the summing correspondence communication identifier among the number of appearances counted in the detection target appearance number counting step;
The computer performs an unauthorized access detection analysis for the communication identifier designated by the individual correspondence communication identifier using the number of appearances of each communication identifier counted in the detection target appearance number counting step, and designated as the sum correspondence correspondence communication identifier. And an unauthorized access detection analysis step for performing unauthorized access detection analysis on the communication identifier using the sum of the appearance counts calculated in the appearance count summation step.

Steady state appearance number counting processing for inputting steady state communication log data including a plurality of communication identifiers and counting the number of appearances in the steady state communication log data for each communication identifier;
Based on the counting result of the steady state appearance number counting process, a communication identifier having a large number of appearances is designated as an individually corresponding communication identifier, and a communication identifier having a small number of appearances is designated as a combined correspondence communication identifier; and
Input the communication log data of the unauthorized access detection target including a plurality of communication identifiers, and the detection target appearance count processing for counting the number of appearances in the communication log data of the unauthorized access detection target for each communication identifier;
Of the number of appearances counted by the detection target appearance number counting process, an appearance number summing process for summing up the number of appearances of the communication identifier specified in the summing-compatible communication identifier;
The unauthorized access detection analysis for the communication identifier designated by the individual correspondence communication identifier is performed using the number of appearances of each communication identifier counted by the detection target appearance number counting process, and the communication designated by the addition correspondence communication identifier. A program for causing a computer to execute an unauthorized access detection analysis process for performing an unauthorized access detection analysis for an identifier by using a sum value of the appearance counts calculated by the appearance count summation process.