JP2017143578A

JP2017143578A - Path scanning for detection of anomalous subgraph and use of dns request and host agent for anomaly/change detection and network situational awareness

Info

Publication number: JP2017143578A
Application number: JP2017088048A
Authority: JP
Inventors: チャールズニールジョシュア; Charles Neil Joshua; エドワードフィスクマイケル; Edward Fisk Michael; ウィリアムブラフアレクサンダー; William Brugh Alexander; リーハッシュ，ジュニアカーティス; Lee Hash Curtis Jr; バイロンストーリーカーティス; Byron Storlie Curtis; アップホフベンジャミン; Upoff Benjamin; ケントアレクサンダー; Kent Alexander
Original assignee: Los Alamos National Security LLC
Current assignee: Los Alamos National Security LLC
Priority date: 2012-03-22
Filing date: 2017-04-27
Publication date: 2017-08-17
Anticipated expiration: 2033-03-14
Also published as: AU2016234999B2; AU2017254815A1; WO2013184206A3; WO2013184206A2; AU2018203393B2; WO2013184211A2; AU2013272215A1; CN104303152A; CN104303153A; US9825979B2; EP2828753A4; AU2019216687B2; US9038180B2; AU2017254815B2; CA2868076C; WO2013184211A3; US20160277433A1; EP3522492A1; US10530799B1; JP2015511101A

Abstract

PROBLEM TO BE SOLVED: To provide a system, apparatus, computer-readable medium, and computer-implemented method, for detecting an anomalous behavior in a network.SOLUTION: Historical parameters of a network are determined in order to determine normal activity levels. A plurality of paths in the network are enumerated as part of a graph representing the network. Each computing system in the network may be a node in the graph, and the sequence of connections between two computing systems may be a directed edge in the network. A statistical model is applied to the plurality of paths in the graph on a sliding window basis to detect an anomalous behavior. Data collected by a Unified Host Collection Agent (UHCA) may also be used to detect an anomalous behavior.SELECTED DRAWING: Figure 4

Description

本発明は概略，網侵入，異常及び方針（ｐｏｌｉｃｙ）違反の検出に関し，より特定すれば，時間発展（ｔｉｍｅｅｖｏｌｖｉｎｇ）グラフに組み込まれた異常部分グラフを検出するための道探査（ｐａｔｈｓｃａｎｎｉｎｇ）による，網侵入，異常及び方針違反の検出に関し，さらに，計算機網の状況認知及び異常／変更検出のためのドメイン名サービス（ＤＮＳ）の使用に関する。 The present invention generally relates to the detection of network intrusion, anomalies and policy violations, and more particularly, by path scanning to detect anomalous subgraphs embedded in a time evolving graph. , Network intrusion, anomaly and policy violation detection, computer network status recognition and the use of domain name service (DNS) for anomaly / change detection.

本発明に関し，米国政府は，Los Alamos国立研究所の運営に関する米国エネルギ庁とLos Alamos National Security, LLCとの間の契約第DE-AC52-06NA25396号に基づく権利を有する。 In connection with the present invention, the US Government has the rights under the contract DE-AC52-06NA25396 between the US Energy Agency and Los Alamos National Security, LLC for the operation of the Los Alamos National Laboratory.

本願は，2012年3月22日出願の米国仮特許出願第61/614,148号の優先権を主張するものである。この先出願された仮特許出願の内容はここに全体を参照によって組み込まれる。 This application claims the priority of US Provisional Patent Application No. 61 / 614,148, filed March 22, 2012. The contents of this previously filed provisional patent application are hereby incorporated by reference in their entirety.

最新の計算機ハッキングは，会社，政府機関及びほかの団体に深刻な脅威となっている。一般に，ハッカーは自動化された手段によってシステムに侵入する。例えば，ハッカーがフィッシング電子メールをある機関に送信し，利用者がリンクをクリックすると，マルウェアが機械を危険にさらす（ｃｏｍｐｒｏｍｉｓｅ）。これがハッカーに危険にさらされた機械の制御手段を与え，したがって，危険にさらされた機械が存在する網への足がかりを与える。 Modern computer hacking is a serious threat to companies, government agencies and other organizations. In general, hackers penetrate the system by automated means. For example, if a hacker sends a phishing email to an institution and a user clicks a link, the malware compromises the machine. This gives the hacker control of the machine at risk, and thus provides a foothold in the network where the machine at risk exists.

ハッカーはどの機械が危険にさらされ，自分が網内のどこに着地するかを選択することはできない。ハッカーは通常，網が危険にさらされた初期点から，網を進み，利用する別のホストを探す。一般に，一人の利用者が網全体にアクセスすることはできないため，ハッカーは網を完全に危険にさらすために複数の機械を通過（ｔｒａｖｅｒｓｅ）する必要がある。ハッカーはしばしば，複数利用者がいる機械を探し，アクセスするために危険にさらされたアカウントを利用し，更に網内に進入する。 Hackers cannot choose which machines are at risk and where they land in the net. Hackers usually go from the initial point where the network is compromised and look for another host to use. In general, since a single user cannot access the entire network, hackers need to traverse multiple machines to fully compromise the network. Hackers often use compromised accounts to find and access machines with multiple users and then enter the network.

計算機網内の悪意のある部内者を検出する従来の方法は，一般に「通過」穴（ｔｒａｖｅｒｓａｌｗｅｌｌ）を補足できない。通過はハッカーが網を進み，システムに潜入し，危険にさらされたシステムを用いて更にほかのホストを危険にさらすときに起こる。特定の機械を監視するホストベースの検出システムはある程度完成しており，ファイアウォールを用いた侵入検出はよく研究されているが，セキュリティ周辺部内の多段ホップを検査し，同時に異常を探す方法は一般に十分に開拓されていない。さらに，網トラヒック監視は一般に，網傍受，ルータミラーポート及びルータベースのフロー観察といった精巧なシステムを用いて実行される。この方法は費用が掛かり，網内のトラヒックを完全に含むことはできない。 Conventional methods of detecting malicious insiders in a computer network generally cannot supplement the “traversal well”. Passage occurs when hackers navigate the network, infiltrate the system, and endanger other hosts using the compromised system. Although host-based detection systems that monitor specific machines have been completed to some extent and intrusion detection using firewalls has been well studied, it is generally sufficient to inspect multiple hops in the security perimeter and simultaneously look for anomalies. Not pioneered. In addition, network traffic monitoring is typically performed using sophisticated systems such as network interception, router mirror ports, and router-based flow observation. This method is expensive and cannot completely include traffic in the network.

本発明のある実施例は，現在の侵入，異常，及び方針違反検出技術によっては完全に特定され，理解され又は解決されていない，本技術における課題及び要望に対する解決手段を提供することができる。例えば，本発明のいくつかの実施例は，網内通信パターンを推測するために用いることができるＤＮＳ要求を用いて，局所異常部分グラフを検出するために，探査統計量を用いる。本発明のいくつかの実施例は，各辺（ｅｄｇｅ）に時系列データを有する，任意の種類のグラフに適用することができる。動的ソーシャルネットワーク分析（例えば，Twitter（登録商標），Facebook（登録商標），電子メール網，等）はこの種の分析に従うこともあるし，生物学で見られるような適切なほかのグラフ構造があってもよい。このように，本発明のいくつかの実施例は，サイバセキュリティ外部のアプリケーションを有してもよい。 Certain embodiments of the present invention can provide a solution to the problems and desires in the present technology that are not fully identified, understood or resolved by current intrusion, anomaly, and policy violation detection techniques. For example, some embodiments of the present invention use exploration statistics to detect local anomaly subgraphs using DNS requests that can be used to infer intra-network communication patterns. Some embodiments of the present invention can be applied to any type of graph with time series data on each edge. Dynamic social network analysis (eg Twitter®, Facebook®, email network, etc.) may follow this type of analysis, or other appropriate graph structure as seen in biology. There may be. Thus, some embodiments of the present invention may have applications outside cyber security.

一つの実施例において，計算機で実現される方法は，正常活動レベルを決定するために網上の各「辺」（すなわち，通信している機械の対）の基本統計モデルの過去のパラメータ（ｈｉｓｔｏｒｉｃａｌｐａｒａｍｅｔｅｒ）を決定するステップを含む。この計算機で実現される方法はまた，網を表すグラフの一部として，網内の複数の道を列挙するステップを含み，網内の各計算システムはグラフ内のノードであってよく，二つの計算システム間の一連のコネクションは，グラフ内の有向辺であってよい。方法は，スライド窓ベースの観察下にあるグラフの辺から形成された道に，これらの基本モデル又は統計モデルを適用するステップと，適用された統計モデルに基づいて異常な振舞を検出するステップとを更に含む。 In one embodiment, the computer-implemented method is based on the historical parameters (historical) of the basic statistical model of each “edge” (ie, the pair of communicating machines) on the network to determine normal activity levels. determining a parameter). The computer-implemented method also includes the step of enumerating multiple paths in the network as part of a graph representing the network, each computing system in the network being a node in the graph, A series of connections between computing systems may be directed edges in the graph. The method includes applying these basic or statistical models to a path formed from the edges of the graph under sliding window-based observation, detecting abnormal behavior based on the applied statistical models, and Is further included.

別の実施例においては，装置は少なくとも一つのプロセッサ及び命令を含むメモリを含む。この命令は，少なくとも一つのプロセッサで実行されたとき，少なくとも一つのプロセッサに，正常活動レベルを決定するために網の過去のパラメータを決定させるように構成される。命令はまた，少なくとも一つのプロセッサに，網を表すグラフの一部として網内の複数の道を列挙させるように構成され，網内の各計算システムはグラフ内のノードであってよく，二つの計算システムの間の一連のコネクションは，グラフ内の有向辺であってもよい。命令は，少なくとも一つのプロセッサに，スライド窓ベースのグラフに，統計モデルを適用し，適用された統計モデルに基づいて異常な振舞を検出するように更に構成される。 In another embodiment, the apparatus includes at least one processor and a memory including instructions. This instruction, when executed on at least one processor, is configured to cause at least one processor to determine a past parameter of the network to determine a normal activity level. The instructions are also configured to cause at least one processor to enumerate multiple paths in the network as part of a graph representing the network, each computing system in the network being a node in the graph, and two A series of connections between computing systems may be directed edges in the graph. The instructions are further configured to apply the statistical model to the sliding window based graph to the at least one processor and detect anomalous behavior based on the applied statistical model.

また別の実施例においては，システムは，網内の異常な振舞を検出するように構成された計算機プログラム命令を記憶するメモリと，記憶された計算機プログラム命令を実行するように構成された複数の処理コアとを含む。複数の処理コアは，正常活動レベルを決定するために網の過去のパラメータを決定するように構成される。複数の処理コアはまた，網を表すグラフの一部として網内の複数の道を列挙させるように構成され，網内の各計算システムはグラフ内のノードであってよく，二つの計算システムの間の一連のコネクションは，グラフ内の有向辺であってもよい。複数の処理コアは，スライド窓ベースのグラフに統計モデルを適用し，適用された統計モデルに基づいて異常な振舞を検出するように更に構成される。 In another embodiment, the system includes a memory that stores computer program instructions configured to detect abnormal behavior in the network, and a plurality of computers configured to execute the stored computer program instructions. Processing core. The plurality of processing cores are configured to determine a past parameter of the network to determine a normal activity level. The multiple processing cores are also configured to enumerate multiple paths in the network as part of a graph representing the network, each computing system in the network may be a node in the graph, and two computing systems The series of connections between may be directed edges in the graph. The plurality of processing cores are further configured to apply a statistical model to the sliding window based graph and detect anomalous behavior based on the applied statistical model.

また別の実施例においては，計算機で実現された方法は，計算システムが，網内の対応するホストが送受信した網通信に属する複数のホストエージェントからデータを収集するステップを含む。計算機で実現される方法はまた，計算システムが，所定の期間に異常な振舞を検出するために収集したデータを分析するステップと，異常な振舞が検出されたとき，異常な振舞が所定の期間に生じたという指示を提供するステップとを含む。 In another embodiment, the computer-implemented method includes the step of the computing system collecting data from a plurality of host agents belonging to network communications transmitted and received by corresponding hosts in the network. The computer-implemented method also includes a step in which the computing system analyzes the collected data to detect abnormal behavior during a predetermined period, and when abnormal behavior is detected, the abnormal behavior is detected for a predetermined period. Providing an indication that the event occurred.

本発明を適切に理解するためには，添付の図面を参照することが望ましい。これらの図面は本発明のいくつかの実施例を描いているに過ぎず，本発明の範囲を制限するものではない。 For a proper understanding of the present invention, reference should be made to the accompanying drawings. These drawings depict only some embodiments of the invention and are not intended to limit the scope of the invention.

ハッカーによる攻撃のよくある初期段階を示す図である。It is a figure which shows the common initial stage of the attack by the hacker. ハッカーによる攻撃の第２段階を示す図である。It is a figure which shows the 2nd step of the attack by a hacker. ハッカーによる攻撃の第４段階を示す図である。It is a figure which shows the 4th step of the attack by a hacker. 本発明の実施例による，侵入，異常及び方針違反を検出するシステムを示す図である。1 illustrates a system for detecting intrusions, anomalies and policy violations according to an embodiment of the present invention. FIG. 外向きの星を示す図である。It is a figure which shows an outward star. 本発明の実施例による，網上の異常な振舞を検出する方法を示すフローチャートである。4 is a flowchart illustrating a method for detecting abnormal behavior on a network according to an embodiment of the present invention. 本発明の実施例による，名前辺だけを用いて生成された道を示す道の図である。FIG. 4 is a road diagram showing a road generated using only name edges according to an embodiment of the present invention. 本発明の実施例による，ＩＰ辺だけを用いて生成された道を示す道の図である。FIG. 4 is a road diagram showing a road generated using only IP edges according to an embodiment of the present invention. 本発明の実施例による，３個の名前辺から始まり，ＩＰ辺で終わる道を示す道の図である。FIG. 4 is a road diagram showing a path starting with three name edges and ending with an IP edge, according to an embodiment of the present invention. 本発明の実施例による，交番する名前辺及びＩＰ辺を有する道を示す道の図である。FIG. 6 is a road diagram showing a road having alternating name sides and IP sides according to an embodiment of the present invention. 本発明の実施例による，ＵＨＣＡを用いて異常に属するデータを収集する方法のフローチャートである。4 is a flowchart of a method for collecting data belonging to an abnormality using UHCA according to an embodiment of the present invention.

本発明のいくつかの実施例は網を介して道を検査し，道は互いに接続している一連の相互接続計算システムである。グラフにおいて，「ノード」は計算システムを表し，「辺」は二つの計算システムの間の一連のコネクションを表す。経時的な道の検査によって，いくつかの実施例において通過任務（ｔｒａｖｅｒｓａｌｍｉｓｓｉｏｎ）を実行する異常行為者（ａｎｏｍａｌｏｕｓａｃｔｏｒ）がうまく検出できることが分かった。一般に，網内の辺ごとに確率モデルが作成される。考慮下の時間の所与の窓において推定されたパラメータに対して，モデルの過去のパラメータに統計的検定が行われる。利用者が規定する警報率に従って調整された過去のパラメータからの一定のしきい値による逸脱が，異常な道を示すことがある。 Some embodiments of the present invention are a series of interconnected computing systems that inspect roads through a network and the roads are connected to each other. In the graph, “node” represents a computing system and “edge” represents a series of connections between two computing systems. Examination of the road over time has been found to successfully detect anomalous actors performing traversal missions in some examples. In general, a probability model is created for each edge in the network. For the parameters estimated in a given window of time under consideration, a statistical test is performed on the model's past parameters. Deviations by a certain threshold from past parameters adjusted according to the alarm rate specified by the user may indicate an unusual path.

いくつかの実施例は，ｋ縦続道（ｋ−ｐａｔｈｓ）において一緒に結合された辺集合における異常な行動を検出する。ｋ縦続道は，第１辺の終点（ｄｅｓｔｉｎａｔｉｏｎ）が第２辺の始点（ｓｏｕｒｃｅ）であり，第２辺の終点が第３辺の始点であり，等々のような，道内の辺の数がｋ個であるグラフ内の一連の有向辺であってよい。各辺にデータが関係付けられている。このデータは，いくつかの実施例においては，単位時間当たりの，計算機網上のホスト間のコネクションの個数であってよい。（ある固定数ｋに関して）すべてのｋ縦続道を列挙し，時間のスライド窓を用いてデータを検査してもよい。道ごとに確立モデルを構築し，異常さのレベルを判定するために，時間窓内で過去のパラメータを現在の推定パラメータと比較してもよい。 Some embodiments detect anomalous behavior in edge sets joined together in k-paths. For k cascades, the end of the first side is the source of the second side, the end of the second side is the start of the third side, and so on. There may be a series of directed edges in the graph that are k. Data is associated with each side. In some embodiments, this data may be the number of connections between hosts on the computer network per unit time. List all k cascades (for a fixed number k) and examine the data using a sliding window of time. In order to build an established model for each road and determine the level of anomaly, the past parameters may be compared with the current estimated parameters within a time window.

計算機網における異常を特定することは，一般に困難かつ複雑な問題である。異常はしばしば，網の非常に局所的な範囲で生じる。基礎となる（ｕｎｄｅｒｌｙｉｎｇ）グラフ構造があるため，この設定において局所性は複雑なことがある。局所的異常を特定するために，経時的にグラフの辺から抽出したデータに探査統計量を用いてもよい。グラフ内の局所性を捕捉するために，二つの形状，すなわち，星状及び上述のｋ縦続道が特に有利である。探査窓として道を使用することは新規である。これらの形状は双方とも，現実の網攻撃において観察されたハッカーの振舞に動機付けられている。 Identifying anomalies in a computer network is generally a difficult and complex problem. Anomalies often occur in a very local area of the net. Locality can be complex in this setting because of the underlying graph structure. To identify local anomalies, exploration statistics may be used on data extracted from graph edges over time. In order to capture locality in the graph, two shapes are particularly advantageous: a star and the above-described k cascade. The use of roads as exploration windows is new. Both of these shapes are motivated by hacker behaviors observed in real web attacks.

局所的異常を特定するために，スライド時間窓の集合を用いて全グラフにわたって，これらの形状を列挙してもよい。異常を捕捉するために，各窓における局所統計量を過去の振舞と比較してもよい。これらの局所統計量はモデルベースであってよく，例示探査手続を示すことを支援するために，例として，網フローデータによって動機付けられた本発明のいくつかの実施例によって用いられた二つのモデルをここで説明する。大規模網のデータ速度は一般にオンライン検出が迅速である必要がある。したがって，実時間の分析速度を達成することが異常検出システムに望まれる。 These shapes may be listed across all graphs using a set of sliding time windows to identify local anomalies. To capture anomalies, local statistics at each window may be compared with past behavior. These local statistics may be model-based, and to help illustrate an example exploration procedure, two examples used by some embodiments of the present invention motivated by network flow data are used as examples. The model is described here. Large network data rates generally require rapid online detection. It is therefore desirable for an anomaly detection system to achieve real-time analysis speed.

攻撃者が一旦網内に入ると，攻撃者の検出は一般に，国及び多くの機関にとって，サイバセキュリティにおいて高優先度の事項である。攻撃者全員を網の中に入れないことは，不可能ではないが非常に困難である。網攻撃のうち，網内の通過は非常にありふれたものであり，攻撃者が達成を望む多くの大規模任務，特に攻撃者が国民国家を代表して仕事をしている任務については核心的要求条件である。本発明のいくつかの実施例は，通過を検出することを約束し，システム運用者が利用可能な調整可能偽陽性パラメータを有する。さらに，いくつかの実施例は実時間で実行されるように設計され，攻撃が発生したとき，迅速な検出を提供する。本発明のいくつかの実施例の別の核心部分は科学捜査ツールの集合であり，それらは分析者が攻撃者の通過を完全に発見し，危険にさらされたホストを特定できるようにする。 Once an attacker enters the network, attacker detection is generally a high-priority item in cybersecurity for countries and many agencies. It is difficult, if not impossible, to keep all attackers out of the net. Of the net attacks, the passage through the network is very common, and the core of many large-scale missions that attackers want to achieve, especially those where attackers work on behalf of the nation-state. It is a requirement. Some embodiments of the present invention promise to detect passage and have adjustable false positive parameters available to the system operator. In addition, some embodiments are designed to be performed in real time and provide rapid detection when an attack occurs. Another core part of some embodiments of the present invention is a collection of forensic tools that allow the analyst to fully discover an attacker's passage and identify a compromised host.

本発明のいくつかの実施例は，異常な道の検出に加えて，網トラヒックの前兆であるＤＮＳ要求を観測し，これらの要求から後続の網トラヒックを推測する。そして，この推測されたトラヒックは，網の偵察，網状況認知及び本発明のいくつかの実施例に関して説明した部分グラフ検出ツールを含む，網異常／変化検出ツールのための高信頼データ源として用いることができる。多くの機関においては，１又は２ヶ所の集約点がすべてのＤＮＳ要求を処理している。その結果，データ供給は一般に，ルータ又は網傍受集約機構のようなほかのありふれた網集約機構から得られるデータよりも小さく，捕捉することが容易である。さらに，各ルータを傍受することの代替物は禁止的に高価であり，ルータの傍受は一般に過密ベースの標本化の影響を受けるため，ＤＮＳは一般にコネクションレベルのトラヒックをより完全に処理する。ルータ又は傍受によって見ることができない下位網内のトラヒックでさえ，多くの場合ＤＮＳ要求から推測することができる。ハッカーが下位網内に留まることは一般に珍しいことではないため，このことは異常検出の点で重要であり得る。 Some embodiments of the present invention, in addition to detecting anomalous paths, observe DNS requests that are precursors to network traffic and infer subsequent network traffic from these requests. This inferred traffic is then used as a reliable data source for network anomaly / change detection tools, including network reconnaissance, network status recognition, and subgraph detection tools described with respect to some embodiments of the present invention. be able to. In many agencies, one or two aggregation points handle all DNS requests. As a result, the data supply is generally smaller than data obtained from other common network aggregation mechanisms such as routers or network interception aggregation mechanisms and is easier to capture. In addition, DNS generally handles connection level traffic more completely because alternatives to intercepting each router are prohibitively expensive and router interception is generally subject to overcrowding-based sampling. Even traffic within a subnetwork that is not visible by routers or intercepts can often be inferred from DNS requests. This can be important in terms of anomaly detection since it is not uncommon for hackers to stay in the subnetwork.

明確化のために，本発明のいくつかの実施例が検出できるハッカーによる攻撃の異常シナリオを説明する。図１Ａはハッカーによる攻撃のありふれた初期段階１００を示している。ハッカーは，悪意のあるソフトウェアを用いて網上の機械１０２を危険にさらすことによって，初期攻撃を達成することができる。危険にさらされた機械１０２は，通過経路に接続されていない補助機械１０４に接続されている。これらの機械は必ずしも無故障（ｃｌｅａｎ）ではないが，この例においては後続の通過には用いられない。網を最初に危険にさらす一つの方法はフィッシング攻撃と呼ばれ，悪意のあるウェブサイトへのリンクを含む電子メールが網上の利用者集合に送信される。利用者がリンクをクリックすると，利用者の計算システムが危険にさらされ，利用者の計算システムへのある形態のアクセスを攻撃者に与える。 For clarity, an anomaly scenario of an attack by a hacker that can be detected by some embodiments of the present invention is described. FIG. 1A shows a common initial stage 100 of a hacker attack. A hacker can accomplish an initial attack by using malicious software to compromise the machine 102 on the network. The endangered machine 102 is connected to an auxiliary machine 104 that is not connected to the passage path. These machines are not necessarily clean, but in this example are not used for subsequent passes. One way to first put the network at risk is called a phishing attack, where an email containing a link to a malicious website is sent to a set of users on the network. When a user clicks on a link, the user's computing system is compromised, giving the attacker some form of access to the user's computing system.

攻撃者は一般にどの計算システムが危険にさらされるかを指示することはできず，最終目標があったとしても，初期ホストは通常攻撃の最終目標ではない。その代わり，ハッカーは，価値のあるデータを探して取り出し，特権を拡大し，及び／又は後の利用及び／又は網事業者が行う防衛手段に直面したときの耐性のために，網内に広範な影響力（ｐｒｅｓｅｎｃｅ）を確立するように，ほかの計算システムへ移動することを望むかも知れない。したがって，攻撃者はこの初期ホストから，一つずつホップしながらほかのホストへ進むことができる。図１Ｂはハッカーによる攻撃の第２段階１１０を示している。ここで，第２計算システム１０２は危険にさらされており，危険にさらされた計算システム１０２は一つの辺１１２によって接続されている。図１Ｃはハッカーによる攻撃の第４段階１２０を示しており，四つの計算システム１０２が危険にさらされており，危険にさらされた計算システム１０２は道１２２によって接続されている。 Attackers generally cannot tell which computing systems are compromised, and even if there is a final goal, the initial host is not the final goal of a normal attack. Instead, hackers search for and extract valuable data, expand privileges, and / or promote extensive use in the network for later use and / or resistance when faced with defense measures taken by network operators. You might want to move to another computing system to establish a strong presence. Therefore, the attacker can proceed from this initial host to another host while hopping one by one. FIG. 1B shows the second stage 110 of the attack by the hacker. Here, the second computing system 102 is at risk, and the compromised computing system 102 is connected by one edge 112. FIG. 1C shows a fourth stage 120 of an attack by a hacker, where four computing systems 102 are compromised and the compromised computing systems 102 are connected by way 122.

攻撃者は網を通過しながら，通過する各辺に沿って通信の時系列において異常な行動を起こす。このことは，辺ごとに過去には正常であった通信レベルの上に追加の通信が一般に見られることを意味する。本発明のいくつかの実施例においては，ある時間間隔においてこれらの異常な辺の結合（ｕｎｉｏｎ）が検出され，これがシステム内への侵入を表す。 While passing through the network, the attacker performs abnormal behavior in the time series of communication along each passing edge. This means that additional communication is generally seen on the communication level that was normal in the past for each side. In some embodiments of the present invention, these abnormal edge unions are detected at certain time intervals, which represent an intrusion into the system.

図２は，本発明の実施例による，侵入，異常及び方針違反を検出する計算システム又は「システム」２００を示している。システム２００はバス２０５又は情報を伝送するほかの通信機構と，バス２０５に結合され，情報を処理するプロセッサ２１０を含む。プロセッサ２１０は，中央処理ユニット（ＣＰＵ）又は特定用途集積回路（ＡＳＩＣ）を含む，任意の種類のはん用又は特定用途のプロセッサであってよい。プロセッサ２１０はまた複数の処理コアを有してもよく，コアの少なくともいくつかは特定機能を実行するように構成されてもよい。いくつかの実施例は対称マルチプロセシング（ＳＭＰ）と呼ばれる複数コア，単一機械方式を用いることができる。ほかの実施例は複数の機械を横断して実現され，各機械が複数のコアを有してもよい。この方式はメッセージパッシングインタフェース（ＭＰＩ）と呼ばれる。システム２００は，情報及びプロセッサ２１０が実行する命令を記憶するメモリ２１５を更に含む。メモリ２１５は，ランダムアクセスメモリ（ＲＡＭ），リードオンリメモリ（ＲＯＭ），フラッシュメモリ，キャシュ，磁気ディスク又は光ディスクのような静止記憶装置，又は任意のほかの種類の非一時的計算機可読媒体，の任意の組合せからなっていてもよい。さらに，システム２００は網への接続を提供する無線網インタフェースのような通信装置２２０を含む。 FIG. 2 illustrates a computing system or “system” 200 for detecting intrusions, anomalies and policy violations according to an embodiment of the present invention. System 200 includes a bus 205 or other communication mechanism for transmitting information, and a processor 210 coupled to bus 205 for processing information. The processor 210 may be any type of general purpose or special purpose processor, including a central processing unit (CPU) or an application specific integrated circuit (ASIC). The processor 210 may also have multiple processing cores, and at least some of the cores may be configured to perform specific functions. Some embodiments may use a multi-core, single-machine scheme called symmetric multiprocessing (SMP). Other embodiments may be implemented across multiple machines, and each machine may have multiple cores. This method is called a message passing interface (MPI). System 200 further includes a memory 215 that stores information and instructions for execution by processor 210. Memory 215 may be any of random access memory (RAM), read only memory (ROM), flash memory, cache, static storage such as a magnetic disk or optical disk, or any other type of non-transitory computer readable medium. It may consist of a combination of In addition, the system 200 includes a communication device 220 such as a wireless network interface that provides connection to the network.

非一時的計算機可読媒体は，プロセッサ２１０が利用できる任意の入手可能な媒体であってよく，揮発性媒体及び非揮発性媒体の双方，着脱可能媒体及び非着脱可能媒体，及び通信媒体を含んでもよい。通信媒体は，計算機可読命令，データ構造体，プログラムモジュール又は搬送波若しくはほかの転送機構のような変調データ信号内のほかのデータを含んでもよく，任意の情報配信媒体を含む。 Non-transitory computer readable media can be any available media that can be utilized by processor 210 and can include both volatile and nonvolatile media, removable and non-removable media, and communication media. Good. Communication media may include computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.

プロセッサ２１０は，バス２０５を介して，液晶ディスプレイ（ＬＣＤ）のような利用者に情報を表示する表示装置２２５に更に結合されている。利用者がシステム２００と対話できるように，キーボード２３０及び計算機マウスのようなカーソル制御装置２３５がバス２０５に更に結合される。 The processor 210 is further coupled via a bus 205 to a display device 225 that displays information to the user, such as a liquid crystal display (LCD). A cursor controller 235 such as a keyboard 230 and a computer mouse is further coupled to the bus 205 so that the user can interact with the system 200.

一つの実施例においては，メモリ２１５はプロセッサ２１０が実行したとき機能を提供するソフトウェアモジュールを記憶する。このモジュールはシステム２００のオペレーティングシステム２４０を含む。モジュールは，侵入，異常及び方針違反を検出するように構成された検出モジュール２４５を更に含む。システム２００は，追加機能を含む１又は複数の追加機能モジュール２５０を含んでもよい。 In one embodiment, memory 215 stores software modules that provide functionality when executed by processor 210. This module includes the operating system 240 of the system 200. The module further includes a detection module 245 configured to detect intrusions, anomalies and policy violations. The system 200 may include one or more additional function modules 250 that include additional functions.

当業者であれば，「システム」はパーソナル計算機，サーバ，コンソール，パーソナルデジタルアシスタント（ＰＤＡ），携帯電話機，任意のほかの適切な計算装置，又は装置の組合せとして実現できることを理解するであろう。上述の機能が「システム」によって実行されると述べたことは，本発明の範囲をいかようにも制限するものではなく，本発明の多くの実施例の一例を提供するものである。実際，ここに開示した方法，システム及び装置は，計算技術と矛盾しない局所化形態及び分散形態で実現することができる。 One skilled in the art will appreciate that a “system” can be implemented as a personal computer, server, console, personal digital assistant (PDA), mobile phone, any other suitable computing device, or combination of devices. Having stated that the above functions are performed by a “system” does not limit the scope of the present invention in any way, but provides an example of many embodiments of the present invention. Indeed, the methods, systems and devices disclosed herein can be implemented in localized and distributed forms consistent with computational techniques.

本明細書において説明したシステムの特徴のうちいくつかは，その実現の独立性を更にとりわけ強調するために，モジュールとして表されていることに注意されたい。例えば，モジュールは，特注の超大規模集積回路（ＶＬＳＩ），又はゲートアレイ，論理チップ，トランジスタ，若しくは個別部品のような市販の半導体で実現してもよい。モジュールはまた，フィールドプログラム可能ゲートアレイ，プログラム可能アレイ論理，プログラム可能論理デバイス，グラヒック処理ユニット，等のようなプログラム可能ハードウェア装置で実現してもよい。 It should be noted that some of the features of the system described herein are represented as modules to more particularly emphasize their implementation independence. For example, the module may be implemented with a custom-built very large scale integrated circuit (VLSI) or a commercially available semiconductor such as a gate array, logic chip, transistor, or discrete component. Modules may also be implemented with programmable hardware devices such as field programmable gate arrays, programmable array logic, programmable logic devices, graphics processing units, and the like.

モジュールはまた，少なくとも部分的に，種々の種類のプロセッサによって実行されるソフトウェアで実現してもよい。特定された実行可能コードの単位は，例えば，計算機命令の１又は複数の物理ブロック又は論理ブロックを有してもよく，計算機命令は例えば，オブジェクト，手続又は関数として整理される。それでもなお，特定されたモジュールの実行可能コードは物理的に一緒に配置する必要はなく，別々の場所に記憶された異種の命令であって，論理的に一緒に結合されたときモジュールを為し，上述のモジュールの目的を達成するものであってもよい。さらに，モジュールは，例えば，ハードウェアディスクドライブ，フラッシュデバイス，ＲＡＭ，テープ，又はデータを記憶するために用いられる任意のほかの媒体であってよい。 A module may also be implemented, at least in part, in software executed by various types of processors. The identified unit of executable code may comprise, for example, one or more physical or logical blocks of computer instructions, which are organized as objects, procedures or functions, for example. Nonetheless, the executable code of the identified modules need not be physically located together, but are dissimilar instructions stored in different locations, and will perform the module when logically combined together. , To achieve the purpose of the module described above. In addition, a module can be, for example, a hardware disk drive, flash device, RAM, tape, or any other medium used to store data.

実際，実行可能コードのモジュールは単一命令であってもよいし，多くの命令であってもよく，いくつかのメモリデバイスにまたがる別個のプログラムのうちいくつかの異なるコードセグメントに分散されていてもよい。類似して，運用データはここではモジュール内で特定され，示されており，任意の適切な形態で実現され，任意の適切な種類のデータ構造体内に整理されてもよい。運用データは一つのデータセットとして集約されていてもよいし，別個の記憶装置を含む，別個の位置に分散されてもよいし，少なくとも部分的にはシステム上又は網上の単なる電子信号として存在してもよい。 In fact, a module of executable code may be a single instruction or a number of instructions, distributed across several different code segments of separate programs that span several memory devices. Also good. Similarly, operational data is identified and shown here in modules, implemented in any suitable form, and may be organized in any suitable type of data structure. Operational data may be aggregated as a single data set, distributed across separate locations, including separate storage devices, or at least partially present as simple electronic signals on the system or network May be.

ハッカーが網に入ったとき，道及び星の異常が観察される。星の異常は，ハッカーが危険にさらされた計算システムを用いてアクセスできるほかの計算システムに接続し，危険にさらされたホストから発散する複数の辺に異常を生成する。 When hackers enter the net, road and star anomalies are observed. Star anomalies connect to other computing systems accessible to hackers using compromised computing systems and generate anomalies on multiple sides that diverge from the compromised host.

道の異常はより巧妙な攻撃を示すことがあり，その攻撃は道内の各ホストから次への一連の通過である。芋虫異常（ｃａｔｅｒｐｉｌｌａｒａｎｏｍａｌｙ）は星及び道の混合である。この方式は実時間で計算機網を監視するように設計され，企業レベル（２０，０００以上の個別インターネットプロトコル（ＩＰ）アドレス）での計算機網データに適用されるどの方式も高速である必要がある。また，非常に局所的な異常を特定するために，システムは一般に多くの小さな窓を同時に監視する必要がある。本発明のいくつかの実施例は，企業規模の網における多数の局所的オブジェクトを実時間で検査することができる。 A road anomaly may indicate a more sophisticated attack, which is a series of passes from each host in the road to the next. The caterpillar anomaly is a mixture of stars and roads. This method is designed to monitor the computer network in real time, and any method applied to computer network data at the enterprise level (more than 20,000 individual Internet Protocol (IP) addresses) must be fast. . Also, to identify very local anomalies, the system generally needs to monitor many small windows simultaneously. Some embodiments of the present invention can examine a large number of local objects in an enterprise-wide network in real time.

外積空間内の窓 Windows in outer space

時間×グラフ積空間内の窓を検査することは有用である。これらの窓集合は，ノード集合Ｖ及び辺集合Ｅを含むグラフＧ＝（Ｖ，Ｅ）が存在するように定義してもよい。辺ｅ∈Ｅごとに，離散時点ｔ∈｛１，・・・，Ｔ｝において，データプロセスＸ_ｅ（ｔ）がある。離散化された時間間隔（ｓ，ｓ＋１，・・・，ｋ）において，辺ｅ上の時間窓集合はΩ＝｛［ｅ，（ｓ，ｓ＋１，・・・，ｋ）］：ｅ∈Ｅ，０≦ｓ≦ｋ≦Ｔ｝と表わすことができる。窓のすべての部分集合の集合，Γ＝｛｛ｗ_１，ｗ_２，・・・｝：ｗ_ｊ∈Ω｝は通常非常に大きく，時間及びグラフ空間において局所性制約を含む部分集合

だけが一般に重要である。したがって，一般に注意は窓の集合γ∈Γ_ｘに制約される。Γ_ｘは通常問題依存である。便宜上，Ｘ（γ）はγで与えられる窓におけるデータとして表される。 It is useful to examine a window in the time x graph product space. These window sets may be defined such that a graph G = (V, E) including a node set V and an edge set E exists. For each edge eεE, there is a data process X _e (t) at discrete points in time tε {1,..., T}. In the discretized time interval (s, s + 1,..., K), the time window set on the edge e is Ω = {[e, (s, s + 1,..., K)]: e∈E, 0 ≦ s ≦ k ≦ T}. The set of all subsets of the window, Γ = {{w ₁ , w ₂ ,...}: W _j ∈Ω} is usually very large and includes a locality constraint in time and graph space

Only is generally important. Thus, attention is generally constrained to the set of windows γ∈Γ _x . Γ _x is usually problem dependent. For convenience, X (γ) is represented as data in a window given by γ.

任意の時点ｔ及び辺ｅについて，Ｘ_ｅ（ｔ）はθ_ｅ（ｔ）で与えられるパラメータ関数を有する確立過程と説明できると仮定してもよい。パラメータ関数の値は，対応する窓γの集合において，θ（γ）によって評価できる。最後に，確立過程の尤度はγについてＬ（θ（γ）｜Ｘ（γ））として表すことができる。 For any instant t and edge e, it may be assumed that X _e (t) can be described as an establishment process with a parameter function given by θ _e (t). The value of the parameter function can be evaluated by θ (γ) in the corresponding set of windows γ. Finally, the likelihood of the establishment process can be expressed as L (θ (γ) | X (γ)) for γ.

時間×グラフ空間における窓の探査統計量 Exploration statistics for windows in time x graph space

窓内のデータが，パラメータが変化したことを示す代替物に対して，既知のパラメータ関数

によって生成されたか否かを知ることは有益である。すなわち，Ｘ（γ）＝ｘ（γ）が観測されたとすると，全パラメータ空間Θを部分集合

に制限することによって形成できる代替物に対して，

を検定することは有益である。一般化尤度比検定（ＧＬＲＴ）統計量は利用できるもっともな統計量である。ここで，

とする。 For alternatives where the data in the window indicates that the parameter has changed, a known parameter function

It is useful to know if it was generated by That is, if X (γ) = x (γ) is observed, the entire parameter space Θ is a subset

For alternatives that can be formed by restricting to

It is useful to test The generalized likelihood ratio test (GLRT) statistic is the best statistic available. here,

And

λ_γのサイズは窓内で検定されたパラメータの数に依存し，直接使用することは困難である。この問題を解決するために，λ_γをｐ値ｐ_γに変換することによってλ_γを正規化できる。 The size of λ _γ depends on the number of parameters tested in the window and is difficult to use directly. To solve this problem, it normalizes the lambda _gamma by converting the lambda _gamma to a p-value p _gamma.

（時間×グラフ）積空間における異常を探査するために，一般にすべての窓γの上をスライドし，探査統計量φ＝ｍｉｎ_γｐ_γを記録する必要がある。実際には，一般に最小のｐ値以上が考慮されるように，一般にｐ値の集合にしきい値処理を行う必要がある。オンライン監視には，偽発見率を制御するためにｐ値に対してしきい値を設定することができる。しきい値が高ければ高いほど，特定される異常が多くなるが，偽陽性も同様に多くなる。一般には，監視ソフトウェアを実行させる分析者が圧倒されないように設定することが望ましい。一般に，検出が発生したとき，窓の集合（唯一つではない）がしきい値を超えており，これらの窓の結合がシステムによって生成された検出された異常である。 In order to search for anomalies in the (time × graph) product space, it is generally necessary to slide over all windows _γ and record the search statistics φ = min _γ p _γ . In practice, it is generally necessary to perform threshold processing on a set of p values so that a minimum p value or more is generally considered. For online monitoring, a threshold can be set for the p-value to control the false discovery rate. The higher the threshold, the more abnormalities that are identified, but the more false positives. In general, it should be set so that the analyst who runs the monitoring software is not overwhelmed. In general, when detection occurs, the set of windows (not only one) exceeds a threshold, and the combination of these windows is a detected anomaly generated by the system.

局所形状：星及び有向ｋ縦続道 Local shape: star and directed k cascade

上述の方式は，バッチ（回顧的）処理又はオンライン（予期的）処理に用いることができる。しかし，グラフは本質的に一般に組合せ的（ｃｏｍｂｉｎａｔｏｒｉａｌ）である。ｎ個のノードを有する完全連結グラフについては，部分グラフの数は２^{ｎ（ｎ−１）}である。実際の応用，特にオンライン環境においては，このような多数の部分グラフについては，グラフ窓の制限された集合を用いることが有利である。窓は，特定の異常の形状を特定するために適切なように作成される。 The scheme described above can be used for batch (retrospective) processing or online (expected) processing. However, the graph is essentially generally combinatorial. For a fully connected graph with n nodes, the number of subgraphs is ^{2n (n-1)} . In practical applications, especially in an online environment, it is advantageous to use a limited set of graph windows for such a large number of subgraphs. Windows are created as appropriate to identify the shape of a particular anomaly.

有向ｋ縦続道 Directed k cascade

一つのよくある侵入の例は，計算機網におけるハッカーの通過によるものであり，オンライン監視用の特定の種類の部分グラフ，すなわち有向ｋ縦続道が特に有利である。有向ｋ縦続道は直径ｋを有するサイズｋの部分グラフである。ここで，サイズとはグラフ内の辺の数であり，直径とはノードの任意の対の間の最大ホップ距離である。これは略式には，ｋ縦続道は一連の辺であり，一連のうち現在の辺の終点ノードが一連の次の辺の開始ノードである，等々であることを意味する。 One common intrusion example is due to the passage of hackers in a computer network, and a particular type of subgraph for online monitoring, namely a directed k cascade, is particularly advantageous. A directed k cascade is a subgraph of size k with diameter k. Here, size is the number of edges in the graph and diameter is the maximum hop distance between any pair of nodes. This means that the k cascade is a sequence of edges, the end node of the current side of the sequence is the start node of the next next sequence, and so on.

攻撃は網を通る道によって説明されるため，核心の道の辺りに“ｆｕｚｚ”という追加の辺を加えることによって，ｋ縦続道は多くの網攻撃の核心を捕捉するという利点を有する。この攻撃形状は実際の攻撃において観測された。さらに，ｋ縦続道は非常に局所的であり，小さな以上の検出を可能にする。 Since the attack is explained by a path through the network, the k cascade has the advantage that it captures the core of many network attacks by adding an additional side “fuzz” around the core path. This attack shape was observed in actual attacks. In addition, the k cascade is very local, allowing detection above small.

いくつかの実施例においては，３縦続道（３−ｐａｔｈｓ）が用いられる。３縦続道は局所性という利点を有し，同時に重大な通過を捕捉するために十分大きい。網グラフ内のすべての３縦続道を探査するために，最初に道が列挙される。多くのグラフにとって，これはささいではない。ｎノードの完全連結グラフにおいて閉道（ｃｙｃｌｅ）及び逆辺（ｂａｃｋｅｄｇｅ）を除去すると，ｎ（ｎ−１）（ｎ−２）（ｎ−３）個の３縦続道がある。 In some embodiments, 3-paths are used. Three cascades have the advantage of locality and are large enough to capture significant passages at the same time. To explore all 3 cascades in the network graph, the paths are first listed. For many graphs this is not trivial. If a cycle and a back edge are removed from an n-node fully connected graph, there are n (n−1) (n−2) (n−3) three cascades.

実際には，網グラフは一般に多くは連結されていない。しかし，３０秒の時間窓において，窓内の非零活動を有する辺だけを含めて，例示実施例においては，約１７，０００のノード，９０，０００の辺及び３億の３縦続道を含むグラフが得られる。ｎ（ｎ−１）（ｎ−２）（ｎ−３）個の潜在的な３縦続道の全集合を実効的に探査しても，現在の時間窓において活動がない辺を含むどの道についても異常測定値は一般に計算できない。ハッカーは辺を通過するために，通常少なくとも一つの通信を行う必要があるため，辺に活動がないことは，その辺には通過がなかったことを示し，したがって，その辺を含む道は（注目する時間窓においては）異常とは考えられない。 In practice, many network graphs are generally not connected. However, in a 30 second time window, including only edges with non-zero activity in the window, the exemplary embodiment includes approximately 17,000 nodes, 90,000 edges, and 300 million 3 cascades. A graph is obtained. Effectively exploring the entire set of n (n-1) (n-2) (n-3) potential three cascades, but for any road that includes an inactive edge in the current time window Anomalous measurements cannot generally be calculated. Because a hacker usually needs to make at least one communication in order to pass an edge, the absence of activity on an edge indicates that there was no passage on that edge, so the road containing that edge is ( It is not considered abnormal (in the time window of interest).

３縦続道は多数あるため，近実時間応答能力を維持するために道を迅速に列挙できることが重要である。ｋ縦続道列挙するアルゴリズムを以下に示す。メッセージパッシングインタフェース（ＭＰＩ）ベースのクラスタへ，ループごとのＥＮＵＭＥＲＡＴＥ内の辺を分配することによって，並列処理が可能になる。次に，各ＭＰＩノードは，当該辺から始めてすべての道を辺リストから再帰的に計算する。この例において，辺Ａは長さ２のリストであり，Ａ［１］は始点ノードであり，Ａ［２］は終点ノードである。
function ENUMERATE（E, K):
// E = グラフを表す辺のリスト
// K = 列挙する道の整数長
for each edge A in E: // Ａはグラフ内のある辺
list P[l] = A // Ａが道内の最初の辺になる
RECURSE(E, P, 1, K) // 追加の辺を再帰的に付加する

function RECURS E(E, P, L, K):
// E = グラフを表す辺のリスト
// P = 道を表す辺のリスト
// L = Ｐの整数長
// K = 列挙する道の整数長
edge A = P[L] // Ａは道内の最後の辺
for each edge B in E: // Ｂはグラフ内のある辺
if A[2] = B[1] then:
P[L+1] = B // Ｂが道内の最後の辺になる
if L+1 == K:
EMIT(P) // ｋ縦続道が見付かった
else:
RECURSE(E, P, L+l, K) // 追加の辺を再帰的に付加 Since there are many 3 cascades, it is important to be able to quickly enumerate the roads to maintain near real-time response capability. The algorithm for enumerating k cascades is shown below. Parallel processing is possible by distributing the edges in the ENUMERATE for each loop to a message passing interface (MPI) based cluster. Each MPI node then recursively computes all the paths from the edge list starting from that edge. In this example, side A is a list of length 2, A [1] is the start node, and A [2] is the end node.
function ENUMERATE (E, K):
// E = list of edges representing the graph
// K = integer length of the enumeration path
for each edge A in E: // A is an edge in the graph
list P [l] = A // A is the first side in the road
RECURSE (E, P, 1, K) // recursively add additional edges

function RECURS E (E, P, L, K):
// E = list of edges representing the graph
// P = list of edges representing the road
// L = integer length of P
// K = integer length of the enumeration path
edge A = P [L] // A is the last edge in the road
for each edge B in E: // B is an edge in the graph
if A [2] = B [1] then:
P [L + 1] = B // B is the last side of the road
if L + 1 == K:
EMIT (P) // k cascade was found
else:
RECURSE (E, P, L + l, K) // Add additional edges recursively

このアルゴリズムはほとんどメモリを使用せず，容易に並列化できる。いくつかの実時間シミュレーションにおいては，およそ３億の道からなる３０分の窓を，４８コアのはん用機械を用いて，窓当たり５秒未満で列挙し検定することができた。これによって，モデルに複雑性を追加し，実時間データストリームを維持しながら，現在分析している既にかなり大きいグラフよりも大きいグラフを扱う余地が生じる。 This algorithm uses almost no memory and can be easily parallelized. In some real-time simulations, a 30 minute window of approximately 300 million roads could be enumerated and tested in less than 5 seconds per window using a 48 core general purpose machine. This adds room to the model and adds room for dealing with larger graphs than are already being analyzed, while maintaining a real-time data stream.

星 Star

図３の外向きの星（ｏｕｔ−ｓｔａｒ）３００に示すように，星は通信網を監視するための別の興味深い形状である。星は始点が所与の中央ノードである辺の集合と定義される。図３において，中央ノード３０２は有向辺によって外側のノード３０４に連結されている。これらの形状は，特に高出次数のノードに関しては，非常に局所化されている訳ではないが，依然として星形の異常をかなり良く捉える。道は星状窓よりもより微細な異常を記述する能力を有するが，星状窓は一般に大きな星状異常については道より優れている。 As shown in the out-star 300 of FIG. 3, the star is another interesting shape for monitoring the communications network. A star is defined as a set of edges whose starting point is a given central node. In FIG. 3, the central node 302 is connected to the outer node 304 by a directed side. These shapes are not very localized, especially for high-degree nodes, but still capture the star-shaped anomalies fairly well. Roads have the ability to describe finer anomalies than star windows, but star windows are generally better than roads for large star anomalies.

時間間隔 Time interval

時間構成要素はグラフ窓内の辺ごとに同一の時間間隔を含む。これは，形状内の辺ごとに同一の時間窓で生じる異常を検出することができる。順次時間窓又は望遠時間窓のようなより精巧な選択肢を用いて，セキュアシェル（ＳＳＨ）のような特定のプロトコルを提供してもよい。 The time component includes the same time interval for each edge in the graph window. This can detect anomalies that occur in the same time window for each side in the shape. More sophisticated options such as sequential time windows or telephoto time windows may be used to provide specific protocols such as Secure Shell (SSH).

辺データ Edge data

一般に，形状γの分解（ｒｅｓｏｌｕｔｉｏｎ）よりも辺の分解でデータをモデル化する方が有利である。推定，仮説検証，ｐ値計算及びしきい値処理を含む，経時的な辺のデータの分布によって動機付けられた二つのモデルを説明する。 In general, it is more advantageous to model data with edge decomposition rather than shape γ resolution. Explain two models motivated by the distribution of edge data over time, including estimation, hypothesis testing, p-value calculation and thresholding.

ＩＰアドレスはノードを規定し，ＩＰアドレス間の通信は，グラフ内のこれらのノード間の有向辺の存在を規定する。網内の辺には膨大な多様性があり，ある特性が，発信元機械に人間の行為者が居ることを表してもよい。 An IP address defines a node, and communication between IP addresses defines the existence of a directed edge between these nodes in the graph. There is a huge variety of edges in the network, and certain characteristics may represent the presence of human actors on the source machine.

計算機網データにおいて，切替え過程を観測することがよくある。直感的に言えば，多くの辺については，この切替えは網上に人間が居ることによって生じる。利用者がある機械に居るとき，利用者は当該機械から発する辺上に非零カウントを生じさせる。しかし，多くの瞬間に，利用者が居たとしても，利用者はあるほかの機械と通信しているかも知れないし，網を全く使用していないかも知れないため，利用者がこの辺に非零カウントを生じさせないことがある。利用者がそこに居ないとき，この辺で０を観測することだけがわかっている。この在／不在が，純粋な０カウント発出とより高い活動性カウント発出との切替え過程を誘起する。直感的に言えば，夜よりも真昼の方がカウントが高いが，モデルの単純性のために，いくつかの実施例においては，均一（ｈｏｍｏｇｅｎｅｏｕｓ）モデルを用いてもよい。 Often, the switching process is observed in computer network data. Intuitively, for many sides, this change occurs because there are people on the net. When a user is on a machine, the user causes a non-zero count on the edge emanating from the machine. However, at many moments, even if there is a user, the user may be communicating with some other machine or may not be using the network at all, so the user may be non-zero in this area. May not cause a count. It is only known that when the user is not there, he observes zero on this side. This presence / absence induces a switching process between a pure zero count issue and a higher activity count issue. Intuitively, the count is higher at noon than at night, but for simplicity of the model, a homogeneous model may be used in some embodiments.

道内の辺の独立性 Independence of neighborhoods in Hokkaido

異常形状を探査するため，一般に，正常条件下の窓内のデータの振舞を記述するモデルを持つことが必要である。列挙された部分グラフの数はノードの数に対して指数的に変化する傾向があり，形状内の辺の独立性を仮定すると，合理的なメモリ要求条件下で，回線速度でグラフを処理するために必要な計算を調整することが容易になる。これは一般に，辺の独立性が辺ごとのモデル（及び辺パラメータの記憶）を必要とするに過ぎないのに対して，非独立性は，何十億ではないにしろ，数億ある形状ごとのモデルを必要とするためである。独立性を仮定して，道のＧＬＲＴは次の式で表される。

In order to explore anomalous shapes, it is generally necessary to have a model that describes the behavior of data in windows under normal conditions. The number of enumerated subgraphs tends to vary exponentially with the number of nodes, and assuming the independence of edges in the shape, the graph is processed at line speed under reasonable memory requirements This makes it easy to adjust the necessary calculations. This generally means that edge independence only requires a model for each edge (and storage of edge parameters), whereas non-independence is not in billions, but in hundreds of millions of shapes. This is because it requires a model. Assuming independence, the GLRT of the road is expressed by the following equation.

ここで，λ_ｅは窓γ内の各辺のＧＬＲＴ評点を表す。 Here, λ _e represents the GLRT score of each side in the window γ.

観測されたマルコフモデル（ＯＭＭ） Observed Markov model (OMM)

ここで説明する二つのモデルのうち第１の最も簡単なものは２状態ＯＭＭであり，Ｂ_ｔで表される。時間ビンに非零カウントがあったときは，Ｂ_ｔ＝１，そうでないときはＢ_ｔ＝０である。このモデルは二つのパラメータｐ０１＝Ｐ（Ｂ_ｔ＝０｜Ｂ_ｔ＝１）を有する。その尤度は次の式で表される。

Here will be described in two models first simplest of is 2 state OMM, represented by B _t. When there is a non-zero count in the time bin, B _t = 1, otherwise B _t = 0. This model two parameters p01 ₌ P _| having _{(B t = 0 B t =} 1). The likelihood is expressed by the following equation.

ここでｎ_ｉｊは，連続する対（ｂ_ｉ，ｂ_ｊ）がデータ内で観測された回数である。初期状態は固定かつ既知であると仮定してもよい。このモデルはバースト性を捕捉するが，非零カウントの分布は無視し，また，０が高状態（ｈｉｇｈｓｔａｔｅ）で生じることを許さない。ＯＭＭの最尤推定は次の式で与えられる。

Here, n _ij is the number of times a continuous pair (b _i , b _j ) is observed in the data. It may be assumed that the initial state is fixed and known. This model captures burstiness but ignores the distribution of non-zero counts and does not allow 0 to occur in a high state. The maximum likelihood estimation of OMM is given by

隠れマルコフモデル（ＨＭＭ） Hidden Markov Model (HMM)

ＨＭＭは上述のＯＭＭの問題を解決する。いくつかの実施例においては，２状態ＨＭＭが，低状態及び高状態における負の２項発出密度に関してゼロの縮退分布と共に用いられる。２項分布密度はポワソン分布の等分散（ｅｑｕｉｄｉｓｐｅｒｓｉｏｎ）の影響を受けず，網カウントにおける異常を監視するために，それらを用いる正当な理由がある。ほかのモデルは一般にゼロを発出するために高状態を許可しないが，このモデルは許可する。例えば，ゼロカウントはｏｎ-ｚｅｒｏデータで分散されるが，依然として明確に「活性」状態の一部であってよい。直感的に言えば，活性状態は一般に，「利用者がこの辺で通信している」ではなく，「利用者が機械のところに居る」，したがって，通信を行う可能性が高いと考えられる。 HMM solves the above-mentioned OMM problem. In some embodiments, a two-state HMM is used with a zero degenerate distribution for negative binomial emission densities in the low and high states. The binomial distribution density is not affected by the equal dispersion of the Poisson distribution, and there are valid reasons to use them to monitor network count anomalies. Other models generally do not allow high states to emit zero, but this model does. For example, the zero count is distributed with on-zero data, but may still still be part of the “active” state. Intuitively speaking, it is generally considered that the active state is not “the user is communicating in this area” but “the user is at the machine”, and therefore the possibility of communication is high.

観察されたカウントＯ_ｔは「隠れた」２状態ＨＭＭＱ_ｔに従う。遷移パラメータは，ｐ_０１＝Ｐ（Ｑ_ｔ＝１｜Ｑ_ｔ-１＝０）及びｐ_１０＝Ｐ（Ｑ_ｔ=0｜Ｑ_ｔ-１＝１）で与えられる。発出密度は各状態において，ｂ_０（Ｏ_ｔ）＝Ｐ（Ｏ_ｔ｜Ｑ_ｔ＝０）＝ＩＱ_ｔ＝０）及びｂ_１（Ｏ_ｔ）＝Ｐ（Ｏ_ｔ｜μ，ｓ，Ｑ_ｔ＝１）＝ＮＢ（Ｏ_ｔ｜μ，ｓ）としてパラメータ化され，ここでＩ（・）は指示関数，ＮＢ（・｜μ，ｓ）は平均値μ，大きさｓの負２項分布密度関数である。尤度は次の式で表される。

The observed count O _t follows the “hidden” two-state HMM Q _t . The transition parameters are given by p ₀₁ = P (Q _t = 1 | Q _t -1 = 0) and p ₁₀ = P (Q _t = 0 | Q _t -1 = 1). The emission density in each state is b ₀ (O _t ) = P (O _t | Q _t = 0) = IQ _t = 0) and b ₁ (O _t ) = P (O _t | μ, s, Q _t = 1) = Parameterized as NB (O _t | μ, s), where I (•) is an indicator function, NB (• | μ, s) is a negative binomial distribution density function of mean value μ and size s It is. The likelihood is expressed by the following equation.

ＨＭＭ最尤推定値は閉形式を有しないため，推定最大化（ＥＭ）方式を用いてもよい。Ｔ個の離散時点の集合において，カウントｘ＝［ｘ_１，・・・，ｘ_Ｔ］’，ここで，ｔ＝１，・・・，Ｔに対してｘ_ｔ∈｛０，１，・・・｝を観測する。このモデルにおいて，カウントは，二つの分布のうち一つから到来し，Ｚ＝［Ｚ_１，・・・，Ｚ_Ｔ］’，すなわち２状態マルコフ過程とみなされる。ｐ_０１＝Ｐｒ（Ｚ_ｎ＝１｜Ｚ_ｎ-１＝０）及びｐ_１０＝Ｐｒ（Ｚ_ｎ＝０｜Ｚ_ｎ-１＝１）とすると，潜在遷移行列は次のように表される。

Since the HMM maximum likelihood estimate does not have a closed form, an estimation maximization (EM) scheme may be used. In the set of the T discrete time count _{x = [x 1, ···,} x T] ', _{where, t = 1, ···, x} t ∈ {0,1 relative to T, · · • Observe. In this model, the count comes from one of the two distributions and is considered Z = [Z ₁ ,..., Z _T ] ′, ie a two-state Markov process. When p ₀₁ = Pr (Z _n = 1 | Z _n-1 = 0) and p ₁₀ = Pr (Z _n = 0 | Z _n-1 = 1), the latent transition matrix is expressed as follows.

初期状態分布はπ＝Ｐｒ（Ｚ_１＝１）と表される。 The initial state distribution is expressed as π = Pr (Z ₁ = 1).

Ｚ_ｔ＝０のとき，時刻ｔにおけるカウントの周辺分布は０に縮退する。すなわち，

When Z _t = 0, the peripheral distribution of the count at time t is degenerated to 0. That is,

ここでＩ（・）は指示関数である。Ｚ_ｔ＝１のとき，カウントは，φ＝［μ，ｓ］’で与えられる平均値及びサイズのパラメータを有する次の負２項分布に従って分布すると推定される。

Here, I (•) is an instruction function. When Z _t = 1, the count is estimated to be distributed according to the following negative binomial distribution with mean and size parameters given by φ = [μ, s] ′.

有用なことは，潜在変数及び観測変数双方の同時確率分布は別個のパラメータ種別に分離されるため，計算に便利な方法で分解できることである。

The useful thing is that the joint probability distributions of both latent and observed variables are separated into separate parameter types and can be decomposed in a convenient way for calculation.

ここで，θ＝（π，Ａ，φ）’である。最後に，尤度は次のようになる。

Here, θ = (π, A, φ) ′. Finally, the likelihood is

プール化及び推定 Pooling and estimation

実際上，網内の多くの辺は非常にまばらなことがあり，したがって，高状態カウントを観測する機会は多くないことがある。推定を行うため，辺はμ_ｅ，すなわち，所定の日数にわたって平均化された１日当たり非零カウントの平均数に従ってプール化してもよい。いくつかの実施例においては，二つの種類の辺を定義してもよい。 In practice, many edges in the network can be very sparse and therefore there are not many opportunities to observe high state counts. To make the estimation, the edges may be pooled according to μ _e , ie, the average number of non-zero counts per day averaged over a predetermined number of days. In some embodiments, two types of edges may be defined.

辺種別Ｉ（μ_ｅ≧１）は，個々のモデルを推定するために十分なデータが存在するこれらの辺からなる。いくつかのモデルの試行（ｒｕｎ）において，この数はある網について辺の約４５％であったが，割合は変わり得る。これらの辺のパラメータとして，最尤推定値（ＭＬＥ）を用いてもよい。 The edge type I (μ _e ≧ 1) consists of these edges for which there is sufficient data to estimate individual models. In some model runs, this number was about 45% of the edges for a net, but the percentage can vary. Maximum likelihood estimation values (MLE) may be used as parameters of these sides.

辺種別ＩＩ（μ_ｅ＜１）は，非常にまばらなデータにまたがって情報を「借りる」ために，共通パラメータ集合を共有する残りの辺（ある網における辺の約５５％）を含む。次に，辺

の集合を，

が辺種別ＩＩにおける所定数の最大のμ_ｅ値の中にあるように抽出される。いくつかの実施例においては，この数は例えば１，０００であってよい。これらの辺それぞれのパラメータが推定され，これらのパラメータベクトルの平均が取られる。辺種別ＩＩの共通辺モデルはこの平均ベクトルによってパラメータ化される。例えば最大の１，０００個のμ_ｅ値を取ることは，モデルが低カウント辺に過度に敏感にならないことを確実にする助けとなる。 Edge type II (μ _e <1) includes the remaining edges (approximately 55% of the edges in a network) that share a common parameter set to “borrow” information across very sparse data. Next, edge

A set of

Are extracted to be within a predetermined number of maximum μ _e values in the edge type II. In some embodiments, this number may be 1,000, for example. The parameters for each of these sides are estimated and the parameter vectors are averaged. The side type II common side model is parameterized by this average vector. For example taking the 1,000 mu _e value of the maximum, the model helps to ensure that not overly sensitive to low count sides.

代替仮説 Alternative hypothesis

ＧＬＲＴを得るために，一般に検出するハッカーの振舞の種類を反映する代替物を考慮するためにパラメータ空間全体を制限する必要がある。これらの代替物は，種々の振舞を捉えるために意図的に汎用的に保たれることがある。ハッカーの振舞がモデルを管理するパラメータのＭＬＥを増加させることを前提としている。これは，ハッカーは辺上の正常な振舞に加えて行動しなければならないことによる。特に，ＯＭＭを参照すると，ハッカーの振舞は不活性状態から活性状態に遷移する確率

In order to obtain a GLRT, it is generally necessary to limit the entire parameter space in order to consider alternatives that reflect the type of hacker behavior to be detected. These alternatives may be intentionally kept generic to capture various behaviors. It is assumed that the hacker's behavior increases the parameter MLE that manages the model. This is because hackers must act in addition to the normal behavior on the edge. In particular, referring to OMM, the hacker's behavior is the probability of transition from an inactive state to an active state.

ＨＭＭ設定においてはより多くの選択肢が利用可能である。いくつかの実施例においては，パラメータ変更の三つの組合せが検定された。

各場合において，帰無仮説はパラメータ又は二つのパラメータの対が過去のＭＬＥ値に等しい。 More options are available for HMM configuration. In some examples, three combinations of parameter changes were tested.

In each case, the null hypothesis is that the parameter or pair of two parameters is equal to the past MLE value.

ｐ値の計算及びしきい値の決定 p-value calculation and threshold determination

観測されたＧＬＲＴ統計量λ_γのｐ値を求める。穏やかな正則条件下で、ＧＬＲＴはΘ内の自由パラメータの数に等しい自由度で漸近的にχ^２分布となる。しかし，これは真のパラメータがΘの境界にないときには適用されない。真のパラメータが境界にあるときは，λ_γの分布におけるゼロの点質量が得られる。 Find the p-value of the observed GLRT statistic λ _γ . Under mild regular conditions, the GLRT is asymptotically a χ ² distribution with degrees of freedom equal to the number of free parameters in Θ. However, this does not apply when the true parameter is not at the Θ boundary. When the true parameter is at the boundary, a zero point mass in the distribution of λ _γ is obtained.

星のｐ値 P-value of star

星は一般に二つの形状の簡易形である。グラフ内の星の数はノードの数であり，したがって，ノードνごとに，ＧＬＲＴ

の分布は，νの周りの星によって形成される。Ａ_νがλ_νの分布を有するとする。すると，Ａ_νはＡ_ν＝Ｂ_νＸ_νでモデル化することができ，ここでＢ_ν〜ベルヌーイ分布（ｐ_ν），Ｘ_ν〜ガンマ分布（τ_νη_ν）である。すべてのλ_ｅの合計はゼロであるから，Ａ_νはゼロの点質量を有する。これはＢ_νによって捕捉できる。Ａ_νの分布の正部分をモデル化するためには，ガンマ分布が魅力的である。何となれば，ガンマ分布はτ_ν＝ν／２，η_ν＝２のとき，自由度νのχ^２分布に等しいからである。λ_νの漸近分布は，独立なゼロ過剰（ｚｅｒｏｉｎｆｌａｔｅｄ）χ^２分布の確率変数の合計である。したがって，ゼロ過剰ガンマ分布はλ_νの分布をかなり良くモデル化できることが期待される。Ｎ個の独立な，同一に分布した標本の対数尤度は次の式で表される。

A star is generally a simple form of two shapes. The number of stars in the graph is the number of nodes, so for each node ν GLRT

The distribution of is formed by stars around ν. Let A _v have a distribution of λ _v . Then, A _v can be modeled by A _v = B _v X _v , where B _v -Bernoulli distribution (p _v ) and X _v -gamma distribution (τ _v η _v ). Since the sum of all λ _e is zero, A _v has a zero point mass. This can be captured by B _[nu. The gamma distribution is attractive for modeling the positive part of the distribution of A _v . If What becomes, gamma distribution τ _ν = _ν / 2, when eta _[nu = 2, is equal to the chi ² distribution with degrees of freedom [nu. The asymptotic distribution of λ _v is the sum of the random variables of independent zero-inflated χ ² distributions. Therefore, the zero excess gamma distribution is expected to be fairly well modeled distribution of lambda _[nu. The log likelihood of N independent, equally distributed samples is expressed by the following equation.

τ_ν及びη_νを推定するために，直接数値最適化を用いてもよい。例えば，これは，検定されたいくつかの実施例において，ノードνを中心とする星ごとに重複しない３０分の窓で１０日に渡って実行することができる。ＭＬＥは

と表わすことができる。観測されたλ_νに対して，上部ｐ値は

によって計算され，ここでＦ_Γは，ガンマ累積分布関数（ＣＤＦ）である。 Direct numerical optimization may be used to estimate τ _ν and η _ν . For example, in some embodiments tested, this can be done over 10 days with a 30 minute window that does not overlap for each star centered at node ν. MLE

Can be expressed as For the observed λ _ν , the upper p-value is

It is calculated by, where F _gamma is the gamma cumulative distribution function (CDF).

道のｐ値 Road p-value

星と異なり，多数の道は，道ごとにλ_γをモデル化することが多くのシステムには，計算時間及びメモリ双方の要求条件の点で禁止的に高価である。その代わり，個別の辺ごとにモデルを構築して，道の尤度計算の際に辺のモデル結合をしてもよい。辺ｅごとに，Λ_ｅがｅ，λ_ｅに対してＧＬＲＴ評点の空分布を有すると仮定する。ここでも，ゼロ過剰ガンマ分布を辺のモデル化に用いてもよい。しかしここでは，辺ごとベースに限定される。λ_ｅの空分布は漸近的にゼロ過剰χ^２分布（一つのパラメータを検定したとき，ゼロにおいて５０％質量を有する）ことによって，このモデルが動機付けられることを再度強調しておく。 Unlike stars, many roads, to model lambda _gamma for each road in many systems, it is prohibitively expensive computation time and in terms of memory both requirements. Instead, a model may be constructed for each individual edge, and the edge model may be combined when calculating the likelihood of the road. Assume that for each edge e, Λ _e has an empty distribution of GLRT scores for e and λ _e . Again, the zero excess gamma distribution may be used for edge modeling. However, here, it is limited to the base for each side. Again, it is emphasized that the sky distribution of λ _e is motivated by the asymptotic zero excess χ ² distribution (having 50% mass at zero when one parameter is tested).

Λ_ｅ＝Ｂ_ｅＸ_ｅと仮定する。ここで，Ｂ_ｅ〜ベルヌーイ分布（ｐ_ｅ），Ｘ_ｅ〜ガンマ分布（τ_ｅ，η），τ_ｅは辺特定形状，ηは共有尺度である。すなわち，辺ごとに二つの自由パラメータｐ_ｅ及びτ_ｅがあり，すべての辺に共通尺度パラメータηがある。ＭＬＥｐｅ，τ_ｅ及び

は，非重複の３０分の窓からλ_ｅｓを用いて推定してもよい。尤度は上述の星に関して説明したものと類似するが，各辺は独自のτ_ｅ及び共有尺度ηを有するため，すべての辺についてηを推定し，次に決定されたηについて個別のτ_ｅを推定するステップを交互に行う反復方式が開発された。反復の各ステップが尤度を増加させるため，手続全体が尤度を増加させる。 Assume Λ _e = B _e X _e . Here, B _e to Bernoulli distribution (p _e ), X _e to gamma distribution (τ _e , η), τ _e is a side specific shape, and η is a shared scale. That is, there are two free parameters p _e and τ _e for each side, and a common scale parameter η for all sides. MLE pe, τ _e and

May be estimated using λ _es from a non-overlapping 30-minute window. Likelihood is similar to that described for the stars above, but each side has its own τ _e and a shared measure η, so η is estimated for all sides and then the individual τ _e for the determined η An iterative scheme has been developed that alternates the steps of estimating. Because each iteration step increases the likelihood, the entire procedure increases the likelihood.

辺のモデルが決定されると，道のｐ値を計算することができる。

を仮定する。３縦続道超過（ｅｘｃｅｅｄａｎｃｅ）のｐ値は，次の式で与えられる混合超過である。

Once the edge model is determined, the p-value of the road can be calculated.

Assuming The p-value for 3 cascade excess is the mixture excess given by:

ここで，共通尺度パラメータを有するガンマ分布確率変数の合計は，またガンマ分布であることを利用している。 Here, it is used that the sum of the gamma distribution random variables having the common scale parameter is also a gamma distribution.

しきい値の決定 Threshold determination

しきい値を決定する一つの方法は，異常が発生していない辺ごとにある期間の分当たりカウントをシミュレートすることである。例えば，これを１０日間実行してもよい。１０分だけずれた３０分の窓を１０日に渡ってスライドさせ，完全探査手続の際に行うように，各窓において最小のｐ値を計算する。１日当たり１回の警報のようなある偽発見率を達成するために，例えばｐ値の結果リストにおける１０番目に小さなｐ値を取ってもよい。窓は重複しているため，単一のｐ値と同一の道について，連続した窓から得られる最小のｐ値をカウントすることによってより控えめではないように選択し，非連続窓に関係する１０番目に小さい最小ｐ値を発見するようにしてもよい。このようにして，いくつかの重複窓に渡る警報がしきい値の決定に対して一つの警報を与えるだけになり，これが一般に分析者が一連の連続する警報を見る方法である。 One way to determine the threshold is to simulate a count per minute of a period for each edge where no anomaly has occurred. For example, this may be performed for 10 days. Slide a 30-minute window that is shifted by 10 minutes over 10 days and calculate the minimum p-value for each window, as done during the full exploration procedure. To achieve a certain false discovery rate, such as one alarm per day, the tenth smallest p-value in the p-value result list may be taken, for example. Because the windows overlap, the same path as a single p-value is chosen to be less conservative by counting the smallest p-value obtained from successive windows, which is related to non-continuous windows. The second smallest p value may be found. In this way, alarms over several overlapping windows only give one alarm for threshold determination, which is generally the way an analyst sees a series of consecutive alarms.

本発明のいくつかの実施例は，基礎となるグラフ構造の辺に経時的に規定されたデータを用いて，異常行動を検出することに向けられている。攻撃は非常に局所的であることがあるため，本発明のいくつかの実施例では時間×グラフ積空間において，局所的に窓を適用する。この局所的な窓におけるデータに用いられた過去のモデルは，過去の振舞に従って予期されるように振る舞う。ｋ縦続道は網を通る通過を検出するために特に効果的である。 Some embodiments of the present invention are directed to detecting abnormal behavior using data defined over time on the edges of the underlying graph structure. Because attacks can be very local, some embodiments of the present invention apply windows locally in time x graph product space. The past model used for the data in this local window behaves as expected according to the past behavior. The k cascade is particularly effective for detecting passage through the network.

図４は，本発明の実施例による，網上の異常な振舞を検出する方法を示している。いくつかの実施例においては，図４の方法は，例えば図２の計算システム２００によって少なくとも部分的に実行することができる。正常な活動レベルを決定するために，ステップ４１０において網の過去のパラメータが決定される。過去のパラメータは，例えば種々の期間における辺上の連結数を含んでもよい。いくつかの実施例においては，過去のパラメータは，二つの辺種別，すなわち要素辺が個別モデルを推定するために十分なデータを有している第１種別，及び要素辺の個別モデルを推定するために十分なデータがない第２種別を考慮して設定してもよい。ある実施例においては，モデルが低カウントの辺に過度に敏感でないことを確かめるために，平均ベクトルによって第２種別の辺がパラメータ化される。 FIG. 4 illustrates a method for detecting abnormal behavior on the network according to an embodiment of the present invention. In some embodiments, the method of FIG. 4 may be performed at least in part by, for example, computing system 200 of FIG. In order to determine the normal activity level, the network's past parameters are determined in step 410. The past parameter may include, for example, the number of connections on the side in various periods. In some embodiments, the past parameters estimate two edge types, a first type whose element edges have sufficient data to estimate an individual model, and an individual model of element edges. Therefore, it may be set in consideration of the second type for which there is not enough data. In one embodiment, the second type of edge is parameterized by an average vector to ensure that the model is not overly sensitive to low count edges.

網内の複数の道は，ステップ４２０において，網を表すグラフの一部として列挙される。各計算システムはグラフ内のノードであってよく，二つの計算システム間の一連のコネクションは，グラフ内の有向辺であってよい。異常な振舞を検出するために，ステップ４３０において，スライド窓ベースで統計モデルがグラフに適用される。いくつかの実施例においては，観測マルコフモデル（ＯＭＭ）が用いられる。別の実施例においては，隠れマルコフモデル（ＨＭＭ）が用いられる。いくつかの実施例においては，ＯＭＭ又はＨＭＭは２状態モデルであってよい（例えば，「オン」が利用者の存在を示し，「オフ」が利用者の不在を示す）。しかし，いくつかの実施例の方法は必ずしもモデルの選択に依存しない。換言すれば，種々の実施例においては種々の統計モデルを用いてもよい。ステップ４４０において，検出された異常な振舞に属するデータが利用者に表示される。 A plurality of paths in the network are listed as part of a graph representing the network in step 420. Each computing system may be a node in the graph, and the series of connections between the two computing systems may be directed edges in the graph. In order to detect anomalous behavior, a statistical model is applied to the graph on a sliding window basis in step 430. In some embodiments, an observed Markov model (OMM) is used. In another embodiment, a hidden Markov model (HMM) is used. In some embodiments, the OMM or HMM may be a two-state model (eg, “on” indicates the presence of a user and “off” indicates the absence of a user). However, some example methods do not necessarily depend on model selection. In other words, various statistical models may be used in various embodiments. In step 440, data belonging to the detected abnormal behavior is displayed to the user.

統一ホスト集約エージェント（ＵＨＣＡ） Unified Host Aggregation Agent (UHCA)

アンチウィルスソフトウェア及びファイアウォールのようなセキュリティアプリケーションを動作させることによって，ホストを保護するホストエージェントを用いてもよい。ホストエージェントは一般に，異常検出のためにホストからサーバへデータをアップロードする統一ホスト集約エージェント（ＵＨＣＡ）を用いる。しかし，本発明のいくつかの実施例は，ホストからほかの機械への網接続，その接続に関係するプロセス，そのプロセスに関係する実行可能ファイル，等を含むデータを提供するためにＵＨＣＡを用いる。 A host agent that protects the host by running security applications such as anti-virus software and firewalls may be used. The host agent generally uses a unified host aggregation agent (UHCA) that uploads data from the host to the server for anomaly detection. However, some embodiments of the present invention use UHCA to provide data including a network connection from the host to another machine, the process associated with the connection, the executable file associated with the process, etc. .

便宜のため，ホストから直接データを得る代わりに，２次サーバ源からデータが収集される。いくつかの実施例は，新規イベントを生成するためにこの観測情報を考慮に入れる。サーバはホストに対して片方向通信を有してもよく，それによってサーバは多数のホストエージェントからメッセージを受信する。いくつかの実施例における双方向通信の欠如は効率を増加させる。 For convenience, instead of obtaining data directly from the host, data is collected from the secondary server source. Some embodiments take this observation information into account for generating new events. The server may have a one-way communication with the host, whereby the server receives messages from multiple host agents. The lack of bidirectional communication in some embodiments increases efficiency.

多くの実施例において，効果的運用に完全なデータ収集は必要ないため，いくつかの実施例は利用者データグラムプロトコル(ＵＤＰ)を用いる。これらの実施例は，できるだけ多くの情報を捕捉してもよいが，いくつかを取り損ねても，異常検出は一般に依然として効果的に機能する。この「損失のある」（ｌｏｓｓｙ）収集方式は，パケット配信がＴＣＰによって実装された方法では保証されないため，通信が片方向であることを可能にする。このことはまた，ＴＣＰベースの方式よりも多いデータ量を許可する。 In many embodiments, complete data collection is not required for effective operation, so some embodiments use the User Datagram Protocol (UDP). These embodiments may capture as much information as possible, but if some are missed, anomaly detection generally still works effectively. This “lossy” collection scheme allows communication to be unidirectional because packet delivery is not guaranteed by the method implemented by TCP. This also allows a larger amount of data than the TCP based scheme.

いくつかの実施例においては，ＵＤＰストリームは暗号化され，それによって網データが保護される。処理が重要な問題であり，データ管理は大規模システムにおいては困難である。それでもなお，いくつかの実施例は強力な暗号化を提供し，プライバシを保証することができる。いくつかの実施例は，損失のある性質によって，セキュリティに必要な追加処理を提供することを支援する。 In some embodiments, the UDP stream is encrypted, thereby protecting the network data. Processing is an important issue, and data management is difficult in large-scale systems. Nonetheless, some embodiments can provide strong encryption and ensure privacy. Some embodiments help provide the additional processing necessary for security due to the lossy nature.

いくつかの実施例はＵＤＰを使用しつつ，パケットのシーケンス番号を用いてパケット損失を検出することができる。媒体接続制御（ＭＡＣ）アドレス及びシーケンス番号を用いて，機械ごとベースでパケットを追跡することができる。この情報はまた，異常検出と独立に用いてもよい。例えば，この情報は所与のホスト上のデータを監視するための科学捜査に用いることができる。例えば，特定のホストがマルウェアを有するか否かを判定するために，実行可能ファイルのチェックサムをリストに置いてもよい。 Some embodiments can detect packet loss using packet sequence numbers while using UDP. Packets can be tracked on a machine-by-machine basis using media connection control (MAC) addresses and sequence numbers. This information may also be used independently of anomaly detection. For example, this information can be used for forensics to monitor data on a given host. For example, a checksum of an executable file may be placed in a list to determine whether a particular host has malware.

ほとんどのデータ集約基盤設備（ｉｎｆｒａｓｔｒｕｃｔｕｒｅ）の弱点は，網内の内部ノード間の可視性が限られることである。攻撃者の検出を改善するために，端点可視性を強化することが望ましい。包括的端点可視性は一般に，網ホストレベルでソフトウェアを展開することを必要とする。すべての網スイッチが，下位網レベルでの網フローデータを集約することができる訳ではない。同様に，ＤＮＳデータ可視性は通常，キャッシュの影響を受け，目的のホストにコネクションを設定するとき，敵がＩＰアドレスではなくホスト名を用いることを要求する。 The weakness of most data aggregation infrastructure is the limited visibility between internal nodes in the network. The endpoint visibility should be enhanced to improve attacker detection. Comprehensive endpoint visibility generally requires deploying software at the network host level. Not all network switches can aggregate network flow data at the lower network level. Similarly, DNS data visibility is usually affected by the cache and requires an enemy to use a host name rather than an IP address when setting up a connection to the target host.

端点可視性を改善するために，いくつかの実施例は，Ｗｉｎｄｏｗｓ（登録商標），ＭａｃＯＳ（登録商標），Ｌｉｎｕｘ（登録商標），Ａｎｄｒｏｉｄ（登録商標）のような種々のオペレーティングシステム上で動作するクロスプラットホームソフトウェアエージェント（以降，「エージェント」と呼ぶ）を用いる。ＵＨＣＡはいくつかの実施例においてＰｙｔｈｏｎで書かれ，種々の目的オペレーティングシステムに適応し，拡大することを容易にしている。しかし，任意の所望のプログラム言語又はアセンブリコードを用いてもよい。エージェントの主な目的はデータ集約であり，エージェントはホストオペレーティングシステムに最小の影響を与えるように設計することができる。試験によって，エージェントのいくつかの実施例は単一ＣＰＵコアの２〜８％しか使用しないことを示した。エージェントはシステム状態及びイベントを収集し，ＪＳＯＮ符号化ログ（ＪＥＬ）と呼ばれるJavaScript（登録商標） Object Notation（ＪＳＯＮ）レコードとして符号化してもよい。いくつかの実施例においては，すべてのＪＥＬが，生成時刻スタンプ，エージェントＩＤ（例えばＭＡＣアドレス），エージェントのＩＰアドレス，オペレーティングシステムの種別及びレコード種別（例えば，網接続状態）を含む。 In order to improve endpoint visibility, some embodiments run on various operating systems such as Windows®, Mac OS®, Linux®, Android®, etc. A cross-platform software agent (hereinafter referred to as “agent”) is used. UHCA is written in Python in some embodiments, making it easy to adapt and extend to various target operating systems. However, any desired programming language or assembly code may be used. The main purpose of agents is data aggregation, and agents can be designed to have minimal impact on the host operating system. Tests have shown that some examples of agents use only 2-8% of a single CPU core. The agent may collect system status and events and encode them as JavaScript Object Notation (JSON) records called JSON encoding logs (JEL). In some embodiments, every JEL includes a generation time stamp, an agent ID (eg, MAC address), an agent IP address, an operating system type, and a record type (eg, network connection status).

ＪＥＬは暗号化されたＵＤＰパケットによって１又は複数の中央集約サーバに比較的頻繁な間隔（例えば１〜５分）で転送してもよい。いくつかの実施例においては，エージェント設定ファイルに複数のサーバを指定してもよく，システムが横に拡大縮小することを可能にする。エージェントの集約能力は，開始プロセスイメージのチェックサムを含むプロセス終了開始情報，網接続イベントログ，実行されているプロセスと設定されている網接続との対応付け，及び現在の網接続状態を含んでもよい。 The JEL may be transferred by encrypted UDP packets to one or more central aggregation servers at relatively frequent intervals (eg, 1-5 minutes). In some embodiments, multiple servers may be specified in the agent configuration file, allowing the system to scale horizontally. The agent's aggregation capability includes process termination start information including a checksum of the start process image, network connection event log, correspondence between the running process and the set network connection, and the current network connection status. Good.

網ポーリング状態 Network polling status

異常な道を検出するために，いくつかの実施例はホスト間の網通信を示す値の三つ組（時間，送信元ＩＰアドレス，あて先ＩＰアドレス）のリストを取得する。このような実施例をＵＨＣＡデータを活用するために拡張するために，エージェントは一般に，目的のプラットホームのすべてに渡る統一ホスト網通信情報を報告することが望ましい。Ｌｉｎｕｘ（登録商標）においては，このデータを生成するためにｐｒｏｃｆｓ（特に，/proc/tcp及び/proc/udp）を用いてもよい。ＯＳＸ（登録商標）及びＡｎｄｒｏｉｄ（登録商標）による実現ではnetstatの実行出力をパースしてもよいが，最適な方法ではない。Ｗｉｎｄｏｗｓ（登録商標）のエージェントは，Python ctype Windows（登録商標） IP helperモジュールのGetExtendedTcpTableメソッド（ctypes.windll.iphlpapi.GetExtendedTcpTable）を用いてもよく，これはprocfs及びnetstatと類似の網状態情報を提供する。 In order to detect abnormal paths, some embodiments obtain a list of triples (time, source IP address, destination IP address) indicating network communication between hosts. In order to extend such an embodiment to take advantage of UHCA data, it is generally desirable for agents to report unified host network communication information across all target platforms. In Linux (registered trademark), procs (in particular, / proc / tcp and / proc / udp) may be used to generate this data. In the implementation by OS X (registered trademark) and Android (registered trademark), the execution output of netstat may be parsed, but it is not an optimal method. Windows® agents may use the Python ctype Windows® IP helper module's GetExtendedTcpTable method (ctypes.windll.iphlpapi.GetExtendedTcpTable), which provides network status information similar to procfs and netstat. To do.

いくつかの実施例においては，データは１秒又は任意のほかの所望の期間ごとにポーリングされる。当然に，ポーリングが頻繁なほどより多くのデータが分析に利用でき，捕捉される可能性が高い接続種別は短くなる。１秒ごとにポーリングすることの欠点は，短命な（すなわち，秒以下）接続は通常エージェントが見過ごすことである。これは多くの検出技法の課題であり得るが，いくつかの実施例の焦点は，網の通過を対話的に検出できることである。自動化された通過でさえ，目的ノードで状態を維持するために，通常は１秒以上の分析（ｒｅｓｏｌｕｔｉｏｎ）を必要とする。 In some embodiments, data is polled every second or any other desired time period. Naturally, the more polling, the more data is available for analysis and the shorter the connection types that are more likely to be captured. The disadvantage of polling every second is that short-lived (ie sub-second) connections are usually overlooked by the agent. While this can be a challenge for many detection techniques, the focus of some embodiments is to be able to detect network traversal interactively. Even automated passage usually requires a resolution of more than 1 second to maintain state at the destination node.

短命な接続を見失う問題に対処するために，ＴＣＰ時間待機状態（ｔｉｍｅｗａｉｔｓｔａｔｅ）を用いることが有利である。クライアントがＴＣＰ上でサーバと通信するとき，サーバはＴＣＰ接続の状態を維持する。通信が終了したとき，サーバは一般に，一定の期間，普通３０秒かそれ以上のＴＩＭＥ＿ＷＡＩＴ状態に接続情報を維持しなければならない。この長い時間窓は，エージェントが，別のやり方では見失ったであろう，秒以下の網通信に関する情報を捕捉できるようにする。後処理において，試験によって対応する設定された接続エントリを有していなかった時間待機状態のエントリがあるか否かが分かる。このような接続はいずれも短命接続として報告してもよい。 To address the problem of losing sight of short-lived connections, it is advantageous to use a TCP time wait state. When the client communicates with the server over TCP, the server maintains a TCP connection state. When communication is complete, the server must generally maintain the connection information in a TIME_WAIT state for a period of time, usually 30 seconds or longer. This long time window allows the agent to capture information about sub-second network communications that would otherwise be lost. In post-processing, it can be seen from the test whether there is an entry in a time waiting state that did not have a corresponding set connection entry. Any such connection may be reported as a short lived connection.

いくつかの実施例は三つ組のリストを必要とするだけであるが，ＵＨＣＡは，網接続についての出来る限りの詳細を，ほかのアプリケーションに提供するために集約サーバに返送してもよい。データは，例えば，少量の試験データのためのスクリプト，又はより大きなジョブのためのＭａｐＲｅｄｕｃｅを用いて三つ組に後処理される。網接続におけるほかの分野において，ＪＥＬは，送信元及びあて先ポート，接続状態（確立，聴取，時間待機，等），接続と関係するプロセスＩＤ，１分の時間窓内又は任意のほかの所望の時間窓で接続が活性であった秒数を含んでもよい。いくつかの実施例は，異常検出のために，ポート情報を活用して個別の通信をより良く区別し，カウント情報を用いて，平均及び分散のような統計量をカウントから収集することによって辺の重みを設定してもよい。 Some embodiments only require a triple list, but UHCA may return as much detail about the network connection as possible to the aggregation server to provide other applications. The data is post-processed in triplicate using, for example, a script for a small amount of test data, or MapReduce for a larger job. In other areas of network connection, JEL is used for source and destination ports, connection status (established, listening, time waiting, etc.), process ID associated with the connection, within a one minute time window, or any other desired It may include the number of seconds that the connection was active in the time window. Some embodiments utilize port information to better distinguish individual communications for anomaly detection, and count information is used to collect statistics such as mean and variance from the count. May be set.

試験において，ＵＨＣＡを組み込んだいくつかの実施例は，ＵＨＣＡ無しのいくつかの実施例の辺検出率の約２倍を示した。例えば，合計３０個の辺を有する一つの試験において，ＵＨＣＡ無しの実施例は３０個の辺のうち１４（４６．７％）を検出し，一方ＵＨＣＡ有りの実施例は３０個の辺のうち２７個（９０％）を検出した。道は１５個の名称辺及び１５個のＩＰ辺からなっていた。この場合，ＵＨＣＡ無しの実施例は最大理論検出率５０％であり，ＵＨＣＡ有りの実施例は最大理論検出率１００％である。 In testing, some examples incorporating UHCA showed about twice the edge detection rate of some examples without UHCA. For example, in one test with a total of 30 sides, the example without UHCA detects 14 out of 30 sides (46.7%), while the example with UHCA out of 30 sides. 27 (90%) were detected. The road consisted of 15 name sides and 15 IP sides. In this case, the embodiment without UHCA has a maximum theoretical detection rate of 50%, and the embodiment with UHCA has a maximum theoretical detection rate of 100%.

図５Ａ〜５Ｄの下部は，この実験において生成された５個の試験道のうち４個の部分道（ｓｕｂｐａｔｈ）を示す。一貫性のために，各道の最初の４個の辺を表示しているが，いくつかの道はより多くの辺を含む。辺が省略されているすべての場合において，この方法は示された最後の辺を検出すると，引き続いて残りの辺を検出する。方法が最後の辺を検出することに失敗すると，残りのすべての辺を見逃し続ける。 The lower part of FIGS. 5A-5D shows four subpaths out of the five test paths generated in this experiment. For consistency, the first four sides of each road are shown, but some roads contain more sides. In all cases where an edge is omitted, this method detects the last edge indicated and then detects the remaining edges. If the method fails to detect the last edge, it continues to miss all remaining edges.

図５Ａ〜Ｄにおいて，ノード（すなわち，網ホスト）は丸印で描かれ，辺（すなわち，網通信）は，あて先ノードを指す菱型の短点を有する線（名称辺），又は矢型端点を有する線（ＩＰ辺）のいずれかで描かれる。ＤＮＳ及びＵＨＣＡのラベルが付けられた棒は，各方法の検出長を示すために用いられる。棒が長いほど長く検出された道を示す。短い棒又は棒がないときは，道内の辺を検出することに失敗したことを強調する。 5A to 5D, nodes (that is, network hosts) are drawn with circles, and edges (that is, network communication) are lines (named edges) having diamond-shaped short points indicating destination nodes, or arrow-shaped end points. Is drawn in any of the lines (IP sides) having The bars labeled DNS and UHCA are used to indicate the detection length of each method. The longer the bar, the longer the path detected. If there are no short bars or bars, emphasize that the failure to detect an edge in the road.

図５Ａは，本発明の実施例による名称辺だけを用いて生成された道を示す道の図５００である。道は６個の辺を有する道（６縦続道）の検出結果を示し，すべての辺はホスト名参照によって生成された。予想通り，この道は非ＵＨＣＡＤＮＳ道検出方式（以降，「ＤＮＳ方式」という）によって成功裏に検出された。驚くべきことに，ＵＨＣＡ方式は道の最初の２辺を見逃したが，ＵＨＣＡ方式はその後の道を拾い上げ，残りの４辺を検出した。 FIG. 5A is a road diagram 500 illustrating a road generated using only name edges according to an embodiment of the present invention. The road shows the detection result of a road having 6 sides (6 cascades), and all sides were generated by host name reference. As expected, this road was successfully detected by a non-UHCA DNS road detection method (hereinafter “DNS method”). Surprisingly, the UHCA method missed the first two sides of the road, but the UHCA method picked up the subsequent road and detected the remaining four sides.

データを詳細に分析した後，道内（２番目のホップ）に関わる一つのホストが機関のサーバとして機能し，定常的に多数の新規コネクションを生成していると判定された。いくつかの実施例においては，新規な辺の振舞がモデル化されているため，ソフトウェアはこのサーバが新規の辺を生成すると予想した。したがって，このサーバを通る道通過（ｐａｔｈｔｒａｖｅｒｓｅ）は異常度が低く，警報しきい値を超えなかったと認定された。これは面白い結果であり，すべての新規の辺（すなわち，すべて新規の辺からなる道）を単に異常と判定することに対して，いくつかの実施例のモデルの使用を正当化するものである。このようなモデル無しでは，このサーバを通るすべての道が警報を生じさせ，偽警報率を増加させる。 After analyzing the data in detail, it was determined that one host involved in the road (second hop) was functioning as an institutional server and constantly creating many new connections. In some embodiments, since the new edge behavior is modeled, the software expected that this server would generate a new edge. Therefore, it was determined that the path traversal through this server had a low degree of abnormality and did not exceed the alarm threshold. This is an interesting result and justifies the use of some example models for simply judging all new edges (ie, all new edges) as abnormal. . Without such a model, all roads through this server will generate alarms and increase the false alarm rate.

図５Ｂは，本発明の実施例による，ＩＰ辺だけを用いて生成された道を示す道の図５１０である。この道はすべてＩＰ辺で生成された７縦続道である。この実験は正確に予期したとおりに振る舞った。ＵＨＣＡ方式はすべての辺を検出し，ＤＮＳ方式はどの辺も検出しなかった。この種の網通過で生成されるＤＮＳ活動はないため，ＤＮＳ方式ではこの種の道を全く検出することはできない。 FIG. 5B is a road diagram 510 illustrating a road generated using only IP edges, according to an embodiment of the present invention. This road is a 7 cascade road all generated at the IP side. This experiment behaved exactly as expected. The UHCA method detected all sides, and the DNS method did not detect any sides. Since there is no DNS activity generated by this type of network traversal, the DNS scheme cannot detect this type of path at all.

図５Ｃは，本発明の実施例による，３個の名称辺で始まり，ＩＰ辺で終わる道を表す道の図５２０である。この道は６縦続道であり，最初の３個の辺は名称辺で生成され，最後の３個の辺はＩＰ辺で生成された。ＤＮＳ方式は予想どおり最初の３個の辺は検出できたが，ＩＰ辺を検出することはできなかった。ＵＨＣＡ方式は予想どおり，道全体を検出できた。 FIG. 5C is a diagram 520 of a path representing a path beginning with three name edges and ending with an IP edge, according to an embodiment of the present invention. This road is a 6 cascade road, the first three sides are generated by the name side and the last three sides are generated by the IP side. The DNS method was able to detect the first three sides as expected, but could not detect the IP side. The UHCA method was able to detect the entire road as expected.

この未知の別の変形も同様に試験したが，簡潔にするため結果は示していない。この５縦続道においては，道は２個のＩＰ辺で始まり，３個の名称辺が続く。ＵＨＣＡ方式は道全体を検出したが，ＤＮＳ方式は名称辺に切り替わった後の道だけを検出した。 This unknown variant was also tested, but the results are not shown for the sake of brevity. In this five cascade road, the road starts with two IP edges and continues with three name edges. The UHCA method detected the entire road, but the DNS method detected only the road after switching to the name side.

図５Ｄは，本発明の実施例による，名称辺とＩＰ辺とを交番させた道を示す道の図５３０である。この道は６縦続道であい，辺は名称辺とＩＰ辺とを交番する。予想ではこの道はＤＮＳ方式では検出不能であり，ＵＨＣＡによって全体が検出されると考えられた。実際には，ＵＨＣＡ方式は道全体を検出し，ＤＮＳ方式は道の最初の辺を検出できた。データを分析すると，この辺はＤＮＳ方式で発見された無関係の３縦続道の一部であることが分かった。この辺は，この試験用の道に選択された辺に偶然関係したものであった。 FIG. 5D is a road diagram 530 showing a path with alternating name sides and IP sides according to an embodiment of the present invention. This road is a six-way cascade, and the sides alternate between the name side and the IP side. It was predicted that this path could not be detected by the DNS system and could be detected entirely by UHCA. Actually, the UHCA method could detect the entire road, and the DNS method could detect the first side of the road. Analysis of the data revealed that this area is part of three unrelated cascades discovered by the DNS method. This side was accidentally related to the side chosen for this test path.

これらの初期結果は，ＵＨＣＡ方式が攻撃者の検出を改善することにつながるという仮説を検証できたという点で勇気づけられるものである。この結果はまた，ＤＮＳ方式が予想検出率に近い性能であることを検証した。 These initial results are encouraging in that we were able to verify the hypothesis that the UHCA approach would improve attacker detection. This result also verified that the DNS method has a performance close to the expected detection rate.

異常性に基づくデータの収集 Data collection based on anomalies

各ホスト上のすべての利用可能なデータは，特に大きな網ではデータ量が膨大であるため，いつでも収集できる訳ではない。その代わり，ここで説明する異常検出方法によって決定された，当該ホスト上の異常性レベルに比例して収集してもよい。異常性のレベルが低いときは，基本網接続性（例えば，ＤＮＳ参照）及びプロセス情報を収集してもよい。中レベルでは，より完全な網振舞データ（例えば，NetFlowデータ）と共に，より多くのプロセス報告（ｐｒｏｃｅｓｓａｃｃｏｕｎｔｉｎｇ）及びサービスを収集してもよい。高レベルでは，プロセス報告，サービス，公開ファイル等を含む完全なホスト振舞情報を，網可視性用の完全なパケット捕捉と共に収集してもよい。いくつかの場合，網の局所的な範囲でだけこれを行ってもよく，異常検出によって駆動される。これは，これらのホストについてより高い品質の検出能力を提供するが，同時に，異常に応答する分析者に高品質の科学捜査情報も提供する。 Not all available data on each host can be collected at any time due to the huge amount of data, especially in large networks. Instead, it may be collected in proportion to the abnormality level on the host determined by the abnormality detection method described here. When the level of anomaly is low, basic network connectivity (see, eg, DNS) and process information may be collected. At the medium level, more process accounting and services may be collected along with more complete network behavior data (eg, NetFlow data). At a high level, complete host behavior information, including process reports, services, public files, etc., may be collected along with complete packet capture for network visibility. In some cases this may be done only in a local area of the network, driven by anomaly detection. This provides higher quality detection capabilities for these hosts, but at the same time provides high quality forensic information to analysts who respond abnormally.

いくつかの実施例においては，ここで説明した道通過方法によって異常レベルを決定してもよい。ノードを通る道通過は，各ノードで収集された現在データによればほんのわずかな異常と認定されることがある。しかし，この道内のノードが中レベルの異常性で振る舞っているときは，道内の各ホストにおいてより包括的なデータを収集してもよい。より良い忠実性を提供するために，このデータをアルゴリズムに反映させてもよく，それによってアルゴリズムはこの道についてより高品質な決定を行うことができる（例えば，より低い偽陽性率及びより高い真陽性率）。この新たな，より高忠実度のデータが異常と認定され続けるときは，ホストにおいて全パケットの捕捉及びプロセス報告を有効にしてもよく，それによって，セキュリティ応答要員が使用するための高品質異常検出データ及び全科学捜査データを提供することができる。 In some embodiments, the anomaly level may be determined by the road passing method described herein. The passage through the nodes may be identified as just a few anomalies according to the current data collected at each node. However, when nodes in this way are behaving with medium level anomalies, more comprehensive data may be collected at each host in the way. To provide better fidelity, this data may be reflected in the algorithm so that the algorithm can make higher quality decisions about this path (eg, lower false positive rate and higher true Positive rate). As this new, higher fidelity data continues to be identified as anomalies, full packet capture and process reporting may be enabled at the host, thereby enabling high quality anomaly detection for use by security response personnel. Data and forensic data can be provided.

図６は，本発明の実施例による，異常に属するデータを収集するためにＵＨＣＡを利用する方法のフローチャート６００である。いくつかの実施例において，図６の方法は，例えば図２の計算システム２００によって少なくとも部分的に実行してもよい。方法は，ステップ６１０において，データを求めて複数のホストエージェントを周期的にポーリングすることから始まる。ステップ６２０において，データは網内の対応するホストによって送受信される網通信に属する複数のホストエージェントから収集される。いくつかの実施例においては，収集されたデータをＵＤＰを介してホストエージェントから片方向通信で送信してもよい。ホストごとに収集されたデータは，開始プロセスイメージのチェックサム，網接続イベントログ，実行プロセスと設定された網接続との対応付け，現在の網接続状態を含むプロセス停止開始情報を含んでもよい。収集されたデータは，ホスト間の網通信を示す三つ組の値のリストを含んでもよく，各三つ組は通信が発生した時刻，送信元ＩＰアドレス，あて先ＩＰアドレスを含んでもよい。 FIG. 6 is a flowchart 600 of a method of using UHCA to collect data belonging to an anomaly according to an embodiment of the present invention. In some embodiments, the method of FIG. 6 may be performed at least in part by, for example, computing system 200 of FIG. The method begins at step 610 by periodically polling a plurality of host agents for data. In step 620, data is collected from a plurality of host agents belonging to network communications transmitted and received by corresponding hosts in the network. In some embodiments, the collected data may be sent in one-way communication from the host agent via UDP. The data collected for each host may include a checksum of the start process image, a network connection event log, a correspondence between the execution process and the set network connection, and process stop start information including the current network connection state. The collected data may include a list of triples indicating network communication between hosts, and each triple may include a time at which communication occurred, a source IP address, and a destination IP address.

いくつかの実施例においては，データは，対応するホストの異常性レベルに比例して収集してもよい。低レベルの異常性では，基本確率的方式からの派生と考えられるとおり，基本網接続性及びプロセス情報を集約してもよい。中レベルの異常性では，より多くのプロセス報告及びサービス，並びにより完全な網振舞データを収集してもよい。高レベルの異常性では，完全なホスト振舞情報を収集し，完全なパケット捕捉を行ってもよい。 In some embodiments, data may be collected in proportion to the anomaly level of the corresponding host. For low-level anomalies, basic network connectivity and process information may be aggregated, as may be derived from the basic stochastic scheme. At mid-level anomalies, more process reports and services, and more complete network behavior data may be collected. For high-level anomalies, complete host behavior information may be collected and complete packet capture may be performed.

ステップ６３０において，所定の期間内の異常振舞を検出するために，収集したデータを分析する。ステップ６４０において短期間接続を検出するためにＴＣＰ待機状態が用いられ，ステップ６５０において，カウント情報を用いてカウント加重値が設定される。異常振舞が検出されたときは，ステップ６６０において，所定の期間に異常振舞が生じたという指示が提供される。 In step 630, the collected data is analyzed to detect abnormal behavior within a predetermined period. In step 640, a TCP standby state is used to detect a short-term connection, and in step 650, a count weight is set using the count information. If an abnormal behavior is detected, an indication is provided in step 660 that the abnormal behavior has occurred during a predetermined period.

図４及び６において実行される方法ステップは，少なくとも，本発明の実施例による，図４及び６に説明される方法を実行するために，非線形適応型プロセッサ用の命令を符号化した計算機プログラム製品によって実行してもよい。計算機プログラム製品は計算機可読媒体で実現してもよい。計算機可読媒体は，限定するものではないが，ハードディスクドライブ，フラッシュ素子，ランダムアクセスメモリ，テープ，又はデータを記憶するために用いられる任意のほかのそのような媒体であってよい。計算機プログラム製品は，図４及び６で説明した方法を実現するように非線形適応型プロセッサを制御する符号化された命令を含んでもよく，それらはまた計算機可読媒体に記憶させることもできる。 The method steps performed in FIGS. 4 and 6 are computer program products that encode instructions for a non-linear adaptive processor to perform at least the method described in FIGS. 4 and 6, according to an embodiment of the present invention. It may be executed by. The computer program product may be implemented on a computer readable medium. The computer readable medium may be, but is not limited to, a hard disk drive, a flash device, random access memory, tape, or any other such medium used for storing data. The computer program product may include encoded instructions that control the non-linear adaptive processor to implement the methods described in FIGS. 4 and 6 and may also be stored on a computer readable medium.

計算機プログラム製品はハードウェア，ソフトウェア又は混成実装で実現できる。計算機プログラム製品は，互いに通信するように動作し，情報又は命令を表示装置に渡すように設計されたモジュールからなってもよい。計算機プログラム製品は，はん用計算機又は特定用途集積回路（ＡＳＩＣ）で動作するように構成してもよい。 A computer program product can be implemented in hardware, software or hybrid implementation. A computer program product may consist of modules that operate to communicate with each other and are designed to pass information or instructions to a display device. The computer program product may be configured to operate on a general purpose computer or an application specific integrated circuit (ASIC).

本願の図面において概略説明及び図示した，本発明の種々の実施例の構成要素は，広範な別の構成で配置及び設計してもよいことは容易に理解されるであろう。したがって，添付の図面に表した，本発明の実施例に詳細な説明は請求した発明の範囲を限定するものではなく，本発明の選択された実施例を代表するものに過ぎない。 It will be readily appreciated that the components of the various embodiments of the present invention that are schematically described and illustrated in the drawings of the present application may be arranged and designed in a wide variety of other configurations. Accordingly, the detailed description of the embodiments of the present invention as illustrated in the accompanying drawings is not intended to limit the scope of the claimed invention, but is merely representative of selected embodiments of the present invention.

本明細書を通じて説明した本発明の特徴，構造及び特性は，１又は複数の実施例において任意の適切な方法で組み合わせてもよい。例えば，本明細書を通じて，「ある実施例」，「いくつかの実施例において」又は類似の表現への言及は，実施例に関係して説明した特定の特徴，構造又は特性が本発明の少なくとも一つの実施例に含まれることを意味する。したがって，本明細書を通じて，「ある実施例において」，「いくつかの実施例において」，「別の実施例において」又は類似の表現の句の出現は，必ずしもすべてが同一グループの実施例を指すものではなく，１又は複数の実施例においては，説明された特徴，構造又は特性を任意の適切な方法で組み合わせてもよい。 The features, structures, and characteristics of the invention described throughout this specification may be combined in any suitable manner in one or more embodiments. For example, throughout this specification, references to “an embodiment,” “in some embodiments,” or similar expressions refer to specific features, structures, or characteristics described in connection with the embodiments at least. It is meant to be included in one embodiment. Thus, throughout this specification, the appearance of phrases in “in one embodiment,” “in some embodiments,” “in another embodiment,” or similar expressions are not necessarily all referring to embodiments in the same group. Rather, in one or more embodiments, the described features, structures, or characteristics may be combined in any suitable manner.

本明細書を通じて，特徴，利点，又は類似の表現への言及は，本発明で実現できる特徴及び利点のすべてが，本発明の任意の一つの実施例にあることを暗示しないことに注意されたい。反対に，特徴及び利点に言及する表現は，実施例に関係して説明された特定の特徴，利点又は特性が，本発明の少なくとも一つの実施例に含まれることを意味すると理解されたい。したがって，本明細書を通じて，特徴及び利点並びに類似の表現の説明は必ずしも同一の実施例を指すものではない。 It should be noted that throughout this specification, references to features, advantages, or similar expressions do not imply that all of the features and advantages that can be realized with the present invention are in any one embodiment of the present invention. . Conversely, expressions referring to features and advantages are to be understood as meaning that the specific features, advantages, or characteristics described in connection with the embodiments are included in at least one embodiment of the invention. Accordingly, throughout this specification, features and advantages and descriptions of similar expressions do not necessarily refer to the same embodiment.

さらに，１又は複数の実施例においては，本発明の説明された特徴，利点及び特性を任意の適切な方法で組み合わせてもよい。当業者であれば，本発明を特定の実施例の特定の特徴又は利点のうち１又は複数を省いて実施してもよいことを理解するであろう。本発明のすべての実施例にある訳ではないほかの例，追加の特徴及び利点は，ある実施例において認識されることがある。 Further, in one or more embodiments, the described features, advantages, and characteristics of the invention may be combined in any suitable manner. Those skilled in the art will appreciate that the present invention may be practiced without one or more of the specific features or advantages of a particular embodiment. Other examples, additional features and advantages that may not be present in all embodiments of the invention may be recognized in certain embodiments.

当業者であれば，上述の発明は，開示されたものと異なる順序のステップ及び／又は異なる構成の要素で実現してもよいことを容易に理解するであろう。したがって，本発明は好適な実施例に基づいて説明されたが，当業者には，本発明の思想及び範囲内に留まる一定の修正物，変形物及び代替構成が存在することは明白であろう。したがって，本発明の境界及び範囲を決定するためには本願の請求項を参照することが望ましい。 One of ordinary skill in the art will readily appreciate that the above-described invention may be implemented with a different order of steps and / or differently configured elements than those disclosed. Thus, although the invention has been described with reference to a preferred embodiment, it will be apparent to those skilled in the art that certain modifications, variations, and alternative constructions remain within the spirit and scope of the invention. . Therefore, it is desirable to refer to the claims of this application to determine the boundaries and scope of the present invention.

Claims

A computing system determining past parameters of a basic statistical model for each edge on the network to determine normal activity levels;
The computing system enumerating a plurality of k cascades in the network as part of a graph representing the network, each computing system in the network serving as a node in the graph, A series of connections between computing systems consists of directed edges in the network, steps,
Said computing system applying an observed Markov model (OMM) or a hidden Markov model (HMM) on a time sliding window basis to said plurality of k cascades in said graph;
The computing system detecting abnormal behavior based on the applied OMM or HMM;
A method implemented on a computer having

The computer-implemented method according to claim 1, further comprising the step of displaying data belonging to the detected abnormal behavior to a user.

The OMM or the HMM has a two-state model;
"On" status indicates the presence of the user,
The computer-implemented method of claim 1, wherein an “off” state indicates the absence of the user.

The computer-implemented method of claim 1, wherein the computing system determines the past parameters taking into account at least two edge types.

The first edge type has element edges with enough data to estimate the individual model,
5. The computer-implemented method according to claim 4, wherein the second edge type is an element edge and has an element edge for which there is not enough data to estimate an individual model of the element edge.

6. The computer implemented method of claim 5, wherein the second edge type is parameterized by an average vector to ensure that the model is not overly sensitive to low count edges.

The computing system collecting data belonging to anomalies from a plurality of host agents belonging to network communications transmitted and received by corresponding hosts in the network;
Analyzing the collected data to detect anomalous behavior within a predetermined period of time;
The computer-implemented method of claim 1, further comprising:

At least one processor;
A memory for storing computer program instructions, said instructions being executed by said at least one processor when executed by said at least one processor;
Determine the past parameters of the basic statistical model of each side of the net to determine the normal activity level;
List a plurality of k cascades in the network as part of a graph representing the network, each computing system in the network acts as a node in the graph, and a series of connections between two computing systems is: Make a directed edge in the network,
Applying a statistical model based on a sliding window of time to the plurality of k cascades in the graph;
Causing anomalous behavior to be detected based on the applied statistical model,
Device configured.

9. The apparatus of claim 8, wherein the computer program instructions are further configured to cause the at least one processor to display data belonging to the detected abnormal behavior to a user.

9. The apparatus of claim 8, wherein the statistical model is an observed Markov model (OMM) or a hidden Markov model (HMM).

The OMM or the HMM has a two-state model;
"On" status indicates the presence of the user,
The apparatus of claim 10, wherein an “off” state indicates the absence of the user.

9. The apparatus of claim 8, wherein the computer program instructions are further configured to cause the at least one processor to determine the past parameters taking into account at least two edge types.

The first edge type has element edges with enough data to estimate the individual model,
The apparatus according to claim 12, wherein the second edge type is an element edge and has an element edge for which there is not enough data to estimate an individual model of the element edge.

14. The apparatus of claim 13, wherein the second edge type is parameterized by an average vector to ensure that the model is not overly sensitive to low count edges.

The computer program instructions are sent to the at least one processor.
Collecting data belonging to anomalies from a plurality of host agents belonging to network communications sent and received by corresponding hosts in the network;
Analyze the collected data to detect anomalies within a given period of time,
9. The apparatus of claim 8, further configured as follows.

A memory for storing computer program instructions configured to detect abnormal behavior in the network;
A plurality of processing cores configured to execute the stored computer program instructions, the plurality of processing cores comprising:
Determine the past parameters of the basic statistical model of each side of the net to determine the normal activity level;
List a plurality of k cascades in the network as part of a graph representing the network, each computing system in the network acts as a node in the graph, and a series of connections between two computing systems is: Make a directed edge in the network,
Applying a statistical model based on a sliding window of time to the plurality of k cascades in the graph;
Detecting anomalous behavior based on the applied statistical model,
System configured.

The system of claim 16, wherein the plurality of processing cores are further configured to display data belonging to the detected abnormal behavior to a user.

The system of claim 16, wherein the statistical model is an observed Markov model (OMM) or a hidden Markov model (HMM).

The OMM or the HMM has a two-state model;
"On" status indicates the presence of the user,
The system of claim 18, wherein an “off” state indicates the absence of the user.

The plurality of processing cores are further configured to determine the past parameters taking into account at least two edge types;
The first edge type has element edges with enough data to estimate the individual model,
The system according to claim 16, wherein the second side type has an element side that is an element side and does not have sufficient data to estimate an individual model of the element side.

21. The system of claim 20, wherein the second edge type is parameterized by an average vector to ensure that the model is not overly sensitive to low count edges.

The plurality of processing cores are:
Collecting data belonging to anomalies from a plurality of host agents belonging to network communications sent and received by corresponding hosts in the network;
The system of claim 16, further configured to analyze the collected data to detect anomalous behavior within a predetermined time period.