JP5310094B2

JP5310094B2 - Anomaly detection system, anomaly detection method and anomaly detection program

Info

Publication number: JP5310094B2
Application number: JP2009046088A
Authority: JP
Inventors: 昌也山形
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2009-02-27
Filing date: 2009-02-27
Publication date: 2013-10-09
Anticipated expiration: 2029-02-27
Also published as: JP2010198579A

Abstract

<P>PROBLEM TO BE SOLVED: To provide an abnormality detection system for efficiently detecting in what data system an abnormality or variation is generated even when there are a very large number of data systems. <P>SOLUTION: An integration means 71 integrates data systems determined to belong to the same group by calculating data values or the sum of power of the data values of the data systems determined to belong to the same group. A statistic calculation means 72 calculates statistic of the data values of the data systems before they are integrated. A group detection means 73 detects a group including a data system in which the abnormality or the variation is generated based on the sum calculated by each group. A data system specification means 74 specifies the data system in which the abnormality or the variation is detected based on the statistic from among the data systems which belong to the group detected by the group detection means 73. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、異常検出システム、異常検出方法および異常検出用プログラムに関し、特に、複数のデータ系列中のどのデータ系列に異常や変化が生じているかを検出する異常検出システム、異常検出方法および異常検出用プログラムに関する。 The present invention relates to an anomaly detection system, an anomaly detection method, and an anomaly detection program, and more particularly to an anomaly detection system, an anomaly detection method, and an anomaly detection for detecting which data series among a plurality of data series are abnormal or changing. Related to the program.

種々の異常検出システムが提案されている。例えば、特許文献１には、ネットワークトラヒックをその特性を利用してフロー単位に集約し、検査対象となるデータ系列を減らして異常検出を行うネットワーク異常検出システムが記載されている。 Various abnormality detection systems have been proposed. For example, Patent Document 1 describes a network abnormality detection system that aggregates network traffic in units of flows using its characteristics and performs abnormality detection by reducing the data series to be inspected.

また、特許文献２には、異常を検出する複数の異常検出手段と、同一原因に由来して複数の異常検出手段から検出された複数の異常検出情報を単一の異常情報として集約し、一つの検出情報として生成するシステムが記載されている。 Patent Document 2 summarizes a plurality of abnormality detection means for detecting an abnormality and a plurality of abnormality detection information derived from the same cause and detected from the plurality of abnormality detection means as a single abnormality information. A system that generates as one piece of detection information is described.

また、非特許文献１には、複数のデータ系列を集約してから異常検出する方法が記載されている。非特許文献１に記載の方法では、異常検出後に、集約前のどのデータ系列が異常の要因であるかを人が特定している。 Non-Patent Document 1 describes a method for detecting anomalies after aggregating a plurality of data series. In the method described in Non-Patent Document 1, after an abnormality is detected, a person specifies which data series before aggregation is the cause of the abnormality.

また、特許文献３には、パケット情報および所定の種別毎のフロー統計情報を収集する複数の収集装置と、収集装置が収集した情報を、一つまたは複数の収集装置から取得し、所定の方法に従って収集情報を集約する集約装置とを備えるネットワーク監視システムが記載されている。 Patent Document 3 discloses a plurality of collection devices that collect packet information and flow statistical information for each predetermined type, and information collected by the collection device from one or a plurality of collection devices. Describes a network monitoring system including an aggregation device that aggregates collected information according to the above.

また、特許文献４には、統計量データを算出し、プロセスの異常を診断するプロセス監視装置が記載されている。 Patent Document 4 describes a process monitoring device that calculates statistical data and diagnoses process abnormality.

特許文献５には、指定されたフィールドの上位Ｎ（Ｎ≧２）個のフィールド値等を算出し、算出した上位Ｎ個のフィールド値等を蓄積し、新規に算出されたデータと過去のデータとの類似度を計算し、類似度と閾値との比較により異常を検出する異常トラフィック検出方法が記載されている。また、特許文献５には、過去の類似度のデータから統計分布を計算し、その９９％値を閾値とすることが記載されている。 Patent Document 5 calculates the top N (N ≧ 2) field values and the like of a specified field, accumulates the calculated top N field values and the like, and newly calculated data and past data An abnormal traffic detection method is described in which the degree of similarity is calculated and an abnormality is detected by comparing the degree of similarity with a threshold. Patent Document 5 describes that a statistical distribution is calculated from past similarity data, and the 99% value is used as a threshold value.

また、異常検出アルゴリズムとして、外れ値検出（特許文献６〜９参照）、変化点検出（特許文献１０，１１参照）、異常行動検出（特許文献１２参照）等がある。 Examples of abnormality detection algorithms include outlier detection (see Patent Literatures 6 to 9), change point detection (see Patent Literatures 10 and 11), abnormal behavior detection (see Patent Literature 12), and the like.

特開２００６−２４６３２５号公報（段落０００８、図１）JP 2006-246325 A (paragraph 0008, FIG. 1) 特開２００１−３１４０３７号公報（段落０００７、図１）Japanese Patent Laid-Open No. 2001-314037 (paragraph 0007, FIG. 1) 特開２００７−１３５９０号公報（段落００２１−００２３、段落００３２−００４３）Japanese Unexamined Patent Publication No. 2007-13590 (paragraphs 0021-0023 and 0032-0043) 特開２００７−６５８８３号公報（段落００２０，００３２）JP 2007-65883 A (paragraphs 0020, 0032) 特開２００８−１６７４８４号公報（段落０００７，００１９）JP 2008-167484 A (paragraphs 0007, 0019) 特開２００１−１０１１５４号公報JP 2001-101154 A 特開２００３−５９７０号公報JP 2003-5970 A 特開２００４−７８９８１号公報Japanese Patent Application Laid-Open No. 2004-78981 特開２００７−１８５３０号公報JP 2007-18530 A 特開２００４−５４３７０号公報JP 2004-54370 A 特開２００４−４６５８号公報JP 2004-4658 A 特開２００４−３０９９９８号公報JP 2004-309998 A

山形昌也、村田康裕、井上大介、衛藤将史、吉岡克成、中尾康二、「マクロ解析環境における全ポートアクセス監視を考慮した変化点検出エンジンの拡張」、２００８年暗号と情報セキュリティシンポジウム(SCIS2008),3C1-4,２００８年Masaya Yamagata, Yasuhiro Murata, Daisuke Inoue, Masafumi Eto, Katsunari Yoshioka, Koji Nakao, “Extension of Change Point Detection Engine Considering All Port Access Monitoring in Macro Analysis Environment”, 2008 Symposium on Cryptography and Information Security (SCIS2008), 3C1-4, 2008

特許文献１に記載されているように、複数のデータ系列を集約して異常検出する方式があるが、特許文献１に記載された技術では、異常検出時に、集約する前の各データ系列のうちどのデータ系列で異常や変化が生じているのかを自動的に特定することはできない。また、アクセス過多を起こすサービス不能攻撃（ＤｏＳ：ＤｅｎｉａｌｏｆＳｅｒｖｉｃｅ）のように、フロー自体が急激に増加する場合には、集約による異常検出対象の減少は困難である。 As described in Patent Document 1, there is a method for detecting anomalies by aggregating a plurality of data series. However, in the technique described in Patent Document 1, when an abnormality is detected, It is not possible to automatically identify in which data series an abnormality or change has occurred. Further, when the flow itself increases rapidly as in a denial of service attack (DoS: Denial of Service) that causes excessive access, it is difficult to reduce the number of abnormality detection targets by aggregation.

また、特許文献２に記載されたシステムでは、各データ系列全てについて異常検出を行うので、データ系列の数に比例して異常検出の処理量が増えるという問題がある。また、特許文献２に記載されたシステムでは、同一要因に由来していない異常情報は集約されない。また、複数のデータ系列に対してそれぞれ個別に異常を検出するためには、データ系列毎に異常検出用の閾値を定める必要がある。また、データ系列におけるデータの傾向が変化すると、その閾値を再設定する必要がある。このように個々のデータ系列毎に異常を検出することは、データ系列毎に個別の閾値設定を必要とし、そのための作業負担が大きくなってしまうという問題があった。また、データマイニングにより自動で異常検出することも考えられるが、統計処理の影響により一定の割合で異常検出が行われる（例えば異常スコアの分布上位０．１％等）。そのため、データ系列の数に比例して異常検出件数が多くなり過ぎ、本来特定したいデータ系列の特定が困難となってしまう。 Further, in the system described in Patent Document 2, since abnormality detection is performed for all data series, there is a problem that the amount of abnormality detection processing increases in proportion to the number of data series. Further, in the system described in Patent Document 2, abnormal information that does not originate from the same factor is not collected. In addition, in order to detect an abnormality individually for each of a plurality of data series, it is necessary to set a threshold for detecting an abnormality for each data series. Further, when the data tendency in the data series changes, it is necessary to reset the threshold. In this way, detecting an abnormality for each data series requires a separate threshold setting for each data series, and there is a problem that the work load for that is increased. Although it is conceivable to detect anomalies automatically by data mining, anomaly detection is performed at a certain rate due to the influence of statistical processing (for example, the top 0.1% of the distribution of anomaly scores). For this reason, the number of abnormality detections increases in proportion to the number of data series, and it becomes difficult to specify the data series originally desired to be specified.

また、非特許文献１に記載された方法では、どのデータ系列が異常の要因であるかを人が特定する。そのため、集約前のデータ系列の種類が多い場合、データ系列の特定が困難となるおそれがある。例えば、１つのデータ系列で、１日に１件異常が生じると仮定する。データ系列が１０００系列あるとすると、１日に１０００件の異常が生じることとなる。この場合、２分に１件以上の異常が生じていることとなる。また、データ系列が１０００００系列あるとすると、１日に１０００００件異常が生じる。この場合、１秒に１件以上の異常が生じていることとなる。このようなペースで異常が生じる場合、人手でのデータ系列の特定は困難である。また、人手でデータ系列を特定する場合、作業者のスキルにより特定の精度や時間にばらつきが生じ、一定の基準に従ってデータ系列を特定し続けることは困難である。 In the method described in Non-Patent Document 1, a person specifies which data series is a cause of abnormality. Therefore, when there are many types of data series before aggregation, it may be difficult to specify the data series. For example, it is assumed that one abnormality occurs per day in one data series. If there are 1000 data series, 1000 abnormalities occur per day. In this case, one or more abnormalities occur every two minutes. Further, if there are 100,000 data series, 100,000 abnormalities occur per day. In this case, one or more abnormalities occur per second. When an abnormality occurs at such a pace, it is difficult to manually specify a data series. In addition, when manually specifying a data series, the specific accuracy and time varies depending on the skill of the operator, and it is difficult to continue to specify the data series according to a certain standard.

そこで、本発明は、データ系列の数が非常に多い場合であっても、どのデータ系列に異常や変化が生じたかを効率よく検出することができる異常検出システム、異常検出方法および異常検出用プログラムを提供することを目的とする。 Therefore, the present invention provides an anomaly detection system, an anomaly detection method, and an anomaly detection program capable of efficiently detecting which data series has an abnormality or change even when the number of data series is very large. The purpose is to provide.

本発明による異常検出システムは、同一のグループに属していると定められたデータ系列のデータ値またはデータ値の累乗の和を計算することにより、同一のグループに属していると定められたデータ系列を集約する集約手段と、集約される前のデータ系列のデータ値の統計量を計算する統計量計算手段と、各グループ毎に計算された和に基づいて、異常または変化が生じているデータ系列を含むグループを検出するグループ検出手段と、グループ検出手段に検出されたグループに属するデータ系列の中から、統計量に基づいて、異常または変化が生じているデータ系列を特定するデータ系列特定手段とを備えることを特徴とする。 An anomaly detection system according to the present invention calculates a data series determined to belong to the same group by calculating the sum of the data values of the data series determined to belong to the same group or the power of the data values. An aggregation means for aggregating data, a statistic calculation means for calculating a statistic of data values of the data series before being aggregated, and a data series in which an abnormality or change has occurred based on the sum calculated for each group A group detecting means for detecting a group including the data sequence specifying means for specifying a data series in which an abnormality or change has occurred based on a statistic from among data series belonging to the group detected by the group detecting means; It is characterized by providing.

また、本発明による異常検出方法は、同一のグループに属していると定められたデータ系列のデータ値またはデータ値の累乗の和を計算することにより、同一のグループに属していると定められたデータ系列を集約し、集約される前のデータ系列のデータ値の統計量を計算し、各グループ毎に計算された和に基づいて、異常または変化が生じているデータ系列を含むグループを検出し、検出されたグループに属するデータ系列の中から、統計量に基づいて、異常または変化が生じているデータ系列を特定することを特徴とする。 Further, the abnormality detection method according to the present invention is determined to belong to the same group by calculating the sum of the data values of the data series determined to belong to the same group or the power of the data values. Aggregate the data series, calculate the statistics of the data values of the data series before being aggregated, and detect the group containing the data series that is abnormal or changing based on the sum calculated for each group A data series in which an abnormality or a change has occurred is specified based on a statistic from among the data series belonging to the detected group.

また、本発明による異常検出用プログラムは、コンピュータに、同一のグループに属していると定められたデータ系列のデータ値またはデータ値の累乗の和を計算することにより、同一のグループに属していると定められたデータ系列を集約する集約処理、集約される前のデータ系列のデータ値の統計量を計算する統計量計算処理、各グループ毎に計算された和に基づいて、異常または変化が生じているデータ系列を含むグループを検出するグループ検出処理、および、グループ検出処理で検出されたグループに属するデータ系列の中から、統計量に基づいて、異常または変化が生じているデータ系列を特定するデータ系列特定処理を実行させることを特徴とする。 Further, the abnormality detection program according to the present invention belongs to the same group by calculating the sum of the data values of the data series or the powers of the data values determined to belong to the same group in the computer. An anomaly or change occurs based on the aggregation process that aggregates the data series defined as follows, the statistic calculation process that calculates the statistic of the data values of the data series before being aggregated, and the sum calculated for each group A group detection process that detects a group that includes the data series that is being detected, and a data series that is abnormal or changes based on a statistic from among the data series that belong to the group detected by the group detection process A data series specifying process is executed.

本発明によれば、データ系列の数が非常に多い場合であっても、どのデータ系列に異常や変化が生じたかを効率よく検出することができる。 According to the present invention, even if the number of data series is very large, it is possible to efficiently detect which data series has an abnormality or change.

本発明の第１の実施形態の異常検出システムの例を示すブロック図である。It is a block diagram which shows the example of the abnormality detection system of the 1st Embodiment of this invention. 第１の実施形態の処理経過の例を示すフローチャートである。It is a flowchart which shows the example of the process progress of 1st Embodiment. 一定期間Ｗ内に発生するデータ数の一例を示す説明図である。6 is an explanatory diagram illustrating an example of the number of data generated within a certain period W. FIG. 外れ値検出を模式的に示す説明図である。It is explanatory drawing which shows an outlier detection typically. 変換点検出を模式的に示す説明図である。It is explanatory drawing which shows a conversion point detection typically. 異常行動検出を模式的に示す説明図である。It is explanatory drawing which shows an abnormal action detection typically. 複数のデータ系列のデータ値の例を示す説明図である。It is explanatory drawing which shows the example of the data value of a some data series. 図７に示したデータ系列の集約結果の変化を示す説明図である。It is explanatory drawing which shows the change of the aggregation result of the data series shown in FIG. 計算された閾値Ｔ_ｋを模式的に示す説明図である。The calculated threshold T _k is an explanatory view schematically showing. 本発明の第２の実施形態の異常検出システムの例を示すブロック図である。It is a block diagram which shows the example of the abnormality detection system of the 2nd Embodiment of this invention. 第３の実施形態の処理経過の例を示すフローチャートである。It is a flowchart which shows the example of the process progress of 3rd Embodiment. 本発明の最小構成を示すブロック図である。It is a block diagram which shows the minimum structure of this invention. 本発明の他の構成例を示すブロック図である。It is a block diagram which shows the other structural example of this invention. 本発明の他の構成例を示すブロック図である。It is a block diagram which shows the other structural example of this invention.

以下、本発明の実施形態を図面を参照して説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

実施形態１．
図１は、本発明の第１の実施形態の異常検出システムの例を示すブロック図である。本発明の異常検出システム１０は、データ系列集約手段２と、統計情報算出手段６と、変換情報記憶手段３と、異常検出手段４と、データ系列特定手段５とを備える。そして、異常検出システム１０は、情報源１から複数のデータ系列のデータを受信する。 Embodiment 1. FIG.
FIG. 1 is a block diagram illustrating an example of an abnormality detection system according to the first embodiment of this invention. The abnormality detection system 10 of the present invention includes a data series aggregation means 2, statistical information calculation means 6, conversion information storage means 3, abnormality detection means 4, and data series identification means 5. Then, the abnormality detection system 10 receives a plurality of data series data from the information source 1.

情報源１は、複数のデータ系列のデータを発生させる。各データ系列のデータは数値で表される数値データであり、このデータが表す数値をデータ値と記す。なお、情報源１は、単一の装置によって実現されているのではなく、複数の装置によって実現されていてもよい。情報源１が発生させる各データ系列のデータは、時間により変化する。また、全データ系列数は、数個ではなく、大量の数であってもよい。例えば、全データ系列数は、１０００個以上、１００００個以上等であってもよい。 The information source 1 generates data of a plurality of data series. The data of each data series is numerical data represented by numerical values, and the numerical value represented by this data is referred to as a data value. Note that the information source 1 is not realized by a single device, but may be realized by a plurality of devices. Data of each data series generated by the information source 1 changes with time. Further, the total number of data series is not limited to several, and may be a large number. For example, the total number of data series may be 1000 or more, 10,000 or more.

各データ系列のデータは、例えば、単位時間当たりのトラヒック数等であってもよい。情報源１は、このようなデータを、通信トラヒックに関するログとして生成すればよい。また、各データ系列は、ＴＣＰ（Transmission Control Protocol ）やＵＤＰ（User Datagram Protocol）のパケットとして異常検出システム１０に受信され、各データ系列の個々のデータは、パケットのペイロードに格納されていてもよい。 The data of each data series may be, for example, the number of traffic per unit time. The information source 1 may generate such data as a log relating to communication traffic. Each data series may be received by the abnormality detection system 10 as a TCP (Transmission Control Protocol) or UDP (User Datagram Protocol) packet, and individual data of each data series may be stored in the payload of the packet. .

個々のデータ系列は、それぞれ別別のポート番号に対応していてもよい。すなわち、データ系列に対応するポート番号は、データ系列毎に異なっていてもよい。 Each data series may correspond to a different port number. That is, the port number corresponding to the data series may be different for each data series.

また、データ系列は、予めグループ分けされる。図１に示す例では、データ系列１からデータ系列Ｎまでの各データ系列が同一のグループに属していると定められていて、同様に、データ系列Ｓ_ｋからデータ系列Ｅ_ｋまでの各データ系列が同一のグループに属していると定められている場合を例示している。 The data series is grouped in advance. In the example shown in FIG. 1, it is determined that each data series from data series 1 to data series N belongs to the same group, and similarly each data series from data series S _k to data series E _k. Exemplifies a case in which it is determined that they belong to the same group.

より具体的には、データ系列は、ＴＣＰやＵＤＰの送信元ポート番号単位（65536 種類）、および、送信先ポート番号単位（65536 種類）に対応していてもよい。また、送信元ＩＰアドレス単位、送信先ＩＰアドレス単位でデータ系列が定められていてもよい。送信元ＩＰアドレスや送信先ＩＰアドレスの種類は、ＩＰｖ４（Internet Protocol Version4）では、約４３億種類（２^２３種類）である。 More specifically, the data series may correspond to TCP or UDP source port number units (65536 types) and destination port number units (65536 types). In addition, the data series may be determined in units of transmission source IP addresses and transmission destination IP addresses. Type of source IP address and destination IP address, the IPv4 (Internet Protocol Version4), is about 4.3 billion kinds ^{(2 23} kinds).

データ系列集約手段２は、グループ毎に、同一のグループに属していると定められているデータ系列を集約する。具体的には、データ系列集約手段２は、同一のグループに属していると定められているデータ系列のデータ値の和、あるいは、データ値の累乗の和を計算する。この計算によって得た和がデータ系列の集約結果である。データ系列集約手段２は、グループ毎に集約結果を求める処理を、一定期間毎に行う。この一定期間を記号Ｗで表す。なお、データ値の累乗の和を計算する際、累乗の計算における指数は自然数に限定されない。例えば、累乗計算における指数を１／２とし、データ値の１／２乗の和（すなわちデータ値の平方根の和）を計算してもよい。 The data series aggregating means 2 aggregates data series that are determined to belong to the same group for each group. Specifically, the data series aggregation unit 2 calculates the sum of the data values of the data series determined to belong to the same group or the sum of the powers of the data values. The sum obtained by this calculation is the aggregation result of the data series. The data series aggregation means 2 performs a process for obtaining an aggregation result for each group at regular intervals. This fixed period is represented by the symbol W. Note that when calculating the sum of the powers of the data values, the exponent in the calculation of the power is not limited to a natural number. For example, the exponent in the power calculation may be ½, and the sum of the data values to the power of ½ (that is, the sum of the square roots of the data values) may be calculated.

図１に示す例では、データ系列集約手段２は、データ系列１からデータ系列Ｎまでのデータ系列に関して集約を行い、同様に、データ系列Ｓ_ｋからデータ系列Ｅ_ｋまでのデータ系列に関して集約を行う。他のデータ系列に関しても同様に、グループ毎に集約を行う。 In the example shown in FIG. 1, the data series aggregation unit 2, and aggregates for the data sequence from the data series 1 to the data series N, likewise, to aggregate with respect to data sequence from the data sequence S _k to the data series E _k . Similarly, other data series are aggregated for each group.

データ系列集約手段２が同一グループの各データ系列のデータ値の和、またはデータ値の累乗の和を計算して、集約結果となる数値を求める。従って、データがまとめられ、データ系列数が和演算前よりも減少することになる。 The data series aggregating means 2 calculates the sum of the data values of each data series in the same group or the sum of the powers of the data values, and obtains a numerical value as an aggregation result. Therefore, the data is collected, and the number of data series is reduced from that before the sum operation.

統計情報算出手段６は、データ系列集約手段２によって集約される前のデータ系列のデータ値の統計量（統計情報）を計算する。統計量の例として、平均値や分散等が挙げられるが、統計情報算出手段６が計算する統計量は、平均値や分散等に限定されない。 The statistical information calculation means 6 calculates the statistic (statistical information) of the data values of the data series before being aggregated by the data series aggregation means 2. Examples of the statistic include an average value and variance. However, the statistic calculated by the statistical information calculation unit 6 is not limited to the average value and variance.

以下、統計情報算出手段６が計算した統計情報を変換情報と記す。変換情報は、統計情報以外の情報（具体的には、第２の実施形態等で説明する除外情報）を含んでもよいが、第１の実施形態では、変換情報は統計情報のみを含む場合について説明する。 Hereinafter, the statistical information calculated by the statistical information calculation unit 6 is referred to as conversion information. The conversion information may include information other than statistical information (specifically, exclusion information described in the second embodiment or the like). However, in the first embodiment, the conversion information includes only statistical information. explain.

変換情報記憶手段３は、変換情報を記憶する記憶装置である。統計情報算出手段６は、計算した統計情報（変換情報）を変換情報記憶手段３に記憶させる。変換情報記憶手段３に記憶された変換情報の集合は変換情報データベースということができる。 The conversion information storage means 3 is a storage device that stores conversion information. The statistical information calculation unit 6 stores the calculated statistical information (conversion information) in the conversion information storage unit 3. A set of conversion information stored in the conversion information storage means 3 can be referred to as a conversion information database.

図１に示す例では、統計情報算出手段６は、データ系列１からデータ系列Ｎまでに関して統計情報３_１ａ（変換情報３_１）を計算し、その統計情報３_１ａを変換情報記憶手段３に記憶させる。同様に、データ系列Ｓ_ｋからデータ系列Ｅ_ｋまでに関しても統計情報３_ｋａ（変換情報３_ｋ）を計算し、その統計情報３_ｋａを変換情報記憶手段３に記憶させる。統計情報算出手段６は、同様に、他の各グループのデータ系列についてもそれぞれ統計情報を計算し、計算した統計情報を変換情報記憶手段３に記憶させる。この結果、個々のグループ毎に計算された統計情報が変換情報記憶装置３に記憶される。 In the example shown in FIG. 1, the statistical information calculation means 6 calculates statistical information 3 _1a (conversion information 3 ₁ ) for data series 1 to data series N, and stores the statistical information 3 _1a in the conversion information storage means 3. Let Similarly, the statistical information 3 _ka (conversion information 3 _k ) is calculated for the data series S _k to the data series E _k , and the statistical information 3 _ka is stored in the conversion information storage unit 3. Similarly, the statistical information calculation means 6 calculates statistical information for the data series of other groups, and stores the calculated statistical information in the conversion information storage means 3. As a result, the statistical information calculated for each group is stored in the conversion information storage device 3.

異常検出手段４は、グループ毎に集約された集約結果（すなわちグループ毎に計算されたデータ値またはデータ値の累乗の和）の推移に基づいて、異常または変化が生じているデータ系列を含むグループを検出する。換言すれば、異常または変化が生じているデータ系列がグループ内に存在するか否かをグループ毎に判定する。異常または変化が生じているデータ系列がグループ内に存在するか否かを判定する異常検出アルゴリズムは、予めグループ毎に定められている。異常検出アルゴリズムは、グループ毎に別々のアルゴリズムであってもよい。例えば、データ系列１からデータ系列Ｎまでのグループ内で異常等が生じているか否かの判定を行う異常検出アルゴリズムと、データ系列Ｓ_ｋからデータ系列Ｅ_ｋまでのグループ内で常等が生じているか否かの判定を行う異常検出アルゴリズムとが異なっていてもよい。 The abnormality detection means 4 includes a group including a data series in which an abnormality or a change has occurred based on a transition of an aggregation result aggregated for each group (that is, a data value calculated for each group or a sum of powers of data values). Is detected. In other words, it is determined for each group whether or not a data series in which an abnormality or change occurs exists in the group. An abnormality detection algorithm for determining whether or not a data series in which an abnormality or change has occurred exists in a group, and is determined in advance for each group. The abnormality detection algorithm may be a separate algorithm for each group. For example, the abnormality detection algorithm for determining from the data series 1 whether abnormality in the group until the data sequence N has occurred, normal, etc. in the group from the data sequence S _k to the data series E _k is generated The abnormality detection algorithm for determining whether or not there may be different.

異常検出手段４に、特許文献６〜９に記載された外れ値検出アルゴリズム、特許文献１０，１１に記載された変化点検出アルゴリズム、あるいは、特許文献１２に記載された異常行動検出アルゴリズムが適用されていてもよい。異常検出アルゴリズムとして、他の異常検出アルゴリズムが適用されてもよい。 The outlier detection algorithm described in Patent Documents 6 to 9, the change point detection algorithm described in Patent Documents 10 and 11, or the abnormal behavior detection algorithm described in Patent Document 12 is applied to the abnormality detection means 4. It may be. Other abnormality detection algorithms may be applied as the abnormality detection algorithm.

データ系列特定手段５は、異常または変化が生じているデータ系列を含むグループが異常検出手段４によって検出された場合、そのグループから計算した統計情報を用いて、そのグループ内のいずれのデータ系列に異常または変化が生じているかを特定する。例えば、データ系列１からデータ系列Ｎまでのグループが異常検出手段４によって検出された場合、データ系列特定手段５は、統計情報３_１ａを用いて、データ系列１からデータ系列Ｎまでのデータ系列の中から、異常または変化が生じているデータ系列を特定する。他のグループが検出された場合も同様である。 When a group including a data series in which an abnormality or change has occurred is detected by the abnormality detection unit 4, the data series specifying unit 5 uses the statistical information calculated from the group to assign any data series in the group. Identify whether an anomaly or change has occurred. For example, when the group from the data series 1 to the data series N is detected by the abnormality detecting unit 4, the data series specifying unit 5 uses the statistical information _31a to determine the data series from the data series 1 to the data series N. The data series in which an abnormality or change has occurred is identified. The same applies when other groups are detected.

データ系列集約手段２、統計情報算出手段６、異常検出手段４およびデータ系列特定手段５は、例えば、プログラム（異常検出用プログラム）に従って動作するコンピュータのＣＰＵによって実現される。この場合、ＣＰＵが、異常検出用プログラムが記憶されたプログラム記憶手段（図示略）から異常検出用プログラムを読み込み、そのプログラムに従って、データ系列集約手段２、統計情報算出手段６、異常検出手段４およびデータ系列特定手段５として動作すればよい。また、データ系列集約手段２、統計情報算出手段６、異常検出手段４およびデータ系列特定手段５がそれぞれ別個の専用回路として実現されていてもよい。 The data series aggregation means 2, the statistical information calculation means 6, the abnormality detection means 4, and the data series identification means 5 are realized by a CPU of a computer that operates according to a program (abnormality detection program), for example. In this case, the CPU reads the abnormality detection program from the program storage means (not shown) in which the abnormality detection program is stored, and according to the program, the data series aggregation means 2, the statistical information calculation means 6, the abnormality detection means 4, and What is necessary is just to operate | move as the data series identification means 5. Further, the data series aggregation means 2, the statistical information calculation means 6, the abnormality detection means 4, and the data series identification means 5 may be realized as separate dedicated circuits.

次に、動作について説明する。図２は、第１の実施形態の処理経過の例を示すフローチャートである。以下に示す例では、データ系列Ｓ_ｋからデータ系列Ｅ_ｋまでのデータ系列に着目して説明するが、他のグループについても同様に動作する。 Next, the operation will be described. FIG. 2 is a flowchart illustrating an example of processing progress of the first embodiment. In the following example, description will be given focusing on the data series from the data series S _k to the data series E _k, but the same operation is performed for other groups.

まず、異常検出システム１０は、データ系列Ｓ_ｋ〜Ｅ_ｋのデータを情報源１から取得する（ステップＡ１）。例えば、情報源１が送信する各データ系列のデータを受信する。データ系列Ｓ_ｋからデータ系列Ｅ_ｋまでのデータ系列の数をＮ_ｋ個とする。 First, the abnormality detection system 10 acquires data of the data series S _{k to} E _k from the information source 1 (step A1). For example, data of each data series transmitted by the information source 1 is received. The number of data series from the data series S _k to the data series E _k is N _k .

次に、データ系列集約手段２は、一定期間Ｗ内に取得したデータ系列Ｓ_ｋからデータ系列Ｎ_ｋまでのデータ系列を集約する（ステップＡ２）。例えば、データ系列集約手段２は、一定期間Ｗ内に取得したデータ系列Ｓ_ｋ〜Ｅ_ｋの各データ値の和を計算し、その結果を集約結果とする。また、各データ値の累乗の和を計算し、その結果を集約結果としてもよい。既に説明したようにデータ値の累乗計算として、平方根を算出してもよい。 Next, the data series aggregation unit 2, aggregate data series to data series N _k from data sequence S _k obtained within a certain period of time W (step A2). For example, the data series aggregation unit 2 calculates the sum of the data values of the data series S _{k to} E _k acquired within a certain period W, and uses the result as the aggregation result. Also, the sum of the powers of the data values may be calculated and the result may be used as the aggregate result. As already described, the square root may be calculated as the power calculation of the data value.

また、一つのデータ系列について、一定時間Ｗ内に取得するデータ数は１個であってもよい。あるいは２個以上であってもよい。図３は、一定期間Ｗ内に発生するデータ数の一例を示す説明図である。図３に示す例では、データ系列１では、一定期間Ｗ内に３つのデータが発生し、データ系列２では、一定期間Ｗ内に１つのデータが発生している。このような場合、一定期間Ｗ内に取得したデータを全て集約の対象とすればよい。例えば、データ系列集約手段２は、図３に示すデータ系列１の３つのデータと、データ系列２の一つのデータについてデータ値の和を計算したり、あるいはデータ値の累乗の和を計算したりすればよい。 Further, the number of data acquired within a certain time W for one data series may be one. Or two or more may be sufficient. FIG. 3 is an explanatory diagram showing an example of the number of data generated within a certain period W. In the example shown in FIG. 3, in the data series 1, three data are generated within a certain period W, and in the data series 2, one data is generated within a certain period W. In such a case, all the data acquired within a certain period W may be subject to aggregation. For example, the data series aggregation means 2 calculates the sum of the data values for the three data of the data series 1 shown in FIG. 3 and one data of the data series 2, or calculates the sum of the powers of the data values. do it.

統計情報算出手段６は、情報源１から取得したデータ値の統計量を計算し、計算した統計量を変換情報記憶手段３に記憶させる（ステップＡ３）。以下に、統計量の計算例を示す。ただし、以下に示す各式において、ｖ_ｉは、それぞれのデータ系列のデータとして取得した各データのデータ値である。統計量計算時点までに取得した、着目しているグループの各データ系列のデータ値をｖ_ｉとしてもよい。あるいは、統計量計算時点から所定期間遡った時点以降に受信したデータ値をｖ_ｉとしてもよい。すなわち、所定期間遡った時点より前の古いデータを無視することとしてもよい。α_ｋは、各データ系列のデータのデータ値の平均値であり、以下に示す式（１）によって求めればよい。なお、Ｎ_ｋは、着目しているグループ（本例ではデータ系列Ｓ_ｋ〜Ｅ_ｋのグループ）内のデータ系列数である。 The statistical information calculation means 6 calculates the statistic of the data value acquired from the information source 1, and stores the calculated statistic in the conversion information storage means 3 (step A3). An example of calculating statistics is shown below. However, in each of formulas shown below, v _i is the data value of each data acquired as data for each data series. Acquired up to statistic calculation time, the data value for each data series in the group in question may be v _i. Alternatively, the data value received since the time of going back a predetermined period from the statistic calculation time may be v _i. That is, it is possible to ignore old data before the point in time that goes back for a predetermined period. α _k is an average value of the data values of the data of each data series, and may be obtained by the following equation (1). N _k is the number of data series in the group of interest (in this example, the group of data series S _{k to} E _k ).

例えば、統計情報算出手段６は、着目しているグループに関して、以下に示す式（２）の計算を行って統計量σ_ｋを求めてもよい。 For example, the statistical information calculation unit 6 may obtain the statistic σ _k by performing the following equation (2) for the group of interest.

式（２）は、統計量σ_ｋとして、標準偏差を求める式である。 Equation (2) is an equation for obtaining a standard deviation as the statistic σ _k .

また、例えば、統計情報算出手段６は、着目しているグループに関して、以下に示す式（３）の計算を行って統計量σ_ｋを求めてもよい。 Further, for example, the statistical information calculation unit 6 may obtain the statistic σ _k by performing the following equation (3) for the group of interest.

式（３）は、統計量σ_ｋとして、調和平均を求める式である。 Expression (3) is an expression for obtaining a harmonic average as the statistic σ _k .

また、例えば、統計情報算出手段６は、着目しているグループに関して、以下に示す式（４）の計算を行って統計量σ_ｋを求めてもよい。 In addition, for example, the statistical information calculation unit 6 may obtain the statistic σ _k by performing the following equation (4) for the group of interest.

式（４）は、統計量σ_ｋとして、データ値の相加平均の指数関数値である。この指数関数の底はｅである。式（４）によって統計量を求めた場合、データ値に桁の大きな値が現れても、計算不能となることを防止できる。 Equation (4) is the exponential function value of the arithmetic mean of the data values as the statistic σ _k . The base of this exponential function is e. When the statistic is obtained by the equation (4), even if a large value appears in the data value, it is possible to prevent the calculation from becoming impossible.

ここでは、統計量の計算式として式（２）〜（４）の３種類を示したが、統計量の計算方法は上記の例に限定されず、統計情報算出手段６は他の計算式で統計量を計算してもよい。 Here, three types of formulas (2) to (4) are shown as statistical formulas, but the statistical calculation method is not limited to the above example, and the statistical information calculation means 6 is another formula. Statistics may be calculated.

ステップＡ３に続いて、異常検出手段４は、データ系列のグループ毎に計算された集約結果の推移に基づいて、異常または変化が生じているデータ系列を含むグループを検出する（ステップＡ４）。例えば、データ系列Ｓ_ｋ〜Ｅ_ｋのグループに関して、異常または変化が生じているデータ系列がグループ内に存在するか否かを判定する。そして、異常または変化が生じているデータ系列が存在すると判定されたグループを特定する。 Subsequent to step A3, the abnormality detection means 4 detects a group including a data series in which an abnormality or change has occurred based on the transition of the aggregation result calculated for each group of data series (step A4). For example, with respect to the group of data series S _{k to} E _k , it is determined whether or not a data series in which an abnormality or change has occurred exists in the group. Then, a group determined to have a data series in which an abnormality or change has occurred is identified.

異常検出手段４は、異常検出アルゴリズムとして例えば外れ値検出を用い、集約結果が外れ値となる場合に、その集約結果の元であったグループを検出してもよい。図４は、外れ値検出を模式的に示す説明図である。外れ値検出は、得られたデータのうち、統計的に他のデータから外れていると判定されるデータ９１を、異常（外れ値）として検出するアルゴリズムである。 The abnormality detection unit 4 may use, for example, outlier detection as an abnormality detection algorithm, and may detect the group that is the source of the aggregation result when the aggregation result is an outlier. FIG. 4 is an explanatory diagram schematically showing outlier detection. Outlier detection is an algorithm that detects, among the obtained data, data 91 that is statistically determined to be out of other data as abnormal (outlier).

また、異常検出手段４は、異常検出アルゴリズムとして例えば変換点検出を用い、集約結果が急激に変化したならば、その集約結果の元であったグループを検出してもよい。図５は、変換点検出を模式的に示す説明図である。変化点検出は、データの急激な変化を検出するアルゴリズムである。図５に示す例では、時刻Ｐにおいて、データの変動の傾向が変わっている。異常検出手段４は、このような集約結果の急激な変化を検出すればよい。 Further, the abnormality detection means 4 may use, for example, conversion point detection as the abnormality detection algorithm, and if the aggregation result changes abruptly, the abnormality detection unit 4 may detect the group that is the source of the aggregation result. FIG. 5 is an explanatory view schematically showing conversion point detection. Change point detection is an algorithm that detects a sudden change in data. In the example shown in FIG. 5, the tendency of data fluctuations changes at time P. The abnormality detection unit 4 may detect such a rapid change in the aggregation result.

また、異常検出手段４は、異常検出アルゴリズムとして例えば異常行動検出を用いて、集約結果の出現パターンが変わったならば、その集約結果の元であったグループを検出してもよい。図６は、異常行動検出を模式的に示す説明図である。異常行動検出では、データの推移を監視し、正常なデータの推移パターンを特定し、その推移パターンに該当しないパターンでデータが現れたならば、異常として検出する。異常検出手段４は、過去の集約結果から集約結果の正常な推移パターンを特定し、そのパターンに該当しない集約結果の変化を検出したならば異常のあるグループとして検出すればよい。 In addition, the abnormality detection unit 4 may detect the group that is the source of the aggregation result if the appearance pattern of the aggregation result is changed by using, for example, abnormal behavior detection as the abnormality detection algorithm. FIG. 6 is an explanatory view schematically showing abnormal behavior detection. In abnormal behavior detection, data transition is monitored, a normal data transition pattern is specified, and if data appears in a pattern not corresponding to the transition pattern, it is detected as abnormal. The abnormality detection unit 4 may identify a normal transition pattern of the aggregation result from the past aggregation results, and detect a group having an abnormality if a change in the aggregation result not corresponding to the pattern is detected.

異常検出手段４が異常または変化の生じたグループがないと判定した場合（ステップＡ５におけるＮｏ）、ステップＡ１に戻る。 When the abnormality detection unit 4 determines that there is no group in which an abnormality or change has occurred (No in step A5), the process returns to step A1.

異常検出手段４が異常または変化の生じたグループがあると判定した場合（ステップＡ５におけるＹｅｓ）、データ系列特定手段５は、ステップＡ４で検出されたグループ（異常または変化の生じたグループ）の統計情報を用いて、そのグループ内のどのデータ系列に異常または変化が生じているのかを判定する（ステップＡ６）。ステップＡ６において、データ系列特定手段５は、ステップＡ４で検出されたグループの統計情報から閾値を計算し、ステップＡ２の集約で用いたデータ値がその閾値を越えているデータ系列が、異常または変化の生じているデータ系列であると判定する。 When the abnormality detection unit 4 determines that there is a group in which abnormality or change has occurred (Yes in step A5), the data series identification unit 5 determines the statistics of the group (group in which abnormality or change has occurred) detected in step A4. Using the information, it is determined in which data series in the group an abnormality or change has occurred (step A6). In step A6, the data series specifying unit 5 calculates a threshold value from the statistical information of the group detected in step A4, and the data series in which the data value used in the aggregation in step A2 exceeds the threshold value is abnormal or changed. It is determined that the data series is generated.

ステップＡ６において、データ系列特定手段５は、閾値を、異常または変化が検出されたグループ内のデータ系列数、そのグループに関してステップＡ３で計算した統計量、そのグループのデータ系列のデータ値の平均値の関数として計算してもよい。データ系列Ｓ_ｋ〜Ｅ_ｋのグループに異常または変化が生じているとすると、データ系列数はＮ_ｋである。また、そのグループに関して計算する閾値をＴ_ｋとすると、Ｔ_ｋは、Ｎ_ｋ、統計量σ_ｋ、データ値の平均値α_ｋの関数として、式（５）のように表すことができる。このような関数として、例えば、式（６）または式（７）に例示する関数を用いることができる。式（６）に示すように、Ｎ_ｋを用いずに閾値Ｔ_ｋを計算してもよい。 In step A6, the data series specifying means 5 sets the threshold value as the number of data series in the group in which anomaly or change is detected, the statistic calculated in step A3 for the group, and the average value of the data values of the data series in the group It may be calculated as a function of If there is an abnormality or change in the group of data series S _{k to} E _k, the number of data series is N _k . Further, when the threshold value calculated for the group is T _k , T _k can be expressed as a function of N _k , statistic σ _k , and average value α _k of data values as shown in Expression (5). As such a function, for example, a function exemplified in Expression (6) or Expression (7) can be used. As shown in Expression (6), the threshold value T _k may be calculated without using N _k .

例えば、統計量σ_ｋとして標準偏差（式（２）参照）が計算され、データ系列特定手段５が式（６）の計算によって閾値を計算したとする。この場合、データ値の平均に偏差の１／２を加算した値が閾値となり、データ系列特定手段５は、その閾値以上のデータ値が生じたデータ系列を検出する。 For example, it is assumed that the standard deviation (see formula (2)) is calculated as the statistic σ _k and the data series specifying unit 5 calculates the threshold by the calculation of formula (6). In this case, a value obtained by adding 1/2 of the deviation to the average of the data values becomes a threshold value, and the data series specifying unit 5 detects a data series in which a data value equal to or greater than the threshold value has occurred.

また、例えば、統計量σ_ｋとして調和平均（式（３）参照）が計算され、データ系列特定手段５が式（７）の計算によって閾値を計算したとする。この場合、平均値が大きいほど、閾値と平均値との差は小さくなる。よって、データ値の平均値が小さい場合には、平均値との差が大きなデータ値が存在しても異常等が生じたデータ系列と判定されにくくなり、データ値の平均値が大きい場合には、平均値との差が小さなデータ値であっても、異常等が生じたデータ系列と判定されやすくなる。 Further, for example, it is assumed that a harmonic average (see formula (3)) is calculated as the statistic σ _k , and the data series identification unit 5 calculates the threshold value by the calculation of formula (7). In this case, the larger the average value, the smaller the difference between the threshold value and the average value. Therefore, when the average value of the data values is small, it is difficult to determine that the data series is abnormal even if there is a data value having a large difference from the average value. When the average value of the data values is large, Even a data value having a small difference from the average value is likely to be determined as a data series in which an abnormality or the like has occurred.

異常検出システム１０は、予め定められたグループ毎に、上記の各ステップＡ１〜Ａ６を実行する。 The abnormality detection system 10 executes the above steps A1 to A6 for each predetermined group.

次に、異常の生じたグループを特定し、さらに異常の生じたデータ系列を特定する処理を具体例を用いて説明する。図７（ａ）〜（ｄ）に示す各グラフでは、横軸はデータ系列であり、縦軸はデータ値である。図７は、データ系列Ｓ_ｋ〜Ｅ_ｋのデータ値を示している。具体的には、データ系列Ｓ_ｋ〜Ｅ_ｋは、データ系列３０００〜３００９の１０個のデータ系列である場合を示している。また、図７（ａ）〜（ｄ）は、それぞれ時刻「１０：００」、「１０：０１」、「１０：０２」、「１０：０９」におけるデータ系列３０００〜３００９のデータ値を示している。 Next, a process for specifying a group in which an abnormality has occurred and further specifying a data series in which an abnormality has occurred will be described using a specific example. In each graph shown in FIGS. 7A to 7D, the horizontal axis is a data series, and the vertical axis is a data value. FIG. 7 shows data values of the data series S _{k to} E _k . Specifically, the data series S _{k to} E _k indicate a case where there are 10 data series of the data series 3000 to 3009. 7A to 7D show data values of the data series 3000 to 3009 at times “10:00”, “10:01”, “10:02”, and “10:09”, respectively. Yes.

図８は、図７に示したデータ系列の集約結果の変化を示す説明図であり、時刻１０：００〜１０：１０までの間における集約結果の変化を示している。図８における横軸は時刻であり、縦軸は各時刻における集約結果である。例えば、時刻１０：００に、図７（ａ）に示す１０個のデータ系列のデータ値を取得すると、データ系列集約手段２はそのデータ値を集約する。図８では、図７（ａ）に示すデータを集約した集約結果に該当する箇所に符号（ａ）を付している。同様に、図７（ｂ），（ｃ），（ｄ）に示すデータの集約結果に該当する箇所にそれぞれ符号（ｂ），（ｃ），（ｄ）を付している。 FIG. 8 is an explanatory diagram showing a change in the aggregation result of the data series shown in FIG. 7, and shows a change in the aggregation result between 10:00 and 10:10. The horizontal axis in FIG. 8 is time, and the vertical axis is the aggregation result at each time. For example, when the data values of the ten data series shown in FIG. 7A are acquired at time 10:00, the data series aggregation unit 2 aggregates the data values. In FIG. 8, a symbol (a) is attached to a location corresponding to the aggregation result obtained by aggregating the data illustrated in FIG. Similarly, symbols (b), (c), and (d) are assigned to locations corresponding to the data aggregation results shown in FIGS. 7 (b), (c), and (d), respectively.

また、各時刻においてデータ系列を集約する際に、統計情報算出手段６は、データ系列３０００〜３００９のグループについての統計量を計算し、変換情報記憶手段３に記憶させている。 Further, when the data series are aggregated at each time, the statistical information calculation unit 6 calculates the statistics for the group of the data series 3000 to 3009 and stores them in the conversion information storage unit 3.

異常検出手段４は、図８（ｄ）の集約結果が他の集約結果と比較して異なっていると判定し、データ系列３０００〜３００９のグループに異常が生じていると判定する。そして、データ系列特定手段５は、このグループについて計算された統計量を用いて閾値Ｔ_ｋを計算する。図９は、計算された閾値Ｔ_ｋを模式的に示す説明図である。図９に示すグラフは、図７（ｄ）に示すグラフと同一であり、集約結果に変化が生じた時刻におけるデータ系列３０００〜３００９のデータ値を示している。そして、このグラフ上に閾値Ｔ_ｋを重ねて示している。図９に示す閾値Ｔ_ｋを計算すると、データ系列特定手段５は、集約結果に変化が生じた時刻におけるデータ値が閾値Ｔ_ｋを越えているデータ系列に異常が生じていると判定する。本例では、二つのデータ系列に異常が生じていると判定する。 The abnormality detection unit 4 determines that the aggregation result of FIG. 8D is different from the other aggregation results, and determines that an abnormality has occurred in the group of the data series 3000 to 3009. Then, the data series identification unit 5 calculates the threshold value T _k using the statistics calculated for this group. 9, the calculated threshold T _k is an explanatory view schematically showing. The graph shown in FIG. 9 is the same as the graph shown in FIG. 7D, and shows data values of the data series 3000 to 3009 at the time when the aggregation result has changed. The threshold value _Tk is superimposed on this graph. When the threshold value T _k shown in FIG. 9 is calculated, the data series specifying unit 5 determines that an abnormality has occurred in the data series in which the data value at the time when the change has occurred in the aggregation result exceeds the threshold value T _k . In this example, it is determined that an abnormality has occurred in the two data series.

また、個々のデータ系列にそれほど大きな変動がなくとも、データ系列毎の変動が加算されて、集約結果に大きな変動が生じることもある。そのような場合であっても、各データ系列のデータ値は閾値以下となり、データ系列特定手段５は、異常や変動が生じているデータ系列はないと判定する。 In addition, even if the individual data series does not vary so much, the variation for each data series may be added to cause a large variation in the aggregation result. Even in such a case, the data value of each data series is equal to or less than the threshold value, and the data series specifying unit 5 determines that there is no data series in which an abnormality or fluctuation occurs.

本実施形態によれば、データ系列集約手段２が大量のデータ系列をグループ毎に和の計算によって集約し、異常検出手段４が、異常または変化の生じているグループを検出する。個々のデータ系列毎に直接、異常等の有無を判定するのではなく、グループ毎にデータ値の集約を行うことで、異常や変化の有無の判定対象を減少させている。そして、データ系列特定手段５が、検出されたグループから異常または変化の生じているデータ系列を検出している。このように、集約により異常や変化の有無の判定対象を減少させているので、どのデータ系列に異常や変化が生じたかを効率よく検出することができる。 According to this embodiment, the data series aggregating means 2 aggregates a large amount of data series for each group by calculating the sum, and the abnormality detecting means 4 detects a group in which an abnormality or a change has occurred. Rather than directly determining the presence or absence of an abnormality or the like for each individual data series, the data values are aggregated for each group, thereby reducing the number of determination targets for the presence or absence of an abnormality or change. Then, the data series specifying means 5 detects a data series in which an abnormality or change has occurred from the detected group. As described above, since the determination target of abnormality or change is reduced by aggregation, it is possible to efficiently detect in which data series abnormality or change has occurred.

また、グループの中からデータ系列を検出する場合には、統計情報算出手段６が計算した統計量（α_ｋやσ_ｋ等）から閾値を計算して、閾値を越えるデータ系列を検出する。従って、人手で予めデータ系列毎に閾値を定めたり、閾値を更新したりする必要がなく、効率よくデータ系列を検出することができる。 When a data series is detected from the group, a threshold value is calculated from the statistics (α _k , σ _k, etc.) calculated by the statistical information calculation means 6 to detect a data series exceeding the threshold value. Therefore, it is not necessary to manually set a threshold value for each data series in advance or update the threshold value, and the data series can be detected efficiently.

また、あるデータ値が変動したとしても、集約結果の変動として現れないこともある。そのようなデータ値の変動が異常により生じたものであっても、集約結果の変動として現れるような極端に大きな変動でなければ、その異常は軽微であり無視してよい。本発明では、集約によって、軽微な異常等まで検出されないようにして、大きな異常や変化が生じたデータ系列を効率よく検出することができる。 Further, even if a certain data value fluctuates, it may not appear as a fluctuation of the aggregation result. Even if such a change in data value is caused by an abnormality, if the change is not an extremely large change that appears as a change in the aggregation result, the abnormality is minor and may be ignored. According to the present invention, it is possible to efficiently detect a data series in which a large abnormality or change has occurred by preventing a minor abnormality from being detected by aggregation.

また、一般に、異常や変化が生じると、データ系列のデータ値は極めて大きくなる。よって、グループを定める際に、データ値のオーダ（桁）が異なるデータ系列を同一のグループに含めることができる。例えば、通常時に１０程度の値となるデータ系列と、通常時に１０００程度の値となるデータ系列とを同じグループに含めておくことができる。１０程度の値となるデータ系列に異常が生じると、そのデータ値は、１０００と比較しても大きくなる値となる。よって、集約結果自体も大きく変動し、そのグループに異常等が生じたと検出できる。ここで、オーダの小さいデータ系列のデータ値の変動が小さい場合（例えば、通常１０程度の値が５０程度になった場合）、他に１０００程度の値のデータ系列も含まれるので、集約結果の変動は目立たない。この場合には、前述のように、オーダの小さいデータ系列の異常等は無視してよい。一般には、異常等が生じるとデータ値は極めて大きく変動するので、グループの集約結果の値も大きく変動し、そのグループ内のいずれかのデータ系列に異常が生じていると判定することができる。 In general, when an abnormality or change occurs, the data value of the data series becomes extremely large. Therefore, when determining groups, data series having different data value orders (digits) can be included in the same group. For example, a data series having a value of about 10 at normal time and a data series having a value of about 1000 at normal time can be included in the same group. When an abnormality occurs in a data series having a value of about 10, the data value becomes a value that is larger than 1000. Therefore, the aggregation result itself varies greatly, and it can be detected that abnormality or the like has occurred in the group. Here, when the fluctuation of the data value of the data series having a small order is small (for example, when the value of about 10 is usually about 50), the data series of about 1000 is also included. The fluctuation is not noticeable. In this case, as described above, an abnormality in a data series having a small order may be ignored. In general, when an abnormality or the like occurs, the data value fluctuates greatly. Therefore, the value of the group aggregation result also fluctuates greatly, and it can be determined that an abnormality has occurred in any data series in the group.

なお、オーダの大きいデータ系列の方が、異常発生時に、より集約結果に影響を与えやすいと考えられることから、データ系列特定手段５は、検出されたグループ内でオーダが最も大きいデータ系列に異常または変化が生じていると判定してもよい。ただし、グループが検出されても個々のデータ系列には異常が生じていない場合や、図９に示すように複数のデータ系列に異常が生じている場合もあるので、閾値を計算して、閾値とデータ値との比較によりデータ系列を検出することが好ましい。 Since the data series with a large order is considered to be more likely to affect the aggregation result when an abnormality occurs, the data series specifying means 5 detects that the data series with the largest order in the detected group is abnormal. Alternatively, it may be determined that a change has occurred. However, even if a group is detected, there is a case where no abnormality occurs in each data series, or there are cases where abnormality occurs in a plurality of data series as shown in FIG. It is preferable to detect the data series by comparing the value with the data value.

実施形態２．
図１０は、本発明の第２の実施形態の異常検出システムの例を示すブロック図である。第１の実施形態と同様の構成要素については、図１と同一の符号を付し、詳細な説明を省略する。本発明の異常検出システム１０は、データ系列集約手段２と、統計情報算出手段６と、変換情報記憶手段３と、異常検出手段４と、データ系列特定手段５と、除外情報編集手段７とを備える。 Embodiment 2. FIG.
FIG. 10 is a block diagram illustrating an example of the abnormality detection system according to the second embodiment of this invention. Constituent elements similar to those in the first embodiment are denoted by the same reference numerals as those in FIG. 1, and detailed description thereof is omitted. The abnormality detection system 10 of the present invention includes a data series aggregation means 2, statistical information calculation means 6, conversion information storage means 3, abnormality detection means 4, data series identification means 5, and exclusion information editing means 7. Prepare.

第１の実施形態では、変換情報が統計情報（統計量）のみを含む場合を示したが、本実施形態では、変換情報に除外情報も含める。除外情報は、集約対象から除外すべきデータ系列を示す情報であり、例えばグループ毎に定められる。図１０に示す例では、データ系列１〜Ｎのグループに関して除外情報３_１ｂが定められ、データ系列Ｓ_ｋ〜Ｅ_ｋのグループに関しては、除外情報３_ｋｂが定められている。また、グループの中に除外すべきデータ系列がなければ、除外情報はなくてもよい。 In the first embodiment, the conversion information includes only statistical information (statistics). However, in the present embodiment, exclusion information is also included in the conversion information. The exclusion information is information indicating a data series to be excluded from the aggregation target, and is determined for each group, for example. In the example shown in FIG. 10, exclusion information 3 _1b is defined for the groups of data series 1 to N, and exclusion information 3 _kb is defined for the groups of data series S _{k to} E _k . Further, if there is no data series to be excluded in the group, there is no need for exclusion information.

除外情報編集手段７は、各グループの除外情報を編集する。ここでは、除外情報編集手段７が、ユーザの操作に応じて除外情報を編集する場合を例にして説明するが、他の態様で除外情報を編集してもよい。除外情報編集手段７は、ユーザの操作に応じて、変換情報記憶手段３に新たな除外情報を記憶させる。また、ユーザの操作に応じて、変換情報記憶手段３に記憶させた除外情報を更新したり、削除したりしてもよい。 The exclusion information editing unit 7 edits the exclusion information of each group. Here, the case where the exclusion information editing unit 7 edits the exclusion information according to the user's operation will be described as an example. However, the exclusion information may be edited in another manner. The exclusion information editing unit 7 stores new exclusion information in the conversion information storage unit 3 in accordance with a user operation. Further, the exclusion information stored in the conversion information storage unit 3 may be updated or deleted according to the user's operation.

本実施形態では、データ系列集約手段２_ａは、グループ毎にデータ系列を集約する。この点は第１の実施形態と同様である。ただし、データ系列集約手段２_ａは、着目しているグループの除外情報が示すデータ系列を、集約対象から除外する。例えば、データ系列Ｓ_ｋ〜Ｅ_ｋのうち、データ系列Ｓ_ｋが除外情報で指定されているとする。この場合、データ系列集約手段２_ａは、データ系列Ｓ_ｋ〜Ｅ_ｋのうち、データ系列Ｓ_ｋ以外のデータ系列のデータ値を用いて集約を行う。他の点に関しては、第１の実施形態のデータ系列集約手段２と同様である。 In the present embodiment, the data series aggregating means _2a aggregates the data series for each group. This is the same as in the first embodiment. However, the data series aggregation unit _2a excludes the data series indicated by the exclusion information of the group of interest from the aggregation target. For example, among the data series _S k to E _k, and data series _{S k} is specified by the excluded information. In this case, the data series aggregating unit 2 _a performs aggregation using data values of data series other than the data series S _k among the data series S _{k to} E _k . The other points are the same as those of the data series aggregation unit 2 of the first embodiment.

また、統計情報算出手段６_ａは、グループ毎に統計量を計算する。この点は第１の実施形態と同様である。ただし、統計情報算出手段６_ａは、着目しているグループの除外情報が示すデータ系列を、統計量の計算から除外する。例えば、データ系列Ｓ_ｋ〜Ｅ_ｋのうち、データ系列Ｓ_ｋが除外情報で指定されているとする。この場合、統計情報算出手段６_ａは、データ系列Ｓ_ｋ〜Ｅ_ｋのうち、データ系列Ｓ_ｋ以外のデータ系列のデータ値を用いてσ_ｋやα_ｋを計算する。このときデータ系列の系列数Ｎ_ｋとして、予め定められたＮ_ｋから除外情報が示すデータ系列の数（除外されるデータ系列数）を減算した値を用いればよい。 The statistical information calculation means 6a calculates _a statistic for each group. This is the same as in the first embodiment. However, the statistics calculating unit 6 _a is a data sequence indicating the exclusion information of the group of interest, to exclude from the calculation of statistics. For example, among the data series _S k to E _k, and data series _{S k} is specified by the excluded information. In this case, the statistical information calculation unit 6 _a calculates σ _k and α _k using data values of data series other than the data series S _k among the data series S _{k to} E _k . At this time, as the number of data series N _k , a value obtained by subtracting the number of data series indicated by the exclusion information (the number of excluded data series) from a predetermined N _k may be used.

データ系列集約手段２_ａおよび統計情報算出手段６_ａは、例えばプログラムに従って動作するＣＰＵによって実現される。この場合、ＣＰＵがプログラムに従って、データ系列集約手段２_ａ、統計情報算出手段６_ａおよび他の各手段として動作すればよい。 Data series aggregation unit 2 _a and statistics calculation means 6 _a is realized by a CPU that operates according to example program. In this case, the CPU may operate as the data series aggregation means 2 _a , the statistical information calculation means 6 _a and other means according to the program.

第２の実施形態においても、第１の実施形態と同様にステップＡ１〜Ａ６の処理を行えばよい。 Also in the second embodiment, the processing of steps A1 to A6 may be performed as in the first embodiment.

第２の実施形態によれば、異常検出の必要のないデータ系列を除外情報で指定しておくことにより、そのようなデータ系列について検出しないようにすることができる。異常検出の必要のないデータ系列の例として、定常的に攻撃を受けていることが既にわかっているデータ系列や、故障などによりデータを取得できないデータ系列等が挙げられる。このようなデータ系列では、異常が現れ続けたり、データが存在しなかったり、変化しなかったりする。このような予め異常等の発生がわかっているデータ系列について検出しないようにすることができる。 According to the second embodiment, it is possible to prevent such a data series from being detected by designating a data series that does not need to be detected as an exception information. Examples of data series that do not require abnormality detection include a data series that is already known to be under constant attack, a data series that cannot acquire data due to a failure, and the like. In such a data series, abnormalities continue to appear, data does not exist, or does not change. It is possible not to detect such a data series in which occurrence of abnormality or the like is known in advance.

実施形態３．
実施形態３は、第２の実施形態と同様の構成である（図１０参照）。ただし、データ系列特定手段５も、除外情報の追加を行う。データ系列特定手段５は、異常検出手段４によってグループが検出されると、そのグループから異常または変化が生じているデータ系列を検出する。この動作は第１の実施形態および第２の実施形態と同様である。第３の実施形態では、データ系列特定手段５は、データ系列を検出したときに、そのデータ系列を示す除外情報を、そのデータ系列が属するグループの除外情報として、変換情報記憶手段３に記憶させる。他の構成要素は、第２の実施形態と同様である。 Embodiment 3. FIG.
The third embodiment has the same configuration as that of the second embodiment (see FIG. 10). However, the data series specifying means 5 also adds exclusion information. When a group is detected by the abnormality detection unit 4, the data series identification unit 5 detects a data series in which an abnormality or change has occurred from the group. This operation is the same as in the first embodiment and the second embodiment. In the third embodiment, when the data series identification unit 5 detects the data series, the data series identification unit 5 stores the exclusion information indicating the data series in the conversion information storage unit 3 as the exclusion information of the group to which the data series belongs. . Other components are the same as those in the second embodiment.

図１１は、第３の実施形態の処理経過の例を示すフローチャートである。ステップＡ１〜ステップＡ６は、第１および第２の実施形態と同様である。データ系列特定手段５は、異常等が生じているデータ系列を検出した後（ステップＡ６の後）、検出したデータ系列を示す除外情報を、そのデータ系列が属しているグループの除外情報として変換情報記憶手段３に記憶させる（ステップＡ７）。 FIG. 11 is a flowchart illustrating an example of processing progress of the third embodiment. Steps A1 to A6 are the same as those in the first and second embodiments. After detecting the data series in which an abnormality or the like has occurred (after step A6), the data series specifying unit 5 converts the exclusion information indicating the detected data series as the exclusion information of the group to which the data series belongs. The data is stored in the storage unit 3 (step A7).

この結果、異常等が発生しているデータ系列として検出されたデータ系列は、集約対象から除外される。そして、引き続きデータを取得して集約する際には、そのデータ系列は集約処理の対象外として扱われる。 As a result, the data series detected as a data series in which an abnormality or the like has occurred is excluded from the aggregation target. Then, when data is continuously acquired and aggregated, the data series is treated as an object of aggregation processing.

第３の実施形態では、データ系列特定手段５が、検出したデータ系列を示す除外情報を変換情報記憶手段３に記憶させるので、異常が生じているデータ系列が何度も連続して検出されることを防ぐことができる。 In the third embodiment, since the data series specifying unit 5 stores the exclusion information indicating the detected data series in the conversion information storage unit 3, the data series in which an abnormality has occurred is continuously detected many times. Can be prevented.

また、検出されたデータ系列のデータ値は大きく変動しているが、そのようなデータ系列を集約の対象として含めた状態を続けていると、他のデータ系列に異常が生じた場合、そのデータ系列の変動が集約結果に埋もれてしまい（換言すれば、集約結果に反映されにくくなってしまい）、他のデータ系列における異常を検出しにくくなってしまう。第３の実施形態では、検出したデータ系列を集約対象から除外するので、そのような問題を防ぐことができる。 In addition, the data value of the detected data series has fluctuated greatly, but if such a data series continues to be included as an aggregation target, if there is an abnormality in another data series, the data The fluctuation of the series is buried in the aggregation result (in other words, it is difficult to be reflected in the aggregation result), and it becomes difficult to detect an abnormality in another data series. In the third embodiment, since the detected data series is excluded from aggregation targets, such a problem can be prevented.

なお、ステップＡ６で検出されたデータ系列が異常でないことや、あるいは、異常が収束したことを確認した場合、除外情報からそのデータ系列の情報を削除すれば、元の検出状態に戻すことができる。 When it is confirmed that the data series detected in step A6 is not abnormal or the abnormality has converged, the original detection state can be restored by deleting the information of the data series from the exclusion information. .

なお、第３の実施形態において、除外情報編集手段７を備えていない構成としてもよい。すなわち、除外情報編集手段７による除外情報の編集は行わずに、データ系列特定手段５のみが除外情報を追加する構成であってもよい。 In the third embodiment, the exclusion information editing unit 7 may not be provided. In other words, the exclusion information editing unit 7 may not edit the exclusion information, and only the data series identification unit 5 may add the exclusion information.

第１から第３までの各実施形態では、データ系列の集約処理（ステップＡ２）に併せて統計情報算出手段６が統計量の計算（ステップＡ３）を行う場合を示した。統計情報算出手段６は、異常検出手段４によってグループが検出されたとき（すなわち、異常または変化が生じたデータ系列を含んでいると判定されたグループの検出時）に、そのグループのデータ値から統計量を計算し、変換情報記憶手段３に記憶させてもよい。統計量の計算方法自体は、既に説明した実施形態と同様である。 In each of the first to third embodiments, a case has been described in which the statistical information calculation means 6 calculates statistics (step A3) in conjunction with the data series aggregation processing (step A2). When the group is detected by the abnormality detection unit 4 (that is, when a group that is determined to include a data series in which an abnormality or change has occurred is detected), the statistical information calculation unit 6 uses the data value of the group. Statistics may be calculated and stored in the conversion information storage unit 3. The statistic calculation method itself is the same as that of the embodiment already described.

このように異常検出手段４に検出されたグループについてのみ統計量を計算することで、変換情報記憶手段３に記憶される統計情報を少なくすることができる。 Thus, by calculating the statistics only for the groups detected by the abnormality detection means 4, the statistical information stored in the conversion information storage means 3 can be reduced.

次に、本発明の最小構成について説明する。図１２は、本発明の最小構成を示すブロック図である。本発明の異常検出システムは、集約手段７１と、統計量計算手段７２と、グループ検出手段７３と、データ系列特定手段７４とを備える。 Next, the minimum configuration of the present invention will be described. FIG. 12 is a block diagram showing the minimum configuration of the present invention. The anomaly detection system of the present invention comprises an aggregation means 71, a statistic calculation means 72, a group detection means 73, and a data series identification means 74.

集約手段７１（例えばデータ系列集約手段２，２_ａ）は、同一のグループに属していると定められたデータ系列のデータ値またはデータ値の累乗の和を計算することにより、同一のグループに属していると定められたデータ系列を集約する。 Aggregating means 71 (for example, data series aggregating means 2, 2 _a ) belongs to the same group by calculating the sum of the data values of the data series determined to belong to the same group or the powers of the data values. Aggregate data series determined to be

統計量計算手段７２（例えば、統計情報算出手段６，６_ａ）は、集約される前のデータ系列のデータ値の統計量を計算する。 The statistic calculation means 72 (for example, the statistical information calculation means 6, 6 _a ) calculates the statistic of the data value of the data series before being aggregated.

グループ検出手段７３（例えば異常検出手段４）は、各グループ毎に計算された和に基づいて、異常または変化が生じているデータ系列を含むグループを検出する。 The group detection unit 73 (for example, the abnormality detection unit 4) detects a group including a data series in which an abnormality or a change has occurred based on the sum calculated for each group.

データ系列特定手段７４（例えばデータ系列特定手段６）は、グループ検出手段７３に検出されたグループに属するデータ系列の中から、統計量に基づいて、異常または変化が生じているデータ系列を特定する。 The data series specifying unit 74 (for example, the data series specifying unit 6) specifies the data series in which an abnormality or change has occurred based on the statistic from the data series belonging to the group detected by the group detecting unit 73. .

そのような構成により、データ系列の数が非常に多い場合であっても、どのデータ系列に異常や変化が生じたかを効率よく検出できる。 With such a configuration, even if the number of data series is very large, it is possible to efficiently detect which data series has an abnormality or change.

また、図１３、図１４は、本発明の他の構成例を示すブロック図である。上記の実施形態には、図１３に示すように、集約から除外されるデータ系列を示す情報である除外情報を記憶する除外情報記憶手段７５（例えば、変換情報記憶手段３）を備え、集約手段７１が、除外情報が示すデータ系列を、データ値またはデータ値の累乗の和の計算対象から除外する構成が開示されている。そのような構成によれば、異常検出の必要のないデータ系列を除外情報で指定しておくことにより、そのようなデータ系列について検出しないようにすることができる。 13 and 14 are block diagrams showing other configuration examples of the present invention. As shown in FIG. 13, the above embodiment includes an exclusion information storage unit 75 (for example, conversion information storage unit 3) that stores exclusion information that is information indicating a data series excluded from aggregation. 71 discloses a configuration in which the data series indicated by the exclusion information is excluded from the calculation target of the data value or the sum of the power of the data value. According to such a configuration, it is possible to prevent such a data series from being detected by designating a data series that does not require abnormality detection with the exclusion information.

また、上記の実施形態には、図１４に示すように、ユーザの操作に応じて、除外情報記憶手段７４に記憶される除外情報を編集する除外情報編集手段７６（例えば、除外情報編集手段７）を備える構成が開示されている。 In the above embodiment, as shown in FIG. 14, an exclusion information editing unit 76 (for example, the exclusion information editing unit 7) that edits exclusion information stored in the exclusion information storage unit 74 in accordance with a user operation. ) Is disclosed.

また、上記の実施形態には、データ系列特定手段７４が、異常または変化が生じているデータ系列として特定したデータ系列を示す除外情報を除外情報記憶手段に記憶させる構成が開示されている。そのような構成によれば、同じデータ系列が連続して検出されないようにすることができる。 In the above-described embodiment, a configuration is disclosed in which exclusion information storage means stores exclusion information indicating a data series specified by the data series identification unit 74 as a data series in which an abnormality or change has occurred. According to such a configuration, it is possible to prevent the same data series from being detected continuously.

また、上記の実施形態には、統計量計算手段７２が、同一のグループに属していると定められたデータ系列が集約されるときに、データ系列のデータ値の統計量を計算する構成が開示されている。 Further, the above embodiment discloses a configuration in which the statistic calculation means 72 calculates the statistic of the data value of the data series when the data series determined to belong to the same group is aggregated. Has been.

また、上記の実施形態には、統計量計算手段７２が、グループ検出手段７３がグループを検出したときに、そのグループに属し集約手段７１に集約された各データ系列のデータ値の統計量を計算する構成が開示されている。 In the above embodiment, the statistic calculation means 72 calculates the statistic of the data value of each data series belonging to the group and aggregated by the aggregation means 71 when the group detection means 73 detects the group. The structure to perform is disclosed.

また、上記の実施形態には、個々のデータ系列が、それぞれ別々のポート番号に対応するデータ系列である場合が開示されている。 Further, the above embodiment discloses a case where each data series is a data series corresponding to a different port number.

また、上記の実施形態には、個々のデータ系列が、通信トラヒックに関するログとして生成されたデータ系列である場合が開示されている。 Further, the above embodiment discloses a case where each data series is a data series generated as a log relating to communication traffic.

本発明は、複数のデータ系列中のどのデータ系列に異常や変化が生じているかを検出する異常検出システムに好適に適用される。例えば、ＴＣＰやＵＤＰの送信元ポート番号、送信先ポート番号、送信元ＩＰアドレス、送信先ＩＰアドレスを単位として、通信パケットの変化の監視や異常検出をするシステムに適用可能である。また、例えば、大規模ＷｅｂサーバにおけるＵＲＬ（Uniform Resource Locator）単位でのアクセス数の変化や、大規模メールサーバにおける送信元、送信先メールアドレス単位でのメールの流通量の変化の監視や異常検出をするシステムに適用可能である。 The present invention is preferably applied to an abnormality detection system that detects which data series in a plurality of data series are abnormal or changed. For example, the present invention can be applied to a system for monitoring a change in a communication packet and detecting an abnormality in units of TCP or UDP transmission source port numbers, transmission destination port numbers, transmission source IP addresses, and transmission destination IP addresses. Also, for example, monitoring of changes in the number of accesses in units of URLs (Uniform Resource Locators) in large-scale Web servers, changes in the amount of mail flow in units of sender and destination mail addresses in large-scale mail servers, and abnormality detection Applicable to systems that perform

１情報源
２，２_ａデータ系列集約手段
３変換情報記憶手段
４異常検出手段
５データ系列特定手段
６，６_ａ統計情報算出手段
７除外情報編集手段
１０異常検出システム DESCRIPTION OF SYMBOLS 1 Information source 2, 2 _a Data series aggregation means 3 Conversion information storage means 4 Abnormality detection means 5 Data series specification means 6, 6 _a Statistical information calculation means 7 Exclusion information editing means 10 Abnormality detection system

Claims

An aggregation means for aggregating the data series determined to belong to the same group by calculating the sum of the data values of the data series determined to belong to the same group or the power of the data values;
A statistic calculation means for calculating a statistic of data values of the data series before being aggregated;
Group detection means for detecting a group including a data series in which an abnormality or change occurs based on the sum calculated for each group;
An abnormality detection system comprising: a data series specifying means for specifying a data series in which an abnormality or a change has occurred based on the statistic from among data series belonging to the group detected by the group detection means. .

Exclusion information storage means for storing exclusion information that is information indicating a data series excluded from aggregation,
The abnormality detection system according to claim 1, wherein the aggregation unit excludes the data series indicated by the exclusion information from a calculation target of a data value or a sum of data value powers.

The abnormality detection system according to claim 2, further comprising an exclusion information editing unit that edits the exclusion information stored in the exclusion information storage unit in response to a user operation.

The abnormality detection system according to claim 2 or 3, wherein the data series specifying unit stores exclusion information indicating the data series specified as a data series in which an abnormality or change has occurred in the exclusion information storage unit.

The statistic calculation means calculates a statistic of a data value of the data series when the data series determined to belong to the same group is aggregated. The abnormality detection system according to item 1.

The statistic calculation means, when the group detection means detects a group, calculates a statistic of the data value of each data series belonging to the group and aggregated by the aggregation means. The abnormality detection system according to claim 1.

The abnormality detection system according to any one of claims 1 to 6, wherein each data series is a data series corresponding to a different port number.

The abnormality detection system according to any one of claims 1 to 7, wherein each data series is a data series generated as a log relating to communication traffic.

Aggregate data series defined as belonging to the same group by calculating the sum of the data values or data value powers of the data series determined to belong to the same group,
Calculate the statistics of the data values of the data series before being aggregated,
Based on the sum calculated for each group, detect groups that contain data series that are abnormal or changing,
An abnormality detection method, characterized in that, from among data series belonging to a detected group, a data series in which an abnormality or change has occurred is identified based on the statistics.

The abnormality detection method according to claim 9, wherein a data series indicated by exclusion information that is information indicating a data series excluded from aggregation is excluded from a calculation target of a data value or a sum of powers of data values.

On the computer,
An aggregation process for aggregating data series determined to belong to the same group by calculating the sum of the data values of the data series determined to belong to the same group or the power of the data values;
A statistic calculation process that calculates the statistics of the data values of the data series before being aggregated,
A group detection process for detecting a group including a data series in which an abnormality or change occurs based on a sum calculated for each group; and
An abnormality detection program for executing a data series identification process for identifying a data series in which an abnormality or a change has occurred based on the statistic from among data series belonging to a group detected by a group detection process.

In a computer comprising exclusion information storage means for storing exclusion information that is information indicating a data series excluded from aggregation,
The abnormality detection program according to claim 11, wherein the data series indicated by the exclusion information is excluded from a calculation target of a data value or a sum of powers of data values in the aggregation processing.