JP2018207345A

JP2018207345A - Calculation device and calculation method

Info

Publication number: JP2018207345A
Application number: JP2017111933A
Authority: JP
Inventors: 五十嵐　弓将; Yumimasa Igarashi; 弓将五十嵐
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2017-06-06
Filing date: 2017-06-06
Publication date: 2018-12-27
Anticipated expiration: 2037-06-06
Also published as: JP6662812B2

Abstract

To perform effective measurement of communication by using limited processing resources.SOLUTION: An acquisition unit 131 of a calculation device 10 acquires flow statistical information, which is statistical information on traffic aggregated for each communication source and communication destination. A classification unit 132 classifies the flow statistical information into groups so that underlying sessions of traffic are the same. A calculation unit 133 calculates statistical information on traffic for each of the groups classified by the classification unit 132 on the basis of the flow statistical information.SELECTED DRAWING: Figure 2

Description

本発明は、計算装置及び計算方法に関する。 The present invention relates to a calculation apparatus and a calculation method.

コンピュータネットワーク観測する目的で用いられるネットワークモニタやＬＡＮ（Local Area Network）アナライザにおける通信の計測方式は、パケットキャプチャによる計測方式とフロー統計情報を計測する計測方式とに大きく分けられる。 Communication measurement methods in network monitors and LAN (Local Area Network) analyzers used for the purpose of computer network observation are broadly divided into measurement methods using packet capture and measurement methods for measuring flow statistical information.

パケットキャプチャによる計測方式は、コンピュータネットワークを構成するコンピュータシステム間でやり取りされる情報の伝送単位であるパケットを複製し、それらの複製したパケットの内容を分析する技術である。 The measurement method using packet capture is a technique for copying packets, which are transmission units of information exchanged between computer systems constituting a computer network, and analyzing the contents of the copied packets.

フロー統計情報を計測する計測方式は、コンピュータネットワークを構成するルータ等のネットワーク機器に、自身を通過する通信の量を計測する機能を搭載して、コンピュータネットワーク内を流れる情報の量を計測する技術である。ここで、フロー統計情報とは、ネットワーク機器を通過する通信の情報量等の統計的数値のことをいう。この統計的数値はカウンタとも呼ばれる。トラフィック統計情報の一例としてバイト数がある。 A measurement method that measures flow statistics information is a technology that measures the amount of information flowing through a computer network by installing a function that measures the amount of communication that passes through the network device such as a router that constitutes the computer network. It is. Here, the flow statistical information refers to a statistical numerical value such as an information amount of communication passing through a network device. This statistical value is also called a counter. An example of traffic statistics information is the number of bytes.

フロー統計情報の代表的な技術仕様として、sFlow（登録商標）（例えば、非特許文献１を参照）とnetFlow（例えば、非特許文献２を参照）が知られている。sFlowは、パケットサンプリングと通信回線インタフェースごとのカウンタを用いた、統計的推定に基づくトラフィック計測技術である。また、netFlowは、ルータやスイッチ等のネットワーク機器でフロー単位のパケット数やバイト数を計測する技術である。 As typical technical specifications of flow statistical information, sFlow (registered trademark) (for example, see Non-Patent Document 1) and netFlow (for example, see Non-Patent Document 2) are known. sFlow is a traffic measurement technique based on statistical estimation using packet sampling and a counter for each communication line interface. NetFlow is a technology that measures the number of packets and the number of bytes per flow with network devices such as routers and switches.

フロー統計情報を計測する計測方式としては、他にも、セッションの上り及び下り両方向のフローのパケットを観測された時刻順に先頭から並べ、そのパケットサイズを配列として用いる方式（例えば、非特許文献３を参照）、あるセッションのパケット長の平均値、中央値、分散とパケット到着間隔の分散を計算する方式（例えば、非特許文献４を参照）、セッションあたりの総パケット数および総バイト数、特定のフラグが付いたパケット数、全パケットの平均および最大サイズと分散を計算する方式（例えば、非特許文献５を参照）等が知られている。 As another measurement method for measuring flow statistical information, other methods are used in which packets of flows in both the uplink and downlink directions of the session are arranged in the order of observed time and the packet size is used as an array (for example, Non-Patent Document 3). ), Average value of median packet length, median, method of calculating variance and variance of packet arrival interval (see Non-Patent Document 4, for example), total number of packets and total number of bytes per session, specific A method of calculating the number of packets with the flag, the average of all packets, the maximum size and variance (see, for example, Non-Patent Document 5), and the like are known.

Traffic Monitoring using sFlow、[online]、[平成２９年５月２５日検索]、インターネット（http://www.sflow.org/sFlowOverview.pdf）Traffic Monitoring using sFlow, [online], [Search May 25, 2017], Internet (http://www.sflow.org/sFlowOverview.pdf) Omar Santos, “Network Security with NetFlow and IPFIX”, Cisco Press, September 2015Omar Santos, “Network Security with NetFlow and IPFIX”, Cisco Press, September 2015 和泉勇治、田中和之、「トラヒック解析に基づいたウェブアプリケーション識別」、信学技報 IEICE Technical Report CS2013-40(2013-09), pp.61-66Yuji Izumi, Kazuyuki Tanaka, “Web Application Identification Based on Traffic Analysis”, IEICE Technical Report CS2013-40 (2013-09), pp.61-66 北村強、静野隆之、岡部稔哉、「フロー挙動分析に基づくアプリケーション識別手法」、信学技報 IEICE Technical Report NS2005-136(2005-12), pp.13-16Takeshi Kitamura, Takayuki Shizuno, Shinya Okabe, “Application Identification Method Based on Flow Behavior Analysis”, IEICE Technical Report NS2005-136 (2005-12), pp.13-16 Liu Yingqiu, Li Wei, Li Yunchun, “Network Traffic Classification Using K-means Clustering”, Second International Multisymposium on Computer and Computational Sciences, pp.360-365Liu Yingqiu, Li Wei, Li Yunchun, “Network Traffic Classification Using K-means Clustering”, Second International Multisymposium on Computer and Computational Sciences, pp. 360-365

しかしながら、従来の技術には、限られた処理資源を用いて効果的な通信の計測を行うことができない場合があるという問題がある。例えば、パケットキャプチャによる計測方式には大量の記憶資源及び計算資源が必要になる。このため、使用可能な処理資源が限られている場合は、パケットキャプチャによる計測方式することができないことがある。一方で、フロー統計情報を計測する計測方式では、効果的な通信の計測を行うことができない場合がある。 However, the conventional technology has a problem in that effective communication measurement may not be performed using limited processing resources. For example, a measurement method using packet capture requires a large amount of storage resources and calculation resources. For this reason, when the processing resources that can be used are limited, it may not be possible to perform a measurement method using packet capture. On the other hand, in a measurement method that measures flow statistical information, there are cases where effective communication measurement cannot be performed.

例えば、非特許文献１に記載のsFlowは、一定の率で間欠的にパケットをサンプリングするものであるため、パケット数が非常に少ないまたは非常に短い通信等、サンプリングされる確率が低いフローでは検出漏れや誤差が発生する場合があり、また、パケットを選ぶ方法やサンプル又はカウンタを収集する周期によっても計測精度に影響が出るため、効果的な通信の計測を行うことができない場合があるという問題点がある。 For example, since sFlow described in Non-Patent Document 1 samples packets intermittently at a constant rate, it is detected for flows with a low probability of being sampled, such as communications with very few or very short packets. There are cases where leaks and errors may occur, and the method of selecting packets and the period of collecting samples or counters also affect the measurement accuracy, so effective communication measurement may not be possible. There is a point.

また、例えば、非特許文献２に記載のnetFlowは、ルータやスイッチ等のネットワーク機器でフロー単位のパケット数やバイト数を計測するものである。このため、既存のルータやスイッチに、本来のパケット交換等の本来の処理と計測に関する処理の両方を行わせるためには、余分な計算資源が必要になる場合がある。 For example, netFlow described in Non-Patent Document 2 measures the number of packets and the number of bytes in a flow unit with a network device such as a router or a switch. For this reason, extra computing resources may be required to cause an existing router or switch to perform both the original process such as packet exchange and the process related to measurement.

また、非特許文献３から５に記載の技術は、キャプチャしたパケットのデータペイロードの中身は分析せずにパケット単位の長さや到着間隔、フラグ等ヘッダ情報のみを参照し計測を行うものであるが、計測可能な情報は限定的であり、付加的なフロー統計情報を生成するためには、パケットキャプチャによる計測を行う必要がある。 The techniques described in Non-Patent Documents 3 to 5 perform measurement by referring to only header information such as the length, arrival interval, and flag of each packet without analyzing the contents of the data payload of the captured packet. The information that can be measured is limited, and in order to generate additional flow statistical information, it is necessary to perform measurement by packet capture.

本発明の計算装置は、通信元及び通信先ごとに集約されたトラフィックに関する統計情報であるフロー統計情報を取得する取得部と、前記フロー統計情報を、基となったトラフィックのセッションが同一となるようにグループに分類する分類部と、前記フロー統計情報に基づいて、前記分類部によって分類されたグループごとのトラフィックに関する統計情報を計算する計算部と、を有することを特徴とする。 The calculation device according to the present invention has the same traffic session as the acquisition unit that acquires flow statistical information that is statistical information related to traffic aggregated for each communication source and communication destination, and the flow statistical information. A classifying unit for classifying into groups, and a calculating unit for calculating statistical information regarding traffic for each group classified by the classifying unit based on the flow statistical information.

本発明によれば、限られた処理資源を用いて効果的な通信の計測を行うことができる。 According to the present invention, effective communication measurement can be performed using limited processing resources.

図１は、第１の実施形態に係る計算システムの構成の一例を示す図である。FIG. 1 is a diagram illustrating an example of a configuration of a calculation system according to the first embodiment. 図２は、第１の実施形態に係る計算装置の構成の一例を示す図である。FIG. 2 is a diagram illustrating an example of the configuration of the computing device according to the first embodiment. 図３は、第１の実施形態に係る計算装置による計算方法を説明するための図である。FIG. 3 is a diagram for explaining a calculation method performed by the calculation apparatus according to the first embodiment. 図４は、第１の実施形態に係るルータの処理の流れを示すフローチャートである。FIG. 4 is a flowchart illustrating the processing flow of the router according to the first embodiment. 図５は、第１の実施形態に係る計算装置の処理の流れを示すフローチャートである。FIG. 5 is a flowchart showing the flow of processing of the computing device according to the first embodiment. 図６は、第１の実施形態に係る一次記憶部のデータ構成の一例を示す図である。FIG. 6 is a diagram illustrating an example of a data configuration of the primary storage unit according to the first embodiment. 図７は、第１の実施形態に係る計算装置の処理の流れを示すフローチャートである。FIG. 7 is a flowchart illustrating a processing flow of the computing device according to the first embodiment. 図８は、第１の実施形態に係る計算装置の処理の流れを示すフローチャートである。FIG. 8 is a flowchart showing the flow of processing of the computing device according to the first embodiment. 図９は、第１の実施形態に係る二次記憶部のデータ構成の一例を示す図である。FIG. 9 is a diagram illustrating an example of a data configuration of the secondary storage unit according to the first embodiment. 図１０は、計算プログラムを実行するコンピュータの一例を示す図である。FIG. 10 is a diagram illustrating an example of a computer that executes a calculation program.

［第１の実施形態の構成］
以下に、本願に係る計算装置及び計算方法の実施形態を図面に基づいて詳細に説明する。なお、本発明は、以下に説明する実施形態により限定されるものではない。まず、図１を用いて、第１の実施形態に係る計算システムの構成について説明する。図１は、第１の実施形態に係る計算システムの構成の一例を示す図である。図１に示すように、計算システム１は、計算装置１０、クライアント２０、サーバ３０及びルータ４０を有する。 [Configuration of First Embodiment]
Hereinafter, embodiments of a calculation device and a calculation method according to the present application will be described in detail with reference to the drawings. In addition, this invention is not limited by embodiment described below. First, the configuration of the calculation system according to the first embodiment will be described with reference to FIG. FIG. 1 is a diagram illustrating an example of a configuration of a calculation system according to the first embodiment. As illustrated in FIG. 1, the computing system 1 includes a computing device 10, a client 20, a server 30, and a router 40.

ここで、ルータ４０は、クライアント２０とサーバ３０との間で通信により発生するトラフィックに基づいて、フロー統計情報を生成する。フロー統計情報には、パケット数及びバイト数が含まれる。また、計算システム１においてフロー統計情報の生成の対象となる通信は、クライアント及びサーバによる通信に限られず、クライアント同士の通信であってもよいし、サーバ同士の通信であってもよいし、クライアント及びサーバ以外の機器による通信であってもよい。また、フロー統計情報を生成する機器は、ルータ４０に限られず、任意のネットワーク機器であってよい。 Here, the router 40 generates flow statistical information based on traffic generated by communication between the client 20 and the server 30. The flow statistics information includes the number of packets and the number of bytes. In addition, the communication for which the flow statistical information is generated in the computing system 1 is not limited to the communication between the client and the server, and may be communication between clients, communication between servers, or client And communication by equipment other than the server may be used. Also, the device that generates the flow statistical information is not limited to the router 40, and may be any network device.

ここで、フロー統計情報とは、コンピュータシステムを識別する送信元、あて先、プロトコル及びポート等を基に、通信を行うコンピュータシステム同士が交換する情報をフローと呼ばれる単位に分割し、フローごとに通信量を計測、計算した統計情報ということができる。例えば、フローは、通信の送信元及びあて先という情報の流れる方向に関する属性を含むため、通常、送信（往き）と受信（帰り）の２種類のフローが存在することになる。本実施形態では、送信と受信の２種類のフローを合わせた単位をセッションとよぶ。また、以降の説明では、計算システム１において、クライアント２０からサーバ３０へ向かう方向を上り、サーバ３０からクライアント２０へ向かう方向を下りとよぶ。 Here, the flow statistical information divides information exchanged between computer systems performing communication based on a transmission source, a destination, a protocol, a port, etc. for identifying a computer system into units called flows, and communicates for each flow. It can be referred to as statistical information obtained by measuring and calculating quantities. For example, since the flow includes an attribute related to the direction in which information flows, that is, a transmission source and a destination of communication, there are usually two types of flows: transmission (forward) and reception (return). In this embodiment, a unit in which two types of flows of transmission and reception are combined is called a session. In the following description, in the calculation system 1, the direction from the client 20 to the server 30 is called up, and the direction from the server 30 to the client 20 is called down.

また、本実施形態では、前述の通り、ネットワーク機器であるルータ４０がフロー統計情報の生成を行う。ルータ４０は、フロー統計情報を、コンピュータシステムが具備する通信回線インタフェース単位で生成してもよいし、さらに細かな通信の単位に分割して生成してもよい。 In this embodiment, as described above, the router 40 that is a network device generates flow statistical information. The router 40 may generate the flow statistics information for each communication line interface included in the computer system, or may be generated by dividing the flow statistics information into finer communication units.

計算装置１０は、ルータ４０によって生成されたフロー統計情報を基に、セッションに関する統計的演算を行うことで、フロー統計情報のみでは得ることができない情報、例えば、パケット数及びバイト数以外の情報を得ることができる。 The computing device 10 performs statistical computation on the session based on the flow statistical information generated by the router 40, thereby obtaining information that cannot be obtained only by the flow statistical information, for example, information other than the number of packets and the number of bytes. Can be obtained.

次に、図２を用いて、計算装置１０の構成について説明する。図２は、第１の実施形態に係る計算装置の構成の一例を示す図である。図２に示すように、計算装置１０は、通信部１１、記憶部１２及び制御部１３を有する。 Next, the configuration of the calculation apparatus 10 will be described with reference to FIG. FIG. 2 is a diagram illustrating an example of the configuration of the computing device according to the first embodiment. As illustrated in FIG. 2, the computing device 10 includes a communication unit 11, a storage unit 12, and a control unit 13.

通信部１１は、ネットワークを介して、他の装置との間でデータ通信を行う。例えば、通信部１１はＮＩＣ（Network Interface Card）である。通信部１１は、例えばルータ４０との間でデータ通信を行う。 The communication unit 11 performs data communication with other devices via a network. For example, the communication unit 11 is a NIC (Network Interface Card). The communication unit 11 performs data communication with the router 40, for example.

記憶部１２は、ＨＤＤ（Hard Disk Drive）、ＳＳＤ（Solid State Drive）、光ディスク等の記憶装置である。なお、記憶部１２は、ＲＡＭ（Random Access Memory）、フラッシュメモリ、ＮＶＳＲＡＭ（Non Volatile Static Random Access Memory）等のデータを書き換え可能な半導体メモリであってもよい。記憶部１２は、計算装置１０で実行されるＯＳ（Operating System）や各種プログラムを記憶する。さらに、記憶部１２は、プログラムの実行で用いられる各種情報を記憶する。また、記憶部１２は、一次記憶部１２１及び二次記憶部１２２を有する。 The storage unit 12 is a storage device such as a hard disk drive (HDD), a solid state drive (SSD), and an optical disk. Note that the storage unit 12 may be a semiconductor memory that can rewrite data, such as a random access memory (RAM), a flash memory, and a non-volatile static random access memory (NVSRAM). The storage unit 12 stores an OS (Operating System) executed by the computing device 10 and various programs. Furthermore, the storage unit 12 stores various information used in executing the program. The storage unit 12 includes a primary storage unit 121 and a secondary storage unit 122.

制御部１３は、計算装置１０全体を制御する。制御部１３は、例えば、ＣＰＵ（Central Processing Unit）、ＭＰＵ（Micro Processing Unit）等の電子回路や、ＡＳＩＣ（Application Specific Integrated Circuit）、ＦＰＧＡ（Field Programmable Gate Array）等の集積回路である。また、制御部１３は、各種の処理手順を規定したプログラムや制御データを格納するための内部メモリを有し、内部メモリを用いて各処理を実行する。また、制御部１３は、各種のプログラムが動作することにより各種の処理部として機能する。例えば、制御部１３は、取得部１３１、分類部１３２、計算部１３３及び保存部１３４を有する。 The control unit 13 controls the entire computing device 10. The control unit 13 is, for example, an electronic circuit such as a CPU (Central Processing Unit) or MPU (Micro Processing Unit), or an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array). The control unit 13 has an internal memory for storing programs and control data defining various processing procedures, and executes each process using the internal memory. The control unit 13 functions as various processing units when various programs are operated. For example, the control unit 13 includes an acquisition unit 131, a classification unit 132, a calculation unit 133, and a storage unit 134.

取得部１３１は、通信元及び通信先ごとに集約されたトラフィックに関する統計情報であるフロー統計情報を取得する。取得部１３１は、ルータ４０等のネットワーク機器で生成されたフロー統計情報を取得する。 The acquisition unit 131 acquires flow statistical information that is statistical information regarding traffic aggregated for each communication source and communication destination. The acquisition unit 131 acquires flow statistical information generated by a network device such as the router 40.

また、分類部１３２は、フロー統計情報を、基となったトラフィックのセッションが同一となるようにグループに分類する。分類部１３２は、例えば、取得部１３１が取得したフロー統計情報から、セッションを識別可能な情報を抽出し、当該抽出した情報に基づいて元となったトラフィックのセッションごとにフロー統計情報を分類する。セッションを識別可能な情報としては、フロー統計情報に含まれる送信元、送信先、及びフロー統計情報の生成時刻等がある。 Further, the classification unit 132 classifies the flow statistical information into groups so that the sessions of the traffic that is the basis are the same. For example, the classification unit 132 extracts information that can identify a session from the flow statistical information acquired by the acquisition unit 131, and classifies the flow statistical information for each session of the original traffic based on the extracted information. . Information that can identify a session includes a transmission source and a transmission destination included in the flow statistical information, a generation time of the flow statistical information, and the like.

例えば、分類部１３２は、第１のフロー統計情報に含まれる送信元及び送信先が、第２のフロー統計情報に含まれる送信先及び送信元のいずれかと同一であり、かつ、第１のフロー統計情報及び第２のフロー統計情報がいずれも所定の期間内に発生したトラフィックに基づくものである場合に、第１のフロー統計情報と第２のフロー統計情報とを同一のグループに分類する。 For example, the classification unit 132 has the same transmission source and transmission destination included in the first flow statistical information as those of the transmission destination and transmission source included in the second flow statistical information, and the first flow. When the statistical information and the second flow statistical information are both based on traffic generated within a predetermined period, the first flow statistical information and the second flow statistical information are classified into the same group.

また、計算部１３３は、フロー統計情報に基づいて、分類部１３２によって分類されたグループごとのトラフィックに関する統計情報を計算する。例えば、取得部１３１がフロー統計情報としてパケット数及びバイト数を取得する場合、計算部１３３は、統計情報として、グループごとのパケットサイズの平均、パケットサイズの平均の最大値、パケットサイズの平均の最小値、及びパケットサイズの平均の標準偏差、バイト数の時間平均、及び、時刻ごとの送受信されたパケットの有無を表す情報を計算する。また、計算部１３３は、統計情報として、時刻ごとの送受信されたパケットの有無に基づく共起行列を計算することができる。 Further, the calculation unit 133 calculates statistical information regarding traffic for each group classified by the classification unit 132 based on the flow statistical information. For example, when the acquisition unit 131 acquires the number of packets and the number of bytes as the flow statistical information, the calculation unit 133 includes, as statistical information, an average packet size for each group, an average maximum packet size, and an average packet size. Information indicating the minimum value, the standard deviation of the average packet size, the time average of the number of bytes, and the presence / absence of transmitted / received packets for each time is calculated. Further, the calculation unit 133 can calculate a co-occurrence matrix based on the presence / absence of transmitted / received packets for each time as statistical information.

また、保存部１３４は、取得部が取得したフロー統計情報や、計算部１３３による計算結果等を、一次記憶部１２１又は二次記憶部１２２に保存する。以降の説明では、取得部１３１が取得したフロー統計情報を入力情報とよぶ。また、計算部１３３が計算した統計情報をセッション統計情報とよぶ。 The storage unit 134 stores the flow statistical information acquired by the acquisition unit, the calculation result by the calculation unit 133, and the like in the primary storage unit 121 or the secondary storage unit 122. In the following description, the flow statistical information acquired by the acquisition unit 131 is referred to as input information. The statistical information calculated by the calculation unit 133 is called session statistical information.

ここで、図３を用いて、入力情報及びセッション統計情報の計算について具体的に説明する。図３は、第１の実施形態に係る計算装置による計算方法を説明するための図である。 Here, calculation of input information and session statistical information will be specifically described with reference to FIG. FIG. 3 is a diagram for explaining a calculation method performed by the calculation apparatus according to the first embodiment.

取得部１３１は、入力情報として、セッションを一意に識別するセッション識別子INP_1、入力情報の生成時刻INP_2、上りパケット数INP_3、下りパケット数INP_4、上りバイト数INP_5、下りバイト数INP_6、セッションの確立後の経過時間INP_7を取得する。 The acquisition unit 131 receives, as input information, a session identifier INP_1 that uniquely identifies the session, input information generation time INP_2, uplink packet count INP_3, downlink packet count INP_4, uplink byte count INP_5, downlink byte count INP_6, after session establishment Get the elapsed time INP_7.

セッション識別子INP_1は、通信を行うある一対のコンピュータシステム間で確立されたある１つのセッションを一意に識別できる値である。取得部１３１は、例えば、コンピュータシステムを識別するアドレス、プロトコル番号、ポート、時刻等に対しビット演算やハッシュ演算等を行うことでセッション識別子を生成することができる。 The session identifier INP_1 is a value that can uniquely identify a certain session established between a pair of computer systems that perform communication. For example, the acquisition unit 131 can generate a session identifier by performing a bit operation, a hash operation, or the like on an address, protocol number, port, time, and the like that identify a computer system.

生成時刻INP_2は、取得部１３１によって取得された入力情報が生成された時刻である。上りパケット数INP_3は、セッション識別子INP_1で識別されるセッションにおいて、セッションの開始時刻から生成時刻INP_2までの、上り方向の累計パケット数（0上の整数値）である。また、下りパケット数INP_4は、セッション識別子INP_1で識別されるセッションにおいて、セッションの開始時刻から生成時刻INP_2までの、下り方向の累計パケット数（0以上の整数値）である。また、上りバイト数INP_5は、セッション識別子INP_1で識別されるセッションにおいて、セッションの開始時刻から生成時刻INP_2までの、上り方向の累計バイト数（0上の整数値）である。また、下りバイト数INP_6は、セッション識別子INP_1で識別されるセッションにおいて、セッションの開始時刻から生成時刻INP_2までの、下り方向の累計バイト数（0以上の整数値）である。また、経過時間INP_7は、セッションの開始時刻から生成時刻INP_2までの経過時間である。より精度の高い計算をするために、生成時刻INP_2及び経過時間INP_7は、マイクロ秒又はミリ秒単位まで含んでいることが望ましい。 The generation time INP_2 is a time when the input information acquired by the acquisition unit 131 is generated. The uplink packet number INP_3 is the cumulative number of packets in the uplink direction (integer value on 0) from the session start time to the generation time INP_2 in the session identified by the session identifier INP_1. Further, the downlink packet number INP_4 is the cumulative number of packets in the downlink direction (an integer value of 0 or more) from the session start time to the generation time INP_2 in the session identified by the session identifier INP_1. The uplink byte count INP_5 is the cumulative number of bytes in the uplink direction (integer value on 0) from the session start time to the generation time INP_2 in the session identified by the session identifier INP_1. In addition, the downlink byte number INP_6 is the cumulative number of bytes in the downlink direction (an integer value of 0 or more) from the session start time to the generation time INP_2 in the session identified by the session identifier INP_1. The elapsed time INP_7 is the elapsed time from the start time of the session to the generation time INP_2. In order to perform calculation with higher accuracy, it is desirable that the generation time INP_2 and the elapsed time INP_7 include microseconds or milliseconds.

ここで、取得部１３１は、セッション開始時刻から一定時間dTおきに生成された入力情報を取得する。例えば、図３に示すように、取得部１３１は、まず、時刻T₁に生成された入力情報INP(T₁)を取得し、次に、時刻T₁から時間dTが経過した時刻T₂に生成された入力情報INP(T₂)を取得する。このように、取得部１３１は、セッション終了時刻である時刻T_nに生成された入力情報を取得するまで入力情報を順次取得する。ここで、生成時刻T_kにおける入力情報をINP(T_k)のように表す。INP(T_k)には、INP_1(T_k)、INP_2(T_k)、INP_3(T_k)、INP_4(T_k)、INP_5(T_k)、INP_6(T_k)、INP_7(T_k)が含まれる。また、入力情報が生成される時刻の間隔dTは一定であることが望ましいが、異なっていてもよい。 Here, the acquisition unit 131 acquires input information generated every predetermined time dT from the session start time. For example, as shown in FIG. 3, the acquisition unit 131 first acquires the input information INP (T ₁ ) generated at time T ₁ , and then at time T ₂ when time dT has elapsed from time T _1. The generated input information INP (T ₂ ) is acquired. In this way, the acquisition unit 131 sequentially acquires input information until it acquires input information generated at time T _n that is a session end time. Here, the input information at the generation time T _k is expressed as INP (T _k ). INP (T _k ) includes INP_1 (T _k ), INP_2 (T _k ), INP_3 (T _k ), INP_4 (T _k ), INP_5 (T _k ), INP_6 (T _k ), INP_7 (T _k ) included. Further, the time interval dT at which the input information is generated is preferably constant, but may be different.

次に、計算部１３３は、セッション統計情報として、セッションごとの上り平均パケットサイズAVE_1及び下り平均パケットサイズAVE_2の２つの変数を計算する。計算部１３３は、INP(T_k)、及び時刻T_kの１つ前の時刻T_k-1における入力情報INP(T_k-1)に基づいて、時刻T_kにおける上り平均パケットサイズAVE_1(T_k)及び下り平均パケットサイズAVE_2(T_k)をそれぞれ（１）式及び（２）式のように計算する。
AVE_1(T_k)＝{INP_4(T_k)-INP_4(T_k-1)}÷{INP_3(T_k)−INP_3(T_k-1)}・・・（１）
AVE_2(T_k)＝{INP_6(T_k)-INP_6(T_k-1)}÷{INP_5(T_k)−INP_5(T_k-1)}・・・（２） Next, the calculation unit 133 calculates two variables of the average uplink packet size AVE_1 and the average downlink packet size AVE_2 for each session as session statistical information. Calculation unit 133, INP (T _k), and the time T on the basis of the input information INP (T _k-1) in the previous time T _k-1 of _k, uplink average packet size AVE_1 at time T _k (T _k ) and the average downlink packet size AVE_2 (T _k ) are calculated as shown in equations (1) and (2), respectively.
AVE_1 (T _k ) = {INP_4 (T _k ) −INP_4 (T _k−1 )} ÷ {INP_3 (T _k ) −INP_3 (T _k−1 )} (1)
AVE_2 (T _k ) = {INP_6 (T _k ) −INP_6 (T _k−1 )} ÷ {INP_5 (T _k ) −INP_5 (T _k−1 )} (2)

このように、計算部１３３は、時刻T_kと時刻T_k-1との間で、バイト数の差分をパケット数の差分で平均することによって平均パケットサイズを計算することができる。ここで、生成時刻T_kにおける平均パケットサイズをAVE(T_k)のように表す。AVE(T_k)には、AVE_1(T_k)、AVE_2(T_k)が含まれる。 As described above, the calculation unit 133 can calculate the average packet size by averaging the difference in the number of bytes with the difference in the number of packets between the time T _k and the time T _k−1 . Here, the average packet size at the generation time T _k is expressed as AVE (T _k ). AVE (T _k ) includes AVE_1 (T _k ) and AVE_2 (T _k ).

各平均パケットサイズは、時刻T_kと時刻T_k-1の間で流れたパケットの１パケットあたりの平均バイト数である。なお、INP(T_k-1)が存在しない場合（例えばk=1、すなわちINP(T_k)がセッション開始後における最初に生成された入力情報である場合）、計算部１３３は、INP_3(T_k-1)、INP_3(T_k-1)、INP_3(T_k-1)、INP_3(T_k-1)を0として計算を行う。また、保存部１３４は、AVE_1(T_k+1)が計算されるまで、INP(T_k)を一次記憶部１２１に保存しておく。そして、保存部１３４は、AVE_1(T_k+1)が計算された後、INP(T_k)を破棄してもよい。 Each average packet size is the average number of bytes per packet of packets flowing between time T _k and time T _k−1 . When INP (T _k-1 ) does not exist (for example, when k = 1, that is, when INP (T _k ) is input information generated first after the session starts), the calculation unit 133 uses INP_3 (T _k-1 ), INP_3 (T _k-1 ), INP_3 (T _k-1 ), and INP_3 (T _k-1 ) are set to 0. Further, the storage unit 134 stores INP (T _k ) in the primary storage unit 121 until AVE_1 (T _{k + 1} ) is calculated. Then, the storage unit 134 may discard INP (T _k ) after AVE_1 (T _{k + 1} ) is calculated.

さらに、計算部１３３は、AVE(T_k)を用いて、上り最大平均パケットサイズAVE_MAX_1(T_k)、下り最大平均パケットサイズAVE_MAX_2(T_k)、上り最小平均パケットサイズAVE_MIN_1(T_k)、下り最小平均パケットサイズAVE_MIN_2(T_k)、上り平均パケットサイズの標準偏差AVE_SD_1(T_k)、及び下り平均パケットサイズの標準偏差AVE_SD_2(T_k)を（３）式から（８）式のように計算する。
AVE_MAX_1(T_k)={AVE_1(T₁),AVE_1(T₂),…,AVE_1(T_k)}の最大値・・・（３）
AVE_MAX_2(T_k)={AVE_2(T₁),AVE_2(T₂),…,AVE_2(T_k)}の最大値・・・（４）
AVE_MIN_1(T_k)={AVE_1(T₁),AVE_1(T₂),…,AVE_1(T_k)}の最小値・・・（５）
AVE_MIN_2(T_k)={AVE_2(T₁),AVE_2(T₂),…,AVE_2(T_k)}の最小値・・・（６）
AVE_SD_1(T_k)={AVE_1(T₁),AVE_1(T₂),…,AVE_1(T_k)}の標準偏差・・・（７）
AVE_SD_2(T_k)={AVE_2(T₁),AVE_2(T₂),…,AVE_2(T_k)}の標準偏差・・・（８） Further, the calculation unit 133 uses AVE (T _k ), the maximum uplink average packet size AVE_MAX_1 (T _k ), the maximum downlink average packet size AVE_MAX_2 (T _k ), the minimum uplink average packet size AVE_MIN_1 (T _k ), the downlink minimum average packet size AVE_MIN_2 (T _k), calculated as the standard deviation of the uplink average packet size AVE_SD_1 (T _k), and a downlink standard deviation of the mean packet size AVE_SD_2 a (T _k) from equation (3) (8) To do.
AVE_MAX_1 (T _k ) = {AVE_1 (T ₁ ), AVE_1 (T ₂ ), ..., maximum value of AVE_1 (T _k )} (3)
AVE_MAX_2 (T _k ) = {AVE_2 (T ₁ ), AVE_2 (T ₂ ),…, AVE_2 (T _k )} maximum value (4)
AVE_MIN_1 (T _k ) = {AVE_1 (T ₁ ), AVE_1 (T ₂ ), ..., AVE_1 (T _k )} minimum value (5)
AVE_MIN_2 (T _k ) = {Minimum value of AVE_2 (T ₁ ), AVE_2 (T ₂ ),…, AVE_2 (T _k )} (6)
AVE_SD_1 (T _k ) = {AVE_1 (T ₁ ), AVE_1 (T ₂ ), Standard deviation of AVE_1 (T _k )} (7)
AVE_SD_2 (T _k ) = {AVE_2 (T ₁ ), AVE_2 (T ₂ ), ..., AVE_2 (T _k )} standard deviation (8)

また、計算部１３３は、セッションの開始から終了までの間に上り及び下りの各方向で流れたバイト数の時間平均、すなわち上り方向のフローレートFRATE_1及び下り方向のフローレートFRATE_2を、それぞれ（９）式及び（１０）式のように計算する。ここで、T_nはセッションが終了した時刻である。また、（９）式及び（１０）式に示すように、計算部１３３は、セッション終了時の入力情報、すなわちINP(T_n)のみからフローレートを計算することができる。
FRATE_1=INP_4(T_n)÷INP_7(T_n)・・・（９）
FRATE_2=INP_6(T_n)÷INP_7(T_n)・・・（１０） In addition, the calculation unit 133 calculates the time average of the number of bytes that flow in each of the uplink and downlink directions from the start to the end of the session, that is, the uplink flow rate FRATE_1 and the downlink flow rate FRATE_2, respectively (9 ) And (10). Here, T _n is the time when the session ends. Further, as shown in the equations (9) and (10), the calculation unit 133 can calculate the flow rate only from the input information at the end of the session, that is, INP (T _n ).
FRATE_1 = INP_4 (T _n ) ÷ INP_7 (T _n ) (9)
FRATE_2 = INP_6 (T _n ) ÷ INP_7 (T _n ) (10)

計算部１３３は、パケット共起行列を計算する。ここで、共起行列とは、画素や単語等の画素間の相対関係や単語の出現パターンを表現する行列であり、一般的に画像認識や言語処理等で利用されてきた。本実施形態においては、計算部１３３は、以下のようにパケット共起行列を計算する。まず、計算部１３３は、あるセッションにおいて、時刻T_k-1と時刻T_kとの間にパケットが１つ以上流れたか否かを0又は1の二値で示す。なお、上り方向の当該二値をBOOL_1(T_k)、下り方向の当該二値をBOOL_2(T_k)と表す。 The calculation unit 133 calculates a packet co-occurrence matrix. Here, the co-occurrence matrix is a matrix that expresses a relative relationship between pixels such as pixels and words and an appearance pattern of words, and has been generally used in image recognition, language processing, and the like. In the present embodiment, the calculation unit 133 calculates a packet co-occurrence matrix as follows. First, the calculation unit 133 indicates whether one or more packets have flowed between time T _k−1 and time T _k in a certain session by binary values of 0 or 1. Note that the binary BOOL_1 (T _k) of the uplink, representing the binary downlink BOOL_2 and (T _k).

ここで、計算部１３３は、INP_3(T_k)、又はINP_5(T_k)が、それぞれINP_3(T_k-1)、又はINP_5(T_k-1)より大きい場合に、時刻T_k-1と時刻T_kとの間に上り方向にパケットが１つ以上流れたとみなし、当該二値の値を1とする。この計算方法は、以下の（１１）式及び（１２）式のように表すことができる。
BOOL_1(T_k)=
INP_3(T_k)-INP_3(T_k-1)>0ならば1,INP_3(T_k)-INP_3(T_k-1)=0ならば0・・・（１１）
BOOL_2(T_k)=
INP_5(T_k)-INP_5(T_k-1)>0ならば1,INP_5(T_k)-INP_5(T_k-1)=0ならば0・・・（１２） Here, the calculation unit 133, INP_3 (T _k), or INP_5 (T _k), respectively INP_3 (T _k-1), or INP_5 when (T _k-1) greater than the time T _k-1 It is considered that one or more packets have flowed in the upstream direction between time T _k and the binary value is 1. This calculation method can be expressed as the following equations (11) and (12).
BOOL_1 (T _k ) =
₁ if INP_3 (T _k ) −INP_3 (T _k−1 )> 0, 0 if INP_3 (T _k ) −INP_3 (T _k−1 ) = 0 (11)
BOOL_2 (T _k ) =
₁ if INP_5 (T _k ) -INP_5 (T _k-1 )> 0, 0 if INP_5 (T _k ) -INP_5 (T _k-1 ) = 0 (12)

また、当該二値は、パケットサイズの平均を用いて以下の（１３）式及び（１４）式のように計算することができる。
BOOL_1(T_k)’=AVE_1(T_k)>0ならば1,AVE_1(T_k)=0ならば0・・・（１３）
BOOL_2(T_k)’=AVE_2(T_k)>0ならば1,AVE_2(T_k)=0ならば0・・・（１４） Further, the binary values can be calculated as in the following formulas (13) and (14) using the average packet size.
1 if BOOL_1 (T _k ) '= AVE_1 (T _k )> 0, 0 if AVE_1 (T _k ) = 0 (13)
1 if BOOL_2 (T _k ) '= AVE_2 (T _k )> 0, 0 if AVE_2 (T _k ) = 0 (14)

計算部１３３は、あるセッションについての入力情報を生成した時刻をセッションの開始直後（１番目）から終了（n番目）まで順番に{T₁,...,T_n}とし、T₁からT_nまでのそれぞれについて二値を決定し、上りと下りそれぞれについて0と1で作られたn個の数字の数列を得ることができる。 The calculating unit 133 sets {T ₁ ,..., T _n } in order from immediately after the start of the session (first) to the end (nth) from the time when the input information for a session is generated, and from T ₁ to T the binary determined for each of up to _n, it is possible to obtain the n-number of digits sequence made of the 0 and 1, respectively downlink and uplink.

計算部１３３が上りのパケットについて得る当該数列をBOOL_1、下りパケットについて得られる当該数列をBOOL_2とすれば、BOOL_1及びBOOL_2は、それぞれn個の数字の長さを持つ。例えば、BOOL_1={1,0,....,1}のようになる。BOOL_1及びBOOL_2は、あるセッションにおけるパケット送出有無のパターンを表現している。例えば、セッションの開始から終了まで連続してパケットを送出しているセッションの場合、BOOL_1及びBOOL_2は、連続する1の羅列{1,1,1,1,1,...,1}のようになる。また、入力情報の生成する時間周期を超えて間欠的にパケットを送出する場合には、BOOL_1及びBOOL_2は、1と0の繰り返し、例えば{1,0,1,0,1,...,0}や{1,0,0,1,0,0,...,1}のようになる場合もある。 If the number sequence obtained by the calculation unit 133 for the upstream packet is BOOL_1 and the number sequence obtained for the downstream packet is BOOL_2, each of BOOL_1 and BOOL_2 has a length of n numbers. For example, BOOL_1 = {1,0, ...., 1}. BOOL_1 and BOOL_2 represent a packet transmission / non-transmission pattern in a session. For example, in the case of a session that sends packets continuously from the start to the end of the session, BOOL_1 and BOOL_2 are as follows: 1 sequence {1,1,1,1,1, ..., 1} become. When packets are intermittently transmitted beyond the time period for generating input information, BOOL_1 and BOOL_2 are repeated 1 and 0, for example, {1,0,1,0,1, ..., 0} or {1,0,0,1,0,0, ..., 1}.

また、セッションの開始直後のみ情報を送って短時間で終了する場合には数列の長さが短くなり、例えば{1,0}のような数列が生成される場合もありえる。このように、BOOL_1及びBOOL_2は、あるセッションにおけるパケット送出のパターンを表現しているが、BOOL_1及びBOOL_2の長さはセッションの継続時間に依存して可変長であり、その長さを予測することは困難である。 In addition, when information is sent only immediately after the start of a session and the process is completed in a short time, the length of the number sequence is shortened. For example, a number sequence such as {1,0} may be generated. In this way, BOOL_1 and BOOL_2 express the packet transmission pattern in a session, but the length of BOOL_1 and BOOL_2 is variable depending on the duration of the session, and the length is predicted. It is difficult.

計算部１３３は、上記の手順で得られたBOOL_1及びBOOL_2から共起行列を生成する。BOOL_1及びBOOL_2は、一次元の0または1の並びであるため、数列の中のある連続した２つの数字のならびに着目した場合、その並び方の組み合わせは{00}、{01}、{10}、{11}の4通りしかない。計算部１３３は、BOOL_1及びBOOL_2の先頭から２つずつ数字の並びを取り出し、その組み合わせの出現する数を合計する。ここで、一例として、入力情報が１０回生成された場合、すなわちn=10の場合の計算部１３３による共起行列の生成方法について説明する。例えば、BOOL_1={1,1,1,1,1,1,1,1,1,1}の場合、計算部１３３は、共起行列MATRIX_1を{00}=0,{01}=0,{10}=0,{11}=9のように生成する。また、例えば、BOOL_2={1,0,1,0,1,0,1,0,1,0}の場合、計算部１３３は、共起行列MATRIX_2を{00}=0,{01}=4,{10}=5,{11}=0のように生成する。 The calculation unit 133 generates a co-occurrence matrix from BOOL_1 and BOOL_2 obtained by the above procedure. Since BOOL_1 and BOOL_2 are one-dimensional 0 or 1 sequences, when focusing on a sequence of two consecutive numbers in the sequence, the combinations of the sequence are {00}, {01}, {10}, There are only four ways of {11}. The calculation unit 133 takes out a sequence of numbers from the top of each of BOOL_1 and BOOL_2, and totals the number of combinations that appear. Here, as an example, a method for generating a co-occurrence matrix by the calculation unit 133 when input information is generated ten times, that is, when n = 10 will be described. For example, when BOOL_1 = {1,1,1,1,1,1,1,1,1,1}, the calculation unit 133 sets the co-occurrence matrix MATRIX_1 to {00} = 0, {01} = 0, Generate as {10} = 0, {11} = 9. For example, when BOOL_2 = {1,0,1,0,1,0,1,0,1,0}, the calculation unit 133 sets the co-occurrence matrix MATRIX_2 to {00} = 0, {01} = 4, {10} = 5, {11} = 0.

さらに、セッションの時間が短い場合、例えばn=4の場合に、BOOL_1={1,0,1,1}の場合、計算部１３３は、共起行列MATRIX_1を{00}=0,{01}=1,{10}=1,{11}=1のように生成する。このように、計算部１３３は、あるセッションについて上り下りそれぞれの共起行列が計算することで、各々の共起行列は４変数を持つため、合計８変数を得ることができる。 Further, when the session time is short, for example, when n = 4 and BOOL_1 = {1,0,1,1}, the calculation unit 133 sets the co-occurrence matrix MATRIX_1 to {00} = 0, {01} = 1, {10} = 1, {11} = 1. In this way, the calculation unit 133 calculates the co-occurrence matrix for each uplink and downlink for a certain session, and each co-occurrence matrix has 4 variables, so that a total of 8 variables can be obtained.

［第１の実施形態の処理］
図４から９を用いて、計算システム１の処理の流れについて説明する。図４は、第１の実施形態に係るルータの処理の流れを示すフローチャートである。また、図５、７及び８は、第１の実施形態に係る計算装置の処理の流れを示すフローチャートである。また、図６は、第１の実施形態に係る一次記憶部のデータ構成の一例を示す図である。また、図９は、第１の実施形態に係る二次記憶部のデータ構成の一例を示す図である。 [Process of First Embodiment]
The process flow of the calculation system 1 will be described with reference to FIGS. FIG. 4 is a flowchart illustrating the processing flow of the router according to the first embodiment. 5, 7 and 8 are flowcharts showing the flow of processing of the computing device according to the first embodiment. FIG. 6 is a diagram illustrating an example of a data configuration of the primary storage unit according to the first embodiment. FIG. 9 is a diagram illustrating an example of a data configuration of the secondary storage unit according to the first embodiment.

図４に示すように、ルータ４０は、一定時間が経過するまで待機し（ステップＳ１１、Ｎｏ）、一定時間が経過すると（ステップＳ１１、Ｙｅｓ）、入力情報を生成する（ステップＳ１２）。 As shown in FIG. 4, the router 40 waits until a certain time elapses (No at Step S11), and when the certain time elapses (Yes at Step S11), generates the input information (Step S12).

図５に示すように、取得部１３１は、ネットワーク機器、すなわちルータ４０から時刻T_kの入力情報INP(T_k)を読み取る（ステップＳ２１）。分類部１３２は、取得部１３１によって取得された入力情報INP(T_k)が、新しいセッションのものであるか否かを判定する（ステップＳ２２）。 As shown in FIG. 5, the acquisition unit 131 reads input information INP (T _k ) at time T _k from the network device, that is, the router 40 (step S21). The classification unit 132 determines whether or not the input information INP (T _k ) acquired by the acquisition unit 131 belongs to a new session (step S22).

ここで、一次記憶部１２１に入力情報INP(T_k-1)が保存されていない場合、分類部１３２は、取得部１３１によって取得された入力情報INP(T_k)が、新しいセッションのものであると判定する（ステップＳ２２、Ｙｅｓ）。この場合、保存部１３４は、入力情報INP(T_k)を一次記憶部１２１に保存する（ステップＳ２３）。図６に示すように、一次記憶部１２１は入力情報を記憶する。図６は、保存部１３４が保存した入力情報INP(T_k)の、セッション識別子INP_1(T_k)が「xyz001」、生成時刻INP_2(T_k)が「20:40」、上りパケット数INP_3(T_k)が「10」、上りバイト数INP_4(T_k)が「80」、下りパケット数INP_5(T_k)が「400」、下りバイト数INP_6(T_k)が「10000」、経過時間INP_7(T_k)が「2」であったことを示している。また、この場合、保存部１３４は、一次記憶部１２１の統計情報を保存する。この場合の統計情報は、セッションの最初の統計情報であるため、保存部１３４は一次記憶部１２１の統計情報の各値を0とする。 Here, when the input information INP (T _k-1 ) is not stored in the primary storage unit 121, the classification unit 132 indicates that the input information INP (T _k ) acquired by the acquisition unit 131 is that of a new session. It is determined that there is (step S22, Yes). In this case, the storage unit 134 stores the input information INP (T _k ) in the primary storage unit 121 (step S23). As shown in FIG. 6, the primary storage unit 121 stores input information. FIG. 6 shows that input information INP (T _k ) stored by the storage unit 134 has a session identifier INP_1 (T _k ) of “xyz001”, a generation time INP_2 (T _k ) of “20:40”, and an uplink packet count INP_3 ( T _k ) is `` 10 '', uplink byte count INP_4 (T _k ) is `` 80 '', downlink packet count INP_5 (T _k ) is `` 400 '', downlink byte count INP_6 (T _k ) is `` 10000 '', elapsed time INP_7 This indicates that (T _k ) was “2”. In this case, the storage unit 134 stores the statistical information in the primary storage unit 121. Since the statistical information in this case is the first statistical information of the session, the storage unit 134 sets each value of the statistical information in the primary storage unit 121 to 0.

一方、一次記憶部１２１に入力情報INP(T_k-1)が保存されている場合、分類部１３２は、取得部１３１によって取得された入力情報INP(T_k)が、新しいセッションのものでないと判定する（ステップＳ２２、Ｎｏ）。つまり、分類部１３２は、入力情報INP(T_k-1)及び入力情報INP(T_k)を同一のグループに分類する。この場合、保存部１３４は、入力情報INP(T_k)を一次記憶部１２１に保存する。また、取得部１３１は、一次記憶部１２１から時刻T_k-1の入力情報及び統計情報を読み取る（ステップＳ２４）。そして、計算部１３３は、各統計情報を計算する（ステップＳ２５）。 On the other hand, when the input information INP (T _k-1 ) is stored in the primary storage unit 121, the classification unit 132 determines that the input information INP (T _k ) acquired by the acquisition unit 131 is not for a new session. Determine (No in step S22). That is, the classification unit 132 classifies the input information INP (T _k-1 ) and the input information INP (T _k ) into the same group. In this case, the storage unit 134 stores the input information INP (T _k ) in the primary storage unit 121. In addition, the acquisition unit 131 reads input information and statistical information at time T _k−1 from the primary storage unit 121 (step S24). And the calculation part 133 calculates each statistical information (step S25).

このように、取得部１３１は、一定時間間隔の時刻のそれぞれに対応するフロー統計情報を時間順に取得することができる。このとき、分類部１３２は、取得部１３１によってフロー統計情報が取得されるたびに、フロー統計情報を分類する。また、計算部１３３は、分類部１３２によって分類が行われるたびに、グループごとのトラフィックに関する統計情報を計算する。 As described above, the acquisition unit 131 can acquire the flow statistical information corresponding to each of the time intervals at a certain time interval in time order. At this time, the classification unit 132 classifies the flow statistical information every time the flow statistical information is acquired by the acquisition unit 131. In addition, the calculation unit 133 calculates statistical information regarding traffic for each group each time classification is performed by the classification unit 132.

図７に示すように、図５のステップＳ２５において、計算部１３３は、まず、時刻T_kの入力情報及び時刻T_k-1の入力情報を基に、時刻T_kの平均パケットサイズを計算する（ステップＳ２５１）。次に、計算部１３３は、時刻T_kの平均パケットサイズ及び時刻T_k-1の平均パケットサイズを基に、時刻T_kの平均パケットサイズの最大値、最小値、及び標準偏差を計算する（ステップＳ２５２）。次に、計算部１３３は、時刻T_kの入力情報及び時刻T_k-1の入力情報を基に、時刻T_kの共起行列を計算する（ステップＳ２５３）。そして、保存部１３４は、計算部１３３によって計算された各統計情報を一次記憶部１２１に保存する。 As shown in FIG. 7, in step S25 of FIG. 5, the calculation unit 133, first, based on the input information of the input information and time T _k-1 at time T _k, calculate the average packet size at time T _k (Step S251). Next, calculation unit 133, based on the average packet size and average packet size at time T _k-1 at time T _k, the maximum value of the average packet size at time T _k, a minimum value, and calculates the standard deviation ( Step S252). Next, calculation unit 133, based on the input information of the input information and time T _k-1 at time T _k, calculates the co-occurrence matrix of time T _k (step S253). Then, the storage unit 134 stores each statistical information calculated by the calculation unit 133 in the primary storage unit 121.

ここで、平均パケットサイズに関する計算を行う場合、計算部１３３は、セッションの開始から終了までの全ての入力情報等を用いることなく、時刻T_k-1及び時刻T_kの入力情報及び統計情報のみを用いることで、（１５）式から（１８）式のように計算を行うことができる。
AVE_MAX_1(T_k)={AVE_1(T_k-1),AVE_1(T_k)}の大きい方・・・（１５）
AVE_MAX_2(T_k)={AVE_2(T_k-1),AVE_2(T_k)}の大きい方・・・（１６）
AVE_MIN_1(T_k)={AVE_1(T_k-1),AVE_1(T_k)}の小さい方・・・（１７）
AVE_MIN_2(T_k)={AVE_2(T_k-1),AVE_2(T_k)}の小さい方・・・（１８） Here, when calculating the average packet size, the calculation unit 133 uses only input information and statistical information at time T _k−1 and time T _k without using all input information from the start to the end of the session. By using this, it is possible to perform calculations as shown in equations (15) to (18).
AVE_MAX_1 (T _k ) = {AVE_1 (T _k-1 ), AVE_1 (T _k )}, whichever is larger (15)
AVE_MAX_2 (T _k ) = {AVE_2 (T _k-1 ), AVE_2 (T _k )}, whichever is larger (16)
AVE_MIN_1 (T _k ) = {AVE_1 (T _k-1 ), AVE_1 (T _k )}, the smaller one (17)
AVE_MIN_2 (T _k ) = {AVE_2 (T _k-1 ), AVE_2 (T _k )}, the smaller one (18)

これにより、ステップＳ２５４において、保存部１３４は、計算部１３３によって計算された時刻T_kにおける統計情報のみが一次記憶部１２１に記憶されるようにすればよい。つまり、保存部１３４は、時刻T_k-1における統計情報を削除し、時刻T_kにおける統計情報を保存してもよいし、時刻T_k-1における統計情報に時刻T_kにおける統計情報を上書きしてもよい。このように、１つ前の時刻の統計情報を破棄していくことで、一次記憶部１２１には１つの時刻の入力情報及び統計情報のみ記憶しておけばよく、記憶容量を削減することができる。 Thus, in step S254, the storage unit 134 may store only the statistical information at the time T _k calculated by the calculation unit 133 in the primary storage unit 121. That is, the storage unit 134 deletes the statistics at time T _k-1, may be stored statistics in time T _k, overwrites the statistics at the time T _k to the statistics at time T _k-1 May be. As described above, by discarding the statistical information of the previous time, it is only necessary to store the input information and statistical information of one time in the primary storage unit 121, and the storage capacity can be reduced. it can.

ここで、変数X={x1,x2,...,x(k),...,x(n-1),x(n)}があったとして、k番目における分散sig(k)、すなわち標準偏差の二乗は以下の（１９）式に示す漸化式で表される。計算部１３３は、（１９）式を用いて、AVE_SD_1(T_k-1)及びAVE_SD_2(T_k-1)を基にAVE_SD_1(T_k)及びAVE_SD_2(T_k)を計算することができる。ただし、u(k)は、k番目までのx(k)の平均である。 Here, assuming that there are variables X = {x1, x2, ..., x (k), ..., x (n-1), x (n)}, the variance sig (k) at the kth, That is, the square of the standard deviation is expressed by a recurrence formula shown in the following formula (19). Calculation unit 133 may calculate the (19) using the formula, AVE_SD_1 (T _k-1) and AVE_SD_2 (T _k-1) based on AVE_SD_1 (T _k) and AVE_SD_2 (T _k). However, u (k) is an average of x (k) up to k-th.

また、計算部１３３は、時刻T_k-1の入力情報及び統計情報を基に時刻T_kの共起行列を計算することができる。まず、計算部１３３は、（１３）式及び（１４）式により、AVE_1(T_k)及びAVE_2(T_k)を基に、BOOL_1(T_k)’及びBOOL_2(T_k)’を計算する。ここで、一次記憶部１２１に、BOOL_1(T_k-1)及びBOOL_2(T_k-1)が記憶されていれば、計算部１３３は、BOOL_1(T_k-1)とBOOL_1(T_k)’、又はBOOL_2(T_k-1)とBOOL_2(T_k)’を連結することで、{00},{01},{10},{11}のうちのどれか生成されるかを得ることができ、BOOL_1及びBOOL_2を計算することができる。 Further, the calculation unit 133 can calculate the co-occurrence matrix at time T _k based on the input information and statistical information at time T _k−1 . First, the calculation unit 133 calculates BOOL_1 (T _k ) ′ and BOOL_2 (T _k ) ′ based on AVE_1 (T _k ) and AVE_2 (T _k ) according to the equations (13) and (14). Here, if BOOL_1 (T _k-1 ) and BOOL_2 (T _k-1 ) are stored in the primary storage unit 121, the calculation unit 133 calculates BOOL_1 (T _k-1 ) and BOOL_1 (T _k ) ′. Or by concatenating BOOL_2 (T _k-1 ) and BOOL_2 (T _k ) ', one of {00}, {01}, {10}, {11} can be obtained. And BOOL_1 and BOOL_2 can be calculated.

このように、計算部１３３は、分類部１３２によって分類が行われたグループの統計情報が既に計算済みである場合、当該計算済みの統計情報及び取得部１３１によって取得されたフロー統計情報に基づいて、グループごとのトラフィックに関する統計情報を計算する。 Thus, when the statistical information of the group classified by the classification unit 132 has already been calculated, the calculation unit 133 is based on the calculated statistical information and the flow statistical information acquired by the acquisition unit 131. , Calculate traffic statistics for each group.

ここで、計算部１３３は、セッションが終了したか否かを判定する（ステップＳ２６）。セッションが終了したと判定した場合（ステップＳ２６、Ｙｅｓ）、計算部１３３は、セッション単位の統計情報を計算し（ステップＳ２７）、kにk+1を代入し（ステップＳ２８）、次の時刻の処理に進む。セッション単位の統計情報とは、例えばフローレートである。また、セッションが終了していないと判定した場合（ステップＳ２６、Ｎｏ）、計算部１３３は、kにk+1を代入し（ステップＳ２８）、次の時刻の処理に進む。 Here, the calculation unit 133 determines whether or not the session has ended (step S26). When it is determined that the session has ended (step S26, Yes), the calculation unit 133 calculates session unit statistical information (step S27), substitutes k + 1 for k (step S28), and calculates the next time. Proceed to processing. The statistical information for each session is, for example, a flow rate. If it is determined that the session has not ended (step S26, No), the calculation unit 133 substitutes k + 1 for k (step S28), and proceeds to processing at the next time.

ここで、計算部１３３は、ルータ４０において、時刻T_k+1の入力情報が生成されているか否かによって、セッションが終了したか否かを判定することができる。つまり、INP(T_k+1)がINP_1(T_k)と同一のセッション識別子を持つ入力情報であれば、計算部１３３はセッションが終了していないと判定する。 Here, the calculation unit 133 can determine whether or not the session is ended depending on whether or not the input information at the time T _{k + 1} is generated in the router 40. That is, if INP (T _{k + 1} ) is input information having the same session identifier as INP_1 (T _k ), the calculation unit 133 determines that the session has not ended.

さらに、計算部１３３は、パケットのヘッダ部分に含まれるフラグを参照することでセッションが終了したか否かを判定してもよい。例えば、計算部１３３は、ＴＣＰ（Transmission Control Protocol)のヘッダの中で送信終了を示すＦＩＮフラグがＯＮであるかＯＦＦであるかを参照し、ＦＩＮフラグがＯＮであればセッションが終了したと判定することができる。なお、この方法は、ＦＩＮフラグを入力情報に追加することによって実現することができる。 Furthermore, the calculation unit 133 may determine whether or not the session has ended by referring to a flag included in the header portion of the packet. For example, the calculation unit 133 refers to whether the FIN flag indicating transmission end is ON or OFF in a TCP (Transmission Control Protocol) header, and determines that the session is ended if the FIN flag is ON. can do. This method can be realized by adding a FIN flag to the input information.

図８に示すように、図５のステップＳ２７において、計算部１３３は、まず、時刻T_kの入力情報を基に、セッションのフローレートを計算する（ステップＳ２７１）。そして、保存部１３４は、一次記憶部１２１の統計情報及びフローレートを二次記憶部１２２に保存し、一次記憶部１２１の入力情報及び統計情報を削除する（ステップＳ２７２）。 As shown in FIG. 8, in step S27 in FIG. 5, the calculation unit 133, first, based on the input information of the time T _k, to calculate the flow rate of the session (step S271). Then, the storage unit 134 stores the statistical information and the flow rate in the primary storage unit 121 in the secondary storage unit 122, and deletes the input information and statistical information in the primary storage unit 121 (step S272).

図９に示すように、二次記憶部１２２は入力情報及び統計情報を記憶する。図９は、二次記憶部１２２に保存済みの入力情報INP(T_k)の、セッション識別子INP_1(T_n)が「abc123」、生成時刻INP_2(T_n)が「20:29」、上りパケット数INP_3(T_n)が「20」、上りバイト数INP_4(T_n)が「120」、下りパケット数INP_5(T_n)が「650」、下りバイト数INP_6(T_n)が「30000」、経過時間INP_7(T_k)が「5」であることを示している。また、図９は、二次記憶部１２２に保存済みの統計情報の、上り平均パケットサイズAVE_1(T_n)が「30」、下り平均パケットサイズAVE_2(T_n)が「200」、上り平均パケットサイズの最大値AVE_MAX_1(T_n)が「60」、下り平均パケットサイズの最大値AVE_MAX_2(T_n)が「500」、上り平均パケットサイズの最小値AVE_MIN_1(T_n)が「2」、下り平均パケットサイズの最小値AVE_MIN_2(T_n)が「50」、上り平均パケットサイズの標準偏差AVE_SD_1(T_n)が「30」、下り平均パケットサイズの標準偏差AVE_SD_2(T_n)が「300」、上り共起行列MATRIX_1(T_n)が「0,0,0,9」、上り共起行列MATRIX_2(T_n)が「0,2,2,5」、上りフローレートFRATE_1(T_n)が「19」、下りフローレートFRATE_2(T_n)が「1300」であることを示している。また、例えば、保存部１３４は、セッション識別子INP_1(T_n)が「abc123」である行の下に、セッション識別子INP_1(T_n)が「xyz001」である行を作成し、入力情報及び統計情報を保存してもよい。 As shown in FIG. 9, the secondary storage unit 122 stores input information and statistical information. FIG. 9 shows that the input information INP (T _k ) stored in the secondary storage unit 122 has the session identifier INP_1 (T _n ) “abc123”, the generation time INP_2 (T _n ) “20:29”, and the uplink packet The number INP_3 (T _n ) is `` 20 '', the number of upstream bytes INP_4 (T _n ) is `` 120 '', the number of downstream packets INP_5 (T _n ) is `` 650 '', the number of downstream bytes INP_6 (T _n ) is `` 30000 '', The elapsed time INP_7 (T _k ) is “5”. Also, FIG. 9 shows that the statistical information stored in the secondary storage unit 122 has an uplink average packet size AVE_1 (T _n ) of “30”, a downlink average packet size AVE_2 (T _n ) of “200”, and an uplink average packet Maximum size AVE_MAX_1 (T _n ) is “60”, maximum downlink average packet size AVE_MAX_2 (T _n ) is “500”, minimum average uplink packet size AVE_MIN_1 (T _n ) is “2”, downlink average The minimum packet size value AVE_MIN_2 (T _n ) is `` 50 '', the standard deviation of uplink average packet size AVE_SD_1 (T _n ) is `` 30 '', the standard deviation of downlink average packet size AVE_SD_2 (T _n ) is `` 300 '', the uplink The co-occurrence matrix MATRIX_1 (T _n ) is “0,0,0,9”, the upstream co-occurrence matrix MATRIX_2 (T _n ) is “0,2,2,5”, and the upstream flow rate FRATE_1 (T _n ) is “19”. ", Indicating that the downstream flow rate FRATE_2 (T _n ) is" 1300 ". Further, for example, storage unit 134, below the line session identifier INP_1 (T _n) is "abc123" session identifier INP_1 (T _n) to create a line of "xyz001", the input information and statistics May be saved.

［実施例］
第１の実施形態に基づく実施例について説明する。本実施例では、ルータ４０は、NetFlowと呼ばれる方式を利用してフロー統計情報を収集する。なお、ルータ４０は、入力情報として必要な情報が収集可能である方式であれば、NetFlow以外の方式を用いてもよい。例えば、ルータ４０は、OpenFlow（参考文献１：OpenFlow Switch Specification（URL:https://www.opennetworking.org/images/stories/downloads/sdn-resources/onf-specifications/openflow/openflow-spec-v1.3.2.pdf））「Body of reply to OFPMP_FLOW request」で規定される情報でも入力情報を生成可能である。 [Example]
An example based on the first embodiment will be described. In this embodiment, the router 40 collects flow statistical information using a method called NetFlow. The router 40 may use a method other than NetFlow as long as it can collect necessary information as input information. For example, the router 40 uses OpenFlow (Reference 1: OpenFlow Switch Specification (URL: https://www.opennetworking.org/images/stories/downloads/sdn-resources/onf-specifications/openflow/openflow-spec-v1. 3.2.pdf)) Input information can be generated even with information specified in “Body of reply to OFPMP_FLOW request”.

NetFlowはインターネットで標準的に利用されている、ルータやスイッチ等のネットワーク機器にフロー統計情報を生成する機能を搭載し、その情報を遠隔の情報収集器および分析器に送信するための仕組みの１つである。図１に示すように、ルータ４０（NetFlow-Enabled Router）が自身を通過する通信のフロー統計情報を生成するネットワーク機器であり、計算装置１０（NetFlow Collector）がフロー統計情報であるFlow Recordを受け取り、収集、分析を行う機器である。 NetFlow is a standard mechanism for generating flow statistical information on network devices such as routers and switches that are used on the Internet, and sending the information to remote information collectors and analyzers. One. As shown in FIG. 1, a router 40 (NetFlow-Enabled Router) is a network device that generates flow statistical information of communication passing through itself, and a computing device 10 (NetFlow Collector) receives a Flow Record that is flow statistical information. It is a device that performs collection and analysis.

ここで、NetFlow Version 5では、Flow Record Formatに以下の情報が定義されている。
1.srcaddr Source IP address
2.dstaddr Destination IP address
3.nexthop IP address of next hop router
4.input SNMP index of input interface
5.output SNMP index of output interface
6.dPkts Packets in the flow
7.dOctets Total number of Layer 3 bytes in the packets of the flow
8.First SysUptime at start of flow
9.Last SysUptime at the time the last packet of the flow was received
10.srcport TCP/UDP source port number or equivalent
11.dstport TCP/UDP destination port number or equivalent
12.pad1 Unused (zero) bytes
13.tcp_flags Cumulative OR of TCP flags
14.prot IP protocol type (for example, TCP = 6; UDP = 17)
15.tos IP type of service (ToS)
16.src_as Autonomous system number of the source, either origin or peer
17.dst_as Autonomous system number of the destination, either origin or peer
18.src_mask Source address prefix mask bits
19.dst_mask Destination address prefix mask bits
20.pad2 Unused (zero) bytes Here, in NetFlow Version 5, the following information is defined in the Flow Record Format.
1.srcaddr Source IP address
2.dstaddr Destination IP address
3.nexthop IP address of next hop router
4.input SNMP index of input interface
5.output SNMP index of output interface
6.dPkts Packets in the flow
7.dOctets Total number of Layer 3 bytes in the packets of the flow
8.First SysUptime at start of flow
9.Last SysUptime at the time the last packet of the flow was received
10.srcport TCP / UDP source port number or equivalent
11.dstport TCP / UDP destination port number or equivalent
12.pad1 Unused (zero) bytes
13.tcp_flags Cumulative OR of TCP flags
14.prot IP protocol type (for example, TCP = 6; UDP = 17)
15.tos IP type of service (ToS)
16.src_as Autonomous system number of the source, either origin or peer
17.dst_as Autonomous system number of the destination, either origin or peer
18.src_mask Source address prefix mask bits
19.dst_mask Destination address prefix mask bits
20.pad2 Unused (zero) bytes

計算装置１０の取得部１３１は、Flow Record Formatの情報を用いて、以下のように入力情報を生成することができる。取得部１３１は、Flow Record Formatの1、2、8、10、11、14を入力としてビット演算あるいはハッシュ計算をすることによりINP_1を生成することができる。また、取得部１３１は、ルータ４０がFlow Recordを生成あるいは送信した時刻を参照しINP_2を生成してもよいし、計算装置１０は、Flow Recordを受信した時刻をINP_2としてもよい。また、取得部１３１は、Flow Record Formatの6を参照しINP_3及びINP_5を生成する。また、取得部１３１は、Flow Record Formatの7を参照しINP_4及びINP_6を生成する。また、取得部１３１は、Flow Record Formatの9と8の時間差分を計算しINP_7を生成する。 The acquisition unit 131 of the computing device 10 can generate input information as follows using the information of the Flow Record Format. The acquisition unit 131 can generate INP_1 by performing bit operation or hash calculation using 1, 2, 8, 10, 11, and 14 of Flow Record Format as inputs. Further, the acquisition unit 131 may generate INP_2 with reference to the time when the router 40 generates or transmits the Flow Record, or the computing device 10 may set the time when the Flow Record is received as INP_2. In addition, the acquisition unit 131 generates INP_3 and INP_5 with reference to Flow Record Format 6. Further, the acquisition unit 131 generates INP_4 and INP_6 with reference to Flow Record Format 7. In addition, the acquisition unit 131 calculates the time difference between 9 and 8 in the Flow Record Format and generates INP_7.

ここで、INP_3とINP_5、及びINP_4とINP_6は、上りと下りで組になっている。NetFlowのFlow Recordはどちらか一方向のフローに関する情報なので、取得部１３１は、対となるFlow Recordを見つける必要があるが、上りと下りのFlow Recordは、上記Flow Record Formatの1と2、10と11が反転した関係となっている。すなわち、取得部１３１は、送信元(source)と宛先(destination)が入れ替わっているフローの対を見つければよい。そして、取得部１３１は、対となる２つのFlow Recordから１つのセッションに相当するINPを生成する。 Here, INP_3 and INP_5, and INP_4 and INP_6 are paired in uplink and downlink. Since the NetFlow Flow Record is information related to a flow in one direction, the acquisition unit 131 needs to find a pair of Flow Records. And 11 are reversed. That is, the acquisition unit 131 only needs to find a pair of flows in which the source (source) and the destination (destination) are switched. Then, the acquisition unit 131 generates an INP corresponding to one session from two paired Flow Records.

ルータ４０は、ある時刻Tにおいて自身を通過しているフロー、すなわちセッション全てに関するFlow Recordを生成して送信する。Flow Recordの送信契機はルータ４０で設定可能であるため、一定の時間間隔、例えば10秒おきにFlow Recordを送信することが可能である。また、計算装置１０は、Flow Recordを受信し、実施形態の手順に従い計算を繰り返すことによりセッション統計情報を生成、更新する。 The router 40 generates and transmits a flow record relating to all flows passing through itself at a certain time T, that is, a session. Since the flow record transmission trigger can be set by the router 40, it is possible to transmit the flow record at regular time intervals, for example, every 10 seconds. In addition, the computing device 10 receives the Flow Record, and generates and updates session statistical information by repeating the calculation according to the procedure of the embodiment.

クライアント２０がサーバ３０に対して通信を開始すると、ルータ４０はその通信をFlow Recordとして記録する。Flow Recordは、上り方向と下り方向の２つが生成される。例えば、実施例では、計算システム１は以下の流れで処理を行う。 When the client 20 starts communication with the server 30, the router 40 records the communication as a Flow Record. Two Flow Records are generated: an upstream direction and a downstream direction. For example, in the embodiment, the calculation system 1 performs processing according to the following flow.

（セッション開始、継続時）
1.ルータ４０はFlow Recordの送信契機で、Flow Recordを生成して計算装置１０に送る。
2.計算装置１０は、Flow Recordを受信し受信時刻を記録する。
3.計算装置１０は、実施形態の手順に従い入力情報を生成する。
4.計算装置１０は、実施形態の手順に従い統計情報を生成する。
5.計算装置１０は、次の時刻の入力情報を待つ。
6.ルータ４０はFlow Recordの送信契機で、Flow Recordを生成して計算装置１０に送る。
7.計算装置１０は、Flow Recordを受信し受信時刻を記録する。 (When session starts and continues)
1. The router 40 generates a flow record and sends it to the computing device 10 when the flow record is transmitted.
2. The computing device 10 receives the Flow Record and records the reception time.
3. The computing device 10 generates input information according to the procedure of the embodiment.
4. The computing device 10 generates statistical information according to the procedure of the embodiment.
5. The computing device 10 waits for input information at the next time.
6. The router 40 generates a flow record and sends it to the computing device 10 when the flow record is transmitted.
7. The computing device 10 receives the Flow Record and records the reception time.

（セッション終了時）
8.クライアント２０又はサーバ３０が通信を終了する。
9.ルータ４０は通信の終了を検知し、Flow Recordを計算装置１０に送信後、終了したFlow Recordを削除する。
10.計算装置１０は、9.のFlow Recordを受信し受信時刻を記録し、入力情報及び統計情報の生成を実施する。
11.計算装置１０は、次の時刻の入力情報を待つ。
12.ルータ４０はFlow Recordが削除されているので、該当する通信のFlow Recordに関しては何も送信しない（あるいは空のFlow Recordを送信する）。
13.計算装置１０は、セッションが終了したと判断し、フローレートを計算する。
14.計算装置１０は、入力情報及び統計情報を二次記憶装置等に書き出し、INP_1で識別されるセッションに関する統計情報の計算を終了する。 (At the end of the session)
8. The client 20 or the server 30 ends the communication.
9. The router 40 detects the end of communication, transmits the flow record to the computing device 10, and then deletes the ended flow record.
10. The computing device 10 receives the Flow Record of 9., records the reception time, and generates input information and statistical information.
11. The computing device 10 waits for input information at the next time.
12. Since the Flow Record has been deleted, the router 40 does not transmit anything regarding the Flow Record of the corresponding communication (or transmits an empty Flow Record).
13. The calculation device 10 determines that the session has ended and calculates the flow rate.
14. The computing device 10 writes the input information and statistical information to the secondary storage device or the like, and ends the calculation of statistical information related to the session identified by INP_1.

［第１の実施形態の効果］
取得部１３１は、通信元及び通信先ごとに集約されたトラフィックに関する統計情報であるフロー統計情報を取得する。また、分類部１３２は、フロー統計情報を、基となったトラフィックのセッションが同一となるようにグループに分類する。また、計算部１３３は、フロー統計情報に基づいて、分類部１３２によって分類されたグループごとのトラフィックに関する統計情報を計算する。 [Effect of the first embodiment]
The acquisition unit 131 acquires flow statistical information that is statistical information regarding traffic aggregated for each communication source and communication destination. Further, the classification unit 132 classifies the flow statistical information into groups so that the sessions of the traffic that is the basis are the same. Further, the calculation unit 133 calculates statistical information regarding traffic for each group classified by the classification unit 132 based on the flow statistical information.

このように、本実施形態の計算装置１０は、パケットキャプチャを用いずにコンピュータシステム間の通信を計測する。このため、本実施形態によれば、キャプチャデータを複製して保存する必要がなく、キャプチャデータを保存する二次記憶装置が必要ないという効果が得られる。 Thus, the computing device 10 of this embodiment measures communication between computer systems without using packet capture. For this reason, according to the present embodiment, it is not necessary to duplicate and store the capture data, and there is an effect that a secondary storage device for storing the capture data is not necessary.

さらに、本実施形態では、パケットが複製されないため、ペイロードに含まれる通信の内容の秘密及びプライバシ保護に関する問題が発生しない。さらに、本実施形態によれば、通信の内容の秘密を保護するために利用される暗号化通信に対して、ペイロードの内容を判読することなく通信を計測することが可能である。 Furthermore, in this embodiment, since the packet is not duplicated, there is no problem regarding the confidentiality of the contents of communication included in the payload and the privacy protection. Furthermore, according to the present embodiment, it is possible to measure communication without reading the contents of the payload for the encrypted communication used for protecting the confidentiality of the contents of the communication.

また、従来のフロー統計情報の計測では、計測できる統計情報がパケット数とバイト数の２種類に限られていた。これに対し、本実施形態の計算装置１０は、８種類の統計情報（パケット数、バイト数、平均パケットサイズ、平均パケットサイズの最大値、最小値、標準偏差、フローレート、共起行列）を計算により生成する。このため、仮に、１つの変数がとり得る値の数をMとすれば、従来の技術で得られていた情報量がM2であったのに対し、本実施形態ではM8の情報量を得ることができる。 In the conventional measurement of flow statistical information, the statistical information that can be measured is limited to two types, that is, the number of packets and the number of bytes. On the other hand, the computing device 10 of this embodiment uses eight types of statistical information (number of packets, number of bytes, average packet size, maximum value of average packet size, minimum value, standard deviation, flow rate, co-occurrence matrix). Generate by calculation. For this reason, if the number of values that can be taken by one variable is M, the amount of information obtained in the prior art was M2, whereas in this embodiment, the amount of information of M8 is obtained. Can do.

さらに、本実施形態では、セッションが開始してから終了するまで全期間に渡って計算に必要な情報を保持する必要がない。つまり、本実施形態では、周期的に取得できるフロー統計情報の１周期分を一時的に記憶しておくだけでパケット数とバイト数以外の各統計情報を計算することができる。このように、本実施形態の計算装置１０によれば、限られた処理資源を用いて効果的な通信の計測を行うことができる。 Furthermore, in this embodiment, it is not necessary to hold information necessary for calculation over the entire period from the start to the end of the session. That is, in the present embodiment, each piece of statistical information other than the number of packets and the number of bytes can be calculated only by temporarily storing one cycle of flow statistical information that can be periodically acquired. Thus, according to the computing device 10 of the present embodiment, effective communication measurement can be performed using limited processing resources.

分類部１３２は、第１のフロー統計情報に含まれる送信元及び送信先が、それぞれ第２のフロー統計情報に含まれる送信先及び送信元と同一であり、かつ、第１のフロー統計情報及び第２のフロー統計情報がいずれも所定の期間内に発生したトラフィックに基づくものである場合に、第１のフロー統計情報と第２のフロー統計情報とを同一のグループに分類することができる。このように、計算装置１０は、フロー統計情報をセッションごとのグループに分類する。これにより、本実施形態によれば、セッション単位での統計情報の計算が可能となる。 The classification unit 132 includes a transmission source and a transmission destination included in the first flow statistical information that are the same as the transmission destination and the transmission source included in the second flow statistical information, respectively, and the first flow statistical information and When both of the second flow statistical information are based on traffic generated within a predetermined period, the first flow statistical information and the second flow statistical information can be classified into the same group. Thus, the computing device 10 classifies the flow statistical information into groups for each session. As a result, according to the present embodiment, it is possible to calculate statistical information in units of sessions.

取得部１３１は、フロー統計情報として、少なくともパケット数及びバイト数を取得することができる。この場合、計算部１３３は、統計情報として、グループごとのパケットサイズの平均、パケットサイズの平均の最大値、パケットサイズの平均の最小値、及びパケットサイズの平均の標準偏差、バイト数の時間平均、及び、時刻ごとの送受信されたパケットの有無を表す情報を計算することができる。このように、計算装置１０は、パケット数及びバイト数から、６種類の統計情報を生成することができる。 The acquisition unit 131 can acquire at least the number of packets and the number of bytes as flow statistical information. In this case, the calculation unit 133 uses, as statistical information, the average packet size for each group, the average maximum packet size, the minimum average packet size, the standard deviation of the average packet size, and the time average of the number of bytes. And information indicating the presence / absence of transmitted / received packets for each time can be calculated. Thus, the computing device 10 can generate six types of statistical information from the number of packets and the number of bytes.

取得部１３１は、一定時間間隔の時刻のそれぞれに対応するフロー統計情報を時間順に取得することができる。この場合、分類部１３２は、取得部１３１によってフロー統計情報が取得されるたびに、フロー統計情報を分類することができる。また、計算部１３３は、分類部１３２によって分類が行われるたびに、グループごとのトラフィックに関する統計情報を計算することができる。これにより、計算装置１０は、フロー統計情報が生成されるのに合わせて、逐次計算を進めていくことができる。 The acquisition unit 131 can acquire the flow statistical information corresponding to each time at a fixed time interval in time order. In this case, the classification unit 132 can classify the flow statistical information each time the flow statistical information is acquired by the acquisition unit 131. In addition, the calculation unit 133 can calculate statistical information regarding traffic for each group each time classification is performed by the classification unit 132. Thereby, the computing device 10 can proceed with the sequential calculation as the flow statistical information is generated.

計算部１３３は、分類部１３２によって分類が行われたグループの統計情報が既に計算済みである場合、当該計算済みの統計情報及び取得部１３１によって取得されたフロー統計情報に基づいて、グループごとのトラフィックに関する統計情報を計算することができる。これにより、あるセッションのフロー統計情報を全て保持しておく必要がなくなるため、使用する記憶容量を削減することができる。 When the statistical information of the group classified by the classification unit 132 has already been calculated, the calculation unit 133 calculates each group based on the calculated statistical information and the flow statistical information acquired by the acquisition unit 131. Statistical information about traffic can be calculated. As a result, it is not necessary to hold all the flow statistical information of a session, and the storage capacity to be used can be reduced.

計算部１３３は、統計情報として、時刻ごとの送受信されたパケットの有無に基づく共起行列を計算することができる。これにより、連続するパケットの出現パターンを分析することが可能となる。 The calculation unit 133 can calculate a co-occurrence matrix based on the presence / absence of transmitted / received packets for each time as statistical information. This makes it possible to analyze the appearance pattern of successive packets.

［システム構成等］
また、図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示のように構成されていることを要しない。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部又は一部を、各種の負荷や使用状況等に応じて、任意の単位で機能的又は物理的に分散・統合して構成することができる。さらに、各装置にて行われる各処理機能は、その全部又は任意の一部が、ＣＰＵ及び当該ＣＰＵにて解析実行されるプログラムにて実現され、あるいは、ワイヤードロジックによるハードウェアとして実現され得る。 [System configuration, etc.]
Each component of each illustrated device is functionally conceptual and does not necessarily need to be physically configured as illustrated. In other words, the specific form of distribution / integration of each device is not limited to the one shown in the figure, and all or a part of the distribution / integration is functionally or physically distributed in arbitrary units according to various loads or usage conditions. Can be integrated and configured. Furthermore, all or a part of each processing function performed in each device may be realized by a CPU and a program that is analyzed and executed by the CPU, or may be realized as hardware by wired logic.

また、本実施形態において説明した各処理のうち、自動的に行われるものとして説明した処理の全部又は一部を手動的に行うこともでき、あるいは、手動的に行われるものとして説明した処理の全部又は一部を公知の方法で自動的に行うこともできる。この他、上記文書中や図面中で示した処理手順、制御手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。 Also, among the processes described in this embodiment, all or part of the processes described as being performed automatically can be performed manually, or the processes described as being performed manually can be performed. All or a part can be automatically performed by a known method. In addition, the processing procedure, control procedure, specific name, and information including various data and parameters shown in the above-described document and drawings can be arbitrarily changed unless otherwise specified.

［プログラム］
一実施形態として、計算装置１０は、パッケージソフトウェアやオンラインソフトウェアとして上記の統計情報の計算を実行する計算プログラムを所望のコンピュータにインストールさせることによって実装できる。例えば、上記の計算プログラムを情報処理装置に実行させることにより、情報処理装置を計算装置１０として機能させることができる。ここで言う情報処理装置には、デスクトップ型又はノート型のパーソナルコンピュータが含まれる。また、その他にも、情報処理装置にはスマートフォン、携帯電話機やＰＨＳ（Personal Handyphone System）等の移動体通信端末、さらには、ＰＤＡ（Personal Digital Assistant）等のスレート端末等がその範疇に含まれる。 [program]
As one embodiment, the calculation device 10 can be implemented by installing a calculation program for executing the above-described statistical information calculation on a desired computer as package software or online software. For example, the information processing apparatus can be caused to function as the calculation apparatus 10 by causing the information processing apparatus to execute the above calculation program. The information processing apparatus referred to here includes a desktop or notebook personal computer. In addition, the information processing apparatus includes mobile communication terminals such as smartphones, mobile phones and PHS (Personal Handyphone System), and slate terminals such as PDA (Personal Digital Assistant).

また、計算装置１０は、ユーザが使用する端末装置をクライアントとし、当該クライアントに上記の統計情報の計算に関するサービスを提供する計算サーバ装置として実装することもできる。例えば、計算サーバ装置は、フロー統計情報を入力とし、セッション統計情報を出力とする計算サービスを提供するサーバ装置として実装される。この場合、計算サーバ装置は、Ｗｅｂサーバとして実装することとしてもよいし、アウトソーシングによって上記の統計情報の計算に関するサービスを提供するクラウドとして実装することとしてもかまわない。 The computing device 10 can also be implemented as a computing server device that uses a terminal device used by a user as a client and provides the client with a service related to the calculation of the statistical information. For example, the calculation server device is implemented as a server device that provides a calculation service that receives flow statistical information and outputs session statistical information. In this case, the calculation server device may be implemented as a Web server, or may be implemented as a cloud that provides a service related to the above-described statistical information calculation by outsourcing.

図１０は、計算プログラムを実行するコンピュータの一例を示す図である。コンピュータ１０００は、例えば、メモリ１０１０、ＣＰＵ１０２０を有する。また、コンピュータ１０００は、ハードディスクドライブインタフェース１０３０、ディスクドライブインタフェース１０４０、シリアルポートインタフェース１０５０、ビデオアダプタ１０６０、ネットワークインタフェース１０７０を有する。これらの各部は、バス１０８０によって接続される。 FIG. 10 is a diagram illustrating an example of a computer that executes a calculation program. The computer 1000 includes a memory 1010 and a CPU 1020, for example. The computer 1000 also includes a hard disk drive interface 1030, a disk drive interface 1040, a serial port interface 1050, a video adapter 1060, and a network interface 1070. These units are connected by a bus 1080.

メモリ１０１０は、ＲＯＭ（Read Only Memory）１０１１及びＲＡＭ１０１２を含む。ＲＯＭ１０１１は、例えば、ＢＩＯＳ（Basic Input Output System）等のブートプログラムを記憶する。ハードディスクドライブインタフェース１０３０は、ハードディスクドライブ１０９０に接続される。ディスクドライブインタフェース１０４０は、ディスクドライブ１１００に接続される。例えば磁気ディスクや光ディスク等の着脱可能な記憶媒体が、ディスクドライブ１１００に挿入される。シリアルポートインタフェース１０５０は、例えばマウス１１１０、キーボード１１２０に接続される。ビデオアダプタ１０６０は、例えばディスプレイ１１３０に接続される。 The memory 1010 includes a ROM (Read Only Memory) 1011 and a RAM 1012. The ROM 1011 stores a boot program such as BIOS (Basic Input Output System). The hard disk drive interface 1030 is connected to the hard disk drive 1090. The disk drive interface 1040 is connected to the disk drive 1100. For example, a removable storage medium such as a magnetic disk or an optical disk is inserted into the disk drive 1100. The serial port interface 1050 is connected to a mouse 1110 and a keyboard 1120, for example. The video adapter 1060 is connected to the display 1130, for example.

ハードディスクドライブ１０９０は、例えば、ＯＳ１０９１、アプリケーションプログラム１０９２、プログラムモジュール１０９３、プログラムデータ１０９４を記憶する。すなわち、計算装置１０の各処理を規定するプログラムは、コンピュータにより実行可能なコードが記述されたプログラムモジュール１０９３として実装される。プログラムモジュール１０９３は、例えばハードディスクドライブ１０９０に記憶される。例えば、計算装置１０における機能構成と同様の処理を実行するためのプログラムモジュール１０９３が、ハードディスクドライブ１０９０に記憶される。なお、ハードディスクドライブ１０９０は、ＳＳＤにより代替されてもよい。 The hard disk drive 1090 stores, for example, an OS 1091, an application program 1092, a program module 1093, and program data 1094. That is, a program that defines each process of the computing device 10 is implemented as a program module 1093 in which a code executable by a computer is described. The program module 1093 is stored in the hard disk drive 1090, for example. For example, a program module 1093 for executing processing similar to the functional configuration in the computing device 10 is stored in the hard disk drive 1090. Note that the hard disk drive 1090 may be replaced by an SSD.

また、上述した実施形態の処理で用いられる設定データは、プログラムデータ１０９４として、例えばメモリ１０１０やハードディスクドライブ１０９０に記憶される。そして、ＣＰＵ１０２０が、メモリ１０１０やハードディスクドライブ１０９０に記憶されたプログラムモジュール１０９３やプログラムデータ１０９４を必要に応じてＲＡＭ１０１２に読み出して実行する。 The setting data used in the processing of the above-described embodiment is stored as program data 1094 in, for example, the memory 1010 or the hard disk drive 1090. Then, the CPU 1020 reads the program module 1093 and the program data 1094 stored in the memory 1010 and the hard disk drive 1090 to the RAM 1012 and executes them as necessary.

なお、プログラムモジュール１０９３やプログラムデータ１０９４は、ハードディスクドライブ１０９０に記憶される場合に限らず、例えば着脱可能な記憶媒体に記憶され、ディスクドライブ１１００等を介してＣＰＵ１０２０によって読み出されてもよい。あるいは、プログラムモジュール１０９３及びプログラムデータ１０９４は、ネットワーク（ＬＡＮ、ＷＡＮ（Wide Area Network）等）を介して接続された他のコンピュータに記憶されてもよい。そして、プログラムモジュール１０９３及びプログラムデータ１０９４は、他のコンピュータから、ネットワークインタフェース１０７０を介してＣＰＵ１０２０によって読み出されてもよい。 The program module 1093 and the program data 1094 are not limited to being stored in the hard disk drive 1090, but may be stored in, for example, a removable storage medium and read out by the CPU 1020 via the disk drive 1100 or the like. Alternatively, the program module 1093 and the program data 1094 may be stored in another computer connected via a network (LAN, WAN (Wide Area Network), etc.). Then, the program module 1093 and the program data 1094 may be read by the CPU 1020 from another computer via the network interface 1070.

１計算システム
１０計算装置
１１通信部
１２記憶部
１３制御部
２０クライアント
３０サーバ
４０ルータ
１２１一次記憶部
１２２二次記憶部
１３１取得部
１３２分類部
１３３計算部
１３４保存部 DESCRIPTION OF SYMBOLS 1 Computation system 10 Computation apparatus 11 Communication part 12 Storage part 13 Control part 20 Client 30 Server 40 Router 121 Primary storage part 122 Secondary storage part 131 Acquisition part 132 Classification part 133 Calculation part 134 Storage part

Claims

An acquisition unit that acquires flow statistical information that is statistical information related to traffic aggregated for each communication source and communication destination;
A classification unit for classifying the flow statistics information into groups so that the sessions of the traffic based on the same are the same;
Based on the flow statistical information, a calculation unit that calculates statistical information about traffic for each group classified by the classification unit;
A computing device characterized by comprising:

In the classification unit, the transmission source and the transmission destination included in the first flow statistical information are the same as the transmission destination and the transmission source included in the second flow statistical information, respectively, and the first flow statistical information And the second flow statistical information is classified into the same group when the second flow statistical information is based on traffic generated within a predetermined period. The computing device according to claim 1, wherein:

The acquisition unit acquires at least the number of packets and the number of bytes as the flow statistical information,
The calculation unit includes, as the statistical information, an average packet size for each group, an average maximum packet size, an average minimum packet size, an average standard deviation of the packet size, and the number of bytes. The calculation apparatus according to claim 1, wherein the time average and the information indicating the presence / absence of transmitted / received packets for each time are calculated.

The acquisition unit acquires the flow statistic information corresponding to each time at a fixed time interval in time order,
The classification unit classifies the flow statistical information every time the flow statistical information is acquired by the acquisition unit,
4. The calculation device according to claim 1, wherein the calculation unit calculates statistical information regarding traffic for each group each time classification is performed by the classification unit. 5.

When the statistical information of the group classified by the classification unit has already been calculated, the calculation unit, based on the calculated statistical information and the flow statistical information acquired by the acquisition unit, The calculation apparatus according to claim 4, wherein statistical information about traffic for each group is calculated.

The calculation device according to claim 1, wherein the calculation unit calculates a co-occurrence matrix based on the presence / absence of a transmitted / received packet for each time as the statistical information.

A calculation method executed by a calculation device,
An acquisition step of acquiring flow statistical information that is statistical information related to traffic aggregated for each communication source and communication destination;
A classification step of classifying the flow statistics information into groups so that the sessions of the traffic based on the same are the same;
A calculation step of calculating statistical information on traffic for each group classified by the classification step based on the flow statistical information;
The calculation method characterized by including.