KR102028093B1

KR102028093B1 - Method of detecting abnormal behavior on the network and apparatus using the same

Info

Publication number: KR102028093B1
Application number: KR1020170139078A
Authority: KR
Inventors: 윤정한; 이우묘; 김신규
Original assignee: 한국전자통신연구원
Priority date: 2017-10-25
Filing date: 2017-10-25
Publication date: 2019-10-02
Also published as: KR20190046018A

Abstract

네트워크에 대한 이상행위 탐지 방법 및 이를 이용한 장치가 개시된다. 본 발명에 따른 이상행위 탐지 방법은 네트워크가 정상일 때 수집된 트래픽을 기반으로 학습을 위한 로그를 추출하는 단계; 상기 로그를 기반으로 패킷을 고려하여 네트워크 이상행위를 탐지하는 로컬(LOCAL) 탐지 모델과 트래픽 흐름을 고려하여 네트워크 이상행위를 탐지하는 글로벌(GLOBAL) 탐지 모델을 각각 학습시키는 단계; 및 상기 로컬 탐지 모델과 상기 글로벌 탐지 모델을 이용하여 상기 네트워크 이상행위에 대한 로컬 탐지와 글로벌 탐지를 동시에 수행하는 단계를 포함한다.Disclosed are an abnormal behavior detection method for a network and an apparatus using the same. Anomaly detection method according to the present invention comprises the steps of extracting a log for learning based on the traffic collected when the network is normal; Training a local (LOCAL) detection model for detecting network anomaly based on the log and a global (GLOBAL) detection model for detecting network anomaly in consideration of traffic flow; And simultaneously performing local detection and global detection on the network abnormality by using the local detection model and the global detection model.

Description

Anomaly detection method for network and device using same {METHOD OF DETECTING ABNORMAL BEHAVIOR ON THE NETWORK AND APPARATUS USING THE SAME}

본 발명은 네트워크에 대한 이상행위를 탐지하는 기술에 관한 것으로, 특히 네트워크가 정상일 때의 트래픽의 특성을 기반으로 이를 위배하는 비정상 트래픽이 발생하는지 탐지하는 기술에 관한 것이다.The present invention relates to a technique for detecting anomalous behavior for a network, and more particularly, to a technique for detecting whether abnormal traffic occurs based on the characteristics of traffic when the network is normal.

네트워크에서의 이상행위 탐지를 위한 기술로는 트래픽 전송량의 경계값 및 패킷들의 ttl(time-to-live)값의 범위 등 특정 값의 변화를 평소 학습하였다가, 학습된 정보를 토대로 이상행위를 탐지하고자 하는 방법들이 있다. 이러한 기술은 네트워크 및 사이버공격 지식에 기반한 경우가 많은데, 대상 네트워크의 고유특성을 반영하기보다는 일반적인 네트워크 트래픽의 정상적인 성질을 반영한다. 따라서, 적용 사이트마다 가지는 네트워크 고유의 특성을 효과적으로 반영할 수 없다.As a technique for detecting anomalous behavior in the network, abnormal behaviors are detected based on the learned information after learning about specific values such as the traffic threshold and the range of time-to-live (ttl) values of packets. There are ways to do it. These techniques are often based on network and cyberattack knowledge, rather than the inherent characteristics of the target network, rather than the normal nature of general network traffic. Therefore, it is impossible to effectively reflect the network-specific characteristics of each application site.

이상행위 탐지의 다른 형태로는 프로토콜 및 서비스 별 트래픽 특성을 학습하여 각 프로토콜, 서비스, 서버-클라이언트들을 구분하면서 이상행위가 발생하는 것을 탐지하는 것이다. 이러한 경우, 대상 프로토콜 및 서비스에 대한 관련정보 및 지식을 바탕으로 하거나 통신의 특정구간(예를 들어, 세션이 맺어지는 순간)의 행위를 기준으로 서비스들을 구분하는 기술들이 있다. 그러나, 이러한 기술은 분석 대상에 대한 정보가 부족하거나, 행위를 구분할 수 있는 특정구간을 모니터링 하지 못하면 해당 통신들이 이루어지고 있는 동안에 이상행위를 탐지할 수 없다는 단점이 존재한다.Another type of anomaly detection involves learning traffic characteristics for each protocol and service, and detecting anomalies while distinguishing each protocol, service, and server-client. In this case, there are techniques for classifying services based on relevant information and knowledge about the target protocol and service or based on the behavior of a particular section of communication (eg, at the moment of session establishment). However, such a technique has a disadvantage in that an abnormal behavior cannot be detected while the corresponding communication is performed unless there is a lack of information on an analysis target or a specific section that can distinguish the behavior.

또한, 프로토콜 및 서비스 별 트래픽 특성을 학습하여 각 프로토콜, 서비스, 서버-클라이언트들을 구분하면서 이상행위가 발생하는 것을 탐지하는 기술도 존재한다. 이 기술은 각 통신구간(세션, 서버-클라이언트 구간 등)의 트래픽들을 어느 정도 수집한 이후, 각 통신구간들의 특성을 학습 및 추출하여 통신구간들의 특성을 서로 구분함으로써 이상행위 및 이상통신을 탐지한다. 이는 사용하는 서비스나 기기 제조사 등 세부적 특징까지 구분할 수 있다는 장점이 있다. 하지만 이 기술은 각 통신들의 트래픽을 충분히 수집한 이후에나 적용이 가능하므로, 이상행위가 발생한 시점에 탐지가 불가능하다. 결국, 항상 뒤늦게 이상행위를 탐지하게 되어, 현장에서 이상행위 발생에 따른 즉각적인 대응을 하기에 부적합하므로 이상행위 탐지시스템에 적용하기 어렵다.In addition, there is a technology that detects anomalous behavior while distinguishing each protocol, service, and server-client by learning traffic characteristics for each protocol and service. This technology collects traffic of each communication section (session, server-client section, etc.) and then detects anomalous behavior and abnormal communication by learning and extracting characteristics of each communication section and distinguishing the characteristics of each communication section. . This has the advantage of being able to distinguish detailed features such as service and device manufacturer. However, this technique can be applied only after the traffic of each communication has been sufficiently collected, and thus it cannot be detected at the time of anomalous behavior. As a result, abnormal behavior is always detected late, so it is difficult to apply to the abnormal behavior detection system because it is not suitable for immediate response to the occurrence of abnormal behavior in the field.

한국 공개 특허 제10-2012-0074041호, 2012년 7월 5일 공개(명칭: SCADA 시스템 및 그의 보안 관리방법)Korean Patent Publication No. 10-2012-0074041, published on July 5, 2012 (name: SCADA system and its security management method)

본 발명의 목적은 특정 네트워크에서 사용하는 기기들에 발생하는 사이버 위협, 사이버 공격 및 보안사고 등을 탐지하기 위한 네트워크 트래픽 감시 기술을 제공하는 것이다.An object of the present invention is to provide a network traffic monitoring technology for detecting cyber threats, cyber attacks and security incidents occurring in devices used in a specific network.

또한, 본 발명의 목적은 대상 네트워크 및 기기들에 대한 구동정보나 프로토콜 스펙 정보를 얻기 어렵거나 또는 활용하기 어려운 경우에도 네트워크가 정상일 때 학습한 특징을 이용하여 비정상 트래픽을 탐지하는 것이다. In addition, an object of the present invention is to detect abnormal traffic using the learned feature when the network is normal even when it is difficult or difficult to obtain driving information or protocol specification information for the target network and devices.

또한, 본 발명의 목적은 네트워크 트래픽을 감시하면서, 이상행위가 발생하는 시점 및 대상을 찾는 방법을 제공하고, 해당 결과를 감시대상 기기의 추가적 행위분석에 활용하는 방안을 제시하는 것이다.It is also an object of the present invention to provide a method of finding a time and target of occurrence of abnormal behavior while monitoring network traffic, and to propose a method of using the result in additional behavior analysis of the monitored device.

상기한 목적을 달성하기 위한 본 발명에 따른 이상행위 탐지 방법은, 네트워크가 정상일 때 수집된 트래픽을 기반으로 학습을 위한 로그를 추출하는 단계; 상기 로그를 기반으로 패킷을 고려하여 네트워크 이상행위를 탐지하는 로컬(LOCAL) 탐지 모델과 트래픽 흐름을 고려하여 네트워크 이상행위를 탐지하는 글로벌(GLOBAL) 탐지 모델을 각각 학습시키는 단계; 및 상기 로컬 탐지 모델과 상기 글로벌 탐지 모델을 이용하여 상기 네트워크 이상행위에 대한 로컬 탐지와 글로벌 탐지를 동시에 수행하는 단계를 포함한다.Anomaly detection method according to the present invention for achieving the above object comprises the steps of extracting a log for learning based on the traffic collected when the network is normal; Training a local (LOCAL) detection model for detecting network anomaly based on the log and a global (GLOBAL) detection model for detecting network anomaly in consideration of traffic flow; And simultaneously performing local detection and global detection on the network abnormality by using the local detection model and the global detection model.

이 때, 학습시키는 단계는 상기 패킷에서 추출된 특징을 기반으로 상기 로컬 탐지 모델을 학습시키는 단계; 및 패턴 매칭(PATTERN MATCHING) 방식을 기반으로 상기 글로벌 탐지 모델을 학습시키는 단계를 포함할 수 있다.In this case, the training may include training the local detection model based on a feature extracted from the packet; And training the global detection model based on a PATTERN MATCHING scheme.

이 때, 로컬 탐지 모델을 학습시키는 단계는 상기 로그를 상기 네트워크에 상응하는 통신구간별로 분류하는 단계; 및 어느 하나의 통신구간에 해당하는 복수개의 패킷들을 기반으로 N차원 공간에 매핑되는 복수개의 제1 특징점들을 추출하고, 로이드(LIOYD) 알고리즘을 기반으로 상기 복수개의 제1 특징점들을 클러스터링하여 하나 이상의 제1 클러스터를 생성하는 단계를 포함할 수 있다.In this case, the training of the local detection model may include classifying the log by communication section corresponding to the network; And extracting a plurality of first feature points mapped to the N-dimensional space based on a plurality of packets corresponding to any one communication section, and clustering the plurality of first feature points based on a LOYID algorithm to generate one or more first feature points. It may include the step of creating one cluster.

이 때, 하나 이상의 제1 클러스터를 생성하는 단계는 상기 하나 이상의 제1 클러스터에 대한 클러스터 중심을 산출하고, 상기 통신구간별로 산출된 클러스터 중심들 간의 거리 차이가 기설정된 오차거리 미만이 될 때까지 반복하여 클러스터링을 수행할 수 있다.At this time, the step of generating one or more first clusters calculates the cluster centers for the one or more first clusters, and repeats until the distance difference between the cluster centers calculated for each communication section is less than a preset error distance. Clustering can be performed.

이 때, 하나 이상의 제1 클러스터를 생성하는 단계는 상기 복수개의 패킷들 각각의 바이트(BYTE)를 분석하여 상기 복수개의 패킷들 각각에 대한 패킷 히스토그램을 생성하는 단계; 그리디 계층(GREEDY LATER) 기반의 학습 방식을 기반으로 상기 패킷 히스토그램을 스택드 오토인코더(STACKED AUTOENCODERS)의 입력 데이터로 입력하는 단계; 및 상기 스택드 오토인코더에 대한 학습을 수행하고, 상기 스택드 오토인코더의 특징 계층을 기반으로 상기 복수개의 제1 특징점들을 추출하는 단계를 포함할 수 있다.In this case, generating one or more first clusters may include: generating a packet histogram for each of the plurality of packets by analyzing a byte (BYTE) of each of the plurality of packets; Inputting the packet histogram as input data of a stacked autoencoder based on a greedy layer based learning scheme; And learning the stacked autoencoder and extracting the plurality of first feature points based on the feature layer of the stacked autoencoder.

이 때, 로컬 탐지 모델을 학습시키는 단계는 전송시간차(INTERARRIVAL TIME)를 기반으로 상기 복수개의 패킷들을 분류하여 하나 이상의 트랜잭션(TRANSACTION)을 생성하는 단계; 상기 하나 이상의 트랜잭션에 포함된 적어도 하나의 패킷을 상기 제1 클러스터에 상응하는 타입으로 변환하여 하나 이상의 패킷타입 시퀀스를 생성하고, 상기 하나 이상의 패킷타입 시퀀스를 클러스터링하여 하나 이상의 제2 클러스터를 생성하는 단계를 더 포함할 수 있다.In this case, the training of the local detection model may include generating one or more transactions by classifying the plurality of packets based on an INTERARRIVAL TIME; Generating at least one packet type sequence by converting at least one packet included in the at least one transaction into a type corresponding to the first cluster, and generating at least one second cluster by clustering the at least one packet type sequence It may further include.

이 때, 로컬 탐지 모델을 학습하는 단계는 상기 복수개의 패킷들 각각의 바이트를 분석하여 바이트 별 등장 횟수를 포함하는 바이트 빈도 데이터를 생성하는 단계; 상기 바이트 빈도 데이터에 포함된 256개의 빈도값들을 버킷(BUCKET) 단위로 분류하여 상기 빈도값들보다 적은 복수개의 버킷들을 생성하고, 상기 복수개의 버킷들마다 할당된 빈도값을 합산하여 버킷 빈도 데이터를 생성하는 단계; 및 상기 버킷 빈도 데이터를 클러스터링하여 어느 하나의 패킷에 대한 하나 이상의 제3 클러스터를 생성하는 단계를 더 포함할 수 있다.In this case, learning the local detection model may include analyzing byte of each of the plurality of packets to generate byte frequency data including the number of occurrences of each byte; The 256 frequency values included in the byte frequency data are classified into buckets to generate a plurality of buckets smaller than the frequency values, and the frequency values allocated to the plurality of buckets are summed to generate bucket frequency data. Generating; And clustering the bucket frequency data to generate one or more third clusters for any one packet.

이 때, 버킷 빈도 데이터를 생성하는 단계는 상기 256개의 빈도값들 중 패킷을 구분하는데 사용되는 복수개의 빈도값들은 각각 다른 버킷으로 분류할 수 있다.In this case, the generating of the bucket frequency data may classify a plurality of frequency values used to distinguish packets among the 256 frequency values into different buckets.

이 때, 로컬 탐지 모델을 학습하는 단계는 전송시간차를 기반으로 상기 복수개의 패킷들을 분류하여 하나 이상의 트랜잭션을 생성하는 단계; 상기 하나 이상의 트랜잭션에 할당된 패킷에 상응하는 전송방향과 상기 하나 이상의 제3 클러스터를 조합하여 통신구간별 트랜잭션의 바이트 빈도를 알 수 있는 패킷 빈도 데이터를 생성하는 단계; 및 상기 패킷 빈도 데이터를 기반으로 학습을 위한 정상 패턴을 검출하고, 상기 정상 패턴에 상응하게 상기 로컬 탐지 모델을 학습시키는 단계를 더 포함할 수 있다.In this case, learning the local detection model may include generating one or more transactions by classifying the plurality of packets based on a transmission time difference; Combining packet transmission direction corresponding to the packet allocated to the one or more transactions with the one or more third clusters to generate packet frequency data for knowing the byte frequency of the transaction for each communication section; And detecting a normal pattern for learning based on the packet frequency data, and training the local detection model corresponding to the normal pattern.

이 때, 글로벌 탐지 모델을 학습시키는 단계는 상기 로그를 기반으로 기설정된 제1 단위시간마다 상기 네트워크에 상응하는 통신경로 별 전송횟수에 대한 제1 히스토그램을 생성하는 단계; 상기 기설정된 제1 단위시간보다 작은 기설정된 제2 단위시간마다 상기 네트워크에 상응하는 통신경로 별 전송횟수에 대한 제2 히스토그램을 생성하는 단계; 및 유사도를 고려하여 상기 제1 히스토그램과 상기 제2 히스토그램을 매칭하고, 매칭된 두 개의 히스토그램들 간의 벡터 거리에 대한 평균과 분산을 산출하여 상기 네트워크에 상응하는 카이스퀘어(CHI-SQUARE) 분포를 생성하는 단계를 포함할 수 있다.In this case, the training of the global detection model may include: generating a first histogram of the number of transmission paths corresponding to the communication paths corresponding to the network every first predetermined unit time based on the log; Generating a second histogram of transmission counts for each communication path corresponding to the network every second preset unit time smaller than the first preset unit time; And considering the similarity, matching the first histogram and the second histogram, calculating an average and a variance of the vector distances between the two matched histograms, and generating a CHI-SQUARE distribution corresponding to the network. It may include the step.

이 때, 생성하는 단계는 상기 네트워크에 상응하는 복수개의 통신경로들과 시간을 고려하여 상기 로그를 복수개의 로그 그룹들로 분류하고, 상기 복수개의 로그 그룹들마다 상기 카이스퀘어 분포를 생성할 수 있다.At this time, the generating may be classified into a plurality of log groups in consideration of a plurality of communication paths and time corresponding to the network, and may generate the chi square distribution for each of the plurality of log groups. .

이 때, 수행하는 단계는 상기 네트워크를 통해 전송되는 개별 패킷 및 개별 트랜잭션 중 어느 하나가 상기 하나 이상의 제1 클러스터, 상기 하나 이상의 제3 클러스터 및 상기 정상 패턴 중 어느 하나에 해당하지 않는 경우, 상기 네트워크에서 비정상 트래픽이 발생한 것으로 판단할 수 있다.At this time, the step of performing if the one of the individual packet and the individual transaction transmitted through the network does not correspond to any one of the one or more first cluster, the one or more third cluster and the normal pattern, the network It can be determined that abnormal traffic has occurred in.

이 때, 수행하는 단계는 상기 네트워크를 통해 전송되는 개별 트랜잭션이 상기 하나 이상의 제2 클러스터에 포함되지 않는 경우, 상기 개별 트랜잭션을 비정상 트랜잭션으로 판단할 수 있다.In this case, if the individual transaction transmitted through the network is not included in the one or more second clusters, the performing of the step may determine the individual transaction as an abnormal transaction.

이 때, 수행하는 단계는 상기 네트워크를 탐지하기 위해 상기 제2 단위시간에 상응하게 탐지대상 히스토그램을 생성하고, 복수개의 제1 히스토그램들 중 상기 탐지대상 히스토그램과 유사도가 가장 높은 어느 하나의 제1 히스토그램을 검출하는 단계; 및 상기 어느 하나의 제1 히스토그램과 상기 탐지대상 히스토그램 간의 벡터 거리가 상기 카이스퀘어 분포의 99% 신뢰구간에 해당하지 않는 경우, 상기 네트워크에서 비정상 트래픽이 발생한 것으로 판단하는 단계를 포함할 수 있다.In this case, the performing of the step may include generating a histogram to be detected corresponding to the second unit time to detect the network, and selecting any one of the first histograms having the highest similarity with the detection histogram among the plurality of first histograms. Detecting; And determining that abnormal traffic has occurred in the network when the vector distance between the first histogram and the detection target histogram does not correspond to a 99% confidence interval of the chi square distribution.

또한, 본 발명의 일실시예에 따른 네트워크에 대한 이상행위 탐지 장치는, 네트워크가 정상일 때 수집된 트래픽을 기반으로 학습을 위한 로그를 추출하고, 상기 로그를 기반으로 패킷을 고려하여 네트워크 이상행위를 탐지하는 로컬(LOCAL) 탐지 모델과 트래픽 흐름을 고려하여 네트워크 이상행위를 탐지하는 글로벌(GLOBAL) 탐지 모델을 각각 학습시키고, 상기 로컬 탐지 모델과 상기 글로벌 탐지 모델을 이용하여 상기 네트워크 이상행위에 대한 로컬 탐지와 글로벌 탐지를 동시에 수행하는 프로세서; 및 상기 로그, 상기 로컬 탐지 모델 및 상기 글로벌 탐지 모델 중 적어도 하나를 저장하는 메모리를 포함한다.In addition, the anomaly detection apparatus for a network according to an embodiment of the present invention, extracting a log for learning based on the traffic collected when the network is normal, and performs a network anomaly in consideration of the packet based on the log Consider a local detection model (LOCAL) detection and traffic flow, and learn a global (GLOBAL) detection model for detecting network anomalies, respectively, and local to the network anomaly using the local detection model and the global detection model A processor that performs detection and global detection simultaneously; And a memory storing at least one of the log, the local detection model, and the global detection model.

본 발명에 따르면, 감시대상 네트워크 및 시스템에 대한 구체적인 정보가 없어도 탐지엔진을 구성할 수 있고, 이를 이용하여 감시대상의 복잡한 특징을 이해하는데 활용할 수 있다.According to the present invention, the detection engine can be configured even without specific information on the network and system to be monitored, and it can be used to understand the complex characteristics of the monitoring target.

또한, 본 발명은 외부로부터 공격 시그니처 정보를 지속적으로 업데이트 받지 않아도 자체적으로 탐지엔진을 운영할 수 있다.In addition, the present invention can operate its own detection engine without constantly updating the attack signature information from the outside.

또한, 본 발명은 보안을 위해 폐쇄망으로 운영되거나 원격 업데이트가 불가능한 곳에 대해서도 효과적으로 감시할 수 있다.In addition, the present invention can effectively monitor even where a remote network or remote update is impossible for security.

또한, 본 발명은 트래픽 전송량과 그 전달내용의 변화를 동시에 감시 및 분석할 수 있으므로 보다 세분화되고 효과적으로 이상행위를 탐지할 수 있다.In addition, the present invention can monitor and analyze the traffic transmission amount and the change in its contents at the same time, thereby making it possible to detect abnormal behavior more finely and effectively.

또한, 본 발명은 글로벌 탐지와 로컬 탐지를 통해 IP 네트워크 상태 및 기기 상태 등의 정보를 획득하고, 제어기기의 입출력정보와 같은 타 영역의 상태 정보와 매칭하여 복합적 상태 정보를 분석할 수 있다.In addition, the present invention obtains information such as IP network state and device state through global detection and local detection, and analyzes complex state information by matching with state information of another area such as input / output information of a controller.

또한, 본 발명은 글로벌 탐지 알고리즘에 서로 다른 기준의 패킷 개수와 트래픽량을 사용함으로써 알고리즘을 추가하지 않아도 세밀한 탐지를 수행할 수 있다.In addition, the present invention can perform detailed detection without using the algorithm by using the packet number and the traffic volume of different criteria in the global detection algorithm.

도 1은 본 발명의 일실시예에 따른 네트워크에 대한 이상행위 탐지 시스템을 나타낸 도면이다.
도 2는 본 발명의 일실시예에 따른 네트워크에 대한 이상행위 탐지 방법을 나타낸 동작흐름도이다.
도 3은 본 발명에 따른 패킷의 바이트(byte)를 이용한 패킷 히스토그램의 일 예를 나타낸 도면이다.
도 4 내지 도 5는 본 발명에 따른 그리디 계층(greedy layer) 기반의 학습 방식과 이를 이용한 클러스터링 과정의 일 예를 나타낸 도면이다.
도 6 내지 도 7은 본 발명에 따른 트랜잭션 분류 과정의 일 예를 나타낸 도면이다.
도 8 내지 도 10은 본 발명에 따른 트랜잭션을 각각의 클러스터에 대한 시퀀스로 학습하는 과정의 일 예를 나타낸 도면이다.
도 11은 본 발명에 따른 로컬 탐지 과정의 일 예를 나타낸 도면이다.
도 12 내지 도 13은 도 11에 도시된 바이트 빈도 데이터(Byte frequency)를 생성하는 과정의 일 예를 나타낸 도면이다.
도 14 내지 도 15는 도 11에 도시된 버킷 빈도 데이터(Bucket frequency)를 생성하는 과정의 일 예를 나타낸 도면이다.
도 16은 본 발명에 따른 제3 클러스터(Bucket frequency cluster)의 일 예를 나타낸 도면이다.
도 17은 도 11에 도시된 패킷 빈도 데이터(Packet frequency)를 생성하는 과정의 일 예를 나타낸 도면이다.
도 18 내지 도 19는 본 발명에 따른 네트워크의 통신경로 및 통신경로 별 전송횟수에 대한 히스토그램의 일 예를 나타낸 도면이다.
도 20은 본 발명에 따른 패턴매칭을 위해 도 19에 도시된 히스토그램을 시간 순으로 모으는 과정의 일 예를 나타낸 도면이다.
도 21은 본 발명에 따라 분석한 트래픽 흐름을 세분화하는 개념의 일 예를 나타낸 도면이다.
도 22는 본 발명의 일실시예에 따른 글로벌 탐지와 로컬 탐지를 통합하는 방법을 나타낸 동작흐름도이다.
도 23 내지 도 24는 본 발명의 일실시예에 따른 네트워크에 대한 이상행위 탐지 장치의 구성도와 기능을 나타낸 도면이다.
도 25는 본 발명의 다른 실시예에 따른 네트워크에 대한 이상행위 탐지 장치를 나타낸 블록도이다.1 is a diagram illustrating an anomaly detection system for a network according to an embodiment of the present invention.
2 is a flowchart illustrating an anomaly detection method for a network according to an exemplary embodiment of the present invention.
3 is a diagram illustrating an example of a packet histogram using bytes of a packet according to the present invention.
4 to 5 are diagrams showing an example of a greedy layer based learning method and a clustering process using the same according to the present invention.
6 to 7 are diagrams showing an example of a transaction classification process according to the present invention.
8 to 10 are diagrams showing an example of a process of learning a transaction according to the sequence for each cluster according to the present invention.
11 is a diagram illustrating an example of a local detection process according to the present invention.
12 to 13 illustrate an example of a process of generating byte frequency data illustrated in FIG. 11.
14 to 15 are diagrams illustrating an example of a process of generating bucket frequency data shown in FIG. 11.
16 illustrates an example of a third cluster (Bucket frequency cluster) according to the present invention.
FIG. 17 is a diagram illustrating an example of a process of generating packet frequency data shown in FIG. 11.
18 to 19 are diagrams showing an example of a histogram for a communication path and a transmission frequency for each communication path of a network according to the present invention.
20 is a view showing an example of a process of collecting the histogram shown in FIG. 19 in chronological order for pattern matching according to the present invention.
21 is a diagram illustrating an example of a concept of segmenting traffic flows analyzed according to the present invention.
22 is a flowchart illustrating a method of integrating global detection and local detection according to an embodiment of the present invention.
23 to 24 are diagrams showing the configuration and function of the abnormal behavior detection apparatus for a network according to an embodiment of the present invention.
25 is a block diagram illustrating an anomaly detection apparatus for a network according to another embodiment of the present invention.

본 발명을 첨부된 도면을 참조하여 상세히 설명하면 다음과 같다. 여기서, 반복되는 설명, 본 발명의 요지를 불필요하게 흐릴 수 있는 공지 기능, 및 구성에 대한 상세한 설명은 생략한다. 본 발명의 실시형태는 당 업계에서 평균적인 지식을 가진 자에게 본 발명을 보다 완전하게 설명하기 위해서 제공되는 것이다. 따라서, 도면에서의 요소들의 형상 및 크기 등은 보다 명확한 설명을 위해 과장될 수 있다.Hereinafter, the present invention will be described in detail with reference to the accompanying drawings. Here, the repeated description, well-known functions and configurations that may unnecessarily obscure the subject matter of the present invention, and detailed description of the configuration will be omitted. Embodiments of the present invention are provided to more completely describe the present invention to those skilled in the art. Accordingly, the shape and size of elements in the drawings may be exaggerated for clarity.

이하, 본 발명에 따른 바람직한 실시예를 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일실시예에 따른 네트워크에 대한 이상행위 탐지 시스템을 나타낸 도면이다.1 is a diagram illustrating an anomaly detection system for a network according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 일실시예에 따른 네트워크에 대한 이상행위 탐지 시스템은 이상행위 탐지 장치(110)를 통한 글로벌 탐지와 로컬 탐지를 수행하여 제어트래픽에 대한 이상행위를 탐지할 수 있다. Referring to FIG. 1, an anomaly detection system for a network according to an embodiment of the present invention may detect an anomaly for a control traffic by performing global detection and local detection through an anomaly detection device 110. .

이 때, 글로벌 탐지는 이상행위 탐지 장치(110)에서 글로벌 탐지 모델의 학습 결과나 클러스터링 결과 등을 적용하여 전체 네트워크 상태를 파악할 수 있고, 전체 네트워크 상태에 따라서 로컬 탐지를 수행하여 어떤 종류의 패킷이나 트랜잭션이 발생하는지 탐지할 수 있다.At this time, the global detection can determine the overall network state by applying the learning result or clustering result of the global detection model in the anomaly detection device 110, and performs a local detection according to the overall network state, You can detect if a transaction occurs.

예를 들어, 글로벌 탐지를 통해 제어시스템 운영센터 및 제어시스템 현장의 IP 네트워크의 상태를 파악할 수 있다. 이 때, 개별 기기들의 통신구간에는 로컬 탐지를 적용함으로써 IP 네트워크의 상태에 따라 전달 가능한 패킷 및 트랜잭션을 구분하여 이상통신 여부를 탐지할 수 있다.For example, global detection can provide insight into the status of IP networks in control system operations and control system sites. At this time, by applying local detection to the communication section of the individual devices, it is possible to detect whether there is an abnormal communication by classifying packets and transactions that can be delivered according to the state of the IP network.

이 때, 본 발명에 따른 이상행위 탐지 장치(110)는 로컬 탐지에 해당하는 구간들의 연관관계를 이용하여 전달 가능한 트래픽의 형태를 제한할 수 있다.At this time, the abnormal behavior detection apparatus 110 according to the present invention can limit the type of traffic that can be delivered using the association of the sections corresponding to the local detection.

예를 들어, 도 1에 도시된 HMI-메인서버-Historian과 HMI-메인서버-PLC와 같이 서로 연동된 통신구간에 대해, 이전에 HMI가 메인서버에 A라는 형태의 트랜잭션을 보내면 다음에는 메인서버가 Historian에 A'라는 형태의 트랜잭션을 전달하는지 학습을 통해 확인하고 이상여부를 감지할 수 있다. 이러한 경우, 글로벌 탐지를 기반으로 전체 네트워크 및 일부 네트워크의 상태정보를 같이 활용함으로써 특정 상황에서 전송 가능한 트랜잭션의 종류를 제한할 수도 있다.For example, for the communication intervals such as HMI-Main Server-Historian and HMI-Main Server-PLC shown in FIG. 1, if the HMI previously sends a transaction of type A to the main server, then the main server You can check whether and deliver the transaction of 'A' to Historian through learning and detect the abnormality. In this case, by using the state information of the entire network and some networks based on the global detection, it is possible to limit the types of transactions that can be transmitted in a specific situation.

또한, 본 발명의 일실시예에 따른 이상행위 탐지 장치(110)는 다른 감지 장치 또는 탐지 장치에서 제공되는 정보를 함께 분석하여 네트워크 트래픽에 제한되지 않고 감시영역을 확대하여 연관관계를 고려한 이상행위 탐지를 수행할 수도 있다. 즉, 다른 탐지 장치에서 제공되는 정보와 글로벌 탐지 및 로컬 탐지의 결과를 조합하여 연관관계 감시를 확장할 수 있다.In addition, the anomaly detection apparatus 110 according to an embodiment of the present invention analyzes the information provided by the other detection device or detection device together to detect anomalies in consideration of the relationship by expanding the surveillance area without being limited to network traffic. You can also do In other words, correlation monitoring can be extended by combining information provided from other detection devices with the results of global and local detection.

예를 들어, 도 1에 도시된 HMI-PLC-현장장치 구간에서, HMI-PLC는 IP 네트워크로 연결되어 네트워크 트래픽을 주고 받으므로, 이상행위 탐지 장치(110)를 통해 탐지 및 감시할 수 있다. 또한, PCL-현장장치의 경우에는, 디지털 및 아날로그 신호를 통해 통신을 수행하므로 본 발명에 따른 이상행위 탐지 장치(110)와는 별도의 제어기기 입출력 이상행위 탐지장치(120)를 통해 이상행위 탐지를 수행할 수 있다.For example, in the HMI-PLC-site device section illustrated in FIG. 1, since the HMI-PLC is connected to an IP network to exchange network traffic, the HMI-PLC may be detected and monitored through the anomaly detecting apparatus 110. In addition, in the case of the PCL-site device, since communication is performed through digital and analog signals, abnormal behavior detection is performed through an input / output abnormal behavior detection apparatus 120 that is separate from the abnormal behavior detection apparatus 110 according to the present invention. Can be done.

이러한 경우, 이상행위 탐지 장치(110)가 글로벌 탐지와 로컬 탐지를 통해 PLC의 상태정보가 변해가는 것을 파악하며, 그 정보를 제어기기 입출력 이상행위 탐지장치(120)에게 전달할 수 있다. 이 때, 제어기기 입출력 이상행위 탐지장치(120)는 PLC의 상태정보 변화에 따라 입출력 정보들이 적절히 변해가는지 감시하는 기준으로 사용할 수 있다. 이는 HMI-PLC 사이의 상태 변화와 PLC-현장장치 사이의 상태 변화가 서로 맞물려 수행되어야 하는 특징을 감시할 수 있어 제어시스템의 전 영역을 감시하는데 적합할 수 있다.In this case, the abnormal behavior detection apparatus 110 may grasp that the state information of the PLC is changed through the global detection and the local detection, and may transmit the information to the controller input / output abnormal behavior detection apparatus 120. At this time, the controller I / O abnormal detection device 120 may be used as a reference for monitoring whether the input / output information is appropriately changed according to the change of the state information of the PLC. It can be suitable for monitoring the whole area of the control system by monitoring the characteristic that the change of state between HMI-PLC and the change of state between PLC-site is to be carried out with each other.

도 2는 본 발명의 일실시예에 따른 네트워크에 대한 이상행위 탐지 방법을 나타낸 동작흐름도이다.2 is a flowchart illustrating an anomaly detection method for a network according to an exemplary embodiment of the present invention.

도 2를 참조하면, 본 발명의 일실시예에 따른 네트워크에 대한 이상행위 탐지 방법은 네트워크가 정상일 때 수집된 트래픽을 기반으로 학습을 위한 로그를 추출한다(S210).Referring to FIG. 2, in the abnormal behavior detection method for a network according to an embodiment of the present invention, a log for learning is extracted based on traffic collected when the network is normal (S210).

즉, 본 발명에 따른 이상행위 탐지 방법은 네트워크가 정상으로 동작할 때 네트워크 트래픽의 평소 특징을 학습하고, 이를 기반으로 사이버공격, 내부자 실수 및 기기 오작동 등에 의해 발생하는 비정상 트래픽 또는 이상 트래픽을 탐지할 수 있다. 이를 위해, 네트워크가 정상일 때 발생한 트래픽에 대한 로그를 추출하여 이용할 수 있다. That is, the abnormal behavior detection method according to the present invention learns the usual characteristics of network traffic when the network is operating normally, and detects abnormal traffic or abnormal traffic caused by cyber attacks, insider mistakes and device malfunctions based on this. Can be. To this end, a log of traffic generated when the network is normal can be extracted and used.

또한, 본 발명의 일실시예에 따른 네트워크에 대한 이상행위 탐지 방법은 로그를 기반으로 패킷을 고려하여 네트워크 이상행위를 탐지하는 로컬(LOCAL) 탐지 모델과 트래픽 흐름을 고려하여 네트워크 이상행위를 탐지하는 글로벌(GLOBAL) 탐지 모델을 각각 학습시킨다(S220). In addition, the abnormal behavior detection method for the network according to an embodiment of the present invention to detect the network abnormal behavior in consideration of the traffic flow and local (LOCAL) detection model for detecting network abnormal behavior in consideration of the packet based on the log Train each of the global (GLOBAL) detection models (S220).

본 발명에서는 로컬 탐지와 클로벌 탐지 2가지를 모두 이용할 수 있다. 각각의 탐지 방법은 평소 네트워크 트래픽의 특징을 학습하는 방법과 이를 기준으로 비정상 트래픽을 탐지하는 방법일 수 있다. In the present invention, both local detection and global detection can be used. Each detection method may be a method of learning characteristics of network traffic and a method of detecting abnormal traffic based on the usual.

이 때, 패킷에서 추출된 특징을 기반으로 로컬 탐지 모델을 학습시킬 수 있다. 예를 들어, 비지도 학습(UNSUPERCISED LEARNING)으로 패킷의 특징이나 패턴을 추출하여 로컬 탐지 모델을 학습시키고, 이를 이용하여 네트워크의 트래픽을 구성하는 패킷을 정상 패킷과 비정상 패킷으로 효과적으로 분류할 수 있다.At this time, the local detection model can be trained based on the feature extracted from the packet. For example, a local detection model can be learned by extracting the feature or pattern of a packet by UNSUPERCISED LEARNING, and by using this, the packet constituting the traffic of the network can be effectively classified into a normal packet and an abnormal packet.

본 발명에서는 로컬 탐지 모델을 2가지 방식으로 학습시킬 수 있다. In the present invention, the local detection model can be trained in two ways.

먼저, 네트워크의 각 트래픽(또는 트랜잭션, 세션정보)을 D차원상의 점이라고 생각하면, D차원 공간에 위치하는 n개의 점들을 k개의 클러스터들로 클러스터링할 수 있는데, 이렇게 생성된 k개의 클러스터들의 중심과 학습 가능한 파라미터 세타(θ)를 이용하여 로컬 탐지 모델을 학습시킬 수 있다.First, considering each traffic (or transaction, session information) of the network as a D-dimensional point, n points located in the D-dimensional space can be clustered into k clusters. The local detection model can be trained using the learnable parameter theta (θ).

예를 들어, 먼저, D차원상의 점들의 집합인 데이터 공간 X에 대해 비선형 매핑(f_θ:X -> Z)을 통하여 특징공간 Z를 구하고, 특징공간 Z의 점들에 대해서 클러스터링을 진행할 수 있다. 이 때, 일반적으로 특징공간 Z의 차원은 데이터 공간 X의 차원보다 작게 설정할 수 있다. 이 후, 반복적인 학습을 통해 K개의 클러스터들의 중심과 학습 가능한 파라미터 세타(θ)를 학습시킬 수 있다.For example, first, a feature space Z may be obtained through a nonlinear mapping (f _θ : X-> Z) of a data space X that is a set of points on a D-dimensional surface, and clustering may be performed on the points of the feature space Z. In this case, in general, the dimension of the feature space Z may be set smaller than the dimension of the data space X. Thereafter, iterative learning can learn the center of K clusters and the learnable parameter theta θ.

이하에서는, 도 3 내지 도 10을 기반으로 로컬 탐지 모델을 학습시키기 위한 첫 번째 방법을 상세하게 설명하도록 한다.Hereinafter, a first method for training the local detection model will be described in detail with reference to FIGS. 3 to 10.

본 발명에 따르면, 로컬 탐지 모델을 학습시키기 위해 먼저, 로그를 네트워크에 상응하는 통신구간별로 분류할 수 있다. 이와 같이 로그를 통신구간별로 분류하여 사용함으로써 로컬 탐지 모델의 학습도 통신구간 별로 따로 수행할 수 있다.According to the present invention, in order to learn the local detection model, first, the log may be classified by communication section corresponding to the network. As described above, the logs are classified and used for each communication section, so that the learning of the local detection model may be separately performed for each communication section.

이 후, 어느 하나의 통신구간에 해당하는 복수개의 패킷들을 기반으로 N 차원 공간에 매핑되는 복수개의 제1 특징점들을 추출하고, 로이드(LIOYD) 알고리즘을 기반으로 복수개의 제1 특징점들을 클러스터링하여 하나 이상의 제1 클러스터를 생성할 수 있다. Thereafter, the plurality of first feature points mapped to the N-dimensional space are extracted based on the plurality of packets corresponding to any one communication section, and the plurality of first feature points are clustered based on the LOYID algorithm. The first cluster may be created.

예를 들어, 어느 하나의 통신구간에 100개의 패킷들이 포함된 경우, N차원 공간에도 100개의 패킷들에 상응하게 100개의 제1 특징점들이 매핑될 수 있다. 이 후, 100개의 제1 특징점들을 k-mean 클러스터링 기법 중 하나인 로이드(LIOYD) 알고리즘으로 k개의 제1 클러스터로 분류할 수 있다.For example, when 100 packets are included in any one communication section, 100 first feature points may be mapped to the 100 packets in the N-dimensional space. Thereafter, 100 first feature points may be classified into k first clusters using a LOYD algorithm, which is one of k-mean clustering techniques.

따라서, 네트워크가 정상일 때 검출되는 패킷은 어느 하나의 제1 클러스터에는 포함되어 있을 수 있다.Therefore, the packet detected when the network is normal may be included in any one first cluster.

이 때, 하나 이상의 제1 클러스터에 대한 클러스터 중심을 산출하고, 통신구간별로 산출된 클러스터 중심들 간의 거리 차이가 기설정된 오차거리 미만이 될 때까지 반복하여 클러스터링을 수행할 수 있다.At this time, the cluster centers for one or more first clusters may be calculated, and clustering may be performed repeatedly until the distance difference between the cluster centers calculated for each communication section is less than a predetermined error distance.

예를 들어, 로이드 알고리즘은, 먼저 랜덤으로 제1 클러스터의 중심을 구하고, 각 중심에 할당된 데이터들의 평균을 중심으로 설정할 수 있다. 이 후, 이전 통신구간을 기반으로 산출된 중심과의 차이가 기설정된 오차거리보다 작아질 때까지 반복하여 수행될 수 있다. For example, the Lloyd's algorithm may first obtain a center of the first cluster at random and set the mean of the data allocated to each center. Thereafter, it may be repeatedly performed until the difference with the center calculated based on the previous communication interval is smaller than the preset error distance.

이 때, 복수개의 제1 특징점들을 추출하기 위해서, 먼저 복수개의 패킷들 각각의 바이트(Byte)를 분석하여 복수개의 패킷들 각각에 대한 패킷 히스토그램을 생성할 수 있다.At this time, in order to extract the plurality of first feature points, a packet histogram for each of the plurality of packets may be generated by first analyzing a byte of each of the plurality of packets.

예를 들어, 복수개의 패킷들 중 어느 하나의 패킷에 대해 100번째 바이트까지를 특징(feature)으로 사용한다고 가정한다면, 도 3에 도시된 것과 같은 패킷 히스토그램(320)을 생성할 수 있다. 이 때, 어느 하나의 패킷에 상응하는 패킷 크기(310)는 153바이트지만, 100번째 바이트까지만 추출하여 히스토그램을 생성하였기 때문에 패킷 히스토그램(320)은 100개의 값(value)을 가질 수 있다.For example, assuming that up to 100 th byte is used as a feature for any one of the plurality of packets, a packet histogram 320 as shown in FIG. 3 may be generated. In this case, although the packet size 310 corresponding to any one packet is 153 bytes, the packet histogram 320 may have 100 values because only the 100th byte is extracted to generate the histogram.

이 후, 그리디 계층(GREEDY LATER) 기반의 학습 방식을 기반으로 패킷 히스토그램을 스택드 오토인코더(STACKED AUTOENCODERS)의 입력 데이터로 입력할 수 있다.Thereafter, a packet histogram may be input as input data of a stacked autoencoder based on a learning method based on a greedy layer.

이 후, 스택드 오토인코더에 대한 학습을 수행하고, 스택드 오토인코더의 특징 계층을 기반으로 복수개의 제1 특징점들을 추출할 수 있다. Thereafter, learning about the stacked autoencoder may be performed, and a plurality of first feature points may be extracted based on the feature layer of the stacked autoencoder.

예를 들어, 도 4에 도시된 것과 같은 스택드 오토인코더의 입력 데이터로 복수개의 패킷 히스토그램들을 입력할 수 있다. 이 후, 그리디 계층 기반의 학습을 수행할 수 있다. 학습이 완료되면, 도 5에 도시된 것과 같이 디코더 레이어 부분을 제외한 인코더 레이어 부분과 특징 계층(410, 420)을 통해 클러스터링에 활용할 복수개의 제1 특징점들을 획득할 수 있다. 이 때, 도 5에 도시된 클러스터링 결과를 확인하면 복수개의 제1 특징점들이 3개의 제1 클러스터들로 분류된 것을 확인할 수 있다. For example, a plurality of packet histograms may be input as input data of a stacked auto encoder such as illustrated in FIG. 4. After that, learning based on the greedy layer may be performed. When learning is completed, a plurality of first feature points to be used for clustering may be obtained through the encoder layer part and the feature layers 410 and 420 except for the decoder layer part as shown in FIG. 5. At this time, if the clustering result illustrated in FIG. 5 is confirmed, it may be confirmed that the plurality of first feature points are classified into three first clusters.

또한, 본 발명의 다른 실시예에 따르면 트래픽을 트랜잭션 단위로 분류하여 로컬 탐지 모델을 학습시킬 수도 있다. In addition, according to another embodiment of the present invention, traffic may be classified into transaction units to learn a local detection model.

이 때, 전송시간차(INTERARRIVAL TIME)를 기반으로 복수개의 패킷들을 분류하여 하나 이상의 트랜잭션(TRANSACTION)을 생성할 수 있다.In this case, one or more transactions may be generated by classifying a plurality of packets based on an INTERARRIVAL TIME.

이 때, 복수개의 패킷들 간의 전송시간차를 측정하여 기설정된 시간 이내에 전송되는 패킷들의 묶음을 트랜잭션으로 분류할 수 있다. At this time, the transmission time difference between the plurality of packets can be measured to classify the bundle of packets transmitted within a predetermined time as a transaction.

예를 들어, 도 6을 참조하면, 패킷들을 트랜잭션으로 분류하기 위해 기설정된 시간이 T1이고, 시간들 간의 관계는 T1<T2<T3라고 가정할 수 있다. 이 때, 도 6에 도시된 611번 패킷부터 613번 패킷까지의 전송시간차는 모두 T1 이내였으나, 613번 패킷과 621번 패킷 간의 전송시간차는 T2에 해당하므로, 일단 611번 패킷부터 613번 패킷까지 하나의 트랜잭션으로 묶을 수 있다. 이 후, 621번 패킷부터 622번 패킷까지의 전송시간차는 모두 T1 이내였으나, 622번 패킷과 631번 패킷 간의 전송시간차는 T3에 해당하므로, 621번 패킷부터 622번 패킷까지 하나의 트랜잭션으로 묶을 수 있다. For example, referring to FIG. 6, it may be assumed that a preset time for classifying packets as a transaction is T1 and a relationship between the times is T1 <T2 <T3. At this time, the transmission time difference from the packet 611 to the 613 packet shown in FIG. 6 was all within T1, but the transmission time difference between the packet 613 and the packet 621 corresponds to T2, so from packet 611 to packet 613 Can be bundled into one transaction. After that, the transmission time difference from packet 621 to packet 622 was all within T1. However, the transmission time difference between packet 622 and packet 631 corresponds to T3. Therefore, packet 621 to packet 622 can be bundled into one transaction. have.

이 때, 도 6에 도시된 것과 같이 패킷간의 전송시간차가 기설정된 시간 이상일 경우, 트랜잭션 단위보다 큰 세션단위로 패킷을 구분할 수 있다. In this case, as shown in FIG. 6, when the transmission time difference between packets is greater than or equal to a predetermined time, the packets may be divided into session units larger than a transaction unit.

또한, 도 7을 참조하면, 제어기기간 통신에서 패킷을 주고 받은 결과, 트랜잭션(710, 720)에 포함된 패킷들 간의 전송시간차는 매우 짧지만 트랜잭션(710)과 트랜잭션(720) 간은 일정시간 이상의 전송시간차가 있음을 확인할 수 있다. In addition, referring to FIG. 7, as a result of transmitting and receiving a packet in the controller period communication, the transmission time difference between the packets included in the transactions 710 and 720 is very short, but the transaction 710 and the transaction 720 have a predetermined time or more. It can be seen that there is a transmission time difference.

또한, 제어기기가 정해진 작업만을 수행하고 있기 때문에, 도 7과 같이 패킷 크기와 전송방향으로 나타낸 그래프에서도 트랜잭션(710, 720)의 종류를 어느 정도 구분할 수 있다.In addition, since the controller performs only a predetermined task, the types of transactions 710 and 720 can be distinguished to some extent even in a graph showing packet sizes and transmission directions as shown in FIG. 7.

이 때, 하나 이상의 트랜잭션에 포함된 적어도 하나의 패킷을 제1 클러스터에 상응하는 타입으로 변환하여 하나 이상의 패킷타입 시퀀스를 생성하고, 하나 이상의 패킷타입 시퀀스를 클러스터링하여 하나 이상의 제2 클러스터를 생성할 수 있다. 즉, 하나 이상의 제2 클러스터에 상응하는 패킷타입 시퀀스 특징을 통해 네트워크가 정상일 때의 트랜잭션의 형태를 학습할 수 있다.In this case, one or more packet type sequences may be generated by converting at least one packet included in one or more transactions into a type corresponding to the first cluster, and one or more second clusters may be generated by clustering one or more packet type sequences. have. That is, it is possible to learn the type of transaction when the network is normal through packet type sequence features corresponding to one or more second clusters.

예를 들어, 어느 하나의 통신구간에 포함된 복수개의 패킷들을 기반으로 도 8에 도시된 것과 같은 3개의 제1 클러스터들이 생성되었다고 가정할 수 있다. 이 후, 도 9에 도시된 것과 같이 전송시간차를 기반으로 복수개의 패킷들을 분류하여 트랜잭션들을 생성할 수 있다. 이 때, 어느 하나의 트랜잭션(910)을 그래프로 나타내면, 도 10에 도시된 것과 같이 트랜잭션(910)에 포함된 패킷들이 도 8에 도시된 3개의 제1 클러스터들에 다양하게 매칭되는 것을 확인할 수 있다. 따라서, 이러한 패킷들을 시간순서대로 나열하여 도 10에 도시된 것과 같은 패킷타입 시퀀스(920)를 생성할 수 있다. For example, it may be assumed that three first clusters as illustrated in FIG. 8 are generated based on a plurality of packets included in any one communication section. Thereafter, as illustrated in FIG. 9, transactions may be generated by classifying a plurality of packets based on a transmission time difference. In this case, when one of the transactions 910 is graphed, it can be seen that the packets included in the transaction 910 are variously matched to the three first clusters shown in FIG. 8 as shown in FIG. 10. have. Thus, these packets can be arranged in chronological order to generate a packet type sequence 920 as shown in FIG.

이와 같이, 패킷뿐만아니라 통신구간에서 나타나는 트랜잭션들을 지정된 종류로 구분할 수 있기 때문에 평소에 발생하지 않는 트랜잭션을 탐지하는데 사용할 수 있다.In this way, not only packets but also transactions appearing in the communication section can be classified into a specified type, which can be used to detect transactions that do not normally occur.

또한, 본 발명에서는 도 11에 도시된 것과 같이 IP-IP 구간 별로 로컬 탐지 모델의 학습을 진행하여 IP-IP 구간마다 별도의 학습모델을 생성할 수 있다. 이 때, IP-IP 단위가 아닌 다른 학습 대상으로 변동할 수도 있다.In addition, in the present invention, as shown in FIG. 11, the training of the local detection model may be performed for each IP-IP section to generate a separate learning model for each IP-IP section. At this time, it may change to a learning target other than the IP-IP unit.

이하에서는, 도 11 내지 도 17을 기반으로 본 발명에서 로컬 탐지 모델을 학습시키기 위한 두 번째 방법을 상세하게 설명하도록 한다. Hereinafter, a second method for learning a local detection model in the present invention will be described in detail with reference to FIGS. 11 to 17.

먼저, 로그를 기반으로 두 개의 IP들 간의 통신구간에 상응하는 복수개의 패킷들 각각의 바이트를 분석하여 바이트 값별 등장 횟수를 포함하는 바이트 빈도 데이터를 생성할 수 있다.First, byte frequency data including the number of occurrences of each byte value may be generated by analyzing bytes of a plurality of packets corresponding to a communication interval between two IPs based on a log.

이 때, 하나의 바이트(byte)는 0x00부터 0xFF까지 256가지의 값을 가질 수 있다. 따라서, 바이트 빈도 데이터는 하나의 패킷에 포함된 모든 바이트에서 0x00부터 0xFF까지의 값이 몇 번 나타나는지를 나타내는 데이터에 상응할 수 있다. At this time, one byte may have 256 values from 0x00 to 0xFF. Thus, byte frequency data may correspond to data indicating how many times a value from 0x00 to 0xFF appears in all bytes included in one packet.

예를 들어, 도 12에 도시된 것과 같은 패킷이 존재한다고 가정할 수 있다. 이 때, 패킷의 헤더(Header)는 제외한 나머지 바이트에 해당하는 바이트 값(1210)을 획득하고, 이를 도 13에 도시된 그래프와 같이 분류하여 각각의 값별 등장횟수를 나타낼 수 있다.For example, it may be assumed that there is a packet as shown in FIG. 12. In this case, a byte value 1210 corresponding to the remaining bytes except for the header of the packet may be obtained and classified as shown in the graph of FIG. 13 to indicate the number of occurrences of each value.

이와 같이, 패킷을 바이트 빈도 데이터로 나타내면, 패킷의 길이가 변하더라도 항상 256개의 정수(integer)로 표현할 수 있기 때문에 학습 및 탐지에서 패킷의 길이변화를 고려하지 않을 수 있다. As described above, when the packet is represented as byte frequency data, even if the length of the packet changes, the packet may be represented as 256 integers so that the length of the packet may not be considered in learning and detection.

이 때, 패킷 간의 값(value)의 차이가 크다고 하여 정상과 비정상을 구분하는 차이가 커지는 것은 아닐 수 있다. 그러나, 바이트 값을 그대로 탐지를 위한 비교대상으로 사용하는 경우, 탐지 모델의 학습 과정에서는 바이트 값의 차이가 큰 만큼 더 잘못된 비정상이라고 판단하는 오류가 발생할 수 있다. 예를 들어, 0x00에 해당하는 값이 나와야 하는데 0x01이 나온 경우와 0xFF가 나온 경우는 똑같이 비정상적인 경우이지만, 단순히 바이트 값을 비교하는 경우에는 0xFF가 나온 경우를 더 잘못된 상태라고 판단하고 학습할 수도 있다. 따라서, 본 발명에서와 같이 바이트 빈도 데이터를 이용하는 경우, 바이트 값을 직접 비교하지 않고 특정 바이트 값이 등장했는지 여부만을 비교하기 때문에 상기와 같은 문제를 해결할 수 있다.At this time, a large difference in values between packets may not increase the difference between normal and abnormal. However, when the byte value is used as a comparison target for detection as it is, an error that may be determined to be a wrong abnormality may occur in the learning process of the detection model because the difference in the byte value is large. For example, if 0x00 comes out and 0x01 comes out and 0xFF comes out, it's an abnormal case, but if you simply compare byte values, you might learn that 0xFF comes out wrong. . Therefore, in the case of using the byte frequency data as in the present invention, the above-mentioned problem can be solved because only a specific byte value is compared or not compared with the byte value directly.

이 때, 바이트 빈도 데이터에 포함된 256개의 빈도값들을 버킷(BUCKET) 단위로 분류하여 빈도값들보다 적은 복수개의 버킷들을 생성하고, 복수개의 버킷들마다 할당된 빈도값을 합산하여 버킷 빈도 데이터를 생성할 수 있다.At this time, 256 frequency values included in the byte frequency data are classified into buckets to generate a plurality of buckets smaller than the frequency values, and the bucket frequency data is calculated by summing the assigned frequency values for each of the plurality of buckets. Can be generated.

예를 들어, 도 14에 도시된 것과 같이 바이트 빈도 데이터에 포함된 0x00부터 0xFF까지의 빈도값들은 B₁에서 B_n까지의 버킷으로 각각 분류될 수 있다. For example, as illustrated in FIG. 14, frequency values from 0x00 to 0xFF included in byte frequency data may be classified into buckets of B ₁ to B _n , respectively.

이 때, 학습 및 탐지의 정확도를 향상시키기 위해서 바이트 빈도 데이터를 그대로 사용할 수도 있지만, 실시간 탐지를 수행하는 경우에는 패킷마다 256개의 정수를 비교하는 것이 실시간 성능에 문제를 발생시킬 수 있다. 따라서, 실시간 탐지에서 성능을 향상시키면서도 정확도를 최대한 유지하기 위해, 256개의 빈도값들을 n개의 버킷들로 모으고, 도 15에 도시된 것과 같이 각각의 버킷에 할당된 빈도값들을 합산하여 표현할 수 있다.In this case, byte frequency data may be used as it is to improve the accuracy of learning and detection. However, when performing real-time detection, comparing 256 integers per packet may cause a problem in real-time performance. Therefore, in order to improve performance in real-time detection while maintaining maximum accuracy, 256 frequency values may be collected into n buckets, and frequency values assigned to each bucket may be summed and expressed as shown in FIG. 15.

이 때, 도 15에 도시된 것과 같이 복수개의 버킷들 각각에 할당된 빈도값들을 합산하여 나타낸 결과가 버킷 빈도 데이터에 상응할 수 있다. At this time, as shown in FIG. 15, the result of summing up frequency values assigned to each of the plurality of buckets may correspond to the bucket frequency data.

이 때, 256개의 빈도값들을 복수개의 버킷들에 할당할 경우, 네트워크 이상행위 탐지를 위한 시스템의 엔트로피(entropy)가 최대가 되도록 할당하여야 한다. 예를 들어, 256개의 빈도값들을 1개의 버킷에 모두 할당하면, 패킷들이 아무리 많더라도 시스템의 엔트로피는 최소가 될 수 있다. 그러나 반대로 256개의 빈도값들마다 개별 버킷을 만들어 총 256개의 버킷이 생성된다면, 시스템의 엔트로피는 최대가 될 수 있다. In this case, when 256 frequency values are allocated to the plurality of buckets, the entropy of the system for detecting network abnormal behavior should be allocated to be the maximum. For example, if all 256 frequency values are assigned to one bucket, no matter how many packets there are, the entropy of the system can be minimal. On the contrary, if a total of 256 buckets are generated by creating a separate bucket for every 256 frequency values, the entropy of the system can be maximized.

따라서, 시스템의 성능을 고려하여 도 14에 도시된 것과 같이 버킷의 개수를 임의의 n개로 지정하였을 때, 엔트로피가 최대가 되게 빈도값들을 할당하기 위해서는, 256개의 빈도값들 중 패킷을 구분하는데 사용되는 복수개의 빈도값들은 각각 다른 버킷으로 분류하여야 할 수 있다.Therefore, when the number of buckets is set to n arbitrary as shown in FIG. 14 in consideration of the performance of the system, in order to assign frequency values to maximize entropy, it is used to classify packets among 256 frequency values. The plurality of frequency values may be classified into different buckets.

예를 들어, 1번 패킷의 바이트 값은 0x00, 0xAA, 0xAA에 상응하고, 2번 패킷의 바이트 값은 0x01, 0xAA, 0xAA에 상응한다고 가정한다면, 엔트로피를 최대로 하기 위해서는 0x00과 0x01은 서로 다른 버킷으로 할당되어야 할 수 있다. 이와 같이 시스템의 엔트로피를 최대로 하기 위한 과정을 엔트로피 극대화(entropy maximization) 학습이라 하고, 정상상태의 네트워크 트래픽을 대상으로 엔트로피 극대화 학습을 수행하여 256개의 빈도값들을 각각의 버킷으로 할당할 수 있다.For example, suppose that the byte value of packet 1 corresponds to 0x00, 0xAA, 0xAA, and the byte value of packet 2 corresponds to 0x01, 0xAA, 0xAA. To maximize entropy, 0x00 and 0x01 are different. It may need to be assigned to a bucket. The process for maximizing the entropy of the system is called entropy maximization learning, and 256 frequency values can be assigned to each bucket by performing entropy maximization learning on steady-state network traffic.

이와 같이 복수개의 버킷들이 생성되고 256개의 빈도값들에 대한 할당이 종료되면, 개별 패킷은 바이트 빈도 데이터가 아닌 버킷 빈도 데이터로 나타내어 사용할 수 있다.As such, when a plurality of buckets are generated and allocation of 256 frequency values is completed, individual packets may be represented as bucket frequency data rather than byte frequency data.

이 때, 버킷 빈도 데이터를 클러스터링하여 어느 하나의 패킷에 대한 하나 이상의 제3 클러스터를 생성할 수 있다. At this time, the bucket frequency data may be clustered to generate one or more third clusters for any one packet.

예를 들어, 도 15에 도시된 버킷 빈도 데이터를 k-means 클러스터링하여 도 16에 도시된 것과 같이 k개의 제3 클러스터들을 생성할 수 있다.For example, k-means clustering of the bucket frequency data shown in FIG. 15 may generate k third clusters as shown in FIG. 16.

이와 같이, 트랜잭션에 포함되는 다수의 패킷에 대한 정보를 k개(P₁~P_k)의 제3 클러스터로 단순히 표현하여 정보를 단순화함으로써 학습과 탐지를 보다 효율적으로 수행할 수 있다.In this way, learning and detection can be performed more efficiently by simply expressing information on a plurality of packets included in a transaction in k clusters P ₁ to P _k to simplify the information.

이 때, 전송시간차를 기반으로 복수개의 패킷들을 분류하여 하나 이상의 트랜잭션을 생성하고, 하나 이상의 트랜잭션에 할당된 패킷에 상응하는 전송방향과 하나 이상의 제3 클러스터를 조합하여 통신구간별 트랜잭션의 바이트 빈도를 알 수 있는 패킷 빈도 데이터를 생성할 수 있다. At this time, by classifying a plurality of packets based on the transmission time difference to generate one or more transactions, by combining the transmission direction corresponding to the packet assigned to one or more transactions and one or more third clusters to determine the byte frequency of the transaction for each communication section Known packet frequency data can be generated.

이 때, 패킷 빈도 데이터는, 어느 하나의 트랜잭션 안에 어떤 패킷이 몇 개나 들어있는지 나타낸 것일 수 있다. 따라서, 어느 하나의 트랜잭션에 포함된 각각의 개별 패킷의 전송방향이 무엇인지와 각각의 개별 패킷이 어떤 제3 클러스터에 해당하는지를 조합하여 나타낼 수 있다. In this case, the packet frequency data may indicate how many packets are contained in any one transaction. Therefore, it can be shown by combining the transmission direction of each individual packet included in any one transaction and which third cluster each individual packet corresponds to.

이 때, 개별 트랜잭션에 포함되는 패킷들은 전송방향에 따라 2가지로 분류할 수 있다. 예를 들어, 도 11 및 도 17을 참조하면, 하나의 트랜잭션에 포함된 패킷들을 IP1에서 IP2로 전송되는 패킷(1110, 1710)과 IP2에서 IP1로 전송되는 패킷(1120, 1720)으로 분류할 수 있다. At this time, the packets included in the individual transaction can be classified into two types according to the transmission direction. For example, referring to FIGS. 11 and 17, packets included in one transaction may be classified into packets 1110 and 1710 transmitted from IP1 to IP2 and packets 1120 and 1720 transmitted from IP2 to IP1. have.

이와 같이, 하나의 트랜잭션을 2n개의 정수로 나타냄으로써, 트랜잭션 내의 패킷 단위의 특성을 나타낼 수 있으면서도 학습과 탐지 과정의 수행속도를 향상시킬 수 있다. As described above, by representing one transaction as 2n integers, it is possible to express the characteristics of packet units in the transaction and to improve the speed of learning and detecting.

이 때, 패킷 빈도 데이터를 기반으로 학습을 위한 정상 패턴을 검출하고, 정상 패턴에 상응하게 로컬 탐지 모델을 학습시킬 수 있다.At this time, the normal pattern for learning may be detected based on the packet frequency data, and the local detection model may be trained corresponding to the normal pattern.

또한, 패턴 매칭(PATTERN MATCHING) 방식을 기반으로 글로벌 탐지 모델을 학습시킬 수 있다. 예를 들어, 데이터 전처리 방식 중에 전체 네트워크 트래픽의 전송정보를 통해 얻은 히스토그램 집합에서 패턴 매칭, 혹은 템플릿 매칭(TEMPLATE MATCHING)이라 불리는 방식을 적용하여 제어네트워크 상의 이상상태를 판단할 수 있다. In addition, the global detection model can be trained based on the PATTERN MATCHING method. For example, an abnormal state on a control network may be determined by applying a method called pattern matching or template matching in a histogram set obtained through transmission information of all network traffic during data preprocessing.

이하에서는, 도 18 내지 도 21을 참조하여 본 발명에서 글로벌 탐지 모델을 학습시키는 방법을 상세하게 설명하도록 한다.Hereinafter, a method of learning a global detection model in the present invention will be described in detail with reference to FIGS. 18 to 21.

먼저, 학습을 위한 전처리 과정으로, 로그를 기반으로 기설정된 제1 단위시간마다 네트워크에 상응하는 통신경로 별 전송횟수에 대한 제1 히스토그램을 생성할 수 있다. First, as a preprocessing process for learning, a first histogram of transmission counts for communication paths corresponding to a network may be generated for each predetermined first unit time based on a log.

예를 들어, 도 18에 도시된 것과 같이 네트워크를 다이렉티드 그래프(DIRECTED GRAPH)로 나타낸 이후, 도 19에 도시된 것과 같이 다이렉티드 그래프 상에서 모든 가능한 엣지를 x축으로 하고, 기설정된 제1 단위시간 동안에 각 엣지의 패킷전송 횟수를 y축으로 하는 히스토그램을 생성할 수 있다. For example, after representing a network as a directed graph as shown in FIG. 18, all possible edges on the directed graph as an x-axis, as shown in FIG. 19, and the preset first It is possible to generate a histogram with the y-axis as the number of packet transmissions of each edge during the unit time.

이 때, 도 19에 도시된 것과 같은 히스토그램을 이용하여 전체 네트워크의 상태를 함축적으로 나타낼 수 있다. 즉, 제1 단위시간마다 히스토그램을 하나씩 생성하면, 도 20에 도시된 것과 같이 n단위시간 동안 만들어진 n개의 제1 히스토그램을 통해 시간에 따라 변화하는 네트워크 상태를 추적할 수 있다.In this case, the histogram as shown in FIG. 19 may be used to express the state of the entire network. That is, if one histogram is generated for each first unit time, as shown in FIG. 20, the network state that changes with time may be tracked through the n first histograms generated for the unit time.

예를 들어, (Unix time stamp, edge - 전송횟수 히스토그램)과 같은 튜플을 매 단위시간마다 생성할 수 있다.For example, a tuple such as (Unix time stamp, edge-transmission histogram) can be created every unit time.

즉, 도 20에 도시된 것과 같이 단위시간당 생성된 제1 히스토그램을 시간 순으로 모았을 때, 제1 히스토그램의 변화는 트래픽 흐름의 변화를 나타낼 수 있다. 따라서, 전처리 과정에서는 입력 트래픽을 단위시간으로 나누어 제1 히스토그램 정보로 변형할 수 있다.That is, when the first histogram generated per unit time is collected in chronological order as shown in FIG. 20, the change in the first histogram may represent a change in traffic flow. Therefore, in the preprocessing process, the input traffic may be divided into unit time and transformed into first histogram information.

이 때, 기설정된 제1 단위시간보다 작은 기설정된 제2 단위시간마다 네트워크에 상응하는 통신경로 별 전송횟수에 대한 제2 히스토그램을 생성할 수 있다.In this case, a second histogram of the transmission frequency for each communication path corresponding to the network may be generated for each second predetermined unit time smaller than the first predetermined unit time.

이 때, 유사도를 고려하여 제1 히스토그램과 제2 히스토그램을 매칭하고, 매칭된 두 개의 히스토그램들 간의 벡터 거리에 대한 평균과 분산을 산출하여 네트워크에 상응하는 카이스퀘어(CHI-SQUARE) 분포를 생성할 수 있다. At this time, the first histogram and the second histogram are matched in consideration of the similarity, and the average and the variance of the vector distances between the two matched histograms are calculated to generate a CHI-SQUARE distribution corresponding to the network. Can be.

즉, 제1 단위시간보다 작은 제2 단위시간 동안 생성된 제2 히스토그램들이 그 이전에 생성된 제1 히스토그램들 중에 가장 유사한 제1 히스토그램과 얼마나 차이가 나는지에 대한 평균과 분산을 구할 수 있다.That is, the average and the variance of the second histograms generated during the second unit time smaller than the first unit time are different from the first histograms most similar among the first histograms generated earlier.

예를 들어, 1초당 1개씩 생성한 제1 히스토그램을 하나의 d차원 벡터로 정의했을 때, n초 동안 제1 히스토그램을 생성하면 n개의 d차원 벡터가 생성될 수 있다. 이렇게 생성된 제1 히스토그램의 집합을 Χ={X(1), ..., X(n)}이라 정의하고 정상패턴이라 가정할 수 있다. 이 때, 정상패턴에서 n보다 훨씬 작은 w초(ex:10초)동안 생성된 w개의 제2 히스토그램 집합을 Ψ={Y(1), ..., Y(n)}라고 했을 때, Ψ와 가장 유사한 패턴까지의 벡터 거리는 [수학식 1]과 같이 측정할 수 있다.For example, when the first histogram generated once per second is defined as one d-dimensional vector, n d-dimensional vectors may be generated when the first histogram is generated for n seconds. The set of the first histograms generated as described above may be defined as Χ = {X (1), ..., X (n)} and may be assumed to be a normal pattern. In this case, when the second set of w histograms generated for w seconds (ex: 10 seconds) which are much smaller than n in the normal pattern is Ψ = {Y (1), ..., Y (n)}, Ψ The vector distance to the pattern most similar to can be measured as shown in [Equation 1].

[수학식 1][Equation 1]

이 때, 패킷 발생수를 그대로 사용한다면, 많은 전송횟수를 가지는 몇 개의 엣지의 변화에 의해 다른 엣지들의 변화가 지배(dominate)되어 버릴 수도 있다. 따라서, 전송횟수는 로그(log)를 취한 값을 사용할 수 있다. 단, 로그(log)를 사용한다면 전송횟수가 0인 경우가 로그에 의해서 정의되지 않으므로 모든 전송횟수에 1을 더해서 [수학식 2]와 같이 ψ와 가장 유사한 패턴까지의 벡터거리를 산출할 수 있다.In this case, if the number of packet occurrences is used as it is, the change of other edges may be dominated by the change of several edges having a large number of transmissions. Therefore, the number of transfers can be a value that takes a log. However, if a log is used, the case where the number of transmissions is 0 is not defined by the log. Therefore, by adding 1 to all transmissions, the vector distance to the pattern most similar to ψ as shown in [Equation 2] can be calculated. .

[수학식 2][Equation 2]

이 때, 제1 히스토그램의 집합(n개의 벡터)을 대상으로 [수학식 2]와 같은 연산을 수행하여 검출한 차이의 평균과 분산이 학습모델에 상응할 수 있다. 즉, 학습과정에서 계산한 차이정보가 카이스퀘어 분포를 따르고 있음을 증명하였으며, 이후 비정상 트래픽 여부를 판단하는데 사용될 수 있다.In this case, the average and the variance of the difference detected by performing an operation as shown in [Equation 2] on the first set of histograms (n vectors) may correspond to the learning model. That is, it is proved that the difference information calculated in the learning process follows the chi-square distribution, and then it can be used to determine whether there is abnormal traffic.

이 때, 네트워크에 상응하는 복수개의 통신경로들과 시간을 고려하여 로그를 복수개의 로그 그룹들로 분류하고, 복수개의 로그 그룹들마다 카이스퀘어 분포를 생성할 수 있다. In this case, the log may be classified into a plurality of log groups in consideration of a plurality of communication paths and time corresponding to the network, and a chi square distribution may be generated for each of the plurality of log groups.

예를 들어, [수학식 2]를 w 시간과 네트워크 엣지 전체에 대해 적용하는 경우에 적용구간은 넓어질 수 있지만, 그 사이에 공격이 발생하는 시각과 엣지의 개수가 적을 경우 이를 구분하기 어려운 문제점이 발생할 수 있다. For example, when [Equation 2] is applied to the w time and the entire network edge, the application range can be widened, but it is difficult to distinguish when the attack time and the number of edges are small in between. This can happen.

본 발명에서는 이를 방지하지 위해서 도 21에 도시된 것과 같이 학습구간에서 엣지들과 시간을 그룹핑하여 로그를 복수개의 로그 그룹들(2110~2160)로 분류하고, 복수개의 로그 그룹들(2110~2160)마다 dist를 산출하여 글로벌 탐지 모델을 학습시킬 수 있다. 이와 같은 과정을 통해, 감시 대상이 그룹단위의 소규모가 되어 공격 탐지율이 향상될 뿐만 아니라 공격발생구간을 보다 정확히 찾아낼 수도 있다.In order to prevent this, in the present invention, as shown in FIG. 21, the log is classified into a plurality of log groups 2110 to 2160 by grouping edges and time in a learning section, and a plurality of log groups 2110 to 2160. We can train the global detection model by calculating dist every time. Through this process, the monitoring target becomes smaller in group units, which not only improves the detection rate of the attack, but also makes it possible to more precisely identify the attack occurrence section.

또한, 그룹핑을 수행하는 기준을 랜덤으로 하면 공격자가 그룹핑 특성을 고려한 정밀한 공격을 통해 탐지를 피해가는 것을 방지할 수 있다는 장점이 존재한다.In addition, if the criteria for performing grouping are randomized, there is an advantage that the attacker can prevent the detection through the precise attack considering the grouping characteristics.

또한, 본 발명의 일실시예에 따른 네트워크에 대한 이상행위 탐지 방법은 로컬 탐지 모델과 글로벌 탐지 모델을 이용하여 네트워크 이상행위에 대한 로컬 탐지와 글로벌 탐지를 동시에 수행한다(S230).In addition, the abnormal behavior detection method for the network according to an embodiment of the present invention performs a local detection and global detection of the network abnormal behavior at the same time using the local detection model and the global detection model (S230).

이 때, 글로벌 탐지와 로컬 탐지를 동시에 사용하면 상호보완적으로 정확한 감시를 수행할 수 있다. At this time, if global detection and local detection are used simultaneously, complementary and accurate monitoring can be performed.

예를 들어, 로컬 탐지 시 패킷이나 트랜잭션 단위로는 이상하지 않은 형태를 전송하고 있다고 하더라도, 그 전송시점이나 전송량이 이상한지에 대한 탐지는 글로벌 탐지에서 수행할 수 있다. For example, even if the local detection transmits an abnormal form in a packet or transaction unit, the detection of whether the transmission time or the transmission amount is abnormal can be performed in the global detection.

다른 예를 들어, 전송량의 변화 없이 전송값만 변경되거나 패킷의 내용만 대체되는 공격의 경우, 글로벌 탐지에서는 탐지할 수 없으나 로컬 탐지에서 탐지할 수 있다.In another example, an attack in which only the transmission value is changed or only the contents of a packet is replaced without a change in the transmission amount may not be detected in the global detection but may be detected in the local detection.

또한, 글로벌 탐지와 로컬 탐지의 결과를 조합하여 기기간의 연관관계를 고려한 이상행위 탐지를 수행할 수도 있다.In addition, by combining the results of the global detection and local detection, anomaly detection may be performed in consideration of the relationship between the devices.

이 때, 네트워크를 통해 전송되는 개별 패킷 및 개별 트랜잭션 중 어느 하나가 하나 이상의 제1 클러스터, 하나 이상의 제3 클러스터 및 정상 패턴 중 어느 하나에 해당하지 않는 경우, 네트워크에서 비정상 트래픽이 발생한 것으로 판단할 수 있다. In this case, when any one of the individual packets and individual transactions transmitted through the network does not correspond to any one of the one or more first cluster, one or more third cluster, and the normal pattern, it may be determined that abnormal traffic has occurred in the network. have.

예를 들어, 어느 하나의 감시대상 패킷이 하나 이상의 제1 클러스터에 포함되지 않는 경우에 감시대상 패킷을 비정상 패킷으로 판단할 수 있다.For example, when any one packet to be monitored is not included in one or more first clusters, the packet to be monitored may be determined to be an abnormal packet.

다른 예를 들어, 어느 하나의 감시대상 패킷이 하나 이상의 제3 클러스터에 포함되지 않는 경우에 감시대상 패킷을 기존에 나타나지 않던 비정상 패킷으로 판단할 수 있다.For another example, when any one packet to be monitored is not included in one or more third clusters, the packet to be monitored may be determined to be an abnormal packet that has not previously appeared.

또 다른 예를 들어, 어느 하나의 감시대상 트랜잭션에 상응하는 패킷 빈도 데이터가 정상 패턴에 맞지 않을 경우에 감시대상 트랜잭션을 비정상 트랜잭션으로 판단할 수 있다.For another example, when the packet frequency data corresponding to any one of the monitored transactions does not match the normal pattern, the monitored transaction may be determined as an abnormal transaction.

이 때, 네트워크를 통해 전송되는 개별 트랜잭션이 하나 이상의 제2 클러스터에 포함되지 않는 경우, 개별 트랜잭션을 비정상 트랜잭션으로 판단할 수 있다. In this case, when individual transactions transmitted through the network are not included in one or more second clusters, the individual transactions may be determined as abnormal transactions.

예를 들어, 어느 하나의 감시대상 트랜잭션이 하나 이상의 제2 클러스터에 포함되지 않는 경우에 감시대상 트랜잭션을 비정상 트랜잭션으로 판단할 수 있다.For example, when any one of the monitored transactions is not included in one or more second clusters, the monitored transaction may be determined to be an abnormal transaction.

이 때, 트랜잭션 단위의 탐지를 위해서는 미리 패킷별로 어느 클러스터에 포함되는지를 확인해야 하므로 패킷 단위의 탐지를 추가로 수행하는 것이 오버헤드로 작용되는 것은 아닐 수 있다.In this case, in order to detect a transaction unit, it is necessary to confirm which cluster is included for each packet in advance, so that additional detection of a packet unit may not be an overhead.

이 때, 네트워크를 탐지하기 위해 제2 단위시간에 상응하게 탐지대상 히스토그램을 생성하고, 복수개의 제1 히스토그램들 중 탐지대상 히스토그램과 유사도가 가장 높은 어느 하나의 제1 히스토그램을 검출할 수 있다. In this case, the detection target histogram may be generated to correspond to the second unit time to detect the network, and any one first histogram having the highest similarity to the detection target histogram among the plurality of first histograms may be detected.

이 때, 어느 하나의 제1 히스토그램과 탐지대상 히스토그램 간의 벡터 거리가 카이스퀘어 분포의 99% 신뢰구간에 해당하지 않는 경우, 네트워크에서 비정상 트래픽이 발생한 것으로 판단할 수 있다. In this case, when the vector distance between any one first histogram and the detection target histogram does not correspond to a 99% confidence interval of the chi square distribution, it may be determined that abnormal traffic has occurred in the network.

이 때, 글로벌 탐지의 적용대상을 단위시간 별 패킷 카운트뿐만 아니라 단위시간 별 트래픽 전송량(data length) 둘 다에 적용하여 공격을 탐지할 수 있다. At this time, the target of the global detection can be applied to not only the packet count per unit time but also the traffic data length (data length) per unit time to detect the attack.

예를 들어, 대부분의 버퍼 오버플로우(buffer overflow)은 패킷의 뒷부분에 악성데이터를 추가하는 방식인데, 이러한 경우에는 패킷의 개수는 변하지 않고 패킷의 길이가 변하게 된다. 이와 같이, 패킷 개수의 변화 없이 데이터만 길게 보내는 공격의 경우, 트래픽 전송량을 확인해야 공격을 탐지할 수 있다.For example, most buffer overflows add malicious data to the back of a packet, in which case the number of packets does not change and the length of the packet changes. As such, in the case of an attack that transmits only data without changing the number of packets, the attack can be detected only by checking the traffic volume.

다른 예를 들어, SYN Flooding 공격과 같이 짧은 패킷을 다량으로 보내는 경우에는 전체 데이터의 크기는 크게 변하지 않지만 패킷의 개수가 변하게 된다. 이와 같이, 짧은 패킷만을 보내는 비정상 패턴의 경우에는 패킷 개수의 변화를 기준으로 감시해야 공격을 탐지할 수 있다.In another example, when a large number of short packets are sent, such as in a SYN flooding attack, the total data size does not change much, but the number of packets changes. As described above, in case of an abnormal pattern sending only a short packet, an attack can be detected only by monitoring based on a change in the number of packets.

또 다른 예를 들어, DNP3 over TCP의 경우, 패킷 크기가 특정 크기 이상을 넘지 않도록 되어 있는데, 취약점 공격 및 전송 데이터 크기 변경 등에 의해 이 비율이 바뀔 수도 있다. 따라서, 패킷 개수 대비 패킷의 양의 비율을 입력으로 하여 패킷의 평균적인 길이 변화를 탐지하는 방식으로 네트워크 서비스의 특성을 감시할 수 있다.Another example, DNP3 over TCP, is that the packet size does not exceed a certain size. This ratio may change due to vulnerability attack and transmission data size change. Therefore, it is possible to monitor the characteristics of the network service by detecting the average change in the length of the packet by inputting the ratio of the amount of packets to the number of packets.

또한, 본 발명은 도 22에 도시된 것과 같이 글로벌 탐지를 수행하여(S2210), 비정상 트래픽 발생구간이 탐지되었는지 여부를 판단하고(S2215), 비정상 트래픽 발생구간이 탐지되면 해당 구간에 대해 로컬 탐지를 수행하여(S2220) 실제로 비정상 트래픽이 전달된 엣지를 검출할 수 있다.In addition, the present invention performs a global detection as shown in Figure 22 (S2210), determines whether or not abnormal traffic generation interval is detected (S2215), if abnormal traffic generation interval is detected local detection for the corresponding interval In operation S2220, an edge through which abnormal traffic is actually transmitted may be detected.

예를 들어, 글로벌 탐지를 수행하는 경우, 특정 시간대역의 일부 엣지들에서 평소와 다른 트래픽 흐름을 탐지할 수는 있지만, 정확히 어느 엣지에서 문제가 발생하였는지는 확인하기 어려울 수 있다. 이러한 경우, 로컬 탐지 방법을 이용하여 해당 시간대역에서 비정상 트래픽 흐름이 탐지된 엣지들의 데이터 전송내역을 분석하면 비정상 데이터를 전송한 엣지를 정확히 찾아낼 수 있다.For example, when performing global detection, it may be possible to detect unusual traffic flows on some edges of a particular time band, but it may be difficult to determine exactly at which edge the problem occurred. In this case, by analyzing the data transmission history of the edges in which the abnormal traffic flow is detected in the corresponding time band by using a local detection method, it is possible to accurately find the edge that transmitted the abnormal data.

이 때, 로컬 탐지는 패킷 단위로 탐지를 수행해야 하므로, 글로벌 탐지보다 고성능의 감시장비를 필요로 할 수 있다. 따라서, 평소에는 모든 탐지 알고리즘을 수행할 수 있도록 로그만 남겨두고, 글로벌 탐지가 선행적으로 비정상 구간을 찾아주는 방식을 이용하여 탐지 알고리즘의 실시간성과 탐지 성능 향상을 확보할 수 있다.At this time, since local detection needs to perform detection on a packet basis, it may require a higher performance monitoring device than global detection. Therefore, it is possible to secure the real-time performance of the detection algorithm and the improvement of the detection performance by using a method in which a global detection proactively finds an abnormal section in advance, by leaving only a log so that all detection algorithms can be performed.

또한, 도 2에는 도시하지 아니하였으나, 본 발명의 일실시예에 따른 네트워크에 대한 이상행위 탐지 방법은 상술한 이상행위 탐지 과정에서 발생하는 다양한 정보를 별도의 저장 모듈을 통해 저장할 수 있다.In addition, although not shown in Figure 2, the abnormal behavior detection method for the network according to an embodiment of the present invention may store a variety of information generated in the above-described abnormal behavior detection process through a separate storage module.

이와 같은, 네트워크에 대한 이상행위 탐지 방법을 통해, 감시대상 네트워크 및 시스템에 대한 구체적인 정보가 없어도 탐지엔진을 구성할 수 있고, 이를 이용하여 감시대상의 복잡한 특징을 이해하는데 활용할 수 있다.Through this method of detecting abnormal behavior for a network, a detection engine can be configured without specific information about a network and a system to be monitored, and can be used to understand the complex characteristics of the monitoring object.

또한, 보안을 위해 폐쇄망으로 운영되거나 원격 업데이트가 불가능한 곳에 대해서도 효과적으로 감시할 수 있으며, 트래픽 전송량과 그 전달내용의 변화를 동시에 감시 및 분석할 수 있으므로 보다 세분화되고 효과적으로 이상행위를 탐지할 수 있다.In addition, it can effectively monitor even where it is operated as a closed network or cannot be remotely updated for security, and it can monitor and analyze the changes in traffic volume and its contents at the same time so that it can detect anomalies more effectively and effectively.

도 23 내지 도 24는 본 발명의 일실시예에 따른 네트워크에 대한 이상행위 탐지 장치의 구성도와 기능을 나타낸 도면이다.23 to 24 are diagrams showing the configuration and function of the abnormal behavior detection apparatus for a network according to an embodiment of the present invention.

도 23은 본 발명의 일실시예에 따른 네트워크에 대한 이상행위 탐지 장치의 구성을 나타내고, 탐지장치를 구성하는 각 모듈과 관련된 기능은 도 24에 도시된 것과 같다.FIG. 23 illustrates a configuration of an anomaly detection apparatus for a network according to an embodiment of the present invention, and functions related to each module constituting the detection apparatus are as shown in FIG. 24.

먼저, 도 23을 참조하면, 본 발명의 일실시예에 따른 네트워크에 대한 이상행위 탐지 장치는 제어시스템 네트워크 트래픽을 지속적으로 모니터링하여 평소에 정상행위를 학습하고, 이와 어긋나는 이상행위를 탐지하는 시스템이다. First, referring to FIG. 23, an abnormal behavior detection apparatus for a network according to an embodiment of the present invention is a system for continuously monitoring normal traffic traffic by continuously monitoring control system network traffic, and detecting abnormal behaviors that deviate from them. .

도 23에 도시된 탐지 장치는 개별 제어기기 간에 전송되는 패킷의 형태를 학습하고, 이상행위를 탐지하는 로컬 탐지와 전체 네트워크의 트래픽 흐름의 변화를 학습하여 이상행위를 탐지하는 글로벌 탐지를 동시에 수행할 수 있다.The detection apparatus shown in FIG. 23 learns the types of packets transmitted between individual controllers, and simultaneously performs local detection for detecting anomalies and global detection for detecting anomalies by learning changes in traffic flow of the entire network. Can be.

이 때, 로컬 탐지는 제어기기 간에 전송되는 패킷의 형태를 학습하고 이를 바탕으로 정상적이지 않은 데이터나 악성명령 전달, 취약점 공격 등 세부적인 공격이나 시스템의 이상 구동을 탐지할 수 있다. At this time, local detection can detect the abnormal operation of the system or detailed attacks, such as abnormal data or malicious command delivery, vulnerability attack based on the type of packets transmitted between the controller devices.

이 때, 글로벌 탐지는 개별 패킷의 형태는 고려하지 않고, 전체 네트워크에서 제어기기간의 통신 상관관계를 학습하여 평소와 다른 패턴의 데이터 흐름을 탐지하는 것을 목표로 할 수 있다. At this time, global detection may be aimed at detecting the data flow of a different pattern than usual by learning the communication correlation of the controller period in the entire network without considering the type of individual packet.

이와 같이 다른 특성을 가지는 로컬 탐지와 글로벌 탐지를 동시에 수행함으로써 패킷의 형태와 전체 트래픽의 흐름변화를 모두 탐지할 수 있다.In this way, both local and global detections with different characteristics can be detected to detect both the type of packet and the flow change of the entire traffic.

예를 들어, 공격자가 페이로드의 데이터를 변경하여 악성명령을 전달하는 페이로드 변경 공격의 경우, 트래픽의 흐름은 동일하므로 글로벌 탐지로는 탐지가 어렵고, 패킷의 형태를 감시하는 로컬 탐지를 통해 이를 탐지할 수 있다. 반면에 특정 제어기기에 메인 서버와 이중화 서버가 연결되어 있다면 메인서버가 동작할 때에는 이중화 서버가 제어기기를 제어해서는 안되지만, 이에 대해 로컬 탐지로는 탐지가 어렵고 글로벌 탐지는 트래픽의 흐름 변화를 탐지하므로 이를 탐지할 수 있다.For example, a payload change attack in which an attacker changes the data in the payload and delivers a malicious command is difficult to detect with global detection because the traffic flow is the same. Can be detected. On the other hand, if the main server and the redundant server are connected to a specific controller, the redundant server should not control the controller when the main server is operating.However, local detection is difficult to detect and global detection detects changes in traffic flow. This can be detected.

이와 같이 도 23에 도시된 탐지 장치는 로컬 탐지와 글로벌 탐지 기능을 동시에 수행할 수 있다. 각 탐지기능을 위해 평소에 트래픽으로부터 정상패턴을 학습하는 학습기, 학습된 모델을 바탕으로 비정상 트래픽을 탐지하는 탐지기가 존재할 수 있다. 또한, 탐지장치는 탐지된 정보를 사용자에게 제공하고, 해당 정보를 저장해 추후 확인해 볼 수 있도록 할 수도 있다.As such, the detection apparatus illustrated in FIG. 23 may simultaneously perform a local detection function and a global detection function. For each detection function, there may be a learner that normally learns a normal pattern from the traffic, and a detector that detects abnormal traffic based on the learned model. In addition, the detection apparatus may provide the detected information to the user, and may store the corresponding information for later checking.

도 25는 본 발명의 다른 실시예에 따른 네트워크에 대한 이상행위 탐지 장치를 나타낸 블록도이다.25 is a block diagram illustrating an anomaly detection apparatus for a network according to another embodiment of the present invention.

도 25를 참조하면, 본 발명의 다른 실시예에 따른 네트워크에 대한 이상행위 탐지 장치는 통신부(2510), 프로세서(2520) 및 메모리(2530)를 포함한다.Referring to FIG. 25, an anomaly detection apparatus for a network according to another embodiment of the present invention includes a communication unit 2510, a processor 2520, and a memory 2530.

통신부(2510)는 네트워크에 대한 이상행위 탐지를 위해 필요한 정보를 송수신하는 역할을 할 수 있다. 특히, 본 발명의 일실시예에 따른 통신부(2510)는 사용자에게 비정상 트래픽 정보를 전송할 수도 있다.The communication unit 2510 may play a role of transmitting and receiving information necessary for detecting abnormal behavior on a network. In particular, the communication unit 2510 according to an embodiment of the present invention may transmit abnormal traffic information to the user.

프로세서(2520)는 네트워크가 정상일 때 수집된 트래픽을 기반으로 학습을 위한 로그를 추출한다.The processor 2520 extracts a log for learning based on the traffic collected when the network is normal.

즉, 본 발명에 따른 네트워크에 대한 이상행위 탐지 장치는 네트워크가 정상으로 동작할 때 네트워크 트래픽의 평소 특징을 학습하고, 이를 기반으로 사이버공격, 내부자 실수 및 기기 오작동 등에 의해 발생하는 비정상 트래픽 또는 이상 트래픽을 탐지할 수 있다. 이를 위해, 네트워크가 정상일 때 발생한 트래픽에 대한 로그를 추출하여 이용할 수 있다.That is, the anomaly detection apparatus for the network according to the present invention learns the usual characteristics of network traffic when the network is operating normally, and based on this, abnormal traffic or abnormal traffic generated by cyber attack, insider mistake and device malfunction, etc. Can be detected. To this end, a log of traffic generated when the network is normal can be extracted and used.

또한, 프로세서(2520)는 로그를 기반으로 패킷을 고려하여 네트워크 이상행위를 탐지하는 로컬(LOCAL) 탐지 모델과 트래픽 흐름을 고려하여 네트워크 이상행위를 탐지하는 글로벌(GLOBAL) 탐지 모델을 각각 학습시킨다.In addition, the processor 2520 learns a local (LOCAL) detection model that detects network anomalies in consideration of packets and a global (GLOBAL) detection model that detects network anomalies in consideration of traffic flow.

예를 들어, 먼저, D차원상의 점들의 집합인 데이터 공간 X에 대해 비선형 매핑(f_θ: X->Z)을 통하여 특징공간 Z를 구하고, 특징공간 Z의 점들에 대해서 클러스터링을 진행할 수 있다. 이 때, 일반적으로 특징공간 Z의 차원은 데이터 공간 X의 차원보다 작게 설정할 수 있다. 이 후, 반복적인 학습을 통해 K개의 클러스터들의 중심과 학습 가능한 파라미터 세타(θ)를 학습시킬 수 있다.For example, first, a feature space Z may be obtained through a nonlinear mapping (f _θ : X-> Z) of a data space X that is a set of points on a D-dimensional surface, and clustering may be performed on the points of the feature space Z. In this case, in general, the dimension of the feature space Z may be set smaller than the dimension of the data space X. Thereafter, iterative learning can learn the center of K clusters and the learnable parameter theta θ.

이 후, 스택드 오토인코더에 대한 학습을 수행하고, 스택드 오토인코더의 특징 계층을 기반으로 복수개의 제1 특징점들을 추출할 수 있다.Thereafter, learning about the stacked autoencoder may be performed, and a plurality of first feature points may be extracted based on the feature layer of the stacked autoencoder.

이 때, 복수개의 패킷들 간의 전송시간차를 측정하여 기설정된 시간 이내에 전송되는 패킷들의 묶음을 트랜잭션으로 분류할 수 있다.At this time, the transmission time difference between the plurality of packets can be measured to classify the bundle of packets transmitted within a predetermined time as a transaction.

또한, 본 발명에서는 IP-IP 구간 별로 로컬 탐지 모델의 학습을 진행하여 IP-IP 구간마다 별도의 학습모델을 생성할 수 있다. 이 때, IP-IP 단위가 아닌 다른 학습 대상으로 변동할 수도 있다.In addition, in the present invention, a learning model may be generated for each IP-IP section to generate a separate learning model for each IP-IP section. At this time, it may change to a learning target other than the IP-IP unit.

이 때, 하나의 바이트(byte)는 0x00부터 0xFF까지 256가지의 값을 가질 수 있다. 따라서, 바이트 빈도 데이터는 하나의 패킷에 포함된 모든 바이트에서 0x00부터 0xFF까지의 값이 몇 번 나타나는지를 나타내는 데이터에 상응할 수 있다.At this time, one byte may have 256 values from 0x00 to 0xFF. Thus, byte frequency data may correspond to data indicating how many times a value from 0x00 to 0xFF appears in all bytes included in one packet.

이 때, 학습 및 탐지의 정확도를 향상시키기 위해서 바이트 빈도 데이터를 그대로 사용할 수도 있지만, 실시간 탐지를 수행하는 경우에는 패킷마다 256개의 정수를 비교하는 것이 실시간 성능에 문제를 발생시킬 수 있다. 따라서, 실시간 탐지에서 성능을 향상시키면서도 정확도를 최대한 유지하기 위해, 256개의 빈도값들을 n개의 버킷들로 모으고, 각각의 버킷에 할당된 빈도값들을 합산하여 표현할 수 있다.In this case, byte frequency data may be used as it is to improve the accuracy of learning and detection. However, when performing real-time detection, comparing 256 integers per packet may cause a problem in real-time performance. Therefore, in order to improve performance in real-time detection while maintaining the maximum accuracy, 256 frequency values may be collected into n buckets, and the frequency values assigned to each bucket may be summed and expressed.

이 때, 복수개의 버킷들 각각에 할당된 빈도값들을 합산하여 나타낸 결과가 버킷 빈도 데이터에 상응할 수 있다. At this time, the result of summing the frequency values assigned to each of the plurality of buckets may correspond to the bucket frequency data.

따라서, 시스템의 성능을 고려하여 버킷의 개수를 임의의 n개로 지정하였을 때, 엔트로피가 최대가 되게 빈도값들을 할당하기 위해서는, 256개의 빈도값들 중 패킷을 구분하는데 사용되는 복수개의 빈도값들은 각각 다른 버킷으로 분류하여야 할 수 있다.Therefore, when the number of buckets is set to any number of n in consideration of the performance of the system, in order to allocate frequency values to maximize entropy, a plurality of frequency values used to distinguish packets among 256 frequency values may be used. It may need to be classified as a different bucket.

이 때, 버킷 빈도 데이터를 클러스터링하여 어느 하나의 패킷에 대한 하나 이상의 제3 클러스터를 생성할 수 있다.At this time, the bucket frequency data may be clustered to generate one or more third clusters for any one packet.

이와 같이, 트랜잭션에 포함되는 다수의 패킷에 대한 정보를 하나 이상의 제3 클러스터로 단순히 표현하여 정보를 단순화함으로써 학습과 탐지를 보다 효율적으로 수행할 수 있다.As such, the information about the plurality of packets included in the transaction may be simply represented by one or more third clusters, thereby simplifying the information, and thus learning and detection may be performed more efficiently.

이 때, 전송시간차를 기반으로 복수개의 패킷들을 분류하여 하나 이상의 트랜잭션을 생성하고, 하나 이상의 트랜잭션에 할당된 패킷에 상응하는 전송방향과 하나 이상의 제3 클러스터를 조합하여 통신구간별 트랜잭션의 바이트 빈도를 알 수 있는 패킷 빈도 데이터를 생성할 수 있다.At this time, by classifying a plurality of packets based on the transmission time difference to generate one or more transactions, by combining the transmission direction corresponding to the packet assigned to one or more transactions and one or more third clusters to determine the byte frequency of the transaction for each communication section Known packet frequency data can be generated.

이 때, 패킷 빈도 데이터는, 어느 하나의 트랜잭션 안에 어떤 패킷이 몇 개나 들어있는지 나타낸 것일 수 있다. 따라서, 어느 하나의 트랜잭션에 포함된 각각의 개별 패킷의 전송방향이 무엇인지와 각각의 개별 패킷이 어떤 제3 클러스터에 해당하는지를 조합하여 나타낼 수 있다.In this case, the packet frequency data may indicate how many packets are contained in any one transaction. Therefore, it can be shown by combining the transmission direction of each individual packet included in any one transaction and which third cluster each individual packet corresponds to.

이 때, 개별 트랜잭션에 포함되는 패킷들은 전송방향에 따라 2가지로 분류할 수 있다. 예를 들어, IP1과 IP2간의 통신구간에 상응하는 하나의 트랜잭션에 포함된 패킷들을 IP1에서 IP2로 전송되는 패킷과 IP2에서 IP1로 전송되는 패킷으로 분류할 수 있다. At this time, the packets included in the individual transaction can be classified into two types according to the transmission direction. For example, packets included in one transaction corresponding to a communication interval between IP1 and IP2 may be classified into packets transmitted from IP1 to IP2 and packets transmitted from IP2 to IP1.

이와 같이, 하나의 트랜잭션을 2n개의 정수로 나타냄으로써, 트랜잭션 내의 패킷 단위의 특성을 나타낼 수 있으면서도 학습과 탐지 과정의 수행속도를 향상시킬 수 있다.As described above, by representing one transaction as 2n integers, it is possible to express the characteristics of packet units in the transaction and to improve the speed of learning and detecting.

또한, 패턴 매칭(PATTERN MATCHING) 방식을 기반으로 글로벌 탐지 모델을 학습시킬 수 있다. 예를 들어, 데이터 전처리 방식 중에 전체 네트워크 트래픽의 전송정보를 통해 얻은 히스토그램 집합에서 패턴 매칭, 혹은 템플릿 매칭(TEMPLATE MATCHING)이라 불리는 방식을 적용하여 제어네트워크 상의 이상상태를 판단할 수 있다.In addition, the global detection model can be trained based on the PATTERN MATCHING method. For example, an abnormal state on a control network may be determined by applying a method called pattern matching or template matching in a histogram set obtained through transmission information of all network traffic during data preprocessing.

먼저, 학습을 위한 전처리 과정으로, 로그를 기반으로 기설정된 제1 단위시간마다 네트워크에 상응하는 통신경로 별 전송횟수에 대한 제1 히스토그램을 생성할 수 있다.First, as a preprocessing process for learning, a first histogram of transmission counts for communication paths corresponding to a network may be generated for each predetermined first unit time based on a log.

이 때, 히스토그램을 이용하여 전체 네트워크의 상태를 함축적으로 나타낼 수 있다. 즉, 제1 단위시간마다 히스토그램을 하나씩 생성하면, n단위시간 동안 만들어진 n개의 제1 히스토그램을 통해 시간에 따라 변화하는 네트워크 상태를 추적할 수 있다.In this case, the histogram may be used to express the state of the entire network. That is, if one histogram is generated for each first unit time, the network state that changes with time may be tracked through the n first histograms generated for the unit time.

이 때, 네트워크에 상응하는 복수개의 통신경로들과 시간을 고려하여 로그를 복수개의 로그 그룹들로 분류하고, 복수개의 로그 그룹들마다 카이스퀘어 분포를 생성할 수 있다. 이와 같은 과정을 통해, 감시 대상이 그룹단위의 소규모가 되어 공격 탐지율이 향상될 뿐만 아니라 공격발생구간을 보다 정확히 찾아낼 수도 있다.In this case, the log may be classified into a plurality of log groups in consideration of a plurality of communication paths and time corresponding to the network, and a chi square distribution may be generated for each of the plurality of log groups. Through this process, the monitoring target becomes smaller in group units, which not only improves the detection rate of the attack, but also makes it possible to more precisely identify the attack occurrence section.

또한, 프로세서(2520)는 로컬 탐지 모델과 글로벌 탐지 모델을 이용하여 네트워크 이상행위에 대한 로컬 탐지와 글로벌 탐지를 동시에 수행한다.In addition, the processor 2520 may simultaneously perform local detection and global detection for network abnormalities using the local detection model and the global detection model.

이 때, 어느 하나의 제1 히스토그램과 탐지대상 히스토그램 간의 벡터 거리가 카이스퀘어 분포의 99% 신뢰구간에 해당하지 않는 경우, 네트워크에서 비정상 트래픽이 발생한 것으로 판단할 수 있다.In this case, when the vector distance between any one first histogram and the detection target histogram does not correspond to a 99% confidence interval of the chi square distribution, it may be determined that abnormal traffic has occurred in the network.

또한, 본 발명은 글로벌 탐지를 수행하여, 비정상 트래픽 발생구간이 탐지되었는지 여부를 판단하고, 비정상 트래픽 발생구간이 탐지되면 해당 구간에 대해 로컬 탐지를 수행하여 실제로 비정상 트래픽이 전달된 엣지를 검출할 수 있다.In addition, the present invention may perform a global detection, determine whether the abnormal traffic generation section is detected, and if the abnormal traffic generation section is detected, it is possible to detect the edge that actually delivered the abnormal traffic by performing a local detection for the section. have.

메모리(2530)는 로그, 로컬 탐지 모델 및 글로벌 탐지 모델 중 적어도 하나를 저장한다.The memory 2530 stores at least one of a log, a local detection model, and a global detection model.

또한, 메모리(2530)는 상술한 바와 같이 본 발명의 일실시예에 따른 네트워크에 대한 이상행위 탐지 장치에서 발생하는 다양한 정보를 저장한다.In addition, the memory 2530 stores various kinds of information generated by the apparatus for detecting abnormal behavior of the network according to the embodiment of the present invention as described above.

실시예에 따라, 메모리(2530)는 이상행위 탐지 장치와 독립적으로 구성되어 네트워크에 대한 이상행위 탐지를 위한 기능을 지원할 수 있다. 이 때, 메모리(2530)는 별도의 대용량 스토리지로 동작할 수 있고, 동작 수행을 위한 제어 기능을 포함할 수도 있다.According to an embodiment, the memory 2530 may be configured independently of the anomaly detection device to support a function for anomaly detection for a network. In this case, the memory 2530 may operate as a separate mass storage and may include a control function for performing an operation.

일 구현예의 경우, 메모리(2530)는 컴퓨터로 판독 가능한 매체이다. 일 구현 예에서, 메모리(2530)는 휘발성 메모리 유닛일 수 있으며, 다른 구현예의 경우, 메모리(2530)는 비휘발성 메모리 유닛일 수도 있다. 일 구현예의 경우, 저장장치는 컴퓨터로 판독 가능한 매체이다. 다양한 서로 다른 구현 예에서, 저장장치는 예컨대 하드디스크 장치, 광학디스크 장치, 혹은 어떤 다른 대용량 저장장치를 포함할 수도 있다.In one implementation, the memory 2530 is a computer readable medium. In one implementation, the memory 2530 may be a volatile memory unit, and for other implementations, the memory 2530 may be a nonvolatile memory unit. In one embodiment, the storage device is a computer readable medium. In various different implementations, the storage device may include, for example, a hard disk device, an optical disk device, or some other mass storage device.

이상에서와 같이 본 발명에 따른 네트워크에 대한 이상행위 탐지 방법 및 이를 이용한 장치는 상기한 바와 같이 설명된 실시예들의 구성과 방법이 한정되게 적용될 수 있는 것이 아니라, 상기 실시예들은 다양한 변형이 이루어질 수 있도록 각 실시예들의 전부 또는 일부가 선택적으로 조합되어 구성될 수도 있다.As described above, the abnormal behavior detection method and the apparatus using the same according to the present invention may not be limitedly applied to the configuration and method of the embodiments described as described above, but the embodiments may be modified in various ways. All or part of each of the embodiments may be configured to be selectively combined so that.

110: 이상행위 탐지 장치
120: 제어기기 입출력 이상행위 탐지 장치
310: 패킷 크기 320: 패킷 히스토그램
410, 420: 특징 계층 610, 620, 710, 720, 910: 트랜잭션
611, 612, 613, 621, 622, 631: 패킷
920: 패킷타입 시퀀스
1110, 1710: IP1에서 IP2로 전송되는 패킷
1120, 1720: IP2에서 IP1로 전송되는 패킷
1210: 바이트 별 값 2010, 2020: 제1 히스토그램
2120~2160: 로그 그룹 2510: 통신부
2520: 프로세서 2530: 메모리110: anomaly detection device
120: controller input / output abnormal behavior detection device
310: packet size 320: packet histogram
410, 420: feature layers 610, 620, 710, 720, 910: transactions
611, 612, 613, 621, 622, 631: packets
920: packet type sequence
1110, 1710: Packets sent from IP1 to IP2
1120, 1720: Packets sent from IP2 to IP1
1210: value per byte 2010, 2020: first histogram
2120 ~ 2160: Log group 2510: Communication unit
2520: Processor 2530: Memory

Claims

Extracting a log for learning based on the collected traffic when the network is normal;
Training a local (LOCAL) detection model for detecting network anomaly based on the log and a global (GLOBAL) detection model for detecting network anomaly in consideration of traffic flow; And
Simultaneously performing a local detection and a global detection of the network anomaly by using the local detection model and the global detection model.
Including,
The learning step
Training the global detection model based on a PATTERN MATCHING scheme;
Training the global detection model
Generating a first histogram of the number of transmissions for each communication path corresponding to the network every first predetermined unit time based on the log;
Generating a second histogram of the transmission frequency for each communication path corresponding to the network every second predetermined unit time smaller than the predetermined first unit time; And
The first histogram and the second histogram are matched in consideration of similarity, and an average and variance of the vector distances between the two matched histograms are calculated to generate a CHI-SQUARE distribution corresponding to the network. Anomaly detection method for a network, characterized in that it comprises a step.

The method according to claim 1,
The learning step
And learning the local detection model based on the features extracted from the packet.

The method according to claim 2,
Training the local detection model
Classifying the log by communication section corresponding to the network; And
Extract a plurality of first feature points mapped to an N-dimensional space based on a plurality of packets corresponding to any one communication section, and cluster the plurality of first feature points based on a LOYID algorithm to cluster one or more first features. Anomaly detection method for a network comprising the step of creating a cluster.

The method according to claim 3,
Creating the at least one first cluster
The cluster center for the one or more first clusters is calculated, and clustering is repeatedly performed until the distance difference between the cluster centers calculated for each communication section is less than a predetermined error distance. Behavior detection method.

The method according to claim 3,
Creating the at least one first cluster
Analyzing a byte (BYTE) of each of the plurality of packets to generate a packet histogram for each of the plurality of packets;
Inputting the packet histogram as input data of a stacked autoencoder based on a greedy layer based learning scheme; And
Performing learning about the stacked autoencoder, and extracting the plurality of first feature points based on the feature layer of the stacked autoencoder.

The method according to claim 3,
Training the local detection model
Classifying the plurality of packets based on an INTERARRIVAL TIME to generate one or more transactions (TRANSACTION);
Generating at least one packet type sequence by converting at least one packet included in the at least one transaction into a type corresponding to the first cluster, and generating at least one second cluster by clustering the at least one packet type sequence Anomaly detection method for a network, characterized in that it further comprises.

The method according to claim 3,
Learning the local detection model
Analyzing byte of each of the plurality of packets to generate byte frequency data including the number of occurrences of each byte;
The 256 frequency values included in the byte frequency data are classified into buckets to generate a plurality of buckets smaller than the frequency values, and the frequency values allocated to the plurality of buckets are summed to generate bucket frequency data. Generating; And
Clustering the bucket frequency data to generate one or more third clusters for any one packet.

The method according to claim 7,
Generating the bucket frequency data
And a plurality of frequency values used to classify packets among the 256 frequency values are classified into different buckets.

The method according to claim 7,
Learning the local detection model
Generating one or more transactions by classifying the plurality of packets based on a transmission time difference;
Combining packet transmission direction corresponding to the packet allocated to the one or more transactions with the one or more third clusters to generate packet frequency data for knowing the byte frequency of the transaction for each communication section; And
Detecting a normal pattern for learning based on the packet frequency data, and learning the local detection model corresponding to the normal pattern.

delete

The method according to claim 1,
The generating step
Anomalous behavior for a network, wherein the log is classified into a plurality of log groups in consideration of a plurality of communication paths corresponding to the network and time, and the chi square distribution is generated for each of the plurality of log groups. Detection method.

The method according to claim 9,
The step of performing
If any one of the individual packet and the individual transaction transmitted through the network does not correspond to any one of the one or more first cluster, the one or more third cluster, and the normal pattern, it is determined that the abnormal traffic in the network Anomaly detection method for a network, characterized in that.

The method according to claim 6,
The step of performing
If the individual transaction transmitted through the network is not included in the one or more second cluster, the abnormal transaction detection method for the network, characterized in that determining the individual transaction as an abnormal transaction.

The method according to claim 1,
The step of performing
Generating a histogram to be detected corresponding to the second unit time to detect the network, and detecting one of the first histograms having the highest similarity to the histogram to be detected among a plurality of first histograms; And
Determining that abnormal traffic has occurred in the network when the vector distance between the first histogram and the detection histogram does not correspond to a 99% confidence interval of the chi square distribution. Anomaly detection for.

Extract the log for learning based on the collected traffic when the network is normal, and detect the network abnormality by considering the local (LOCAL) detection model and the traffic flow that detect the network abnormality based on the packet. A processor configured to learn a global detection model, and simultaneously perform a local detection and a global detection of the network anomaly using the local detection model and the global detection model; And
A memory storing at least one of the log, the local detection model and the global detection model
Including,
The processor is
Train the global detection model based on a PATTERN MATCHING method, generate a first histogram for the number of transmissions for each communication path corresponding to the network at a first predetermined unit time based on the log; Generate a second histogram of the transmission frequency for each communication path corresponding to the network every second predetermined unit time smaller than a first predetermined unit time, and match the first and second histograms in consideration of similarity. And generating an CHI-SQUARE distribution corresponding to the network by calculating an average and a variance of the vector distances between the matched two histograms.

The method according to claim 15,
The processor is
Anomaly detection apparatus for a network, characterized in that for learning the local detection model based on the feature extracted from the packet.

The method according to claim 16,
The processor is
The log is classified according to a communication section corresponding to the network, and a plurality of first feature points mapped to a predetermined N-dimensional space are extracted based on a plurality of packets corresponding to any one communication section, and a LOYID algorithm Anomaly detection apparatus for a network, characterized in that for generating at least one first cluster by clustering the plurality of first feature points based on.

The method according to claim 17,
The processor is
The cluster center for the one or more first clusters is calculated, and clustering is repeatedly performed until the distance difference between the cluster centers calculated for each communication section is less than a predetermined error distance. Behavior detection device.

The method according to claim 17,
The processor is
When the individual packet transmitted through the network is not included in the one or more first cluster, the abnormal behavior detection apparatus for the network, characterized in that the individual packet is determined as an abnormal packet.

The method according to claim 17,
The processor is
In order to detect the network, a detection target histogram is generated corresponding to the second unit time, and among the plurality of first histograms, any one first histogram having the highest similarity to the detection target histogram is detected, and the one And determining that abnormal traffic has occurred in the network when the vector distance between the first histogram and the target histogram does not correspond to a 99% confidence interval of the chisquare distribution.