KR100976052B1

KR100976052B1 - Method for packet image conversion in SVM intrusion detection

Info

Publication number: KR100976052B1
Application number: KR1020080005223A
Authority: KR
Inventors: 이극
Original assignee: 한남대학교 산학협력단
Priority date: 2008-01-17
Filing date: 2008-01-17
Publication date: 2010-08-17
Also published as: KR20100004123A

Abstract

본 발명은 SVM 침입 탐지를 위한 패킷이미지 변환 방법에 관한 것으로서, 침입 탐지에 사용되는 데이터의 특성 정보 영역을 추출하는 과정과, 상기 특성 정보 영역의 각 바이트값을 255로 나누어서 각 바이트를 0에서 1사이의 값을 가지도록 변환시킨 패킷이미지로 생성하는 과정과, 상기 각 바이트별 패킷이미지를 모아서 전체의 패킷이미지 패턴으로 구성하는 과정을 포함한다. The present invention relates to a packet image conversion method for SVM intrusion detection, comprising: extracting a feature information region of data used for intrusion detection; and dividing each byte value of the feature information region by 255 to divide each byte from 0 to 1 Generating a packet image converted to have a value between and collecting the packet image of each byte to form a whole packet image pattern.

상기 특성 정보 영역은, 각 데이터의 헤더 정보임을 특징으로 하며, 상기 특성 정보 영역의 크기는 상기 각 데이터의 처음 60바이트 정보량임을 특징으로 한다. 상기 패킷이미지 패턴은 60개의 패킷이미지로 이루어짐을 특징으로 한다.The characteristic information region is characterized in that the header information of each data, the size of the characteristic information region is characterized in that the amount of information of the first 60 bytes of each data. The packet image pattern is characterized by consisting of 60 packet images.

네트워크, 방화벽, 바이러스, SVM, 미확인, 예측, 패킷, 패킷이미지, 변환 Network, Firewall, Virus, SVM, Unknown, Prediction, Packet, PacketImage, Translation

Description

Method for packet image conversion in SVM intrusion detection

본 발명은 SVM 침입 탐지를 위한 패킷이미지 변환 방법에 관한 것이다.The present invention relates to a packet image conversion method for SVM intrusion detection.

네트워크 보안을 위한 방화벽(Firewall)은 외부 네트워크와의 격리를 위해서 만들어졌다. firewall은 Host/Network 망의 침입을 원천 봉쇄한다. 방화벽은 네트워크의 입구에 설치되어 외부의 패킷이 내부 네트워크로 밀려들어 오는 것을 막는다. 병목점의 관점에서 방화벽은 외부 네트워크와 내부 네트워크의 경계선에 위치하여 내부에서 외부로, 또는 외부에서 내부로 들어오는 패킷을 일괄적으로 처리한다. 적어도 내-외부간의 통신이 방화벽을 거쳐야만 하는 것이라면, 방화벽은 적절한 병목점 역할을 하게 되는 것이다. Firewalls for network security are designed for isolation from external networks. Firewalls block the intrusion of host / network networks. A firewall is installed at the entrance of the network to prevent outside packets from entering the internal network. From a bottleneck point of view, a firewall is located at the boundary between the external and internal networks to batch process incoming packets from inside to outside or from outside to inside. If at least internal and external communication must go through the firewall, then the firewall is a suitable bottleneck.

이러한 병목점 위치에서 적당한 룰(rule)을 도입하여 지나다니는 패킷을 처분하면 그만큼 내부 네트워크는 안전할 수 있다. 취약부분의 관점에서 볼 때 방화벽은 외부와 내부 네트워크를 격리하여 내부 네트워크로의 접속 자체를 방어하기 때문에 일단 내부 네트워크에서의 취약부분을 피할 수 있다. 먼저 방화벽을 통과해야만 내부 네트워크에 들어오게 되므로, 방화벽을 뚫지 못하는 한은 내부 네트워크 내의 취약부분은 눈에 띄지 않게 된다. 현재 버그가 예전에 비해 자주 발표되고 있으며 또한 일괄적으로 버그 정리를 하기가 힘든 상황에 방화벽은 이 문제점들을 해결할 수 있는 좋은 특성을 지니고 있다. Introducing proper rules at these bottleneck locations can dispose of the passing packets so that the internal network can be as secure. From a vulnerable point of view, the firewall isolates the external and internal networks and defends itself from access to the internal network, thus avoiding the vulnerabilities in the internal network. The first step through the firewall is to enter the internal network, so as long as you can't penetrate the firewall, the vulnerabilities in the internal network are invisible. Currently, bugs are released more frequently than before, and in a situation where it is difficult to fix them in a batch, the firewall is a good feature to solve these problems.

방화벽(firewall)을 설치하지 않는 성은 담을 쌓아놓지 않은 마을이라고도 볼 수 있는 것이다. 내부 네트워크에서 외부 네트워크로 나가는 곳이 방화벽이 설치된 곳 이외에 다른 것이 존재한다면 문제가 생긴다. 곧 PPP등의 서비스를 하는 경우는 방화벽을 통하지 않고 내부로 연결할 길을 만들어 준 것이 된다. 방화벽을 설치하는 경우, 어떤 다른 경로를 통해서도 접근이 불가능하도록 해야 한다. A castle without a firewall can be seen as a village with no walls. If the exit from the internal network to the external network is something other than a firewall, there is a problem. In the case of services such as PPP, it made a way to connect to the inside without going through the firewall. If you install a firewall, make sure that it is not accessible through any other path.

한편, 침입 탐지 시스템이란 컴퓨터 시스템의 비정상적인 사용, 오용, 남용 등을 실시간으로 탐지하는 시스템이다. 여기서 오용(Misuse)이란 기업 및 개인 기밀 데이터를 훔치는 것처럼 중대한 것부터 개인 메일 시스템을 악용하는 등 사소한 일까지 넓은 범위 포함할 수 있는데, 이러한 불법적인 행위로부터 시스템 자원 및 정보를 보호하기 위해 감시하고 그에 대한 대응하는 일련의 행위를 뜻한다. 일반적으로 공격 형태는 정찰(Reconnaissance), 악용(Exploits), 서비스 거부(DoS: denial-of-service)와 같이 세 가지가 있다. On the other hand, intrusion detection system is a system for detecting abnormal use, misuse, abuse, etc. of the computer system in real time. Misuse can encompass a wide range of things, such as theft of corporate and personal confidential data, from trivial things such as the abuse of personal mail systems, to monitor and protect system resources and information from such illegal activities. It means a series of actions. In general, there are three types of attacks: reconnaissance, exploits, and denial-of-service (DoS).

정찰은 ping, DNS(Domain Name Server) zone 전송, e-mail 정찰, TCP(Transmission Control Protocol)나 UDP(User Datagram Protocol) 포트 스캔, 그리고 CGI 결함을 찾기 위해 어떻게든지 공개 웹 서버 인덱싱(Indexing) 등이 있으며, 악용은 시스템 접근 권한을 얻기 위해 숨겨진 기능이나 버그들(Bugs)을 이용한다. 서비스 거부 공격은 공격 침입자가 네트워크 링크(Links)들에 과부하를 일으 키고, CPU에도 과부하를 일으켜, 디스크를 꽉 채워 어떤 서비스도 지원하지 못하도록 마비시키는 것을 말한다. 침입자의 목적은 개인의 정보보다는 시스템 자원에 대한 서비스를 중단시키기 위해 공격한다.Reconnaissance can include pinging, sending Domain Name Server (DNS) zones, e-mail reconnaissance, scanning Transmission Control Protocol (TCP) or User Datagram Protocol (UDP) ports, and indexing public web servers somehow to find CGI flaws. This exploit exploits hidden features or bugs to gain system access. Denial of service attacks involve an attacker overloading network links and overloading the CPU, causing the disk to fill up and paralyze it from supporting any services. The attacker's objective is to attack to disrupt service to system resources rather than personal information.

이러한 침입 탐지 시스템은 방화벽만으로 내부 사용자의 불법적인 행동(기밀 유출 등)과 외부 해킹에 대처할 수 없으므로 모든 내외부 정보의 흐름을 실시간으로 차단하기 위해 해커 침입 패턴에 대한 추적과 유해 정보 감시를 필요로 한다. 이를 위하여 기존의 침입 탐지 시스템은 규칙기반 침입탐지 시스템으로 센서를 통해 네트워크 트래픽 데이터를 수집해 침입규칙과 비교를 통해 공격 유무를 판단한다. Since intrusion detection system cannot cope with illegal behavior (secret leak, etc.) and external hacking by internal firewall alone, tracking of hacker intrusion pattern and monitoring of harmful information is necessary to block all internal and external information flow in real time. . For this purpose, the existing intrusion detection system is a rule-based intrusion detection system that collects network traffic data through sensors and compares them with intrusion rules to determine whether there is an attack.

그러나 현재 네트워크 사용자와 망의 확산되어짐에 따라서 사이버 공격의 형태가 변화하고 있다. 사이버상의 불법적인 행위는 공격 방식이 점차 복잡하며, 지능적으로 진화되는 모습을 보이고 있다. 이처럼 사이버 위협들이 점차 다양해지고 지능화되는데 반해 기존의 침입 탐지 시스템은 알려진 바이러스나 해킹을 감지하고 방어하는 기능은 수행할 수 있지만, 미확인 공격에 대해서는 제로데이(zero-day)가 지나기 전까지는 전혀 네트워크와 시스템을 보호하는 기능을 수행하지 못하고 있다. 이러한 다양해지고 지능화되어지는 인터넷 위협에 대하여 망을 보호하기 위해서는 미확인 위협을 관리하고 예측하는 정보보호 기술이 필요하다.However, with the spread of network users and networks, the forms of cyber attacks are changing. Illegal behavior in cyberspace is increasingly complicated and intelligently evolving. While cyber threats are increasingly diversified and intelligent, existing intrusion detection systems can detect and defend known viruses or hacks. It does not function to protect the system. In order to protect the network against such diverse and intelligent Internet threats, information protection technology that manages and predicts unidentified threats is required.

본 발명은 기존에 쓰이던 규칙 기반의 네트워크 트래픽 분석과 감시의 한계를 극복하고 오용 탐지와 비정상 탐지 방법에서 향상된 성능을 발휘하기 위하여 SVM(Support Vector Machines)을 이용하여 침입을 탐지하고 예측하는 기술이 본 출원인에 의해 제안되었는데, 이러한 SVM 침입 탐지를 위해 필요한 패킷이미지 변환 기술을 제안하는데 그 목적이 있다.In order to overcome the limitations of rule-based network traffic analysis and monitoring, and to improve performance in misuse detection and anomaly detection methods, the present invention provides a technique for detecting and predicting intrusion using SVM (Support Vector Machines). Proposed by the applicant, the purpose is to propose a packet image conversion technique required for such SVM intrusion detection.

본 발명은 침입 탐지에 사용되는 데이터의 특성 정보 영역을 추출하는 과정과, 상기 특성 정보 영역의 각 바이트값을 255로 나누어서 각 바이트를 0에서 1사이의 값을 가지도록 변환시킨 패킷이미지로 생성하는 과정과, 상기 각 바이트별 패킷이미지를 모아서 전체의 패킷이미지 패턴으로 구성하는 과정을 포함한다. The present invention provides a method of extracting a feature information region of data used for intrusion detection, and generating a packet image obtained by dividing each byte value of the feature information region by 255 and converting each byte to have a value between 0 and 1. And collecting a packet image of each byte to form a whole packet image pattern.

본 발명은 SVM 적용을 통한 네트워크 침입 탐지 시에 사용되는 패킷이미지를 효과적으로 변환시킬 수 있다.The present invention can effectively convert the packet image used for network intrusion detection through SVM application.

이하, 본 발명의 바람직한 실시 예들의 상세한 설명이 첨부된 도면들을 참조하여 설명될 것이다. 하기에서 각 도면의 구성요소들에 참조부호를 부가함에 있어 동일한 구성요소들에 대해서는 비록 다른 도면상에 표시되더라도 가능한 한 동일한 부호를 가지도록 하고 있음에 유의해야 한다.Hereinafter, the detailed description of the preferred embodiments of the present invention will be described with reference to the accompanying drawings. In the following description of the reference numerals to the components of the drawings it should be noted that the same reference numerals as possible even if displayed on different drawings.

도 1은 SVM 침입 탐지 시스템의 개요도이다.1 is a schematic diagram of an SVM intrusion detection system.

SVM 침입 탐지 시스템은 가공되지 않은 패킷을 가지고 패킷이미지변환 기술로 패킷이미지를 생성하고 생성된 패킷이미지를 정규화하고 SVM학습을 통해 실시간 네트워크 침입탐지를 수행한다.SVM intrusion detection system uses raw image packets to generate packet images using packet image conversion technology, normalizes the generated packet images, and performs real-time network intrusion detection through SVM learning.

지금까지의 침입 탐지 시스템은 규칙 기반의 네트워크 트래픽 분석과 감시를 한다. 규칙 기반 시스템의 대표적인 Snort는 패킷 스니퍼(Packet sniffer)/패킷 로더(Packet Logger)/네트워크IDS(Network IDS)이다. Snort의 구조는 스니퍼, 전처러기, 탐지엔진, 경고/로깅으로 구성되며, 전처러기와, 탐지엔진, 경고/로깅부분은 플러그인 형태로 되어 다. 이러한 기존의 침입 탐지 시스템의 침입 탐지 방법은 바이러스나 해킹을 감지하고 방어하는 기능을 수행할 수 있지만, 알려지지 않거나 변종인 공격에 대해서 시스템과 네트워크망을 보호할 수 없다.Historically, intrusion detection systems perform rule-based network traffic analysis and monitoring. Typical Snorts in rule-based systems are Packet Sniffer / Packet Logger / Network IDS. Snort's structure consists of sniffer, pre-detachment, detection engine, and warning / logging, and pre-deposition, detection engine, and warning / logging are in the form of plug-ins. The intrusion detection method of the existing intrusion detection system can perform a function of detecting and defending a virus or a hack, but cannot protect the system and the network against an unknown or variant attack.

이러한 기존에 쓰이던 규칙 기반의 네트워크 트래픽 분석과 감시의 한계를 극복하고 오용 탐지와 비정상 탐지 방법에서 향상된 성능을 발휘하기 위하여 SVM(Support Vector Machines)을 이용하여 침입을 탐지하고 예측하기 위해 필요한 패킷이미지 변환 기술이다. 즉, 본 발명은 공격의 예측 및 탐지를 위해서 SVM을 이용하여 침입탐지 및 예측에 필요한 패킷이미지로 변환하는 기술을 제안한다.In order to overcome the limitations of conventional rule-based network traffic analysis and monitoring and to improve performance in misuse detection and anomaly detection methods, packet image conversion is needed to detect and predict intrusions using SVM (Support Vector Machines). Technology. That is, the present invention proposes a technique for converting packet images necessary for intrusion detection and prediction using SVM for prediction and detection of an attack.

상기 SVM(Support Vector Machines)은 1995년에 Vladimir Vapnik과 그의 AT&T Bell 연구소 팀이 개발한 커널 머신을 이용한 학습 기계로서 통계적 학습 이론에 기반하여 최적 분류를 행함으로써 뛰어난 일반화 성능을 보여준다. 전통적인 대부분의 패턴인식 기법들이 학습 데이터의 수행도를 최적화하기 위한 경험적인 위험 최소화(Empirical Risk Minimization) 방법에 기초하고 있는데 반하여, 상기 SVM은 고정되어 있지만 알려지지 않은 확률 분포를 갖는 데이터에 대해 잘못 분류하는 확률을 최소화하는 구조적인 위험 최소화(Structural Risk Minimization) 방 법에 기초한다. SVM을 이용한 위협 예측 방법은 데이터들의 특성에 따라서 고차원 평면의 양쪽으로 분류된다.The SVM (Support Vector Machines) is a learning machine using a kernel machine developed by Vladimir Vapnik and his AT & T Bell lab team in 1995 and shows excellent generalization performance by performing optimal classification based on statistical learning theory. While most traditional pattern recognition techniques are based on empirical risk minimization methods for optimizing the performance of learning data, the SVM misclassifies data that has a fixed but unknown probability distribution. It is based on the structural risk minimization method which minimizes the probability. Threat prediction methods using SVM are classified into both high-dimensional planes according to the characteristics of the data.

SVM은 변환 학습용 데이터 집합을 사용하여 학습되어지고 그 결과로써 결정함수가 생성된다. 실제로 이결정 함수는 고차원 공간에서 결정평면이다. SVM의 학습은 학습용 데이터 집합과 학습에 사용되는 내부 커널함수, 또한 조정 파라미터인 C값에 의존한다. SVM의 커널 함수는 고차원의 비선형의 입력 값을 선형으로 변환시켜줌으로써 SVM의 연산속도를 높이는 역할을 한다.The SVM is trained using a set of transform training data and as a result a decision function is generated. In fact, this decision function is the crystal plane in high-dimensional space. The learning of SVM depends on the training data set, the internal kernel function used for learning, and the C value, which is a tuning parameter. SVM's kernel function speeds up SVM computation by converting high-level nonlinear input values linearly.

초기의 SVM은 선형 분리 가능한 두 클래스를 구분 지으며 마진을 최대로 하는 초평면 w×x+b=0을 찾는 개념이었다. 즉, 도 2와 같이 이미지를 정규화시킨 후 SVM학습을 수행하고, SVM 학습이 완료하여 서포트 벡터(Support Vector, 이하 SV)를 생성한 후, 이러한 기준점이 되는 서포트 벡터와 실험 벡터간에 유사도 마진을 찾는다.Early SVMs distinguished between two classes that are linearly separable and found the superplane w × x + b = 0 to maximize the margin. In other words, after normalizing an image as shown in FIG. 2, SVM learning is performed, and after SVM learning is completed, a support vector (hereinafter referred to as SV) is generated, and a similarity margin is found between the support vector and the experimental vector serving as reference points. .

상기와 같이 데이터들의 특성에 따라 고차원 평면의 양쪽으로 분류할 때, 본 발명은 패킷에 실린 데이터들의 특성을 명확히 구별해 내기 위한 패킷이미지 변환 기술을 적용하는데 일반적으로 패킷은 시간, 길이, 로우(raw) 데이터 등 여러 가지 정보를 가지고 있기 때문에 그에 알맞은 이미지 변환을 수행한다.When classifying into both sides of the high-dimensional plane according to the characteristics of the data as described above, the present invention applies a packet image conversion technique to clearly distinguish the characteristics of the data contained in the packet, the packet is generally time, length, raw (raw) ) Because it has various information such as data, it performs appropriate image conversion.

SVM을 적용한 SVM 침입 탐지 시스템은 기존에 알려진 학습 데이터들에 대하여 패킷이미지변환기술로 패킷이미지를 생성하고 생성된 패킷이미지를 정규화하고 SVM학습을 통해 서포트 벡터를 생성한 후, 네트워크를 통해 실시간으로 들어오는 데이터들에 대하여 상기 서포트 벡터와의 유사도를 비교하여 음성, 양성 판단을 하 여 침입 탐지를 하는 기술이다. SVM intrusion detection system using SVM generates packet image with packet image conversion technology on the known learning data, normalizes the generated packet image, generates support vector through SVM learning, and then enters the network in real time. This technique compares the similarity with the support vector with respect to data and makes intrusion detection by making a negative or positive judgment.

도 3은 SVM 침입 탐지 시스템의 블록도이다.3 is a block diagram of an SVM intrusion detection system.

SVM 침입 탐지 시스템은, 네트워크 연결 기록 정보를 가진 네트워크 연결 정보 데이터베이스에서 학습 데이터와 네트워크 데이터가 추출되고, 특성 추출부(10)는 이러한 정보 중에서 필요한 특성 정보만을 선택한다. 즉, 특성 추출부(10)는 침입 탐지를 위해 필요로 하는 특성 정보들(예컨대, 헤더 정보)들을 추출한다.In the SVM intrusion detection system, the training data and the network data are extracted from the network connection information database having the network connection record information, and the feature extraction unit 10 selects only the necessary feature information from the information. That is, the feature extractor 10 extracts feature information (eg, header information) required for intrusion detection.

일반적으로 네트워크를 통해 들어오는 데이터들의 경우, SYN Floodung, Land, Tear Drop과 같은 공격 바이러스 데이터들의 공격은 해당 패킷의 60바이트 이내의 헤더 정보에서 식별이 가능하다. 따라서 특성 추출부는 데이터 중에서 침입 탐지 시에 사용되는 정보 영역(예컨대, 60바이트 이내의 헤더정보)을 추출하는 것이다. 이렇게 추출된 특성 정보들은 각각 학습 패킷 및 네트워크 패킷으로서 SVM 변환부에 입력된다. 이하, 실시예 설명에서는 특성 추출부가 침입 탐지를 위해 필요로 하는 특성 정보의 크기를 60바이트의 헤더 정보량으로 예로서 설명하겠으나, 그 크기 양은 필요에 따라 가변적으로 선택 적용될 수 있음은 자명할 것이다.In general, in case of data coming through the network, the attack of attack virus data such as SYN Floodung, Land, and Tear Drop can be identified from the header information within 60 bytes of the packet. Therefore, the feature extracting unit extracts an information area (for example, header information within 60 bytes) used for intrusion detection from the data. The extracted characteristic information is input to the SVM conversion unit as a learning packet and a network packet, respectively. In the following description of the embodiment, the feature extractor will describe the size of the feature information required for intrusion detection as an example of 60-byte header information, but it will be apparent that the amount of the feature can be variably selected and applied as necessary.

상기 학습 데이터는 이미 알려진 바이러스 데이터로서 SVM 학습을 위해 필요로 하는 정보들이다. 특성 추출부는 상기 학습 데이터의 헤더 정보를 추출하여 학습 패킷으로서 SVM 변환부에 전달한다. 상술한 바와 같이 바이러스 데이터들의 공격은 해당 패킷의 60바이트 이내의 헤더 정보에서 식별이 가능하다. 따라서 알려진 바이러스 데이터의 패킷에 있는 60바이트 이내의 헤더를 추출하여 학습 패킷용으로 사용하는 것이다.The training data are known virus data and are information required for SVM learning. The feature extractor extracts the header information of the training data and transfers the header information of the training data to the SVM converter as a training packet. As described above, the attack of the virus data can be identified from the header information within 60 bytes of the packet. Therefore, the header within 60 bytes of the packet of known virus data is extracted and used for the training packet.

네트워크 데이터는 네트워크를 통해 들어오는 데이터들에 대해 침입 탐지 대상이 되는 데이터들이다. 상기 네트워크 데이터의 캡처는, 네트워크 인터페이스 카드(NIC)에서의 libpcap과 같은 패킷 캡처라이브러리를 사용하여 네트워크를 통과하는 패킷을 캡쳐한다. 특성 추출부(10)는 상기 네트워크 데이터의 헤더 정보를 추출하여 네트워크 패킷으로서 SVM 변환부에 전달한다. 헤더를 식별하여 정상 패킷인지, 아니면 바이러스에 감염된 비정상 패킷인지 판별할 수 있기 때문에, 네트워크 데이터에서 헤더 패킷을 추출하여 네트워크 패킷으로서 SVM 변환부에 전달하는 것이다.Network data is data that is an intrusion detection target for data coming through the network. The capture of network data captures packets passing through the network using a packet capture library, such as libpcap on a network interface card (NIC). The feature extractor 10 extracts the header information of the network data and transfers the header information of the network data to the SVM converter as a network packet. Since the header can be identified to determine whether it is a normal packet or an abnormal packet infected with a virus, the header packet is extracted from the network data and transmitted to the SVM converter as a network packet.

상기와 같이 특성이 추출된 학습 패킷 및 네트워크 패킷은 SVM 변환부(20)에서 SVM 학습 기계의 표준 입력 폼에 합당하도록 변환된 패킷이미지들, 즉, 학습 패킷이미지와 네트워크 패킷이미지로 변환된다.The learning packet and the network packet having the characteristics extracted as described above are converted into packet images, that is, the learning packet image and the network packet image, which are converted by the SVM converter 20 so as to conform to the standard input form of the SVM learning machine.

SVM 변환부(20)는 패킷을 SVM 표준형태인 패킷이미지로 변환시키는 기능을 수행하는데, 입력되는 학습 패킷과 네트워크 패킷을 각각 SVM 표준형태인 패킷이미지로 변환시켜, 학습 패킷이미지와 네트워크 패킷이미지로 출력한다.The SVM converter 20 converts a packet into a packet image of an SVM standard. The SVM converter 20 converts an input learning packet and a network packet into a packet image of an SVM standard, respectively, into a learning packet image and a network packet image. Output

상기와 같이 SVM 변환부(20)는 특성 추출된 학습 패킷과 네트워크 패킷인 가공되지 않은 패킷을 가지고 패킷이미지 변환 기술로서 패킷이미지(학습 패킷이미지, 네트워크 패킷이미지)를 각각 생성하는데, 상기 패킷이미지를 생성하는 패킷이미지 변환 방법은 다음과 같다.As described above, the SVM converting unit 20 generates packet images (learning packet images, network packet images) as packet image conversion techniques by using the feature-extracted learning packets and the raw packets which are network packets. The packet image conversion method to generate is as follows.

SYN FLoodung, Land, TearDrop과 같은 공격들은 패킷의 60바이트 이내의 헤더 정보에서 식별이 가능하다. 따라서 이러한 공격들을 구분하기 위하여 60바이트를 쪼개어 각 1바이트당 하나의 이미지를 나타내도록 한다. 1바이트는 0~255의 정보를 가지는 값이지만, 이러한 1바이트 값을 255로 나누게 되면 0에서 1 사이의 값을 가지게 된다. 즉, 0에서 255 사이의 값을 가지는 1바이트 정보를 255로 나누어서 0에서 1사이의 값을 가지는 패킷이미지로 만드는 것이다. 이러한 패킷이미지 변환 방법은 도 4 및 도 5에서 상세히 설명한다.Attacks such as SYN FLoodung, Land, and TearDrop can be identified from header information within 60 bytes of the packet. Therefore, to distinguish between these attacks, 60 bytes are broken down to represent one image for each byte. 1 byte is a value having information of 0 ~ 255, but when this 1 byte value is divided by 255, it has a value between 0 and 1. That is, 1 byte information having a value between 0 and 255 is divided into 255 to make a packet image having a value between 0 and 1. This packet image conversion method will be described in detail with reference to FIGS. 4 and 5.

SVM 변환부(20)는 학습 패킷 및 네트워크 패킷에 대하여 상기와 같은 패킷이미지 변환 기술을 적용하여, 각각 학습 패킷이미지와 네트워크 패킷이미지를 생성한다. 즉, 60바이트 헤더 정보로 이루어진 학습 패킷의 각 바이트별로 상기 변환을 수행하여 60개의 학습 패킷이미지(이하, 학습 패킷이미지 패턴)를 생성 출력하고, 마찬가지로 60바이트의 헤더 정보로 이루어진 학습 패킷의 각 바이트별로 상기 변환을 수행하여 60개의 네트워크 패킷이미지(이하, 네트워크 패킷이미지 패턴)를 생성 출력한다.The SVM converter 20 generates the learning packet image and the network packet image by applying the packet image conversion technique as described above to the learning packet and the network packet. That is, 60 bytes of learning packet images (hereinafter, referred to as "learning packet image patterns") are generated and output by performing the above transformation for each byte of the learning packet consisting of 60 byte header information, and each byte of the learning packet consisting of 60 bytes of header information. By performing the above conversion, 60 network packet images (hereinafter, referred to as network packet image patterns) are generated and output.

SVM 분류부(30)는 SVM학습모듈(31)과 분류모듈(32)로서 이루어지는데, 학습 패킷이미지 패턴과 네트워크 패킷이미지 패턴과의 유사도를 비교하여, 실시간으로 들어오는 네트워크 데이터가 정상패킷인지 비정상패킷인지 분류라는 기능을 수행한다. 즉, SVM 분류부는 학습 패킷이미지 패턴을 이용하여 SVM학습을 통해 기준이 되는 서포트 벡터(SV)를 생성하며, 아울러, 상기 네트워크 패킷이미지 패턴에 대한 실험 벡터를 생성하고, 상기 서포트 벡터와 실험 벡터간의 유사도 비교를 함으로 써, 실시간으로 들어오는 네트워크 패킷이미지에 대하여 침입 탐지를 하여 정상패킷인지 비정상패킷인지 분류라는 기능을 수행한다.SVM classification unit 30 is composed of the SVM learning module 31 and the classification module 32, by comparing the similarity between the learning packet image pattern and the network packet image pattern, whether the incoming network data in real time normal packets or abnormal packets It performs a function called cognitive classification. That is, the SVM classifier generates a support vector (SV), which is a reference through SVM learning, using a learning packet image pattern, and generates an experimental vector for the network packet image pattern, and between the support vector and the experimental vector. By comparing the similarity, intrusion detection is performed on the incoming network packet image in real time to perform the function of classifying whether it is a normal packet or an abnormal packet.

상기 SVM학습모듈(31)은, 생성된 학습 패킷이미지 패턴을 정규화한 후 이를 SVM학습에 이용한다. SVM학습이 완료되면 도 2와 같은 서포트 벡터(Support Vector, 이하 SV)가 생성된다. 아울러, 상기 네트워크 패킷이미지 패턴에 대한 실험 벡터를 생성한다.The SVM learning module 31 normalizes the generated learning packet image pattern and then uses it for SVM learning. When the SVM learning is completed, a support vector (SV) as shown in FIG. 2 is generated. In addition, an experimental vector is generated for the network packet image pattern.

분류모들(32)은 상기 생성된 서포트 벡터(SV)와 실험 벡터간의 유사도를 비교하여 실시간 네트워크 침입탐지를 하게 되고, 그 결과 상호 유사도로서 정상과 비정상으로 구별하며 패킷을 연속한 패킷이미지 패턴으로 처리함으로써 공격 패킷이미지패턴과 완전히 일치하지 않더라도 공격 패킷이미지 패턴과 일부분만 감지하더라도 이와 가장 유사한 패턴으로 구별하게 된다. The classifiers 32 compare the similarity between the generated support vector SV and the experimental vector to detect the network intrusion in real time. As a result, the classifier 32 distinguishes between normal and abnormal as mutual similarity and divides the packet into a continuous packet image pattern. By processing, even if it does not completely match the attack packet image pattern, even if it detects only part of the attack packet image pattern, it is distinguished into the most similar pattern.

도 4는 본 발명의 실시 예에 따른 SVM 침입 탐지 시스템에서 수집된 패킷을 패킷이미지로 변환하는 과정을 도시한 플로차트이고, 도 5는 패킷이미지 변환 예시도이다.4 is a flowchart illustrating a process of converting a collected packet into a packet image in the SVM intrusion detection system according to an exemplary embodiment of the present invention, and FIG. 5 is an exemplary diagram of packet image conversion.

이하, 도 4 설명에서는 패킷을 패킷 이미지로 변환시키는 과정을 설명하는데, 상기 학습 패킷을 학습 패킷이미지 변환시키거나, 상기 네트워크 패킷을 네트워크 패킷이미지로 변환시키는 과정이 도 4의 과정에 적용된다.In the following description of FIG. 4, a process of converting a packet into a packet image is described. A process of converting the learning packet into a learning packet image or converting the network packet into a network packet image is applied to the process of FIG. 4.

우선, SVM학습이 이루어져야 하는 데이터로부터 특성 정보 영역을 추출(S41)한다. 상기 특성 정보 영역은 침입 탐지 시에 사용되는 데이터의 특성을 알 수 있 는 영역의 정보로서, 예컨대, 각 데이터들의 처음 60바이트 정보가 담긴 패킷이 이에 해당한다. 따라서 60바이트가 특성 정보 영역 크기가 된다.First, the characteristic information area is extracted from the data for which SVM learning is to be performed (S41). The characteristic information area is information of an area where the characteristics of data used at the time of intrusion detection can be known. For example, the characteristic information area corresponds to a packet containing the first 60 bytes of data. Therefore, 60 bytes is the size of the characteristic information area.

데이터로부터 60바이트의 헤더 정보를 추출한 패킷은, SVM표준형태를 가지기 위해서 패킷이미지로 변환되는 과정(S42,S43,S44,S45)을 가진다.A packet from which 60 bytes of header information is extracted from the data is converted into a packet image (S42, S43, S44, S45) in order to have an SVM standard form.

도 5의 패킷이미지 변환 예시도와 같이 0에서 255 사이의 값을 가지는 1바이트값을 255로 나누어서 0에서 1사이의 값을 가지는 패킷이미지를 만들고, 특성 정보 크기 전체인 60바이트에 대하여 각각 이러한 변환을 반복 수행하여 총 60개의 패킷이미지를 만든다. 따라서 변환을 통해 총 60개의 패킷이미지로 생성된다.As shown in the packet image conversion example of FIG. 5, a packet image having a value between 0 and 1 is divided by dividing a 1 byte value having a value between 0 and 255 by 255, and each of these conversions is performed for 60 bytes, which are the entire feature information sizes. Repeat this to make a total of 60 packet images. Therefore, a total of 60 packet images are generated through the conversion.

상술하면, 60바이트로 이루어진 특성 정보 영역의 첫 번째 바이트를 읽어(S42) 들인 후, 읽어들인 첫 번째 바이트 값을 255로 나누어 그 나눈 값을 패킷이미지로서 생성(S43)한다. 하나의 바이트는 8비트로 이루어져 있어 0에서 255 사이의 정보값을 가지기 때문에 해당 바이트 값을 255로 나누어서 0에서 1사이의 값으로 표준 변환시켜 패킷이미지화 시키는 것이다.In detail, after reading the first byte of the feature information area consisting of 60 bytes (S42), the first byte value read is divided by 255 and the divided value is generated as a packet image (S43). Since one byte is composed of 8 bits and has an information value between 0 and 255, the byte value is divided into 255 and converted to a standard value between 0 and 1 for packet imaging.

그 후, 읽어들인 바이트의 위치가 특성 영역 크기보다 큰지를 판단(S44)한다. 상기에 특성 영역 크기가 60바이트로 예시를 들었기 때문에, 읽어들인 바이트가 60바이트 위치보다 큰지를 판단하는 것이다. 크지 않을 경우에는 다음 번째의 바이트를 리딩(S45)하여 255로 나누어서 두 번째 패킷이미지를 생성(S43)한다.Thereafter, it is determined whether the position of the read byte is larger than the size of the characteristic area (S44). Since the characteristic area size is given as an example of 60 bytes, it is determined whether the byte read is larger than the 60 byte position. If not large, the next byte is read (S45) and divided by 255 to generate a second packet image (S43).

결국, 특성 영역 크기인 60번째 바이트가 될 때까지 상기 과정을 반복하여 총 60개의 패킷이미지를 생성할 수 있다. 읽어들인 바이트가 특성 영역 크기인 60번째 바이트보다 클 때에는 상기 과정을 종료하고 총 60개의 패킷이미지를 모아서 S47과 같은 패킷이미지 60개로 이루어진 패킷이미지 패턴을 가질 수 있다(S46).As a result, the above process may be repeated until the 60 th byte, which is the size of the feature region, to generate a total of 60 packet images. When the read byte is larger than the 60 th byte, which is the size of the feature region, the process is terminated and a total of 60 packet images may be collected to have a packet image pattern including 60 packet images such as S47 (S46).

도 6은 패킷이미지를 이용하여 SVM 적용한 침입 탐지의 테스트 결과를 도시한 그래프로서, 학습용 데이터의 추출을 위한 패킷수집은 libpcap 라이브러리를 사용하는 TCPDUMP 와 같은 네트워크 모니터링 도구를 사용하였으며, 본 실험에서는 SVM 을 학습하기 위해 정상패킷 20개와 TearDrop 공격인 비정상패킷 20개를 수집 각각 수집한다. 그 후 정상패킷이미지 20개와 비정상패킷이미지 20개를 만들어 학습하는데 사용하였다. 학습된 SVM 미확인 칩입탐지 도구를 TearDrop 공격의 변종공격을 수집하여 테스트하였다. 이때 학습용 데이터로 쓰이게 되는 SVM 표준 형태로의 변환을 위해 SVM 학습을 하기 위해 정상패킷과 비정상패킷을 각각 패킷이미지로 변환한다. 이때 패킷을 1바이트씩 을 순차적으로 60개(60바이트)를 모아 패킷이미지로 패턴을 만들고 패턴을 생성할 경우 본 발명이 제안하는 방법을 통해 패킷이미지로 변환하였다. FIG. 6 is a graph showing test results of intrusion detection using SVM using packet image. In order to extract training data, packet collection using a network monitoring tool such as TCPDUMP using libpcap library was used. To learn, collect 20 normal packets and 20 abnormal packets that are TearDrop attacks. After that, 20 normal packet images and 20 abnormal packet images were made and used for learning. The trained SVM unidentified chip detection tool was tested by collecting variants of TearDrop attack. At this time, normal and abnormal packets are converted into packet images for SVM learning in order to convert to SVM standard format used as learning data. In this case, when a packet is collected by 60 bytes (60 bytes) by 1 byte, the packet image is formed into a packet image, and the pattern is converted into a packet image through the method proposed by the present invention.

도 6은 Y축은 유사도를 나타내며 X축은 0의 기점을 나타낸 것으로서, 임계치와 일치할 시에는 0의 기점을 가지며 임계치보다 유사도가 클 때는 0을 기점으로 상단에 위치하며 임계치보다 유사도가 작을 때는 0을 기점으로 하단에 위치한다. 따라서 SVM 학습 결과로 그래프 0을 기점으로 상단은 비정상패킷의 점 분포이며, 0을 기점으로 하단은 정상패킷의 점 분포하게 된다. 본 테스트에서는 비정상패킷의 개수는15개, 정상패킷의 개수는 14개로 나왔다. 이와 같이 변종에 대한 테스트 결과는 보는 바와 같이 모든 변종 패킷을 탐지할 수 있었다. FIG. 6 shows that the Y-axis represents the similarity and the X-axis represents the origin of 0. When the threshold coincides with the threshold, it has a base of zero, and when the similarity is greater than the threshold, it is located at the top with 0 as the threshold and 0 when the similarity is smaller than the threshold. It is located at the bottom from the starting point. Therefore, as a result of SVM learning, the top is the point distribution of the abnormal packet starting from graph 0, and the bottom is the point distribution of the normal packet starting from 0. In this test, the number of abnormal packets was 15 and the number of normal packets was 14. As a result, the test result for the variant was able to detect all the variant packets as shown.

상술한 본 발명의 설명에서는 구체적인 실시 예에 관해 설명하였으나, 여러 가지 변형이 본 발명의 범위에서 벗어나지 않고 실시될 수 있다. 따라서 본 발명의 특허 범위는 상기 설명된 실시 예에 의하여 정할 것이 아니고 특허청구범위뿐 아니라 균등 범위에도 미침은 자명할 것이다.In the above description of the present invention, specific embodiments have been described, but various modifications may be made without departing from the scope of the present invention. Therefore, the scope of the present invention is not to be determined by the embodiments described above, but will be apparent in the claims as well as equivalent scope.

도 2는 SVM에서 서포트 벡터와 실험 벡터간에 유사도 마진을 도시한 그림이다.2 is a diagram showing the similarity margin between the support vector and the experimental vector in the SVM.

도 4는 본 발명의 실시 예에 따른 패킷이미지 변환 과정을 도시한 플로차트이다.4 is a flowchart illustrating a packet image conversion process according to an embodiment of the present invention.

도 5는 본 발명의 실시 예에 따라 패킷이미지 변환 예시를 도시한 그림이다.5 is a diagram illustrating an example of packet image conversion according to an embodiment of the present invention.

도 6은 본 발명의 실시 예에 따라 SVM 침입 탐지 시스템의 테스트 결과에 대한 그래프 값이다. 6 is a graph value for a test result of the SVM intrusion detection system according to an embodiment of the present invention.

Claims

Extracting header information of a data packet used for intrusion detection;

Generating a packet image by dividing each byte value of the header information by 255 and converting each byte to have a value between 0 and 1;

Collecting the packet image of each byte to configure the entire packet image pattern

Packet image conversion method for SVM intrusion detection comprising a.

delete

The method of claim 1, wherein the size of the header information is an amount of information of the first 60 bytes of each data packet.

4. The method of claim 3, wherein the packet image pattern comprises 60 packet images. 5.