WO2022270678A1 - Network intrusion detection system using determination delay for packets - Google Patents

Network intrusion detection system using determination delay for packets Download PDF

Info

Publication number
WO2022270678A1
WO2022270678A1 PCT/KR2021/011914 KR2021011914W WO2022270678A1 WO 2022270678 A1 WO2022270678 A1 WO 2022270678A1 KR 2021011914 W KR2021011914 W KR 2021011914W WO 2022270678 A1 WO2022270678 A1 WO 2022270678A1
Authority
WO
WIPO (PCT)
Prior art keywords
packet
class
intrusion
classifier
packets
Prior art date
Application number
PCT/KR2021/011914
Other languages
French (fr)
Korean (ko)
Inventor
박우길
김태훈
Original Assignee
영남대학교 산학협력단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 영남대학교 산학협력단 filed Critical 영남대학교 산학협력단
Publication of WO2022270678A1 publication Critical patent/WO2022270678A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/32Flow control; Congestion control by discarding or delaying data units, e.g. packets or frames
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols

Definitions

  • the present invention relates to an intrusion detection system for detecting whether an intrusion occurs in a network from the outside.
  • ML-NIDS machine-learning-based network intrusion detection system
  • ML-NIDS detects whether it is a chip after the session is terminated. Since the ML-NIDS can detect the intrusion after the network intrusion has occurred, there is a time difference between the actual intrusion and the ML-NIDS detecting the intrusion. This causes many limitations in quickly and safely protecting the network because ML-NIDS cannot detect network intrusion in real time. Therefore, when an intrusion occurs in the network, a technique for ML-NIDS to detect the intrusion in real time is required.
  • the conventional ML-NIDS uses a method of using statistical characteristics of the entire session as a feature and a method of using some packets within a session as a feature in order to more quickly identify network intrusion.
  • the method using the statistical characteristics of the entire session as a feature creates a session feature necessary for detection after the session is terminated, it is still unable to detect an intrusion at the time of intrusion. That is, it is impossible to eliminate the delay for intrusion detection.
  • the method using some packets within a session as a feature uses only the first few packets rather than the entire session, so it is possible to detect an attack with high accuracy.
  • such a packet data-based method enables intrusion detection when only a certain number of packets are received without waiting until the session is terminated.
  • the second method can detect an attack more quickly than the conventional method, but it is also impossible to prevent a delay in determining whether or not an intrusion occurs. The reason for this is that the second method cannot immediately detect an intrusion at the time it occurs, and can only be detected when a certain number of packets are received.
  • the simplest method for quickly detecting an intrusion without delay is to determine whether or not an intrusion occurs for all received packets.
  • this method may cause a problem in that intrusion detection performance is very poor. This is because attack traffic intruding into the network may initially be indistinguishable from normal normal traffic. Rather, the intruding attack traffic is similar to normal traffic at the beginning of the session, but there are cases in which it differs from normal traffic as attempts for actual intrusion proceed. Therefore, when identifying an attack from the first packet of a session, depending on the type of attack, the attack traffic is often misclassified as normal traffic at the beginning of the session, so intrusion detection performance may be very low.
  • One of the simplest methods to solve this problem is to detect whether or not an attack is present for every packet belonging to a session that has already been determined to be normal.
  • This method can detect the occurrence of an attack after a certain number of packets are transmitted in a session initially classified as normal, so it can solve the problem of misclassification in the early stage and detect intrusion in real time.
  • the method of detecting all packets belonging to a normal session has a great limitation in applying it to actual ML-NIDS because the proportion of normal traffic in the total traffic is very high and the amount of traffic is large.
  • the amount of recent traffic is constantly exploding, and accordingly, the network bandwidth to be processed by the NIDS is also increasing from several gigabps to hundreds of gigabps. Therefore, in order to examine packets of all normal sessions, much higher processing performance is required than the currently developed NIDS, which is very difficult to solve from a technical point of view.
  • the conventional NIDS has a problem of poor detection performance due to the large amount of traffic to be processed, a problem of low accuracy in detecting intrusion by misclassifying an attacked case and a normal case, and a problem of not being able to detect intrusion in real time at the same time.
  • the present applicant researches and develops an intrusion detection system capable of detecting in real time and preventing the packet from being misclassified by reducing the amount of traffic to be processed and deferring judgment on attack traffic that is difficult to distinguish from normal traffic at the beginning. proceeded.
  • the present invention as a network intrusion detection system, is intended to provide a system capable of detecting intrusion in real time while increasing the accuracy of detection when an intrusion into a network occurs from the outside.
  • the present invention provides an intrusion detection system for detecting intrusion into a network composed of a plurality of packets, by using an artificial neural network, classifying the packet as a normal class or an intrusion detection target according to whether an external intrusion occurs.
  • a classifier for classifying into one of the attack classes and a determination unit learning a case in which a classification result of the classifier for the packet and actual intrusion do not match, and determining the packet as one of a judgment class or a judgment pending class, wherein the classifier, in the determination unit If the packet is determined to be in the judgment class, the packet is classified into either the normal class or the attack class according to whether or not it is intruded, and if the packet is in the judgment pending class, a packet received next to the packet is classified. It is characterized in that it detects in real time whether there is an intrusion into the network by holding classification of the packet until it is done, and determining intrusion for each individual packet.
  • it further comprises a processing unit for processing the packet according to a classification result of the packet classified by the classifier, wherein the processing unit forwards the packet when the packet is classified into the normal class by the classifier, When the packet is classified as the attack class, the packet may be discarded.
  • the classifier includes: a feature generator for generating a feature for the packet as an input to the artificial neural network; and a classification unit classifying the packet into one of the normal class and the attack class according to the output of the artificial neural network for the feature.
  • the artificial neural network provided in the classifier is a recurrent neural network formed by sequentially connecting a plurality of units, and the units are provided for each packet, and the units are connected back and forth to affect input and output. there is.
  • the classifier when the packet is classified into the judgment pending class by the determination unit, the input of the unit (unit n) matching the packet is a feature of the packet and a previous unit (unit n+1). Including the output of , the characteristics of the previous packet may be reflected.
  • the feature generator may independently create the feature for each received packet based on data of a predetermined size including a header among data of the packet.
  • the determination unit includes: a decision-sustaining learning model for learning the packet corresponding to the case where the classification result of the classifier and actual intrusion do not match; and a determination module that compares the reference data with data of the packet and classifies whether the packet belongs to the determination class or the determination pending class.
  • the decision-withholding learning model includes an extraction module for classifying arbitrary learning data composed of the packets with respect to the classifier and extracting misclassified data in which a classification result of the classifier and actual intrusion do not match; a learning module that designates the misclassified data as the decision-deferred class and performs single classification and learning on the misclassified data for the decision-deferred class; and a single classification module for performing single classification learned in the learning module on the learning data.
  • the execution speed of the intrusion detection system can be increased by reducing the amount of traffic to be processed.
  • the present invention can reduce the amount of traffic to be processed by detecting normal traffic, which accounts for most of the total network traffic, as normal traffic at an early stage and not detecting intrusion in traffic classified as normal.
  • a feature in which the characteristics of the corresponding packet are reflected is used for each packet.
  • the byte value of the entire packet is not used, but among some byte values, SIP, DIP, ID, Source
  • SIP, DIP, ID, Source The size of the feature can be reduced because only the value with Port removed is used.
  • FIG. 1 shows a configuration diagram of an intrusion detection system according to an embodiment of the present invention.
  • FIG. 2 shows a process in which an intrusion detection system according to an embodiment of the present invention and a conventional classifier determine that a session including a plurality of packets is an attack session.
  • FIG. 3 shows a configuration diagram of a classifier according to an embodiment of the present invention.
  • FIG. 4 shows a process of constructing a decision-withholding learning model according to an embodiment of the present invention.
  • FIG. 5 shows a process of classifying whether or not an intrusion has occurred in an intrusion detection system according to an embodiment of the present invention.
  • the intrusion detection system 1 shows a configuration diagram of an intrusion detection system 1 according to an embodiment of the present invention.
  • the intrusion detection system 1 is a system that detects intrusion into a network composed of a plurality of packets, and can detect intrusion into the network in real time by determining intrusion for each individual packet.
  • the intrusion detection system 1 may include a determination unit 10 , a classifier 30 and a processing unit 50 .
  • the determination unit 10 may learn a case in which the classification result of the classifier 30 for the packet and actual intrusion do not match, and determine the packet as one of the judgment class and the judgment pending class. That is, the determination unit 10 can suspend the determination until an accurate determination of the corresponding packet is possible by learning sessions that may be misclassified in advance through machine learning when determining whether an initially received packet is an intrusion. can
  • the determination unit 10 may include a judgment-deferred learning model 110 and a determination module 130 .
  • the decision-sustaining learning model 110 may learn a packet corresponding to a case where the classification result of the classifier 30 and actual intrusion do not match.
  • the decision-withholding learning model 110 may include an extraction module 1101 , a learning module 1130 and a single classification module 1105 .
  • the extraction module 1101 may classify arbitrary learning data composed of packets with respect to the classifier 30 to extract misclassified data in which the classification result of the classifier 30 and actual intrusion do not match.
  • the learning module 1103 may designate the misclassified data as a decision-deferred class, and perform single classification training on the misclassified data for the decision-deferred class. At this time, the learning module 1103 may extract misclassified data using a one-class classifier and classify it into a judgment deferred class.
  • the single classification module 1105 may select packets having traffic similar to the misclassified data traffic by performing the single classification learned in the learning module 1103 on the learning data.
  • a description of the decision-sustaining learning model 110 will be described later in detail with reference to FIGS. 4 and 5 .
  • the decision module 130 may compare the reference data and the data of the packet to classify whether the packet belongs to the decision class or the decision pending class. In the decision module 130, when the corresponding packet belongs to the decision pending class, classification may be performed multiple times in the artificial neural network 330 of the classifier 30 thereafter.
  • the classifier 30 may classify a packet, which is an intrusion detection target, into one of a normal class and an attack class according to whether an external intrusion occurs using an artificial neural network.
  • the classifier 30 classifies the packet into either a normal class or an attack class depending on whether the packet is intruded when the determination unit 10 determines that the packet is in the judgment class, and if the packet is in the judgment pending class, the packet is received next to the packet. Classification of packets can be withheld until packets have been classified.
  • the classifier 30 may include a feature generator 310 , an artificial neural network 330 and a classifier 350 .
  • the feature generator 310 may generate features for packets as an input to the artificial neural network 330 .
  • the feature generator 310 may independently create a feature for each received packet based on data of a predetermined size (n bytes) including a header among packet data.
  • n which is a value for determining the overall size of the feature, may be set differently according to training data, and may be set to an optimal value based on the training data.
  • the artificial neural network 330 may be a recurrent neural network formed by sequentially connecting a plurality of units 3301 .
  • the unit 3301 may be provided for each packet, and a plurality of units 3301 may be connected back and forth to affect the input/output of the next unit 3301.
  • an input of a unit (unit n) matching the corresponding packet may include a feature of the corresponding packet and output information of the previous unit (unit n-1). .
  • the classification unit 350 may classify the packet into either a normal class or an attack class according to the output of the artificial neural network 330 for the feature.
  • the artificial neural network 330 may include a DNN in an output portion, and an object determined to be a normal class or an attack class in the classification unit 350 may be an output that has passed through the unit 3301 classification and DNN.
  • the processing unit 50 may process packets using information stored as whitelist for sessions classified as normal in the past intrusion determination and information stored as blacklist for sessions classified as intrusion.
  • the processing unit 50 searches for a session to which a packet received from the whitelist or blacklist belongs, classifies whether the received packet is a normal packet or an attacked packet, and processes the received packet. If the received packet is the same as the packet of the session classified as normal, the processing unit 50 forwards the corresponding packet, and if it is the same as the packet of the session classified as attack, the processing unit 50 may discard the corresponding packet. If the corresponding packet is clearly determined to be normal or an attack, further intrusion determination may not be performed.
  • the amount of traffic to be processed by the intrusion detection system 1 can be reduced because additional double-determination is not performed on normal packets received thereafter.
  • the corresponding packet is an attack packet and is malicious, attack traffic such as a DDoS attack may explode instantaneously. Since the corresponding traffic is immediately discarded, the amount of traffic to be processed can be reduced even at this time.
  • the processing unit 50 may process packets according to a classification result of the packets classified by the classifier 30 .
  • the processor 50 processes the corresponding packet differently according to whether the corresponding packet belongs to the normal class or the attack class as a result of classification by the classifier 30 . If the corresponding packet is classified as a normal class, the processor 50 forwards the packet, and if the packet is classified as an attack class, the processor 50 may discard the packet. Even if class classification is not made for the first packet in a specific session and the judgment is suspended, as a result, the class for the second half packet is classified, so according to the classification result of the classification unit 350, the processing unit 50 classifies the corresponding packet. can be processed
  • FIG. 2 shows a comparison between the intrusion detection system 1 according to an embodiment of the present invention and a conventional classifier determining a session including a plurality of packets as an attack session.
  • Figure 2 assumes a case of a specific session composed of 7 packets and classified as an attack session.
  • the ‘session feature-based classifier’ can detect intrusion after the corresponding session is terminated. As shown in FIG. 2, it is possible to determine an 'attack' after the session is terminated in the last packet 7, without judging packets 1 through 6. Therefore, the session feature-based classifier has a problem in that the amount of traffic to be processed is large and intrusion cannot be detected in real time.
  • the second ‘cumulative packet (1 ⁇ t packet) feature-based classifier’ determines that packets 1 through 3 are normal, and then determines that packets 4 and above are attack. This means that the cumulative packet classifier incorrectly detected packets 1 to 3 and correctly detected packets 4 and up. Therefore, if a result that is determined to be normal and later determined to be an attack continues, there is a problem in that the determination that it is normal in the beginning cannot be trusted and it is not known which result is correct.
  • the third intrusion detection system which is an embodiment of the present invention, creates and learns a feature for each partial session using packets sequentially received for each session, and determines intrusion based on this.
  • judgment is suspended for packets for which accurate judgment is impossible, and judgment is performed as normal or attack only when accurate judgment is possible.
  • packets 1 to 3 were judged to be undeterminable in the intrusion detection system 1 and were withheld, and packet 4 was determined to be an attack.
  • the present invention can reduce misclassification of the judgment results and increase reliability, compared to the cumulative packet feature-based classifier, by suspending the judgment instead of making an arbitrary unreliable judgment until an accurate judgment is possible.
  • packet 4 is clearly determined to be an attack, packet 4 is discarded and packets 5 to 7 are not judged thereafter, so the amount of traffic to be processed can be reduced compared to the session feature-based classifier.
  • the order of packets determined as an attack may be different according to the determination method of each classifier. As shown in FIG. 2, it can be confirmed that the present invention detects the occurrence of an attack the fastest, and the accuracy is also high.
  • FIG 3 shows a configuration diagram of a classifier 30 according to an embodiment of the present invention.
  • the classifier 30 determines whether each packet is a normal packet or an attack packet and classifies the corresponding packet into either a normal class or an attack class.
  • the classifier 30 is characterized in that if the determination unit 10 cannot accurately determine the corresponding packet, the classification unit 30 suspends the determination of the corresponding packet and determines the next packet.
  • the reason why the judgment on the previous packet can be suspended and whether the next packet is intruded is determined because the classifier 30 of the present invention uses information about the attack in the previous packets. In other words, when a feature is created for each individual packet for a session, it is difficult to detect the characteristics of an attack made over several packets if only an independent individual feature unrelated to the previous packet is created and used. Therefore, even if the determination of whether a packet is intruded is independently performed, the attack characteristic applied to the previous packet should be designed to be used in determining the next packet.
  • the embodiment of the present invention uses the output of the unit 3301 applied to the k ⁇ 1 th packet and the k th feature data as input to the k th unit 3301 to determine the k th packet. That is, in determining a specific packet (k-th packet), it is a method of using output information for the immediately previous packet (k-1-th packet). In this way, the k-2 th packet information is used for the k-1 th packet, and the k-3 th packet information is used for the k-2 th packet. It can be performed including the characteristics of the accumulated session using all the information on the -1st packet.
  • the artificial neural network 330 provided in the classifier 30 of the present invention may be a recurrent neural network formed by sequentially connecting a plurality of units 3301 .
  • Units 3301 may be provided as many as the number of packets included in the session, or may be more than that.
  • Each unit 3301 corresponds to a packet, and the first unit 3301 (unit n) may determine whether or not the first packet is intruded. Determination of whether a packet is intruded is performed centering on the unit 3301, the feature of the corresponding packet is input to the unit 3301 as an input value, and the normal/attack state of the corresponding packet can be determined as an output thereof.
  • the classifier 30 of FIG. 3 shows a structure in which an LSTM consisting of a total of N units 3301 and a DNN are added to the output part of each unit as an embodiment of the present invention.
  • N may be set to a value showing optimal performance based on the learning data.
  • the t-th unit 3301 of the LSTM is used to classify the t-th packet, and the last N-th unit 3301 can be used to classify not only the N-th packet but all packets after the N-th.
  • the feature x N generated for each packet in the feature generator 310 may be input to a corresponding unit N as an input, and h N and c N may be output as outputs.
  • h N since the DNN may be included in the output part of the classifier 30, h N may go through the DNN and finally output o N.
  • x 1 is input to unit 1 and h 1 and c 1 are output as a result.
  • h 1 passes through the DNN and finally o 1 is output, and o 1 is a classification result for the corresponding packet and may indicate whether it is a normal packet or an attack packet. If o 1 is classified as a normal or attack packet, then the classification unit 350 classifies the corresponding packet according to the class, and the processing unit 50 may forward or discard the corresponding packet.
  • the classifier 30 may suspend the determination of the corresponding packet and attempt to classify the second packet.
  • x 2 , a feature of packet 2, and h 1 and c 1 , outputs of unit 1 may be used as inputs of unit 2. That is, when the classification result for the t-1 th packet is 'not classifiable', c t-1 and h t-1 can be stored and used in advance to perform classification when the next t-th packet is received.
  • c t-1 and h t-1 partially include the characteristics of packets 1 through t-1. Therefore, it is possible to create a feature reflecting the characteristics of all packets received in the past without storing information on all packets.
  • a feature (F t ) of the t-th packet in a specific session may be defined as follows.
  • the decision-sustaining learning model 110 is a component for preventing attack traffic similar to normal traffic from being classified as normal when an initial packet is received at the beginning of a session. To this end, the decision-withholding learning model 110 learns the classifier 30 using session traffic in which the initial attack traffic is misclassified as normal. Through this, the decision deferral learning model 110 can determine whether to immediately perform an attack determination on the corresponding packet or defer the determination to the next packet.
  • the judgment-deferred learning model 110 learns the classifier 30 using the dataset D t composed of the t-th packets for several sessions, and classifies arbitrary learning data with the classifier 30.
  • t 1, 2, ... , N ⁇ , we can learn M using a one-class classifier.
  • a case of using Deep-SVDD as a one-class classifier is exemplified. Deep-SVDD is easy to control misclassification probability because it can determine which class a packet belongs to by mapping a feature domain to an optimal domain through deep learning.
  • the entire training dataset T 1 and the dataset D 1 consisting of only the first packets of all sessions belonging to T 1 are generated.
  • Unit 1 of the LSTM of the classifier 30 is learned using T 1 and D 1 .
  • the training dataset T 1 is classified into LSTM unit 1, and then misclassified data M 1 composed of misclassified data is generated. Regardless of the actual class of M 1 , M 1 is collectively designated as a judgment deferred class.
  • Deep SVDD 1 a one-class classifier, is learned using M 1 .
  • D 1 learned for unit 1 is classified as Deep SVDD 1 .
  • a dataset T 2 composed of sessions classified as pending judgment may be newly created.
  • D 2 is created using the created training dataset T 2 , and unit 2 of the LSTM is learned using it.
  • D 2 used for learning from the second process includes not only the second packet feature x 2 but also h 1 and c 1 outputs of LSTM unit 1.
  • the training dataset T 2 is classified as LSTM unit 2
  • misclassified data M 2 is obtained
  • M 2 is designated as a decision-deferred class
  • Deep SVDD 2 is learned using M 2 .
  • D 2 learned with unit 2 is classified as Deep SVDD 2 , and then a dataset T 3 composed of sessions classified as pending judgment is constructed. Through this process, the judgment suspension learning model 110 can be built.
  • D t is generated for all sessions belonging to T t , and unit t of LSTM is learned using this.
  • D t used for learning may include h t-1 and c t- 1 outputs of the LSTM unit t as well as the feature x t of the corresponding packet.
  • misclassified data M t is obtained. Designate the corresponding M t as a decision-deferred class and learn Deep SVDD t using M t .
  • FIG. 5 shows a process of classifying whether an intrusion occurs in the intrusion detection system 1 according to an embodiment of the present invention.
  • the received packet is searched in the Whitelist or Blacklist to see if there is a normal or attack session to which the packet belongs.
  • the next step is to create a feature x t for that packet in both cases, take x t, , c t -1 , h t-1 as input and classify it in unit t, x t, , c t -1 , h Classification is performed with t- 1 as the input of Deep SVDD t .
  • the decision-deferred class is a single classification
  • the packet can be classified the corresponding packet or session information is added to the Whitelist or Blacklist depending on whether an intrusion has occurred. If it is a normal packet, it is added to the whitelist and the packet is forwarded.

Abstract

The present invention relates to an intrusion detection system for detecting whether or not there is an intrusion into a network comprising a plurality of packets, and comprises: a classifier that, by using an artificial neural network, classifies, into any one of a normal class and an attack class, according to whether or not there is an external intrusion, the packets that are targets for detecting whether or not there is an intrusion; and a determination unit that determines the packets as any one of a determination class and a determination hold class by learning a case in which the result of the classification of the classifier for the packets does not match whether or not there is an actual intrusion, wherein the classifier classifies the packets into any one of the normal class and the attack class, according to whether or not there is an intrusion, if the determination unit determines the packets as the determination class, and, if the packets are the determination hold class, holds the classification of the packets until packets received after the packets are classified. Therefore, it is possible to improve intrusion detection performance by reducing the possibility of misclassification, and to detect, in real time, whether or not there is an intrusion.

Description

패킷에 대한 판단 지연을 이용한 네트워크 침입탐지 시스템Network intrusion detection system using packet decision delay
본 발명은 외부에서 네트워크로 침입이 발생한 경우 침입 여부를 탐지하는 침입탐지 시스템에 관한 것이다.The present invention relates to an intrusion detection system for detecting whether an intrusion occurs in a network from the outside.
기존 기계학습 기반 네트워크 침입탐지 시스템(Machine-learning-based network intrusion detection system, ML-NIDS)은 세션이 종료된 후 칩임 여부를 탐지한다. ML-NIDS은 네트워크 침입이 발생한 후에 침입 사실을 파악할 수 있으므로 실제 침입한 시점과 ML-NIDS가 침입을 탐지한 시점 간에는 시간차가 존재한다. 이는 ML-NIDS가 네트워크 침입을 실시간으로 파악할 수 없으므로 네트워크를 신속하고 안전하게 보호하는데 많은 한계를 야기한다. 그러므로 네트워크에 침입이 발생하는 경우 ML-NIDS가 실시간으로 침입을 탐지하는 기술이 요구된다.Existing machine-learning-based network intrusion detection system (ML-NIDS) detects whether it is a chip after the session is terminated. Since the ML-NIDS can detect the intrusion after the network intrusion has occurred, there is a time difference between the actual intrusion and the ML-NIDS detecting the intrusion. This causes many limitations in quickly and safely protecting the network because ML-NIDS cannot detect network intrusion in real time. Therefore, when an intrusion occurs in the network, a technique for ML-NIDS to detect the intrusion in real time is required.
이를 위해 종래 ML-NIDS는 네트워크 침입 여부를 보다 빠르게 파악하기 위해 전체 세션에 대한 통계적인 특성을 피처로 사용하는 방법과 세션 내 일부 패킷을 피처로 이용하는 방법을 이용하였다. 그러나, 전체 세션에 대한 통계적인 특성을 피처로 사용하는 방법은 세션이 종료된 후에 탐지에 필요한 세션 피처를 생성하게 되므로 여전히 침입이 발생한 시점에 침입 여부를 탐지할 수 없다. 즉, 침입 탐지에 대한 지연을 제거하는 것은 불가능하다. 반면, 세션 내 일부 패킷을 피처로 이용하는 방법은 세션 전체에 대한 것이 아니라 처음 일부 패킷만을 이용하므로 높은 정확도로 공격을 탐지하는 것이 가능하다. 또한, 이러한 패킷 데이터 기반 방식은 세션이 종료될 때까지 기다리지 않고 일정 수의 패킷만을 수신하면 침입 탐지가 가능하다. 두 번째 방법은 종래보다 빠르게 공격을 탐지할 수 있으나, 이 역시도 침입 여부 판단에 대한 지연이 발생하는 것을 막는 것이 불가능하다. 그 이유는, 두 번째 방법은 침입이 발생한 시점에 바로 탐지할 수는 없고 반드시 일정 수의 패킷을 수신하여야만 탐지가 가능하기 때문이다.To this end, the conventional ML-NIDS uses a method of using statistical characteristics of the entire session as a feature and a method of using some packets within a session as a feature in order to more quickly identify network intrusion. However, since the method using the statistical characteristics of the entire session as a feature creates a session feature necessary for detection after the session is terminated, it is still unable to detect an intrusion at the time of intrusion. That is, it is impossible to eliminate the delay for intrusion detection. On the other hand, the method using some packets within a session as a feature uses only the first few packets rather than the entire session, so it is possible to detect an attack with high accuracy. In addition, such a packet data-based method enables intrusion detection when only a certain number of packets are received without waiting until the session is terminated. The second method can detect an attack more quickly than the conventional method, but it is also impossible to prevent a delay in determining whether or not an intrusion occurs. The reason for this is that the second method cannot immediately detect an intrusion at the time it occurs, and can only be detected when a certain number of packets are received.
한편, 침입 발생 시 지연 없이 빠르게 침입 여부를 탐지하기 위한 가장 간단한 방법은 수신하는 모든 패킷에 대해 침입 여부를 판단하는 것이다. 그러나, 이 방법은 침입탐지 성능이 매우 떨어지는 문제가 발생할 수 있다. 네트워크를 침입하는 공격 트래픽은 초기에 일반 정상 트래픽과 구분되지 않을 수 있기 때문이다. 오히려 세션 초기에는 침입하는 공격 트래픽이 정상 트래픽과 유사하지만 이후 실제 침입을 위한 시도가 진행되면서 정상 트래픽과 차이가 나는 경우도 있다. 따라서, 세션의 첫 패킷부터 공격 여부를 구별할 경우, 공격 타입에 따라서 공격 트래픽이 세션 초기에 정상 트래픽으로 잘못 분류되는 경우가 많아 오히려 침입탐지 성능이 매우 떨어지는 문제가 발생할 수 있다.On the other hand, when an intrusion occurs, the simplest method for quickly detecting an intrusion without delay is to determine whether or not an intrusion occurs for all received packets. However, this method may cause a problem in that intrusion detection performance is very poor. This is because attack traffic intruding into the network may initially be indistinguishable from normal normal traffic. Rather, the intruding attack traffic is similar to normal traffic at the beginning of the session, but there are cases in which it differs from normal traffic as attempts for actual intrusion proceed. Therefore, when identifying an attack from the first packet of a session, depending on the type of attack, the attack traffic is often misclassified as normal traffic at the beginning of the session, so intrusion detection performance may be very low.
이를 해결하기 위한 가장 간단한 방법 중 하나는 정상으로 이미 판단된 세션에 속하는 모든 패킷에 대해 공격 여부를 매 패킷마다 탐지하는 것이다. 이 방법은 초기에 정상으로 분류되었던 세션에서 일정 수의 패킷이 전송된 이후에 공격이 발생한 것을 탐지할 수 있으므로 초기에 오분류되는 문제를 해결할 수 있고 실시간적으로 침입을 탐지할 수 있다. 그러나 정상 세션에 속하는 모든 패킷을 탐지하는 방법은, 전체 트래픽에서 정상 트래픽이 차지하는 비중은 매우 높아 트래픽의 양이 많아 실제 ML-NIDS에 적용함에 있어 큰 제약이 있다. 또한, 최근 트래픽의 양이 끊임없이 폭증하고 있으며, 그에 따라 NIDS가 처리할 네트워크 대역폭도 수 기가 bps를 넘어 수백 기가 bps로 커지고 있다. 따라서 모든 정상 세션의 패킷을 조사하기 위해서는 현재 개발된 NIDS 보다 매우 높은 처리 성능이 요구되며, 이는 기술적 측면에서도 해결이 상당히 어렵다. One of the simplest methods to solve this problem is to detect whether or not an attack is present for every packet belonging to a session that has already been determined to be normal. This method can detect the occurrence of an attack after a certain number of packets are transmitted in a session initially classified as normal, so it can solve the problem of misclassification in the early stage and detect intrusion in real time. However, the method of detecting all packets belonging to a normal session has a great limitation in applying it to actual ML-NIDS because the proportion of normal traffic in the total traffic is very high and the amount of traffic is large. In addition, the amount of recent traffic is constantly exploding, and accordingly, the network bandwidth to be processed by the NIDS is also increasing from several gigabps to hundreds of gigabps. Therefore, in order to examine packets of all normal sessions, much higher processing performance is required than the currently developed NIDS, which is very difficult to solve from a technical point of view.
따라서, 종래 NIDS는 처리해야 하는 트래픽의 양이 많아 탐지 성능이 저조한 문제, 공격받은 경우와 정상인 경우를 오분류하여 침입 여부를 탐지하는데 정확성이 떨어지는 문제, 실시간으로 침입 발생을 탐지하지 못하는 문제를 동시에 가진다. 이에 본 출원인은 처리해야 하는 트래픽의 양을 줄이고, 초기에 정상 트래픽과 구별이 어려운 공격 트래픽에 대한 판단을 보류함으로써 해당 패킷이 잘못 분류되는 것을 방지하고 실시간으로 탐지가 가능한 침입탐지 시스템에 대한 연구 개발을 진행하였다.Therefore, the conventional NIDS has a problem of poor detection performance due to the large amount of traffic to be processed, a problem of low accuracy in detecting intrusion by misclassifying an attacked case and a normal case, and a problem of not being able to detect intrusion in real time at the same time. have Accordingly, the present applicant researches and develops an intrusion detection system capable of detecting in real time and preventing the packet from being misclassified by reducing the amount of traffic to be processed and deferring judgment on attack traffic that is difficult to distinguish from normal traffic at the beginning. proceeded.
[선행기술문헌][Prior art literature]
[특허문헌][Patent Literature]
한국등록특허 제10-2083028호Korean Patent Registration No. 10-2083028
한국공개특허 제10-2013-0006750호Korean Patent Publication No. 10-2013-0006750
한국등록특허 제10-1139913호Korean Patent Registration No. 10-1139913
본 발명은 네트워크 침입탐지 시스템으로서, 외부에서 네트워크에 대한 침입이 발생한 경우 탐지의 정확성을 높이면서 실시간으로 침입 여부를 탐지할 수 있는 시스템을 제공하고자 한다.The present invention, as a network intrusion detection system, is intended to provide a system capable of detecting intrusion in real time while increasing the accuracy of detection when an intrusion into a network occurs from the outside.
본 발명이 해결하려는 과제들은 앞에서 언급한 과제들로 제한되지 않는다. 본 발명의 다른 과제 및 장점들은 아래 설명에 의해 더욱 분명하게 이해될 것이다.The problems to be solved by the present invention are not limited to the problems mentioned above. Other problems and advantages of the present invention will be more clearly understood from the description below.
상기 목적을 달성하기 위하여 본 발명은, 복수의 패킷으로 구성된 네트워크에 대한 침입 여부를 탐지하는 침입탐지 시스템에 있어서, 인공 신경망을 이용하여 외부의 침입 여부에 따라 침입 여부 탐지 대상인 상기 패킷을 정상 클래스 또는 공격 클래스 중 어느 하나로 분류하는 분류기; 및 상기 패킷에 대한 상기 분류기의 분류결과와 실제 침입 여부가 일치하지 않는 경우를 학습하여, 상기 패킷을 판단 클래스 또는 판단보류 클래스 중 어느 하나로 판단하는 판단부를 포함하며, 상기 분류기는, 상기 판단부에서 상기 패킷이 상기 판단 클래스로 판단된 경우, 침입 여부에 따라 상기 패킷을 상기 정상 클래스 또는 상기 공격 클래스 중 어느 하나로 분류하고, 상기 패킷이 상기 판단보류 클래스인 경우, 상기 패킷 다음으로 수신되는 패킷을 분류할 때까지 상기 패킷에 대한 분류를 보류하여, 개별 패킷마다 침입 여부를 판단함으로써 네트워크에 대한 침입 여부를 실시간으로 탐지하는 것을 특징으로 한다.In order to achieve the above object, the present invention provides an intrusion detection system for detecting intrusion into a network composed of a plurality of packets, by using an artificial neural network, classifying the packet as a normal class or an intrusion detection target according to whether an external intrusion occurs. A classifier for classifying into one of the attack classes; and a determination unit learning a case in which a classification result of the classifier for the packet and actual intrusion do not match, and determining the packet as one of a judgment class or a judgment pending class, wherein the classifier, in the determination unit If the packet is determined to be in the judgment class, the packet is classified into either the normal class or the attack class according to whether or not it is intruded, and if the packet is in the judgment pending class, a packet received next to the packet is classified. It is characterized in that it detects in real time whether there is an intrusion into the network by holding classification of the packet until it is done, and determining intrusion for each individual packet.
바람직하게, 상기 분류기에서 분류된 상기 패킷에 대한 분류 결과에 따라 상기 패킷을 처리하는 처리부를 더 포함하며, 상기 처리부는, 상기 분류기에서 상기 패킷이 상기 정상 클래스로 분류된 경우 상기 패킷을 포워딩하고, 상기 패킷이 상기 공격 클래스로 분류된 경우 상기 패킷을 폐기 처리할 수 있다.Preferably, it further comprises a processing unit for processing the packet according to a classification result of the packet classified by the classifier, wherein the processing unit forwards the packet when the packet is classified into the normal class by the classifier, When the packet is classified as the attack class, the packet may be discarded.
바람직하게, 상기 분류기는, 상기 인공 신경망에 대한 입력으로 상기 패킷에 대한 피처를 생성하는 피쳐 생성부; 및 상기 피쳐에 대한 상기 인공 신경망의 출력에 따라 상기 패킷을 상기 정상 클래스 또는 상기 공격 클래스 중 어느 하나로 분류하는 분류부를 포함할 수 있다.Preferably, the classifier includes: a feature generator for generating a feature for the packet as an input to the artificial neural network; and a classification unit classifying the packet into one of the normal class and the attack class according to the output of the artificial neural network for the feature.
바람직하게, 상기 분류기에 마련되는 상기 인공 신경망은, 복수 개의 유닛(unit)이 순차적으로 연결되어 형성된 순환 신경망으로, 상기 유닛은 상기 패킷마다 마련되며, 상기 유닛은 전후 연결되어 입출력에 영향을 미칠 수 있다.Preferably, the artificial neural network provided in the classifier is a recurrent neural network formed by sequentially connecting a plurality of units, and the units are provided for each packet, and the units are connected back and forth to affect input and output. there is.
바람직하게, 상기 분류기는, 상기 판단부에서 상기 패킷이 상기 판단보류 클래스로 분류된 경우, 상기 패킷과 매칭되는 상기 유닛(unit n)의 입력은 상기 패킷의 피쳐와 이전 유닛(unit n+1)의 출력을 포함하여, 이전 패킷의 특성을 반영할 수 있다.Preferably, the classifier, when the packet is classified into the judgment pending class by the determination unit, the input of the unit (unit n) matching the packet is a feature of the packet and a previous unit (unit n+1). Including the output of , the characteristics of the previous packet may be reflected.
바람직하게, 상기 피쳐 생성부는, 상기 패킷의 데이터 중 헤더를 포함한 일정 크기의 데이터를 기준으로 하여 수신되는 상기 패킷마다 상기 피쳐를 독립적으로 생성할 수 있다. Preferably, the feature generator may independently create the feature for each received packet based on data of a predetermined size including a header among data of the packet.
바람직하게, 상기 판단부는, 상기 분류기의 분류 결과와 실제 침입 여부가 일치하지 않는 경우에 해당되는 상기 패킷을 학습하는 판단보류 학습모델; 및 상기 기준 데이터와 상기 패킷의 데이터를 비교하여 상기 패킷이 상기 판단 클래스에 속하는지 또는 상기 판단보류 클래스에 속하는지 분류하는 결정 모듈을 포함할 수 있다.Preferably, the determination unit includes: a decision-sustaining learning model for learning the packet corresponding to the case where the classification result of the classifier and actual intrusion do not match; and a determination module that compares the reference data with data of the packet and classifies whether the packet belongs to the determination class or the determination pending class.
바람직하게, 상기 판단보류 학습모델은, 상기 패킷으로 구성된 임의의 학습 데이터를 상기 분류기에 대해 분류하여 상기 분류기의 분류 결과와 실제 침입 여부가 일치하지 않는 오분류 데이터를 추출하는 추출 모듈; 상기 오분류 데이터를 상기 판단보류 클래스로 지정하고, 상기 오분류 데이터를 상기 판단보류 클래스에 대해 단일 분류 학습시키는 학습 모듈; 및 상기 학습 데이터에 대해 상기 학습 모듈에서 학습된 단일 분류를 수행하는 단일 분류 모듈을 포함할 수 있다.Preferably, the decision-withholding learning model includes an extraction module for classifying arbitrary learning data composed of the packets with respect to the classifier and extracting misclassified data in which a classification result of the classifier and actual intrusion do not match; a learning module that designates the misclassified data as the decision-deferred class and performs single classification and learning on the misclassified data for the decision-deferred class; and a single classification module for performing single classification learned in the learning module on the learning data.
본 발명은 초기에 정상 트래픽인지 공격 트래픽인지 구분이 명확하지 않은 경우 정확한 분류가 가능할 때까지 탐지를 보류한다. 이를 통해 정상 트래픽이 공격 트래픽으로, 공격 트래픽이 정상 트래픽으로 잘못 분류되는 것을 방지하여 정확도 높은 침입 탐지가 가능하며, 실시간으로 침입 여부를 탐지할 수 있다. In the present invention, when it is not clear whether normal traffic or attack traffic is initially identified, detection is suspended until accurate classification is possible. This prevents normal traffic from being misclassified as attack traffic and attack traffic as normal traffic, enabling highly accurate intrusion detection and detecting intrusion in real time.
본 발명에 따르면, 처리해야 하는 트래픽을 양을 줄임으로써 침입탐지 시스템의 수행 속도를 높일 수 있다. 본 발명은 전체 네트워크 트래픽의 대부분을 차지하는 정상 트래픽을 조기에 정상 트래픽으로 탐지하여 정상으로 분류된 트래픽에 대해서는 침입 여부를 탐지하지 않도록 하여 처리해야 하는 트래픽을 양을 줄일 수 있다. 또한, 매 패킷에 대한 침입 여부 판단을 수행하고 판단이 보류된 패킷 이후 특정 패킷의 트래픽이 정상 또는 공격으로 분류된 경우에는 이후 패킷에 대한 판단을 하지 않음으로써 처리할 트래픽을 양을 줄일 수 있는 장점을 갖는다.According to the present invention, the execution speed of the intrusion detection system can be increased by reducing the amount of traffic to be processed. The present invention can reduce the amount of traffic to be processed by detecting normal traffic, which accounts for most of the total network traffic, as normal traffic at an early stage and not detecting intrusion in traffic classified as normal. In addition, it is possible to reduce the amount of traffic to be processed by determining whether each packet is intruded or not, and when the traffic of a specific packet is classified as normal or attack after the packet for which the judgment is suspended, the judgment is not made on the subsequent packet. have
본 발명은 패킷에 대해 분류기를 적용함에 있어서 패킷별로 해당 패킷의 특징이 반영된 피쳐를 이용하는데, 피쳐를 생성함에 있어서 전체 패킷의 바이트 값을 이용하지 않고, 일부 바이트 값 중 SIP, DIP, ID, Source Port 등을 제거한 값만을 이용하므로 피쳐의 크기를 줄일 수 있다.In the present invention, in applying a classifier to a packet, a feature in which the characteristics of the corresponding packet are reflected is used for each packet. In generating the feature, the byte value of the entire packet is not used, but among some byte values, SIP, DIP, ID, Source The size of the feature can be reduced because only the value with Port removed is used.
도 1은 본 발명의 실시예에 따른 침입탐지 시스템의 구성도를 나타낸다.1 shows a configuration diagram of an intrusion detection system according to an embodiment of the present invention.
도 2는 본 발명의 실시예에 따른 침입탐지 시스템과 종래 분류기가 복수 개의 패킷을 포함하는 세션이 공격 세션이라고 판단하는 과정을 나타낸다.2 shows a process in which an intrusion detection system according to an embodiment of the present invention and a conventional classifier determine that a session including a plurality of packets is an attack session.
도 3은 본 발명의 실시예에 따른 분류기의 구성도를 나타낸다.3 shows a configuration diagram of a classifier according to an embodiment of the present invention.
도 4는 본 발명의 실시예에 따른 판단보류 학습모델이 구축되는 과정을 나타낸다.4 shows a process of constructing a decision-withholding learning model according to an embodiment of the present invention.
도 5는 본 발명의 실시예에 따른 침입탐지 시스템에서 침입 여부를 분류하는 과정을 나타낸다5 shows a process of classifying whether or not an intrusion has occurred in an intrusion detection system according to an embodiment of the present invention.
이하, 첨부된 도면들에 기재된 내용들을 참조하여 본 발명을 상세히 설명한다. 다만, 본 발명이 예시적 실시예들에 의해 제한되거나 한정되는 것은 아니다. 각 도면에 제시된 동일 참조부호는 실질적으로 동일한 기능을 수행하는 부재를 나타낸다.Hereinafter, the present invention will be described in detail with reference to the contents described in the accompanying drawings. However, the present invention is not limited or limited by exemplary embodiments. The same reference numerals in each figure indicate members performing substantially the same function.
본 발명의 목적 및 효과는 하기의 설명에 의해서 자연스럽게 이해되거나 보다 분명해질 수 있으며, 하기의 기재만으로 본 발명의 목적 및 효과가 제한되는 것은 아니다. 또한, 본 발명을 설명함에 있어서 본 발명과 관련된 공지 기술에 대한 구체적인 설명이, 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략하기로 한다.The objects and effects of the present invention can be naturally understood or more clearly understood by the following description, and the objects and effects of the present invention are not limited only by the following description. In addition, in describing the present invention, if it is determined that a detailed description of a known technology related to the present invention may unnecessarily obscure the subject matter of the present invention, the detailed description will be omitted.
도 1은 본 발명의 실시예에 따른 침입탐지 시스템(1)의 구성도를 나타낸다. 침입탐지 시스템(1)은 복수의 패킷으로 구성된 네트워크에 대한 침입 여부를 탐지하는 시스템으로, 개별 패킷마다 침입 여부를 판단함으로써 네트워크에 대한 침입 여부를 실시간으로 탐지할 수 있다. 침입탐지 시스템(1)은 판단부(10), 분류기(30) 및 처리부(50)를 포함할 수 있다.1 shows a configuration diagram of an intrusion detection system 1 according to an embodiment of the present invention. The intrusion detection system 1 is a system that detects intrusion into a network composed of a plurality of packets, and can detect intrusion into the network in real time by determining intrusion for each individual packet. The intrusion detection system 1 may include a determination unit 10 , a classifier 30 and a processing unit 50 .
판단부(10)는 패킷에 대한 분류기(30)의 분류 결과와 실제 침입 여부가 일치하지 않는 경우를 학습하여, 패킷을 판단 클래스 또는 판단보류 클래스 중 어느 하나로 판단할 수 있다. 즉, 판단부(10)는 초기에 수신되는 패킷에 대해 침입 여부를 판단할 때 오분류될 수 있는 세션들을 미리 기계학습을 통해 학습함으로써, 해당 패킷에 대한 정확한 판단이 가능할 때까지 판단을 보류할 수 있다.The determination unit 10 may learn a case in which the classification result of the classifier 30 for the packet and actual intrusion do not match, and determine the packet as one of the judgment class and the judgment pending class. That is, the determination unit 10 can suspend the determination until an accurate determination of the corresponding packet is possible by learning sessions that may be misclassified in advance through machine learning when determining whether an initially received packet is an intrusion. can
판단부(10)는 판단보류 학습모델(110), 결정 모듈(130)을 포함할 수 있다. 판단보류 학습모델(110)은 분류기(30)의 분류 결과와 실제 침입 여부가 일치하지 않는 경우에 해당되는 패킷을 학습할 수 있다. The determination unit 10 may include a judgment-deferred learning model 110 and a determination module 130 . The decision-sustaining learning model 110 may learn a packet corresponding to a case where the classification result of the classifier 30 and actual intrusion do not match.
판단보류 학습모델(110)은 추출 모듈(1101), 학습 모듈(1130) 및 단일 분류 모듈(1105)을 포함할 수 있다. 추출 모듈(1101)은 패킷으로 구성된 임의의 학습 데이터를 분류기(30)에 대해 분류하여 분류기(30)의 분류 결과와 실제 침입 여부가 일치하지 않는 오분류 데이터를 추출할 수 있다. 학습 모듈(1103)은 오분류 데이터를 판단보류 클래스로 지정하고, 오분류 데이터를 판단보류 클래스에 대해 단일 분류 학습시킬 수 있다. 이때 학습 모듈(1103)은 one-class 분류기를 이용하여 오분류 데이터를 추출하고, 이를 판단보류 클래스로 분류할 수 있다. 단일 분류 모듈(1105)은 학습 데이터에 대해 학습 모듈(1103)에서 학습된 단일 분류를 수행하여 오분류된 데이터 트래픽과 유사한 트래픽을 갖는 패킷을 선별할 수 있다. 판단보류 학습모델(110)에 대한 설명은 이하 도 4 및 도 5에서 자세히 후술한다.The decision-withholding learning model 110 may include an extraction module 1101 , a learning module 1130 and a single classification module 1105 . The extraction module 1101 may classify arbitrary learning data composed of packets with respect to the classifier 30 to extract misclassified data in which the classification result of the classifier 30 and actual intrusion do not match. The learning module 1103 may designate the misclassified data as a decision-deferred class, and perform single classification training on the misclassified data for the decision-deferred class. At this time, the learning module 1103 may extract misclassified data using a one-class classifier and classify it into a judgment deferred class. The single classification module 1105 may select packets having traffic similar to the misclassified data traffic by performing the single classification learned in the learning module 1103 on the learning data. A description of the decision-sustaining learning model 110 will be described later in detail with reference to FIGS. 4 and 5 .
결정 모듈(130)은 기준 데이터와 패킷의 데이터를 비교하여 패킷이 판단 클래스에 속하는지 또는 판단보류 클래스에 속하는지 분류할 수 있다. 결정 모듈(130)에서 해당 패킷이 판단보류 클래스에 속하는 경우 이후 분류기(30)의 인공 신경망(330)에서 복수 회의 분류가 수행될 수 있다.The decision module 130 may compare the reference data and the data of the packet to classify whether the packet belongs to the decision class or the decision pending class. In the decision module 130, when the corresponding packet belongs to the decision pending class, classification may be performed multiple times in the artificial neural network 330 of the classifier 30 thereafter.
분류기(30)는 인공 신경망을 이용하여 외부의 침입 여부에 따라 침입 여부 탐지 대상인 패킷을 정상 클래스 또는 공격 클래스 중 어느 하나로 분류할 수 있다. 분류기(30)는 판단부(10)에서 패킷이 판단 클래스로 판단된 경우, 침입 여부에 따라 패킷을 정상 클래스 또는 공격 클래스 중 어느 하나로 분류하고, 패킷이 판단보류 클래스인 경우, 패킷 다음으로 수신되는 패킷을 분류할 때까지 패킷에 대한 분류를 보류할 수 있다.The classifier 30 may classify a packet, which is an intrusion detection target, into one of a normal class and an attack class according to whether an external intrusion occurs using an artificial neural network. The classifier 30 classifies the packet into either a normal class or an attack class depending on whether the packet is intruded when the determination unit 10 determines that the packet is in the judgment class, and if the packet is in the judgment pending class, the packet is received next to the packet. Classification of packets can be withheld until packets have been classified.
분류기(30)는 피쳐 생성부(310), 인공신경망(330) 및 분류부(350)를 포함할 수 있다. 피쳐 생성부(310)는 인공 신경망(330)에 대한 입력으로 패킷에 대한 피처를 생성할 수 있다. 피쳐 생성부(310)는 패킷의 데이터 중 헤더를 포함한 일정 크기(n바이트)의 데이터를 기준으로 하여 수신되는 패킷마다 피쳐를 독립적으로 생성할 수 있다. 피쳐 생성부(310)는 피쳐 생성 시, 특정 세션에 의존적인 학습을 지양하기 위해서 생성된 피쳐에서 SIP, DIP, IP ID에 해당하는 값들을 제외하며, TCP, UDP 등 포트 번호를 갖는 경우 source port(혹은 역방향 패킷인 경우 destination port) 등을 제외할 수 있다. 피쳐의 전체 크기를 결정하는 값인 n은 학습 데이터에 따라 다르게 설정될 수 있으며, 학습 데이터를 기준으로 최적의 값으로 설정될 수 있다.The classifier 30 may include a feature generator 310 , an artificial neural network 330 and a classifier 350 . The feature generator 310 may generate features for packets as an input to the artificial neural network 330 . The feature generator 310 may independently create a feature for each received packet based on data of a predetermined size (n bytes) including a header among packet data. When the feature creation unit 310 creates a feature, it excludes values corresponding to SIP, DIP, and IP ID from the created feature in order to avoid learning dependent on a specific session, and when it has a port number such as TCP or UDP, the source port (or destination port in the case of reverse packets), etc. may be excluded. n, which is a value for determining the overall size of the feature, may be set differently according to training data, and may be set to an optimal value based on the training data.
인공 신경망(330)은 복수 개의 유닛(3301)이 순차적으로 연결되어 형성된 순환 신경망일 수 있다. 유닛(3301)은 패킷마다 마련될 수 있으며, 복수 개의 유닛(3301)은 전후 연결되어 다음 유닛(3301)의 입출력에 영향을 미칠 수 있다. 판단부(10)에서 패킷이 판단보류 클래스로 분류된 경우, 해당 패킷과 매칭되는 유닛(unit n)의 입력은 해당 패킷의 피쳐와 이전 유닛(unit n-1)의 출력 정보를 포함할 수 있다. The artificial neural network 330 may be a recurrent neural network formed by sequentially connecting a plurality of units 3301 . The unit 3301 may be provided for each packet, and a plurality of units 3301 may be connected back and forth to affect the input/output of the next unit 3301. When a packet is classified as a judgment pending class in the determination unit 10, an input of a unit (unit n) matching the corresponding packet may include a feature of the corresponding packet and output information of the previous unit (unit n-1). .
분류부(350)는 피쳐에 대한 인공 신경망(330)의 출력에 따라 패킷을 정상 클래스 또는 공격 클래스 중 어느 하나로 분류할 수 있다. 인공 신경망(330)은 출력 부분에 DNN을 포함할 수 있으며, 분류부(350)에서 정상 클래스 또는 공격 클래스로 판단되는 대상은 유닛(3301) 분류와 DNN을 거친 출력일 수 있다.The classification unit 350 may classify the packet into either a normal class or an attack class according to the output of the artificial neural network 330 for the feature. The artificial neural network 330 may include a DNN in an output portion, and an object determined to be a normal class or an attack class in the classification unit 350 may be an output that has passed through the unit 3301 classification and DNN.
처리부(50)는 과거에 수행된 침입 판단에서 정상으로 분류된 세션에 대해 whitelist로 저장된 정보와 침입으로 분류된 세션에 대해 blacklist로 저장된 정보를 이용하여 패킷에 대한 처리를 수행할 수 있다. 처리부(50)는 whitelist 또는 blacklist에서 수신된 패킷이 속하는 세션을 검색하여 수신된 패킷이 정상 패킷인지 또는 공격받은 패킷인지 분류하여 처리할 수 있다. 수신된 패킷이 정상으로 분류된 세션의 패킷과 동일한 경우 처리부(50)는 해당 패킷을 포워딩하고, 공격으로 분류된 세션의 패킷과 동일한 경우 처리부(50)는 해당 패킷을 폐기할 수 있다. 해당 패킷이 정상 또는 공격으로 명확히 판단된 경우에는 이후 추가적인 침입 여부 판단을 수행하지 않을 수 있다. 패킷이 정상인 경우에는 이후 수신되는 정상인 패킷들에 대해 추가적인 이중 판단을 수행하지 않으므로 침입탐지 시스템(1)이 처리해야 할 트래픽의 양을 줄일 수 있다. 반면, 해당 패킷이 공격 패킷으로 악성인 경우에는 DDoS 공격 등, 순간적으로 공격 트래픽이 폭증할 수 있는데, 해당 트래픽을 바로 폐기하므로 이때에도 처리해야 하는 트래픽의 양을 줄일 수 있다.The processing unit 50 may process packets using information stored as whitelist for sessions classified as normal in the past intrusion determination and information stored as blacklist for sessions classified as intrusion. The processing unit 50 searches for a session to which a packet received from the whitelist or blacklist belongs, classifies whether the received packet is a normal packet or an attacked packet, and processes the received packet. If the received packet is the same as the packet of the session classified as normal, the processing unit 50 forwards the corresponding packet, and if it is the same as the packet of the session classified as attack, the processing unit 50 may discard the corresponding packet. If the corresponding packet is clearly determined to be normal or an attack, further intrusion determination may not be performed. If the packet is normal, the amount of traffic to be processed by the intrusion detection system 1 can be reduced because additional double-determination is not performed on normal packets received thereafter. On the other hand, if the corresponding packet is an attack packet and is malicious, attack traffic such as a DDoS attack may explode instantaneously. Since the corresponding traffic is immediately discarded, the amount of traffic to be processed can be reduced even at this time.
처리부(50)는 분류기(30)에서 분류된 패킷에 대한 분류 결과에 따라 패킷을 처리할 수 있다. 처리부(50)는 분류기(30)의 분류 결과로 해당 패킷이 정상 클래스에 속하는지 또는 공격 클래스에 속하는지에 따라 해당 패킷에 대한 처리를 다르게 수행한다. 해당 패킷이 정상 클래스로 분류되면, 처리부(50)는 패킷을 포워딩하고, 패킷이 공격 클래스로 분류되면, 처리부(50)는 패킷을 폐기 처리할 수 있다. 특정 세션에서 초반 패킷에 대한 클래스 분류가 이루어지지 않고 판단이 보류되는 경우에도, 결과적으로는 후반 패킷에 대한 클래스가 분류되므로, 분류부(350)의 분류 결과에 따라 처리부(50)는 해당 패킷을 처리 가능하다.The processing unit 50 may process packets according to a classification result of the packets classified by the classifier 30 . The processor 50 processes the corresponding packet differently according to whether the corresponding packet belongs to the normal class or the attack class as a result of classification by the classifier 30 . If the corresponding packet is classified as a normal class, the processor 50 forwards the packet, and if the packet is classified as an attack class, the processor 50 may discard the packet. Even if class classification is not made for the first packet in a specific session and the judgment is suspended, as a result, the class for the second half packet is classified, so according to the classification result of the classification unit 350, the processing unit 50 classifies the corresponding packet. can be processed
도 2는 본 발명의 실시예에 따른 침입탐지 시스템(1)과 종래 분류기가 복수 개의 패킷을 포함하는 세션을 공격 세션이라고 판단하는 과정을 비교하여 나타낸다. 도 2는 7개의 패킷으로 구성되며 공격 세션으로 분류된 특정 세션인 경우로 가정한다. ‘세션 피쳐기반 분류기’는 해당 세션이 종료한 후에 침입 여부에 대한 탐지가 가능하다. 도 2에서와 같이 1번 패킷에서 6번 패킷까지는 판단을 하지 않고, 마지막인 7번 패킷에서 세션이 종료된 후에 ‘공격’이라고 판단이 가능하다. 그러므로 세션 피쳐기반 분류기는 처리해야 하는 트래픽의 양이 많고, 실시간으로 침입 여부를 탐지할 수 없는 문제가 있다.2 shows a comparison between the intrusion detection system 1 according to an embodiment of the present invention and a conventional classifier determining a session including a plurality of packets as an attack session. Figure 2 assumes a case of a specific session composed of 7 packets and classified as an attack session. The ‘session feature-based classifier’ can detect intrusion after the corresponding session is terminated. As shown in FIG. 2, it is possible to determine an 'attack' after the session is terminated in the last packet 7, without judging packets 1 through 6. Therefore, the session feature-based classifier has a problem in that the amount of traffic to be processed is large and intrusion cannot be detected in real time.
두 번째 ‘누적 패킷(1~t 패킷) 피쳐기반 분류기’는 1번 패킷에서 3번 패킷까지는 정상이라고 판단하다가 4번 패킷부터 공격이라고 판단한다. 이것은, 누적 패킷 분류기가 1번 패킷 ~ 3번 패킷에서는 틀리게 탐지를 하였고, 4번 패킷부터 올바르게 탐지를 하였음을 의미한다. 따라서, 이렇게 정상이라고 판단되다가 나중에 공격이라고 판단되는 결과가 계속되면, 초반에 정상이라는 판단에 대해 신뢰할 수 없고, 어떤 결과가 맞는지 알 수 없으므로 문제가 있다.The second ‘cumulative packet (1~t packet) feature-based classifier’ determines that packets 1 through 3 are normal, and then determines that packets 4 and above are attack. This means that the cumulative packet classifier incorrectly detected packets 1 to 3 and correctly detected packets 4 and up. Therefore, if a result that is determined to be normal and later determined to be an attack continues, there is a problem in that the determination that it is normal in the beginning cannot be trusted and it is not known which result is correct.
반면, 본 발명의 실시예인 세 번째 침입탐지 시스템은 각 세션에 대해 순차적으로 수신된 패킷을 이용하여 부분 세션 별로 피쳐를 생성하여 학습하고 이를 바탕으로 침입 여부를 판단한다. 본 발명의 실시예는 정확한 판단이 불가능한 패킷에 대해서는 판단을 보류하며, 정확한 판단이 가능한 경우에만 정상 또는 공격으로 판단을 수행한다. 도 2를 참고하면, 본 발명의 실시예에서 1번 패킷 ~ 3번 패킷은 침입탐지 시스템(1)에서 판단이 불가능하다고 판단하여 보류하였고, 4번 패킷에서 공격이라고 판단함을 확인할 수 있다. 즉, 본 발명은 정확한 판단이 가능할 때까지 임의적인 신뢰성 없는 판단을 하는 대신 판단을 보류함으로써 누적 패킷 피쳐기반 분류기보다 판단 결과에 대한 오분류를 줄이고 신뢰성을 높일 수 있다. 또한, 4번 패킷에서 공격으로 명확히 판단하면 4번 패킷은 폐기되고 이후 5번 내지 7번 패킷에 대해서는 판단을 수행하지 않으므로 세션 피쳐기반 분류기에 비해 처리해야 할 트래픽의 양을 줄일 수 있다. On the other hand, the third intrusion detection system, which is an embodiment of the present invention, creates and learns a feature for each partial session using packets sequentially received for each session, and determines intrusion based on this. In the embodiment of the present invention, judgment is suspended for packets for which accurate judgment is impossible, and judgment is performed as normal or attack only when accurate judgment is possible. Referring to FIG. 2 , in the embodiment of the present invention, packets 1 to 3 were judged to be undeterminable in the intrusion detection system 1 and were withheld, and packet 4 was determined to be an attack. That is, the present invention can reduce misclassification of the judgment results and increase reliability, compared to the cumulative packet feature-based classifier, by suspending the judgment instead of making an arbitrary unreliable judgment until an accurate judgment is possible. In addition, if packet 4 is clearly determined to be an attack, packet 4 is discarded and packets 5 to 7 are not judged thereafter, so the amount of traffic to be processed can be reduced compared to the session feature-based classifier.
이렇듯, 동일한 공격 세션에 대해서도 각 분류기의 판단 방법에 따라 공격이라고 판단하는 패킷의 순서가 상이할 수 있다. 도 2와 같이 본 발명이 가장 빠르게 공격 발생을 탐지하였으며, 정확도 또한 높음을 확인할 수 있다. As such, even for the same attack session, the order of packets determined as an attack may be different according to the determination method of each classifier. As shown in FIG. 2, it can be confirmed that the present invention detects the occurrence of an attack the fastest, and the accuracy is also high.
도 3은 본 발명의 실시예에 따른 분류기(30)의 구성도를 나타낸다.3 shows a configuration diagram of a classifier 30 according to an embodiment of the present invention.
분류기(30)는 각 패킷이 정상 패킷인지 공격 패킷인지를 판단하여 해당 패킷을 정상 클래스 또는 공격 클래스 중 어느 하나로 분류할 수 있다. 특히, 분류기(30)는 판단부(10)에서 해당 패킷에 대한 정확한 판단이 불가능한 경우에는 해당 패킷에 대한 판단을 보류하고 다음 패킷을 판단하는 것을 특징으로 한다. 다만 이때 이전 패킷에 대한 판단을 보류하고 다음 패킷에 대한 침입 여부를 판단할 수 있는 이유는, 본 발명의 분류기(30)가 이전 패킷들에서 공격에 대한 정보를 이용하기 때문이다. 다시 말하면, 세션에 대해서 개별 패킷별로 피쳐를 생성할 때, 이전 패킷과 무관한 독립적인 개별 피쳐만을 생성하여 활용하면 여러 패킷에 걸쳐 이루어진 공격의 특성을 탐지하기 어렵다. 그러므로, 패킷에 대한 침입 여부 판단은 독립적으로 수행되더라도 이전 패킷에 가해진 공격 특성은 다음 패킷 판단에 이용되도록 설계되어야 한다.The classifier 30 determines whether each packet is a normal packet or an attack packet and classifies the corresponding packet into either a normal class or an attack class. In particular, the classifier 30 is characterized in that if the determination unit 10 cannot accurately determine the corresponding packet, the classification unit 30 suspends the determination of the corresponding packet and determines the next packet. However, at this time, the reason why the judgment on the previous packet can be suspended and whether the next packet is intruded is determined because the classifier 30 of the present invention uses information about the attack in the previous packets. In other words, when a feature is created for each individual packet for a session, it is difficult to detect the characteristics of an attack made over several packets if only an independent individual feature unrelated to the previous packet is created and used. Therefore, even if the determination of whether a packet is intruded is independently performed, the attack characteristic applied to the previous packet should be designed to be used in determining the next packet.
한편, 피쳐를 생성하는 방법에는 세션 별로 수신된 일정 개수의 패킷을 이용하여 부분적인 세션 피처를 생성하는 방법이 있다. 그러나 이 방법은 메모리 용량을 많이 차지하고 큰 프로세싱 파워가 요구되어 실제 구현하는 것은 불가능에 가깝다. 그 이유는, 특정 세션에 대해서 현재 k개의 패킷이 수신된 경우, k개 패킷에 대한 피처를 실시간적으로 생성하기 위해서는 k개 패킷을 모두 저장하고 있어야 하며, 새로운 패킷을 수신할 때마다 k개의 패킷을 모두 읽어 실시간적으로 피처를 생성하여야 하기 때문이다. Meanwhile, as a method of generating a feature, there is a method of generating a partial session feature using a certain number of packets received for each session. However, this method takes up a lot of memory capacity and requires a lot of processing power, so it is almost impossible to actually implement it. The reason is that if k packets are currently received for a specific session, all k packets must be stored in order to generate features for the k packets in real time, and k packets are received whenever a new packet is received. This is because features must be created in real time by reading all of them.
이에 본 발명의 실시예는 k번째 패킷에 대한 판단을 위해 k-1번째 패킷에 적용되는 유닛(3301)의 출력과 k번째 피쳐 데이터를 k번째 유닛(3301)의 입력으로 사용하도록 한다. 즉, 특정 패킷(k번째 패킷)을 판단함에 있어서, 바로 이전 패킷(k-1번째 패킷)에 대한 출력 정보를 이용하는 방법이다. 이렇듯 k-1번째 패킷에는 k-2번째 패킷 정보를 이용하고, k-2번째 패킷에는 k-3번째 패킷 정보를 이용하는 등 연쇄적으로 이루어지므로, k번째 패킷에 대한 판단은 1번째 패킷부터 k-1번째 패킷에 대한 정보를 모두 이용하여 누적된 세션의 특성을 포함하여 수행될 수 있다. Accordingly, the embodiment of the present invention uses the output of the unit 3301 applied to the k−1 th packet and the k th feature data as input to the k th unit 3301 to determine the k th packet. That is, in determining a specific packet (k-th packet), it is a method of using output information for the immediately previous packet (k-1-th packet). In this way, the k-2 th packet information is used for the k-1 th packet, and the k-3 th packet information is used for the k-2 th packet. It can be performed including the characteristics of the accumulated session using all the information on the -1st packet.
이를 위해 본 발명의 분류기(30)에 마련되는 인공 신경망(330)은 복수 개의 유닛(3301)이 순차적으로 연결되어 형성된 순환 신경망일 수 있다. 유닛(3301)은 세션에 포함된 패킷의 개수만큼 마련될 수 있으며, 그보다 많을 수 있다. 각 유닛(3301)은 패킷과 대응하여 1번 패킷에 대한 침입 여부 판단은 1번 유닛(3301)(unit n)에서 수행될 수 있다. 패킷에 대한 침입 여부 판단은 유닛(3301)을 중심으로 수행되며, 해당 패킷의 피쳐가 유닛(3301)에 입력값으로 입력되고, 그 출력으로 해당 패킷에 대한 정상/공격 여부가 판단될 수 있다.To this end, the artificial neural network 330 provided in the classifier 30 of the present invention may be a recurrent neural network formed by sequentially connecting a plurality of units 3301 . Units 3301 may be provided as many as the number of packets included in the session, or may be more than that. Each unit 3301 corresponds to a packet, and the first unit 3301 (unit n) may determine whether or not the first packet is intruded. Determination of whether a packet is intruded is performed centering on the unit 3301, the feature of the corresponding packet is input to the unit 3301 as an input value, and the normal/attack state of the corresponding packet can be determined as an output thereof.
도 3의 분류기(30)는 본 발명의 실시예로서 총 N개의 유닛(3301)로 구성된 LSTM과 각 유닛의 출력 부분에 DNN이 추가된 구조를 나타낸다. 이때 N은 학습 데이터를 기준으로 최적의 성능을 보여주는 값으로 설정될 수 있다. LSTM의 t번째 유닛(3301)은 t번째 패킷을 분류하는데 사용되며 마지막 N번째 유닛(3301)은 N번째 패킷뿐만 아니라 N번째 이후의 모든 패킷을 분류하는데 사용될 수 있다. The classifier 30 of FIG. 3 shows a structure in which an LSTM consisting of a total of N units 3301 and a DNN are added to the output part of each unit as an embodiment of the present invention. In this case, N may be set to a value showing optimal performance based on the learning data. The t-th unit 3301 of the LSTM is used to classify the t-th packet, and the last N-th unit 3301 can be used to classify not only the N-th packet but all packets after the N-th.
분류기(30)에서는 피쳐 생성부(310)에서 패킷별로 생성된 피쳐 xN는 해당 유닛 N에 입력으로서 입력될 수 있고, 그 출력으로 hN과 cN이 출력될 수 있다. 전술한 바와 같이 분류기(30)의 출력부분에 DNN이 포함될 수 있으므로, hN은 DNN을 거쳐 최종적으로 oN을 출력할 수 있다, In the classifier 30, the feature x N generated for each packet in the feature generator 310 may be input to a corresponding unit N as an input, and h N and c N may be output as outputs. As described above, since the DNN may be included in the output part of the classifier 30, h N may go through the DNN and finally output o N.
보다 상세하게 도 3을 참고하면, x1이 유닛 1에 입력되어 그 결과로 h1과 c1이 출력되는 과정을 확인할 수 있다. h1은 DNN을 거쳐 최종적으로 o1이 출력되며, o1은 해당 패킷에 대한 분류 결과로서 정상 패킷인지 아니면 공격 패킷인지를 나타낼 수 있다. 만일 o1이 정상 혹은 공격 패킷인 것으로 분류되면 이후 분류부(350)는 해당 패킷을 클래스에 따라 분류하고, 처리부(50)는 해당 패킷을 포워딩 또는 폐기하는 처리를 할 수 있다.Referring to FIG. 3 in more detail, it can be seen that x 1 is input to unit 1 and h 1 and c 1 are output as a result. h 1 passes through the DNN and finally o 1 is output, and o 1 is a classification result for the corresponding packet and may indicate whether it is a normal packet or an attack packet. If o 1 is classified as a normal or attack packet, then the classification unit 350 classifies the corresponding packet according to the class, and the processing unit 50 may forward or discard the corresponding packet.
반면, 해당 패킷이 분류 불가한 경우라면 분류기(30)는 해당 패킷에 대한 판단을 보류하고 두 번째 패킷에 대한 분류를 시도할 수 있다. 이때, 유닛 2의 입력으로 2번 패킷의 피쳐인 x2와 유닛 1의 출력인 h1, c1이 사용될 수 있다. 즉, t-1번째 패킷에 대한 분류 결과가 ‘분류 불가’인 경우에는 다음 패킷인 t번째 패킷 수신 시 분류를 수행하기 위해서 ct-1, ht-1을 미리 저장하여 이용할 수 있다. 기본적으로 ct-1, ht-1은 1번 내지 t-1 패킷의 특성을 부분적으로 포함하고 있다. 따라서 모든 패킷에 대한 정보를 저장하지 않아도 과거에 수신되었던 모든 패킷의 특성을 반영한 피처를 생성할 수 있는 것이다. 특정 세션 내 t번째 패킷에 대한 피쳐(Ft)는 다음과 같이 정의될 수 있다. On the other hand, if the corresponding packet is unclassifiable, the classifier 30 may suspend the determination of the corresponding packet and attempt to classify the second packet. In this case, x 2 , a feature of packet 2, and h 1 and c 1 , outputs of unit 1, may be used as inputs of unit 2. That is, when the classification result for the t-1 th packet is 'not classifiable', c t-1 and h t-1 can be stored and used in advance to perform classification when the next t-th packet is received. Basically, c t-1 and h t-1 partially include the characteristics of packets 1 through t-1. Therefore, it is possible to create a feature reflecting the characteristics of all packets received in the past without storing information on all packets. A feature (F t ) of the t-th packet in a specific session may be defined as follows.
Figure PCTKR2021011914-appb-I000001
Figure PCTKR2021011914-appb-I000001
도 4는 본 발명의 실시예에 따른 판단보류 학습모델(110)이 구축되는 과정을 나타낸다. 판단보류 학습모델(110)은 세션 초기에 초반 패킷이 수신될 때, 정상 트래픽과 유사한 공격 트래픽이 정상으로 분류되는 것을 방지하기 위한 구성이다. 이를 위해 판단보류 학습모델(110)은 초기 공격 트래픽이 정상으로 오분류되는 세션 트래픽을 이용하여 분류기(30)를 학습한다. 이를 통해, 판단보류 학습모델(110)은 해당 패킷에 대한 공격 여부 판단을 바로 수행해야 할지 아니면 다음 패킷으로 판단을 보류할지를 결정할 수 있다. 4 shows a process of constructing a decision-withholding learning model 110 according to an embodiment of the present invention. The decision-sustaining learning model 110 is a component for preventing attack traffic similar to normal traffic from being classified as normal when an initial packet is received at the beginning of a session. To this end, the decision-withholding learning model 110 learns the classifier 30 using session traffic in which the initial attack traffic is misclassified as normal. Through this, the decision deferral learning model 110 can determine whether to immediately perform an attack determination on the corresponding packet or defer the determination to the next packet.
판단보류 학습모델(110)은 여러 세션에 대해 t번째 패킷들로 구성된 데이터셋 Dt를 이용하여 분류기(30)를 학습하고, 임의의 학습 데이터를 해당 분류기(30)로 분류한 결과에서 오분류된 데이터들로만 구성된 데이터셋 Mt를 생성할 수 있다. M={Mt | t=1, 2, …, N }의 데이터셋에 대해 one-class 분류기를 사용하여 M을 학습할 수 있다. 본 발명의 실시예에서는 one-class 분류기로서 Deep-SVDD를 사용한 경우를 예시한다. Deep-SVDD는 Deep learning을 통해 피처 도메인을 최적의 도메인으로 매핑함으로써 해당 패킷이 어느 클래스에 속하는지 여부를 판단할 수 있으므로 오분류 확률 제어에 용이하다.The judgment-deferred learning model 110 learns the classifier 30 using the dataset D t composed of the t-th packets for several sessions, and classifies arbitrary learning data with the classifier 30. As a result, misclassification It is possible to create a dataset M t consisting of only the data M={M t | t=1, 2, … , N }, we can learn M using a one-class classifier. In an embodiment of the present invention, a case of using Deep-SVDD as a one-class classifier is exemplified. Deep-SVDD is easy to control misclassification probability because it can determine which class a packet belongs to by mapping a feature domain to an optimal domain through deep learning.
판단보류 학습모델(110)을 구축하는 방법은, 먼저, 1단계로 전체 학습데이터셋 T1과 T1에 속하는 모든 세션의 첫 번째 패킷으로만 구성되는 데이터셋 D1을 생성한다. T1과 D1을 사용하여 분류기(30)의 LSTM의 유닛 1을 학습한다. 학습 후 학습데이터셋 T1을 LSTM 유닛 1으로 분류한 후 오분류된 데이터로 구성된 오분류 데이터 M1을 생성한다. M1은 M1의 실제 클래스와 관계없이 일괄적으로 판단보류 클래스로 지정한다. M1을 이용하여 one-class 분류기인 Deep SVDD1을 학습한다. 이때 Deep SVDD는 one-class 분류기이기 때문에 판단보류 클래스 한 개로 구성된 데이터셋으로 학습될 수 있다. 유닛 1에 대해 학습된 D1을 Deep SVDD1으로 분류한다. 분류 결과 판단 보류로 분류되는 세션들로 구성된 데이터셋 T2를 새롭게 생성할 수 있다.In the method of constructing the decision-deferred learning model 110, first, in one step, the entire training dataset T 1 and the dataset D 1 consisting of only the first packets of all sessions belonging to T 1 are generated. Unit 1 of the LSTM of the classifier 30 is learned using T 1 and D 1 . After training, the training dataset T 1 is classified into LSTM unit 1, and then misclassified data M 1 composed of misclassified data is generated. Regardless of the actual class of M 1 , M 1 is collectively designated as a judgment deferred class. Deep SVDD 1 , a one-class classifier, is learned using M 1 . At this time, since Deep SVDD is a one-class classifier, it can be learned with a dataset consisting of one deferred decision class. D 1 learned for unit 1 is classified as Deep SVDD 1 . As a result of the classification, a dataset T 2 composed of sessions classified as pending judgment may be newly created.
생성된 학습 데이터셋 T2를 이용하여 이전 과정과 유사하게 D2를 생성하고 이를 이용하여 LSTM의 유닛 2를 학습한다. 두 번째 과정부터 학습에 사용되는 D2에는 두 번째 패킷 피처인 x2뿐만 아니라 LSTM 유닛 1의 출력인 h1, c1도 포함됨에 주목한다. 이하는 첫 과정과 유사하게 학습 데이터셋 T2에 대해서 LSTM 유닛 2로 분류한 후 오분류 데이터 M2를 얻고, M2를 판단보류 클래스로 지정한 후, M2를 이용하여 Deep SVDD2를 학습한다. 유닛 2로 학습된 D2를 Deep SVDD2로 분류한 후 판단보류로 분류되는 세션들로 구성된 데이터셋 T3를 구성한다. 이러한 과정을 통해 판단보류 학습모델(110)이 구축될 수 있다.Similar to the previous process, D 2 is created using the created training dataset T 2 , and unit 2 of the LSTM is learned using it. Note that D 2 used for learning from the second process includes not only the second packet feature x 2 but also h 1 and c 1 outputs of LSTM unit 1. Similar to the first process, the training dataset T 2 is classified as LSTM unit 2, misclassified data M 2 is obtained, M 2 is designated as a decision-deferred class, and Deep SVDD 2 is learned using M 2 . D 2 learned with unit 2 is classified as Deep SVDD 2 , and then a dataset T 3 composed of sessions classified as pending judgment is constructed. Through this process, the judgment suspension learning model 110 can be built.
이를 보다 일반화하면 다음과 같이 정의될 수 있다. 학습 데이터셋 Tt(t>1)를 이용하여 Tt에 속하는 모든 세션에 대해 Dt를 생성하고, 이를 이용하여 LSTM의 유닛 t를 학습한다. 이때 학습에 사용되는 Dt는 해당 패킷의 피처 xt뿐만 아니라 LSTM 유닛 t의 출력인 ht-1, ct-1도 포함될 수 있다. 학습 데이터셋 Tt에 대해서 LSTM 유닛 t로 분류한 후 오분류 데이터 Mt를 얻는다. 해당 Mt를 판단보류 클래스로 지정하고 Mt를 이용하여 Deep SVDDt를 학습한다. 학습된 Dt를 Deep SVDDt로 분류한 후 판단보류로 분류되는 세션들로 구성된 데이터셋 Tt + 1를 구성한다. 이러한 과정은 Mt가 공집합이 될 때까지 혹은 t=N까지 반복될 수 있다. 또한, N번째 이후에도 패킷이 있는 경우에는 DN에 모두 포함될 수 있다. 따라서 N번째 스텝에서 DN은 DN +1, DN +2, …을 모두 포함할 수 있다.More generalizing this, it can be defined as: Using the training dataset T t (t>1), D t is generated for all sessions belonging to T t , and unit t of LSTM is learned using this. At this time, D t used for learning may include h t-1 and c t- 1 outputs of the LSTM unit t as well as the feature x t of the corresponding packet. After classifying the training dataset T t with LSTM unit t, misclassified data M t is obtained. Designate the corresponding M t as a decision-deferred class and learn Deep SVDD t using M t . After classifying the learned D t as Deep SVDD t , a dataset T t + 1 consisting of sessions classified as pending decision is constructed. This process can be repeated until M t is an empty set or t=N. In addition, if there are packets after the Nth, they may all be included in D N . So at the Nth step, D N is D N +1 , D N +2 , … may include all.
도 5는 본 발명의 실시예에 따른 침입탐지 시스템(1)에서 침입 여부를 분류하는 과정을 나타낸다. 5 shows a process of classifying whether an intrusion occurs in the intrusion detection system 1 according to an embodiment of the present invention.
먼저 수신된 패킷에 대해 해당 패킷이 속하는 정상 또는 공격 세션이 있는지 Whitelist 또는 Blacklist에서 검색한다. 도 5에서는 Whitelist에서 검색하는 단계를 먼저 도시하였으나, 순서에 제한되지 않고 Blacklist를 먼저 검사해도 무방하다. Whitelist에서 해당 패킷이 속하는 세션이 검출된 경우에는 해당 패킷을 포워딩하고, 검출되지 않은 경우에는 Blacklist에서 검색하여 세션이 검출되면 해당 패킷을 폐기한다. 만약 Whitelist 또는 Blacklist 둘 다에서 해당 패킷이 검출되지 않으면, 해당 패킷이 세션 테이블에 속하는지 검색한다. 해당 패킷이 세션 테이블에서 검출되지 않으면 분류기(30)에서 수행될 조건으로 c=0, h=0, t=1을 설정한다. 반면, 해당 패킷이 세션 테이블에서 검출되면 세션 정보로부터 ct-1, ht-1, t 값을 읽어온다. First, the received packet is searched in the Whitelist or Blacklist to see if there is a normal or attack session to which the packet belongs. In FIG. 5, the step of searching the whitelist is shown first, but the order is not limited and it is okay to check the blacklist first. If the session to which the corresponding packet belongs is detected in the Whitelist, the corresponding packet is forwarded. If not detected, the Blacklist is searched and if the session is detected, the corresponding packet is discarded. If the corresponding packet is not detected in both the whitelist or blacklist, it is searched whether the corresponding packet belongs to the session table. When the corresponding packet is not detected in the session table, the classifier 30 sets c=0, h=0, and t=1 as conditions to be performed. On the other hand, if the corresponding packet is detected in the session table, c t-1 , h t-1 , and t values are read from the session information.
다음 단계는 두 경우 모두 해당 패킷에 대한 피쳐 xt를 생성하고, xt,, ct -1, ht-1를 입력으로 하여 유닛 t에서 분류하고, xt,, ct -1, ht- 1를 Deep SVDDt의 입력으로 하여 분류를 수행한다. 판단보류 클래스를 단일 분류로 하는 Deep SVDDt에서의 분류 결과, 해당 패킷이 분류 가능하면 침입 발생 여부에 따라 Whitelist 또는 Blacklist에 해당 패킷 또는 해당 세션의 정보를 추가한다. 정상 패킷인 경우 Whitelist에 추가하고 패킷을 포워딩하며, 공격 패킷인 경우 Blacklist에 추가한 후 패킷을 폐기한다. 반면, Deep SVDDt에서의 분류 결과, 해당 패킷이 분류 불가능하면, 해당 패킷의 순서에 따라 ct, ht, t를 업데이트하여 저장하거나 바로 저장하여 패킷을 포워딩한다.The next step is to create a feature x t for that packet in both cases, take x t, , c t -1 , h t-1 as input and classify it in unit t, x t, , c t -1 , h Classification is performed with t- 1 as the input of Deep SVDD t . As a result of classification in Deep SVDD t where the decision-deferred class is a single classification, if the packet can be classified, the corresponding packet or session information is added to the Whitelist or Blacklist depending on whether an intrusion has occurred. If it is a normal packet, it is added to the whitelist and the packet is forwarded. If it is an attack packet, it is added to the blacklist and the packet is discarded. On the other hand, as a result of classification in Deep SVDD t , if the corresponding packet cannot be classified, c t , h t , t are updated and stored according to the order of the corresponding packet, or stored immediately and the packet is forwarded.
이상에서 대표적인 실시예를 통하여 본 발명을 상세하게 설명하였으나, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자는 상술한 실시예에 대하여 본 발명의 범주에서 벗어나지 않는 한도 내에서 다양한 변형이 가능함을 이해할 것이다. 그러므로 본 발명의 권리 범위는 설명한 실시예에 국한되어 정해져서는 안 되며, 후술하는 특허청구범위뿐만 아니라 특허청구범위와 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태에 의하여 정해져야 한다. Although the present invention has been described in detail through representative embodiments, those skilled in the art will understand that various modifications are possible to the above-described embodiments without departing from the scope of the present invention. will be. Therefore, the scope of the present invention should not be limited to the described embodiments and should not be defined, and should be defined by all changes or modifications derived from the claims and equivalent concepts as well as the claims to be described later.

Claims (8)

  1. 복수의 패킷으로 구성된 네트워크에 대한 침입 여부를 탐지하는 침입탐지 시스템에 있어서,In the intrusion detection system for detecting whether there is an intrusion on a network composed of a plurality of packets,
    인공 신경망을 이용하여 외부의 침입 여부에 따라 침입 여부 탐지 대상인 상기 패킷을 정상 클래스 또는 공격 클래스 중 어느 하나로 분류하는 분류기; 및a classifier for classifying the packet, which is an intrusion detection target, into either a normal class or an attack class according to an external intrusion by using an artificial neural network; and
    상기 패킷에 대한 상기 분류기의 분류결과와 실제 침입 여부가 일치하지 않는 경우를 학습하여, 상기 패킷을 판단 클래스 또는 판단보류 클래스 중 어느 하나로 판단하는 판단부를 포함하며,A determination unit configured to learn a case in which a classification result of the classifier for the packet and actual intrusion do not match, and determine the packet as one of a judgment class or a judgment pending class;
    상기 분류기는, The classifier,
    상기 판단부에서 상기 패킷이 상기 판단 클래스로 판단된 경우, 침입 여부에 따라 상기 패킷을 상기 정상 클래스 또는 상기 공격 클래스 중 어느 하나로 분류하고, 상기 패킷이 상기 판단보류 클래스인 경우, 상기 패킷 다음으로 수신되는 패킷을 분류할 때까지 상기 패킷에 대한 분류를 보류하여, If the determination unit determines that the packet is in the determination class, the packet is classified into either the normal class or the attack class according to whether or not it is intruded, and if the packet is in the judgment pending class, it is received next to the packet. The classification of the packet is suspended until the packet is classified,
    개별 패킷마다 침입 여부를 판단함으로써 네트워크에 대한 침입 여부를 실시간으로 탐지할 수 있는 것을 특징으로 하는 침입탐지 시스템.An intrusion detection system characterized in that it is possible to detect in real time whether or not there is an intrusion into the network by determining whether or not there is an intrusion for each individual packet.
  2. 제 1 항에 있어서,According to claim 1,
    상기 분류기에서 분류된 상기 패킷에 대한 분류 결과에 따라 상기 패킷을 처리하는 처리부를 더 포함하며,Further comprising a processing unit for processing the packet according to a classification result of the packet classified by the classifier,
    상기 처리부는,The processing unit,
    상기 분류기에서 상기 패킷이 상기 정상 클래스로 분류된 경우 상기 패킷을 포워딩하고, 상기 패킷이 상기 공격 클래스로 분류된 경우 상기 패킷을 폐기처리하는 것을 특징으로 하는 침입탐지 시스템.The intrusion detection system of claim 1 , wherein the classifier forwards the packet when the packet is classified as the normal class, and discards the packet when the packet is classified as the attack class.
  3. 제 1 항에 있어서,According to claim 1,
    상기 분류기는,The classifier,
    상기 인공 신경망에 대한 입력으로 상기 패킷에 대한 피처를 생성하는 피쳐 생성부; 및a feature generating unit generating a feature for the packet as an input to the artificial neural network; and
    상기 피쳐에 대한 상기 인공 신경망의 출력에 따라 상기 패킷을 상기 정상 클래스 또는 상기 공격 클래스 중 어느 하나로 분류하는 분류부를 포함하는 것을 특징으로 하는 침입탐지 시스템.and a classification unit classifying the packet into one of the normal class and the attack class according to the output of the artificial neural network for the feature.
  4. 제 3 항에 있어서,According to claim 3,
    상기 분류기에 마련되는 상기 인공 신경망은,The artificial neural network provided in the classifier,
    복수 개의 유닛(unit)이 순차적으로 연결되어 형성된 순환 신경망으로, 상기 유닛은 상기 패킷마다 마련되며, 상기 유닛은 전후 연결되어 입출력에 영향을 미치는 것을 특징으로 하는 침입탐지 시스템.A recurrent neural network formed by sequentially connecting a plurality of units, wherein the units are provided for each packet, and the units are connected back and forth to affect input and output.
  5. 제 4 항에 있어서,According to claim 4,
    상기 분류기는,The classifier,
    상기 판단부에서 상기 패킷이 상기 판단보류 클래스로 분류된 경우, When the packet is classified into the judgment pending class by the determination unit,
    상기 패킷과 매칭되는 상기 유닛(unit n)의 입력은 상기 패킷의 피쳐와 이전 유닛(unit n+1)의 출력을 포함하여,The input of the unit (unit n) matching the packet includes the feature of the packet and the output of the previous unit (unit n + 1),
    이전 패킷의 특성을 반영하는 것을 특징으로 하는 침입탐지 시스템.An intrusion detection system characterized by reflecting the characteristics of previous packets.
  6. 제 3 항에 있어서,According to claim 3,
    상기 피쳐 생성부는,The feature creation unit,
    상기 패킷의 데이터 중 헤더를 포함한 일정 크기의 데이터를 기준으로 하여 수신되는 상기 패킷마다 상기 피쳐를 독립적으로 생성하는 것을 특징으로 하는 침입탐지 시스템.The intrusion detection system characterized in that the feature is independently generated for each packet received based on data of a certain size including a header among data of the packet.
  7. 제 1 항에 있어서,According to claim 1,
    상기 판단부는,The judge,
    상기 분류기의 분류 결과와 실제 침입 여부가 일치하지 않는 경우에 해당되는 상기 패킷을 학습하는 판단보류 학습모델; 및a judgment suspension learning model for learning the packet corresponding to the case where the classification result of the classifier and actual intrusion do not match; and
    상기 기준 데이터와 상기 패킷의 데이터를 비교하여 상기 패킷이 상기 판단 클래스에 속하는지 또는 상기 판단보류 클래스에 속하는지 분류하는 결정 모듈을 포함하는 것을 특징으로 하는 침입탐지 시스템.and a determination module comparing the reference data with data of the packet and classifying whether the packet belongs to the determination class or the determination pending class.
  8. 제 7 항에 있어서,According to claim 7,
    상기 판단보류 학습모델은,The judgment suspension learning model,
    상기 패킷으로 구성된 임의의 학습 데이터를 상기 분류기에 대해 분류하여 상기 분류기의 분류 결과와 실제 침입 여부가 일치하지 않는 오분류 데이터를 추출하는 추출 모듈;an extraction module for classifying arbitrary training data composed of the packets with respect to the classifier and extracting misclassified data in which a classification result of the classifier and actual intrusion do not match;
    상기 오분류 데이터를 상기 판단보류 클래스로 지정하고, 상기 오분류 데이터를 상기 판단보류 클래스에 대해 단일 분류 학습시키는 학습 모듈; 및a learning module that designates the misclassified data as the decision-deferred class and performs single classification and learning on the misclassified data for the decision-deferred class; and
    상기 학습 데이터에 대해 상기 학습 모듈에서 학습된 단일 분류를 수행하는 단일 분류 모듈을 포함하는 것을 특징으로 하는 침입탐지 시스템.and a single classification module for performing a single classification learned in the learning module on the learning data.
PCT/KR2021/011914 2021-06-25 2021-09-03 Network intrusion detection system using determination delay for packets WO2022270678A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020210083071A KR102354467B1 (en) 2021-06-25 2021-06-25 Network intrusion detection system using deferred decision for packet
KR10-2021-0083071 2021-06-25

Publications (1)

Publication Number Publication Date
WO2022270678A1 true WO2022270678A1 (en) 2022-12-29

Family

ID=80049738

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2021/011914 WO2022270678A1 (en) 2021-06-25 2021-09-03 Network intrusion detection system using determination delay for packets

Country Status (2)

Country Link
KR (1) KR102354467B1 (en)
WO (1) WO2022270678A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114500102B (en) * 2022-03-09 2024-02-13 绍兴文理学院 Sampling-based edge computing architecture Internet of things intrusion detection system and method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101256671B1 (en) * 2006-06-16 2013-04-19 주식회사 케이티 Methofd for testing detection performance of intrusion detection system and the media thereof
KR101553264B1 (en) * 2014-12-11 2015-09-15 한국과학기술정보연구원 System and method for preventing network intrusion
KR20190081408A (en) * 2017-12-29 2019-07-09 이화여자대학교 산학협력단 System and method for detecting network intrusion, computer readable medium for performing the method
KR102014044B1 (en) * 2019-02-18 2019-10-21 한국남동발전 주식회사 Intrusion prevention system and method capable of blocking l2 packet
KR102083028B1 (en) * 2019-02-19 2020-02-28 유재선 System for detecting network intrusion

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101139913B1 (en) 2009-11-25 2012-04-30 한국 한의학 연구원 Method of pattern classification with indecision
KR20130006750A (en) 2011-06-20 2013-01-18 한국전자통신연구원 Method for identifying a denial of service attack and apparatus for the same

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101256671B1 (en) * 2006-06-16 2013-04-19 주식회사 케이티 Methofd for testing detection performance of intrusion detection system and the media thereof
KR101553264B1 (en) * 2014-12-11 2015-09-15 한국과학기술정보연구원 System and method for preventing network intrusion
KR20190081408A (en) * 2017-12-29 2019-07-09 이화여자대학교 산학협력단 System and method for detecting network intrusion, computer readable medium for performing the method
KR102014044B1 (en) * 2019-02-18 2019-10-21 한국남동발전 주식회사 Intrusion prevention system and method capable of blocking l2 packet
KR102083028B1 (en) * 2019-02-19 2020-02-28 유재선 System for detecting network intrusion

Also Published As

Publication number Publication date
KR102354467B1 (en) 2022-01-24

Similar Documents

Publication Publication Date Title
CN112953924B (en) Network abnormal flow detection method, system, storage medium, terminal and application
KR102135024B1 (en) Method and apparatus for identifying category of cyber attack aiming iot devices
CN110311829B (en) Network traffic classification method based on machine learning acceleration
CN110784481B (en) DDoS detection method and system based on neural network in SDN network
CN1943210B (en) Source/destination operating system type-based IDS virtualization
CN110149266B (en) Junk mail identification method and device
US20040107361A1 (en) System for high speed network intrusion detection
CN110808971A (en) Deep embedding-based unknown malicious traffic active detection system and method
WO2022270678A1 (en) Network intrusion detection system using determination delay for packets
US11811800B2 (en) Traffic feature information extraction device, traffic feature information extraction method, and traffic feature information extraction program
CN111698260A (en) DNS hijacking detection method and system based on message analysis
CN111523588B (en) Method for classifying APT attack malicious software traffic based on improved LSTM
WO2022191596A1 (en) Device and method for automatically detecting abnormal behavior of network packet on basis of auto-profiling
Kim et al. Real-time network intrusion detection using deferred decision and hybrid classifier
WO2018110997A1 (en) Method and apparatus for generating network intrusion detection rule
Choi et al. Implementation and design of a zero-day intrusion detection and response system for responding to network security blind spots
KR101017536B1 (en) Network message processing using pattern matching
WO2021080043A1 (en) Somatic mutation detection device and method, having reduced sequencing platform-specific errors
WO2022107925A1 (en) Deep learning object detection processing device
CN112948578A (en) DGA domain name open set classification method, device, electronic equipment and medium
KR20220150545A (en) Network attack detection system and network attack detection method
CN114930329A (en) Method for training module and method for preventing capture of AI module
CN114915444B (en) DDoS attack detection method and device based on graph neural network
KR102483797B1 (en) Method for analyzing cause of network packet attack using XAI, apparatus and computer program for performing the method
WO2023121148A1 (en) Apparatus and method for adversarial feature selection considering attack function of vehicle can

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21947268

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE