WO2022145838A1

WO2022145838A1 - System and method for learning and detecting abnormal behavior by using regression security check

Info

Publication number: WO2022145838A1
Application number: PCT/KR2021/019355
Authority: WO
Inventors: 이시영
Original assignee: 엑사비스 주식회사
Priority date: 2020-12-28
Filing date: 2021-12-20
Publication date: 2022-07-07

Abstract

Disclosed are a method and system for learning and detecting abnormal behavior by using a regression security check. The method comprises: a step for performing, by the system, a packet storage process for selectively storing at least some of a plurality of packets passing through a network; a zero-day penetration determination step for examining, by the system, storage packets stored through the packet storage process and determining whether there is zero-day penetration corresponding to a new security rule that is to be applied to the network; and a step for generating, by the system, normalization data on the basis of penetration behaviors that occur due to the zero-day penetration, wherein a behavior pattern determination model for determining a behavior pattern corresponding to the zero-day penetration is learned on the basis of the generated normalization data.

Description

Abnormal behavior learning and detection system and method using regression security inspection

The present invention relates to a system and method for learning and detecting abnormal behavior using regression security inspection, and more particularly, deep learning (deep learning) of abnormal behavior due to zero-day penetration through regression security inspection with security threats with updated detection rules. It relates to the generation of learning data for learning by deep learning) and a technical idea that can be used to build and utilize a deep learning model that can judge abnormal behavior by using it.

Existing network control and management equipment is based on packet information of TCP (transport layer protocol)/UDP (user datagram protocol) or IP (Internet protocol), Efforts have been made to achieve DDoS (Distributed Denial of Service) prevention and the like. However, the packet-based approach ignores information related to the communication relationship of higher-level applications and simply depends on the information contained in each separate packet, which is a temporary unit of information delivery. Due to limitations, it is provided in the form of a single system for independent goals, such as a router for packet routing, a dedicated system to prevent DDoS attacks, or a DPI (Deep Packet Inspection) system for traffic control. Among them, the DPI system finds and detects the well-known port number used by a specific application or client (eg, a P2P client) and the signature of the payload, and controls the detected packet. are doing By detecting such a signature, which client, that is, an application, is currently generating and/or transmitting packets in the network, and appropriate network control is performed according to a predetermined policy. And by setting the security rule including the information about the signature in the security equipment, security equipment such as DPI defends the threat.

On the other hand, zero-day penetration (attack) is narrowly a technical threat that attacks a vulnerability in computer software. It refers to an attack (security blind spot in time) that is made in the .

Since these vulnerabilities and security blind spots in time are literally before the security rules have been patched, if they are known to a malicious attacker, they will inevitably be exposed to attacks on the software defenselessly, which is an important issue for software security.

However, if these vulnerabilities and security gaps in time are discovered by developers or users or not reported by users, an attack vector (eg, malicious code such as a virus or worm) that uses the vulnerability must be discovered before a patch for the vulnerability is available. There is a risk of being helplessly exposed to attacks using the vulnerability until the patch is normally made.

Therefore, when zero-day infiltration has been performed, being able to quickly determine whether or not zero-day infiltration has occurred may be very important in minimizing damage caused by zero-day infiltration.

* Prior art literature

- Patent literature

Korean Patent Application (Application No. 10-2008-0126888, "Network Control System and Network Control Method")

Korean Patent Application (Application No. 10-2011-0019891, "Network inspection system and its providing method")

The present invention is an invention devised to solve these problems, and the technical problem to be achieved is a network security system that inspects packets in real time and selectively passes only packets with no problems as a result of the inspection, wherein a specific threat is detected during the zero-day period. Therefore, it is to provide a technical idea that can determine that a specific threat has penetrated through the detection of learned anomalies even when a rule set to detect it is not provided.

In addition, it is possible to inspect all or most of the rulesets through very efficient packet storage, and based on this, it is to provide a technical idea that can change the automatically applied security rulesets according to the latest security threats.

In addition, it is to provide a technical idea that can significantly reduce the number of packets to be stored in order to record the network and support high-speed packet search.

In addition, while significantly reducing the number of packets stored for recording the network, there is little difference in performance in inspecting the network, enabling network recording for a long time in the past. It is an object to provide a method and system capable of not only inspecting current network packets in real time, but also retrospectively inspecting past networks in a short time.

In addition, whenever the detection rule (security rule) for a new threat is updated, a security check (regressive security check) can be performed on the past traffic, and through this, the intrusion actions taken as a zero-day attack can be analyzed using standardized data (e.g., Directive graph), there is an effect that the generation and labeling (also called labeling or annotatoin) of learning data defining the characteristics of the intrusion behavior can be automatically performed.

In addition, it is possible to collect related behavior information over a long period of time after successful penetration through the regression security check, which has the effect of improving the accuracy of abnormal behavior.

In addition, by defining the intrusion behavior with this standardized data, the characteristics of a zero-day attack can be effectively used for learning a deep learning model, and this has the effect of building a behavior pattern detection module that detects a zero-day attack pattern with high accuracy. have.

In addition, there is an effect of increasing security by detecting network patterns suspected of attacks through the built-up behavior pattern detection module separately from the detection rules through packet inspection.

According to one aspect of the present invention, a method for learning and detecting anomalies using a regression security check for solving the above technical problem is a packet storage process in which a system selectively stores at least some of a plurality of packets passing through a network. performing a zero-day penetration determination step in which the system examines stored packets stored through the packet storage process to determine whether there has been a zero-day penetration corresponding to a new security rule to be applied to the network, and the system performs the It comprises the step of generating infiltration behaviors generated by the zero-day penetration as standardized data, and based on the generated standardized data, a behavior pattern determination model capable of determining a behavior pattern corresponding to the zero-day penetration is learned. .

The step of the system performing a packet storage process for selectively storing at least some of a plurality of packets passing through a network may include: an initial stage of the session among session establishment packets forming a session from the plurality of packets, by the system performing the packet storage process of storing only N (N is a natural number) preceding packets.

In the step of generating, by the system, the intrusion behaviors caused by the zero-day penetration as standardized data, the system sets at least one host used for the infiltration behaviors caused by the zero-day penetration as a node, and in each node, another node The method may include generating the standardized data as a directional graph in which a communication action that occurs as an edge is set.

The nodes included in the directed graph may include information on IP and a time point at which at least one host corresponding to each node performs a specific action.

The edges included in the directed graph may include information on a protocol or performance action corresponding to a communication action corresponding to each edge.

The behavior pattern determination model may be constructed by learning the learning data including the formalized data using RNN or LSTM.

The abnormal behavior learning and detection method using the regression security check includes the steps of: generating real-time standardized data for packets that are going to pass through the network after generating the behavior pattern determination model; It may further include the step of determining whether an abnormal behavior using the judgment model.

The above method may be implemented by software installed in the data processing device.

According to another aspect of the present invention, an abnormal behavior learning and detection system using a regression security check is provided. The system includes a packet storage module that performs a packet storage process for selectively storing at least some of a plurality of packets passing through a network, and when a new security rule to be applied to the network is updated, the stored packets stored through the packet storage process An intrusion prevention module that inspects the security rules to determine whether or not there has been a zero-day penetration corresponding to the new security rule, a data generation module that generates intrusion actions caused by the zero-day penetration into standardized data, and the generated standardized data by learning and a control module for generating a behavior pattern determination model for determining a behavior pattern corresponding to the zero-day penetration.

According to another aspect, the system includes a storage device in which a program is stored, and a processor for running the program, wherein the program driven by the processor selectively selects at least some of a plurality of packets passing through a network. When a new security rule to be applied to the network is updated, it is determined whether there has been a zero-day penetration corresponding to the new security rule by examining the stored packets stored through the packet storage process. And, the infiltration behaviors generated by the zero-day penetration are generated as standardized data, and the generated standardized data is learned to generate a behavior pattern determination model for determining a behavior pattern corresponding to the zero-day penetration.

The program driven by the processor is a directed graph in which at least one host used for the intrusion actions caused by the zero-day penetration is set as a node, and the communication action that occurs from each node to another node is set as an edge. The standardized data may be generated.

The program driven by the processor generates real-time formal data for packets that are going to pass through the network after generating the behavior pattern determination model, and uses the behavior pattern determination model for the generated real-time formal data. It is possible to determine whether there is an abnormality.

According to the technical idea of the present invention, in a network security system that inspects packets in real time and selectively passes only packets without a problem as a result of inspection, by learning the zero-day penetration and subsequent behavior patterns detected through regression security inspection, zero In the day period, it is possible to quickly find the penetration of a threat for which a detection rule set is not provided, and through this, there is an effect of minimizing the damage caused by the penetration of a security threat for which a detection rule set is not provided.

In addition, it is possible to effectively inspect all or most of the rulesets, and based on this, there is an effect that the automatically applied security rulesets can be changed according to the latest security threats.

In addition, it is possible to generate flow and flow-based session information while inspecting packets at high speed, so that only a certain number of initial preceding packets of a session can be inspected, so that packet inspection can be performed in real time in a high-speed network environment there is

In addition, it is possible to significantly reduce the number of packets required to record the network, and to support high-speed packet search based on session information and flow information.

In addition, since the number of packets required to record the network is reduced, network recording is possible for a long time even in the same physical environment.

In addition, since such a network can be recorded, it is excellent in performing packet inspection in real time, and it is possible to verify whether there has been a network attack in the past.

In order to more fully understand the drawings cited in the Detailed Description, a brief description of each drawing is provided.

1 is a diagram showing a schematic configuration of a system according to an embodiment of the present invention.

2 is a diagram schematically illustrating a network security method according to an embodiment of the present invention.

3 is a diagram for explaining a session, a flow, and a packet for a system providing method according to an embodiment of the present invention.

4 is a diagram for explaining the concept of performing a packet search according to a system providing method according to an embodiment of the present invention.

5 is a view for explaining the effect of the system providing method according to an embodiment of the present invention.

6 is a diagram for explaining a plurality of packet storage modes through a system providing method according to an embodiment of the present invention.

7 is a diagram for explaining a concept capable of effectively examining past network attacks according to an embodiment of the present invention.

8 is a diagram illustrating a schematic physical configuration of a system according to an embodiment of the present invention.

9 is a diagram illustrating the concept of a standardized data model for defining an attack behavior according to an embodiment of the present invention.

10 is a diagram illustrating an embodiment of standardized data of a zero-day attack behavior according to an embodiment of the present invention.

Since the present invention can apply various transformations and can have various embodiments, specific embodiments are illustrated in the drawings and described in detail in the detailed description. However, this is not intended to limit the present invention to specific embodiments, and should be understood to include all modifications, equivalents, and substitutes included in the spirit and scope of the present invention. In describing the present invention, if it is determined that a detailed description of a related known technology may obscure the gist of the present invention, the detailed description thereof will be omitted.

Terms such as first, second, etc. may be used to describe various elements, but the elements should not be limited by the terms. The above terms are used only for the purpose of distinguishing one component from another.

The terms used in the present application are only used to describe specific embodiments, and are not intended to limit the present invention. The singular expression includes the plural expression unless the context clearly dictates otherwise.

In this specification, terms such as include or have are intended to designate that the features, numbers, steps, operations, components, parts, or combinations thereof described in the specification exist, and include one or more other features or numbers, It should be understood that the possibility of the presence or addition of steps, operations, components, parts or combinations thereof is not precluded in advance.

In addition, in the present specification, when any one component 'transmits' data to another component, the component may directly transmit the data to the other component or through at least one other component. This means that the data may be transmitted to the other component. Conversely, when one component 'directly transmits' data to another component, it means that the data is transmitted from the component to the other component without passing through the other component.

Hereinafter, the present invention will be described in detail focusing on embodiments of the present invention with reference to the accompanying drawings. Like reference numerals in each figure indicate like elements.

Referring to FIG. 1 , a system 100 according to an embodiment of the present invention includes an intrusion prevention module 110 , a packet storage module 120 , and a control module 130 . The system 100 may further include a DB 140 and/or a packet search module 150 . In addition, the system 100 may further include a packet extraction module 160 and a data generation module 170 .

According to another embodiment, the system 100 may include a packet storage module 120 and a control module 130 . In this case, the system 100 may be applied in order to apply the technical idea of the present invention to an IPS system (eg, intrusion prevention module 110 ) that has been previously built.

On the other hand, in order to implement the technical idea of the present invention, the packet extraction module 160 may be further provided. The packet extraction module 160 may receive a plurality of packets from a network. The packet extraction module 160 may be installed at a predetermined location on the network, collect packets moving through the network, and distribute the packets to the intrusion prevention module 110 and the packet storage module 120 . Of course, the distributed packets may be the same. According to an example, the packet extraction module 160 may be implemented by including a device for tapping packets from a network.

The packet extraction module 160 may, for example, be located at the front end and/or rear end of a gateway existing on a predetermined local area network (LAN) to inspect the network according to the technical idea of the present invention. Then, the system 100 can control the network/traffic according to the inspection result. Controlling network/traffic may refer to artificial actions such as adjusting bandwidth and transmission speed or blocking transmission for each predetermined session, flow, and/or packet. The packet extraction module 160 may be implemented as, for example, a predetermined NIC (Network Interface Card), but is not limited thereto.

The packet extraction module 160 may transmit the received packet to both the intrusion prevention module 110 and the packet storage module 120 .

As will be described later, the data generation module 170 may generate behaviors including a series of attack behavior patterns generated by a network attack, for example, a zero-day attack, as mirror-determined standardized data. According to an example, the standardized data may be a directed graph, but is not limited thereto, and expresses a series of actions (eg, information transmission, storage, distribution, deletion, etc.) that occur in time series, and a characteristic characteristic from this expression Various formalized data suitable for use as input data of a deep learning model capable of extracting behaviors may be defined.

The operation and/or function of the data generating module 170 will be described later.

The intrusion prevention module 110 may perform a packet inspection in real time according to an applied security rule set, and selectively pass a packet according to the inspection result. That is, only packets having no problem as a result of the packet inspection can be selectively passed.

The applied security ruleset may be a part of a full set of security rulesets that are known at the time of inspection, that is, stored in the system 100 and applicable, that is, a full rule-set. The security ruleset may mean including a plurality of security rules. In addition, each security rule may include a rule that is a standard for packet inspection. For example, a specific pattern of bitstream exists in a packet, a certain value or text exists in a certain location in the packet, a packet is received on a certain port, and/or a packet's source or destination is a certain value, etc. At least one of the rules may be combined to form a security rule. It goes without saying that the type or characteristic of the new security threat and the corresponding security rule may be updated in the system 100 whenever a new security threat occurs.

On the other hand, the full rule set is generally known to include more than 20,000 security rules, and the number inevitably increases whenever a new security threat occurs. And in fact, it is almost impossible to inspect all the full rule sets in real time, no matter how high-performance the intrusion prevention module 110 or IPS is. In reality, although there is a difference in degree, the applied security ruleset is generally set to one tenth of the full ruleset. That is, in the conventional network security systems, only a part of the known full rule set is applied as the applied security rule set to perform real-time inspection, and the remaining security rule sets are not inspected.

Therefore, in the present specification, the remaining security rule sets of the applied security rule set are expressed as operational security blanks. The operational security gap is a security threat that is unavoidable, but efforts were needed to minimize the operational security gap. In such an effort, for example, there has been a method in which a security expert learns the trend of security threats at a time point, and changes the applied security rule set at regular intervals. However, having to hire a security expert is not only a big loss of cost, but even an excellent security expert knows exactly what security threats actually exist in the target network and optimizes the applied security ruleset accordingly. It was practically impossible.

Because it is difficult to know what kind of security threats actually exist other than the security ruleset that is inspected in real time through the IPS, to know this, all packets are stored separately, and then for the stored packets, either the full rule set or most security of the full rule set This is because it is only possible to know exactly what security threats exist in the network only by performing inspections with the ruleset. However, as described above, it is very expensive to store all packets passing through the network, and it takes a very long time to perform packet inspection with the full ruleset after storing all these packets. And it may be inefficient to check the inspection result only after a long time and to change the applied security rule set.

This problem can be solved by a technical idea that can significantly reduce the number of packets to be stored, and have the same inspection result as that of actually storing and inspecting all packets. By providing such a technical idea, the present invention can significantly reduce the amount of stored packets and obtain high-quality inspection results, and through this, a full rule set or a security rule set corresponding to an operational security gap for the stored packet (eg, a full rule set) It is possible to obtain a quick inspection result by performing inspection with the security ruleset except for the middle applied security ruleset), and through this, it is possible to adaptively change the applied security ruleset, thereby effectively enhancing the security of the network. This technical idea can be achieved by a technical idea that can select only the initial N (N is a natural number) packets of a specific session very quickly and a technical idea of selectively storing the initial N packets of the selected session. These technical ideas will be described later.

On the other hand, when a new security threat occurs, it cannot be prevented no matter how optimizing the applied security rule set. That is, when a new attack occurs, a security rule (detection rule) corresponding to the new attack is inevitably generated after a certain period of time. This new security threat is defined as a 'security gap in time' in this specification, and an attack occurring in this security gap in time is called a zero-day attack.

In addition, a technical idea for minimizing such a security gap in time can also be made possible by storing only efficiently selected packets as described above and searching for the stored packets at high speed. These technical ideas will also be described later.

The packet storage module 120 may selectively extract at least some of the plurality of packets received from the packet extraction module 160 .

Then, the control module 130 may perform a packet inspection on the storage packet stored by the packet storage module 120 . The packet inspection of the stored packet may be a process of performing packet inspection by setting at least one inspection rule other than the applied security rule set. Of course, it goes without saying that packet inspection may be performed on the full rule set at the time of packet inspection according to an implementation example. Also, the applied security ruleset may be applied when checking the stored packet. Alternatively, the packet inspection may be performed only on the remaining security rules excluding the applied security rule set among the full rule sets.

The packet inspection for these stored packets may be after a predetermined time from the time of the packet inspection performed by the intrusion prevention module 110 . Therefore, the full rule set at the time of packet inspection performed by the intrusion prevention module 110 and the full rule set at the time of packet inspection performed by the control module 130 on the stored packet may be different. Therefore, according to an embodiment, when a new security rule is added to the full rule set at the time of packet inspection performed on a stored packet compared to the full rule set at the time of packet inspection performed by the intrusion prevention module 110, the new security rule is The control module 130 may change the applied security rule set so that the stored packet can be set as an inspection rule.

As a result, the control module 130 performs packet inspection by setting at least one security rule not included in the applied security rule set for the stored packet as an inspection rule, and according to the result of the packet inspection, the currently applied applied security rule set can be changed Hereinafter, for convenience of description, a case in which the control module 130 performs a check using a full rule set on a storage packet will be described as an example.

When the control module 130 changes the applied security ruleset, it may mean that at least one of a security rule included in the applied security ruleset before the change and the security rule included in the applied security ruleset after the change is different. Therefore, it goes without saying that the security rules included in the applied security ruleset before the change may still be included in the applied security ruleset after the change.

Also, when the control module 130 changes the applied security rule set, it may be a process of replacing the security rules included in the applied security rule set while maintaining the number of security rules included in the applied security rule set.

However, depending on the implementation, the control module 130 may also change the number of security rules to be included in the applied security rule set. For example, the control module 130 may perform a packet inspection on the storage packet at regular intervals, and as a result, if it is determined that the security threat is increasing compared to the existing one, the number of security rules may be increased. Of course, on the contrary, the number of security rules to be included in the applied security rule set may be reduced according to the packet inspection result of the stored packet.

The control module 130 may enable the intrusion prevention module 110 to perform packet inspection as an inspection rule for the changed applied security rule set. For example, when the intrusion prevention module 110 stores the applied security rule set by itself, information on the changed applied security rule set may be transmitted to the intrusion prevention module 110 . Alternatively, the control module 130 stores information on the changed applied security rule set in a predetermined storage location, and the intrusion prevention module 110 sets the applied security rule set stored in the storage location as an inspection rule to perform packet inspection. you can also make it In other various methods, the control module 130 may change the applied security rule set.

The DB 140 may mean an information storage means in which information necessary to implement the technical idea of the present invention is stored. Needless to say, the DB 140 may store information on the storage packet stored by the packet storage module 120 as described above, flow information, session information, etc. to be described later. In addition, information on the applied security ruleset may be stored. It is sufficient if the DB 140 is implemented as a storage means in which other information necessary to implement the technical idea of the present invention is stored. In addition, the DB 140 does not need to be implemented with only one physical storage device, but may be implemented with a plurality of physical storage devices. Also, according to an embodiment, the DB 140 may be implemented as a physical device separate from the system 100 , and the system 100 may access the DB 140 through a network.

The packet search module 150 may perform a function of searching for packets stored in the DB 140 . In this case, as will be described later, it is possible to enable a high-speed down-drill search. And there is an effect of reducing the security gap in time through such a high-speed packet search.

Referring to FIG. 2 , for the network security method according to an embodiment of the present invention, effective selective storage of input traffic can be made by the system 100 ( S100 ). In this selective storage, a technical idea of detecting a session and extracting only the initial N preceding packets of the session may be used. Also, as will be described later, a technical idea of using flow information to quickly extract the initial N preceding packets of a session may be provided. The selective storage of such input traffic may be performed by the packet storage module 120 .

Meanwhile, an intrusion prevention process may be performed in real time by the system 100 (S100-1). As described above, such intrusion prevention can be achieved by selectively passing packets through real-time packet inspection according to the applied security ruleset. The system 100 may select some security rule sets selected by a predetermined method among the full rule sets as the applied security rule sets in the intrusion prevention process, and apply them as inspection rules of the intrusion prevention process. This initial applied security rule set may be made arbitrarily or by a predetermined security officer.

Then, the system 100 may perform a packet inspection on the stored packet (S110). The full rule set may be applied to the packet inspection for the stored packet as described above, but is not limited thereto. At least one security rule not included in the applied security rule set currently applied to the intrusion prevention process may be used as an inspection rule in packet inspection for the stored packet.

The packet inspection for the storage packet may be performed in a predetermined cycle unit (eg, day/week, etc.). The faster this cycle is, the more immediately it can be possible to respond to operational security gaps. In addition, in order to shorten the period, efficient packet storage may be required so that the quality of packet inspection can be improved while the amount of stored packets is small.

Then, the system 100 may change the applied security rule set according to the test result (S120). The change of the applied security ruleset is, for example, when a packet that is not detected by the currently set applied security ruleset is detected in the packet inspection process (S110) for the stored packet, the change date to include a security rule capable of detecting the corresponding packet can

An example for changing such a security rule may be a least recently used (LRU) method. That is, instead of excluding the security rule that has not been used for the longest time (ie, not used to detect a packet to be filtered) from the applied security rule set, it may be a method of allowing a new security rule to be included in the applied security rule set. However, an average expert in the technical field of the present invention can easily infer that the method of changing the specific applied security ruleset can be varied.

Meanwhile, the system 100 may adaptively adjust the number of security rules to be included in the applied security rule set and/or the packet inspection period for the stored packets. Such adjustment may be adaptively adjusted according to the strength of the security threat (eg, the number of detected attacks or packets, etc.) according to the result of the packet inspection process S110 for the stored packet.

For example, it may be desirable to adjust the number of security rules included in the applied security rule set within a range that does not seriously degrade network performance. In addition, the adjustment of the packet inspection period for the stored packet can be adjusted to be short when it is determined that the strength of the security threat is strong, and can be adjusted to be long in the opposite case.

When the applied security ruleset is changed, the intrusion prevention process may be performed according to the changed applied security ruleset.

After all, according to the technical idea of the present invention, it is possible to provide an effect that an adaptive security policy can be implemented according to a security threat made in an actual network while minimizing an operational security gap.

Meanwhile, the concept of selectively storing packets by the packet storage module 120 will be described as follows. According to the technical idea of the present invention, the packet storage module 120 can extract and store only N preceding packets of each session at high speed. To this end, a technical idea of extracting the initial N preceding packets of a session using a flow is provided.

The packet storage module 120 may include a flow creation module 121 and a session creation module 122 .

The flow generation module 121 may generate a plurality of flows based on the packets received by the packet extraction module 160 . The packet extraction module 160 may sequentially output packets to the flow generation module 121 . Then, the flow generating module 121 may generate a flow. Creating a flow may mean generating flow information as will be described later. Depending on the embodiment, optionally, the flow generating module 121 may extract a packet included in the flow and store it in the DB 140 . The flow generating module 121 may store all packets corresponding to a predetermined flow, but as will be described later, depending on the implementation, only the initial few packets of the session including the flow are finally stored in the DB 140 ) can be stored in

Of course, the flow generation module 121 temporarily stores a flow and all packets included in the flow in the DB 140, and the session creation module 122 selectively stores only some of the stored packets, and the rest You can also delete it.

As used herein, a flow means a set of IP packets that are continuously delivered within a limited time. Thus, an IP flow is an application's address pair (sender address, sender port number, receiver address, receiver port number), host pair (sender network address, receiver network address), AS number pair (sender AS number, receiver AS number), etc. It can be defined as a flow of IP packets that are continuously delivered within a limited time specified by . Since the concept of such a flow and a method of forming the flow are disclosed in detail in the above-mentioned prior art document, a detailed description thereof will be omitted herein. In addition, with respect to the concept of a flow and a method of generating a flow in the present specification, the technical ideas and descriptions disclosed in the aforementioned prior art documents are included as references in the present specification, and may be treated as being included in the description of the present specification.

A 5-tuple may be used as an example for generating a flow among properties of packets. That is, the flow generating module 121 may receive packets on the network as input and generate a flow that is a continuous set of packets, or may extract some of the packets forming the flow. A condition for generating a flow or detecting a flow packet is to compare the properties (eg, 5-Tuple (Source Address, Destination Address, Source Port, Destination Port, Protocol)) of the packets to obtain the same attribute (eg, 5-tuple) value), a new flow is created, and if a packet having the same value exists, the flow information of the flow can be updated.

A contiguous set of packets does not necessarily mean physically consecutive packets, but may be used to mean including packets having the same properties of packets arriving within a time-limited time.

The flow information includes 5-tuple information of a packet, and includes a flow size (Flow Size), a duration (Duration), that is, a start time (S.T) and end time (E.T) of a flow, and a packet count (Packet Count, P.C). , the average packet size (Average Packet Size), the average rate (Average Rate), a flag (eg, a special signal for a protocol (SYN, FIN, etc.)) and / or follow size, and the like. The flow information may be output to the DB 140 and stored. The flow generation module 121 may store flow information on a predetermined flow and a packet included in the flow to correspond to the DB 140 . This process may be defined as that the flow generating module 121 generates a flow. For example, the flow information and the packets included in the flow may be stored so as to be physically continuous, or may be stored in various forms that can be easily searched even though they are physically separated, such as a table or a link.

Some of the packets stored in this way may be deleted based on session information generated by the session creation module 122 . That is, except for the initial N preceding packets of the session, it may be deleted. Therefore, depending on the implementation, only flow information is stored for a specific flow, and a packet corresponding to the specific flow may not be stored.

When a plurality of flows are stored in a storage device (eg, the DB 140) by the flow creation module 121, that is, when a plurality of flows are generated, the session creation module 122 is A session can be created based on the information about Creating a session means extracting flows forming the same session from among a plurality of generated flows, generating session information including identification information for the extracted flows, and storing it in the DB 140 . can It may also mean including the step of storing the preceding N preceding packets among packets included in the session together with the session information to correspond to the session information. The process of storing the preceding packet to correspond to the session information may refer to a process of deleting the preceding packet except for the preceding packet from among the packets already stored by the flow generating module 121 . Alternatively, the session information and the preceding packet may be stored separately. In this case, the preceding packet may be stored double.

The concept of creating a session by the session creation module 122 will be described with reference to FIG. 3 .

3 is a diagram for explaining a session, a flow, and a packet for a method for providing a network inspection system according to an embodiment of the present invention.

Referring to FIG. 3 , when a session S is formed between predetermined devices, the session S may consist of at least one flow F. In addition, each of the at least one flow may be composed of at least one packet (P).

According to the technical idea of the present invention, the system 100 may collect packets passing through a point on a predetermined network. This may be performed by the packet extraction module 160 .

In addition, the system 100 may generate a flow based on packet attributes (eg, 5-tuple, etc.) of the collected packets. The flow generation method is as described above. Generation of such a flow may be performed by the flow generation module 121 . Each flow may consist of only one packet or may consist of a plurality of packets. Also, a flow size may be different for each flow.

When the flow is generated in this way, the session creation module 122 may create a session. In addition, the session creation module 122 may selectively store some or all of the plurality of packets in the storage device or the DB 140 based on the generated session.

To this end, the system 100 may provide at least one packet storage mode.

The packet storage mode provided according to the technical idea of the present invention may provide a mode for storing only the initial N packets of at least the session. According to an embodiment, a mode for storing all or a part (eg, N) of packets forming the session only for a predetermined type of session may be provided. According to an implementation example, a mode for storing all packets included in a session (all sessions or a predetermined type of session) may be provided. For each mode, the system 100 may provide a session-based packet storage mode according to the session information generated by the session creation module 122 rather than randomly storing packets. Examples of such a packet storage mode will be described later with reference to FIG. 6 .

To create a session, the session creation module 122 may check flow information stored in the DB 140 . Flows included in the same session may have common characteristics. Accordingly, the session creation module 122 may search for flows having the common characteristic among the flows stored in the DB 140 . In addition, it is possible to determine the temporal priority of each flow based on the flow information (eg, S.T, E.T, etc. information included in the flow information). The session creation module 122 may identify the best flow and the last flow of the session based on flag information included in the flow information of each session formation flow.

Accordingly, the session creation module 122 may extract at least one flow included in a specific session, that is, a session formation flow. The session formation flow may be one flow or may include a plurality of flows.

As such, the system 100 according to the technical idea of the present invention does not generate only a flow, but creates a session based on the generated flow, which is an important characteristic of the session by the initial N (N is a natural number) preceding packets of the session. Because they can all be understood. Therefore, compared to the prior art of storing and inspecting (eg, DPI) all collected packets, or storing and inspecting a certain number of preceding packets for each flow, only a smaller number of packet inspections is required to determine the characteristics of a given application or You can check any information you want. In general, it is known that there is no significant difference in the quality of inspection when packets are inspected within the initial 5 or less of a session compared to inspecting all packets included in the session.

Of course, as described above, at least one packet storage mode is provided according to network characteristics or security strength, and the session creation module 122 stores a packet corresponding to a currently set setting mode among at least one packet storage mode. can

In addition, according to the technical idea of the present invention, there is an effect that it is possible to reduce the amount of packets to be inspected in the packet inspection process for stored packets. Therefore, there is an effect that a gain for storage may occur.

In addition, as in the technical idea of the present invention, when a flow is generated from a packet and a session is created using the generated flow, high-speed packet search is possible even when a specific service user searches for a packet. That is, the system 100 may store all collected packets rather than only the initial N preceding packets of the session. A high-speed down-drill search is possible by searching for a session, searching for a flow corresponding to a desired packet from the searched session, and then searching for a packet based on the searched flow. Because, in the case of creating only flows, in the worst case, packets can be searched after performing a search as many as the number of flows. This is because the flow and packet corresponding to the packet can be searched for. Of course, even when only the initial N preceding packets are stored, this effect still exists. Also, a service user who wants to search for a packet may know the session information, but may not know the flow information. Accordingly, when a session is created as in the technical idea of the present invention, efficient and high-speed packet searching is possible in the network recording service.

According to an embodiment, the session creation module 122 included in the system 100 may store M packets, that is, storage packets, which are more than N preceding packets, among the collected packets. Even in this case, the system 100 may perform packet inspection only on the preceding packet. In addition, by storing M stored packets, it is possible to increase the possibility that a desired packet is found not only for packet inspection but also for packet search. M may be adaptively set according to the type of service, the request of the service user, or the type of application in which the session is used.

Referring back to FIG. 1 , the session generating module 122 may create a session based on a plurality of flows generated by the flow generating module 121 . That is, session information can be generated.

The session information may include at least an index (identification information) of at least one flow included in the session, that is, each of the session forming flows. In addition, various pieces of information indicating characteristics of the session may be further included in the session information.

As described above, high-speed packet searching may be possible through the generation of the session information, and only the initial N preceding packets of the session may be specified through the generation of the session.

A conceptual structure in which packets of the present invention are stored will be described with reference to FIG. 4 as follows.

4 is a diagram for explaining the concept of performing a packet search according to a method for providing a network inspection system according to an embodiment of the present invention.

Referring to FIG. 4 , the session creation module 122 may create a predetermined session as described above. As shown in FIG. 4 , the session information generated through the session creation may include at least identification information of a session formation flow included in the session.

In addition, the session information may further include information on 5-tuple of the session, start time (S.T) and end time (E.T), packet counts (P,C), session size (S.S), and the like.

The packet search module 150 included in the system 100 may first search for a session corresponding to the packet search request in response to a packet search request received from a service user's terminal (not shown). Of course, the packet search request may include at least one piece of information included in the session information. For example, a sender address, a receiver address, and time information may be included in the packet search request.

Then, the packet search module 150 may search for a flow corresponding to the packet search request by searching for flow information of each of the session forming flows included in the session information. And when a flow corresponding to the packet search request is found, the packet search module 150 can easily search for a packet corresponding to the packet search request from the DB 140 . Of course, depending on the implementation, when the system 100 stores only the preceding packet, the packet corresponding to the packet search request may not exist. Also, when all packets are stored, it may be guaranteed that the packet corresponding to the packet search request is searched.

As a result, the technical idea of the present invention has the effect of enabling a high-speed down-drilling search in the order of session, flow, and packet when a packet is searched after a flow is created from a packet and a session is created from the flow.

Such a high-speed search has the effect of reducing the security gap in time. That is, it is very important to be able to quickly respond to the target network or target system that has been attacked by the security gap in time, because for this purpose, packet search due to the security gap in time must be performed quickly.

Meanwhile, for a packet obtained as a result of such a search, the control module 130 performs protocol information and related metadata (URL, or other packet attribute (5-Tuple) of the packet through deep packet inspection (DPI)). information contained in ), etc.) can be extracted.

And by using this extracted information, a series of actions made through a zero-day attack can be defined as a standardized data model. In this way, when a series of actions made by a zero-day attack (actions made by packets used in a zero-day attack) are expressed as predetermined standardized data, these data models are trained with a predetermined deep learning model to achieve zero-day It is possible to allow the deep learning model to learn a feature or characteristic of actions made in an attack.

Then, network threats are detected by packet inspection later, and when a specific action is performed, it can be determined that it is highly likely to be an action caused by a network threat, thereby greatly improving network security.

In addition, according to the technical idea of the present invention as described above, since the preceding N packets are stored, packets for a relatively long period may be stored, and when a new new security rule (detection rule) is updated, this new security Whether the stored packets are used in a zero-day attack can be checked using the rule.

And through this inspection, the packets used in the zero-day attack and the actions performed in the network (or network equipment) attacked by these packets can be identified and expressed as a standardized data model that is easy for the deep learning model to learn. In the case of (creation), training data for deep learning learning is automatically labeled and has the effect of being generated.

As such, the method of learning the behavior pattern of a zero-day attack through a deep learning model and using it for network security will be described in detail later.

Referring back to FIG. 1 , the control module 130 may perform a packet inspection on the packets stored by the session creation module 122 . According to an example, the session creation module 122 may store only the preceding packets for each session in the storage device or the DB 140, and in this case, may perform packet inspection on the preceding packets of each session. There may be various methods for performing packet inspection, for example, a conventional Deep Packet Inspection (DPI) or the like may be used.

In addition, the data generation module 170 may generate an action pattern of a series of actions generated as a network threat (including a zero-day attack) as a standardized data model through packet inspection. To this end, the control module 130 extracts information necessary to generate the standardized data through packet inspection, for example, a source address, a destination address, a protocol, etc. from a packet, and transmits it to the data generation module 170 . can

According to an example, when a security rule capable of detecting a zero-day attack is updated, the control module 130 may extract packets used for the zero-day attack through packet inspection by using the security rule.

Then, the data generation module 170 detects actions (eg, transmission, deletion, change of information, etc.) of the device (or devices) that has been subjected to the zero-day attack (eg, a host existing in the attacked network, etc.) And, the detected behaviors may be generated as a standardized data model, that is, standardized data.

Of course, not all actions of the devices subjected to the zero-day attack may be generated by the zero-day attack, and there may be cases in which a normal operation is performed. Accordingly, the data generation module 170 may generate standardized data for an action of at least one device that has been attacked and a series of actions of devices connected to the device. And among them, both behaviors resulting from a zero-day attack and normal behaviors may be included.

The data generation module 170 may generate not only the device subjected to the zero-day attack, but also a plurality of devices and a plurality of actions made through their communication as standardized data, among them, actions after receiving the zero-day attack It is possible to label the standardized data corresponding to the zero-day attack.

Then, the control module 130 learns the standardized data labeled that the zero-day attack has been received and the standardized data that is not, and generates a behavior pattern determination model that can distinguish the device or the characteristic behavior patterns of the devices that have been subjected to the zero-day attack from this. can do.

This behavior pattern determination model may be a deep learning model trained to receive and learn a plurality of labeled standardized data as input data to determine characteristic behavior patterns of only devices that have been subjected to a zero-day attack.

As such a deep learning model, a model such as RNN or LSTM that can learn continuous data from the past may be used, and the normalized data (eg, directed graph) may be used as input data of such a deep learning model. .

After the behavior pattern determination model generated by the control module 130 is learned using the learning data generated by the data generation module 170, a predetermined behavior pattern is generated by the device receiving the predetermined packet. In this case, it can be determined whether the generated behavior pattern is a behavior pattern having the characteristics of a zero-day attack or not.

To this end, the data generation module 170 may generate standardized data corresponding to action patterns generated by packets passing through the network in real time, and input the generated standardized data into the action pattern determination model. Then, the behavior pattern determination model may determine whether the input standardized data is a behavior pattern corresponding to a zero-day attack.

After all, according to the technical idea of the present invention, after the security rule corresponding to the zero-day attack is known and updated in the security device, a security measure is applied to check whether a malicious code corresponding to the zero-day attack exists through packet inspection. , it has the effect of separately examining the unique behavior patterns corresponding to the zero-day attack.

In general, even if a specific malicious code is known, all malicious attack actions performed by the specific malicious code may not be known, and new types of second zero-day attacks and third zero-day attacks may occur. There may be no countermeasures until security rules corresponding to malicious code are developed.

However, when the behavior patterns of multiple malicious codes are learned according to the technical idea of the present invention, when a new zero-day attack occurs for which no inspection rule, that is, a detection rule, occurs, packet inspection itself cannot prevent the attack, but the behavior pattern Based on this, it is possible to know whether a behavior pattern similar to that of other conventional malicious attacks has occurred, so there is an effect of greatly enhancing security.

Meanwhile, it goes without saying that the packet inspection result of the control module 130 may be stored in the DB 140 . In addition, since packet inspection can be performed only on the preceding packet by the control module 130 , the packet inspection for the session may be completed in real time before the session is terminated.

In addition, according to the technical idea of the present invention as described above, when only a preset number of preceding packets for each session are stored, current network packets can be inspected in real time, and past packets (that is, , pre-stored preceding packets) can be checked. In other words, it has the effect of retrospectively inspecting the network even in the past. It may have an effect.

Meanwhile, according to the technical idea of the present invention, as described above, the system 100 may be used for a network recording service. For network recording, all collected packets had to be stored in the prior art, but according to the technical idea of the present invention, by forming a session, only the initial N preceding packets of the session are stored, so that the amount of stored packets can be significantly reduced and important information can be stored. do. Of course, M storage packets may be stored according to the needs of the service. Even in this case, there is an effect of saving storage compared to collecting/storing the entire packet.

Also, according to the technical idea of the present invention, the system 100 may store only packets corresponding to a predetermined type of session. For example, the system 100 may perform network recording only for a predetermined session, such as an HTTP or TCP session.

As such, the function of performing network recording only for a predetermined session may be performed by the flow generating module 121 or may be generated by the session generating module 122 . For example, the flow generating module 121 may generate a flow targeting only packets corresponding to a predetermined session among the packets collected by the packet extraction module 160 . Alternatively, after the flow generating module 121 generates a flow for all packets, a flow that does not correspond to a predetermined session among the generated flows by the session generating module 122 may be deleted from the DB 140 . .

Whether the session corresponds to the predetermined session may be determined based on port information of packets. That is, a port number may be bound according to the type of session, and whether a packet or a flow corresponding to a predetermined session may be determined based on the port number.

According to an embodiment, the packet extraction module 160 may transmit only packets corresponding to a predetermined session to the flow generation module 121 .

In any case, the system 100 may perform network recording only for a predetermined session.

As a result, according to the technical idea of the present invention, the absolute amount of stored packets can be reduced compared to the conventional network recording, and network recording can be performed only for a desired session.

This may be conceptually shown in FIG. 5 .

5 is a diagram for explaining an effect of a method for providing a network inspection system according to an embodiment of the present invention.

Referring to FIG. 5 , a horizontal axis of a rectangle conceptually represents a session size, and a vertical axis conceptually represents sessions. Accordingly, the rectangle 10 shown in FIG. 5 may mean the amount of stored packets when all the collected packets are stored.

The system 100 according to the technical idea of the present invention does not store all packets included in a specific session, but only N preceding packets (or M storage packets), so the amount of packets stored for each session This has the effect of reducing it.

In addition, the system 100 according to the technical idea of the present invention does not store packets for all sessions, but can store only a predetermined type of session, so that the system 100 does not store packets of a predetermined type (corresponding to D). There is an effect that packet storage may not be performed at all.

As described above, according to the technical idea of the present invention, high-speed packet search is possible by selectively storing only packets meaningful for packet inspection while reducing the absolute amount of packets to be stored. At the same time, as described above, there is an effect that a high-speed packet search is possible through the drill-down search in the order of session information and flow information.

6 is a diagram for explaining a plurality of packet storage modes through a method for providing a network inspection system according to an embodiment of the present invention.

Referring to FIG. 6 , the horizontal axis of the rectangle conceptually represents the session size, and the vertical axis conceptually represents the sessions. Accordingly, the rectangle 10 shown in FIG. 6 may mean the amount of stored packets when all the collected packets are stored, and the shaded area 20 is the number of packets actually stored by the session creation module 122 . quantity can be expressed.

First, FIG. 6A shows a case in which packets are not stored, and in this case, a case in which packets are inspected in real time according to the technical idea of the present invention may be shown. In this case, the same function as the conventional DPI can be performed. However, even at this time, according to the technical idea of the present invention, there is an effect of generating a session and inspecting only preceding packets of the created session at high speed.

6B conceptually illustrates a case in which all packets of a session of a predetermined type are checked only for a session. 6C conceptually illustrates a case of storing initial N preceding packets for all sessions.

FIG. 6D conceptually illustrates a case of storing initial N preceding packets for a session in a predetermined type. Also, FIG. 6E conceptually illustrates a case of storing all packets for all sessions.

As such, the system 100 provides at least one packet storage mode as shown in FIG. 6 , and among them, the system 100 can adaptively store packets according to a setting mode set for the current network. have. Of course, the setting mode may be adaptively selected according to network characteristics or required security strength.

That is, it is a diagram for explaining a concept capable of reducing a security gap in time. First, FIG. 7A exemplarily illustrates the operation concept of a conventional network security system (eg, DPI). For example, a new network threat may occur at a predetermined time point t1. A network inspection rule (eg, a packet signature indicating a new threat, etc.) corresponding to such a new network threat may be set after a certain period of time (t2). only in response to the new network threat. That is, there is a problem in that even if there is an actual network attack between the time points t1 and t2, it cannot be recognized.

Of course, when both a network inspection system (eg, DPI) and a network recording system are used in the prior art, a network attack may be recognized between time points t1 and t2. However, even in this case, in the conventional network recording, a large number of packets had to be stored compared to the technical idea of the present invention, and thus there was a problem in that it was not possible to recognize or respond to a past network attack at high speed.

In contrast, according to the network inspection method according to the technical idea of the present invention as shown in FIG. 7B , network recording is performed between time points t1 and t2 while network recording is performed only by storing a small number of packets. there is an effect Through this, retrospective network inspection can be performed on the past network at high speed, and through this, there is an effect that quick action can be taken on the attacked target. Of course, a high-speed network inspection in real time may be possible after the time point t2.

Referring to FIG. 8 , the system 100 includes a logical configuration for each function as shown in FIG. 1 , and may physically include the configuration shown in FIG. 8 .

The system 100 includes a memory (storage device) 120-1 in which a program for implementing the technical idea of the present invention is stored, and a processor 110-1 for executing the program stored in the memory 120-1. ) may be provided.

An average expert in the art of the present invention can easily infer that the processor 110 - 1 may be named by various names, such as a CPU or a mobile processor, depending on the implementation example of the system 100 . In addition, as described with reference to FIG. 1 , the system 100 may be implemented by organically combining a plurality of physical devices. In this case, the processor 110-1 is provided with at least one for each physical device, so that the system of the present invention is provided. An average expert in the technical field of the present invention can easily infer that (100) can be implemented.

The memory 120-1 stores the program and may be implemented as any type of storage device that the processor can access to drive the program. Also, depending on the hardware implementation, the memory 120-1 may be implemented as a plurality of storage devices instead of any one storage device. Also, the memory 120 - 1 may include a temporary memory as well as a main memory. In addition, it may be implemented as a volatile memory or a non-volatile memory, and may be defined to include all types of information storage means implemented so that the program can be stored and driven by the processor.

The system 100 may be a system implemented by a subject that directly or indirectly operates a security device or provides a security service according to an embodiment, and is independently implemented for a web server, a computer, or a function defined in the present invention It can be implemented in various ways, such as a security device, and it can be defined to include any type of data processing device capable of performing the function defined in this specification.

In addition, various peripheral devices (peripheral device 1 130 to peripheral device N 130 - 1) may be further provided according to an embodiment of the system 100 . For example, an average expert in the art can easily infer that a keyboard, monitor, graphic card, communication device, etc. may be further included in the system 100 as peripheral devices.

Hereinafter, in this specification, that the system 100 or a predetermined module included in the system 100 performs an operation or function means that the operation or the program driven by the processor 110-1 An average expert in the technical field of the present invention can easily infer that a function is meant to be performed.

As described above, in order to implement the security method according to the technical idea of the present invention, that is, the abnormal behavior learning and detection method using the regression security check, the system 100 first selects at least some of the plurality of packets passing through the network. You can perform the process of storing packets that are stored as .

And at this time, the packet storage process may be performed by the packet storage module 120, and as described above, the initial N (N is a natural number) preceding packets of the session among the session forming packets forming the session are selectively stored. It can be a process.

Then, the control module 130 examines the stored packets stored through the packet storage process to determine a zero-day penetration (attack) to determine whether there has been a zero-day penetration corresponding to a new security rule to be applied to the network. have. In addition, it is possible to extract a packet or packets used for zero-day penetration. Determination of the zero-day penetration and extraction of the packet used therefor may be performed whenever a new security rule is updated.

Then, the control module 130 may extract necessary information for generating standardized data by examining the detected packets.

For example, the control module 130 may perform DPI (Deep Packet Inspection) on a packet used for a zero-day attack to extract a protocol and metadata (eg, source address, destination address, etc.) of the protocol, , may be included in the necessary information. The necessary information may be information that can define or express continuous actions, that is, action patterns of the attacked or received device and/or other devices that communicated with the device. In addition to the protocol and the metadata, the action pattern It goes without saying that various pieces of information that can define

The extracted necessary information may be transmitted to the data generation module 170 , and the data generation module 170 may generate standardized data based on the received necessary information.

This standardized data can define a series of actions, and the data generation module 170 determines the action pattern when the zero-day attack is received and the action pattern when it is not attacked (eg, in a normal case that is not attacked and/or other attacks). It is possible to generate a plurality of standardized data including the behavior pattern of the case of receiving , etc.) as learning data.

The data generation module 170 forms a series of actions of various devices based on various packets, such as a normal packet included in the network, a packet with a malicious code corresponding to another attack, and a packet corresponding to the zero-day attack. , and the generated standardized data may be used as learning data of the behavior pattern determination model generated by the control module 130 .

As a result of the packet inspection performed by the control module 130, the data generation module 170 automatically receives a zero-day attack based on a packet having a malicious code corresponding to a zero-day attack among stored packets. It can be labeled with a pattern, which has the effect of enabling automated labeling of training data.

The control module 130 provides standardized data corresponding to each of a plurality of behavior patterns and labeling information labeled on each standardized data (eg, a normal behavior pattern, a behavior pattern when subjected to a zero-day attack, when receiving other attacks) behavior pattern, etc.) can be used as learning data to train the behavior pattern judgment model.

Then, when predetermined standardized data is input, the learned behavior pattern determination model can determine whether the standardized data (ie, behavior pattern) has the characteristics of the zero-day attack.

To this end, it may be preferable that the standardized data be defined in a form in which the behavior pattern determination model, that is, the deep learning model, can learn and infer well while well defining a series of continuous actions.

According to the technical idea of the present invention, the standardized data may be defined as a directed graph, and the case in which the behavior pattern determination model is implemented in a Recurrent Neural Network (RNN) or Lomg Short-Term Memory (LSTM) will be described as an example. The scope of the present invention is not limited thereto.

An example of standardized data according to an embodiment of the present invention is shown in FIG. 9 .

Referring to FIG. 9 , the standardized data according to the technical idea of the present invention may be implemented as a directed graph.

In addition, each of the at least one host used for intrusion behaviors caused by zero-day penetration in the directed graph may be a node. For example, a host that has received a malicious code corresponding to a zero-day attack may become a node, and another host that communicates with such a host may be set as another node.

In addition, the communication behavior that occurs from each node to another node can be set as an edge.

In addition, each node included in the directivity graph includes information and IP (IP of the host corresponding to each node, eg, IP1, IP2, IP3, IP4 , IP5))). In FIG. 9 , T1, T2, T3, T4, and T5 sequentially indicate the order in which the actions occurred independently of the timing of the actions occurring in each host.

In addition, the edge included in the directed graph is a protocol (eg, Protocol 1, Protocol 2, Protocol 3, Protocol 5) and/or performing behavior (eg, Behavior 1, Behavior 2) corresponding to the communication behavior corresponding to each edge. , Behavior 3, Behavior 5) may be included.

For example, FIG. 9 shows an example of standardized data in which one behavior pattern is defined, and the directional graph may include five nodes N1 , N2 , N3 , N4 , and N5 .

For example, the first node N1 may be a host having IP1 and receiving a packet containing a malicious code corresponding to a zero-day attack or a host existing outside the protected network and transmitting the malicious code to an internal host. , a predetermined host communicating with the first node N1 at a predetermined time may be the second node N2. After that, a predetermined host that communicated with the second node N2 at a predetermined time may become the third node N3, and after that, hosts communicating with the third node N3 at a predetermined time are respectively first It may be a 4th node (N4) and a 5th node (N5).

Here, communication with the fourth node N4 corresponding to T4 may be performed before communication with the fifth node N5 corresponding to T5.

In addition, the first edge E1 includes information indicating that the first node N1 and the second node N2 communicated through the first protocol 1, and the action through the communication is an action called Behavior 1. can

Similarly, the second edge E1 includes information indicating that the second node N2 and the third node N3 communicated through the second protocol 2, and the action through the communication is the action called Behavior 2. can In addition, the third edge E1 includes information indicating that the third node N3 and the fourth node N4 communicated through the third protocol 3, and the action through the communication is an action called Behavior 3. and the fourth edge (E4) includes information indicating that the third node (N3) and the fifth node (N5) communicated through the fifth protocol (Protocol 5), and the action through the communication is an action called Behavior 5 can do.

As such, a series of actions starting from a host directly subjected to a zero-day attack, that is, information defining a behavior pattern may be defined as a directed graph, and such a directed graph may be an example of standardized data.

An example of generating standardized data through a more specific example is shown in FIG. 10 .

Referring to FIG. 10 , in FIG. 10 , the detection rule for malicious code γ has been newly updated in the system 100 , and accordingly, the system 100 performs a packet inspection on stored packets to change the external IP α to the internal IP It may be standardized data defining a behavior pattern that has occurred in an example when zero-day penetration of malicious code γ is detected as β.

The system 100 detected that an attempt to steal ID and password of the internal database (IP δ) occurred from the internal IP β that had been infiltrated by zero-day, and the act of periodic data transfer from the internal IP β to the external Blacklist IP ε was detected to be occurring.

Then, the hosts corresponding to each of the external IP α, internal IP β, internal IP δ, internal IP β, and external Blacklist IP ε can be nodes (N10, N11, N12, N13, N14) included in the standardized data.

And, as can be seen from nodes N11 and N13, it can be seen that even the same host can be expressed as a different node when the timestamp, that is, the time point of the action is different.

Also, the edge E10 may include information indicating that the node N10 and the node N11 communicated through a protocol (HTTP) and transmitted malicious code through the communication. The edge E11 may include information indicating that the node N11 and the node N12 communicated through a protocol (HTTP), and multiple login attempts and failures were made through the communication. .

The edge E12 may include information indicating that the node N12 and the node N13 communicated through a protocol (unknown), and that predetermined information was transmitted through the communication.

The edge E13 may include information indicating that the node N13 and the node N14 communicated through a protocol (unknown), and periodic data transfer was performed through the communication.

When a malicious code is detected in this way, a series of behavior patterns generated therefrom can be generated as standardized data.

In addition, a plurality of predetermined behavior types may be defined for the behavior included in each edge when generating standardized data, and each behavior performed through communication between nodes is set as one of the predetermined behavior types, standardization may be made. Of course, each of these actions may be specified in various ways, such as specified from protocol and meta information detected through DPI, or specified by log records of each host.

As such, the behavior pattern generated based on a predetermined packet is generated as standardized data by the data generation module 170, and labeling information (zero-day attack, normal, etc.) is labeled for each of the generated standardized data, so that the learning data is Once constructed, as described above, the control module 130 may generate a behavior pattern determination model by inputting learning data and learning to determine whether a predetermined standardized data is a behavior pattern corresponding to a zero-day attack.

And when the learned behavior pattern determination model is built, thereafter, when real-time standardized data is generated for a predetermined packet (eg, initial packet for each session, etc.) that is going to pass through the network by the data generation module 170, it is generated For one real-time standardized data, it is possible to determine whether the behavior pattern corresponds to the zero-day attack by using the behavior pattern determination model.

As described above, as the behavior pattern determination model, a known deep learning model such as RNN or LSTM may be used, and the structure and learning method of such a deep learning model are well known, so a detailed description thereof will be omitted. In any case, the behavior pattern determination model may define characteristic features of a behavior pattern generated by the zero-day attack and determine whether a predetermined behavior pattern is a behavior pattern caused by the zero-day attack using this.

After all, according to the technical idea of the present invention, the system 100 according to the technical idea of the present invention not only checks whether there has been a zero-day attack, but also learns a characteristic behavior pattern generated by the zero-day attack. In the future, it is possible to not only defend against attacks through packet inspection, but also to identify threats suspected of attacks based on behavior patterns within the protected network.

The network security method according to an embodiment of the present invention may be implemented in the form of a computer-readable program command and stored in a computer-readable recording medium, and the control program and the target program according to the embodiment of the present invention are also transmitted to the computer. It may be stored in a readable recording medium. The computer-readable recording medium includes all types of recording devices in which data readable by a computer system is stored.

The program instructions recorded on the recording medium may be specially designed and configured for the present invention, or may be known and available to those skilled in the software field.

Examples of the computer-readable recording medium include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs and DVDs, and floppy disks. hardware devices specially configured to store and execute program instructions, such as magneto-optical media and ROM, RAM, flash memory, and the like. In addition, the computer-readable recording medium is distributed in a computer system connected to a network, so that the computer-readable code can be stored and executed in a distributed manner.

Examples of program instructions include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by an apparatus for electronically processing information using an interpreter or the like, for example, a computer.

The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

The description of the present invention described above is for illustration, and those of ordinary skill in the art to which the present invention pertains can understand that it can be easily modified into other specific forms without changing the technical spirit or essential features of the present invention. will be. Therefore, it should be understood that the embodiments described above are illustrative in all respects and not restrictive. For example, each component described as a single type may be implemented in a distributed manner, and likewise components described as distributed may also be implemented in a combined form.

The scope of the present invention is indicated by the following claims rather than the above detailed description, and all changes or modifications derived from the meaning and scope of the claims and their equivalents should be construed as being included in the scope of the present invention. .

The present invention can be applied to a system and method for learning and detecting anomalies using regression security checks.

Claims

In the method of learning and detecting abnormal behavior using regression security check,

performing a packet storage process, wherein the system selectively stores at least some of the plurality of packets passing through the network;

a zero-day penetration determination step in which the system examines stored packets stored through the packet storage process to determine whether there has been a zero-day penetration corresponding to a new security rule to be applied to the network; and

Including, by the system, generating intrusion behaviors generated by the zero-day penetration as standardized data,

Abnormal behavior learning and detection method using a regression security check, characterized in that a behavior pattern determination model capable of determining a behavior pattern corresponding to the zero-day penetration is learned based on the generated standardized data.
2. The method of claim 1, wherein the system performing a packet storage process selectively storing at least some of a plurality of packets traversing a network comprises:

The system performs the packet storage process of storing only the initial N (N is a natural number) preceding packets of the session from among the session forming packets forming a session from the plurality of packets. Anomaly learning and detection method.
The method of claim 1, wherein the generating of the intrusion behaviors generated by the zero-day penetration by the system as standardized data comprises:

generating, by the system, as a directional graph in which at least one host used for intrusion actions caused by the zero-day penetration is set as a node, and communication actions occurring from each node to other nodes are set as edges Abnormal behavior learning and detection method using regression security check, including.
According to claim 3, wherein the node included in the directed graph,

A method for learning and detecting anomalies using regression security checks that includes information on the time and IP of at least one host corresponding to each node performing a specific action.
The method of claim 3, wherein the edge included in the directivity graph,

Abnormal behavior learning and detection method using regression security check that includes information on protocol or execution behavior corresponding to communication behavior corresponding to each edge.
According to claim 1, wherein the behavior pattern determination model,

Abnormal behavior learning and detection method using a regression security check constructed by learning the training data including the standardized data using RNN or LSTM.
The method of claim 1, wherein the abnormal behavior learning and detection method using the regression security check comprises:

generating real-time standardized data for packets that are intended to pass through the network after generating the behavior pattern determination model;

Abnormal behavior learning and detection method using a regression security check, further comprising the step of determining whether an abnormal behavior occurs with respect to the generated real-time standardized data using the behavior pattern determination model.
A computer program installed in a data processing apparatus and recorded on a medium for performing the method according to any one of claims 1 to 7.
In the abnormal behavior learning and detection system using regression security check,

a packet storage module for performing a packet storage process of selectively storing at least some of a plurality of packets passing through a network;

an intrusion prevention module that, when a new security rule to be applied to the network is updated, checks stored packets stored through the packet storage process to determine whether there is a zero-day penetration corresponding to the new security rule;

a data generation module for generating the intrusion behaviors generated by the zero-day penetration as standardized data; and

Abnormal behavior learning and detection system using a regression security check, comprising a control module for learning the generated standardized data and generating a behavior pattern determination model for determining a behavior pattern corresponding to the zero-day penetration.
a storage device in which the program is stored;

A processor for running the program,

The program driven by the processor,

performing a packet storage process of selectively storing at least some of a plurality of packets passing through the network;

When a new security rule to be applied to the network is updated, it is determined whether there has been a zero-day penetration corresponding to the new security rule by examining the stored packets stored through the packet storage process,

Intrusion behaviors caused by the zero-day penetration are generated as standardized data,

Abnormal behavior learning and detection system using a regression security check that learns the generated standardized data and generates a behavior pattern determination model for judging a behavior pattern corresponding to the zero-day penetration.
11. The method of claim 10, wherein the program driven by the processor,

A regression security check is performed in which at least one host used for the intrusion actions caused by the zero-day penetration is set as a node, and the standardized data is generated as a directional graph in which the communication action that occurs from each node to another node is set as an edge. Abnormal behavior learning and detection system using
11. The method of claim 10, wherein the program driven by the processor,

After generating the behavior pattern determination model, generating real-time standardized data for packets that are going to pass through the network,

Abnormal behavior learning and detection system using regression security check to determine whether abnormal behavior is detected using the behavior pattern determination model for the generated real-time standardized data.