KR102620130B1

KR102620130B1 - APT attack detection method and device

Info

Publication number: KR102620130B1
Application number: KR1020220004838A
Authority: KR
Inventors: 송중석; 최상수; 김규일; 이준; 권태웅; 이윤수; 최익제; 최윤수
Original assignee: 한국과학기술정보연구원
Priority date: 2021-12-08
Filing date: 2022-01-12
Publication date: 2024-01-03
Also published as: KR20230086538A

Abstract

실시예들에 따른 APT 공격 탐지 방법 및 장치는 보안이벤트들을 분류하고 과거 보안이벤트 사고가 있었던 IP 패킷과의 유사도를 기반으로 블랙 IP 및 의심(suspicious) IP를 분류할 수 있다. APT attack detection methods and devices according to embodiments can classify security events and classify black IPs and suspicious IPs based on similarity to IP packets that have had past security event incidents.

Description

APT (Advanced Persistent Threat) attack detection method and device {APT attack detection method and device}

실시예들에 따른 APT 공격 탐지 방법 및 장치는 추가적인 APT 전용 보안 장비 없이 네트워크 패킷을 분석하여 APT 공격을 탐지하기 위한 것이다.APT attack detection methods and devices according to embodiments are for detecting APT attacks by analyzing network packets without additional APT-specific security equipment.

지능형 지속 공격인 APT는 지능적인 방법으로 지속적으로 특정 대상을 공격하는 것으로서, 기존의 불특정 다수를 노린 악성코드와는 다른 공격이다. APT의 공격 사례는 사회 생활 패턴의 변화에 따라 그 수가 지속적으로 증가하고 있다. 이러한 APT를 탐지하고 예측하기 위한 방법으로서 엔드 포인트(End-point) 탐지 방법, 이상탐지 방법 등 다양한 APT 공격 탐지 방법들이 연구, 발전되어 오고 있다. APT, an intelligent persistent attack, is an attack that continuously attacks a specific target using an intelligent method, and is different from existing malware that targets an unspecified number of people. The number of APT attack cases continues to increase as social life patterns change. As a method to detect and predict such APT, various APT attack detection methods, such as end-point detection methods and anomaly detection methods, have been researched and developed.

종래의 APT 공격 탐지 방법은 데이터(이벤트) 수집 시 각 엔드 포인트에서 수집된 데이터 호환성, 각 엔드 포인트에 설치된 장비 연동 기능 등 추가 기술력 필요하다. 또한 종래의 APT 공격 탐지 방법은 다양한 보안이벤트를 수집해 활용하므로 데이터 전처리에 필요한 추가 작업이 필요하여 상당한 비용이 소요된다. 특히 최근 발행하는 표적형 사이버 공격은 정상 행위를 모방하여 공격하는 경우가 많다. 따라서 시그니처 기반 및 이상탐지를 기반으로 하는 APT 공격 탐지 방법은 이러한 정상 행위를 모방하는 공격을 탐지하지 못할 수 있다. 종래의 APT 공격 탐지 방법은 엔드 포인트에서 공격을 탐지 및 분석하므로 실시간 초기 공격 및 위협을 탐지하고 대응하기 어려운 경우가 발생한다. 뿐만 아니라 종래의 APT 공격 탐지 방법은 실제 사이버 공격 유/무를 방지하기 위한 보안 이벤트에 집중하므로 미래 공격을 대비하기 위한 분석 정보는 제공하지 않는다. Conventional APT attack detection methods require additional technical skills such as data compatibility from each endpoint when collecting data (events) and the ability to link equipment installed on each endpoint. Additionally, since conventional APT attack detection methods collect and utilize various security events, additional work required for data preprocessing is required, which incurs significant costs. In particular, recent targeted cyber attacks often attack by imitating normal behavior. Therefore, APT attack detection methods based on signatures and anomaly detection may not be able to detect attacks that mimic these normal behaviors. Conventional APT attack detection methods detect and analyze attacks at endpoints, making it difficult to detect and respond to initial attacks and threats in real time. In addition, conventional APT attack detection methods focus on security events to prevent actual cyber attacks and do not provide analysis information to prepare for future attacks.

따라서 실시예들에 따른 APT 공격 탐지 방법은 탐지된 전체 데이터 셋(Data set)을 전처리하여 보안이벤트들을 전송하는 IP 패킷들을 분류하는 단계, 보안이벤트들을 전송하는 IP 패킷들 중 적어도 하나 이상의 IP 패킷의 네트워크 방향을 전환하는 단계, 네트워크 방향이 전환된 IP 패킷을 포함하는 IP 패킷들이 지속공격에 대응하는지 여부를 판단하고, 보안이벤트들을 전송하는 IP 패킷들 중 지속공격에 대응하는 IP 패킷들을 출력하는 단계, 지속공격에 대응하는 IP 패킷들이 표적 공격에 대응하는지 여부를 판단하고, 지속공격에 대응하는 IP 패킷들 중 표적 공격에 대응하는 IP 패킷들을 출력하는 단계, 출력된 IP 패킷들의 각 IP 패킷과 과거 보안이벤트 사고가 있었던 하나 또는 그 이상의 IP 패킷들과의 유사도를 측정하는 단계 및 유사도를 기반으로 IP 패킷의 주소를 블랙 IP 주소 또는 의심(Suspicious) IP 주소로 분류하는 단계를 포함한다.Therefore, the APT attack detection method according to embodiments includes preprocessing the entire detected data set to classify IP packets transmitting security events, and classifying IP packets transmitting security events. A step of changing the network direction, determining whether IP packets including the IP packet whose network direction has been changed correspond to a persistent attack, and outputting IP packets corresponding to a persistent attack among IP packets transmitting security events. , determining whether the IP packets corresponding to the continuous attack correspond to the targeted attack, outputting IP packets corresponding to the targeted attack among the IP packets corresponding to the continuous attack, each IP packet of the output IP packets and the past It includes measuring the similarity with one or more IP packets in which a security event occurred and classifying the address of the IP packet as a black IP address or a suspicious IP address based on the similarity.

실시예들에 따른 APT 공격 탐지 장치는 탐지된 전체 데이터 셋(Data set)을 전처리하여 보안이벤트들을 전송하는 IP 패킷들을 분류하는 전처리 모듈, 보안이벤트들을 전송하는 IP 패킷들 중 적어도 하나 이상의 IP 패킷의 네트워크 방향을 전환하는 네트워크 방향 전환 모듈, 네트워크 방향이 전환된 IP 패킷을 포함하는 IP 패킷들이 지속공격에 대응하는지 여부를 판단하고, 보안이벤트들을 전송하는 IP 패킷들 중 지속공격에 대응하는 IP 패킷들을 출력하는 지속공격 분류 모듈, 지속공격에 대응하는 IP 패킷들이 표적 공격에 대응하는지 여부를 판단하고, 지속공격에 대응하는 IP 패킷들 중 표적 공격에 대응하는 IP 패킷들을 출력하는 표적공격 분류 모듈, 출력된 IP 패킷들의 각 IP 패킷과 과거 보안이벤트 사고가 있었던 하나 또는 그 이상의 IP 패킷들과의 유사도를 측정하는 유사도 측정 모듈 및 유사도를 기반으로 IP 패킷의 주소를 블랙 IP 주소 또는 의심(Suspicious) IP 주소로 분류하는 블랙 IP 결정모듈을 포함한다.APT attack detection devices according to embodiments include a preprocessing module that classifies IP packets transmitting security events by preprocessing the entire detected data set, and at least one IP packet among the IP packets transmitting security events. A network direction change module that changes the network direction, determines whether IP packets including the IP packet whose network direction has been changed correspond to a persistent attack, and selects IP packets that respond to a persistent attack among IP packets that transmit security events. A persistent attack classification module that outputs, a targeted attack classification module that determines whether the IP packets corresponding to the persistent attack correspond to a targeted attack, and outputs IP packets that correspond to the targeted attack among the IP packets corresponding to the persistent attack. A similarity measurement module measures the similarity between each IP packet and one or more IP packets that have had security events in the past, and based on the similarity, the address of the IP packet is converted into a black IP address or a suspicious IP address. It includes a black IP decision module classified as.

실시예들에 따른 APT 공격 탐지 방법 및 장치는 네트워크 행위 기반 분석에 기반하므로 추가 장비 없이 APT를 탐지하여 비용을 절감할 수 있다.Since the APT attack detection method and device according to embodiments are based on network behavior-based analysis, costs can be reduced by detecting APT without additional equipment.

실시예들에 따른 APT 공격 탐지 방법 및 장치는 네트워크 경계선에서 APT 공격 탐지 및 대응을 보다 신속하게 수행할 수 있다.APT attack detection methods and devices according to embodiments can more quickly detect and respond to APT attacks at the network border.

실시예들에 따른 APT 공격 탐지 방법 및 장치는 최소한의 행위를 기반으로 정상 행위를 모방하는 공격을 탐지하여 신속한 탐지 결과를 제공할 수 있다.APT attack detection methods and devices according to embodiments can detect attacks that imitate normal behavior based on minimal behavior and provide rapid detection results.

실시예들에 따른 APT 공격 탐지 방법 및 장치는 APT 뿐만 아니라 미래 공격을 대비하기 위한 위험행위까지 분류하여 제공하므로 향후 공격 차단 및 추가 정밀 분석까지 수행할 수 있다. APT attack detection methods and devices according to embodiments classify and provide not only APT but also risky behaviors to prepare for future attacks, so that future attacks can be blocked and additional detailed analysis can be performed.

도면은 실시예들을 더욱 이해하기 위해서 포함되며, 도면은 상세한 설명과 함께 실시예들을 나타낸다. 이하에서 설명하는 다양한 실시예들의 보다 나은 이해를 위하여, 하기 도면들에 걸쳐 유사한 참조 번호들이 대응하는 부분들을 포함하는 다음의 도면들과 관련하여 상세한 설명을 반드시 참조해야 한다.
도 1은 실시예들에 따른 APT 공격 탐지 방법의 예시이다.
도 2는 실시예들에 따른 APT 공격 탐지 장치의 구성도를 나타낸다.
도 3은 실시예들에 따른 APT 공격 탐지 방법을 설명하는 블록도이다.
도 4는 실시예들에 따른 APT 공격 탐지 방법을 나타낸 플로우 차트이다.
도 5는 실시예들에 따른 유사도 측정 모듈의 예시를 나타내는 블록도이다.
도 6은 실시예들에 따른 정규 표현식을 나타낸다.
도 7은 실시예들에 따른 APT 공격 탐지 방법의 플로우차트이다.The drawings are included to further understand the embodiments, and the drawings represent the embodiments in conjunction with the detailed description. For a better understanding of the various embodiments described below, reference should be made to the detailed description in conjunction with the following drawings, with like reference numerals indicating corresponding parts throughout.
1 is an example of an APT attack detection method according to embodiments.
Figure 2 shows a configuration diagram of an APT attack detection device according to embodiments.
Figure 3 is a block diagram explaining an APT attack detection method according to embodiments.
Figure 4 is a flow chart showing an APT attack detection method according to embodiments.
Figure 5 is a block diagram showing an example of a similarity measurement module according to embodiments.
Figure 6 shows regular expressions according to embodiments.
Figure 7 is a flowchart of an APT attack detection method according to embodiments.

이하에서는 바람직한 실시예에 대해 구체적으로 설명하며, 그 예는 첨부된 도면에 나타낸다. 첨부된 도면을 참조한 아래의 상세한 설명은 실시예들의 예시에 따라 구현될 수 있는 실시예만을 나타내기보다는 실시예들의 바람직한 예시를 설명하기 위한 것이다. 다음의 상세한 설명은 실시예들에 대한 철저한 이해를 제공하기 위해 세부 사항을 포함한다. 그러나 실시예들이 이러한 세부 사항 없이 실행될 수 있다는 것은 당업자에게 자명하다. Hereinafter, preferred embodiments will be described in detail, examples of which are shown in the attached drawings. The detailed description below with reference to the accompanying drawings is intended to explain preferred examples of the embodiments rather than showing only embodiments that can be implemented according to the examples of the embodiments. The following detailed description includes details to provide a thorough understanding of the embodiments. However, it will be apparent to those skilled in the art that the embodiments may be practiced without these details.

실시예들에서 사용되는 대부분의 용어는 해당 분야에서 널리 사용되는 일반적인 것들에서 선택되지만, 일부 용어는 출원인에 의해 임의로 선택되며 그 의미는 필요에 따라 다음 설명에서 자세히 서술한다. 따라서 실시예들은 용어의 단순한 명칭이나 의미가 아닌 용어의 의도된 의미에 근거하여 이해되어야 한다.Most of the terms used in the embodiments are selected from common ones widely used in the field, but some terms are arbitrarily selected by the applicant and their meaning is detailed in the following description as necessary. Accordingly, the embodiments should be understood based on the intended meaning of the terms rather than their mere names or meanings.

도 1은 실시예들에 따른 APT 공격 탐지 방법의 예시이다.1 is an example of an APT attack detection method according to embodiments.

도 1의 상단은 엔드 포인트 기반의 APT 공격 탐지 방법의 예시(100)를 나타내며, 도 1의 하단은 실시예들에 따른 APT 공격 탐지 방법의 예시(110)를 나타낸다.The top of FIG. 1 shows an example 100 of an endpoint-based APT attack detection method, and the bottom of FIG. 1 shows an example 110 of an APT attack detection method according to embodiments.

엔드 포인트 기반의 APT 공격 탐지 방법(100)은 네트워크 패킷, 이메일 패킷, 스토리지 패킷 등 다양한 패킷들을 분석하여 보안 이벤트들을 탐지하는 엔트 포인트들을 연동하는 연동 작업을 포함한다. 상술한 바와 같이 각 엔드 포인트에는 보안 이벤트들을 탐지하기 위한 장비가 설치된다. 따라서 연동 작업은 데이터 호환, 장비 연동 등이 포함된다. 연동된 엔드 포인트들로부터 탐지되는 보안 이벤트들은 엔드 포인트 탐지 및 분석 단계를 통해 분석된다. 분석된 보안이벤트들은 탐지 및 분석 단계를 통해 APT 행위 및 정상 행위로 분류된다.The endpoint-based APT attack detection method 100 includes linking endpoints that detect security events by analyzing various packets such as network packets, email packets, and storage packets. As described above, equipment to detect security events is installed at each endpoint. Therefore, interconnection work includes data compatibility, equipment interconnection, etc. Security events detected from linked endpoints are analyzed through the endpoint detection and analysis stage. Analyzed security events are classified into APT behavior and normal behavior through detection and analysis stages.

실시예들에 따른 APT 공격 탐지 방법은 네트워크 행위를 기반으로 수행된다. 따라서 실시예들에 따른 APT 공격 탐지 방법은 엔드 포인트 별 장비 없이 네트워크 상에서 보안 이벤트들을 APT와 관련 있는 이벤트들을 공격 행위(또는 행위)별로 분류할 수 있다. 실시예들에 따른 APT 공격 탐지 방법은 이벤트들을 각 이벤트가 대응하는 공격 행위들간의 유사도를 판단할 수 있다. 실시예들에 따른 APT 공격 탐지 방법은 판단된 유사도를 기반으로 이벤트에 대응하는 행위를 APT 행위, 정상 행위 및 위험 행위로 분류할 수 있다. 실시예들에 따른 위험 행위는 APT 행위는 아니지만 미래 공격에 대비한 위험 요소가 높은 행위를 나타낸다. 따라서 실시예들에 따른 APT 공격 탐지 방법은 현재의 공격뿐만 아니라 미래 공격을 대비한 정보를 제공할 수 있다.APT attack detection methods according to embodiments are performed based on network behavior. Therefore, the APT attack detection method according to embodiments can classify security events related to APT on the network by attack action (or actions) without equipment for each endpoint. The APT attack detection method according to embodiments can determine the similarity between the events and the attack actions that each event corresponds to. The APT attack detection method according to embodiments may classify actions responding to events into APT actions, normal actions, and dangerous actions based on the determined similarity. Risky behavior according to embodiments is not APT behavior, but represents behavior with a high risk factor in preparation for future attacks. Therefore, the APT attack detection method according to the embodiments can provide information to prepare for not only current attacks but also future attacks.

도 2는 실시예들에 따른 APT 공격 탐지 장치의 구성도를 나타낸다.Figure 2 shows a configuration diagram of an APT attack detection device according to embodiments.

도 2는 도 1 에서 설명한 실시예들에 따른 APT 공격 탐지 방법(예를 들면 APT 공격 탐지 방법(110))을 수행하는 APT 공격 탐지 장치의 구성도(200)이다. 도 2 에 도시된 구성도는 예시에 불과하므로, 실시예들에 따른 APT 공격 탐지 장치는 본 예시에 국한되지 않는다. FIG. 2 is a configuration diagram 200 of an APT attack detection device that performs an APT attack detection method (eg, APT attack detection method 110) according to the embodiments described in FIG. 1. Since the configuration diagram shown in FIG. 2 is only an example, the APT attack detection device according to the embodiments is not limited to this example.

실시예들에 따른 APT 공격 탐지 장치(200)는 전처리 모듈(210), 네트워크 방향 전환 모듈(220), 지속공격분류 모듈(230), 표적공격분류 모듈(240), 유사도 측정 모듈(250) 및 블랙 IP 결정 모듈(260)을 포함한다. The APT attack detection device 200 according to embodiments includes a preprocessing module 210, a network redirection module 220, a persistent attack classification module 230, a targeted attack classification module 240, a similarity measurement module 250, and Includes a black IP determination module 260.

전처리 모듈(210)은 보안이벤트들(raw data)을 포함하는 전체 데이터셋에 대하여 IP 패턴 및 임계수치(threshold)를 기반으로 APT 공격과 무관한 보안이벤트들을 제거한다. 실시예들에 따른 보안이벤트들은 IDS/IPS 보안장비에서 수집된 로 데이터(raw data, 원천 데이터)로서 패킷 형태(예를 들면 IP 패킷)가질 수 있다. 실시예들에 따른 IP 패턴은 일정 위험 IP 주소를 탐지하기 위한 IDS/IPS 패턴으로, 기설정된 패턴을 나타낸다. 전처리 모듈(210)은 IP 패턴이 동일한 경우 공격성과 관계없이 일정 위험 IP주소를 IDS/IPS 탐지 패턴에 입력한다. 전처리 모듈(210)은 IP 패턴이 탐지된 IP 패킷을 공격성과 관계없이 탐지된 것으로 보고 해당 IP 패킷을 제거한다. 실시예들에 따른 임계수치는 정상/비정상 네트워크 트래픽을 분류하기 위한 기설정된　이상치 데이터로 flooding,　scanning, brute forcing와 같이 특정 이상치 데이터를 의미한다. 전처리 모듈(210)은 임계수치 이상의 보안이벤트들을 분류하여 제거한다. 실시예들에 따른 전처리 모듈(210)은 IP 패턴 및 임계수치에 대응하는 IP 패킷들을 제거하여 APT 공격과 관련이 있는 보안이벤트들을 네트워크 방향 전환 모듈(220)로 전달한다.The preprocessing module 210 removes security events unrelated to APT attacks based on IP patterns and thresholds for the entire dataset including security events (raw data). Security events according to embodiments are raw data (raw data) collected from IDS/IPS security equipment and may be in the form of packets (eg, IP packets). The IP pattern according to the embodiments is an IDS/IPS pattern for detecting certain dangerous IP addresses and represents a preset pattern. If the IP patterns are the same, the preprocessing module 210 inputs a certain risky IP address into the IDS/IPS detection pattern regardless of the aggressiveness. The preprocessing module 210 views IP packets in which an IP pattern is detected as detected regardless of aggressiveness and removes the IP packets. Threshold values according to embodiments are preset outlier data for classifying normal/abnormal network traffic and refer to specific outlier data such as flooding, scanning, and brute forcing. The preprocessing module 210 classifies and removes security events exceeding a critical value. The preprocessing module 210 according to embodiments removes IP packets corresponding to the IP pattern and threshold value and transmits security events related to the APT attack to the network redirection module 220.

APT 공격과 관련이 있는 보안이벤트들을 전송하는 IP 패킷들은 인바운드 IP 패킷들 및 아웃 바운드 IP 패킷들을 포함한다. 인바운드 IP 패킷은 외부에서 내부로 입력되는 공격자의 리퀘스트(request)로서 공격자에 대응하는 소스 IP, 타겟에 대응하는 데스티네이션 IP 등을 포함할 수 있다. 아웃바운드 IP 패킷은 내부에서 외부로 전송되는 리스판스(response)으로서 소스 IP, 데스티네이션 IP 등을 포함할 수 있다. 아웃바운드 IP 패킷의 데스티네이션 IP는 공격자에 대응하는, 공격자에게 보내는 리스판스이다. 이러한 공격자에게 보내는 리스판스 역시 공격자가 보내는 리퀘스트와 마찬가지로 공격과 관련된다. 따라서 실시예들에 따른 네트워크 방향 전환 모듈(220)은 네트워크 방향 일관성을 확보하기 위하여 APT 공격과 관련이 있는 보안이벤트들을 전송하는 IP 패킷들(예를 들면 아웃바운드 IP 패킷의 데스티네이션 IP 를 소스 IP로 변경 설정)의 네트워크 방향 전환을 수행할 수 있다. 실시예들에 따른 네트워크 방향 전환은 아웃바운드 IP 패킷의 데스티네이션 IP를 공격자로 변경하는 방법을 포함한다.IP packets that transmit security events related to APT attacks include inbound IP packets and outbound IP packets. An inbound IP packet is an attacker's request that is input from the outside to the inside, and may include a source IP corresponding to the attacker, a destination IP corresponding to the target, etc. An outbound IP packet is a response transmitted from the inside to the outside and may include a source IP, destination IP, etc. The destination IP of the outbound IP packet is the response sent to the attacker. The response sent to this attacker is also related to the attack, just like the request sent by the attacker. Therefore, in order to ensure network direction consistency, the network direction change module 220 configures IP packets that transmit security events related to APT attacks (for example, the destination IP of an outbound IP packet to the source IP). You can change the network direction of (Change Settings). Network redirection according to embodiments includes changing the destination IP of an outbound IP packet to the attacker.

실시예들에 따른 보안이벤트는 하나 또는 그 이상의 IP 패킷들을 통해 전송될 수 있다. 실시예들에 따른 지속공격분류 모듈(230)은 네트워크 방향 전환된 IP 패킷들(예를 들면 아웃 바운드 IP 패킷)을 포함하는 IP 패킷들을 수신하고, IP 패킷들을 통해 전송되는 보안이벤트들이 지속공격에 해당하는지 판단한다. 실시예들에 따른 지속공격은 기설정된 시간이상 유지되는 공격이다. 따라서 한번 발생하는 싱글 (single) 보안이벤트 및 기설정된 시간(θ, 예를 들면 24시간 등)보다 작은 시간 동안 연속되는 보안이벤트는 지속공격으로 판단되지 않는다. 실시예들에 따른 지속공격분류 모듈(230)은 싱글 보안이벤트를 전송하는 IP 패킷 및 기설정된 시간보다 작은 시간 동안 연속되는 보안이벤트를 전송하는 IP 패킷들을 제거한다. 또한 지속공격분류 모듈(230)은 지속공격으로 분류되는 IP 패킷들을 공격자 IP 별로 그룹핑하여 공격자 IP 별로 그룹핑된 IP 그룹들의 세트를 출력한다. 실시예들에 따른 기설정된 시간은 APT 공격 탐지 장치의 사용자에 의해 설정될 수도 있고, 자동으로 설정될 수도 있다. 기설정된 시간은 시(hour)단위로 설정될 수 있으나 본 예시에 국한되지 않는다. Security events according to embodiments may be transmitted through one or more IP packets. The persistent attack classification module 230 according to embodiments receives IP packets including network redirected IP packets (for example, outbound IP packets), and detects security events transmitted through IP packets in persistent attacks. Determine if it applies. A continuous attack according to embodiments is an attack that continues for more than a preset time. Therefore, a single security event that occurs once and a security event that continues for less than a preset time (θ, for example, 24 hours, etc.) are not judged to be continuous attacks. The continuous attack classification module 230 according to embodiments removes IP packets transmitting a single security event and IP packets transmitting consecutive security events for a time shorter than a preset time. Additionally, the persistent attack classification module 230 groups IP packets classified as persistent attacks by attacker IP and outputs a set of IP groups grouped by attacker IP. The preset time according to embodiments may be set by the user of the APT attack detection device, or may be set automatically. The preset time may be set in hours, but is not limited to this example.

실시예들에 따른 표적공격분류 모듈(240)은 지속공격분류 모듈(230)에서 출력되는 IP 그룹들의 세트에 속한 IP 패킷들을 통해 전송되는 보안이벤트들이 표적공격에 대응하는지 여부를 판단한다. 실시예들에 따른 표적공격은 주요 IT 기반 시설이나 표적으로 삼은 기업, 기관 등 조직의 네트워크에 다양한 방법으로 침투해 장기간 잠복하면서 기밀정보를 유출하거나 주요 시설의 제어 능력을 확보하는 것을 목표로 하는 공격이다. 이러한 표적공격은 일회성이 아니라 장기간에 걸쳐 이루어지고 다양한 악성코드나 공격루트를 이용하여 탐지 및 대응이 어렵다. 따라서 실시예들에 따른 표적공격분류 모듈(240)은 하나의 패턴(single pattern)을 갖는 IP 패킷은 스캐닝에 해당할 가능성이 높으므로 표적공격으로 판단하지 않는다. 또한 실시예들에 따른 표적공격분류 모듈(240)은 기설정된 기관 및/또는 호스트의 개수(θ)보다 작은 개수의 기관 및/또는 호스트를 타겟하는 IP 패킷들을 통해 전송되는 보안이벤트들은 표적공격으로 판단하지 않는다. 실시예들에 따른 표적공격분류 모듈(240)은 IP 그룹들의 세트에 속한 IP 패킷들 중 하나의 패턴을 갖는 IP 패킷 및 기설정된 개수보다 작은 개수의 기관 및/또는 호스트를 타겟하는 IP 패킷들을 제거하고 IP 그룹들의 세트를 출력한다. 실시예들에 따른 기설정된 기관 및/또는 호스트의 개수는 APT 공격 탐지 장치의 사용자에 의해 설정될 수도 있고, 자동으로 설정될 수도 있다. 지속공격분류의 범위는 표적공격분류의 범위보다 넓으므로 표적공격분류 모듈(240)은 지속공격분류 모듈(230)의 출력에 대해 동작한다. 따라서 표적공격분류 모듈(240)로부터 출력된 IP 그룹들의 세트는 지속공격 및 표적공격에 대응하는 보안이벤트들을 포함한다.The targeted attack classification module 240 according to embodiments determines whether security events transmitted through IP packets belonging to the set of IP groups output from the persistent attack classification module 230 correspond to targeted attacks. Targeted attacks according to embodiments are attacks that aim to infiltrate the networks of organizations such as major IT infrastructure or targeted companies and institutions in various ways and stay in hiding for a long period of time to leak confidential information or secure control capabilities of major facilities. am. These targeted attacks are not one-time, but occur over a long period of time and are difficult to detect and respond to using various malicious codes or attack routes. Therefore, the targeted attack classification module 240 according to embodiments does not determine that an IP packet with a single pattern is a targeted attack because it is likely to correspond to scanning. In addition, the targeted attack classification module 240 according to embodiments classifies security events transmitted through IP packets targeting organizations and/or hosts smaller than the preset number of organizations and/or hosts (θ) as targeted attacks. Don't judge. The targeted attack classification module 240 according to embodiments removes IP packets having one pattern among IP packets belonging to a set of IP groups and IP packets targeting organizations and/or hosts smaller than the preset number. and output a set of IP groups. The number of preset organizations and/or hosts according to embodiments may be set by the user of the APT attack detection device, or may be set automatically. Since the range of sustained attack classification is wider than the range of targeted attack classification, the targeted attack classification module 240 operates on the output of the sustained attack classification module 230. Accordingly, the set of IP groups output from the targeted attack classification module 240 includes security events corresponding to persistent attacks and targeted attacks.

실시예들에 따른 유사도 측정 모듈(250)은 IP 그룹들의 세트에 속한 IP 패킷의 페이로드와 기존 보안이벤트 관련 사고가 있었던 IP 패킷 (예를 들면 인바운드 IP 패킷 및 아웃바운드 IP 패킷)의 페이로드와 비교하여 유사도를 측정할 수 있다. 기존 보안이벤트 관련 사고가 있었던 IP 패킷은 메모리에 저장될 수 있다. 실시예들에 따른 유사도 측정 모듈(250)은 유사도 측정을 위하여 페이로드 내의 데이터를 전처리 할 수 있다. 실시예들에 따른 유사도 측정 모듈(250)은 데이터 전처리를 위하여 페이로드를 워드 바이 워드(word by word)로 나누고 페이로드에 포함된 개인정보(예를 들면 IP 주소, 주민등록번호, 전화번호, 여권번호 등), 특수문자, 불용문자 등을 제거하고 문자열을 일정한 단위로 분할하여 유사도 측정을 위한 데이터를 생성할 수 있다. 또한 실시예들에 따른 유사도 측정 모듈(250)은 TF-IDF　(Term Frequency - Inverse Document Frequency)등을 기반으로 각 워드에 벡터값을 부여하여 페이로드 간의 유사도 측정을 수행할 수 있다.The similarity measurement module 250 according to embodiments includes the payload of an IP packet belonging to a set of IP groups and the payload of an IP packet that has had an incident related to an existing security event (for example, an inbound IP packet and an outbound IP packet). Similarity can be measured by comparison. IP packets that have occurred in an existing security event can be stored in memory. The similarity measurement module 250 according to embodiments may preprocess data in the payload to measure similarity. The similarity measurement module 250 according to embodiments divides the payload into word by word for data preprocessing and personal information included in the payload (e.g. IP address, resident registration number, phone number, passport number). etc.), special characters, unused characters, etc. can be removed and data for similarity measurement can be generated by dividing the string into certain units. Additionally, the similarity measurement module 250 according to embodiments may perform similarity measurement between payloads by assigning a vector value to each word based on TF-IDF (Term Frequency - Inverse Document Frequency).

실시예들에 따른 TF-IDF는 어떤 단어가 특정 문장, 문서 내에서 얼마나 중요한 것인지를 나타내는 통계적 수치로서, 가중치를 나타낸다. TF는 특정한 문자가 얼마나 자주 등장하는지를 나타내는 값으로, 이 값이 높을수록 중요도가 높은 것으로 판단된다. 하지만 단어 자체가 자주 사용되면 그 단어가 상투적인 단어라는 것을 나타낸다. IDF는 상투적인 단어의 빈도수의 역수값으로서 한 단어가 문서의 집합에서 얼마나 공통적으로 나타나는지를 나타낸다. IDF의 값은 문장, 문서군의 성격에 따라 결정된다. TF-IDF는 TF와 IDF를 곱한 값이다. TF는 하나의 문서 내에서 나타내는 단어의 빈도수로 도출될 수 있고, IDF는 전체 문서의 수를 해당 단어를 포함한 문서의 수로 나눈 뒤 로그를 취하여 얻어질 수 있다. TF 및 IDF를 계산하는 방법 및 유사도를 판단하는 방법은 상술한 예시에 국한되지 않으며 다양한 방법으로 계산 및 도출될 수 있다. TF-IDF according to embodiments is a statistical value indicating how important a word is in a specific sentence or document, and represents a weight. TF is a value that indicates how often a specific character appears, and the higher the value, the more important it is judged to be. However, if the word itself is used frequently, it indicates that the word is a cliché. IDF is the reciprocal of the frequency of a common word and indicates how commonly a word appears in a set of documents. The value of IDF is determined depending on the nature of the sentence or document group. TF-IDF is the product of TF and IDF. TF can be derived from the frequency of a word in one document, and IDF can be obtained by dividing the total number of documents by the number of documents containing the word and then taking the log. The method of calculating TF and IDF and the method of determining similarity are not limited to the above examples and can be calculated and derived in various ways.

실시예들에 따른 블랙 IP 결정 모듈(260)은 유사도 측정 모듈(250)로부터 수신한 유사도를 기반으로 해당 IP 패킷의 IP 주소가 블랙 IP(Black IP) 주소 또는 의심(Suspicious) IP 주소인지를 판단한다. 블랙 IP 결정 모듈(260)은 유사도가 기설정된 한계값(예를 들면 유사도 80%) 이상이면 해당 IP 패킷의 IP 주소를 블랙 IP 주소로 판단한다. 또한 블랙 IP 결정 모듈(260)은 기설정된 한계값(예를 들면 유사도 80%) 이하이면 해당 IP 패킷의 IP 주소를 의심(Suspicious) IP 주소로 판단한다. 실시예들에 따른 블랙 IP 결정 모듈(260)은 기관 또는 호스트별로 패킷 숫자를 기설정하고, 유사도가 기설정된 한계값보다 이상인 패킷의 개수가 기설정된 패킷 숫자 이상이면 해당 패킷들을 블랙 IP로 지정할 수　있다. 예를 들어 기관 또는 호스트 별로 설정된, 유사도가 기설정된 한계값 이상의 패킷들의 개수가 3인 경우로서, 블랙 IP로 탐지된 IP 패킷들의 개수가 2이면, 해당 IP 패킷들은 유사도와 관계없이 블랙 IP로 결정되지 않는다. 기설정된 한계값 및 기설정된 패킷 개수는 시스템 관리자에 의해서 설정 및 변경될 수 있다. 실시예들에 따른 블랙 IP 결정 모듈(260)은 유사도만을 기반으로 해당 IP 패킷의 IP 주소가 블랙 IP 주소 또는 의심 IP 주소인지를 판단할 수 있다. 또한 실시예들에 따른 블랙 IP 결정 모듈(260)은 유사도 및 패킷 개수를 기반으로 해당 IP 패킷의 IP 주소가 블랙 IP 주소 또는 의심 IP 주소인지를 판단할 수 있다. 실시예들에 따른 블랙 IP 주소는 지속공격 및 표적공격에 해당하는 IP 주소로서, 과거 공격과 유사한 데이터를 전송하는 IP 주소이다. 실시예들에 따른 의심 IP 주소는 지속공격 및 표적공격에 해당하는 IP 주소이지만, 과거 공격과 유사도가 낮은 데이터를 전송하는 IP 주소로서 여전히 위험이 있는 IP 주소이다.The black IP determination module 260 according to embodiments determines whether the IP address of the corresponding IP packet is a black IP address or a suspicious IP address based on the similarity received from the similarity measurement module 250. do. The black IP determination module 260 determines that the IP address of the corresponding IP packet is a black IP address if the similarity is greater than a preset threshold (for example, 80% similarity). Additionally, the black IP determination module 260 determines the IP address of the IP packet as a suspicious IP address if it is below a preset threshold (for example, 80% similarity). The black IP determination module 260 according to embodiments may preset the number of packets for each organization or host, and designate the packets as black IPs if the number of packets with a similarity greater than the preset threshold is greater than the preset number of packets. there is. For example, if the number of packets exceeding the preset limit of similarity set for each organization or host is 3, and the number of IP packets detected as black IP is 2, the corresponding IP packets are determined as black IP regardless of similarity. It doesn't work. The preset limit value and the preset number of packets can be set and changed by the system administrator. The black IP determination module 260 according to embodiments may determine whether the IP address of the corresponding IP packet is a black IP address or a suspicious IP address based only on similarity. Additionally, the black IP determination module 260 according to embodiments may determine whether the IP address of the corresponding IP packet is a black IP address or a suspicious IP address based on similarity and the number of packets. Black IP addresses according to embodiments are IP addresses corresponding to continuous attacks and targeted attacks, and are IP addresses that transmit data similar to past attacks. Suspicious IP addresses according to embodiments are IP addresses that correspond to persistent attacks and targeted attacks, but are still risky IP addresses that transmit data with low similarity to past attacks.

도 3은 실시예들에 따른 APT 공격 탐지 방법을 설명하는 블록도이다.Figure 3 is a block diagram explaining an APT attack detection method according to embodiments.

도 3의 블록도는 도 2에서 설명한 APT 공격 탐지 장치(예를 들면 APT 공격 탐지 장치(200))에서 수행되는 APT 공격 탐지 방법을 나타낸다. The block diagram of FIG. 3 shows an APT attack detection method performed by the APT attack detection device (eg, the APT attack detection device 200) described in FIG. 2.

실시예들에 따른 APT 공격 탐지 장치(예를 들면 APT 공격 탐지 장치(200))는 탐지된 전체 보안이벤트들에 대응하는 전체 데이터 셋(300)을 입력받는다.An APT attack detection device (eg, APT attack detection device 200) according to embodiments receives the entire data set 300 corresponding to all detected security events.

실시예들에 따른 APT 공격 탐지 장치(예를 들면 도 2의 전처리 모듈(210))는 IP 패턴 및 임계수치(threshold)를 기반으로 APT 공격과 무관한 보안이벤트들을 제거하여 전처리된 보안이벤트들을 출력한다. 실시예들에 따른 IP 패턴 및 임계수치는 도 2에서 설명한 바와 동일하므로 구체적인 설명은 생략한다. The APT attack detection device according to embodiments (e.g., the preprocessing module 210 in FIG. 2) removes security events unrelated to the APT attack based on the IP pattern and threshold and outputs preprocessed security events. do. Since the IP patterns and threshold values according to the embodiments are the same as those described in FIG. 2, detailed descriptions are omitted.

실시예들에 따른 APT 공격 탐지 장치(예를 들면 네트워크 방향 전환 모듈(220))는 인바운드 IP 패킷 또는 아웃 바운드 IP 패킷의 네트워크 방향을 전환하여 보안이벤트들을 전송하는 IP 패킷들의 네트워크 방향의 일관성을 확보한다. 방향이 전환된 IP 패킷들 및 실시예들에 따른 네트워크 방향 전환은 도 2에서 설명한 바와 동일하므로 구체적인 설명은 생략한다. The APT attack detection device (e.g., the network direction change module 220) according to embodiments secures consistency of the network direction of IP packets transmitting security events by changing the network direction of inbound IP packets or outbound IP packets. do. Since the redirected IP packets and the network redirection according to the embodiments are the same as described in FIG. 2, detailed description will be omitted.

실시예들에 따른 APT 공격 탐지 장치 (예를 들면 지속공격 분류 모듈(230))는 IP 패킷들이 전송하는 보안이벤트들이 지속공격에 해당하는지 여부를 판단한다. 도 2에서 설명한 바와 같이 실시예들에 따른 APT 공격 탐지 장치는 싱글(single) 보안이벤트 및 기설정된 시간보다 작은 시간동안 연속되는 보안이벤트들은 지속공격에 해당하지 않는다고 판단한다. 실시예들에 따른 APT 공격 탐지 장치는 지속공격에 해당하지 않는 보안이벤트를 전송하는 IP 패킷들을 제거하고 공격자의 IP별로 IP 그룹들을 생성하고 IP 그룹들의 세트를 출력할 수 있다. 실시예들에 따른 APT 공격 탐지 장치의 지속공격분류 동작은 도 2에서 설명한 바와 동일하므로 구체적인 설명은 생략한다.An APT attack detection device (for example, the persistent attack classification module 230) according to embodiments determines whether security events transmitted by IP packets correspond to persistent attacks. As described in FIG. 2, APT attack detection devices according to embodiments determine that a single security event and security events that continue for a time shorter than a preset time do not correspond to continuous attacks. APT attack detection devices according to embodiments may remove IP packets that transmit security events that do not correspond to persistent attacks, create IP groups for each attacker's IP, and output a set of IP groups. Since the continuous attack classification operation of the APT attack detection device according to the embodiments is the same as described in FIG. 2, detailed description is omitted.

실시예들에 따른 APT 공격 탐지 장치(예를 들면 표적공격분류 모듈(240))는 지속공격으로 분류된 IP 그룹들의 세트에 대하여, 각 IP가 표적공격에 해당하는지 판단한다. 실시예들에 따른 APT 공격 탐지 장치는 하나의 패턴을 갖는 IP 패킷 및 기설정된 개수의 기관들 또는 호스트들을 타겟하는 IP 패킷들을 통해 전송되는 보안이벤트는 표적공격에 해당하지 않는다고 판단한다. 실시예들에 따른 APT 공격 탐지 장치는 표적공격에 해당하지 않는 보안이벤트를 전송하는 IP 패킷들을 제거하고 공격자의 IP 별로 IP 그룹들을 생성하여 IP 그룹들의 세트를 출력할 수 있다. 실시예들에 따른 APT 공격 탐지 장치의 표적공격분류 동작은 도 2에서 설명한 바와 동일하므로 구체적인 설명은 생략한다.The APT attack detection device (e.g., the targeted attack classification module 240) according to embodiments determines whether each IP corresponds to a targeted attack for a set of IP groups classified as persistent attacks. APT attack detection devices according to embodiments determine that a security event transmitted through IP packets with one pattern and IP packets targeting a preset number of organizations or hosts does not correspond to a targeted attack. APT attack detection devices according to embodiments may remove IP packets that transmit security events that do not correspond to targeted attacks, create IP groups for each attacker's IP, and output a set of IP groups. Since the target attack classification operation of the APT attack detection device according to the embodiments is the same as described in FIG. 2, detailed description will be omitted.

실시예들에 따른 APT 공격 탐지 장치(예를 들면 유사도 측정 모듈(250))는 과거 공격에 대응하는 IP 패킷들(예를 들면 인바운드 IP 패킷 및 아웃바운드 IP 패킷)의 페이로드와 입력된 IP 그룹들의 세트에 속한 IP 패킷의 페이로드를 비교하여 유사도 측정을 수행한다. 도 2에서 설명한 바와 같이 APT 공격 탐지 장치는 페이로드 내의 데이터를 전처리하여 유사도 측정을 위한 데이터를 생성할 수 있다. 실시예들에 따른 데이터 전처리는 도 2에서 설명한 바와 동일하므로 구체적인 설명은 생략한다.The APT attack detection device (e.g., similarity measurement module 250) according to embodiments detects the payload of IP packets (e.g., inbound IP packet and outbound IP packet) corresponding to past attacks and the input IP group. Similarity measurement is performed by comparing the payload of IP packets belonging to the set. As described in FIG. 2, an APT attack detection device can generate data for similarity measurement by preprocessing data in the payload. Since data preprocessing according to the embodiments is the same as described in FIG. 2, detailed description will be omitted.

실시예들에 따른 APT 공격 탐지 장치(예를 들면 블랙 IP 결정 모듈(260))는 측정된 유사도 결과를 기반으로 각 IP 주소가 블랙 IP(Black IP) 주소 또는 의심(Suspicious) IP 주소인지를 판단한다. 도 2에서 설명한 바와 같이 APT 공격 탐지 장치는 유사도가 기설정된 값(예를 들면 유사도 80%) 이상이면 해당 IP 주소를 블랙 IP 주소로 판단한다. APT 공격 탐지 장치는 유사도가 기설정된 값(예를 들면 유사도 80%) 이하이면 해당 IP 주소를 의심 IP 주소로 판단한다. 또한 실시예들에 따른 블랙 IP 결정 모듈(260)은 기관 또는 호스트별로 패킷 숫자를 기설정하고, 유사도가 기설정된 한계값보다 이상인 패킷의 개수가 기설정된 패킷 숫자 이상이면 해당 패킷들을 블랙 IP로 지정할 수　있다. 기설정된 한계값 및 기설정된 패킷 개수는 시스템 관리자에 의해서 설정 및 변경될 수 있다. 실시예들에 따른 블랙 IP 판단 방법은 도 2에서 설명한 바와 동일하므로 구체적인 설명은 생략한다. An APT attack detection device (e.g., black IP determination module 260) according to embodiments determines whether each IP address is a black IP address or a suspicious IP address based on the measured similarity result. do. As explained in Figure 2, the APT attack detection device determines the IP address as a black IP address if the similarity is greater than a preset value (for example, 80% similarity). The APT attack detection device determines that the IP address is a suspicious IP address if the similarity is less than a preset value (for example, 80% similarity). Additionally, the black IP determination module 260 according to embodiments presets the number of packets for each organization or host, and designates the packets as black IPs when the number of packets with a similarity greater than the preset threshold is greater than the preset number of packets. It is possible. The preset limit value and the preset number of packets can be set and changed by the system administrator. Since the black IP determination method according to the embodiments is the same as described in FIG. 2, detailed description will be omitted.

도 4는 실시예들에 따른 APT 공격 탐지 방법을 나타낸 플로우 차트이다. Figure 4 is a flow chart showing an APT attack detection method according to embodiments.

도 4의 플로우 차트(400)는 도 1 내지 도 2에서 설명한 APT 공격 탐지 장치(예를 들면 APT 공격 탐지 장치(200))에서 수행되는 APT 공격 탐지 방법을 나타낸다.The flow chart 400 of FIG. 4 shows an APT attack detection method performed by the APT attack detection device (eg, the APT attack detection device 200) described in FIGS. 1 and 2.

실시예들에 따른 APT 공격 탐지 장치는 탐지된 전체 데이터 셋을 수집하고 저장한다(410). 실시예들에 따른 전체 데이터 셋은 보안이벤트들에 대응한다. An APT attack detection device according to embodiments collects and stores the entire detected data set (410). The entire data set according to embodiments corresponds to security events.

실시예들에 따른 APT 공격 탐지 장치(예를 들면 도 2에서 설명한 전처리 모듈(210))는 저장된 데이터 셋에 포함된 보안이벤트들을 분류한다(420). 도 2에서 설명한 바와 같이 APT 공격 탐지 장치는 보안이벤트들(raw data)를 포함하는 전체 데이터셋에 대하여 IP 패턴 및 임계수치(threshold)를 기반으로 APT 공격과 무관한 보안이벤트들을 제거할 수 있다. An APT attack detection device according to embodiments (for example, the preprocessing module 210 described in FIG. 2) classifies security events included in the stored data set (420). As explained in FIG. 2, the APT attack detection device can remove security events unrelated to the APT attack based on IP patterns and thresholds for the entire dataset including security events (raw data).

실시예들에 따른 APT 공격 탐지 장치(예를 들면 네트워크 방향 전환 모듈(220))는 APT 공격과 관련이 있는 보안이벤트들을 전송하는 IP 패킷들의 네트워크 방향을 전환한다(430). 네트워크 방향 전환은 아웃바운드에서 인바운드로의 방향 전환을 포함한다. 구체적인 설명은 도 1 내지 도 3에서 설명한 바와 동일하다. An APT attack detection device (for example, the network redirection module 220) according to embodiments changes the network direction of IP packets transmitting security events related to the APT attack (430). Network redirection involves changing direction from outbound to inbound. The detailed description is the same as that described in FIGS. 1 to 3.

실시예들에 따른 APT 공격 탐지 장치(예를 들면 지속공격분류 모듈(230))는 IP 패킷들을 통해 전송되는 보안이벤트들이 지속공격에 해당하는지 판단한다(440). 싱글 (single) 보안이벤트 및 기설정된 시간(θ)보다 작은 시간 동안 연속되는 보안이벤트는 지속공격으로 판단되지 않는다. 실시예들에 따른 APT 공격 탐지 장치는 공격자 IP별로 IP 그룹핑을 수행하여 IP 그룹들의 세트를 출력한다.An APT attack detection device (e.g., persistent attack classification module 230) according to embodiments determines whether security events transmitted through IP packets correspond to persistent attacks (440). Single security events and security events that continue for less than the preset time (θ) are not judged as continuous attacks. APT attack detection devices according to embodiments perform IP grouping for each attacker IP and output a set of IP groups.

실시예들에 따른 APT 공격 탐지 장치(예를 들면 표적공격분류 모듈(240))는 IP 그룹들의 세트를 통해 전송되는 보안이벤트들이 표적공격에 해당하는지 판단한다(450). 하나의 패턴을 갖는 IP 패킷 및 기설정된 개수의 기관들 또는 호스트들을 타겟하는 IP 패킷들을 통해 전송되는 보안이벤트는 표적공격으로 판단되지 않는다. 이후 실시예들에 따른 APT 공격 탐지 장치는 공격자 IP별로 IP 그룹핑을 수행하여 IP 그룹들의 세트를 출력한다.An APT attack detection device (e.g., a targeted attack classification module 240) according to embodiments determines whether security events transmitted through a set of IP groups correspond to targeted attacks (450). Security events transmitted through IP packets with one pattern and IP packets targeting a preset number of organizations or hosts are not judged to be targeted attacks. Afterwards, the APT attack detection device according to the embodiments performs IP grouping for each attacker IP and outputs a set of IP groups.

실시예들에 따른 APT 공격 탐지 장치(예를 들면 유사도 측정 모듈(250))는 IP 그룹들의 세트에 속한 각 IP와 기존 IP(또는 기존 보안 이벤트 관련 사고가 있던 IP)와의 유사도를 판단한다(460). 실시예들에 따른 APT 공격 탐지 장치는 IP 그룹들의 세트에 속한 IP 패킷의 페이로드와 기존 보안이벤트 관련 사고가 있었던 IP 패킷(예를 들면 인바운드 IP 패킷 및 아웃바운드 IP 패킷)의 페이로드와 비교하여 유사도를 판단할 수 있다. 또한 실시예들에 따른 APT 공격 탐지 장치는 유사도 측정을 위하여 페이로드 내의 데이터를 전처리 할 수 있다.An APT attack detection device according to embodiments (e.g., similarity measurement module 250) determines the similarity between each IP belonging to a set of IP groups and an existing IP (or an IP that has had an existing security event-related incident) (460) ). APT attack detection devices according to embodiments compare the payload of IP packets belonging to a set of IP groups with the payload of IP packets (for example, inbound IP packets and outbound IP packets) that have had incidents related to existing security events. Similarity can be judged. Additionally, APT attack detection devices according to embodiments may preprocess data in the payload to measure similarity.

실시예들에 따른 APT 공격 탐지 장치(예를 들면 블랙 IP 결정 모듈(260))는 유사도를 기반으로 해당 IP가 블랙 IP 또는 의심 IP 인지 여부를 결정할 수 있다(470). 실시예들에 따른 APT 공격 탐지 장치는 유사도가 기설정된 한계값(예를 들면 유사도 80%) 이상이면 해당 IP 패킷의 IP 주소를 블랙 IP 주소로 판단할 수 있다. 또한 실시예들에 따른 블랙 IP 결정 모듈(260)은 기관 또는 호스트별로 패킷 숫자를 기설정하고, 유사도가 기설정된 한계값보다 이상인 패킷의 개수가 기설정된 패킷 숫자 이상이면 해당 패킷들을 블랙 IP로 지정할 수　있다. 기설정된 한계값 및 기설정된 패킷 개수는 시스템 관리자에 의해서 설정 및 변경될 수 있다. 실시예들에 따른 블랙 IP 판단 방법은 도 2에서 설명한 바와 동일하므로 구체적인 설명은 생략한다.An APT attack detection device (e.g., black IP determination module 260) according to embodiments may determine whether the corresponding IP is a black IP or a suspicious IP based on similarity (470). APT attack detection devices according to embodiments may determine the IP address of the corresponding IP packet to be a black IP address if the similarity is greater than a preset threshold (for example, 80% similarity). Additionally, the black IP determination module 260 according to embodiments presets the number of packets for each organization or host, and designates the packets as black IPs when the number of packets with a similarity greater than the preset threshold is greater than the preset number of packets. It is possible. The preset limit value and the preset number of packets can be set and changed by the system administrator. Since the black IP determination method according to the embodiments is the same as described in FIG. 2, detailed description will be omitted.

도 5는 실시예들에 따른 유사도 측정 모듈의 예시를 나타내는 블록도이다.Figure 5 is a block diagram showing an example of a similarity measurement module according to embodiments.

도 5에 도시된 유사도 측정 모듈(500)은 도 1 내지 도 4에서 설명한 바와 같이 유사도 측정 모듈(250)의 예시이다. 상술한 바와 같이 유사도 측정 모듈(500)은 해당 IP와 기존 IP 페이로드 간의 유사도를 측정하기 위하여 데이터 전처리를 수행하기 위하여 디코딩 수행부(510), 개인정보 제거부(520), 특수문자 제거부(530), 불용문자 제거부(540) 및 문자열 세그먼트부(550)를 포함할 수 있다. The similarity measurement module 500 shown in FIG. 5 is an example of the similarity measurement module 250 as described in FIGS. 1 to 4. As described above, the similarity measurement module 500 includes a decoding unit 510, a personal information removal unit 520, and a special character removal unit ( 530), an unused character removal unit 540, and a string segment unit 550.

실시예들에 따른 보안 데이터는 IP 페이로드 데이터(또는 페이로드 데이터)일 수 있다. 즉, 실시예들에 따른 보안 데이터는 보안 데이터가 생성된 프로토콜에 따라서 페이로드 구성 방식이 다를 수 있고, 키-값(key-value)데이터 구조를 가질 수 있다. 따라서, 실시예들에 따른 보안 데이터는 기계어 데이터 및 자연어 데이터의 결합이 달라지거나, key와 value의 순서 또는 key에 따라 출현하는 value들의 관계가 달라지는 경우, 본래 데이터의 의미가 달라질 수 있다.Security data according to embodiments may be IP payload data (or payload data). That is, security data according to embodiments may have a different payload configuration method depending on the protocol in which the security data was generated and may have a key-value data structure. Therefore, the original meaning of security data according to embodiments may vary if the combination of machine language data and natural language data changes, or if the order of keys and values or the relationship between values that appear according to the key changes.

실시예들에 따른 유사도 측정 모듈은 데이터를 전처리 하기 위하여 도면에 도시되지 않은 하나 또는 그 이상의 엘리먼트들을 더 포함할 수 있다.The similarity measurement module according to embodiments may further include one or more elements not shown in the drawing to preprocess data.

실시예들에 따른 디코딩 수행부(510)는 입력된 보안 데이터에 대하여 디코딩을 수행할 수 있다. 실시예들에 따른 디코딩 수행부(510)는 보안 데이터가 인코딩된 방식에 따라 디코딩을 수행할 수 있다. 예를 들어, 디코딩 수행부(510)는 16진법으로 인코딩된 문자열을 아스키 문자열로 변환할 수 있다.The decoding unit 510 according to embodiments may perform decoding on input security data. The decoding unit 510 according to embodiments may perform decoding according to the way the security data was encoded. For example, the decoding unit 510 may convert a hexadecimal encoded string into an ASCII string.

실시예들에 따른 개인정보 제거부(520)는 디코딩된 보안 데이터에 대하여 개인정보 제거를 수행할 수 있다. 실시예들에 따른 개인정보는 시스템 관리자 또는 일반 사용자에 대한 정보일 수 있다. 예를 들어, 개인정보는 IP 주소, 주민등록번호, 전화번호, 여권번호 등을 포함한다. 실시예들에 따른 개인정보 제거부(520)는 정규 표현식에 따라 저장된 패턴을 통해 보안 데이터에 포함된 개인정보를 식별하고 제거할 수 있다. 실시예들에 따른 정규 표현식은 시스템 관리자 또는 일반 사용자에 의해 기설정될 수 있다. 실시예들에 따른 개인정보 제거부(520)는 블랙 IP를 분류하기 위한 유사도 측정을 위하여 불필요한 데이터를 제거하고, 사용자 또는 관리자의 개인정보를 제거하여 보안을 유지할 수 있다.The personal information removal unit 520 according to embodiments may remove personal information from decoded secure data. Personal information according to embodiments may be information about a system administrator or general user. For example, personal information includes IP address, resident registration number, phone number, passport number, etc. The personal information removal unit 520 according to embodiments may identify and remove personal information included in security data through patterns stored according to regular expressions. Regular expressions according to embodiments may be preset by a system administrator or general user. The personal information removal unit 520 according to embodiments may remove unnecessary data to measure similarity for classifying black IPs and maintain security by removing personal information of users or administrators.

실시예들에 따른 특수문자 제거부(530)는 개인정보가 제거된 보안 데이터에 대하여 특수문자 제거를 수행할 수 있다. 즉, 실시예들에 따른 특수문자 제거부(530)에 의한 특수문자 제거 과정은 개인정보 제거부(520)에 의한 개인정보 제거 과정 이후에 수행될 수 있다. 실시예들에 따른 특수문자 제거부(530)에 의한 특수문자 제거 과정이 개인정보 제거부(520)에 의한 개인정보 제거 과정 이전에 수행되면, 개인정보가 제거되지 않는 문제점이 발생할 수 있다. 예를 들어, 개인정보에 해당하는 주민등록번호에서 특수문자 “-(바)”가 먼저 제거되면, 개인정보 제거부(520)는 주민등록번호를 식별하지 못한다. 실시예들에 따른 특수문자 제거부(530)는 개인정보 제거부(520) 이후에 위치하여 상술한 문제점을 해결할 수 있다. 즉, 실시예들에 따른 특수문자 제거부(530)는 유사도 측정에 필요한 특수문자를 제외한 특수문자를 제거할 수 있다. 예를 들어, 특수문자 제거부(530)는 @(앳), _(언더바), .(닷) 및 /(슬래쉬)를 제외한 특수문자를 제거할 수 있다. 실시예들에 따른 특수문자 제거부(530)는 정규 표현식에 따라 저장된 패턴을 통해 보안 데이터에 포함된 특수문자를 식별하고 제거할 수 있다. 실시예들에 따른 정규 표현식은 시스템 관리자 또는 일반 사용자에 의해 기설정될 수 있다. 실시예들에 따른 특수문자 제거부(530)는 유사도 측정에 포함되지 않는 불필요한 데이터를 제거하여, 최적의 학습 데이터를 생성할 수 있다.The special character removal unit 530 according to embodiments may perform special character removal on secure data from which personal information has been removed. That is, the special character removal process by the special character removal unit 530 according to embodiments may be performed after the personal information removal process by the personal information removal unit 520. If the special character removal process by the special character removal unit 530 according to embodiments is performed before the personal information removal process by the personal information removal unit 520, a problem in which personal information is not removed may occur. For example, if the special character “- (bar)” is first removed from the resident registration number corresponding to personal information, the personal information removal unit 520 cannot identify the resident registration number. The special character removal unit 530 according to embodiments is located after the personal information removal unit 520 and can solve the above-mentioned problems. That is, the special character removal unit 530 according to embodiments can remove special characters excluding special characters required for similarity measurement. For example, the special character removal unit 530 can remove special characters except @ (at), _ (underbar), . (dot), and / (slash). The special character removal unit 530 according to embodiments may identify and remove special characters included in security data through patterns stored according to regular expressions. Regular expressions according to embodiments may be preset by a system administrator or general user. The special character removal unit 530 according to embodiments can generate optimal learning data by removing unnecessary data that is not included in similarity measurement.

실시예들에 따른 불용문자 제거부(540)는 특수문자가 제거된 보안 데이터에 대하여 불용문자 제거를 수행할 수 있다. 실시예들에 따른 불용문자 제거부(540)는 유사도 측정에 포함되지 않는 데이터 중 상술한 개인정보 제거 과정 및 특수문자 제거과정에서 제거되지 않은 데이터를 제거할 수 있다. 즉, 실시예들에 따른 불용문자 제거부(540)에 의한 불용문자 제거 과정은 상술한 개인정보 제거 과정 및 특수문자 제거과정 이후에 수행될 수 있다. 예를 들어, 불용문자 제거부(540)는 특수문자로만 구성된 데이터, 숫자 1개 및 문자 1개로만 이루어진 데이터, 특수문자 1개 및 문자들로만 이루어진 데이터 등을 제거할 수 있다. 실시예들에 따른 불용문자 제거부(540)는 정규 표현식에 따라 저장된 패턴을 통해 보안 데이터에 포함된 불용문자를 식별하고 제거할 수 있다. 실시예들에 따른 정규 표현식은 시스템 관리자 또는 일반 사용자에 의해 기설정될 수 있다. 실시예들에 따른 불용문자 제거부(540)는 유사도 측정에 포함되지 않는 불필요한 데이터를 제거할 수 있다.The unused character removal unit 540 according to embodiments may perform unused character removal on security data from which special characters have been removed. The unused character removal unit 540 according to embodiments may remove data that was not removed in the above-described personal information removal process and special character removal process among data not included in similarity measurement. That is, the process of removing unused characters by the unused character removal unit 540 according to embodiments may be performed after the personal information removal process and the special character removal process described above. For example, the unused character removal unit 540 can remove data consisting only of special characters, data consisting of only one number and one letter, data consisting of only one special character and letters, etc. The invalid character removal unit 540 according to embodiments may identify and remove invalid characters included in security data through patterns stored according to regular expressions. Regular expressions according to embodiments may be preset by a system administrator or general user. The unused character removal unit 540 according to embodiments may remove unnecessary data that is not included in similarity measurement.

실시예들에 따른 문자열 세그먼트부(550)는 불용문자가 제거된 보안 데이터에 대하여 문자열 세그먼트(segment)를 수행할 수 있다. 실시예들에 따른 문자열세그먼트 과정은 탭 문자(예를 들어, '\t','\f', '\v'등)를 기준으로 문자열을 토크나이징하는 과정일 수 있다. 일정 길이 이상의 문자열(또는 데이터)는 유사 판단을 위해 사용되지 못할 수 있다. 따라서, 실시예들에 따른 문자열 세그먼트부(550)는 개인정보, 특수문자 및 불용문자가 순서대로 제거된 보안 데이터에 대하여 일정 길이 이하로 문자열을 분할(또는 토크나이징)하여 유사도 측정을 위한 데이터를 생성할 수 있다.The string segment unit 550 according to embodiments may perform string segmentation on secure data from which unused characters have been removed. The string segmentation process according to embodiments may be a process of tokenizing a string based on tab characters (eg, '\t', '\f', '\v', etc.). Strings (or data) over a certain length may not be used for similarity determination. Therefore, the string segment unit 550 according to embodiments divides (or tokenizes) the string into a certain length or less for security data from which personal information, special characters, and unused characters have been removed in order, and provides data for similarity measurement. can be created.

실시예들에 따른 문자열 세그먼트부(550)에 의한 문자열 세그먼트 과정은 상술한 개인정보 제거 과정, 특수문자 제거 과정 및 불용문자 제거 과정 이후에 수행될 수 있다. 실시예들에 따른 문자열 세그먼트부(550)에 의한 문자열 세그먼트 과정이 상술한 개인정보 제거 과정, 특수문자 제거 과정 및 불용문자 제거 과정 이전에 수행되면 실시예들에 따른 유사도 측정 모듈의 출력 데이터가 달라질 수 있다. 예를 들어, “010-0000-0000/desk/1r40”를 가지는 데이터에 대하여 상술한 개인정보 제거 과정, 특수문자 제거 과정 및 불용문자 제거 과정을 수행하면 다음과 같다:The string segmentation process by the string segment unit 550 according to embodiments may be performed after the above-described personal information removal process, special character removal process, and unused character removal process. If the string segmentation process by the string segment unit 550 according to the embodiments is performed before the personal information removal process, special character removal process, and unused character removal process described above, the output data of the similarity measurement module according to the embodiments will vary. You can. For example, if the above-described personal information removal process, special character removal process, and unused character removal process are performed on data containing “010-0000-0000/desk/1r40”, it is as follows:

개인정보 제거: /desk/1r40Remove personal information: /desk/1r40

특수문자 제거: desk 1r40Remove special characters: desk 1r40

불용문자 제거: deskRemove unnecessary characters: desk

예를 들어, “010-0000-0000/desk/1r40”를 가지는 데이터에 대하여 상술한 개인정보 제거 과정, 특수문자 제거 과정 및 불용문자 제거 과정 이전에 문자열 세그먼트 과정(예를 들어, 10 단위)를 수행하면 다음과 같다:For example, for data with “010-0000-0000/desk/1r40”, a string segment process (e.g., 10 units) is performed before the personal information removal process, special character removal process, and unused character removal process described above. If you do this:

문자열 세그먼트: 010-0000-0 000/desk/1 r40String segment: 010-0000-0 000/desk/1 r40

개인정보 제거: 010-0000-0 000/desk/1 r40Remove personal information: 010-0000-0 000/desk/1 r40

특수문자 제거: 010-0000-0 000/desk/1Remove special characters: 010-0000-0 000/desk/1

불용문자 제거: 010-0000-0 000 desk 1Remove unused characters: 010-0000-0 000 desk 1

즉, 실시예들에 따른 문자열 세그먼트부(550)에 의한 문자열 세그먼트 과정이 상술한 개인정보 제거 과정, 특수문자 제거 과정 및 불용문자 제거 과정 이전에 수행되면, 유사도 측정 모듈의 출력은 유사도 측정에 포함되지 않는 데이터를 포함할 수 있다. 따라서 실시예들에 따른 유사도 측정 모듈은 문자열 세그먼트 처리를 가장 마지막에 수행하여 유사도 측정에 포함되지 않는 데이터를 제거할 수 있다. That is, if the string segmentation process by the string segment unit 550 according to the embodiment is performed before the personal information removal process, special character removal process, and unused character removal process described above, the output of the similarity measurement module is included in the similarity measurement. It may contain data that is not available. Therefore, the similarity measurement module according to embodiments may perform string segment processing last and remove data that is not included in the similarity measurement.

도 6은 실시예들에 따른 정규 표현식을 나타낸다.Figure 6 shows regular expressions according to embodiments.

도 6은 유사도 측정 모듈에서 사용되는 정규 표현식(예를 들면 도 5에서 설명한 정규 표현식)의 예시를 나타내는 도면이다. 도 5에서 설명한 개인정보 제거 과정, 특수문자 제거 과정 및 불용문자 제거 과정은 각각 시스템 관리자 또는 일반 사용자에 의하여 기 설정된 정규 표현식에 따라 수행될 수 있다.Figure 6 is a diagram showing an example of a regular expression (eg, the regular expression described in Figure 5) used in the similarity measurement module. The personal information removal process, special character removal process, and unused character removal process described in FIG. 5 may each be performed according to regular expressions preset by the system administrator or general user.

실시예들에 따른 개인정보 제거부(도 5의 개인정보 제거부(520))는 기 설정된 정규 표현식에 따라 개인정보 제거 과정을 수행할 수 있다. 실시예들에 따른 개인정보 제거부에 관한 정규 표현식은 IP 패턴, 주민등록번호 패턴, 전화번호 패턴 및/또는 여권번호 패턴을 포함할 수 있다.The personal information removal unit (personal information removal unit 520 in FIG. 5) according to embodiments may perform a personal information removal process according to a preset regular expression. The regular expression for the personal information removal unit according to embodiments may include an IP pattern, a resident registration number pattern, a phone number pattern, and/or a passport number pattern.

실시예들에 따른 IP 주소는 IPv4(IPversion4) 형태일 수 있다. 실시예들에 따른 IP 주소는 000.000.000.000 내지 255.255.255.255 중 어느 하나와 같이 3자리 숫자가 4마디로 표현되는 패턴(pattern)을 가질 수 있다. 즉, 실시예들에 따른 IP 주소를 나타내는 패턴은 ((25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])일 수 있다. 따라서, 실시예들에 따른 개인정보 제거부는 ip_pattern = r'((25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])'를 가지는 정규 표현식에 따라보안 데이터(도 5에서 설명한 보안 데이터)에 포함된 IP 주소를 식별하고 이를 삭제할 수 있다.The IP address according to embodiments may be in IPv4 (IPversion4) format. An IP address according to embodiments may have a pattern in which a 3-digit number is expressed in 4 words, such as any one of 000.000.000.000 to 255.255.255.255. That is, the pattern representing the IP address according to the embodiments is ((25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9]) \.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9]). Accordingly, the personal information removal unit according to embodiments may be configured as ip_pattern = r'((25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9 ])\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])' Accordingly, the IP address included in the security data (the security data described in Figure 5) can be identified and deleted.

실시예들에 따른 주민등록번호는 -(바)를 기준으로 앞에 6개의 숫자 및 뒤에 7개의 숫자를 가지는 총 13자리의 숫자로 표현되는 패턴을 가질 수 있다.The resident registration number according to embodiments may have a pattern expressed as a total of 13 numbers with 6 numbers in front and 7 numbers at the end based on - (bar).

즉, 실시예들에 따른 주민등록번호를 나타내는 패턴은 ([0-9]{2}(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0,1])[-][1-4][0-9]{6})일 수 있다. 따라서, 실시예들에 따른 개인정보 제거부는 res_pattern = r'([0-9]{2}(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0,1])[-][1-4][0-9]{6})'을 가지는 정규 표현식에 따라 보안 데이터(도5에서 설명한 보안 데이터)에 포함된 주민등록번호를 식별하고 이를 삭제할 수 있다.That is, the pattern representing the resident registration number according to the embodiments is ([0-9]{2}(0[1-9]|1[0-2])(0[1-9]|[1-2][ It could be 0-9]|3[0,1])[-][1-4][0-9]{6}). Therefore, the personal information removal unit according to embodiments res_pattern = r'([0-9]{2}(0[1-9]|1[0-2])(0[1-9]|[1- to security data (secure data described in Figure 5) according to a regular expression with '2][0-9]|3[0,1])[-][1-4][0-9]{6})'. You can identify the resident registration number included and delete it.

실시예들에 따른 전화번호는 2개 또는 3개의 숫자로 표현되는 지역코드, 3개 또는 4개의 숫자로 표현되는 앞자리 번호 및 4개의 숫자로 표현되는 뒷자리 번호를 포함할 수 있으며, 상술한 지역코드, 앞자리 번호 및 뒷자리 번호는 각각 -(바)로 구분될 수 있다. 즉, 실시예들에 따른 전화번호를 나타내는 패턴은 ([0-9]{2, 3}[-][0-9]{3, 4}[-][0-9]{4})일 수 있다. 따라서, 실시예들에 따른 개인정보 제거부는 phone_pattern = r'([0-9]{2, 3}[-][0-9]{3, 4}[-][0-9]{4})'를 가지는 정규 표현식에 따라 보안 데이터에 포함된 전화번호를 식별하고 이를 삭제할 수 있다.The phone number according to embodiments may include an area code expressed by 2 or 3 numbers, a first digit number expressed by 3 or 4 numbers, and a last digit number expressed by 4 numbers, and the above-mentioned area code , the first digit number and the last digit number can each be separated by - (bar). That is, the pattern representing the phone number according to the embodiments is ([0-9]{2, 3}[-][0-9]{3, 4}[-][0-9]{4}). You can. Therefore, the personal information removal unit according to embodiments phone_pattern = r'([0-9]{2, 3}[-][0-9]{3, 4}[-][0-9]{4} )', you can identify phone numbers included in secure data and delete them.

실시예들에 따른 여권번호는 M, T, S, R, G 및 D 중 하나의 알파벳 및 8개의 숫자로 표현되는 패턴을 가질 수 있다. 즉, 실시예들에 따른 여권번호를 나타내는 패턴은 ([MTSRGD][0-9]{8})일 수 있다. 따라서, 실시예들에 따른 개인정보 제거부는 passport_pattern = r'([MTSRGD][0-9]{8})'를 가지는 정규 표현식에 따라 보안 데이터(도 5에서 설명한 보안 데이터)에 포함된 여권번호를 식별하고 이를 삭제할 수 있다.Passport numbers according to embodiments may have a pattern represented by one of M, T, S, R, G, and D and eight numbers. That is, the pattern representing the passport number according to embodiments may be ([MTSRGD][0-9]{8}). Therefore, the personal information removal unit according to embodiments may include the passport number included in the security data (the security data described in FIG. 5) according to a regular expression with passport_pattern = r'([MTSRGD][0-9]{8})'. can be identified and deleted.

실시예들에 따른 개인정보 제거부의 기 설정된 정규 표현식에 따른 개인정보 제거 과정은 상술한 실시예들에 국한되지 않는다.The personal information removal process according to the preset regular expression of the personal information removal unit according to the embodiments is not limited to the above-described embodiments.

실시예들에 따른 특수문자 제거부(도 5의 특수문자 제거부(523))는 기 설정된 정규 표현식(이 도면에 도시되어 있지 않음)에 따라 특수문자 제거 과정을 수행할 수 있다. 도 5에서 상술한 바와 같이 특수문자 제거부는 유사도 측정에 필요한 특수문자(예를 들어, @(앳), _(언더바), .(닷)및 /(슬래쉬))를 제외한 특수문자를 제거할 수 있다.The special character removal unit (special character removal unit 523 in FIG. 5) according to embodiments may perform a special character removal process according to a preset regular expression (not shown in this figure). As described above in FIG. 5, the special character removal unit removes special characters excluding special characters (e.g., @ (at), _ (underbar), . (dot), and / (slash) required for similarity measurement. You can.

실시예들에 따른 불용문자 제거부(도 5의 불용문자 제거부(524))는 기 설정된 정규 표현식에 따라 불용문자를 제거 과정을 수행할 수 있다. 실시예들에 따른 불용문자에 관한 정규 표현식은 특수문자로만 구성된 문자열 패턴, 특수문자로 끝나는 문자열에서의 특수문자 패턴, 특수문자로 시작하는 문자열에서 특수문자 패턴, 문자 1개, 특수문자 1개 및 문자 1개로 이루어진 3자리 문자열 패턴, 숫자 1개, 문자 1개 및 숫자 1개로 이루어진 3자리 문자열 패턴 및 숫자 1개로만 이루어진 문자열 패턴을 포함할 수 있다.The unused character removal unit (the unused character removal unit 524 in FIG. 5) according to embodiments may perform a process of removing unused characters according to a preset regular expression. Regular expressions for unused characters according to embodiments include a string pattern consisting of only special characters, a special character pattern in a string ending with a special character, a special character pattern in a string starting with a special character, one character, one special character, and It can include a 3-digit string pattern consisting of one letter, a 3-digit string pattern consisting of 1 number, 1 letter and 1 number, and a string pattern consisting of only 1 number.

실시예들에 따른 불용문자 제거부는 row = re.sub('\s[@\-_.]{2,}\s', ' ', row)를 가지는 정규 표현식에 따라, 보안 데이터에 포함된 특수문자들(예를 들어, @\-_ 및.) 로만 구성된 문자열을 식별하고 제거할 수 있다.The unused character removal unit according to embodiments includes the security data according to a regular expression having row = re.sub('\s[@\-_.]{2,}\s', ' ', row). You can identify and remove strings that consist only of special characters (e.g. @\-_ and .).

실시예들에 따른 불용문자 제거부는 special_char_pattern = '([@\-_.]{1,})(\w)'를 가지는 정규 표현식에 따라, 보안 데이터에 포함된 특수문자로 끝나는 문자열에서의 특수문자를 식별하고 제거할 수 있다.The unused character removal unit according to embodiments uses a regular expression having special_char_pattern = '([@\-_.]{1,})(\w)', Characters can be identified and removed.

실시예들에 따른 불용문자 제거부는 special_char_pattern ='(\w)([@\-_.]{1,})'를 가지는 정규 표현식에 따라, 보안 데이터에 포함된 특수문자로 시작하는 문자열에서의 특수문자를 식별하고 제거할 수 있다.The unused character removal unit according to embodiments uses a regular expression having special_char_pattern ='(\w)([@\-_.]{1,})', in a string starting with a special character included in the security data. Special characters can be identified and removed.

실시예들에 따른 불용문자 제거부는 special_char_pattern = '\s[azA-Z]{1}0-9]{1}[a-zA-Z]{1}\s'를 가지는 정규 표현식에 따라, 보안 데이터에 포함된 문자 1개, 특수문자 1개 및 문자 1개로 이루어진 3자리 문자열을 식별하고 제거할 수 있다.The unused character removal unit according to embodiments uses a regular expression with special_char_pattern = '\s[azA-Z]{1}0-9]{1}[a-zA-Z]{1}\s', You can identify and remove a 3-character string consisting of 1 character, 1 special character, and 1 character included in the data.

실시예들에 따른 불용문자 제거부는 special_char_pattern = '\s[0-9]{1}[ a-zA-Z]{1}[ 0-9]{1}\s'를 가지는 정규 표현식에 따라, 보안 데이터에 포함된 숫자 1개, 문자 1개 및 숫자 1개로 이루어진 3자리 문자열을 식별하고 제거할수 있다.The unused character removal unit according to embodiments follows a regular expression with special_char_pattern = '\s[0-9]{1}[ a-zA-Z]{1}[ 0-9]{1}\s', It can identify and remove a 3-character string consisting of 1 number, 1 letter, and 1 number included in secure data.

실시예들에 따른 불용문자 제거부는 special_char_pattern ='\s[0-9]{1,} \s'를 가지는 정규 표현식에 따라, 보안 데이터에 포함된 숫자 1개로만 이루어진 문자열을 식별하고 제거할 수 있다.The unused character removal unit according to embodiments can identify and remove a string consisting of only one number included in security data according to a regular expression having special_char_pattern ='\s[0-9]{1,}\s'. there is.

실시예들에 따른 불용문자 제거부의 기 설정된 정규 표현식에 따른 불용문자 제거 과정은 상술한 실시예들에 국한되지 않는다.The process of removing unused characters according to a preset regular expression of the unused character removal unit according to the embodiments is not limited to the above-described embodiments.

따라서, 실시예들에 따른 유사도 측정 모듈의 개인정보 제거부, 특수문자 제거부 및 불용문자 제거부는 기 설정된 정규 표현식에 따라 제거 과정을 효율적으로 수행할 수 있다.Therefore, the personal information removal unit, special character removal unit, and unused character removal unit of the similarity measurement module according to embodiments can efficiently perform a removal process according to a preset regular expression.

도 7은 실시예들에 따른 APT 공격 탐지 방법의 플로우차트이다.Figure 7 is a flowchart of an APT attack detection method according to embodiments.

도 7의 플로우차트(700)는 도 1 내지 도 6에서 설명한 APT 공격 탐지 장치(예를 들면 도 2의 APT 공격 탐지 장치(200))의 APT 공격 탐지 방법을 나타낸다.The flowchart 700 of FIG. 7 shows the APT attack detection method of the APT attack detection device (for example, the APT attack detection device 200 of FIG. 2) described in FIGS. 1 to 6.

실시예들에 따른 APT 공격 탐지 장치(예를 들면 전처리 모듈(210))는 탐지된 전체 데이터 셋(Data set)을 전처리하여 보안이벤트들을 전송하는 IP 패킷들을 분류한다(710). 실시예들에 따른 APT 공격 탐지 장치는 IP 패턴 및 임계수치(threshold)를 기반으로 APT 공격과 무관한 보안이벤트들을 제거하여 APT 공격과 관련있는 보안이벤트들을 전송하는 IP 패킷들을 분류할 수 있다. 실시예들에 따른 IP 패킷들의 분류 방법은 도 2 내지 도 6에서 설명한 바와 동일하므로 구체적인 설명은 생략한다.An APT attack detection device according to embodiments (e.g., preprocessing module 210) preprocesses the entire detected data set and classifies IP packets transmitting security events (710). APT attack detection devices according to embodiments may classify IP packets transmitting security events related to APT attacks by removing security events unrelated to APT attacks based on IP patterns and thresholds. Since the method of classifying IP packets according to the embodiments is the same as described in FIGS. 2 to 6, detailed description will be omitted.

실시예들에 따른 APT 공격 탐지 장치(예를 들면 네트워크 방향 전환 모듈(220))는 보안이벤트들을 전송하는 IP 패킷들 중 적어도 하나 이상의 IP 패킷의 네트워크 방향을 전환한다(720). 실시예들에 따른 APT 공격 탐지 장치는 적어도 하나 이상의 IP 패킷이 아웃바운드 IP 패킷이면, 아웃바운드 IP 패킷의 데스티네이션 IP를 공격자 IP로 설정하여 네트워크 방향을 전환한다. 실시예들에 따른 네트워크 방향 전환은 도 2 내지 도 6에서 설명한 바와 동일하므로 구체적인 설명은 생략한다. An APT attack detection device (e.g., network redirection module 220) according to embodiments changes the network direction of at least one IP packet among IP packets transmitting security events (720). If at least one IP packet is an outbound IP packet, the APT attack detection device according to embodiments sets the destination IP of the outbound IP packet to the attacker IP and changes the network direction. Since the network direction change according to the embodiments is the same as described in FIGS. 2 to 6, detailed description is omitted.

실시예들에 따른 APT 공격 탐지 장치(예를 들면 지속공격분류 모듈(230))는 네트워크 방향이 전환된 IP 패킷을 포함하는 IP 패킷들이 지속공격에 대응하는지 여부를 판단하고, 보안이벤트들을 전송하는 IP 패킷들 중 지속공격에 대응하는 IP 패킷들을 출력한다(730). 실시예들에 따른 APT 공격 탐지 장치는 싱글(single) 보안이벤트 및 기설정된 시간보다 작은 시간 동안 연속되는 보안이벤트들을 전송하는 IP 패킷들을 제거할 수 있다. 실시예들에 따른 지속공격 분류 방법은 도 2 내지 도 6에서 설명한 바와 동일하므로 구체적인 설명은 생략한다. The APT attack detection device (e.g., persistent attack classification module 230) according to embodiments determines whether IP packets, including IP packets whose network direction has been switched, respond to persistent attacks, and transmits security events. Among the IP packets, IP packets corresponding to persistent attacks are output (730). APT attack detection devices according to embodiments can remove single security events and IP packets that transmit consecutive security events for a time shorter than a preset time. Since the persistent attack classification method according to the embodiments is the same as described in FIGS. 2 to 6, detailed description is omitted.

실시예들에 따른 APT 공격 탐지 장치(예를 들면 표적공격분류 모듈(240))는 지속공격에 대응하는 IP 패킷들이 표적 공격에 대응하는지 여부를 판단하고, 지속공격에 대응하는 IP 패킷들 중 표적 공격에 대응하는 IP 패킷들을 출력한다(740). 실시예들에 따른 APT 공격 탐지 장치는 하나의 패턴을 갖는 IP 패킷 및 기설정된 개수보다 작은 개수의 기관 및/또는 호스트를 타겟하는 IP 패킷들을 제거할 수 있다. 또한 실시예들에 따른 APT 공격 탐지 장치는 지속공격에 대응하는 IP 패킷들 중 상기 표적 공격에 대응하는 IP 패킷들을 공격자 IP 별로 그룹핑하여 IP 그룹들의 세트를 생성할 수 있다. 실시예들에 따른 표적공격 분류 방법은 도 2 내지 도 6에서 설명한 바와 동일하므로 구체적인 설명은 생략한다.The APT attack detection device (e.g., the targeted attack classification module 240) according to embodiments determines whether IP packets corresponding to the persistent attack correspond to the targeted attack, and determines whether the IP packets corresponding to the persistent attack are targeted. IP packets corresponding to the attack are output (740). An APT attack detection device according to embodiments may remove IP packets with one pattern and IP packets targeting organizations and/or hosts smaller than a preset number. Additionally, the APT attack detection device according to embodiments may generate a set of IP groups by grouping IP packets corresponding to the target attack among IP packets corresponding to persistent attacks by attacker IP. Since the targeted attack classification method according to the embodiments is the same as described in FIGS. 2 to 6, detailed description is omitted.

실시예들에 따른 APT 공격 탐지 장치(예를 들면 유사도 측정 모듈(250))는 출력된 IP 패킷들의 각 IP 패킷과 과거 보안이벤트 사고가 있었던 하나 또는 그 이상의 IP 패킷들과의 유사도를 측정한다(750). 실시예들에 따른 유사도는 각 IP 패킷의 페이로드 내의 데이터와 상기 과거 보안이벤트 사고가 있었던 하나 또는 그 이상의 IP 패킷들의 페이로드 내의 데이터 간의 유사도를 나타낸다. 실시예들에 따른 APT 공격 탐지 장치는 유사도 측정을 위하여 각 IP 패킷의 페이로드 내의 데이터를 전처리할 수 있다. 예를 들어 APT 공격 탐지 장치는 각 IP 패킷의 페이로드 내의 데이터에 포함된 개인정보를 제거하고, 개인정보가 제거된 데이터 내의 특수문자를 제거하고, 특수문자가 제거된 데이터 내의 불용문자를 제거하고, 불용문자가 제거된 데이터 내의 문자열을 일정길이를 세그멘테이션할 수 있다. 실시예들에 따른 유사도 측정 방법은 도 2 내지 도 6에서 설명한 바와 동일하므로 구체적인 설명은 생략한다.The APT attack detection device (e.g., similarity measurement module 250) according to embodiments measures the similarity between each IP packet of output IP packets and one or more IP packets that have had security events in the past ( 750). Similarity according to embodiments indicates the similarity between data in the payload of each IP packet and data in the payload of one or more IP packets in which the past security event occurred. APT attack detection devices according to embodiments may preprocess data in the payload of each IP packet to measure similarity. For example, the APT attack detection device removes personal information included in the data in the payload of each IP packet, removes special characters in the data from which personal information has been removed, removes unused characters from data from which special characters have been removed, and , the string in the data with unused characters removed can be segmented to a certain length. Since the similarity measurement method according to the embodiments is the same as that described in FIGS. 2 to 6, detailed description will be omitted.

실시예들에 따른 APT 공격 탐지 장치(예를 들면 블랙 IP 결정 모듈(260))는 유사도를 기반으로 IP 패킷의 주소를 블랙 IP 주소 또는 의심(Suspicious) IP 주소로 분류한다(760). 실시예들에 따른 APT 공격 탐지 장치는 유사도가 기설정된 값 보다 큰 경우 IP 패킷의 주소를 블랙 IP 주소로 분류하고, 유사도가 기설정된 값 보다 작은 경우 IP 패킷의 주소를 의심 IP 주소로 분류한다. 또한 실시예들에 따른 블랙 IP 결정 모듈(260)은 기관 또는 호스트별로 패킷 숫자를 기설정하고, 유사도가 기설정된 한계값보다 이상인 패킷의 개수가 기설정된 패킷 숫자 이상이면 해당 패킷들을 블랙 IP로 지정할 수　있다. 기설정된 한계값 및 기설정된 패킷 개수는 시스템 관리자에 의해서 설정 및 변경될 수 있다. 실시예들에 따른 블랙 IP 결정 방법은 도 2 내지 도 6에서 설명한 바와 동일하므로 구체적인 설명은 생략한다. The APT attack detection device (e.g., black IP determination module 260) according to embodiments classifies the address of the IP packet as a black IP address or a suspicious IP address based on similarity (760). APT attack detection devices according to embodiments classify the address of the IP packet as a black IP address when the similarity is greater than a preset value, and classify the address of the IP packet as a suspicious IP address when the similarity is less than the preset value. Additionally, the black IP determination module 260 according to embodiments presets the number of packets for each organization or host, and designates the packets as black IPs when the number of packets with a similarity greater than the preset threshold is greater than the preset number of packets. It is possible. The preset limit value and the preset number of packets can be set and changed by the system administrator. Since the black IP determination method according to the embodiments is the same as described in FIGS. 2 to 6, detailed description is omitted.

도 1 내지 도 7에서 설명한 실시예들에 따른 APT 공격 탐지 장치의 구성요소들은 메모리와 결합된 하나 또는 그 이상의 프로세서들을 포함하는 하드웨어, 소프트웨어, 펌웨어, 또는 이들의 결합으로 구현될 수 있다. 실시예들에 따른 APT 공격 탐지 장치의 구성요소들은 하나의 칩, 예를 들면 하나의 하드웨어 서킷으로 구현될 수 있다. 또한 실시예들에 따른 APT 공격 탐지 장치의 구성요소들은 각각 별도의 칩들로 구현될 수 있다. 또한 실시예들에 따른 APT 공격 탐지 장치의 구성요소들은 중 적어도 하나 이상은 하나 또는 그 이상의 프로그램들을 실행 할 수 있는 하나 또는 그 이상의 프로세서들로 구성될 수 있으며, 하나 또는 그 이상의 프로그램들은 도 1 내지 도 7에서 설명한 APT 공격 탐지 장치의 동작/방법들 중 어느 하나 또는 그 이상의 동작들을 수행시키거나, 수행하기 위한 인스트럭션들을 포함할 수 있다.Components of the APT attack detection device according to the embodiments described in FIGS. 1 to 7 may be implemented as hardware including one or more processors combined with memory, software, firmware, or a combination thereof. Components of the APT attack detection device according to embodiments may be implemented with one chip, for example, one hardware circuit. Additionally, components of the APT attack detection device according to embodiments may each be implemented as separate chips. In addition, at least one of the components of the APT attack detection device according to embodiments may be composed of one or more processors capable of executing one or more programs, and the one or more programs may be configured as shown in FIGS. 1 to 1. It may include instructions for performing or performing one or more of the operations/methods of the APT attack detection device described in FIG. 7.

설명의 편의를 위하여 각 도면을 나누어 설명하였으나, 각 도면에 서술되어 있는 실시예들을 병합하여 새로운 실시예를 구현하도록 설계하는 것도 가능하다. 그리고, 통상의 기술자의 필요에 따라, 이전에 설명된 실시예들을 실행하기 위한 프로그램이 기록되어 있는 컴퓨터에서 판독 가능한 기록 매체를 설계하는 것도 실시예들의 권리범위에 속한다. 실시예들에 따른 장치 및 방법은 상술한 바와 같이 설명된 실시예들의 구성과 방법이 한정되게 적용될 수 있는 것이 아니라, 실시예들은 다양한 변형이 이루어질 수 있도록 각 실시예들의 전부 또는 일부가 선택적으로 조합되어 구성될 수도 있다. 바람직한 예시에 대하여 도시하고 설명하였지만, 실시예들은 상술한 특정의 예시에 한정되지 아니하며, 청구범위에서 청구하는 실시예들의 요지를 벗어남이 없이 당해 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 다양한 변형실시가 가능한 것은 물론이고, 이러한 변형실시들은 실시예들의 기술적 사상이나 전망으로부터 개별적으로 이해돼서는 안 될 것이다. 실시예들에 따른 장치 및 방법에 대한 설명은 서로 보완하여 적용될 수 있다.For convenience of explanation, each drawing has been described separately, but it is also possible to design a new embodiment by merging the embodiments described in each drawing. In addition, according to the needs of those skilled in the art, designing a computer-readable recording medium on which programs for executing the previously described embodiments are recorded also falls within the scope of the rights of the embodiments. The apparatus and method according to the embodiments are not limited to the configuration and method of the embodiments described above, but the embodiments are selectively combined in whole or in part so that various modifications can be made. It may be composed. Although preferred examples have been shown and described, the embodiments are not limited to the specific examples described above, and can be understood by those skilled in the art without departing from the gist of the embodiments claimed in the claims. Of course, various modifications are possible, and these modifications should not be understood individually from the technical ideas or perspectives of the embodiments. Descriptions of devices and methods according to embodiments may be applied to complement each other.

실시예들에 따른 장치의 다양한 구성요소들은 하드웨어, 소프트웨어, 펌웨어 또는 그것들의 조합에 의해 구성될 수 있다. 실시예들의 다양한 구성요소들은 하나의 칩, 예를 들면 하나의 하드웨어 서킷으로 구현될 수 있다 실시예들에 따라, 실시예들에 따른 구성요소들은 각각 별도의 칩들로 구현될 수 있다. 실시예들에 따라, 실시예들에 따른 장치의 구성요소들 중 적어도 하나 이상은 하나 또는 그 이상의 프로그램들을 실행 할 수 있는 하나 또는 그 이상의 프로세서들로 구성될 수 있으며, 하나 또는 그 이상의 프로그램들은 실시예들에 따른 동작/방법들 중 어느 하나 또는 그 이상의 동작/방법들을 수행시키거나, 수행시키기 위한 인스트럭션들을 포함할 수 있다. 실시예들에 따른 장치의 방법/동작들을 수행하기 위한 실행 가능한 인스트럭션들은 하나 또는 그 이상의 프로세서들에 의해 실행되기 위해 구성된 일시적이지 않은 CRM 또는 다른 컴퓨터 프로그램 제품들에 저장될 수 있거나, 하나 또는 그 이상의 프로세서들에 의해 실행되기 위해 구성된 일시적인 CRM 또는 다른 컴퓨터 프로그램 제품들에 저장될 수 있다. 또한 실시예들에 따른 메모리는 휘발성 메모리(예를 들면 RAM 등)뿐 만 아니라 비휘발성 메모리, 플래쉬 메모리, PROM등을 전부 포함하는 개념으로 사용될 수 있다. 또한, 인터넷을 통한 전송 등과 같은 캐리어 웨이브의 형태로 구현되는 것도 포함될 수 있다. 또한, 프로세서가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 프로세서가 읽을 수 있는 코드가 저장되고 실행될 수 있다.Various components of the device according to embodiments may be configured by hardware, software, firmware, or a combination thereof. Various components of the embodiments may be implemented with one chip, for example, one hardware circuit. Depending on the embodiments, the components according to the embodiments may be implemented with separate chips. Depending on the embodiments, at least one or more of the components of the device according to the embodiments may be composed of one or more processors capable of executing one or more programs, and the one or more programs may be executed. It may perform one or more of the operations/methods according to the examples, or may include instructions for performing them. Executable instructions for performing methods/operations of a device according to embodiments may be stored in a non-transitory CRM or other computer program product configured for execution by one or more processors, or may be stored in one or more processors. It may be stored in temporary CRM or other computer program products configured for execution by processors. Additionally, memory according to embodiments may be used as a concept that includes not only volatile memory (eg, RAM, etc.) but also non-volatile memory, flash memory, and PROM. Additionally, it may also be implemented in the form of a carrier wave, such as transmission through the Internet. Additionally, the processor-readable recording medium is distributed in a computer system connected to a network, so that the processor-readable code can be stored and executed in a distributed manner.

이 문서에서 “/”와 “,”는 “및/또는”으로 해석된다. 예를 들어, “A/B”는 “A 및/또는 B”로 해석되고, “A, B”는 “A 및/또는 B”로 해석된다. 추가적으로, “A/B/C”는 “A, B 및/또는 C 중 적어도 하나”를 의미한다. 또한, “A, B, C”도 “A, B 및/또는 C 중 적어도 하나”를 의미한다. 추가적으로, 이 문서에서 “또는”는 “및/또는”으로 해석된다. 예를 들어, “A 또는 B”은, 1) “A”만을 의미하거나, 2) “B”만을 의미하거나, 3) “A 및 B”를 의미할 수 있다. 달리 표현하면, 본 문서의 “또는”은 “추가적으로 또는 대체적으로(additionally or alternatively)”를 의미할 수 있다. In this document, “/” and “,” are interpreted as “and/or.” For example, “A/B” is interpreted as “A and/or B”, and “A, B” is interpreted as “A and/or B”. Additionally, “A/B/C” means “at least one of A, B and/or C.” Additionally, “A, B, C” also means “at least one of A, B and/or C.” Additionally, in this document, “or” is interpreted as “and/or.” For example, “A or B” may mean 1) only “A”, 2) only “B”, or 3) “A and B”. In other words, “or” in this document may mean “additionally or alternatively.”

제1, 제2 등과 같은 용어는 실시예들의 다양한 구성요소들을 설명하기 위해 사용될 수 있다. 하지만 실시예들에 따른 다양한 구성요소들은 위 용어들에 의해 해석이 제한되어서는 안된다. 이러한 용어는 하나의 구성요소를 다른 구성요소와 구별하기 위해 사용되는 것에 불과하다. 것에 불과하다. 예를 들어, 제1 사용자 인풋 시그널은 제2사용자 인풋 시그널로 지칭될 수 있다. 이와 유사하게, 제2사용자 인풋 시그널은 제1사용자 인풋시그널로 지칭될 수 있다. 이러한 용어의 사용은 다양한 실시예들의 범위 내에서 벗어나지 않는 것으로 해석되어야만 한다. 제1사용자 인풋 시그널 및 제2사용자 인풋 시그널은 모두 사용자 인풋 시그널들이지만, 문맥 상 명확하게 나타내지 않는 한 동일한 사용자 인풋 시그널들을 의미하지 않는다.Terms such as first, second, etc. may be used to describe various components of the embodiments. However, the interpretation of various components according to the embodiments should not be limited by the above terms. These terms are merely used to distinguish one component from another. It's just a thing. For example, a first user input signal may be referred to as a second user input signal. Similarly, the second user input signal may be referred to as the first user input signal. Use of these terms should be interpreted without departing from the scope of the various embodiments. The first user input signal and the second user input signal are both user input signals, but do not mean the same user input signals unless clearly indicated in the context.

실시예들을 설명하기 위해 사용된 용어는 특정 실시예들을 설명하기 위한 목적으로 사용되고, 실시예들을 제한하기 위해서 의도되지 않는다. 실시예들의 설명 및 청구항에서 사용된 바와 같이, 문맥 상 명확하게 지칭하지 않는 한 단수는 복수를 포함하는 것으로 의도된다. 및/또는 표현은 용어 간의 모든 가능한 결합을 포함하는 의미로 사용된다. 포함한다 표현은 특징들, 수들, 단계들, 엘리먼트들, 및/또는 컴포넌트들이 존재하는 것을 설명하고, 추가적인 특징들, 수들, 단계들, 엘리먼트들, 및/또는 컴포넌트들을 포함하지 않는 것을 의미하지 않는다. 실시예들을 설명하기 위해 사용되는, ~인 경우, ~때 등의 조건 표현은 선택적인 경우로만 제한 해석되지 않는다. 특정 조건을 만족하는 때, 특정 조건에 대응하여 관련 동작을 수행하거나, 관련 정의가 해석되도록 의도되었다.The terminology used to describe the embodiments is for the purpose of describing specific embodiments and is not intended to limit the embodiments. As used in the description of the embodiments and the claims, the singular is intended to include the plural unless the context clearly dictates otherwise. The expressions and/or are used in a sense that includes all possible combinations between the terms. The expression includes describes the presence of features, numbers, steps, elements, and/or components and does not imply the absence of additional features, numbers, steps, elements, and/or components. . Conditional expressions such as when, when, etc. used to describe the embodiments are not limited to optional cases. It is intended that when a specific condition is satisfied, the relevant operation is performed or the relevant definition is interpreted in response to the specific condition.

Claims

Preprocessing the entire detected data set to classify IP packets transmitting security events;
switching the network direction of at least one IP packet transmitting the security events;
determining whether the IP packets, including the IP packet whose network direction has been changed, correspond to a persistent attack, and outputting IP packets corresponding to the persistent attack among IP packets transmitting the security events;
determining whether the IP packets corresponding to the sustained attack correspond to a targeted attack, and outputting IP packets corresponding to the targeted attack among the IP packets corresponding to the sustained attack;
Measuring similarity between each of the output IP packets and one or more IP packets in which a security event occurred in the past; and
Classifying the address of the IP packet as a black IP address or a suspicious IP address based on the similarity,
The similarity is,
Indicates the similarity between data in the payload of each IP packet and data in the payload of one or more IP packets in which the past security event occurred,
The step of measuring the similarity between each IP packet of the output IP packets and one or more IP packets in which a past security event occurred,
Removing personal information included in data in the payload of each IP packet;
Removing special characters from the data from which the personal information has been removed;
removing unused characters from data from which the special characters have been removed; and
A step of segmenting a string in the data from which the unused characters have been removed to a certain length, wherein the string is tokenized based on a tab character.
How to detect APT attacks.

delete

The method of claim 1, wherein classifying the address of the IP packet as a black IP address or a suspicious IP address based on the similarity comprises:
APT attack detection comprising classifying the address of the IP packet as a black IP address if the similarity is greater than a preset value, and classifying the address of the IP packet as a suspicious IP address if the similarity is less than a preset value. method.

The method of claim 4, wherein classifying the address of the IP packet as a black IP address or a suspicious IP address based on the similarity comprises:
If the number of IP packets having a similarity greater than the preset value is greater than the preset number of packets, determining the address of the IP packets having a similarity greater than the preset value as a black IP address, where the preset number of packets is APT attack detection method, set by institution or host.

The method of claim 4, wherein the preprocessing of the entire detected data set to classify IP packets transmitting security events comprises:
An APT attack detection method that classifies IP packets that transmit security events related to APT attacks by removing security events unrelated to APT attacks based on IP patterns and thresholds.

The method of claim 6, wherein switching the network direction of at least one IP packet among the IP packets transmitting the security events comprises:
If the at least one IP packet is an outbound IP packet, an APT attack detection method of changing the network direction by setting the destination IP of the outbound IP packet to the attacker IP.

The method of claim 7, wherein it is determined whether the IP packets, including the IP packet whose network direction has been changed, correspond to a persistent attack, and IP packets corresponding to the persistent attack are selected among the IP packets transmitting the security events. The output steps are:
An APT attack detection method including the step of removing IP packets that transmit a single security event and consecutive security events for a period of time less than a preset time.

The method of claim 8, wherein the step of determining whether the IP packets corresponding to the sustained attack correspond to a targeted attack and outputting IP packets corresponding to the targeted attack among the IP packets corresponding to the sustained attack include:
An APT attack detection method comprising removing IP packets with one pattern and IP packets targeting organizations and/or hosts smaller than a preset number.

The method of claim 9, wherein the IP packets corresponding to the targeted attack among the IP packets corresponding to the persistent attack are a set of IP groups created by grouping each attacker IP.

A preprocessing module that classifies IP packets transmitting security events by preprocessing the entire detected data set;
a network direction change module that changes the network direction of at least one IP packet among the IP packets transmitting the security events;
A persistent attack classification module that determines whether the IP packets, including the IP packet whose network direction has been changed, correspond to a persistent attack, and outputs IP packets corresponding to the persistent attack among the IP packets transmitting the security events. ;
a targeted attack classification module that determines whether IP packets corresponding to the sustained attack correspond to a targeted attack and outputs IP packets corresponding to the targeted attack among the IP packets corresponding to the sustained attack;
a similarity measurement module that measures the similarity between each of the output IP packets and one or more IP packets that have had security events in the past;
It includes a black IP determination module that classifies the address of the IP packet as a black IP address or a suspicious IP address based on the similarity,
The similarity is,
Indicates the similarity between data in the payload of each IP packet and data in the payload of one or more IP packets in which the past security event occurred,
The similarity measurement module is,
Personal information included in the data in the payload of each IP packet is removed, special characters in the data from which the personal information has been removed are removed, unused characters in the data from which the special characters have been removed are removed, and the unused characters are removed from the data. The string in the removed data is segmented to a certain length, and the string is tokenized based on the tab character.
APT attack detection device.

delete

The method of claim 11, wherein the black IP determination module,
An APT attack detection device that classifies the address of the IP packet as a black IP address when the similarity is greater than a preset value, and classifies the address of the IP packet as a suspicious IP address when the similarity is less than a preset value.

The method of claim 14, wherein the black IP determination module,
If the number of IP packets having a similarity greater than the preset value is greater than the preset number of packets, determining the address of the IP packets having a similarity greater than the preset value as a black IP address, where the preset number of packets is APT attack detection device, set according to institution or host.

The method of claim 14, wherein the preprocessing module,
An APT attack detection device that classifies IP packets that transmit security events related to APT attacks by removing security events unrelated to APT attacks based on IP patterns and thresholds.

The method of claim 16, wherein the network redirection module,
If the at least one IP packet is an outbound IP packet, an APT attack detection device that changes the network direction by setting the destination IP of the outbound IP packet to the attacker IP.

The method of claim 17, wherein the sustained attack classification module,
An APT attack detection device that removes single security events and IP packets that transmit consecutive security events for less than a preset time.

The method of claim 18, wherein the targeted attack classification module,
An APT attack detection device that removes IP packets with one pattern and IP packets targeting organizations and/or hosts smaller than a preset number.

The APT attack detection device according to claim 19, wherein the IP packets corresponding to the targeted attack among the IP packets corresponding to the persistent attack are a set of IP groups created by grouping each attacker IP.