KR102120232B1

KR102120232B1 - Cyber targeted attack detect system and method using kalman-filter algorithm

Info

Publication number: KR102120232B1
Application number: KR1020190139413A
Authority: KR
Inventors: 남기효; 정문권; 이희웅
Original assignee: (주)유엠로직스
Priority date: 2019-11-04
Filing date: 2019-11-04
Publication date: 2020-06-16

Abstract

The present invention relates to a cyber target attack detection system using a Kalman-filter algorithm and a method thereby and, more particularly, to a cyber target attack detection system using a Kalman-filter algorithm which includes: an attack collection unit (100) that collects related information on previously generated cyber target attacks; an attack analysis unit (200) that constructs big data using the information collected by the attack collection unit (100) and calculates the degree of risk of a corresponding cyber target attack by analyzing the big data; a keyword collection unit (300) for collecting issue keywords by analyzing morphemes contained in various text materials on the Internet; a keyword analysis unit (400) for selecting social issues keywords by performing a correlation analysis between the issue keywords collected by the keyword collection unit (300); and an integrated analysis unit (500) that determines whether the corresponding cyber target attack is a social issue-type cyber target attack, analyzes server operation data collected in real time through a correlation analysis between the risk information calculated by the attack analysis unit (200) and the social issue keywords selected by the keyword analysis unit (400), and detects a cyber target attack by using the degree of attack generation risk of a corresponding site using the Kalman filter algorithm.

Description

Cyber targeted attack detect system and method using kalman-filter algorithm}

본 발명은 칼만필터 알고리즘을 이용한 사이버 표적공격 탐지 시스템 및 그 탐지 방법에 관한 것으로, 더욱 상세하게는 사회이슈와 위협정보를 연관 분석하여, 사이버 표적공격에 대한 보안 강화 및 예방을 수행할 수 있도록 하는 칼만필터 알고리즘을 이용한 사이버 표적공격 탐지 시스템 및 그 탐지 방법에 관한 것이다.The present invention relates to a cyber target attack detection system using the Kalman filter algorithm and a detection method thereof, and more specifically, to analyze and analyze social issues and threat information, thereby enabling security enhancement and prevention for cyber target attacks. A cyber target attack detection system using a Kalman filter algorithm and a detection method thereof.

또한, 칼만필터 알고리즘을 이용하여 데이터 분석을 수행함으로써, 분석 시간(탐지 시간)을 절약할 수 있는 칼만필터 알고리즘을 이용한 사이버 표적공격 탐지 시스템 및 그 탐지 방법에 관한 것이다.In addition, by performing data analysis using the Kalman filter algorithm, it relates to a cyber target attack detection system and a detection method using the Kalman filter algorithm that can save analysis time (detection time).

사이버 표적공격은, 특정 실체를 목표로 하는 사람들에 의해 잠행적이고 지속적인 컴퓨터 해킹 프로세스들의 집합에 의해 공격이 이루어지며, 보통 개인 단체, 국가 또는 사업체나 정치단체(의 서버)를 그 표적으로 삼는다.Cyber targeted attacks are targeted by individuals targeted at specific entities and attacked by a set of latent and persistent computer hacking processes, usually targeting individual groups, countries or businesses or political groups (servers).

이러한 사이버 표적공격은 오랜 시간 동안 상당한 정도의 은밀함이 요구되어, 표적으로 삼고 있는 시스템 내의 취약점을 공격하기 위한 악성 소프트웨어를 이용하며, 이러한 악성 소프트웨어를 생성하기 위해 외부에서 지속적으로 표적 대상들에 대한 데이터를 감시하고 추출하게 된다.This cyber target attack requires a considerable degree of stealth for a long time, uses malicious software to attack vulnerabilities in the targeted system, and continuously generates data about target targets from the outside to create such malicious software Monitor and extract.

이러한 사이버 표적공격이 알려지지 않은 취약점을 공격하기 때문에, 기존 보안 시스템의 시그니처 기반의 탐지로는 방어가 힘들며, 이상 트래픽을 감지하는 시스템 역시 무력화하기 위해 장기간 동안 아주 천천히 공격을 시도하므로 피해 자체를 인지하지 못하게 되는 문제점이 있다.Because these cyber target attacks attack unknown vulnerabilities, it is difficult to defend with signature-based detection of existing security systems, and systems that detect abnormal traffic also attempt to attack very slowly for a long period of time to neutralize them, so they do not recognize the damage itself. There is a problem that can not be.

종래에는 이러한 문제점을 해소하기 위하여, 국내공개특허 제10-2014-0077405호("사이버 공격 탐지 장치 및 방법")에서는 미리 설정된 기간 동안 사이버 표적공격 관련 정보 소시들을 수집한 후, 미리 저장되어 있는 정상적 행위와 유사도를 비교하여 공격의 행위를 탐지하는 기술을 개시하고 있다.Conventionally, in order to solve this problem, in Korean Patent Publication No. 10-2014-0077405 ("Cyber Attack Detection Apparatus and Method"), information sources related to cyber target attacks are collected for a predetermined period, and then stored in advance. Disclosed is a technique for detecting the behavior of an attack by comparing the similarity with the behavior.

국내공개특허 제10-2014-0077405호(공개일자 2014.06.24.)Domestic Publication Patent No. 10-2014-0077405 (published on 2014.06.24.)

본 발명은 상기한 바와 같은 종래 기술의 문제점을 해결하기 위하여 안출된 것으로, 본 발명의 목적은 기존에 발생한 사이버 표적공격과 해당 기간에 발생했던 사회이슈에 대한 연관 분석을 수행하여, 사회이슈형 사이버 표적공격을 분석하여, 발생할 수 있는 사회이슈형 사이버 표적공격에 선제적으로 대응할 수 있는 칼만필터 알고리즘을 이용한 사이버 표적공격 탐지 시스템 및 그 탐지 방법을 제공하는 것이다.The present invention has been devised to solve the problems of the prior art as described above, and the object of the present invention is to perform a social analysis of the cyber target attack and the social issue that occurred in the corresponding period. It is to provide a cyber target attack detection system and a detection method using a Kalman filter algorithm capable of proactively responding to a social issue type cyber target attack that may occur by analyzing a target attack.

본 발명의 일 실시예에 따른 칼만필터 알고리즘을 이용한 사이버 표적공격 탐지 시스템은, 기발생한 사이버 표적공격에 대한 관련 정보들을 수집하는 공격 수집부(100), 상기 공격 수집부(100)에서 빅데이터를 구성한 정보들을 분석하여 해당하는 사이버 표적공격의 위험도를 산정하는 공격 분석부(200), 인터넷 상의 다양한 텍스트 자료에 포함된 형태소를 분석하여, 이슈 키워드를 수집하는 키워드 수집부(300), 상기 키워드 수집부(300)에서 수집한 상기 이슈 키워드들 간의 연관성 분석을 수행하여, 사회이슈 키워드를 선정하는 키워드 분석부(400) 및 상기 공격 분석부(200)에서 산정한 위험도 정보와 상기 키워드 분석부(400)에서 선정한 사회이슈 키워드 간의 상관도 분석을 통해, 해당하는 사이버 표적공격의 사회이슈형 사이버 표적공격 여부를 판단하여, 실시간으로 수집되는 서버 운영 데이터를 분석하고, 칼만필터 알고리즘을 이용하여 해당 사이트의 공격 발생 위험도를 이용하여 사이버 표적공격을 탐지하는 통합 분석부(500)를 포함하여 구성되는 것이 바람직하다.Cyber target attack detection system using a Kalman filter algorithm according to an embodiment of the present invention, the attack collection unit 100 for collecting the related information on the pre-occurring cyber target attack, the attack collection unit 100 to the big data Attack analysis unit (200) to calculate the risk of the corresponding cyber target attack by analyzing the constructed information, keyword collection unit (300) to collect morphemes included in various text data on the Internet, and to collect issue keywords The keyword analysis unit 400 for selecting a social issue keyword by performing association analysis between the issue keywords collected by the unit 300 and the risk analysis unit 400 and the risk information calculated by the attack analysis unit 200 ), analyzes the correlation between social issue keywords selected, determines whether the corresponding cyber target attack is a social issue type cyber target attack, analyzes the server operation data collected in real time, and uses the Kalman filter algorithm to It is preferably configured to include an integrated analysis unit 500 for detecting a cyber target attack using the risk of attack occurrence.

더 나아가, 상기 공격 수집부(100)는 외부로부터 기발생한 사이버 표적공격에 대한 로그 정보들을 전송받아 이를 수집하여, 빅데이터화하는 것이 바람직하다.Furthermore, it is preferable that the attack collection unit 100 receives log information about a cyber target attack that has already occurred from the outside, collects it, and makes it big data.

더 나아가, 상기 공격 분석부(200)는 상기 빅데이터를 기설정된 기준에 따라 정형화하는 정형화부(210), 정형화된 데이터에 대해 기설정된 통계 분석을 수행하는 통계분석부(220), 딥러닝을 이용하여 정형화된 데이터에 대한 학습을 수행하는 학습부(230) 및 상기 학습부(230)의 학습 결과에 따라, 사이버 표적공격의 탐지를 위한 위험도를 산정하는 위험도 산정부(240)를 더 포함하여 구성되는 것이 바람직하다.Further, the attack analysis unit 200 is a standardization unit 210 that formalizes the big data according to a predetermined criterion, a statistical analysis unit 220 that performs predetermined statistical analysis on the standardized data, and deep learning. Further comprising a learning unit 230 for performing learning on the structured data using the risk calculation unit 240 for estimating the risk for the detection of a cyber target attack according to the learning results of the learning unit 230 It is preferably configured.

더 나아가, 상기 키워드 수집부(300)는 외부로부터 수집하고자 하는 기간 정보를 입력받아, 해당하는 기간의 인터넷 상의 다양한 텍스트 자료들을 수집하는 텍스트 수집부(310) 및 수집한 상기 텍스트 자료들에 대한 형태소 분석을 통해, 이슈 키워드를 설정하는 이슈 키워드 설정부(320)를 더 포함하되, 상기 이슈 키워드 설정부(320)는 분석한 형태소들 중 기설정된 블랙리스트 형태소들은 제외하고 나머지 형태소를 상기 이슈 키워드로 설정하는 것이 바람직하다.Furthermore, the keyword collection unit 300 receives a period information to be collected from the outside, a text collection unit 310 for collecting various text data on the Internet of the corresponding period, and a morpheme for the collected text data Through analysis, an issue keyword setting unit 320 for setting an issue keyword is further included, but the issue keyword setting unit 320 excludes a preset blacklist morpheme from among the analyzed morphemes and uses the remaining morphemes as the issue keyword. It is desirable to set.

더 나아가, 상기 키워드 분석부(400)는 기설정된 기준에 따라, 상기 이슈 키워드 설정부(320)에서 설정한 상기 이슈 키워드들을 그룹화하는 키워드 그룹화부(410) 및 각각의 그룹 내 이슈 키워드들 간의 연관성 분석을 수행하여, 키워드 발생 빈도를 기준으로 사회이슈 키워드를 선정하는 사회이슈 선정부(420)를 더 포함하여 구성되는 것이 바람직하다.Further, the keyword analysis unit 400, according to a predetermined criterion, the keyword grouping unit 410 for grouping the issue keywords set by the issue keyword setting unit 320, and the association between issue keywords in each group It is preferable to further include a social issue selection unit 420 for selecting social issue keywords based on the frequency of keyword generation by performing analysis.

더 나아가, 상기 통합 분석부(500)는 상기 공격 분석부(200)에서 산정한 사이버 표적공격의 탐지를 위한 위험도와 상기 키워드 분석부(400)에서 선정한 사회이슈 키워드 간의 상관도 분석을 통해, 해당하는 사이버 표적공격의 사회이슈형 사이버 표적공격 여부를 판단하는 상관도 분석부(510) 및 실시간으로 입력받은 특정 서버의 운영 데이터를 분석하여, 특정 서버에 대한 사회이슈형 사이버 표적공격의 발생 위험도를 예측하되, 칼만필터 알고리즘을 이용하여 예측한 발생 위험도를 보정하는 위험도 예측부(520)를 더 포함하여 구성되는 것이 바람직하다.Furthermore, the integrated analysis unit 500 analyzes the correlation between the risk for the detection of the cyber target attack calculated by the attack analysis unit 200 and the social issue keywords selected by the keyword analysis unit 400, corresponding The correlation analysis unit 510 for determining whether a cyber-targeted cyber target attack is a cyber target attack, and analyzes operational data of a specific server received in real time, thereby increasing the risk of a social-issue cyber target attack for a specific server. Predictively, it is preferable that the risk prediction unit 520 is further configured to correct the occurrence risk predicted using the Kalman filter algorithm.

본 발명의 일 실시예에 따른 칼만필터 알고리즘을 이용한 사이버 표적공격 탐지 방법은, 기발생한 사이버 표적공격에 대한 관련 정보들을 수집하고, 수집한 상기 정보들을 이용하여 빅데이터를 구성하여 이를 통해 사이버 표적공격의 위험도를 산정하는 공격 분석단계(S100), 인터넷 상의 다양한 텍스트 자료에 포함된 형태소를 분석하여 이슈 키워드를 수집하고, 수집한 상기 이슈 키워드들 간의 연관성 분석을 통해 상기 이슈 키워드들 중 사회이슈 키워드를 선정하는 사회이슈 분석단계(S200) 및 상기 공격 분석단계(S100)에서 산정한 위험도 정보와 상기 사회이슈 분석단계(S200)에서 선정한 사회이슈 키워드 간의 상관도 분석을 통해, 기발생한 사이버 표적공격의 사회이슈형 사이버 표적공격 여부를 판단하며, 실시간으로 수집되는 서버 운영 데이터를 분석하여 칼만필터 알고리즘을 이용하여 해당 사이트의 사회이슈형 사이버 표적공격 발생 위험도를 판단하는 통합 분석단계(S300)로 이루어지는 것이 바람직하다.The cyber target attack detection method using the Kalman filter algorithm according to an embodiment of the present invention collects related information on a previously generated cyber target attack, configures big data using the collected information, and thereby through this cyber target attack The attack analysis step (S100) for calculating the risk of collecting the issue keywords by analyzing the morphemes contained in various text data on the Internet, and the social issue keywords among the issue keywords by analyzing the association between the collected issue keywords The cyber target attack society through analysis of the correlation between the risk information calculated in the selected social issue analysis step (S200) and the attack analysis step (S100) and the social issue keyword selected in the social issue analysis step (S200) It is desirable to consist of an integrated analysis step (S300) that determines whether an issue is a cyber target attack and analyzes the server operation data collected in real time and uses a Kalman filter algorithm to determine the risk of a social issue type cyber target attack at the site. Do.

더 나아가, 상기 공격 분석단계(S100)는 외부로부터 기발생한 사이버 표적공격에 대한 로그 정보들을 전송받아 이를 수집하여 빅데이터화하는 빅데이터 구성단계(S110), 상기 빅데이터를 기설정된 기준에 따라 정형화하는 정형화 단계(S120), 정형화된 데이터에 대해 기설정된 통계 분석을 수행하는 통계분석단계(S130), 딥러닝을 이용하여 정형화된 데이터에 대한 학습을 수행하는 학습단계(S140) 및 학습 결과에 따라, 사이버 표적공격의 탐지를 위한 위험도를 산정하는 위험도 산정단계(S150)로 이루어지는 것이 바람직하다.Further, in the attack analysis step (S100), the big data configuration step (S110) of receiving log information about a cyber target attack that has been generated from the outside and collecting it to make it big data, and formalizing the big data according to predetermined criteria According to the standardization step (S120), the statistical analysis step (S130) for performing predetermined statistical analysis on the standardized data, the learning step (S140) for learning the structured data using deep learning, and the learning result, It is preferable to consist of a risk estimating step (S150) for estimating the risk for detecting a cyber target attack.

더 나아가, 상기 사회이슈 분석단계(S200)는 외부로부터 수집하고자 하는 기간 정보를 입력받아, 해당하는 기간의 인터넷 상의 다양한 텍스트 자료들을 수집하는 텍스트 수집단계(S210), 수집한 상기 텍스트 자료들에 대한 형태소 분석을 통해, 이슈 키워드를 설정하는 이슈 키워드 설정단계(S220), 기설정된 기준에 따라, 설정한 상기 이슈 키워드들을 그룹화하는 키워드 그룹화단계(S230) 및 각각의 그룹 내 이슈 키워드들 간의 연관성 분석을 수행하여, 키워드 발생 빈도를 기준으로 사회이슈 키워드를 선정하는 사회이슈 키워드 선정단계(S240)로 이루어지는 것이 바람직하다.Further, in the social issue analysis step (S200), a text collection step (S210) for receiving various periods of text data on the Internet in the corresponding period by receiving information about a period to be collected from the outside, in step S210, for the collected text data Through morpheme analysis, an issue keyword setting step (S220) for setting an issue keyword, a keyword grouping step (S230) for grouping the set of issue keywords according to a predetermined criterion, and an analysis of association between issue keywords in each group By performing, it is preferable to consist of a social issue keyword selection step (S240) for selecting social issue keywords based on the frequency of keyword occurrence.

더 나아가, 상기 이슈 키워드 설정단계(S220)는 분석한 형태소들 중 기설정된 블랙리스트 형태소들을 제외하고 나머지 형태소를 상기 이슈 키워드로 설정하는 것이 바람직하다.Further, in the setting of the issue keyword (S220), it is preferable to set the remaining morphemes as the issue keywords, except for the preset blacklist morphemes among the analyzed morphemes.

더 나아가, 상기 통합 분석단계(S300)는 상기 공격 분석단계(S100)에서 산정한 사이버 표적공격의 탐지를 위한 위험도 정보와 상기 사회이슈 분석단계(S200)에서 선정한 사회이슈 키워드 간의 상관도 분석을 통해, 기발생한 사이버 표적공격의 사회이슈형 사이버 표적공격 여부를 판단하는 상관도 분석단계(S310) 및 실시간으로 입력받은 특정 서버의 운영 데이터를 분석하여 특정 서버에 대한 사회이슈형 사이버 표적공격의 발생 위험도를 예측하는 위험도 예측단계(S320)로 이루어지되, 상기 위험도 예측단계(S320)는 칼만필터 알고리즘을 이용하여 예측한 발생 위험도를 보정하는 것이 바람직하다.Furthermore, the integrated analysis step (S300) is through the correlation analysis between the risk information for the detection of the cyber target attack calculated in the attack analysis step (S100) and the social issue keyword selected in the social issue analysis step (S200). , Correlation analysis step (S310) of determining whether a cyber-attack is a cyber-attack of a previously generated cyber target attack, and risk of occurrence of a social-issue cyber target attack on a specific server by analyzing operation data of a specific server input in real time It is composed of a risk prediction step (S320) for predicting, the risk prediction step (S320) is preferably to correct the occurrence risk predicted using the Kalman filter algorithm.

상기와 같은 구성에 의한 본 발명의 칼만필터 알고리즘을 이용한 사이버 표적공격 탐지 시스템 및 그 탐지 방법은 종래에 발생하는 사이버 표적공격들에 의한 데이터들을 분석한 결과, 공격 그룹(공격자 조직) 별로 노리는 사회/경제적 이슈(일 예를 들자면, 정상회담, 올림픽, 가상화폐 등)와 공격 기법 간의 특징이 있는 것으로 판단하고, 사회이슈와 발생하는 사이버 표적공격 기법을 연관지어 분석함으로써, 취약점을 내제하고 있는 표적 그룹(공격 대상 조직)이 당할 수 있는 사이버 표적공격을 선제적으로 대응할 수 있는 장점이 있다.The cyber target attack detection system and the detection method using the Kalman filter algorithm of the present invention according to the above-described configuration are analyzed by data generated by the conventional cyber target attacks, and the society targeting each attack group (attacker organization)/ Target groups that are vulnerable by analyzing economic issues (for example, summit, Olympics, cryptocurrency, etc.) and attack techniques, and analyzing and analyzing social issues and cyber target attack techniques. It has the advantage of being able to preemptively respond to cyber targeted attacks that the (target organization) can face.

특히, 사회이슈와 발생하는 사이버 표적공격을 연관지어 분석함으로써, 공격자의 공격 경로, 악성코드 감염원 등을 예측하여 신속하게 탐지할 수 있어, 발생할 수 있는 표적공격을 선제적으로 대응할 수 있는 장점이 있다.In particular, by analyzing and analyzing social issues and cyber target attacks that occur, it is possible to quickly detect and predict an attacker's attack path, a source of malicious code infection, etc., thereby proactively responding to a target attack that may occur. .

더불어, 칼만필터 알고리즘을 이용함으로써, 수많은 사회이슈 관련 정보들과 사이버 표적공격 관련 정보들을 분석하는데 있어서, 탐지 시간을 절약할 수 있어, 좀더 신속하게 사이버 표적공격을 분석/탐지할 수 있는 장점이 있다.In addition, by using the Kalman filter algorithm, it is possible to save detection time in analyzing a large number of social issues related information and information related to a cyber target attack, which has the advantage of analyzing/detecting a cyber target attack more quickly. .

이를 통해서, 표적 그룹에서 실시간으로 발생되는 사회이슈에 따른 사이버 표적공격 피해를 최소화할 수 있으며, 과거에 사이버 표적공격과 연관되어 있는 사회이슈가 또 다시 나타날 경우, 예상되는 표적 그룹에서 보안을 강화하고, 지속적인 모니터링을 통해 사이버 표적공격을 미연에 방지하고 예방 및 신속 조치를 취할 수 있는 장점이 있다.Through this, it is possible to minimize the cyber target attack damage caused by social issues generated in real time in the target group, and if the social issue associated with the cyber target attack appears again in the past, strengthen the security in the expected target group and In addition, through continuous monitoring, it is possible to prevent cyber target attacks in advance, and to take preventive and prompt measures.

도 1은 본 발명의 일 실시예에 따른 칼만필터 알고리즘을 이용한 사이버 표적공격 탐지 시스템을 나타낸 구성도이다.
도 2 내지 도 6은 본 발명의 일 실시예에 따른 칼만필터 알고리즘을 이용한 사이버 표적공격 탐지 시스템의 각 구성의 세부 구성 동작도이다.
도 7은 본 발명의 일 실시예에 따른 칼만필터 알고리즘을 이용한 사이버 표적공격 탐지 방법을 나타낸 순서도이다.1 is a block diagram showing a cyber target attack detection system using a Kalman filter algorithm according to an embodiment of the present invention.
2 to 6 is a detailed configuration operation diagram of each configuration of the cyber target attack detection system using the Kalman filter algorithm according to an embodiment of the present invention.
7 is a flowchart illustrating a cyber target attack detection method using a Kalman filter algorithm according to an embodiment of the present invention.

이하 첨부한 도면들을 참조하여 본 발명의 칼만필터 알고리즘을 이용한 사이버 표적공격 탐지 시스템 및 그 탐지 방법을 상세히 설명한다. 다음에 소개되는 도면들은 당업자에게 본 발명의 사상이 충분히 전달될 수 있도록 하기 위해 예로서 제공되는 것이다. 따라서, 본 발명은 이하 제시되는 도면들에 한정되지 않고 다른 형태로 구체화될 수도 있다. 또한, 명세서 전반에 걸쳐서 동일한 참조번호들은 동일한 구성요소들을 나타낸다.Hereinafter, a cyber target attack detection system and its detection method using the Kalman filter algorithm of the present invention will be described in detail with reference to the accompanying drawings. The drawings introduced below are provided as examples in order to sufficiently convey the spirit of the present invention to those skilled in the art. Accordingly, the present invention is not limited to the drawings presented below and may be embodied in other forms. In addition, the same reference numbers throughout the specification indicate the same components.

이 때, 사용되는 기술 용어 및 과학 용어에 있어서 다른 정의가 없다면, 이 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 통상적으로 이해하고 있는 의미를 가지며, 하기의 설명 및 첨부 도면에서 본 발명의 요지를 불필요하게 흐릴 수 있는 공지 기능 및 구성에 대한 설명은 생략한다.At this time, unless there are other definitions in the technical terms and scientific terms used, it has the meanings commonly understood by those of ordinary skill in the art to which this invention belongs, and the subject matter of the present invention in the following description and the accompanying drawings Descriptions of well-known functions and configurations that may unnecessarily obscure are omitted.

더불어, 시스템은 필요한 기능을 수행하기 위하여 조직화되고 규칙적으로 상호 작용하는 장치, 기구 및 수단 등을 포함하는 구성 요소들의 집합을 의미한다.In addition, a system refers to a set of components, including devices, instruments and means, which are organized and regularly interacted to perform the necessary functions.

본 발명의 일 실시예에 따른 칼만필터 알고리즘을 이용한 사이버 표적공격 탐지 시스템 및 그 탐지 방법은 취약점을 내제하고 있는 공격 대상 그룹(개인, 단체, 국가, 사업체 또는 정치단체 등)이 당할 수 있는 사이버 표적공격, 특히 사회이슈형 사이버 표적공격에 대해 선제적으로 대응하기 위해, 사회이슈와 발생했던 사이버 표적공격을 연관 분석하여 공격자의 공격 경로, 악성코드 감염원 등을 탐지하는 것이 바람직하다.The cyber target attack detection system and the detection method using the Kalman filter algorithm according to an embodiment of the present invention can be attacked by a target group (individual, organization, country, business, or political organization) that has a vulnerability. In order to preemptively respond to attacks, especially social issue-type cyber target attacks, it is desirable to detect the attack path of the attacker and the source of malicious code by analyzing and analyzing the cyber target attacks that have occurred.

특히, 수많은 데이터들을 분석하면서, 문제점을 나타날 수 있는 탐지 시간을 감소시키기 위해, 칼만필터 알고리즘(Kalman-Filter algorithm)을 이용함으로써, 효과를 극대화시킬 수 있는 장점이 있다.In particular, while analyzing a large number of data, in order to reduce the detection time that may cause problems, by using a Kalman-Filter algorithm (Kalman-Filter algorithm), there is an advantage that can maximize the effect.

이러한 본 발명의 일 실시예에 따른 칼만필터 알고리즘을 이용한 사이버 표적공격 탐지 시스템 및 그 탐지 방법은, 과거에 발생한 사이버 표적공격에 대한 빅데이터를 수집하고 수집된 빅데이터를 분석하여 사이버 표적공격의 위험도를 산정하고, 과거에 발생했던 사회적 이슈들에 대한 키워드들을 수집하고 수집한 키워드들을 대상으로 사회이슈를 분석하여, 사이버 표적공격에 대한 분석 결과와 사회이슈에 대한 분석 결과를 이용하여 '사회이슈형 사이버 표적공격'을 탐지하는 것이 바람직하다.The cyber target attack detection system and the detection method using the Kalman filter algorithm according to an embodiment of the present invention collects big data about the cyber target attack that occurred in the past and analyzes the collected big data to analyze the risk of cyber target attack Is calculated, collects keywords for social issues that have occurred in the past, analyzes social issues with the collected keywords, and uses the analysis results for cyber targeted attacks and social issues to analyze'social issues. It is desirable to detect a cyber target attack.

특히, 사회이슈형 사이버 표적공격이 발생함에 있어서 나타나는 패턴 등을 학습하여 앞으로 나타날 가능성이 높은 '사회이슈형 사이버 표적공격'을 탐지하여, 이를 위한 보안 강화 등을 통해 사이버 공격에 의한 피해를 미연에 방지할 수 있는 장점이 있다.In particular, by learning the patterns that appear in the occurrence of social issue-type cyber target attacks, it is possible to detect the'social issue-type cyber target attacks' that are likely to appear in the future and to prevent the damage caused by cyber attacks by strengthening security for this. There is an advantage that can be prevented.

도 1 내지 도 6은 본 발명의 일 실시예에 따른 칼만필터 알고리즘을 이용한 사이버 표적공격 탐지 시스템을 나타낸 구성도들로, 도 1 내지 도 6을 참조로 하여 본 발명의 일 실시예에 따른 칼만필터 알고리즘을 이용한 사이버 표적공격 탐지 시스템을 상세히 설명한다.1 to 6 are configuration diagrams showing a cyber target attack detection system using a Kalman filter algorithm according to an embodiment of the present invention, with reference to FIGS. 1 to 6 Kalman filter according to an embodiment of the present invention The cyber target attack detection system using the algorithm will be described in detail.

본 발명의 일 실시예에 따른 칼만필터 알고리즘을 이용한 사이버 표적공격 탐지 시스템은 도 1에 도시된 바와 같이, 공격 수집부(100), 공격 분석부(200), 키워드 수집부(300), 키워드 분석부(400) 및 통합 분석부(500)를 포함하여 구성되는 것이 바람직하다.The cyber target attack detection system using the Kalman filter algorithm according to an embodiment of the present invention, as shown in Figure 1, the attack collection unit 100, the attack analysis unit 200, the keyword collection unit 300, keyword analysis It is preferably configured to include the unit 400 and the integrated analysis unit 500.

또한, 각 구성들에서 발생하는 데이터들을 전송받아, 이를 데이터베이스화하여 저장 및 관리하는 데이터베이스부(600)를 더 포함하여 구성되는 것이 바람직하다.In addition, it is preferable to further include a database unit 600 that receives, receives, and generates data from each of the components, and then stores and manages the database.

각 구성에 대해서 자세히 알아보자면,To learn more about each configuration,

상기 공격 수집부(100)는 이미 발생한 사이버 표적공격에 대한 관련 정보들을 수집하는 것이 바람직하다.It is preferable that the attack collection unit 100 collects information related to a cyber target attack that has already occurred.

상세하게는, 상기 공격 수집부(100)는 과거 발생했던 사이버 표적공격에 대한 공격 로그를 수집하는 것이 바람직하다.In detail, it is preferable that the attack collection unit 100 collects an attack log for a cyber target attack that has occurred in the past.

즉, 현재 발생하는 또는, 발생할 수 있는 사이버 표적공격을 탐지하기 위해서는, 과거에 발생했던 사이버 표적공격에 대한 분석이 필요하다. 이에 따라, 일정한 형태의 사이버 표적공격이 아닌, 모든 형태의 사이버 표적공격에 대한 로그 정보들을 전송받아, 이를 수집하여 빅데이터로 구성하는 것이 바람직하다.That is, in order to detect a cyber target attack that is currently occurring or may occur, analysis of a cyber target attack that has occurred in the past is required. Accordingly, it is desirable to receive log information for all types of cyber target attacks, not certain types of cyber target attacks, and collect them to form big data.

이를 위해서, 상기 공격 수집부(200)는 도 2에 도시된 바와 같이, 사이버 표적공격 관련 로그 정보들을 보관하고 있는 다양한 사이버 보안관제 운영 기관(보안관제 센터 등)들을 설정하고 이들을 관리하면서, 이들이 보관하고 있는 모든 형태의 사이버 표적공격 관련 로그 정보들을 전송받는 것이 바람직하다.To this end, as shown in FIG. 2, the attack collection unit 200 sets and manages various cyber security control operating organizations (security control centers, etc.) that store log information related to the cyber target attack, while they are stored. It is desirable to receive log information related to all types of cyber target attacks.

이 때, 단순히 로그 정보들을 모두 모아서, 빅데이터로 구성하는 것이 아니라, 전송받은 로그 정보의 형식이 데이터베이스 형식인지, 파일 형식인지 판별하여, 데이터베이스 형식일 경우, SQL과 연동 후 상기 데이터베이스부(600)에 '사이버 표적공격 데이터'로 저장하고, 파일 형식일 경우, 해당 파일을 추출하여 상기 데이터베이스부(600)에 '사이버 표적공격 데이터'로 저장하는 것이 바람직하다.At this time, rather than simply collecting all log information and configuring it as big data, it is determined whether the format of the received log information is a database format or a file format, and in the case of a database format, the database unit 600 is interlocked with SQL. In the'cyber target attack data', and in the case of a file format, it is preferable to extract the file and store it as'cyber target attack data' in the database unit 600.

상기 공격 분석부(200)는 상기 공격 수집부(100)에서 수집하여 빅데이터화한 정보들을 이용하여, 이를 분석하여 해당하는 사이버 표적 공격의 위험도를 산정하는 것이 바람직하다.It is preferable that the attack analysis unit 200 calculates the risk of a corresponding cyber target attack by analyzing the information using the big data collected and collected by the attack collection unit 100.

이 때, 상기 공격 분석부(200)는 도 1에 도시된 바와 같이, 정형화부(210), 통계분석부(220), 학습부(230) 및 위험도 산정부(240)를 포함하여 구성되는 것이 바람직하다.At this time, as shown in Figure 1, the attack analysis unit 200 is configured to include a shaping unit 210, a statistical analysis unit 220, a learning unit 230 and a risk calculation unit 240 desirable.

상기 정형화부(210)는 상기 공격 수집부(100)에서 상기 빅데이터를 전송받아, 상기 빅데이터를 미리 설정된 기준에 따라 정형화하는 것이 바람직하며, 상기 통계분석부(220)는 상기 정형화부(210)에 의해 정형화된 데이터에 대해 미리 설정된 통계 분석을 수행하는 것이 바람직하다.The shaping unit 210 receives the big data from the attack collection unit 100, and it is preferable to shape the big data according to a preset criterion, and the statistical analysis unit 220 may include the shaping unit 210 It is preferable to perform a preset statistical analysis on data structured by ).

또한, 상기 학습부(230)는 딥러닝을 이용하여, 상기 정형화부(210)에 의해 정형화된 데이터에 대한 학습을 수행하고, 상기 위험도 산정부(240)를 통해, 상기 학습부(230)의 학습 결과에 따라 사이버 표적공격의 탐지를 위한 위험도를 산정하는 것이 바람직하다.In addition, the learning unit 230, using deep learning, performs learning on the data normalized by the shaping unit 210, and through the risk calculation unit 240, the learning unit 230 It is desirable to calculate the risk for the detection of cyber target attacks according to the learning results.

상세하게는, 상기 공격 분석부(200)는 도 3에 도시된 바와 같이, 각 서버(다양한 사이버 보안관제 운영 기관 등)에서 수집하여 빅데이터화하여 상기 데이터베이스부(600)에 '사이버 표적공격 데이터'로 저장 및 관리하고 있는 정보를 딥러닝 학습 및 통계분석을 수행하여 분석을 수행하게 된다.In detail, as shown in FIG. 3, the attack analysis unit 200 collects data from each server (various cyber security control operating organizations, etc.) and makes it big data to the'cyber target attack data' in the database unit 600. Deep learning learning and statistical analysis are performed on the information stored and managed by the analysis.

이 때, 상기 정형화부(210)를 통해서, 정형화하는 과정을 선행하는 것이 바람직하다.At this time, through the shaping unit 210, it is preferable to precede the shaping process.

정형화란, 데이터 분석을 수행하기 위해, 문자, 문자열, 정수형 등의 다양한 데이터를 수치화하는 것을 의미한다. 이렇게 수치화된 데이터, 다시 말하자면, 정형화된 데이터를 상기 통계분석부(220)를 통해서, 공격 날짜, 공격 유형, 공격 대상 등을 입력변수로 이용하여, 미리 설정된 빈도분석, 회귀분석 등 다양한 통계분석을 수행하는 것이 바람직하다.Formalization means digitizing various data such as characters, character strings, and integers to perform data analysis. Through the statistical analysis unit 220, the digitized data, that is, the standardized data, using an attack date, an attack type, an attack target, etc. as input variables, various statistical analyzes such as preset frequency analysis and regression analysis are performed. It is preferred to perform.

또한, 상기 학습부(230)를 통해서, 딥러닝을 이용하여 정형화된 데이터를 학습함으로써, 상기 위험도 산정부(240)에서 각 학습 결과(일 예를 들자면, 공격 발생 일자 및 공격 시간대별 주 공격 패턴 학습 결과, 공격자 그룹(출발지 IP정보, 출발지 PORT 정보 등)별 공격 패턴 학습 결과, 주 피해 대상 기관(목적지 IP정보, 목적지 PORT 정보 등)별 공격 패턴 학습 결과, 페이로드에 나타나는 공격 패턴 학습 결과 등)에 비중을 부여하고, 사이버 표적공격 탐지를 위한 위험도를 산정하기 위해 가중합계를 계산하게 된다.In addition, by learning the structured data using deep learning through the learning unit 230, each learning result (for example, an attack occurrence date and a main attack pattern by attack time zone) by the risk calculation unit 240 Learning result, attack pattern learning result by attacker group (departure IP information, origin PORT information, etc.), attack pattern learning result by main victim organizations (destination IP information, destination PORT information, etc.), attack pattern learning results appearing in payload, etc. ), and weighted sum is calculated to calculate the risk for cyber target attack detection.

이 때, 상기 학습부(230)는 정형화된 데이터 뿐 아니라, 상기 통계분석부(220)의 결과 데이터 역시 학습하는 것이 바람직하다.At this time, it is preferable that the learning unit 230 learns not only the structured data, but also the result data of the statistical analysis unit 220.

여기서, 상기 위험도 산정부(240)에서 산정한 사이버 표적공격 탐지를 위한 위험도란, 사이버 표적공격을 탐지하기 위한 기준을 산정하는 것을 의미하며, 과거에 발생했던 사이버 표적공격에서 나타나는 빈도 높은 행위정보(페이로드, IP주소, 공격대상 기관 등)를 이용하여, 사이버 표적공격 이전에 나타나는 패턴을 학습하여, 해당 정보를 상기 데이터베이스부(600)의 '사이버 표적공격 분석 결과 데이터'로 저장 및 관리하는 것이 바람직하다.Here, the risk for detecting a cyber target attack calculated by the risk calculating unit 240 means calculating a criterion for detecting a cyber target attack, and high-frequency behavior information appearing in the cyber target attack that occurred in the past ( Using payload, IP address, target organization, etc.) to learn patterns that appear before the cyber target attack, storing and managing the information as'cyber target attack analysis result data' of the database unit 600 desirable.

즉, 상기 위험도 산정부(240)에서 산정한 위험도란 사이버 표적공격이 발생할 때, 발생 이전(직전 등)에 나타나는 특징이 될 만한 패턴을 학습하여, 추후에 실시간으로 분석한 서버 운영 데이터(일 예를 들자면, 특정 웹 사이트 등의 운영 데이터)에서 학습한 패턴과 유사/동일한 패턴이 인지될 경우, 이를 사이버 표적공격의 발생 위험도가 존재하는 것으로 판단할 수 있다.That is, when the cyber target attack occurs, the risk calculated by the risk calculation unit 240 learns a pattern that may be characteristic before the occurrence (just before), and analyzes the server operation data in real time later (eg For example, if a pattern similar/same as the pattern learned from the operation data of a specific web site) is recognized, it can be determined that there is a risk of cyber target attack.

상기 키워드 수집부(300)는 인터넷 상의 다양한 텍스트 자료에 포함된 형태소를 분석하여, 이슈 키워드를 수집하는 것이 바람직하다.It is preferable that the keyword collection unit 300 analyzes the morphemes included in various text data on the Internet, and collects the issue keywords.

이를 위해, 상기 키워드 수집부(300)는 도 1에 도시된 바와 같이, 텍스트 수집부(310) 및 이슈 키워드 설정부(320)를 포함하여 구성될 수 있다.To this end, as illustrated in FIG. 1, the keyword collection unit 300 may include a text collection unit 310 and an issue keyword setting unit 320.

상기 텍스트 수집부(310)는 외부로부터 수집하고자 하는 기간 정보를 입력받아, 해당하는 기간의 인터넷 상의 다양한 텍스트 자료들을 수집하는 것이 바람직하며, 상기 이슈 키워드 설정부(320)는 상기 텍스트 수집부(310)에서 수집한 상기 텍스트 자료들에 대한 형태소 분석을 통해, 이슈 키워드를 설정하는 것이 바람직하다.Preferably, the text collection unit 310 receives information on a period to be collected from the outside, and collects various text data on the Internet in a corresponding period, and the issue keyword setting unit 320 includes the text collection unit 310 ), it is preferable to set an issue keyword through morpheme analysis on the text data collected.

이 때, 상기 이슈 키워드 설정부(320)는 상기 텍스트 수집부(310)에서 분석한 형태소들 중 미리 설정된 블랙리스트 형태소들은 제외하고 나머지 형태소들을 상기 이슈 키워드로 설정하는 것이 바람직하다.At this time, it is preferable that the issue keyword setting unit 320 sets the remaining morphemes as the issue keywords except for the preset blacklist morphemes among the morphemes analyzed by the text collection unit 310.

이를 통해서, 상기 키워드 수집부(300)는 인터넷 상에 존재하는 뉴스 사이트와 SNS를 활용하여 미리 설정한 기간에 존재하는 다양한 키워드들을 수집하게 된다.Through this, the keyword collection unit 300 collects various keywords existing in a preset period by utilizing news sites and SNS existing on the Internet.

상세하게는, 도 4에 도시된 바와 같이, 포털 사이트(검색 사이트), 다양한 방송사 사이트, 다양한 방송사들의 SNS등을 통해서 업로드되는 모든 기사를 텍스트 자료들로 수집하는 것이 바람직하다.In detail, as shown in FIG. 4, it is desirable to collect all articles uploaded through portal sites (search sites), various broadcaster sites, and various broadcasters' SNSs as text materials.

이 때, 포털 사이트(검색 사이트), 다양한 방송사 사이트, 다양한 방송사들의 SNS에 해당하는 URL 링크 등을 입력하면서, 모든 기사들의 업로드 날짜가 포함되어 있는 파라미터 부분을 분리하여, 상기 텍스트 수집부(310)로 입력함으로써, 날짜별 기사들을 텍스트 자료들로 수집할 수 있다.At this time, while inputting a portal site (search site), various broadcaster sites, URL links corresponding to various broadcasters' SNS, and the like, and separating the parameter parts including the upload dates of all articles, the text collection unit 310 By entering, you can collect articles by date as text materials.

이러한 기간 설정은 분석의 효율성을 높이기 위하여, 미리 설정된 소정기간마다 또는, 입력받은 특정 기간을 설정하여, 해당하는 기간 내의 모든 기사들을 수집하는 것이 바람직하다.In order to increase the efficiency of the analysis, it is preferable to collect all articles within a corresponding period by setting a predetermined period or a predetermined period received in advance.

일 예를 들자면, 외부 관리자(사용자)가 원하는 특정 기간을 입력할 경우, 상기 텍스트 수집부(310)는 크롤링 모듈을 이용하여 특정 기간에 해당하는 모든 기사에 포함되어 있는 텍스트 자료들을 수집하게 된다.For example, when an external administrator (user) inputs a specific period desired, the text collection unit 310 collects text data included in all articles corresponding to a specific period using a crawl module.

상기 이슈 키워드 설정부(320)는 상기 텍스트 수집부(310)에서 수집한 상기 텍스트 자료들에 형태소 분석을 수행하여, 키워드를 추출하는데, 상술한 바와 같이, 해당 키워드가 의미없거나 자주 등장하는 단어일 경우, 이를 블랙리스트로 등록하여, 이를 제외하고 나머지 형태소들을 이슈 키워드로 설정하는 것이 바람직하다.The issue keyword setting unit 320 extracts a keyword by performing morphological analysis on the text data collected by the text collection unit 310, and as described above, the keyword is meaningless or frequently appears. In this case, it is desirable to register this as a blacklist and set the rest of the morphemes as an issue keyword.

상기 블랙리스트에 대한 일 예를 들자면, 기자이름, 뉴스이름, 사이트이름, 날씨 정보 등으로, 대부분의 기사에 의미없이 자주 등장하는 단어를 등록하는 것이 바람직하다.As an example of the blacklist, it is desirable to register words that appear frequently and without meaning in most articles, such as reporter name, news name, site name, weather information, and the like.

여기서, 의미가 없다는 것은, 형태소 자체 의미를 의미하는 것이 아니라, 해당 형태소가 사이버 표적공격과는 연관 없이, 기사의 기본 포맷으로 반드시(대부분) 존재하는 것을 뜻한다.Here, the absence of meaning does not mean the morpheme itself, but means that the morpheme necessarily (mostly) exists in the basic format of the article, regardless of the cyber target attack.

이렇게 설정한 상기 이슈 키워드를 상기 데이터베이스부(600)의 '수집 키워드 목록 데이터'로 저장 및 관리하는 것이 바람직하다.It is preferable to store and manage the issue keyword thus set as'collection keyword list data' of the database unit 600.

상기 키워드 분석부(400)는 상기 키워드 수집부(300)에서 수집한 상기 이슈 키워드들 간의 연관성 분석을 수행하여, 사회이슈 키워드를 선정하는 것이 바람직하다.Preferably, the keyword analysis unit 400 selects a social issue keyword by performing a correlation analysis between the issue keywords collected by the keyword collection unit 300.

이를 위해, 상기 키워드 분석부(400)는 도 1에 도시된 바와 같이, 키워드 그룹화부(410), 사회이슈 선정부(420)를 포함하여 구성될 수 있다.To this end, as illustrated in FIG. 1, the keyword analysis unit 400 may include a keyword grouping unit 410 and a social issue selection unit 420.

상기 키워드 분석부(400)는 상기 키워드 수집부(300)에서 수집한 상기 이슈 키워드들을 활용하여 해당 기간의 이슈 키워드들 간의 연관성을 분석하여, 사회이슈를 도출할 수 있다. 여기서, 해당 기간이란, 상기 텍스트 수집부(310)에서 기간 별로 텍스트 자료들을 수집하기 때문에, 상기 텍스트 수집부(310)에서 텍스트 자료를 수집한 기간을 의미한다.The keyword analysis unit 400 may derive a social issue by analyzing the association between issue keywords in a corresponding period using the issue keywords collected by the keyword collection unit 300. Here, the term “period” refers to a period in which text data is collected by the text collection unit 310 because the text collection unit 310 collects text data for each period.

도 5에 도시된 바와 같이, 상기 키워드 분석부(400)는 연관성 분석을 수행하기 위해서, 상기 키워드 그룹화부(410)를 통해서, 미리 설정된 기준에 따라, 상기 이슈 키워드들을 그룹화하는 것이 바람직하다.As illustrated in FIG. 5, it is preferable that the keyword analysis unit 400 groups the issue keywords according to a preset criterion through the keyword grouping unit 410 in order to perform relevance analysis.

이 때, 미리 설정된 기준으로는, 경제, 정치, 사회, 문화, 연예 등, 통상적으로 기사의 큰 카테고리로 설정하는 것이 바람직하다.At this time, as a preset criterion, it is generally desirable to set a large category of articles, such as economy, politics, society, culture, and entertainment.

상기 사회이슈 선정부(420)는 각각의 그룹 내 이슈 키워드들 간의 연관성 분석을 수행하여, 키워드 발생 빈도를 기준으로 사회이슈 키워드를 선정하는 것이 바람직하다. 다시 말하자면 특정 기간 내에 수집된 텍스트 자료들에서 분석된 형태소들이 발생 빈도가 높을 경우, 해당 형태소를 사회이슈 키워드로 선정하는 것이 바람직하다.It is preferable that the social issue selection unit 420 selects social issue keywords based on the frequency of keyword occurrence by performing association analysis between issue keywords in each group. In other words, if the frequency of occurrence of morphemes analyzed from text data collected within a specific period is high, it is desirable to select the morpheme as a social issue keyword.

키워드 발생 빈도가 높다는 것은, 그만큼 기사로 자주 언급되었다는 것을 의미하기 때문에, 사회이슈 키워드일 가능성이 매우 높기 때문에, 이를 사회이슈 키워드로 선정하는 것이 바람직하다.It is preferable to select this as a social issue keyword because it is highly likely that it is a social issue keyword because it means that it is frequently referred to as an article.

이 때, 상기 사회이슈 선정부(420)는 선정한 상기 사회이슈 키워드를 상기 데이터베이스부(600)의 '수집 키워드 분석 데이터'로 저장 및 관리하되, 단순하게 상기 사회이슈 키워드만을 저장하는 것이 아니라, 상기 사회이슈 키워드가 발생한 특정 기간 역시 데이터베이스화하여 같이 저장 및 관리하는 것이 바람직하다.At this time, the social issue selection unit 420 stores and manages the selected social issue keywords as'collection keyword analysis data' of the database unit 600, but does not simply store the social issue keywords. It is also desirable to store and manage the specific period of time when the social issue keyword occurs in a database.

이를 통해서, 추후에 사회이슈 키워드를 조회하기 위하여, 원하는 기간을 입력할 경우, 해당하는 기간의 '사회이슈 키워드'를 확인할 수 도 있다.Through this, if a desired period is input to search for a social issue keyword in the future, the'social issue keyword' of the corresponding period may be checked.

상기 통합 분석부(500)는 상기 공격 분석부(200)에서 산정한 위험도 정보와 상기 키워드 분석부(400)에서 선정한 사회이슈 키워드 간의 상관도 분석을 통해, 해당하는 사이버 표적공격의 사회이슈형 사이버 표적공격 여부를 판단하게 된다.The integrated analysis unit 500 analyzes the correlation between the risk information calculated by the attack analysis unit 200 and the social issue keywords selected by the keyword analysis unit 400, and the social issue type cyber of the corresponding cyber target attack Target attack will be judged.

또한, 실시간으로 수집되는 서버 운영 데이터를 분석하고, 칼만필터 알고리즘을 이용하여 해당 서버의 사회이슈형 사이버 표적공격의 발생 위험도를 판단하는 것이 바람직하다.In addition, it is desirable to analyze the server operation data collected in real time, and to determine the risk of occurrence of a social issue type cyber target attack of the corresponding server using a Kalman filter algorithm.

이를 위해, 상기 통합 분석부(500)는 상관도 분석부(510)와 위험도 예측부(520)를 포함하여 구성되는 것이 바람직하다.To this end, the integrated analysis unit 500 is preferably configured to include a correlation analysis unit 510 and a risk prediction unit 520.

상기 통합 분석부(500)는 상기 공격 분석부(200)의 결과 데이터와 상기 키워드 분석부(400)의 결과 데이터를 이용하여, 연관성 분석모듈을 통해 사회이슈형 사이버 표적공격 여부를 판단하는 것이 바람직하다.Preferably, the integrated analysis unit 500 determines whether a social issue type cyber target attack is performed through a correlation analysis module using the result data of the attack analysis unit 200 and the result data of the keyword analysis unit 400. Do.

즉, 다시 말하자면, 상기 상관도 분석부(510)를 통해, 상기 공격 분석부(200)에서 산정한 사이버 표적공격의 탐지를 위한 위험도와 상기 키워드 분석부(400)에서 선정한 사회이슈 키워드 간의 상관도 분석을 통해, 해당하는 사이버 표적공격의 사회이슈형 사이버 표적공격 여부를 판단하는 것이 바람직하다.That is, in other words, through the correlation analysis unit 510, the correlation between the risk for detection of the cyber target attack calculated by the attack analysis unit 200 and the social issue keywords selected by the keyword analysis unit 400 Through analysis, it is desirable to determine whether the corresponding cyber target attack is a social issue type cyber target attack.

다시 말하자면, 상기 공격 분석부(200)에서 모든 형태의 사이버 표적공격의 로그 정보들을 분석하여 위험도를 산정하였기 때문에, '사회이슈형 사이버 표적공격'을 판단하기 위해서는, 해당 기간, 즉, 특정 사이버 표적공격이 발생하였을 때의 사회이슈 키워드가 특정 사이버 표적공격이 이루어진 공격 대상 그룹과 상관도가 있는지 분석하여, 상관도가 소정기준값 이상일 경우, 특정 사이버 표적공격이 '사회이슈형 사이버 표적공격'인 것으로 판단하는 것이 바람직하다.In other words, since the attack analysis unit 200 calculates the risk by analyzing log information of all types of cyber target attacks, in order to determine a'social issue type cyber target attack', a corresponding period, that is, a specific cyber target Analyzes whether the social issue keyword when the attack occurs has a correlation with the target group where the specific cyber target attack has been made, and if the correlation is more than a predetermined threshold, the specific cyber target attack is a'social issue type cyber target attack' It is desirable to judge.

이러한 사회이슈형 사이버 표적공격의 경우, 사회이슈를 빗대어 공격 대상 그룹을 비교적 특정할 수 있기 때문에, 과거에 발생했던 사회이슈형 사이버 표적공격을 분석하여 이에 대한 대비책 등을 마련하였을 경우, 동일한 사회이슈가 발생할 경우, 해당하는 공격 대상 그룹에서 이에 대해 선제적 대응책을 마련할 수 있는 장점이 있다. 즉, 모든 사이버 표적공격에 대한 방어책을 마련하는 것은 현실적으로 불가능하기 때문에, 가장 효과적으로 방어책을 마련할 수 있는 사회이슈형 사이버 표적공격부터 분석하여 이에 대한 선제적 대응을 진행하는 것이 바람직하다.In the case of such a social issue-type cyber target attack, it is possible to comparatively specify the target group by attacking the social issue, so when analyzing the social-issue type cyber target attack that occurred in the past and preparing a countermeasure against it, the same society When an issue occurs, the target group has the advantage of being able to prepare a preemptive countermeasure. That is, since it is practically impossible to provide a defense against all cyber targeted attacks, it is desirable to analyze the social issue-type cyber targeted attacks that can provide the most effective defenses and proactively respond to them.

상기 위험도 예측부(520)는 실시간으로 입력받은 특정 서버의 운영 데이터를 분석하여, 특정 서버에 대한 사회이슈형 사이버 표적공격의 발생 위험도를 예측하되, 도 6에 도시된 바와 같이, 칼만필터 알고리즘을 이용하여 예측한 발생 위험도를 보정하는 것이 바람직하다.The risk predicting unit 520 analyzes operational data of a specific server received in real time to predict the risk of a social issue-type cyber target attack on a specific server, but as shown in FIG. 6, the Kalman filter algorithm is used. It is desirable to correct the predicted risk of occurrence.

상세하게는, 상기 위험도 예측부(520)는 특정 웹 사이트에 설치된 보안장치(IPS, IDS, 웹 방화벽 등)로부터 실시간으로 수신받은 보안 이벤트 로그와 웹 트래픽 정보 등을 포함하는 특정 서버의 운영 데이터와 상기 상관도 분석부(510)에서 판단한 사회이슈형 사이버 표적공격 정보를 이용하여, 특정 서버의 사회이슈형 사이버 표적공격의 발생 위험도를 예측하되, 칼만필터 알고리즘을 이용하여 예측한 발생 위험도를 보정하게 된다.In detail, the risk predicting unit 520 includes operating data of a specific server including security event log and web traffic information received in real time from a security device (IPS, IDS, web firewall, etc.) installed on a specific website. Using the social issue type cyber target attack information determined by the correlation analysis unit 510, a risk of occurrence of a social issue type cyber target attack of a specific server is predicted, and the predicted risk level is corrected using a Kalman filter algorithm. do.

여기서, 칼만필터 알고리즘이란, 기존의 값과 새롭게 입력된 값(현재 입력된 값)을 이용하여 반복 산출을 통해 예측 보정하는 알고리즘으로서, 이를 통해서, 특정 웹 사이트의 사회이슈형 사이버 표적공격의 발생 위험도를 보정하여 예측하는 것이 바람직하다.Here, the Kalman filter algorithm is an algorithm that predictively corrects through iterative calculation using an existing value and a newly input value (currently entered value), through which the risk of occurrence of a social issue type cyber target attack of a specific website It is preferable to predict by correcting.

이렇게 예측한 특정 웹 사이트의 사회이슈형 사이버 표적공격의 발생 위험도는 상기 데이터베이스부(600)에 '사회이슈형 공격 분석 데이터'로 저장 및 관리하는 것이 바람직하다.It is preferable to store and manage the risk of occurrence of a social issue type cyber target attack of a specific website predicted as'social issue type attack analysis data' in the database unit 600.

도 7은 본 발명의 일 실시예에 따른 칼만필터 알고리즘을 이용한 사이버 표적공격 탐지 방법을 나타낸 순서도로서, 도 7을 참조로 하여 본 발명의 일 실시예에 따른 칼만필터 알고리즘을 이용한 사이버 표적공격 탐지 방법을 상세히 설명한다.7 is a flowchart illustrating a cyber target attack detection method using a Kalman filter algorithm according to an embodiment of the present invention, and a cyber target attack detection method using a Kalman filter algorithm according to an embodiment of the present invention with reference to FIG. Will be described in detail.

본 발명의 일 실시예에 따른 칼만필터 알고리즘을 이용한 사이버 표적공격 탐지 방법은 도 7에 도시된 바와 같이, 공격 분석단계(S100), 사회이슈 분석단계(S200) 및 통합 분석단계(S300)로 이루어지는 것이 바람직하다.The cyber target attack detection method using the Kalman filter algorithm according to an embodiment of the present invention includes an attack analysis step (S100), a social issue analysis step (S200), and an integrated analysis step (S300), as shown in FIG. 7. It is preferred.

각 단계에 대해서 자세히 알아보자면,To learn more about each step,

상기 공격 분석단계(S100)는 이미 발생한 사이버 표적공격에 대한 관련 정보들을 수집하고, 수집한 상기 정보들을 이용하여 빅데이터를 구성하여 이를 통해 사이버 표적공격의 위험도를 산정할 수 있다.In the attack analysis step (S100), information related to a cyber target attack that has already occurred may be collected, and big data may be configured using the collected information to calculate the risk of the cyber target attack.

상세하게는, 상기 공격 분석단계(S100)는 빅데이터 구성단계(S110), 정형화 단계(S120), 통계분석단계(S130), 학습단계(S140) 및 위험도 산정단계(S150)로 이루어지는 것이 바람직하다.In detail, the attack analysis step (S100) is preferably composed of a big data configuration step (S110), a formalization step (S120), a statistical analysis step (S130), a learning step (S140), and a risk estimation step (S150). .

현재 발생하는 또는, 발생할 수 있는 사이버 표적공격을 탐지하기 위해서는, 과거에 발생했던 사이버 표적공격에 대한 분석이 필요하다.In order to detect a cyber target attack that is currently occurring or may occur, analysis of a cyber target attack that has occurred in the past is required.

이에 따라, 상기 빅데이터 구성단계(S110)는 상기 공격 수집부(100)에서, 과거 발생했던 사이버 표적공격에 대한 공격 로그를 수집하되, 일정한 형태의 사이버 표적공격이 아닌, 모든 형태의 사이버 표적공격에 대한 로그 정보들을 전송받아, 이를 수집하여 빅데이터로 구성하는 것이 바람직하다.Accordingly, in the configuration of the big data (S110), the attack collection unit 100 collects an attack log for a cyber target attack that has occurred in the past, but not all types of cyber target attacks, but all types of cyber target attacks. It is desirable to receive log information for, collect it, and compose it into big data.

이를 위해서, 사이버 표적공격 관련 로그 정보들을 보관하고 있는 다양한 서버(사이버 보안관제 운영 기관, 보안관제 센터 등)들을 설정하고 이들을 관리하면서, 이들이 보관하고 있는 모든 형태의 사이버 표적공격 관련 로그 정보들을 전송받는 것이 바람직하다.To this end, various servers (cyber security control operation agencies, security control centers, etc.) that store log information related to cyber target attacks are set up and managed, and all types of cyber target attack log information stored by them are received. It is preferred.

상기 정형화 단계(S120)는 상기 정형화부(210)에서, 상기 빅데이터를 미리 설정된 기준에 따라 정형화하는 것이 바람직하다. 상세하게는, 정형화란, 데이터 분석을 수행하기 위해, 문자, 문자열, 정수형 등의 다양한 데이터를 수치화하는 것을 의미한다.In the shaping step (S120), the shaping unit 210 preferably forms the big data according to a preset criterion. Specifically, formalization means digitizing various data such as characters, character strings, and integers in order to perform data analysis.

상기 통계분석단계(S130)는 상기 통계분석부(220)에서, 정형화된 데이터를 미리 설정된 통계 분석을 수행하게 된다. 상세하게는, 공격 날짜, 공격 유형, 공격 대상 등을 입력변수로 이용하여, 미리 설정된 빈도분석, 회귀분석 등 다양한 통계분석을 수행하는 것이 바람직하다.In the statistical analysis step (S130), the statistical analysis unit 220 performs predetermined statistical analysis on the structured data. In detail, it is desirable to perform various statistical analyzes such as preset frequency analysis and regression analysis using the attack date, attack type, attack target, and the like as input variables.

상기 학습단계(S140)는 상기 학습부(230)에서, 딥러닝을 이용하여 정형화된 데이터에 대한 학습을 수행하고, 상기 위험도 산정단계(S150)는 상기 위험도 산정부(240)에서, 상기 학습단계(S140)에서의 학습 결과에 따라, 사이버 표적공격의 탐지를 위한 위험도를 산정하게 된다.The learning step (S140), in the learning unit 230, performs learning on the structured data using deep learning, and the risk estimating step (S150) is the risk estimating unit 240, the learning step According to the learning result in S140, the risk for detecting a cyber target attack is calculated.

상세하게는, 상기 학습단계(S140)는 딥러닝을 이용하여 정형화된 데이터를 학습함으로써, 상기 위험도 산정단계(S150)에서 각 학습 결과(일 예를 들자면, 공격 발생 일자 및 공격 시간대별 주 공격 패턴 학습 결과, 공격자 그룹(출발지 IP정보, 출발지 PORT 정보 등)별 공격 패턴 학습 결과, 주 피해 대상 기관(목적지 IP정보, 목적지 PORT 정보 등)별 공격 패턴 학습 결과, 페이로드에 나타나는 공격 패턴 학습 결과 등)에 비중을 부여하고, 사이버 표적공격 탐지를 위한 위험도를 산정하기 위해 가중합계를 계산하게 된다.Specifically, in the learning step (S140), by learning the structured data using deep learning, each learning result (for example, an attack occurrence date and a main attack pattern by attack time zone) in the risk estimating step (S150) Learning result, attack pattern learning result by attacker group (departure IP information, origin PORT information, etc.), attack pattern learning result by main target organization (destination IP information, destination PORT information, etc.), attack pattern learning result appearing in payload, etc. ), and weighted sum is calculated to calculate the risk for cyber target attack detection.

이 때, 상기 학습단계(S140)는 정형화된 데이터 뿐 아니라, 상기 통계분석단계(S130)에서의 결과 데이터 역시 학습하는 것이 바람직하다.At this time, the learning step (S140), as well as the standardized data, it is preferable to learn the result data in the statistical analysis step (S130).

상기 위험도 산정단계(S150)에서 산정한 사이버 표적공격 탐지를 위한 위험도란, 사이버 표적공격을 탐지하기 위한 기준을 산정하는 것을 의미하며, 과거에 발생했던 사이버 표적공격에서 나타나는 빈도 높은 행위정보(페이로드, IP주소, 공격대상 기관 등)를 이용하여, 사이버 표적공격 이전에 나타나는 패턴을 학습하여, 해당 정보를 상기 데이터베이스부(600)의 '사이버 표적공격 분석 결과 데이터'로 저장 및 관리하는 것이 바람직하다.The risk for detecting a cyber target attack calculated in the risk estimation step (S150) means calculating a criterion for detecting a cyber target attack, and frequent behavior information (payload) appearing in the cyber target attack that occurred in the past , IP address, attack target organizations, etc., it is desirable to learn patterns that appear before the cyber target attack, and store and manage the information as'cyber target attack analysis result data' of the database unit 600. .

즉, 상기 위험도 산정부(240)에서 산정한 위험도란 사이버 표적공격이 발생할 때, 발생 이전(직전 등)에 나타나는 특징이 될 만한 패턴을 학습하여, 추후에 실시간으로 분석한 서버 운영 데이터에서 학습한 패턴과 유사/동일한 패턴이 인지될 경우, 이를 사이버 표적공격의 발생 위험도가 존재하는 것으로 판단할 수 있다.That is, when the cyber target attack occurs, the risk calculated by the risk calculation unit 240 learns a pattern that may be characteristic before the occurrence (just before), and learns from the server operation data analyzed in real time later. If a pattern similar/same as the pattern is recognized, it can be determined that there is a risk of cyber target attack.

상기 사회이슈 분석단계(S200)는 인터넷 상의 다양한 텍스트 자료에 포함된 형태소를 분석하여 이슈 키워드를 수집하고, 수집한 상기 이슈 키워드들 간의 연관성 분석을 통해 상기 이슈 키워드들 중 사회이슈 키워드를 선정하는 것이 바람직하다.In the social issue analysis step (S200), it is necessary to collect the issue keywords by analyzing the morphemes included in various text data on the Internet, and to select the social issue keywords among the issue keywords by analyzing the association between the collected issue keywords. desirable.

이를 위해, 상기 사회이슈 분석단계(S200)는 텍스트 수집단계(S210), 이슈 키워드 설정단계(S220), 키워드 그룹화단계(S230) 및 사회이슈 키워드 선정단계(S240)로 이루어지는 것이 바람직하다.To this end, the social issue analysis step (S200) is preferably composed of a text collection step (S210), an issue keyword setting step (S220), a keyword grouping step (S230), and a social issue keyword selection step (S240).

상기 텍스트 수집단계(S210)는 상기 텍스트 수집부(310)에서, 외부로부터 수집하고자 하는 기간 정보를 입력받아, 해당하는 기간의 인터넷 상의 다양한 텍스트 자료들을 수집하는 것이 바람직하다.In the text collection step (S210), it is preferable that the text collection unit 310 receives the period information to be collected from the outside and collects various text data on the Internet in a corresponding period.

즉, 인터넷 상에 존재하는 뉴스 사이트와 SNS를 활용하여 미리 설정한 기간에 존재하는 다양한 키워드들을 수집하게 된다.That is, various keywords existing in a preset period are collected by utilizing news sites and SNS existing on the Internet.

상세하게는, 포털 사이트(검색 사이트), 다양한 방송사 사이트, 다양한 방송사들의 SNS등을 통해서 업로드되는 모든 기사를 텍스트 자료들로 수집하는 것이 바람직하다.In detail, it is desirable to collect all articles uploaded through portal sites (search sites), various broadcaster sites, and various broadcasters' SNS as text materials.

이를 통해서, 외부 관리자(사용자)가 원하는 특정 기간을 입력할 경우, 크롤링 모듈을 이용하여 특정 기간에 해당하는 모든 기사에 포함되어 있는 텍스트 자료들을 수집하게 된다.Through this, when a specific period desired by an external administrator (user) is input, a text module included in all articles corresponding to a specific period is collected using a crawl module.

상기 이슈 키워드 설정단계(S220)는 상기 이슈 키워드 설정부(320)에서, 상기 텍스트 수집단계(S210)를 통해서, 수집한 상기 텍스트 자료들에 대한 형태소 분석을 통해, 이슈 키워드를 설정하는 것이 바람직하다.In the issue keyword setting step (S220), it is preferable to set the issue keyword in the issue keyword setting unit 320 through the text collection step (S210) through morpheme analysis of the collected text materials. .

이 때, 상기 이슈 키워드 설정단계(S220)는 수집한 상기 텍스트 자료들에 형태소 분석을 수행하여, 키워드를 추출하는데, 해당 키워드가 의미없거나 자주 등장하는 단어일 경우, 이를 블랙리스트로 등록하여, 이를 제외하고 나머지 형태소들을 이슈 키워드로 설정하는 것이 바람직하다.At this time, the issue keyword setting step (S220) performs a morpheme analysis on the collected text data, extracts a keyword. If the keyword is meaningless or frequently appears, register it as a blacklist, and It is desirable to set the remaining morphemes as issue keywords.

상기 키워드 그룹화단계(S230)는 상기 키워드 그룹화부(410)에서, 미리 설정된 기준에 따라, 설정한 상기 이슈 키워드들을 그룹화하게 된다.The keyword grouping step (S230), in the keyword grouping unit 410, groups the issue keywords set according to a preset criterion.

상세하게는, 상기 키워드 그룹화단계(S230)는 연관성 분석을 수행하기 위해서, 미리 설정된 기준에 따라, 상기 이슈 키워드들을 그룹화하는 것이 바람직하다.In detail, in the keyword grouping step (S230), it is preferable to group the issue keywords according to a preset criterion in order to perform relevance analysis.

상기 사회이슈 키워드 선정단계(S240)는 상기 사회이슈 선정부(420)에서, 상기 키워드 그룹화단계(S230)에 의해 그룹화가 이루어진 각각의 그룹 내 이슈 키워드들 간의 연관성 분석을 수행하여, 키워드 발생 빈도를 기준으로 사회이슈 키워드를 선정하게 된다. 다시 말하자면 특정 기간 내에 수집된 텍스트 자료들에서 분석된 형태소들이 발생 빈도가 높을 경우, 해당 형태소를 사회이슈 키워드로 선정하는 것이 바람직하다.In the social issue keyword selection step (S240), the social issue selection unit 420 performs correlation analysis between issue keywords in each group grouped by the keyword grouping step (S230) to determine the frequency of keyword occurrence. Social issue keywords are selected based on the criteria. In other words, if the frequency of occurrence of morphemes analyzed from text data collected within a specific period is high, it is desirable to select the morpheme as a social issue keyword.

즉, 수집한 상기 이슈 키워드들을 활용하여 해당 기간의 이슈 키워드들 간의 연관성을 분석하여, 사회이슈를 도출할 수 있다. 여기서, 해당 기간이란, 상기 텍스트 수집단계(S210)에서 기간 별로 텍스트 자료들을 수집하기 때문에, 상기 텍스트 수집단계(S210)에서 텍스트 자료를 수집한 기간을 의미한다.That is, it is possible to derive a social issue by analyzing the association between issue keywords of a corresponding period using the collected issue keywords. Here, the corresponding period refers to a period in which text data is collected in the text collection step (S210) because text data is collected for each period in the text collection step (S210 ).

상기 통합 분석단계(S300)는 상기 공격 분석단계(S100)에서 산정한 위험도 정보와 상기 사회이슈 분석단계(S200)에서 선정한 사회이슈 키워드 간의 상관도 분석을 통해, 미리 발생한 사이버 표적공격의 사회이슈형 사이버 표적공격 여부를 판단하게 된다.The integrated analysis step (S300) is a social issue type of a cyber target attack that has occurred in advance through analysis of the correlation between the risk information calculated in the attack analysis step (S100) and the social issue keyword selected in the social issue analysis step (S200). It will determine whether or not it is a cyber target attack.

또한, 실시간으로 수집되는 서버 운영 데이터를 분석하여 칼만필터 알고리즘을 이용하여 해당 사이트의 사회이슈형 사이버 표적공격 발생 위험도를 판단할 수 있다.In addition, by analyzing the server operation data collected in real time, it is possible to determine the risk of occurrence of a social issue type cyber target attack of the corresponding site by using a Kalman filter algorithm.

이를 위해, 상기 통합 분석단계(S300)는 상관도 분석단계(S310) 및 위험도 예측단계(S320)로 이루어지게 된다.To this end, the integrated analysis step (S300) is composed of a correlation analysis step (S310) and a risk prediction step (S320).

상기 상관도 분석단계(S310)는 상기 상관도 분석부(510)에서, 산정한 사이버 표적공격의 탐지를 위한 위험도와 선정한 사회이슈 키워드 간의 상관도 분석을 통해, 해당하는 사이버 표적공격의 사회이슈형 사이버 표적공격 여부를 판단하는 것이 바람직하다.The correlation analysis step (S310) is a social issue type of the corresponding cyber target attack through an analysis of the correlation between the risk for detecting the cyber target attack calculated by the correlation analysis unit 510 and the selected social issue keyword. It is desirable to judge whether or not a cyber target attack.

다시 말하자면, 상기 공격 분석단계(S100)에서는 모든 형태의 사이버 표적공격의 로그 정보들을 분석하여 위험도를 산정하였기 때문에, '사회이슈형 사이버 표적공격'을 판단하기 위해서는, 해당 기간, 즉, 특정 사이버 표적공격이 발생하였을 때의 사회이슈 키워드가 특정 사이버 표적공격이 이루어진 공격 대상 그룹과 상관도가 있는지 분석하여, 상관도가 소정기준값 이상일 경우, 특정 사이버 표적공격이 '사회이슈형 사이버 표적공격'인 것으로 판단하는 것이 바람직하다.In other words, in the attack analysis step (S100), since the risk is calculated by analyzing log information of all types of cyber target attacks, in order to determine a'social issue type cyber target attack', a corresponding period, that is, a specific cyber target Analyzes whether the social issue keyword when the attack occurs has a correlation with the target group where the specific cyber target attack has been made, and if the correlation is more than a predetermined threshold, the specific cyber target attack is a'social issue type cyber target attack' It is desirable to judge.

상기 위험도 예측단계(S320)는 상기 위험도 예측부(520)에서, 실시간으로 입력받은 특정 서버의 운영 데이터를 분석하여, 특정 서버에 대한 사회이슈형 사이버 표적공격의 발생 위험도를 예측하되, 칼만필터 알고리즘을 이용하여 예측한 발생 위험도를 보정하는 것이 바람직하다.In the risk prediction step (S320), the risk prediction unit 520 analyzes operational data of a specific server input in real time to predict the risk of a social issue-type cyber target attack on a specific server, but a Kalman filter algorithm. It is desirable to correct the predicted risk of occurrence using.

상세하게는, 상기 위험도 예측부(520)는 특정 웹 사이트에 설치된 보안장치(IPS, IDS, 웹 방화벽 등)로부터 실시간으로 수신받은 보안 이벤트 로그와 웹 트래픽 정보 등을 포함하는 특정 서버의 운영 데이터를 분석하여, 판단한 특정 서버의 공격 발생 위험도와 상기 상관도 분석부(510)에서 판단한 사회이슈형 사이버 표적공격 정보를 이용하여, 특정 웹 사이트의 사회이슈형 사이버 표적공격의 발생 위험도를 예측하되, 칼만필터 알고리즘을 이용하여 예측한 발생 위험도를 보정하게 된다.In detail, the risk prediction unit 520 operates data of a specific server including security event logs and web traffic information received in real time from security devices (IPS, IDS, web firewall, etc.) installed on a specific website. By analyzing and determining the risk of attack of a specific server and using the social issue type cyber target attack information determined by the correlation analysis unit 510, the risk of occurrence of a social issue type cyber target attack of a specific website is predicted, but Kalman The predicted risk of occurrence is corrected using a filter algorithm.

이와 같이, 본 발명의 일 실시예에 따른 칼만필터 알고리즘을 이용한 사이버 표적공격 탐지 시스템 및 그 탐지 방법은, 상기 데이터베이스부(600)에 저장 및 관리하고 있는 데이터들을 이용하여, 서버를 운영하는 관리자(운영자 등)가 해당 데이터들을 이용하여 사회이슈에 따른 사이버 표적공격(사회이슈형 사이버 표적공격)에 따른 피해를 최소화할 수 있도록 대비책(방지책) 등을 강구할 수 있도록 제공할 수 있는 장점이 있다.As described above, the cyber target attack detection system and the detection method using the Kalman filter algorithm according to an embodiment of the present invention use the data stored and managed in the database unit 600 to operate a server ( Operators, etc.) have the advantage that they can provide countermeasures (prevention measures) to minimize the damage caused by cyber target attacks (social issue type cyber target attacks) according to social issues using the data.

또한, 과거에 발생했던 사회이슈형 사이버 표적공격과 연관이 있는 사회이슈가 또다시 발생할 경우, 다시 말하자면, '사회이슈형 사이버 표적공격과 연관이 있는 사회이슈 키워드가 특정 기간에 다시 대두될 경우, 이와 관련도가 높은 공격 대상 그룹이 운영하고 있는 서버, 웹 사이트 등의 보안을 강화하고 지속적인 모니터링을 통해 사이버 표적공격을 미연에 방지하여, 예방 및 조치를 취할 수 있는 장점이 있다.Also, if a social issue related to a social issue-type cyber target attack that occurred in the past occurs again, in other words,'the social issue keyword related to a social issue-type cyber target attack reappears in a specific period, This has the advantage of strengthening the security of servers, websites, etc. operated by the highly targeted attack group and preventing and targeting cyber attacks by continuously monitoring them.

이상과 같이 본 발명에서는 구체적인 구성 소자 등과 같은 특정 사항들과 한정된 실시예 도면에 의해 설명되었으나 이는 본 발명의 보다 전반적인 이해를 돕기 위해서 제공된 것 일 뿐, 본 발명은 상기의 일 실시예에 한정되는 것이 아니며, 본 발명이 속하는 분야에서 통상의 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다.As described above, in the present invention, specific matters such as specific components and the like have been described by the limited embodiment drawings, but they are provided only to help the overall understanding of the present invention, and the present invention is limited to the above-described one embodiment. No, those skilled in the art to which the present invention pertains can make various modifications and variations from these descriptions.

따라서, 본 발명의 사상은 설명된 실시예에 국한되어 정해져서는 아니 되며, 후술하는 특허 청구 범위뿐 아니라 이 특허 청구 범위와 균등하거나 등가적 변형이 있는 모든 것들은 본 발명 사상의 범주에 속한다고 할 것이다.Accordingly, the spirit of the present invention should not be limited to the described embodiments, and should not be determined, and all claims that are equivalent to or equivalent to the scope of the claims as well as the scope of the claims described below belong to the scope of the spirit of the invention. .

100 : 공격 수집부
200 : 공격 분석부
210 : 정형화부 220 : 통계분석부
230 : 학습부 240 : 위험도 산정부
300 : 키워드 수집부
310 : 텍스트 수집부 320 : 이슈 키워드 설정부
400 : 키워드 분석부
410 : 키워드 그룹화부 420 : 사회이슈 선정부
500 : 통합 분석부
510 : 상관도 분석부 520 : 위험도 예측부
600 : 데이터베이스부100: attack collection unit
200: attack analysis unit
210: shaping section 220: statistical analysis section
230: learning unit 240: risk calculation
300: keyword collection unit
310: text collection unit 320: issue keyword setting unit
400: keyword analysis unit
410: Keyword grouping unit 420: Social issue selection unit
500: integrated analysis unit
510: correlation analysis unit 520: risk prediction unit
600: database unit

Claims

An attack collection unit 100 that collects related information on a previously generated cyber target attack;
An attack analysis unit 200 for calculating the risk of a corresponding cyber target attack by analyzing information constituting big data in the attack collection unit 100;
A keyword collection unit 300 that collects issue keywords by analyzing morphemes included in various text materials on the Internet;
A keyword analysis unit 400 for selecting a social issue keyword by performing a correlation analysis between the issue keywords collected by the keyword collection unit 300; And
Using the social issue keyword selected by the keyword analysis unit 400, the correlation between the social issue keyword in the past when a specific cyber target attack occurs and the target server having the corresponding specific cyber target attack has been analyzed and analyzed. When the correlation is more than a predetermined reference value, the corresponding specific cyber target attack is determined as a social issue type cyber target attack,
Analyzing the determined social issue type cyber target attack information, the risk target information of the cyber target attack calculated by the attack analysis unit 200, and server operation data collected in real time, and predicting the attack of the corresponding server using the Kalman filter algorithm An integrated analysis unit 500 that detects a cyber target attack using the occurrence risk;
It comprises,
The keyword collection unit 300
Issue that sets the issue keyword by receiving the period information to be collected from the outside and collecting various text data on the Internet for the relevant period, and through morpheme analysis on the collected text data The keyword setting unit 320 further includes,
The issue keyword setting unit 320
A cyber target attack detection system using a Kalman filter algorithm characterized in that the remaining morphemes are set as the issue keywords, except for the preset blacklist morphemes among the analyzed morphemes.

According to claim 1,
The attack collection unit 100
A cyber target attack detection system using a Kalman filter algorithm characterized by receiving log information of a cyber target attack that has already occurred from the outside, collecting it, and making it big data.

According to claim 2,
The attack analysis unit 200
A shaping unit 210 for shaping the big data according to a predetermined criterion;
A statistical analysis unit 220 that performs predetermined statistical analysis on the structured data;
A learning unit 230 that performs learning on the structured data using deep learning; And
A risk calculation unit 240 for estimating a risk for detecting a cyber target attack according to the learning result of the learning unit 230;
Cyber attack attack detection system using a Kalman filter algorithm, characterized in that further comprises a.

delete

According to claim 1,
The keyword analysis unit 400
A keyword grouping unit 410 for grouping the issue keywords set by the issue keyword setting unit 320 according to a preset criterion; And
A social issue selection unit 420 which selects a social issue keyword based on the frequency of keyword occurrence by performing association analysis between issue keywords in each group;
Cyber attack attack detection system using a Kalman filter algorithm, characterized in that further comprises a.

The method of claim 5,
The integrated analysis unit 500
A social issue type cyber target attack of a corresponding cyber target attack through an analysis of a correlation between a risk for the detection of a cyber target attack calculated by the attack analysis unit 200 and a social issue keyword selected by the keyword analysis unit 400 Correlation analysis unit 510 for determining whether or not; And
A risk prediction unit 520 that analyzes operational data of a specific server input in real time and predicts a risk of occurrence of a social issue-type cyber target attack on a specific server, but corrects the predicted risk using a Kalman filter algorithm;
Cyber attack attack detection system using a Kalman filter algorithm, characterized in that further comprises a.

An attack analysis step (S100) of collecting relevant information on a previously generated cyber target attack and constructing big data using the collected information to calculate a risk of a cyber target attack through this;
A social issue analysis step (S200) of collecting issue keywords by analyzing morphemes contained in various text data on the Internet, and selecting social issue keywords among the issue keywords through analysis of association between the collected issue keywords; And
Using the social issue keyword selected in the social issue analysis step (S200), the correlation between the social issue keyword in the past when a specific cyber target attack occurs and the target server having the corresponding specific cyber target attack is analyzed and analyzed When a correlation is higher than a predetermined reference value, a specific cyber target attack is determined as a social issue type cyber target attack,
Analyzing the determined social issue type cyber target attack information, risk information of the cyber target attack calculated in the attack analysis step (S100), and server operation data collected in real time, and predicting the social issue of the corresponding server using the Kalman filter algorithm Integrated analysis step of determining the risk of the type cyber target attack (S300);
Is made of,
The social issue analysis step (S200)
Issue that sets issue keywords by receiving the period information to be collected from the outside and collecting various text data on the Internet in the corresponding period (S210), and through morpheme analysis of the collected text data Keyword setting step (S220), a keyword grouping step (S230) for grouping the set of issue keywords according to a predetermined criterion, and performing correlation analysis between issue keywords in each group, based on the frequency of keyword occurrence A social target keyword detection method using a Kalman filter algorithm, characterized in that it consists of a social issue keyword selection step (S240) for selecting social issue keywords.

The method of claim 7,
The attack analysis step (S100)
A big data configuration step (S110) of receiving log information about a cyber target attack that has already occurred from the outside and collecting it to make big data;
A shaping step of shaping the big data according to a predetermined criterion (S120);
A statistical analysis step of performing predetermined statistical analysis on the structured data (S130);
A learning step of performing learning on the structured data using deep learning (S140); And
According to the learning result, the risk calculation step of estimating the risk for the detection of a cyber target attack (S150);
Cyber target attack detection method using a Kalman filter algorithm, characterized in that consisting of.

delete

The method of claim 7,
The issue keyword setting step (S220)
A cyber target attack detection method using a Kalman filter algorithm, characterized in that the remaining morphemes are set as the issue keywords, except for the preset blacklist morphemes among the analyzed morphemes.

The method of claim 7,
The integrated analysis step (S300)
Through the correlation analysis between the risk information for the detection of the cyber target attack calculated in the attack analysis step (S100) and the social issue keywords selected in the social issue analysis step (S200), the social issue type cyber of the previously generated cyber target attack Correlation analysis step to determine whether the target attack (S310); And
A risk prediction step (S320) of analyzing the operation data of the specific server received in real time and predicting the risk of occurrence of a social issue type cyber target attack on the specific server;
It consists of,
The risk prediction step (S320)
A cyber target attack detection method using the Kalman Filter algorithm, characterized by correcting the predicted risk of occurrence using the Kalman Filter algorithm.