KR101893090B1

KR101893090B1 - Vulnerability information management method and apparastus thereof

Info

Publication number: KR101893090B1
Application number: KR1020170152295A
Authority: KR
Inventors: 김환국; 김태은; 장대일; 유창훈; 손영남; 고은혜; 나사랑
Original assignee: 한국인터넷진흥원
Priority date: 2017-11-15
Filing date: 2017-11-15
Publication date: 2018-08-29

Abstract

A method and apparatus for managing vulnerability information are provided. A vulnerability information management method according to an embodiment of the present invention includes a step of obtaining a vulnerability table including first vulnerability information by parsing and classifying data obtained from a vulnerability information source; a step of extracting a text associated with the first vulnerability information from the vulnerability table; a step of generating first vulnerability basic information from the text; and a step of storing the first vulnerability basic information in the vulnerability table.

Description

[0001] VULNERABILITY INFORMATION MANAGEMENT METHOD AND APPARATUS THEREOF [0002]

본 발명은 취약점 정보를 관리하는 방법 및 장치에 관한 것이다. 보다 상세하게는, 저장되어 있는 취약점 정보로부터 부가적인 정보를 생성함으로써, 취약점 정보를 보다 유용하게 활용할 수 있도록 하는 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for managing vulnerability information. More particularly, the present invention relates to a method and apparatus for generating additional information from stored vulnerability information so as to more effectively utilize the vulnerability information.

이 부분에 기술된 내용은 단순히 본 실시 예에 대한 배경 정보를 제공할 뿐 공지된 기술을 기재한 것이 아니다.The contents described in this section merely provide background information on the present embodiment, but do not describe a known art.

소프트웨어에 내재되어 있는 보안 취약점은 컴퓨터 시스템을 공격하는데 쉽게 악용될 수 있다. 공격자들은 인터넷 스캔 도구를 가지고 보안이 취약한 웹 서비스(Web Service)를 식별하여 악의적인 행동을 수행할 수 있다. 따라서 보안 관리자들은 공개된 취약점을 파악하고 빠르게 대응할 필요가 있다. 특히, 최근에는 사물인터넷(Internet of Things; IoT) 기기가 널리 보급됨에 따라 인터넷에 연결된 디바이스의 수가 급증하고 있다. 따라서, 인터넷에 연결된 수 많은 컴퓨터 시스템의 보안 취약점을 빠르게 파악하고 취약점 분석을 수행할 필요가 있다. 보안 취약점으로부터 야기되는 보안 사고를 미연에 방지하기 위해 취약점을 파악하고 분석함으로써 대응 방법을 결정하는 것을 취약점 분석이라고 한다.Security vulnerabilities inherent in software can easily be exploited to attack computer systems. Attackers can use Internet scanning tools to identify malicious Web services that are less secure. As a result, security administrators need to be aware of and respond quickly to open vulnerabilities. Especially, in recent years, as the Internet of Things (IoT) devices have become widespread, the number of devices connected to the Internet is rapidly increasing. Therefore, it is necessary to quickly identify vulnerabilities and perform vulnerability analysis on many computer systems connected to the Internet. Vulnerability analysis is the process of identifying and analyzing vulnerabilities in order to prevent security incidents arising from security vulnerabilities.

미리 알려진 보안 취약점을 쉽게 공유하기 위해 여러 취약점 정보 소스들로부터 취약점 정보가 제공되고 있다. 예를 들면, NVD(Natinoal Vulnerability Database)는 CVE(Common Vulnerabilities and Exposures) 정보를 제공하고 있다. CVE 정보는 소프트웨어 패키지의 보안 취약점 정보에 대한 참조 방법을 제공하고 있다. CVE 정보는 취약점 식별자(Common Vulnerabilities and Exposures IDentifier; CVE-ID), 취약점 개요(Overview), 취약도 점수(Common Vulnerability Scoring System; CVSS), CPE(Common Platform Enumeration), 취약점 종류(Common Weakness Enumeration; CWE)를 포함하고 있다. (http://nvd.nist.gov/ 참조)Vulnerability information is provided from multiple vulnerability information sources in order to easily share known security vulnerabilities in advance. For example, NVD (Natinoal Vulnerability Database) provides Common Vulnerabilities and Exposures (CVE) information. CVE information provides a way to reference security vulnerability information in software packages. The CVE information includes the Common Vulnerabilities and Exposures Identifier (CVE-ID), Overview of the Vulnerability Scoring System (CVSS), Common Platform Enumeration (CPE), Common Weakness Enumeration ). (see http://nvd.nist.gov/)

또한, http://vuldb.com/(VulDB)이나 http://www.securityfocus.com/bid/ (Bugtraq) 에도 취약점정보가 게재된다. 또한, 인터넷에 연결되는 디바이스의 제조사들도 웹 페이지를 통해 디바이스 펌웨어 버전 정보나 보안 패치 정보 등을 여러 가지 형태로 게재하고 있다(http://iptime.com/iptime/?page_id=126 및 http://netiskorea.com/atboard.php?grp1=support&grp2=download 참조). Vulnerability information can also be found at http://vuldb.com/(VulDB) or at http://www.securityfocus.com/bid/ (Bugtraq). In addition, manufacturers of devices connected to the Internet are also displaying device firmware version information and security patch information in various forms through a web page (http://iptime.com/iptime/?page_id=126 and http: //netiskorea.com/atboard.php?grp1=support&grp2=download).

따라서 다양한 취약점 정보 소스로부터 제공되는 취약점 정보를 빠르게 파악하고 분석할 수 있도록 하기 위하여, 다양한 취약점 정보 소스로부터 제공되는 취약점 정보들을 수집하고 통합적으로 공유될 필요가 있다. 그러나 취약점 정보 소스들마다 제공되는 취약점 정보에 포함되어 있는 정보의 종류가 상이하고, 정보가 저장된 형태도 상이한 경우가 많다. 예를 들어, 취약점이 발생한 제품명을 나타내는 CPE (Common Platform Enumeration) 정보를 포함하는 취약점 정보를 제공하는 취약점 정보 소스도 있으나, CPE 정보를 포함하지 않는 취약점 정보 소스도 존재한다. 따라서, 다양한 취약점 정보 소스로부터 제공되는 취약점 정보를 빠르게 파악하고 분석하기 위하여, 취약점 정보를 가공할 필요가 있다.Therefore, in order to quickly identify and analyze vulnerability information provided from various vulnerability information sources, vulnerability information provided from various vulnerability information sources should be collected and integrated. However, the types of information contained in the vulnerability information provided for each vulnerability information source are different, and the types of information stored are often different. For example, there is a vulnerability information source that provides vulnerability information including CPE (Common Platform Enumeration) information indicating the name of the product where the vulnerability occurs, but there is also a vulnerability information source that does not include CPE information. Therefore, in order to quickly identify and analyze vulnerability information provided from various vulnerability information sources, it is necessary to process the vulnerability information.

본 발명이 해결하고자 하는 기술적 과제는, 다양한 취약점 정보 소스로부터 취약점 정보를 수집하고, 수집된 취약점 정보로부터 부가적인 정보를 생성함으로써, 보다 유용한 취약점 정보를 공유하고 분석할 수 있도록 취약점 정보를 관리하는 방법 및 그 장치를 제공하는 것이다.The technical problem to be solved by the present invention is to manage vulnerability information so that more useful vulnerability information can be shared and analyzed by collecting vulnerability information from various vulnerability information sources and generating additional information from the collected vulnerability information And a device therefor.

본 발명의 기술적 과제들은 이상에서 언급한 기술적 과제들로 제한되지 않으며, 언급되지 않은 또 다른 기술적 과제들은 아래의 기재로부터 본 발명의 기술분야에서의 통상의 기술자에게 명확하게 이해 될 수 있을 것이다.The technical objects of the present invention are not limited to the above-mentioned technical problems, and other technical subjects not mentioned can be clearly understood by those skilled in the art from the following description.

상기 기술적 과제를 해결하기 위한, 본 발명의 일 실시예에 따른 컴퓨팅 장치가 취약점 정보를 관리하는 방법은, 취약점 정보 소스로부터 획득된 데이터를 파싱하고 분류함으로써 제1 취약점 정보를 포함하는 취약점 테이블을 획득하는 단계와, 상기 취약점 테이블로부터 상기 제1 취약점 정보와 관련된 텍스트를 추출하는 단계와, 상기 텍스트로부터 제1 취약점 기본 정보를 생성하는 단계 및 상기 제1 취약점 기본 정보를 상기 취약점 테이블에 저장하는 단계;를 포함할 수 있다.According to an aspect of the present invention, there is provided a method of managing vulnerability information according to an embodiment of the present invention includes parsing and classifying data obtained from a vulnerability information source to obtain a vulnerability table including a first vulnerability information Extracting a text associated with the first vulnerability information from the vulnerability table; generating first vulnerability basic information from the text; and storing the first vulnerability basic information in the vulnerability table. . &Lt; / RTI >

또한, 다른 일 실시 예에 따르면, 상기 제1 취약점 기본 정보는 상기 제1 취약점 정보의 제목(title), 키워드, 말뭉치(corpus) 및 요약을 포함할 수 있다.According to another embodiment, the first vulnerability basic information may include a title, a keyword, a corpus, and a summary of the first vulnerability information.

또한, 또 다른 일 실시 예에 따르면, 상기 제1 취약점 기본 정보를 생성하는 단계는, 상기 텍스트로부터 대표 문장 또는 대표 단어를 추출하는 단계 및 상기 대표 문장 또는 대표 단어로 구성된 제목을 생성하는 단계를 포함할 수 있다.According to still another embodiment, the step of generating the first vulnerability basic information includes extracting a representative sentence or a representative word from the text and generating a title composed of the representative sentence or the representative word can do.

또한, 또 다른 일 실시 예에 따르면, 상기 대표 문장 또는 대표 단어를 추출하는 단계는 상기 텍스트를 텍스트랭크(Textrank) 알고리즘에 입력함으로써 출력된 문장 또는 단어를 상기 대표 문장 또는 대표 단어로 반환하는 단계를 포함할 수 있다.According to another embodiment of the present invention, the step of extracting the representative sentence or the representative word includes the step of inputting the text into a text rank algorithm to return the sentence or word as the representative sentence or the representative word .

또한, 또 다른 일 실시 예에 따르면, 상기 제목을 생성하는 단계는 상기 대표 문장 또는 대표 단어로부터 제조사명, 제품명 및 취약점 유형에 상응하는 단어를 결정하는 단계 및 상기 결정된 단어에 기초하여 상기 제조사명, 상기 제품명 및 상기 취약점 유형으로 구성된 제목을 생성하는 단계를 포함할 수 있다.According to another embodiment of the present invention, the step of generating the title includes the steps of: determining a word corresponding to a manufacturer name, a product name, and a vulnerability type from the representative sentence or the representative word; And generating a title composed of the product name and the vulnerability type.

또한, 또 다른 일 실시 예에 따르면, 상기 제1 취약점 기본 정보를 생성하는 단계는 상기 텍스트를 텍스트랭크 알고리즘 및 워드투벡(Word2Vec) 알고리즘에 입력함으로써 출력되는 단어를 상기 키워드로 반환하는 단계를 포함할 수 있다.According to still another embodiment, the step of generating the first vulnerability basic information includes a step of inputting the text into a text rank algorithm and a word-by-word (Word2Vec) algorithm, .

또한, 또 다른 일 실시 예에 따르면, 상기 제1 취약점 기본 정보를 생성하는 단계는 상기 텍스트 및 상기 키워드를 주제 모델(Topic model)을 이용하여 유사한 의미의 단어 클러스터를 생성하는 단계 및 상기 단어 클러스터에 기초하여 상기 말뭉치를 생성하는 단계를 포함할 수 있다.According to still another embodiment, the step of generating the first vulnerability basic information may include generating a word cluster having a similar meaning using the text and the keyword using a topic model, And generating the corpus based on the corpus.

또한, 또 다른 일 실시 예에 따르면, 상기 제1 취약점 기본 정보를 생성하는 단계는, 상기 키워드 및 상기 말뭉치의 데이터를 통합하고, 중복되는 정보를 제거함으로써 결합 정보를 생성하는 단계를 포함할 수 있다.According to another embodiment, the step of generating the first vulnerability basic information may include a step of combining the data of the keyword and the corpora, and generating the combined information by eliminating the redundant information .

또한, 또 다른 일 실시 예에 따르면, 상기 제1 취약점 기본 정보를 생성하는 단계는 상기 텍스트를 축약함으로써 상기 요약을 생성하는 단계를 포함할 수 있다.According to still another embodiment, generating the first vulnerability basic information may include generating the summary by abbreviating the text.

또한, 또 다른 일 실시 예에 따르면, 상기 취약점 정보 관리 방법은, 네트워크를 통해 CPE(Common Platform Enumeration) 사전을 획득하는 단계와, 상기 CPE 사전에 기초하여 상기 제1 취약점 기본 정보에 상응하는 CPE 명칭을 결정하는 단계 및 상기 저장하는 단계는, 상기 CPE 명칭을 상기 취약점 테이블에 저장하는 단계를 더 포함할 수 있다.According to another embodiment of the present invention, there is provided a vulnerability information management method comprising the steps of: obtaining a CPE (Common Platform Enumeration) dictionary through a network; obtaining a CPE name corresponding to the first vulnerability basic information based on the CPE dictionary , And the step of storing may further include storing the CPE name in the vulnerability table.

또한, 또 다른 일 실시 예에 따르면, 상기 CPE를 결정하는 단계는, 상기 제1 취약점 기본 정보를 키워드로 하여 상기 CPE 사전에 대해 CPE 트리 기반 키워드 분석을 수행함으로써 매칭률이 가장 높은 결과를 상기 CPE 명칭으로 결정하는 것을 특징으로 할 수 있다.According to another embodiment of the present invention, the step of determining the CPE may include performing CPE tree-based keyword analysis on the CPE dictionary using the first vulnerability basic information as a keyword, As the name.

또한, 또 다른 일 실시 예에 따르면, 상기 취약점 정보 관리 방법은, 취약점 정보 트레이닝 데이터를 학습함으로써 취약점 정보 분류 모델을 생성하는 단계와, 상기 취약점 기본 정보를 상기 취약점 정보 분류 모델에 입력함으로써, 제1 취약점 종류 정보를 생성하는 단계 및 상기 제1 취약점 종류 정보를 상기 취약점 테이블에 저장하는 단계를 더 포함할 수 있다.According to another embodiment of the present invention, the vulnerability information management method includes generating a vulnerability information classification model by learning vulnerability information training data, and inputting the vulnerability basic information into the vulnerability information classification model, Generating vulnerability category information, and storing the first vulnerability category information in the vulnerability table.

또한, 또 다른 일 실시 예에 따르면, 상기 취약점 정보 트레이닝 데이터는 미리 알려진 취약점 정보를 포함하는 파일을 파싱함으로써 획득되는 상기 미리 알려진 취약점 정보의 요약(summary) 정보 및 제2 취약점 종류 정보를 포함할 수 있다.According to another embodiment of the present invention, the vulnerability information training data may include summary information of the previously-known vulnerability information obtained by parsing a file containing known vulnerability information and second vulnerability type information have.

또한, 또 다른 일 실시 예에 따르면, 상기 취약점 정보 분류 모델은, 상기 취약점 기본 정보가 입력됨에 따라 취약점 분류(Common Weakness Enumeration; CWE) 코드의 형태로 상기 제1 취약점 종류 정보를 출력하는 것을 특징으로 할 수 있다.According to another embodiment of the present invention, the vulnerability information classification model outputs the first vulnerability category information in the form of a Common Weakness Enumeration (CWE) code as the vulnerability basic information is input can do.

또한, 또 다른 일 실시 예에 따르면, 상기 취약점 정보 관리 방법은 상기 제1 취약점 기본 정보에 대한 상기 취약점 테이블의 제2 취약점 정보로부터 생성된 제2 취약점 기본 정보의 유사도를 산출하는 단계 및 상기 산출된 유사도가 가장 높은 값이거나 임계값 이상인 경우, 상기 제2 취약점 정보의 취약점 ID를 연관 취약점 ID로 상기 취약점 테이블에 저장하는 단계를 더 포함할 수 있다.According to another embodiment of the present invention, the vulnerability information management method further comprises the step of calculating the similarity of the second vulnerability basic information generated from the second vulnerability information of the vulnerability table to the first vulnerability basic information, And storing the vulnerability ID of the second vulnerability information as an associated vulnerability ID in the vulnerability table if the similarity is the highest value or more than the threshold value.

상기 기술적 과제를 해결하기 위한, 본 발명의 일 실시예에 따른 취약점 정보 관리 장치는, 취약점 정보 소스로부터 획득된 데이터를 파싱하고 분류함으로써 제1 취약점 정보를 포함하는 취약점 테이블을 저장하는 저장부와, 상기 취약점 테이블로부터 상기 제1 취약점 정보와 관련된 텍스트를 추출하는 정보 추출부 및 상기 텍스트로부터 제1 취약점 기본 정보를 생성하며, 상기 제1 취약점 기본 정보를 상기 취약점 테이블에 저장하는 취약점 정보 관리부를 포함할 수 있다.According to an aspect of the present invention, there is provided a vulnerability information management apparatus comprising: a storage unit for storing a vulnerability table including first vulnerability information by parsing and classifying data obtained from a vulnerability information source; An information extracting unit for extracting a text related to the first vulnerability information from the vulnerability table and a vulnerability information managing unit for generating first vulnerability basic information from the text and storing the first vulnerability basic information in the vulnerability table .

상기 기술적 과제를 해결하기 위한, 본 발명의 일 실시 예에 따른 비일시적(non-transitory) 컴퓨터 판독 가능한 매체에 기록된 컴퓨터 프로그램은, 상기 컴퓨터 프로그램의 명령어들이 서버의 프로세서에 의해 실행되는 경우에, 취약점 정보 소스로부터 획득된 데이터를 파싱하고 분류함으로써 제1 취약점 정보를 포함하는 취약점 테이블을 획득하는 단계와, 상기 취약점 테이블로부터 상기 제1 취약점 정보와 관련된 텍스트를 추출하는 단계와, 상기 텍스트로부터 제1 취약점 기본 정보를 생성하는 단계 및 상기 제1 취약점 기본 정보를 상기 취약점 테이블에 저장하는 단계를 포함하는 동작이 수행되는 것을 특징으로 할 수 있다.According to an aspect of the present invention, there is provided a computer program recorded on a non-transitory computer readable medium, wherein when instructions of the computer program are executed by a processor of a server, The method comprising: obtaining a vulnerability table including first vulnerability information by parsing and classifying data obtained from the vulnerability information source; extracting text associated with the first vulnerability information from the vulnerability table; Generating vulnerability basic information, and storing the first vulnerability basic information in the vulnerability table.

상기 기술적 과제를 해결하기 위한, 본 발명의 일 실시 예에 따른 취약점 정보 관리 시스템은, 취약점 정보 소스로부터 획득된 데이터를 파싱하고 분류함으로써 제1 취약점 정보를 획득하는 취약점 정보 수집 시스템과, 상기 제1 취약점 정보를 포함하는 취약점 테이블을 저장하는 저장 장치 및 상기 취약점 테이블로부터 상기 제1 취약점 정보와 관련된 텍스트를 추출하고, 상기 텍스트로부터 제1 취약점 기본 정보를 생성하며, 상기 제1 취약점 기본 정보를 상기 취약점 테이블에 저장하는 취약점 정보 관리 장치를 포함할 수 있다.According to an aspect of the present invention, there is provided a vulnerability information management system comprising: a vulnerability information collection system for obtaining first vulnerability information by parsing and classifying data obtained from a vulnerability information source; A vulnerability table storing a vulnerability table including vulnerability information and a text associated with the first vulnerability information from the vulnerability table, generating first vulnerability basic information from the text, And a vulnerability information management apparatus that stores the vulnerability information in a table.

도 1은 다양한 취약점 정보 소스로부터 제공되는 취약점 정보에 포함되는 정보의 종류를 설명하기 위한 예시를 도시한 도면이다.
도 2는 일 실시 예에 따른 취약점 정보 관리 장치의 구조를 도시한 도면이다.
도 3은 일 실시 예에 따라 취약점 정보를 관리하는 프로세스를 도시한 도면이다.
도 4는 일 실시 예에 따라 제목 및 키워드를 생성하는 코드의 예시를 도시한 도면이다.
도 5는 일 실시 예에 따라 말뭉치를 생성하기 위해 실행되는 코드의 예시를 도시한 도면이다.
도 6은 일 실시 예에 따라 취약점 기본 정보로부터 CPE 명칭을 생성하는 프로세스를 도시한 도면이다.
도 7은 일 실시 예에 따라 CWE 코드를 생성하는 프로세스를 도시한 도면이다.
도 8은 CVE 정보의 예시를 도시한 도면이다.
도 9는 일 실시 예에 따라 분류된 취약점 유형을 도시한 도면이다.
도 10은 일 실시 예에 따라 취약점 정보 분류 모델을 생성하는 프로세스를 도시한 도면이다.
도 11은 일 실시 예에 따라 연관 취약점 ID를 저장하는 프로세스를 도시한 도면이다.1 is a diagram illustrating an example for explaining types of information included in vulnerability information provided from various vulnerability information sources.
2 is a diagram illustrating a structure of a vulnerability information management apparatus according to an embodiment.
3 is a diagram illustrating a process for managing vulnerability information according to an embodiment.
4 is a diagram illustrating an example of code for generating titles and keywords in accordance with one embodiment.
5 is an illustration of an example of code executed to generate a corpus according to one embodiment.
6 is a diagram illustrating a process for generating a CPE name from vulnerability basic information according to an embodiment.
7 is a diagram illustrating a process for generating a CWE code according to one embodiment.
8 is a diagram showing an example of CVE information.
9 is a diagram illustrating types of vulnerabilities classified according to one embodiment.
10 is a diagram illustrating a process of generating a vulnerability information classification model according to an embodiment.
11 is a diagram illustrating a process for storing an association vulnerability ID according to an embodiment.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예들을 상세히 설명한다. 본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시 예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 게시되는 실시 예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시 예들은 본 발명의 게시가 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 명세서 전체에 걸쳐 동일 참조 부호는 동일 구성 요소를 지칭한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. BRIEF DESCRIPTION OF THE DRAWINGS The advantages and features of the present invention and the manner of achieving them will become apparent with reference to the embodiments described in detail below with reference to the accompanying drawings. The present invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. Is provided to fully convey the scope of the invention to those skilled in the art, and the invention is only defined by the scope of the claims. Like reference numerals refer to like elements throughout the specification.

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어(기술 및 과학적 용어를 포함)는 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 공통적으로 이해될 수 있는 의미로 사용될 수 있다. 또 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않는 한 이상적으로 또는 과도하게 해석되지 않는다. 본 명세서에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다.Unless defined otherwise, all terms (including technical and scientific terms) used herein may be used in a sense commonly understood by one of ordinary skill in the art to which this invention belongs. Also, commonly used predefined terms are not ideally or excessively interpreted unless explicitly defined otherwise. The terminology used herein is for the purpose of illustrating embodiments and is not intended to be limiting of the present invention. In the present specification, the singular form includes plural forms unless otherwise specified in the specification.

본 명세서 전체에서, 취약점 정보 소스는 취약점 정보의 출처를 의미한다. 예를 들어, 취약점 정보 소스는 취약점 정보를 제공하는 데이터베이스나 웹 페이지를 제공하는 서버일 수 있으나, 본 발명이 이에 제한되는 것은 아니다.Throughout this specification, the vulnerability information source refers to the source of the vulnerability information. For example, the vulnerability information source may be a server that provides a vulnerability information database or a web page, but the present invention is not limited thereto.

또한, 본 명세서 전체에서, 취약점 테이블은 수집된 취약점 정보가 구조화되어 저장된 데이터를 의미한다. 일반적으로, 취약점 테이블은 복수의 필드에 정보가 분류되어 저장된 테이블의 형태로 구성될 수도 있으나, 이에 한정되지 아니한다.Also, throughout this specification, the vulnerability table refers to data stored and structured in the collected vulnerability information. Generally, the vulnerability table may be configured in the form of a table in which information is classified and stored in a plurality of fields, but is not limited thereto.

취약점이란 컴퓨터 시스템에 허용되지 않은 사용자가 접근할 수 있는 위협, 컴퓨터 시스템의 정상적인 서비스를 방해하는 위협, 컴퓨터 시스템에서 관리하는 중요 데이터의 유출, 변조, 삭제에 대한 위협 등을 의미한다. 구체적으로는 (i) Race Condition 환경변수, 계정 및 패스워드, 접근권한, 시스템 구성, 네트워크 구성, 버퍼 오버플로우(buffer overflow), 백도어 등과 같은 시스템 보안취약점, (ii) 불필요한 서비스와 정보 제공, 서비스 거부 공격, RPC, HTTP, SMTP, FTP, BIND, FINER, 버퍼 오버플로우 등과 같은 네트워크 보안취약점 및 (iii) 웹 서버, 방화벽 서버, IDS 서버, 데이터베이스 서버, 소스코드 취약점 등과 같은 응용프로그램 취약점 등이 있다.Vulnerabilities are threats to unauthorized user access to computer systems, threats to disrupt normal service of computer systems, threats to leakage, tampering, and deletion of critical data managed by computer systems. Specifically, (i) system security vulnerabilities such as Race Condition environment variables, account and password, access rights, system configuration, network configuration, buffer overflow, backdoor, etc., (ii) unnecessary services and information provision, Network vulnerabilities such as attacks, RPC, HTTP, SMTP, FTP, BIND, FINER, buffer overflows, and (iii) application vulnerabilities such as Web servers, firewall servers, IDS servers, database servers, and source code vulnerabilities.

이하, 도면들을 참조하여 본 발명의 몇몇 실시예들을 설명한다.Some embodiments of the present invention will now be described with reference to the drawings.

도 1은 다양한 취약점 정보 소스로부터 제공되는 취약점 정보에 포함되는 정보의 종류를 설명하기 위한 예시를 도시한 도면이다. 다만, 도 1은 본 발명을 설명하기 위한 예시일 뿐 실제 취약점 정보의 구성은 도 1에 도시된 바와 상이할 수도 있다.1 is a diagram illustrating an example for explaining types of information included in vulnerability information provided from various vulnerability information sources. However, FIG. 1 is only an example for explaining the present invention, and actual vulnerability information may be different from that shown in FIG.

인터넷 등의 네트워크에 연결되는 제품에 대한 취약점 정보를 공유하기 위하여, 다양한 취약점 정보 소스를 통해서 취약점 정보들이 공개된다. 도 1을 참조하면, 필드는 취약점 정보가 분류되는 정보의 종류를 의미한다. 또한, 필드는 취약점 테이블에 취약점 정보가 분류되어 저장되는 데이터 필드(data field)의 분류를 의미할 수도 있다.Vulnerability information is disclosed through a variety of vulnerability information sources in order to share vulnerability information on products connected to networks such as the Internet. Referring to FIG. 1, the field indicates a type of information in which the vulnerability information is classified. In addition, the field may indicate a classification of a data field in which the vulnerability information is classified and stored in the vulnerability table.

도 1에서, 취약점 식별자는 취약점 정보를 식별할 수 있도록 하는 ID를 의미한다. 예를 들어, NVD에서 제공되는 취약점 정보의 취약점 식별자는 'CVE-2016-2222'와 같은 CVE-ID일 수 있다. 제목(Title)은 취약점을 대표하는 이름을 의미한다. 일반적으로, 제목은 하나의 문장이나 단어들의 조합일 수 있으나, 이에 한정되지는 아니한다. 개요(Overview)는 텍스트 등을 통해서 취약점의 내용을 설명하는 정보를 의미한다. 제품명은 취약점이 존재하는 제품을 지시하는 정보를 의미한다. 예를 들어, 제품명은 CPE 명칭일 수 있다. CPE 명칭은 장치의 종류, 명칭 등을 알 수 있도록 미리 정해진 형태로 구성된 정보이다. 취약점 종류는 취약점이 정보 유출에 관한 건인지, 외부로부터의 소스 코드 실행에 관한 것인지, 외부 공격에 의한 소스 코드 삽입에 관한 것인지, 또는 버퍼 에러에 관한 것인지 등의 취약점의 성질을 나타내는 정보를 의미한다. 예를 들어, NVD에서 제공되는 취약점 정보인 CVE(Common Vulnerabilities and Exposures) 정보는 취약점 종류에 대한 정보로서 취약점을 분류하는 코드인 취약점 분류 코드(Common Weakness Enumeration Code; CWE code)를 포함한다. 취약점 점수는 취약점으로 인해 컴퓨터 시스템이 취약한 정도를 나타내는 값을 의미한다. 예를 들어, NVD에서 제공되는 취약점 정보인 CVE 정보는 취약점 점수로서 취약점 스코어링 시스템(Common Vulnerability Scoring System; SVSS) 정보를 포함한다. 릴리즈는 취약점 정보가 게시된 날짜나 시점을 의미한다. 원격/로컬은 취약점 정보가 원격 관련 취약점인지, 로컬 관련 취약점인지에 대한 정보를 의미한다. 솔루션은 취약점에 대한 해결 방안에 관한 정보를 의미한다. 참조는 취약점과 관련하여 참조될 수 있는 정보를 의미한다.In FIG. 1, the vulnerability identifier indicates an ID that allows the vulnerability information to be identified. For example, the vulnerability identifier in the vulnerability information provided in NVD may be a CVE-ID such as 'CVE-2016-2222'. Title means the name representing the vulnerability. In general, a title may be a single sentence or a combination of words, but is not limited thereto. Overview refers to information that describes the content of a vulnerability through text or the like. The product name refers to information indicating the product in which the vulnerability exists. For example, the product name may be a CPE name. The CPE name is information configured in a predetermined form so that the type, name, etc. of the device can be known. The type of vulnerability is information indicating the nature of the vulnerability such as whether the vulnerability relates to information leakage, external source code execution, source code insertion by external attack, or buffer error . For example, Common Vulnerabilities and Exposures (CVE) information provided by the NVD includes information on the types of vulnerabilities and Common Weakness Enumeration Code (CWE code), which is a code for classifying vulnerabilities. Vulnerability score means the degree of vulnerability of computer system due to vulnerability. For example, CVE information, which is vulnerability information provided by NVD, is a vulnerability score and includes Common Vulnerability Scoring System (SVSS) information. A release is the date or time at which the vulnerability information was posted. Remote / local means information about whether the vulnerability information is a remote related vulnerability or a local related vulnerability. The solution is information about a solution to the vulnerability. Reference refers to information that can be referenced in connection with the vulnerability.

도 2는 일 실시 예에 따른 취약점 정보 관리 장치의 구조를 도시한 도면이다.2 is a diagram illustrating a structure of a vulnerability information management apparatus according to an embodiment.

일 실시 예에 따른 컴퓨팅 장치(10)는 저장부(210) 및 취약점 정보 관리부(220)를 포함할 수 있다. 또한, 컴퓨팅 장치(10)는 취약점 정보 수집 시스템(20)을 더 포함할 수 있다. 취약점 정보 수집 시스템(20)은 컴퓨팅 장치(10)와 구분된 별도의 장치로 구비될 수도 있다.The computing device 10 according to one embodiment may include a storage unit 210 and a vulnerability information management unit 220. In addition, the computing device 10 may further include a vulnerability information collection system 20. The vulnerability information collection system 20 may be provided separately from the computing device 10.

일 실시 예에 따르면, 취약점 정보 수집 시스템(20)은 복수의 취약점 정보 소스들(30)로부터 취약점 정보를 수집하고, 수집된 취약점 정보에 기초하여 취약점 테이블(1)을 생성할 수 있다. 취약점 정보 수집 시스템(20)은 취약점 정보 소스들(30)로부터 수신된 데이터를 파싱함으로써 취약점 정보를 추출할 수 있다. 예를 들어, 취약점 정보 소스들(30)로부터 취약점 정보를 포함하는 취약점 파일을 다운로드한 경우, 취약점 정보 수집 시스템(20)은 다운로드된 취약점 파일에 대한 파일 파싱을 수행함으로써 취약점 파일에 포함된 취약점 정보를 추출할 수 있다. 다른 예를 들면, 취약점 정보 수집 시스템(20)은 서버를 통해 제공되는 웹 페이지의 소스 코드에 대한 웹 언어(예를 들어, HTML(HyperText Markup Language) 파싱을 수행함으로서, 취약점 정보를 추출할 수 있다. 또한, 취약점 정보 수집 시스템(20)은 취약점 정보를 분류하고, 분류 결과에 따라 취약점 테이블에 취약점 정보를 저장할 수 있다. 즉, 예를 들어, 취약점 식별자로 분류된 취약점 정보는 취약점 식별자를 저장하는 항목에 저장하고, 취약점 종류로 분류된 취약점 정보는 취약점 종류를 저장하는 항목에 저장할 수 있다. 상기 각 항목은 취약점 테이블의 필드로 구성될 수 있으나, 이에 한정되지 아니한다.According to one embodiment, the vulnerability information collection system 20 can collect vulnerability information from a plurality of vulnerability information sources 30 and generate the vulnerability table 1 based on the collected vulnerability information. The vulnerability information collection system 20 can extract vulnerability information by parsing the data received from the vulnerability information sources 30. For example, when a vulnerability file containing vulnerability information is downloaded from the vulnerability information sources 30, the vulnerability information collection system 20 performs file parsing on the downloaded vulnerability file, Can be extracted. For example, the vulnerability information collection system 20 can extract vulnerability information by performing a web language (e.g., HyperText Markup Language (HTML) parsing) on a source code of a web page provided through a server The vulnerability information collection system 20 may classify the vulnerability information and store the vulnerability information in the vulnerability table according to the classification result, that is, for example, the vulnerability information classified as the vulnerability identifier stores the vulnerability identifier And the vulnerability information classified as the vulnerability category can be stored in an item for storing the vulnerability category. Each item may be configured as a field of the vulnerability table, but is not limited thereto.

컴퓨팅 장치(10)는 획득된 취약점 테이블(1)을 저장부(210)에 저장할 수 있다. 저장부(210)는 데이터를 저장할 수 있는 저장 매체나 시스템을 포함하여 구성될 수 있다.The computing device 10 may store the obtained vulnerability table 1 in the storage unit 210. [ The storage unit 210 may include a storage medium or a system capable of storing data.

취약점 정보 관리부(220)는 저장부(210)의 취약점 테이블(1)을 관리할 수 있다. 취약점 정보 관리부(220)는 취약점 테이블(1)에 포함된 취약점 정보에 기초하여 취약점 기본 정보를 생성할 수 있다. 여기서, 취약점 기본 정보는 취약점을 설명하는 기초적인 정보를 의미한다. 일 실시 예에 따르면, 취약점 기본 정보는 취약점을 대표하는 이름이 제목(title), 취약점 정보로부터 추출된 키워드, 취약점 정보에 대하여 특정 주제에 대한 유사 의미의 단어 클러스터인 말뭉치(corpus) 및 취약점에 대한 축약 정보인 요약(summary) 정보 중 하나 이상을 포함할 수 있다.The vulnerability information management unit 220 can manage the vulnerability table 1 of the storage unit 210. [ The vulnerability information management unit 220 can generate vulnerability basic information based on the vulnerability information included in the vulnerability table 1. [ Here, the vulnerability basic information represents basic information describing the vulnerability. According to one embodiment, the vulnerability basic information includes a title representing a vulnerability, a keyword extracted from vulnerability information, a corpus, a word cluster having a similar meaning to a specific subject, And summary information that is reduced information.

또한, 다른 일 실시 예에 따르면, 취약점 정보 관리부(220)는 취약점 기본 정보를 이용하여 추가 정보를 더 생성할 수 있다. 취약점 정보 관리부(220)는 생성된 추가 정보와 동일한 종류의 정보가 취약점 테이블(1)에 존재하지 않는 경우, 생성된 추가 정보를 취약점 테이블(1)에 저장할 수 있다. 취약점 정보 관리부(220)는 생성된 추가 정보와 동일한 종류의 정보가 취약점 테이블(1)에 존재하는 경우, 취약점 정보 관리부(220)는 취약점 테이블(10)에 저장된 정보를 생성된 추가 정보로 변경할 수 있다.In addition, according to another embodiment, the vulnerability information management unit 220 may further generate additional information using vulnerability basic information. The vulnerability information management unit 220 may store the generated additional information in the vulnerability table 1 if the same kind of information as the generated additional information does not exist in the vulnerability table 1. [ The vulnerability information management unit 220 can change the information stored in the vulnerability table 10 to the generated additional information if the same kind of information as the generated additional information exists in the vulnerability table 1 have.

다만, 도 2는 일 실시 예를 설명하기 위한 것이며, 컴퓨팅 장치(10)는 동일한 동작을 수행할 수 있는 다른 형태로 변형될 수 있다. 예를 들어, 컴퓨팅 장치는 데이터를 저장하는 스토리지(storage) 장치, 오퍼레이션을 로드(load)하고 저장하는 메모리, 메모리에 저장된 오퍼레이션을 실행하는 하드웨어 프로세서 및 외부와 통신을 수행하기 위한 네트워크 인터페이스가 동일한 동작을 수행하도록 구성될 수 있다. 또는, 컴퓨팅 장치(10)는 하나의 장치가 아닌 다른 시스템이나 복수의 장치를 포함하는 시스템으로 구성될 수도 있다.However, FIG. 2 is intended to illustrate an embodiment, and the computing device 10 may be modified into other forms capable of performing the same operations. For example, the computing device may include a storage device for storing data, a memory for loading and storing operations, a hardware processor for executing operations stored in the memory, and a network interface for performing communication with the outside, As shown in FIG. Alternatively, the computing device 10 may be configured as a system other than a single device or a system including a plurality of devices.

이하에서는 취약점 정보 관리 장치(10)의 동작을 보다 상세히 설명한다.Hereinafter, the operation of the vulnerability information management apparatus 10 will be described in more detail.

도 3은 일 실시 예에 따라 취약점 정보를 관리하는 프로세스를 도시한 도면이다.3 is a diagram illustrating a process for managing vulnerability information according to an embodiment.

먼저, 취약점 정보 관리 장치(10)는 취약점 정보 수집 시스템(20)을 통해 취약점 테이블(1)을 획득할 수 있다(S310). 취약점 정보 수집 시스템(20)은 취약점 정보 소스들(30)로부터 수신된 데이터를 파싱함으로써 취약점 정보를 획득할 수 있다. 다양한 취약점 정보 소스들(30)로부터 획득된 취약점 정보로부터 구조화된 취약점 테이블(1)을 생성하기 위해, 획득된 취약점 정보를 분류할 수 있다. 또한, 취약점 정보가 정형화되어 있지 않은 경우, 취약점 정보를 정형화하고, 취약점 정보 수집 시스템(20)은 정형화된 취약점 정보를 포함하는 취약점 테이블(1)에 구성할 수 있다. 획득된 취약점 테이블(1)은 저장부(210)에 저장될 수 있다.First, the vulnerability information management apparatus 10 can acquire the vulnerability table 1 through the vulnerability information collection system 20 (S310). The vulnerability information collection system 20 can acquire vulnerability information by parsing the data received from the vulnerability information sources 30. The acquired vulnerability information can be classified to generate a structured vulnerability table (1) from the vulnerability information obtained from the various vulnerability information sources (30). In addition, when the vulnerability information is not formulated, the vulnerability information can be formatted and the vulnerability information collection system 20 can be configured in the vulnerability table 1 including the formalized vulnerability information. The acquired vulnerability table (1) can be stored in the storage unit (210).

이후, 취약점 정보 관리 장치(10)의 취약점 정보 관리부(220)는 취약점 테이블(1)에 저장된 취약점 정보로부터 텍스트를 추출할 수 있다(S320). 예를 들어, 취약점 정보 관리부(220)는 취약점 테이블의 개요(overview) 필드에 저장된 텍스트를 추출할 수 있다. 다만, 여기서 텍스트는 취약점 정보로부터 추출되는 정보의 대표적인 예시일 뿐, 취약점 정보로부터 추출되는 정보의 형태는 실시 예에 따라 변경될 수 있다.Thereafter, the vulnerability information management unit 220 of the vulnerability information management apparatus 10 can extract text from the vulnerability information stored in the vulnerability table 1 (S320). For example, the vulnerability information management unit 220 can extract the text stored in the overview field of the vulnerability table. Here, the text is only a representative example of the information extracted from the vulnerability information, and the type of the information extracted from the vulnerability information may be changed according to the embodiment.

이후, 취약점 정보 관리 장치(10)의 취약점 정보 관리부(220)는 추출된 텍스트를 이용하여 취약점 기본 정보를 생성할 수 있다(S330). 취약점 정보 관리부(220)는 취약점 기본 정보를 생성하기 위해 텍스트 요약 및 연관 키워드 생성 알고리즘을 이용할 수 있다. 일 실시 예에 따르면, 단계 S330에서 생성되는 취약점 기본 정보는 제목, 키워드, 말뭉치 및 요약(summary) 정보 중 적어도 하나를 포함할 수 있다.Thereafter, the vulnerability information management unit 220 of the vulnerability information management device 10 may generate vulnerability basic information using the extracted text (S330). The vulnerability information management unit 220 may use a text summary and an associated keyword generation algorithm to generate vulnerability basic information. According to an exemplary embodiment, the vulnerability basic information generated in step S330 may include at least one of a title, a keyword, a corpus, and summary information.

일 실시 예에 따르면, 취약점 기본 정보는 제목(title)을 포함할 수 있다. 이 경우, 취약점 정보 관리부(220)는 제목을 생성하기 위해 취약점 정보로부터 추출된 텍스트에 텍스트랭크(Textrank) 알고리즘을 적용할 수 있다. 추출된 텍스트에 텍스트랭크 알고리즘을 적용함으로써, 취약점 정보 관리부(220)는 텍스트에 포함된 문장 중에서 대표 문장을 선택할 수 있다. 또는, 취약점 정보 관리부(220)는 텍스트 내에서 주요 내용을 대표하는 것으로 판단되는 키워드들을 추출할 수 있다. 취약점 정보 관리부(220)는 키워드들을 조합함으로써 대표 문장을 결정할 수 있다. 예를 들어, 취약점 정보 관리부(220)는 텍스트로부터 제조사, 제품 및 취약점 종류를 나타내는 키워드들을 추출하고, 제조사, 제품 및 취약점 종류를 조합한 문장을 대표 문장으로 결정할 수 있다. 취약점 정보 관리부(220)는 대표 문장으로 선택된 텍스트를 취약점 기본 정보의 제목(title)으로 결정할 수 있다.According to one embodiment, the vulnerability basic information may include a title. In this case, the vulnerability information management unit 220 may apply a textrank algorithm to the text extracted from the vulnerability information to generate a title. By applying a text rank algorithm to the extracted text, the vulnerability information management unit 220 can select a representative sentence among the sentences included in the text. Alternatively, the vulnerability information management unit 220 may extract keywords that are considered to represent the main contents in the text. The vulnerability information management unit 220 can determine a representative sentence by combining keywords. For example, the vulnerability information management unit 220 may extract keywords representing types of manufacturers, products, and vulnerabilities from the text, and determine sentences combining the manufacturer, product, and types of vulnerabilities as representative sentences. The vulnerability information management unit 220 can determine the text selected as the representative sentence as the title of the vulnerability basic information.

또한, 취약점 기본 정보는 취약점에 대한 키워드를 포함할 수 있다. 이 경우, 취약점 정보 관리부(220)는 키워드를 결정하기 위해 취약점 정보로부터 추출된 텍스트에 텍스트랭크(Textrank) 알고리즘 및 워드투벡(Word2Vec) 알고리즘을 적용할 수 있다. 추출된 텍스트에 텍스트랭크(Textrank) 알고리즘 및 워드투벡(Word2Vec) 알고리즘을 적용함으로써, 취약점 정보 관리부(220)는 텍스트로부터 키워드를 추출할 수 있다.In addition, the vulnerability basic information may include a keyword for the vulnerability. In this case, the vulnerability information management unit 220 may apply a textrank algorithm and a word-by-word (Word2Vec) algorithm to the text extracted from the vulnerability information to determine the keyword. By applying a text rank algorithm and a word-by-word algorithm to the extracted text, the vulnerability information management unit 220 can extract keywords from the text.

도 4는 일 실시 예에 따라 제목 및 키워드를 생성하는 코드의 예시를 도시한 도면이다. 도 4를 참조하면, 취약점 정보 관리부(220)는 summarize() 함수를 이용하여 여러 문장 중에서 텍스트의 내용을 대표하는 문장 하나를 반환하도록 함으로써, 제목을 생성할 수 있다. 또한, 취약점 정보 관리부(220)는 keywords() 함수를 이용하여 입력값으로부터 텍스트에 대한 키워드를 생성할 수 있다. 도 4에 도시된 예시는 v102674로 식별되는 취약점 정보로부터 제목 및 키워드를 생성하는 코드에 관한 것이다. 취약점 정보 관리부(220)는 도 4에 도시된 바와 같은 코드를 실행함으로써 취약점 정보에 대한 주제 및 키워드를 생성할 수 있다.4 is a diagram illustrating an example of code for generating titles and keywords in accordance with one embodiment. Referring to FIG. 4, the vulnerability information management unit 220 may generate a title by using a summarize () function to return a sentence representing the contents of the text among a plurality of sentences. In addition, the vulnerability information management unit 220 can generate a keyword for the text from the input value using the keywords () function. The example shown in FIG. 4 relates to a code for generating a title and a keyword from vulnerability information identified by v102674. The vulnerability information management unit 220 can generate a subject and a keyword for the vulnerability information by executing a code as shown in FIG.

취약점 기본 정보는 취약점 정보에 대한 말뭉치(corpus)를 포함할 수 있다. 취약점 정보 관리부(220)는 말뭉치를 생성하기 위해 취약점 정보로부터 추출된 텍스트 및 키워드를 주제 모델(Topic model)을 이용하여 유사한 의미의 단어 클러스터를 생성할 수 있다. 여기서, 주제 모델은 문서 집합의 주제를 발견하기 위한 통계적 모델이다. 주제 모델링을 수행함으로써 생성되는 주제는 유사한 의미의 단어 클러스터가 된다. 취약점 정보 관리부(220)는 생성된 단어 클러스터를 선택함으로써 말뭉치를 생성할 수 있다.The vulnerability basic information may include a corpus for vulnerability information. The vulnerability information management unit 220 can generate word clusters having a similar meaning using a topic model by using the text and keywords extracted from the vulnerability information to generate a corpus. Here, the subject model is a statistical model for finding the subject of a document set. The subject created by subject modeling is a word cluster of similar meaning. The vulnerability information management unit 220 can generate a corpus by selecting the generated word cluster.

또한, 일 실시 예에 따르면, 취약점 정보 관리 장치(10)는 키워드와 말뭉치의 데이터를 통합한 후, 통합된 데이터로부터 중복되는 데이터를 제거함으로써 결합 정보를 생성할 수 있다. 결합 정보를 생성함으로써, 키워드와 말뭉치가 통합적으로 이용될 수 있다.Also, according to one embodiment, the vulnerability information management apparatus 10 may combine the keyword and corpus data, and then generate the combination information by removing redundant data from the integrated data. By generating the combination information, the keyword and the corpus can be integrally used.

도 5는 일 실시 예에 따라 말뭉치를 생성하기 위해 실행되는 코드의 예시를 도시한 도면이다. 일 실시 예에 따른 취약점 정보 관리부(220)는 도 5에 도시된 코드를 실행함으로써 취약점 정보로부터 추출된 텍스트에 대한 전처리(pre-procesing)을 수행하고, 단어 클러스터를 생성함으로써 말뭉치(corpus)를 획득할 수 있다.5 is an illustration of an example of code executed to generate a corpus according to one embodiment. The vulnerability information management unit 220 according to an embodiment performs pre-processing on text extracted from the vulnerability information by executing the code shown in FIG. 5, and acquires a corpus by generating a word cluster can do.

취약점 기본 정보는 취약점 정보에 대한 요약(summary) 정보를 포함할 수 있다. 취약점 기본 정보는 취약점 정보로부터 추출된 텍스트를 축약함으로써 생성될 수 있다. 예를 들어, 취약점 기본 정보로부터 30 바이트(byte)의 텍스트가 추출된 경우, 추출된 텍스트를 4 바이트의 텍스트로 축약함으로써 생성될 수 있다. 취약점 정보 관리부(220)는 텍스트를 축약하기 위해 seq2seq 알고리즘을 이용할 수 있다.The vulnerability basic information may include summary information on the vulnerability information. The vulnerability basic information can be generated by abbreviating the text extracted from the vulnerability information. For example, when 30 bytes (bytes) of text are extracted from vulnerability basic information, it can be generated by shortening the extracted text into 4 bytes of text. The vulnerability information management unit 220 may use the seq2seq algorithm to shorten the text.

다시 도 3을 참조하면, 취약점 정보 관리부(220)는 생성된 취약점 기본 정보를 취약점 테이블에 저장할 수 있다(S340). 여기서, 취약점 정보 관리부(220)는 도 1에 도시된 필드와는 별도의 필드에 취약점 기본 정보를 저장할 수 있다.Referring back to FIG. 3, the vulnerability information management unit 220 may store the generated vulnerability basic information in the vulnerability table (S340). Here, the vulnerability information management unit 220 may store the vulnerability basic information in a field separate from the field shown in FIG.

다른 일 실시 예에 따르면, 취약점 정보 관리부(220)는 취약점 기본 정보를 이용하여 추가적인 정보를 더 생성할 수 있다. 일 실시 예에 따르면, 취약점 정보 관리부(220)는 취약점 기본 정보를 이용하여 취약점이 발생한 제품을 나타내는 CPE(Common Platform Enumeration) 명칭, 발생한 취약점의 종류를 나타내는 CWE(Common Weakness Enumeration) 코드, 및 취약점 기본 정보와 관련 있는 취약점을 식별하는 식별자인 연관 CVE-ID(Common Vulnerabilities and Exposures-ID)를 추가적으로 생성할 수 있다.According to another embodiment, the vulnerability information management unit 220 may further generate additional information using vulnerability basic information. According to one embodiment, the vulnerability information management unit 220 uses the vulnerability basic information to identify a name of a CPE (Common Platform Enumeration) indicating a product in which a vulnerability occurs, a Common Weakness Enumeration (CWE) An additional Common Vulnerabilities and Exposures (IDs) can be created that are identifiers that identify vulnerabilities related to information.

도 6은 일 실시 예에 따라 취약점 기본 정보로부터 CPE 명칭을 생성하는 프로세스를 도시한 도면이다. 도 6의 (a)는 취약점 기본 정보에 상응하는 CPE 명칭이 CPE 사전(720-1)에 존재하는 경우에 수행되는 프로세스이다. 일 실시 예에 따르면, 취약점 정보 관리부(220)는 취약점 기본 정보에 포함된 키워드(710)에 매칭되는 CPE 키워드들을 CPE 사전(720-1)에서 검색할 수 있다. 취약점 정보 관리부(220)는 CPE 사전(720-1)으로부터 검색된 CPE 키워드로부터 CPE 사전(720-1)의 CPE 포맷을 준수하는 CPE 명칭(730)을 생성할 수 있다.6 is a diagram illustrating a process for generating a CPE name from vulnerability basic information according to an embodiment. 6A is a process performed when the CPE name corresponding to the vulnerability basic information exists in the CPE dictionary 720-1. According to one embodiment, the vulnerability information management unit 220 can search the CPE dictionary 720-1 for the CPE keywords matched with the keyword 710 included in the vulnerability basic information. The vulnerability information management unit 220 may generate a CPE name 730 conforming to the CPE format of the CPE dictionary 720-1 from the CPE keyword retrieved from the CPE dictionary 720-1.

취약점 정보 관리부(220)는 CPE 사전(920)에 기초하여 제품명을 CPE 포맷으로 변환하기 위해 CPE 사전을 이용하여 CPE 트리를 생성할 수 있다. 일 실시 예에 따르면, CPE 트리는 6개의 레벨을 가질 수 있다.The vulnerability information management unit 220 may generate the CPE tree using the CPE dictionary to convert the product name into the CPE format based on the CPE dictionary 920. According to one embodiment, the CPE tree may have six levels.

복수의 레벨 및 복수의 노드를 갖는 CPE 트리는, (i) 제1 레벨에 해당하는 노드는 제조사(Vendor) 정보, (ii) 제2 레벨에 해당하는 노드는 제품명(Product) 정보, 제3 레벨에 해당하는 노드는 제품버전(Version) 정보, 제4 레벨에 해당하는 노드는 업데이트(Update) 정보, 제5 레벨에 해당하는 노드는 배포판(Edition) 정보, 제6 레벨에 해당하는 노드는 제품 언어(Language) 정보를 포함한다.A CPE tree having a plurality of levels and a plurality of nodes includes (i) a node corresponding to a first level, a vendor information, (ii) a node corresponding to a second level, product name information, The node corresponding to the fifth level is the distribution (Edition) information, the node corresponding to the sixth level is the product language ( Language) information.

생성된 CPE 트리는 제1 레벨 내지 제6 레벨 중에서 적어도 세 개의 레벨을 포함할 수 있다. 제1 레벨에 해당하는 노드가 갖는 정보와 제2 레벨에 해당하는 노드가 갖는 정보가 동일할 수 있다. 즉, 제조사와 제품명이 같을 수 있다. The generated CPE tree may include at least three levels from the first level to the sixth level. The information of the node corresponding to the first level and the information of the node corresponding to the second level may be the same. That is, the product name may be the same as the manufacturer.

CPE 트리는, 부모 노드, 자식 노드, 형제 노드 중 적어도 하나를 포함한다. 부모 노드 및 상기 자식 노드 간에는 연결된다. 복수의 레벨 중에서 상위 레벨에 해당하는 노드가 부모 노드에 해당하고, 복수의 레벨 중에서 하위 레벨에 해당하는 노드가 자식 노드에 해당하고, 복수의 레벨 중에서 동일 레벨에 해당하는 노드들이 형제 노드에 해당한다.The CPE tree includes at least one of a parent node, a child node, and a sibling node. And is connected between the parent node and the child node. Among the plurality of levels, the node corresponding to the upper level corresponds to the parent node, the node corresponding to the lower level among the plurality of levels corresponds to the child node, and the nodes corresponding to the same level among the plurality of levels correspond to the sibling nodes .

복수의 레벨 중에서 중간 레벨이 생략되면, 생략된 중간 레벨의 상위 레벨 노드에 해당하는 노드 및 생략된 중간 레벨의 하위 레벨에 해당하는 노드 간에 연결된다.If an intermediate level is omitted from a plurality of levels, the nodes corresponding to the upper level node of the omitted intermediate level and the nodes corresponding to the lower level of the omitted intermediate level are connected.

취약점 정보 관리부(220)는 CPE 사전의 문자열을 문자':'를 기준으로 분리하여 복수의 레벨들을 생성한다. 취약점 정보 관리부(220)는 CPE 사전의 제5 레벨에서 문자'~'을 기준으로 문자열을 분리한다.The vulnerability information management unit 220 generates a plurality of levels by separating the character string of the CPE dictionary on the basis of the character ':'. The vulnerability information management unit 220 separates the character string based on the character '~' in the fifth level of the CPE dictionary.

취약점 정보 관리부(220)는 CPE 트리의 키워드들 중에서 제품명 정보에 포함된 키워드들을 조합하여, CPE 트리로부터 CPE 사전의 포맷을 준수하는 하나 이상의 CPE들로 변환한다.The vulnerability information management unit 220 combines the keywords included in the product name information among the keywords of the CPE tree into one or more CPEs conforming to the format of the CPE dictionary from the CPE tree.

도 6의 (b)는 취약점 기본 정보에 상응하는 CPE 명칭이 CPE 사전(720-1)에 존재하지 않는 경우에 수행되는 프로세스이다. 취약점 정보 관리부(220)는 취약점 기본 정보의 키워드(740)에 상응하는 CPE 키워드를 CPE 사전(720-2)에서 검색한 결과, 키워드(740)에 상응하는 CPE 키워드가 검색되지 않는 경우, 키워드(740)에 기초하여 CPE 명칭을 직접 생성할 수 있다.6B is a process performed when the CPE name corresponding to the vulnerability basic information does not exist in the CPE dictionary 720-1. When the CPE keyword corresponding to the keyword 740 of the vulnerability basic information is retrieved from the CPE dictionary 720-2 as a result of searching the CPE keyword corresponding to the keyword 740 as a result of searching for the keyword 740, RTI ID = 0.0 > 740). &Lt; / RTI >

먼저, 취약점 정보 관리부(220)는 키워드(740) 각각의 의미를 분석함으로써, 키워드를 분류할 수 있다. 여기서, 키워드를 분류하기 위해 취약점 정보 관리부(220)는 기계 학습을 통해 생성된 텍스트 분류 모델을 이용할 수 있다. 여기서, 텍스트 분류 모델은 미리 알려진 취약점 정보(예를 들어, CVE 정보)에 포함된 CPE 명칭과 개요(overview) 정보 등을 학습 데이터로 하여 트레이닝된 모델일 수 있다. 취약점 정보 관리부(220)는 분류된 키워드(750) 각각을 CPE 포맷의 각 위치에 삽입함으로써 CPE 포맷을 준수하는 CPE 명칭(760)을 생성할 수 있다. 이후, 취약점 정보 관리부(220)는 CPE 명칭(760)을 CPE 사전(720-2)에 등록할 수 있다.First, the vulnerability information management unit 220 can classify keywords by analyzing the meaning of each keyword 740. Here, in order to classify keywords, the vulnerability information management unit 220 may use a text classification model generated through machine learning. Here, the text classification model may be a model trained by using CPE name and overview information included in known vulnerability information (for example, CVE information) as learning data. The vulnerability information management unit 220 may generate a CPE name 760 that conforms to the CPE format by inserting each of the classified keywords 750 into each position of the CPE format. Thereafter, the vulnerability information management unit 220 may register the CPE name 760 in the CPE dictionary 720-2.

도 7은 일 실시 예에 따라 CWE 코드를 생성하는 프로세스를 도시한 도면이다.7 is a diagram illustrating a process for generating a CWE code according to one embodiment.

먼저, 단계 S810에서 취약점 정보 관리 장치(10)는 취약점 기본 정보에 따라 취약점 정보를 분류할 수 있는 취약점 정보 분류 모델을 생성할 수 있다. 여기서, 취약점 정보 분류 모델은 미리 정해진 취약점 유형들 중 어느 하나 이상을 선택함으로써 취약점 정보를 분류할 수 있다. 여기서, 미리 정해진 취약점 유형들은 각 취약점 유형이 미리 알려진 취약점 정보를 임계값 이상이 되도록 정해질 수 있다. 예를 들어, 각 취약점 유형에 대한 CVE 정보가 1000개 이상이 되도록 취약점 유형이 정해질 수 있다. 또한, 각 취약점 유형들은 상응하는 CWE 코드를 가질 수 있다.First, in step S810, the vulnerability information management apparatus 10 may generate a vulnerability information classification model capable of classifying the vulnerability information according to the vulnerability basic information. Here, the vulnerability information classification model can classify the vulnerability information by selecting one or more of the predetermined vulnerability types. Here, the predetermined types of vulnerabilities can be determined such that each vulnerability type has previously known vulnerability information exceeding the threshold value. For example, a vulnerability type could be defined to have more than 1000 CVE information for each type of vulnerability. In addition, each type of vulnerability can have a corresponding CWE code.

일 실시 예에 따르면, 취약점 정보 분류 모델을 생성하기 위해, 취약점 정보 관리 장치(10)는 취약점 정보 트레이닝 데이터를 학습할 수 있다. 여기서, 취약점 정보 트레이닝 데이터는 미리 알려진 취약점 정보로서, 취약점 유형에 대한 정보를 포함할 수 있다. According to one embodiment, the vulnerability information management apparatus 10 can learn vulnerability information training data to generate a vulnerability information classification model. Here, the vulnerability information training data is known vulnerability information in advance and may include information on the vulnerability type.

이후, 단계 S820에서 취약점 정보 관리 장치(10)는 취약점 기본 정보를 취약점 정보 분류 모델에 입력함으로써 취약점 유형 정보를 생성할 수 있다. 여기서, 취약점 유형 정보는 CWE 코드를 포함할 수 있다. 취약점 정보 분류 모델에 취약점 기본 정보가 입력되면, 취약점 정보 분류 모델은 취약점 기본 정보에 상응하는 CWE 코드를 반환할 수 있다. 예를 들어, 취약점 기본 정보의 키워드와 요약 정보에 버퍼 오류(buffer) 오류와 관련된 단어들이 포함되어 있는 경우, 취약점 정보 분류 모델은 버퍼 오류에 대한 CWE 코드를 반환할 수 있다. 단계 S830에서 취약점 정보 관리 장치(10)는 생성된 취약점 유형 정보를 저장할 수 있다.Thereafter, in step S820, the vulnerability information management apparatus 10 can generate vulnerability type information by inputting vulnerability basic information into the vulnerability information classification model. Here, the vulnerability type information may include a CWE code. When the vulnerability basic information is input to the vulnerability information classification model, the vulnerability information classification model can return the CWE code corresponding to the vulnerability basic information. For example, if the keywords and summary information of the vulnerability basic information include words related to a buffer error, the vulnerability information classification model may return a CWE code for a buffer error. In step S830, the vulnerability information management apparatus 10 may store the generated vulnerability type information.

도 8은 CVE 정보의 예시를 도시한 도면이다. 도 8를 참조하면, CVE 정보는 CVE-ID(910), 취약점 개요 정보(Overview)(920), CVSS(930), CPE(940), CWE(950), 및 Reference(960)를 전부 또는 일부 포함한다. 취약점 개요 정보 (920)는 "place where a vulnerability was discovered", "(in) related software product names", "(when)conditions of the vulnerability occurrence", "(allow)attacker type", "(to)results of attack", "(via)means of attack", "(aka)vulnerability title in the reference site", "(a different vulnerability than)other CVE-IDs" 등으로 구성될 수 있다. 취약점 개요 정보 (920)는 Description 등과 같은 용어로 표시될 수 있다. 취약점 정보 관리부(220)는 'allow ', 'to', 'via', 'aka', 'a different vulnerability than'등과 같은 단어 또는 문자열을 기준으로 주제, 키워드, 요약 정보를 구성하는 텍스트들을 추출할 수 있다. 8 is a diagram showing an example of CVE information. 8, the CVE information includes all or part of a CVE-ID 910, an overview 920, a CVSS 930, a CPE 940, a CWE 950, and a Reference 960 . The vulnerability outline information 920 includes information on the vulnerability outcome 920 and the vulnerability outcome 930. The vulnerability outline information 920 includes information on the vulnerability outcome 930, , "(aka) vulnerability title in the reference site", "(a) different vulnerability than other CVE-IDs". Vulnerability summary information 920 may be expressed in terms such as Description or the like. The vulnerability information management unit 220 extracts texts constituting the subject, keyword, and summary information based on words or strings such as 'allow', 'to', 'via', 'aka', 'a different vulnerability than' .

취약점 분류 모델을 생성하기 위해, 취약점 정보 관리 장치(10)는 취약점 개요 정보를 구성하는 문자열로부터 취약점 개요 정보에서 공통적으로 사용되는 정보를 제거한다. 즉, 취약점 정보 관리부(220)는 개요 정보를 구성하는 문자열에서 의미있는 정보를 선택한다. 예컨대, 제1 설명정보, 제2 설명정보, 및 제3 설명정보는 (i) 공격자 유형(Attacker Type), (ii) 공격 결과(Results of Attack), (iii) 공격 수단(Means of Attack), 및 (iv) 참조 사이트의 취약점 제목(Vulnerability Title in The Reference Site) 중 적어도 하나를 포함할 수 있다. 도 8을 참조하면, 취약점 개요 정보(320)의 "epan/dissectors/packet-hiqnet.c in the HiQnet dissector in Wireshark 2.0.x before 2.0.2 does not validate the data type, which allows remote attackers to cause a denial of service (out-of-bounds read and application crash) via a crafted packet." 중에서 'allows', 'to', 및 'via'를 기준으로 하여 'remote attackers', 'cause a denial of service (out-of-bounds read and application crash)', 및 'a crafted packet'를 추출할 수 있다.In order to create the vulnerability classifying model, the vulnerability information managing device 10 removes information commonly used in the vulnerability outline information from the strings constituting the vulnerability outline information. That is, the vulnerability information management unit 220 selects meaningful information from a character string constituting the outline information. For example, the first explanatory information, the second explanatory information, and the third explain information may be classified into (i) Attacker Type, (ii) Results of Attack, (iii) Means of Attack, And (iv) a Vulnerability Title in the Reference Site. 8, "epan / dissectors / packet-hiqnet.c in the HiQnet dissector in Wireshark 2.0.x before 2.0.2 does not validate the data type, which allows remote attackers to cause a denial of service (out-of-bounds read and application crash) via a crafted packet. " The remote attackers', 'cause a denial of service (out-of-bounds read and application crash)', and 'a crafted packet' based on 'allows',' to ' .

취약점 정보 관리 장치(10)는 도 8에 도시된 바와 같은 취약점 정보로부터 미리 정해진 취약점 유형 정보를 추출할 수 있다. 예를 들어, 도 8에 도시된 'CWE-20'을 추출할 수 있다. The vulnerability information management apparatus 10 can extract the predetermined vulnerability type information from the vulnerability information as shown in FIG. For example, 'CWE-20' shown in FIG. 8 can be extracted.

취약점 정보 관리 장치(10)는 제1 설명정보, 제2 설명정보, 제3 설명정보, 및 취약점 유형 정보를 기반으로 학습하여, 취약점 정보 분류 모델들을 생성할 수 있다. 취약점 정보 분류 모델들은 제1 설명정보, 제2 설명정보, 및 제3 설명정보 간에 연관되지 않아서, 제1 설명정보, 제2 설명정보, 및 제3 설명정보는 취약점 유형 정보를 생성하는 데 독립적인 것으로 가정할 수 있다. 취약점 정보 분류 모델들은 제1 설명정보, 제2 설명정보, 및 제3 설명정보를 벡터로 표현하며, 제1 설명정보, 제2 설명정보, 및 제3 설명정보에 대하여 최대 확률을 갖는 취약점 유형 정보를 검색하여 분류하는 조건부 확률 모델일 수 있다. 예컨대, 나이브 베이즈 분류기(Naive Bayes Classifier)가 적용될 수 있으나, 이는 예시일 뿐이며, 취약점 정보 분류 모델들은 구현되는 설계에 따라 다양한 분류 모델이 사용될 수 있음은 물론이다.The vulnerability information management apparatus 10 can generate vulnerability information classification models by learning based on the first explanatory information, the second explanatory information, the third explanatory information, and the vulnerability type information. The vulnerability information classification models are not associated between the first description information, the second description information, and the third description information, so that the first description information, the second description information, and the third description information are independent . The vulnerability information classification models represent the first explanatory information, the second explanatory information, and the third explanatory information as a vector, and the vulnerability information classification models having the maximum probability for the first explanatory information, the second explain information, Which is a conditional probability model. For example, a Naive Bayes classifier may be applied, but this is merely an example, and it goes without saying that various classification models may be used according to the design of the vulnerability information classification models.

취약점 정보 관리 장치(10)는 제1 설명정보, 제2 설명정보, 제3 설명정보, 및 기 분류된 취약점 유형 정보를 기반으로 학습하여, 각각의 취약점 정보 분류 모델을 생성할 수 있다.The vulnerability information management apparatus 10 can generate each vulnerability information classification model by learning based on the first explanatory information, the second explanatory information, the third explanatory information, and the pre-classified vulnerability type information.

상술된 바와 같이 취약점 정보 분류 모델들을 이용하여, 취약점 기본 정보에 설명 정보에 매칭되는 정보가 포함되어 있는 경우, 취약점 유형을 판단할 수 있다.As described above, when the vulnerability basic information includes information matching the explanatory information using the vulnerability information classification models, the vulnerability type can be determined.

도 9는 일 실시 예에 따라 분류된 취약점 유형을 도시한 도면이다.9 is a diagram illustrating types of vulnerabilities classified according to one embodiment.

일 실시 예에 따르면, 취약점 정보 분류 모델은 도 9에 도시된 취약점 유형 중 어느 하나로 취약점 정보를 분류할 수 있다. 도 9에 도시된 취약점 유형은 NVD에서 CVE 정보에 포함하여 제공하는 취약점 일람(Common Weakness Enumeration; CWE)을 그룹화함으로써 16종의 취약점 유형을 구성한 것이다. 도 9의 CWE 항목은 CWE 번호를 의미한다. CWE 코드는 CWE 코드를 포함할 수 있다. 예를 들어, CWE 번호가 '119'인 경우, CWE 코드는 'CWE-119'일 수 있다.According to one embodiment, the vulnerability information classification model can classify vulnerability information into any one of the vulnerability types shown in FIG. The type of vulnerability shown in FIG. 9 is composed of 16 kinds of vulnerability types by grouping Common Weakness Enumeration (CWE) provided by NVD in CVE information. The CWE item in FIG. 9 means the CWE number. The CWE code may include CWE code. For example, if the CWE number is '119', the CWE code may be 'CWE-119'.

도 10은 일 실시 예에 따라 취약점 정보 분류 모델을 생성하는 프로세스를 도시한 도면이다.10 is a diagram illustrating a process of generating a vulnerability information classification model according to an embodiment.

일 실시 예에 따르면, 취약점 정보 가공 장치(10)는 취약점 테이블로부터 추출된 텍스트, 즉, 취약점 정보에 대해 전처리(pre-process)를 수행할 수 있다(S1110). 취약점 정보 가공 장치(10)는 텍스트에 포함된 stop word(예를 들어, 쉼표나 마침표)를 제거할 수 있다(S1120).According to an embodiment, the vulnerability information processing apparatus 10 may perform pre-processing on text extracted from the vulnerability table, that is, vulnerability information (S1110). The vulnerability information processing apparatus 10 may remove a stop word (e.g., a comma or a period) included in the text (S1120).

이후, 취약점 정보 가공 장치(10)는 텍스트에 대해 스테밍(stemming)을 수행함으로써 텍스트로부터 어간을 추출할 수 있다(S1140). 또한, 취약점 정보 가공 장치(10)는 추출된 단어들에 대하여 상위 반도 단어들을 제거할 수 있다(S1140).Thereafter, the vulnerability information processing device 10 may extract stem from the text by performing stemming on the text (S1140). In addition, the vulnerability information processing apparatus 10 may remove the upper half word words of the extracted words (S1140).

취약점 정보 가공 장치(10)는 상술된 바와 같이 전처리가 수행된 텍스트에 대해 특징을 선택할 수 있다(S1150). 이후, 취약점 정보 가공 장치(10)는 선택된 특징에 기초하여 취약점 정보 분류 모델을 생성(S1160)함으로써, 취약점 정보를 학습한 분류 모델을 생성할 수 있다.The vulnerability information processing device 10 can select a feature for the preprocessed text as described above (S1150). Thereafter, the vulnerability information processing device 10 can generate a classification model that has learned the vulnerability information by creating a vulnerability information classification model based on the selected characteristic (S1160).

도 11은 일 실시 예에 따라 연관 취약점 ID를 저장하는 프로세스를 도시한 도면이다.11 is a diagram illustrating a process for storing an association vulnerability ID according to an embodiment.

먼저, 취약점 정보 가공 장치(10)는 취약점 기본 정보를 다른 취약점 기본 정보들과의 유사도를 산출할 수 있다(S1210). 즉, 취약점 테이블(1)에 저장된 제1 취약점 정보로부터 생성된 제1 취약점 기본 정보와 제2 취약점 정보로부터 생성된 제2 취약점 기본 정보가 존재하는 경우, 취약점 정보 가공 장치(10)는 제1 취약점 기본 정보와 제2 취약점 기본 정보의 유사도를 산출할 수 있다. 예를 들어, 취약점 정보 가공 장치(10)는 취약점 테이블(10)에 저장된 취약점 정보의 요약(summary) 정보와 다른 취약점 정보의 요약 정보를 문서 유사도 판단 알고리즘을 이용하여 유사도를 산출할 수 있다.First, the vulnerability information processing apparatus 10 can calculate the degree of similarity between vulnerability basic information and other vulnerability basic information (S1210). That is, when the first vulnerability basic information generated from the first vulnerability information stored in the vulnerability table (1) and the second vulnerability basic information generated from the second vulnerability information exist, the vulnerability information processing device (10) The degree of similarity between the basic information and the second vulnerability basic information can be calculated. For example, the vulnerability information processing apparatus 10 may calculate summary information of vulnerability information stored in the vulnerability table 10 and summary information of other vulnerability information using a document similarity determination algorithm.

이후, 취약점 정보 가공 장치(10)는 여러 취약점 기본 정보들 간의 유사도를 서로 비교할 수 있다(S1220). 예를 들어, 취약점 정보 가공 장치(10)는 어떤 취약점 정보와 유사도가 가장 높은지 또는 유사도가 높은 순위의 취약점 정보가 무엇인지 판단할 수 있다. 다른 예를 들면, 취약점 정보 가공 장치(10)는 한 취약점 정보에 대한 유사도가 임계값 이상인 다른 취약점 정보가 무엇인지 판단할 수 있다.Thereafter, the vulnerability information processing device 10 can compare the similarities of the vulnerability basic information with each other (S1220). For example, the vulnerability information processing device 10 can determine which of the vulnerability information is the most similar to the vulnerability information or the vulnerability information of the order of high similarity. In another example, the vulnerability information processing device 10 can determine what other vulnerability information with a degree of similarity to one vulnerability information is equal to or higher than a threshold value.

취약점 정보 가공 장치(10)는 비교 결과에 따라 연관 취약점 ID를 저장할 수 있다(S1230). 즉, 취약점 정보 가공 장치(10) 유사도가 높은 취약점 기본 정보를 결정할 수 있고, 유사도가 높은 취약점 기본 정보에 관한 취약점 정보에 포함된 취약점 ID(예를 들어, CVE-ID)를 연관 취약점 ID로 취약점 정보 테이블에 저장할 수 있다.The vulnerability information processing device 10 may store the association vulnerability ID according to the comparison result (S1230). That is, the vulnerability information processing device 10 can determine vulnerability basic information having a high degree of similarity, and can identify vulnerability IDs (for example, CVE-ID) included in vulnerability information on vulnerability basic information having a high degree of similarity, Can be stored in an information table.

지금까지 설명된 본 발명의 실시예에 따른 방법들은 컴퓨터가 읽을 수 있는 코드로 구현된 컴퓨터프로그램의 실행에 의하여 수행될 수 있다. 상기 컴퓨터프로그램은 인터넷 등의 네트워크를 통하여 제1 컴퓨팅 장치로부터 제2 컴퓨팅 장치에 전송되어 상기 제2 컴퓨팅 장치에 설치될 수 있고, 이로써 상기 제2 컴퓨팅 장치에서 사용될 수 있다. 상기 제1 컴퓨팅 장치 및 상기 제2 컴퓨팅 장치는, 서버 장치, 클라우드 서비스를 위한 서버 풀에 속한 물리 서버, 데스크탑 피씨와 같은 고정식 컴퓨팅 장치를 모두 포함한다.The methods according to the embodiments of the present invention described so far can be performed by the execution of a computer program embodied in computer readable code. The computer program may be transmitted from a first computing device to a second computing device via a network, such as the Internet, and installed in the second computing device, thereby enabling it to be used in the second computing device. The first computing device and the second computing device all include a server device, a physical server belonging to a server pool for cloud services, and a fixed computing device such as a desktop PC.

상기 컴퓨터프로그램은 DVD-ROM, 플래시 메모리 장치 등의 기록매체에 저장된 것일 수도 있다.The computer program may be stored in a recording medium such as a DVD-ROM, a flash memory device, or the like.

이상 첨부된 도면을 참조하여 본 발명의 실시예들을 설명하였지만, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 본 발명이 그 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적인 것이 아닌 것으로 이해해야만 한다.While the present invention has been described in connection with what is presently considered to be practical exemplary embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, but, on the contrary, I can understand that. It is therefore to be understood that the above-described embodiments are illustrative in all aspects and not restrictive.

Claims

A method for a computing device to manage vulnerability information,
Obtaining a vulnerability table including first vulnerability information by parsing and classifying data obtained from the vulnerability information source;
Extracting a text associated with the first vulnerability information from the vulnerability table;
Generating first vulnerability basic information from the text;
Storing the first vulnerability basic information in the vulnerability table;
Calculating a degree of similarity of the second vulnerability basic information generated from the second vulnerability information of the vulnerability table with respect to the first vulnerability basic information; And
And storing the vulnerability ID of the second vulnerability information in the vulnerability table as an association vulnerability ID if the calculated similarity is the highest value or is greater than or equal to a threshold value.
How to Manage Vulnerability Information.

The method according to claim 1,
Wherein the first vulnerability basic information comprises:
A title, a keyword, a corpus, and a summary of the first vulnerability information;
How to Manage Vulnerability Information.

3. The method of claim 2,
The step of generating the first vulnerability basic information includes:
Extracting a representative sentence or a representative word from the text; And
And generating a title composed of the representative sentence or the representative word.
How to Manage Vulnerability Information.

The method of claim 3,
The step of extracting the representative sentence or the representative word includes:
And returning the sentence or word as the representative sentence or the representative word by inputting the text into a text rank algorithm.
How to Manage Vulnerability Information.

The method of claim 3,
Wherein the step of generating the title comprises:
Determining a manufacturer name, a product name, and a word corresponding to the type of vulnerability from the representative sentence or the representative word; And
And generating a title composed of the manufacturer name, the product name, and the vulnerability type based on the determined word.
How to Manage Vulnerability Information.

3. The method of claim 2,
The step of generating the first vulnerability basic information includes:
And returning the keyword as an output word by inputting the text into a text rank algorithm and a word-by-word algorithm (Word2Vec)
How to Manage Vulnerability Information.

The method according to claim 6,
The step of generating the first vulnerability basic information includes:
Generating a word cluster having a similar meaning using the text and the keyword using a topic model; And
And generating the corpus based on the word clusters.
How to Manage Vulnerability Information.

3. The method of claim 2,
The step of generating the first vulnerability basic information includes:
Integrating data of the keyword and the corpus, and generating combined information by eliminating redundant information.
How to manage vulnerability information

3. The method of claim 2,
The step of generating the first vulnerability basic information includes:
And generating the summary by abbreviating the text. &Lt; RTI ID = 0.0 >
How to Manage Vulnerability Information.

The method according to claim 1,
The vulnerability information management method includes:
Obtaining a Common Platform Enumeration (CPE) dictionary over the network;
Determining a CPE name corresponding to the first vulnerability basic information based on the CPE dictionary; And
Wherein the storing further comprises: storing the CPE name in the vulnerability table.
How to Manage Vulnerability Information.

11. The method of claim 10,
The step of determining the CPE comprises:
And a CPE tree based keyword analysis is performed on the CPE dictionary using the first vulnerability basic information as a keyword to determine the CPE name as a result having the highest matching rate.
How to Manage Vulnerability Information.

The method according to claim 1,
The vulnerability information management method includes:
Generating a vulnerability information classification model by learning vulnerability information training data;
Generating first vulnerability category information by inputting the vulnerability basic information into the vulnerability information classification model; And
And storing the first vulnerability class information in the vulnerability table.
How to Manage Vulnerability Information.

13. The method of claim 12,
The vulnerability information training data includes:
Information including summary information of the previously-known vulnerability information obtained by parsing a file including information of known vulnerability information and information of the second vulnerability type,
How to Manage Vulnerability Information.

13. The method of claim 12,
The vulnerability information classification model includes:
And outputs the first vulnerability type information in the form of a Common Weakness Enumeration (CWE) code as the vulnerability basic information is input.
How to Manage Vulnerability Information.

delete

A storage unit for storing a vulnerability table including first vulnerability information by parsing and classifying data obtained from the vulnerability information source; And
An information extraction unit for extracting a text associated with the first vulnerability information from the vulnerability table; And
Generating basic information of the first vulnerability from the text, storing the first vulnerability basic information in the vulnerability table, generating second vulnerability basic information generated from the second vulnerability information of the first vulnerability basic information, And a vulnerability information management unit for storing the vulnerability ID of the second vulnerability information in the vulnerability table as an association vulnerability ID when the calculated similarity is the highest value or the threshold value or more.
Vulnerability information management device.

A computer program recorded on a non-transitory computer readable medium, the instructions of the computer program being executed by a processor of the server,
Obtaining a vulnerability table including first vulnerability information by parsing and classifying data obtained from the vulnerability information source;
Extracting a text associated with the first vulnerability information from the vulnerability table;
Generating first vulnerability basic information from the text;
Storing the first vulnerability basic information in the vulnerability table;
Calculating a degree of similarity of the second vulnerability basic information generated from the second vulnerability information of the vulnerability table with respect to the first vulnerability basic information; And
And storing the vulnerability ID of the second vulnerability information in the vulnerability table as an associated vulnerability ID when the calculated similarity is the highest value or the threshold value or more.
Computer program.

A vulnerability information collection system for acquiring first vulnerability information by parsing and classifying data obtained from the vulnerability information source;
A storage for storing a vulnerability table including the first vulnerability information; And
Extracting a text associated with the first vulnerability information from the vulnerability table, generating first vulnerability basic information from the text, storing the first vulnerability basic information in the vulnerability table, Wherein the degree of similarity of the second vulnerability basic information generated from the second vulnerability information of the vulnerability table is calculated, and when the calculated similarity is the highest value or the threshold value or more, And a vulnerability information management device that stores the vulnerability information in a vulnerability table.
Vulnerability Information Management System.