KR20110119988A

KR20110119988A - Apparatus and method for preventing leakage of secret data

Info

Publication number: KR20110119988A
Application number: KR1020100039449A
Authority: KR
Inventors: 성동수; 박찬정; 이건배
Original assignee: 경기대학교 산학협력단
Priority date: 2010-04-28
Filing date: 2010-04-28
Publication date: 2011-11-03
Also published as: KR101158797B1

Abstract

PURPOSE: A confidential information leakage prevention apparatus and method thereof using the specific information of confidential information are provided to improve the efficiency and reliability of determination by blocking the leakage of confidential information. CONSTITUTION: A storage unit(110) stores confidential information and the information of insiders. An education unit(120) creates an education result according to a security level based on a specific information by extracting the specific information form the confidential information. A sensing unit(130) senses a deduction attempt information in an inner network. When the deduction attempt is sensed, a determination unit(140) determines the security level of the deduction attempt information using the education result. A control unit(150) blocks or accepts the deduction attempt information by comparing the security level of the insiders with the security level of the deduction attempt information.

Description

Apparatus and Method for preventing leakage of secret data}

본 발명은 기밀정보 유출 방지에 관한 것으로서, 보다 상세하게는 내부자에 의하여 외부로 반출되는 정보에 대한 기밀정보 유출 방지 장치 및 방법에 관한 것이다.
The present invention relates to prevention of confidential information leakage, and more particularly, to an apparatus and method for preventing leakage of confidential information on information carried out by an insider.

최근 기업 및 공공기관 등의 내부 기밀정보 유출로 인한 피해가 크게 증가하고 있다. 기밀정보는 외부자 또는 내부자에 의하여 유출될 수 있다. 외부자에 의한 기밀정보의 유출은 외부에서 네트워크를 통하여 자료가 저장된 서버나 데이터베이스에 불법 침입하여 파일이나 데이터를 복사하거나 정보를 열람하여 이루어진다. 예를 들어, FTP, 웹하드, TELNET, 해킹툴, 바이러스 및 관리자 ID 도용 등을 통하여 기밀정보의 유출이 이루어지고 있다.Recently, the damages caused by the leakage of internal confidential information from companies and public institutions have increased significantly. Confidential information can be leaked by outsiders or insiders. The leakage of confidential information by outsiders is made by copying files or data or reading information by illegally invading the server or database where data is stored through the network from outside. For example, confidential information has been leaked through FTP, Webhard, TELNET, hacking tools, viruses, and administrator ID theft.

이에 반해, 내부자에 의한 기밀정보의 유출은 내부의 정보 생성자, 관리자 및 그 외 내부자에 의하여 의도적이거나 또는 부주의하게 기밀정보를 다루어서 유출된다. 예를 들어, 파일 복사, 파일 전송 및 파일 인쇄 등을 통해 내부자에 의하여 기밀정보가 유출될 수 있다. On the other hand, the leakage of confidential information by insiders is leaked by deliberately or inadvertently handling confidential information by internal information creators, managers and other insiders. For example, confidential information may be leaked by an insider through file copying, file transfer, and file printing.

기존의 기밀정보 유출 방지 기술은 외부자의 불법 침입에 대한 기밀정보 유출 방지와 관련된 기술이 주를 이루었다. 하지만, 기밀정보의 유출은 외부자의 의한 유출에 비하여 내부자의 의한 유출의 비율이 현저히 높다. 또한, 기밀정보의 유출로 인한 피해액은 내부자에 의한 유출이 외부자에 의한 유출보다 더 큰 것으로 나타났다. 이에 따라, 최근에는 내부자에 의한 기밀정보 유출 방지(DLP: Data Loss Prevention)에 대한 연구가 이루어지고 있다.Conventional confidential information leakage prevention technology was mainly related to the prevention of confidential information leakage against outsiders' illegal intrusion. However, the leakage of confidential information is significantly higher than that of outsiders. In addition, the damages caused by the leakage of confidential information were found to be greater by the insiders than by the outsiders. Accordingly, in recent years, research on data loss prevention (DLP) by insiders has been conducted.

현재 내부자에 의한 기밀정보 유출 방지를 위한 방법으로, 기밀정보에 대한 접근 제한/차단, 데이터나 파일의 암호화, 데이터나 파일이 외부로 유출되는 경우 이에 대한 로그기록을 남겨 모니터링, 파일의 복사나 전송을 원천적으로 차단, 저장장치의 파괴/소거, 휴대 전자기기의 반출입 통제 등의 방법이 사용되고 있다. 이와 같은 방법들은 기밀정보의 유출을 차단할 수 있으나, 외부로 반출되어도 무방한 일반 자료에 대해서도 통제 또는 차단되는 문제점이 있다. 그래서, 외부와 자료의 송수신이 빈번한 기업 및 공공기관 등은 업무 효율이 현저히 저하될 수 있다.
It is a method to prevent the leakage of confidential information by insiders, which restricts / blocks access to confidential information, encrypts data or files, monitors when data or files are leaked to the outside, monitors them, and copies or transmits them. It is a method to cut off, destroy / remove storage devices, and control the import and export of portable electronic devices. Such methods can prevent the leakage of confidential information, but there is a problem that the general data that is allowed to be exported outside is controlled or blocked. Therefore, in the case of companies and public institutions that frequently transmit and receive data with the outside, work efficiency may be significantly reduced.

본 발명은 내부의 기밀정보가 외부로 유출되는 것을 차단하고, 기밀정보가 아닌 일반 자료는 외부로 반출을 허용하는 기밀정보 유출 방지 장치 및 방법을 제공하는 것이다. The present invention provides an apparatus and a method for preventing the leakage of confidential information, which blocks internal confidential information from being leaked to the outside, and allows general data which is not confidential information to be exported to the outside.

또한, 본 발명은 반출되는 정보의 기밀자료 여부의 판단에 대하여 효율성 및 신뢰성이 향상된 기밀정보 유출 방지 장치 및 방법을 제공하는 것이다.
The present invention also provides an apparatus and method for preventing leakage of confidential information with improved efficiency and reliability in determining whether confidential information is carried out.

본 발명의 일 측면에 따르면, 기밀정보 유출 방지 장치가 개시된다.According to an aspect of the present invention, a confidential information leakage prevention apparatus is disclosed.

본 발명의 실시예에 따른 기밀정보 유출 방지 장치는 보안 등급별로 기밀정보 및 내부인원 정보를 저장하는 저장부, 상기 기밀정보로부터 특징 정보를 추출하고, 상기 특징 정보를 이용하여 상기 보안 등급별로 학습결과를 생성하는 학습부, 내부 네트워크에서 외부로 정보의 반출 시도를 감지하는 감지부, 상기 반출 시도가 감지되면, 상기 학습 결과를 이용하여 반출시도 정보의 보안 등급을 결정하는 결정부 및 상기 반출시도 정보의 보안 등급과 내부인원의 보안 등급을 비교하여 상기 반출시도 정보의 반출을 허용 또는 차단하는 제어부를 포함하되, 상기 학습부는 다수의 학습모듈을 포함하고, 상기 결정부는 상기 다수의 학습모듈에 대응하는 다수의 분류모듈을 포함한다.An apparatus for preventing the leakage of confidential information according to an embodiment of the present invention includes a storage unit for storing confidential information and internal personnel information for each security level, extracting feature information from the confidential information, and learning results for each security level using the feature information. A learning unit for generating a; a detection unit for detecting an attempt to export information from an internal network to the outside; a determination unit for determining a security level of the export attempt information using the learning result when the export attempt is detected; And a control unit for allowing or blocking the export attempt information by comparing the security level of the security personnel with the security level of the internal personnel, wherein the learning unit includes a plurality of learning modules, and the determining unit corresponds to the plurality of learning modules. It includes a number of classification modules.

본 발명의 다른 측면에 따르면, 내부 네트워크에서 기밀정보 유출 방지 장치가 기밀정보 유출을 방지하는 방법이 개시된다.According to another aspect of the present invention, a method of preventing a leak of confidential information by an apparatus for preventing leakage of confidential information in an internal network is disclosed.

본 발명의 실시예에 따른 기밀정보 유출 방지 방법은 보안 등급이 설정된 기밀정보 및 내부인원 정보를 저장하는 단계, 상기 기밀정보로부터 특징 정보를 추출하고, 상기 특징 정보를 이용하여 상기 보안 등급별로 학습결과를 생성하는 단계, 상기 내부 네트워크에서 외부로 정보의 반출 시도를 감지하는 단계, 상기 학습결과를 이용하여 반출시도 정보의 보안 등급을 결정하는 단계 및 상기 반출시도 정보의 보안 등급과 내부인원의 보안 등급을 비교하여 상기 반출시도 정보의 반출을 허용 또는 차단하는 단계를 포함하되, 상기 기밀정보 유출 방지 장치는 다수의 학습모듈 및 상기 다수의 학습모듈에 대응하는 다수의 분류모듈을 포함한다.In the method for preventing the leakage of confidential information according to an embodiment of the present invention, the method includes: storing confidential information and internal personnel information having a security level set, extracting feature information from the confidential information, and learning results for each security level by using the feature information. Generating a step, detecting an attempt to export information from the internal network to the outside, determining a security level of the export attempt information using the learning result, and a security level of the export attempt information and a security level of the internal personnel. Comprising a comparison step including allowing or blocking the export of the export attempt information, the confidential information leakage prevention device includes a plurality of learning modules and a plurality of classification modules corresponding to the plurality of learning modules.

본 발명은 내부의 기밀정보가 외부로 유출되는 것을 차단할 뿐만 아니라, 기밀정보가 아닌 일반 자료는 외부로 반출할 수 있다.The present invention not only prevents confidential information from being leaked to the outside, but general data that is not confidential information can be exported to the outside.

또한, 본 발명은 반출되는 정보의 기밀자료 여부의 판단에 대하여 효율성 및 신뢰성을 향상시킬 수 있다.
In addition, the present invention can improve the efficiency and reliability with respect to the determination of whether confidential information of the exported information.

도 1은 기밀정보 유출 방지를 위한 내부 네트워크의 구성을 개략적으로 예시한 구성도.
도 2는 기밀정보 유출 방지 장치의 구성을 개략적으로 예시한 구성도.
도 3은 기밀정보 유출 방지 장치가 학습 결과를 산출하는 절차를 나타낸 흐름도.
도 4는 기밀정보 유출 방지 장치가 반출시도 정보의 보안 등급을 결정하는 절차를 나타낸 흐름도.
도 5는 기밀정보 유출 방지 장치가 수행하는 기밀정보 유출 방지 방법을 나타낸 흐름도.1 is a configuration diagram schematically illustrating a configuration of an internal network for preventing the leakage of confidential information.
2 is a configuration diagram schematically illustrating a configuration of an apparatus for preventing leakage of confidential information.
3 is a flowchart illustrating a procedure of calculating a learning result by a device for preventing leakage of confidential information;
4 is a flowchart illustrating a procedure of determining, by the apparatus for preventing the leakage of confidential information, the security level of the export attempt information;
5 is a flowchart illustrating a method of preventing the leakage of confidential information performed by the apparatus for preventing the leakage of confidential information.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 이를 상세한 설명을 통해 상세히 설명하고자 한다. 그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.The present invention may be variously modified and have various embodiments, and specific embodiments will be illustrated in the drawings and described in detail with reference to the accompanying drawings. However, this is not intended to limit the present invention to specific embodiments, it should be understood to include all modifications, equivalents, and substitutes included in the spirit and scope of the present invention.

본 발명을 설명함에 있어서, 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 본 명세서의 설명 과정에서 이용되는 숫자는 하나의 구성요소를 다른 구성요소와 구분하기 위한 식별기호에 불과하다.In describing the present invention, when it is determined that the detailed description of the related known technology may unnecessarily obscure the subject matter of the present invention, the detailed description thereof will be omitted. In addition, numerals used in the description of the present invention are merely an identifier for distinguishing one component from another.

또한, 본 명세서에서, 일 구성요소가 다른 구성요소와 "연결된다" 거나 "접속된다" 등으로 언급된 때에는, 상기 일 구성요소가 상기 다른 구성요소와 직접 연결되거나 또는 직접 접속될 수도 있지만, 특별히 반대되는 기재가 존재하지 않는 이상, 중간에 또 다른 구성요소를 매개하여 연결되거나 또는 접속될 수도 있다고 이해되어야 할 것이다.In addition, in the present specification, when one component is referred to as "connected" or "connected" with another component, the one component may be directly connected or directly connected to the other component, but in particular It is to be understood that, unless there is an opposite substrate, it may be connected or connected via another component in the middle.

이하, 본 발명의 실시예를 첨부한 도면들을 참조하여 상세히 설명하기로 한다. 본 발명을 설명함에 있어 전체적인 이해를 용이하게 하기 위하여 도면 번호에 상관없이 동일한 수단에 대해서는 동일한 참조 번호를 사용하기로 한다.
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the following description of the present invention, the same reference numerals will be used for the same means regardless of the reference numerals in order to facilitate the overall understanding.

도 1은 기밀정보 유출 방지를 위한 내부 네트워크의 구성을 개략적으로 예시한 구성도이다. 내부 네트워크는 기업 및 공공기관 등의 내부 네트워크가 될 수 있다.1 is a configuration diagram schematically illustrating a configuration of an internal network for preventing the leakage of confidential information. The internal network may be an internal network of companies and public institutions.

도 1을 참조하면, 내부 네트워크는 기밀정보 유출 방지 장치(100) 및 복수의 단말(200)을 포함하여 구성된다.Referring to FIG. 1, the internal network includes an apparatus 100 for preventing leakage of confidential information and a plurality of terminals 200.

복수의 단말(200)은 내부 네트워크를 통해 기밀정보 유출 방지 장치(100)와 유무선으로 연결되는 단말장치로, PC, 노트북 등이 될 수 있다. 내부 인원은 단말(200)을 이용하여 내부의 정보를 외부로 반출할 수 있다. 예를 들어, 내부 인원은 외부 네트워크로 파일을 전송하거나, 이동 저장 장치를 이용하여 파일을 복사하거나, 프린터와 같은 사무기기를 이용하여 파일을 인쇄함으로써, 내부의 정보를 외부로 반출할 수 있다.The plurality of terminals 200 are terminal devices connected to the confidential information leakage prevention apparatus 100 through a internal network in a wired or wireless manner, and may be PCs, laptops, or the like. The internal personnel may take out the internal information to the outside using the terminal 200. For example, the internal personnel may export the internal information to the outside by transmitting the file to an external network, copying the file using a mobile storage device, or printing the file using an office device such as a printer.

기밀정보 유출 방지 장치(100)는 단말(200)를 통한 정보 반출 시도를 감지하면, 기밀정보의 여부에 따라 반출을 허용하거나 차단한다.When the confidential information leakage prevention apparatus 100 detects an attempt to export information through the terminal 200, the confidential information leakage prevention apparatus 100 permits or blocks the export based on the presence or absence of confidential information.

기밀정보 유출 방지 장치(100)는 기밀정보 및 내부인원 정보를 저장한 데이터베이스를 포함한다. 여기서, 기밀정보 및 내부인원은 사전에 보안 등급이 설정될 수 있으며, 보안 등급별로 기밀정보 및 내부인원 정보가 데이터베이스에 저장될 수 있다. 보안 등급은 내부의 보안 위원회 또는 보안 담당자에 의하여 설정될 수 있다.The confidential information leakage prevention apparatus 100 includes a database storing confidential information and internal personnel information. Here, the security information and internal personnel may be set in advance a security level, the confidential information and internal personnel information may be stored in a database for each security level. The security level may be set by an internal security committee or security officer.

기밀정보 유출 방지 장치(100)는 등급별 기밀정보로부터 특징 정보를 추출하고, 특징 정보를 이용하여 학습 결과를 산출한다. 이때, 기밀정보 유출 방지 장치(100)는 다수의 학습 모듈을 구비하여, 각 학습 모듈에 따른 학습 결과를 산출할 수 있다. 각 학습 모듈은 고유의 학습 알고리즘에 따라 학습 결과를 산출하는 모듈로, 다양한 학습 알고리즘이 적용될 수 있다. The confidential information leakage prevention apparatus 100 extracts feature information from graded confidential information and calculates a learning result using the feature information. In this case, the confidential information leakage prevention apparatus 100 may include a plurality of learning modules to calculate a learning result according to each learning module. Each learning module is a module that calculates a learning result according to a unique learning algorithm, and various learning algorithms may be applied.

기밀정보 유출 방지 장치(100)는 단말(200)에 의한 정보의 반출 시도를 감지하면, 반출시도 정보의 보안 등급을 결정하고, 결정된 보안 등급에 따라 반출시도 정보의 반출을 허용 또는 차단한다.When the confidential information leakage prevention apparatus 100 detects an attempt to export information by the terminal 200, the confidential information leakage prevention device 100 determines a security level of the export attempt information and allows or blocks the export attempt information according to the determined security level.

보다 상세히 설명하면, 기밀정보 유출 방지 장치(100)는 반출시도 정보로부터 특징 정보를 추출한다. 여기서, 기밀정보 유출 방지 장치(100)는 다수의 학습 모듈에 대응하는 다수의 분류 모듈을 구비한다. 그래서, 각 분류 모듈은 추출된 특징 정보와 학습 결과를 비교하여 반출시도 정보의 보안 등급을 결정할 수 있는 분류 결과를 산출할 수 있다. 여기서, 분류 결과는 보안 등급별로 반출시도 정보가 매칭(Matching)되는 수준(예를 들어, 키워드가 매칭되는 횟수, 비율 등)가 될 수 있다. 또한, 각 분류 모듈이 산출한 분류 결과는 서로 상이할 수 있다. 그리고, 기밀정보 유출 방지 장치(100)는 각 분류 모듈에 의하여 산출된 분류 결과를 종합하여 해당 정보에 대한 보안 등급을 결정한다. 이후, 기밀정보 유출 방지 장치(100)는 결정된 보안 등급과 내부 인원의 보안 등급을 비교하여 반출시도 정보의 반출을 허용 또는 차단할 수 있다. 예를 들어, 내부 인원의 보안 등급이 반출시도 정보의 결정된 보안 등급 이상이면, 기밀정보 유출 방지 장치(100)는 반출을 허용할 수 있다. In more detail, the confidential information leakage prevention apparatus 100 extracts feature information from the export attempt information. Here, the confidential information leakage prevention apparatus 100 includes a plurality of classification modules corresponding to the plurality of learning modules. Thus, each classification module may calculate a classification result for determining the security level of the export attempt information by comparing the extracted feature information and the learning result. Here, the classification result may be a level (eg, the number of times a keyword is matched, a ratio, etc.) in which export attempt information is matched for each security level. In addition, classification results calculated by each classification module may be different from each other. In addition, the confidential information leakage prevention apparatus 100 determines the security level for the corresponding information by combining the classification results calculated by each classification module. Thereafter, the confidential information leakage prevention apparatus 100 may allow or block the carrying out of the export attempt information by comparing the determined security level with the security level of the internal personnel. For example, if the security level of the internal personnel is greater than or equal to the determined security level of the export attempt information, the confidential information leakage prevention apparatus 100 may allow the export.

기밀정보 유출 방지 장치(100)에 대해서는 도 2를 참조하여 상세히 후술한다. The confidential information leakage prevention apparatus 100 will be described later in detail with reference to FIG. 2.

도 2는 기밀정보 유출 방지 장치의 구성을 개략적으로 예시한 구성도이다.2 is a configuration diagram schematically illustrating a configuration of an apparatus for preventing leakage of confidential information.

도 2를 참조하면, 기밀정보 유출 방지 장치(100)는 저장부(110), 학습부(120), 감지부(130), 결정부(140) 및 제어부(150)를 포함한다.Referring to FIG. 2, the apparatus 100 for preventing leakage of confidential information includes a storage 110, a learner 120, a detector 130, a determiner 140, and a controller 150.

저장부(110)는 기밀정보 및 내부인원 정보를 저장한다. 보안 위원회 또는 보안 담당자는 기밀정보 및 내부인원에 대하여 보안 등급을 설정할 수 있으며, 저장부(110)는 보안 등급별로 기밀정보 및 내부인원 정보를 저장할 수 있다.The storage unit 110 stores confidential information and internal personnel information. The security committee or the security officer may set a security level for confidential information and internal personnel, and the storage 110 may store confidential information and internal personnel information for each security level.

또한, 저장부(110)는 학습부(120)에 의하여 생성된 학습 결과를 학습 모듈별로 저장한다.In addition, the storage unit 110 stores the learning results generated by the learning unit 120 for each learning module.

학습부(120)는 보안 등급별 기밀정보로부터 특징 정보를 추출하고 특징 정보를 이용하여 다수의 학습 모듈별로 학습 결과를 생성한다.The learning unit 120 extracts feature information from confidential information for each security level and generates a learning result for each of a plurality of learning modules using the feature information.

학습부(120)는 학습정보 추출부(121) 및 학습결과 생성부(123)를 포함한다.The learning unit 120 includes a learning information extracting unit 121 and a learning result generating unit 123.

학습정보 추출부(121)는 보안 등급별 기밀정보로부터 보안 등급별로 특징 정보를 추출한다.The learning information extracting unit 121 extracts feature information for each security level from confidential information for each security level.

예를 들어, 학습정보 추출부(121)는 기밀정보를 기계적 처리가 가능하도록 변환하는 정규화를 거쳐 기밀정보의 특징을 잘 반영할 수 있는 키워드 또는 멀티미디어(이미지, 음성, 동영상 등)를 추출한다. 여기서, 정규화는 특수문자 및 태그의 제거, 한자를 한글로 변환, 이미지에서 문자 추출 등이 될 수 있다. 키워드는 기밀정보에 포함된 명사가 될 수 있으며, 여러 문서에서 공통적으로 많이 나타나는 단어(불용어)는 배제될 수 있다. 예를 들어, 불용어는 접속사, 대명사, 관사, 동사 등이 될 수 있다.For example, the learning information extracting unit 121 extracts a keyword or multimedia (image, voice, video, etc.) that can accurately reflect the characteristics of the confidential information through normalization of converting the confidential information into mechanical processing. In this case, the normalization may include removing special characters and tags, converting Chinese characters to Hangul, and extracting characters from an image. Keywords can be nouns included in confidential information, and words common to many documents can be excluded. For example, stopwords can be conjunctions, pronouns, articles, and verbs.

학습결과 생성부(123)는 다수의 학습 모듈을 포함하며, 각 학습모듈은 설정된 학습 알고리즘에 따라 추출된 특징 정보를 이용하여 보안 등급별 기밀정보에 대한 학습 결과를 생성한다. 그리고, 각 학습모듈은 생성한 학습 결과를 저장부(110)에 저장한다. 이때, 저장부(110)는 학습모듈별로 보안 등급별 기밀정보에 대한 학습 결과를 저장할 수 있다. The learning result generator 123 includes a plurality of learning modules, and each learning module generates a learning result for confidential information for each security level by using feature information extracted according to a set learning algorithm. Each learning module stores the generated learning result in the storage 110. In this case, the storage 110 may store a learning result for confidential information for each security level for each learning module.

여기서, 학습모듈은 베이시안 필터(Baysian Filter), 뉴럴 네트워크(Neural Network), K 근접 알고리즘(K nearest neighbor algorithm) 등의 알고리즘이 적용될 수 있다. Here, the learning module may be applied to algorithms such as a Baysian filter, a neural network, a K nearest neighbor algorithm, and the like.

이하에서는 학습결과 생성부(123)가 베이시안 필터가 적용된 A 학습모듈, 뉴럴 네트워크가 적용된 B 학습모듈 및 K 근접 알고리즘이 적용된 C 학습모듈을 포함하는 것으로 예를 들어 설명하며, 발명의 이해를 도모하고자 다수의 학습모듈 중에서 베이시안 필터가 적용된 A 학습모듈을 중심으로 설명한다.Hereinafter, the learning result generator 123 will be described as an example including the A learning module to which the Bayesian filter is applied, the B learning module to which the neural network is applied, and the C learning module to which the K proximity algorithm is applied. In the following description, a learning module to which a Bayesian filter is applied will be described.

베이시안 필터는 수학자 토마스 베이즈의 정리를 텍스트 분류에 적용한 것으로, 특정한 텍스트에서 개별적인 단어들의 출현 빈도를 모두 저장하고, 비슷한 분류의 텍스트를 계속 샘플 데이터로 추가시킨 후, 임의의 텍스트가 해당 분류에 속하는 여부를 파악하는데 이용할 수 있는 알고리즘이다.The Bayesian filter applies the mathematician Thomas Bayes' theorem to text classifications, storing all occurrences of individual words in a particular text, continuing to add similar classification text as sample data, and then placing any text into that classification. An algorithm that can be used to determine whether it belongs.

A 학습모듈은 학습정보 추출부(121)에 의하여 추출된 특징 정보에 대한 정보 통계량을 산출한다. 여기서, A 학습모듈은 특징 정보로서 키워드를 이용하며, 정보 통계량으로서 카이 제곱 통계량(X² statistics)을 산출하는 것으로 가정한다.The learning module A calculates information statistics on feature information extracted by the learning information extracting unit 121. Here, it is assumed that the learning module A uses keywords as feature information and calculates chi square statistics (X ² statistics) as information statistics.

카이 제곱 통계량은 키워드(t)와 범주(c)의 의존성(dependency)을 측정하는 것으로, 다음의 수학식1을 이용하여 산출될 수 있다.The chi-square statistic measures the dependency of the keyword (t) and the category (c) and can be calculated using Equation 1 below.

[수학식 1][Equation 1]

여기서, A는 범주 c에 속한 문서 중에서 키워드 t를 포함한 문서의 수이고, B는 범주 c가 아닌 문서 중에서 키워드 t를 포함한 문서의 수이고, C는 범주 c에 속한 문서 중에서 키워드 t를 포함하지 않은 문서의 수이고, D는 범주 c가 아닌 문서 중에서 키워드 t를 포함하지 않은 문서의 수이고, N은 전체 학습 문서의 수이다.Where A is the number of documents with keyword t among documents in category c, B is the number of documents with keyword t among documents that are not category c, and C does not include keyword t among documents in category c. The number of documents, D is the number of documents that do not include the keyword t among documents that are not in category c, and N is the total number of learning documents.

예를 들어, 범주 c가 1 등급과 2 등급이고, 1 등급 및 2 등급의 기밀정보를 각각 50개를 학습하는 것으로 가정한다. 그리고, 키워드 t는 트랜스코딩이고, A=20, B=1, C=30, D=49이라고 가정하면, 카이 제곱 통계량은 다음과 같이 산출될 수 있다.For example, suppose that category c is Grade 1 and Grade 2, and 50 classes of Grade 1 and Grade 2 confidential information are learned, respectively. In addition, assuming that keyword t is transcoding and A = 20, B = 1, C = 30, and D = 49, the chi-square statistic may be calculated as follows.

이와 같이 산출되는 카이 제곱 통계량은 다음의 표 1과 같이 정리될 수 있다. 즉, A 학습모듈은 표 1과 같이 학습결과를 생성할 수 있으며, 생성한 학습결과를 저장부(110)에 저장한다.The chi square statistics calculated as described above may be summarized as shown in Table 1 below. That is, the learning module A may generate a learning result as shown in Table 1, and stores the generated learning result in the storage unit 110.

1등급Grade 1 2등급Grade 2 X² (트랜스코딩, 1등급)=21.76
X² (콘텐츠, 1등급)=15.00
X² (통신망, 1등급)=12.00
X² (실시간, 1등급)=10.00
X² (전송, 1등급)=8.00
…
X² (장치, 1등급)=1.01X ² (Transcoded, Class 1) = 21.76
X ² (Content, Grade 1) = 15.00
X ² (network, class 1) = 12.00
X ² (Realtime, Grade 1) = 10.00
X ² (Transmission, Class 1) = 8.00
…
X ² (Device, Class 1) = 1.01 X² (벤젠, 2등급)=25.00
X² (염기, 2등급)=20.00
X² (조성물, 2등급)=15.00
X² (세포, 2등급)=10.00
X² (촉매, 2등급)=5.00
…
X² (물질, 2등급)=1.50X ² (benzene, class 2) = 25.00
X ² (Base, Grade 2) = 20.00
X ² (Composition, Grade 2) = 15.00
X ² (Cell, Grade 2) = 10.00
X ² (Catalyst, Grade 2) = 5.00
…
X ² (Material, Class 2) = 1.50

이 외의 각 학습모듈(B 학습모듈, C 학습모듈 등)도 해당 학습 알고리즘을 수행하여 A 학습모듈과 같이, 특징 정보에 대한 정보 통계량을 산출하고, 산출된 정보 통계량을 정리한 학습결과를 저장부(110)에 저장한다.In addition, each learning module (B learning module, C learning module, etc.) performs a corresponding learning algorithm to calculate information statistics on feature information, such as A learning module, and stores the learning results in which the calculated information statistics are summarized. Save to 110.

예를 들어, 학습결과 생성부(123)가 생성하여 저장부(110)에 저장하는 학습결과는 다음의 표 2와 같이 나타낼 수 있다.For example, the learning result generated by the learning result generator 123 and stored in the storage 110 may be represented as shown in Table 2 below.

A 학습모듈A Learning Module B 학습모듈B Learning Module C 학습모듈C Learning Module 1등급
키워드: 트랜스코딩(21.76), 콘텐츠(15.00)…
이미지: XXXX
2등급
키워드: 벤젠(25.00), 염기(20.00)…
이미지: YYYYY
…

n등급
키워드: …
이미지: …Grade 1
Keywords: transcoding (21.76), content (15.00)…
Image: XXXX
Grade 2
Keywords: benzene (25.00), base (20.00)…
Image: YYYYY
…

n grade
keyword: …
image: … …… ……

감지부(130)는 내부 네트워크에서 외부로의 반출 시도를 감지한다. 즉, 감지부(130)는 단말(200)에 의한 정보의 반출 시도를 감지할 수 있다.The detector 130 detects an attempt to carry out from the internal network to the outside. That is, the sensing unit 130 may detect an attempt to export information by the terminal 200.

예를 들어, 단말(200)에 의한 반출 시도는 파일 전송, 파일 복사, 파일 인쇄 등이 있을 수 있다. 감지부(130)는 단말(200)에 의하여 이동 저장 장치로 파일을 복사하거나, 외부 네트워크로 파일이 전송되는 것을 감지할 수 있다.For example, the export attempt by the terminal 200 may include a file transfer, a file copy, a file print, and the like. The detector 130 may copy the file to the mobile storage device by the terminal 200 or detect that the file is transmitted to the external network.

결정부(140)는 감지부(130)에 의하여 반출 시도를 감지하면, 반출을 시도하는 정보(이하, 반출시도 정보)의 보안 등급을 결정한다.When the determination unit 140 detects an export attempt by the detection unit 130, the determination unit 140 determines a security level of information (hereinafter, export attempt information) that attempts to export.

결정부(140)는 분류정보 추출부(141), 분류결과 생성부(143) 및 등급 결정부(145)를 포함한다.The determination unit 140 includes a classification information extraction unit 141, a classification result generation unit 143, and a rating determination unit 145.

분류정보 추출부(141)는 반출시도 정보로부터 특징 정보를 추출한다. 분류정보 추출부(141)는 상술한 학습정보 추출부(121)와 같은 동작을 수행할 수 있다.The classification information extracting unit 141 extracts feature information from the export attempt information. The classification information extracting unit 141 may perform the same operation as the learning information extracting unit 121 described above.

예를 들어, 분류정보 추출부(141)는 반출시도 정보를 정규화하여 키워드 또는 멀티미디어를 추출할 수 있다. For example, the classification information extracting unit 141 may extract keywords or multimedia by normalizing export attempt information.

분류결과 생성부(143)는 다수의 학습모듈에 대응하는 다수의 분류 모듈을 포함하며, 각 분류모듈은 분류정보 추출부(141)에 의하여 추출된 반출시도 정보의 특징 정보를 학습 결과와 비교한다. 이에 따라, 각 분류모듈은 는 반출시도 정보의 보안 등급을 결정할 수 있는 분류 결과를 산출할 수 있다. 여기서, 분류 결과는 보안 등급별로 반출시도 정보가 매칭되는 수준(예를 들어, 키워드가 매칭되는 횟수, 비율 등)이 될 수 있다. 또한, 각 분류모듈이 산출한 분류 결과는 서로 상이할 수 있다.The classification result generator 143 includes a plurality of classification modules corresponding to the plurality of learning modules, and each classification module compares the characteristic information of the export attempt information extracted by the classification information extraction unit 141 with the learning results. . Accordingly, each classification module may calculate a classification result for determining the security level of the export attempt information. Here, the classification result may be a level (eg, the number of times a keyword is matched, a ratio, etc.) in which export attempt information is matched for each security level. In addition, classification results calculated by each classification module may be different from each other.

이하에서는 다수의 분류모듈 중에서 베이시안 필터가 적용된 A 학습모듈에 대응되는 A 분류모듈을 중심으로 예를 들어 설명한다.Hereinafter, an example will be described based on the A classification module corresponding to the A learning module to which the Bayesian filter is applied among the plurality of classification modules.

A 분류모듈은 A 학습모듈과 같이, 반출시도 정보로부터 추출된 키워드에 대하여 정보 통계량을 산출할 수 있다. 또한, A 분류모듈은 추출된 키워드에 대하여 A 학습모듈에 의하여 산출된 학습결과와 비교한다. 즉, A 분류모듈은 반출시도 정보의 키워드가 학습결과의 키워드 중에서 매칭되는 키워드가 존재하는지를 확인하여 그 결과를 산출한다.Like the learning module A, the A classification module may calculate information statistics on keywords extracted from the export attempt information. In addition, the A classification module compares the extracted keyword with the learning results calculated by the A learning module. That is, the A classification module checks whether a keyword matching the keyword of the export attempt information exists among the keywords of the learning result and calculates the result.

예를 들어, A 분류모듈이 반출시도 정보로부터 추출된 키워드에 대하여 산출한 정보 통계량은 다음의 표 3과 같이 나타낼 수 있다. 여기서, A 분류모듈은 A 학습모듈과 같이 정보 통계량으로서 카이 제곱 통계량을 산출할 수 있다.For example, the information statistics calculated for the keywords extracted by the classification module A from the export attempt information may be represented as shown in Table 3 below. Here, the A classification module may calculate the chi square statistics as the information statistics like the A learning module.

키워드keyword 정보 통계량Information statistics 장치Device 3030 기술Technology 2020 시스템system 1010 …… …… 전송send 55 통신망communications network 33 촉매catalyst 33 콘텐츠contents 22 …… ……

A 분류모듈은 반출시도 정보의 키워드(표 3)와 A 학습모듈이 생성한 학습결과(표 1)를 비교한다. 즉, A 분류모듈은 표 3의 키워드와 표 1의 키워드를 비교하여 매칭되는 키워드가 존재하는지를 확인한다.The A classification module compares the keywords of export attempt information (Table 3) with the learning results generated by the A learning module (Table 1). That is, the classification module A compares the keywords of Table 3 and the keywords of Table 1 and checks whether a matching keyword exists.

예를 들어, 표 1과 표 2를 비교하면, 1 등급에 매칭되는 키워드는 전송, 통신망 및 콘텐츠가 존재하고, 2 등급에 매칭되는 키워드는 촉매가 존재한다. 여기서, 정보 통계량이 일정 수준 이하(예를 들어, 2)인 키워드는 제외하는 것으로 가정한다. 즉, 1 등급에 매칭되는 키워드인 장치는 정보 통계량이 1.01이므로 제외될 수 있다.For example, when comparing Table 1 and Table 2, there are transmissions, networks, and contents in the keyword matching the first class, and catalyst in the keyword matching the second class. Here, it is assumed that the keyword whose information statistics are below a certain level (for example, 2) is excluded. That is, the device that is a keyword matching the first grade may be excluded because the information statistics is 1.01.

예를 들어, A 분류모듈이 매칭되는 키워드가 존재하는지를 확인하여 산출하는 분류결과는 다음의 표 4와 같이 나타낼 수 있다. 여기서, 분류결과는 이진 가중치(Boolean Weighting)를 이용하여 나타낸 것이다.For example, the classification result calculated by checking whether a matching keyword exists in a classification module A may be represented as shown in Table 4 below. Here, the classification result is represented using Boolean weighting.

1 등급1 rating 키워드keyword 트랜스코딩Transcoding 콘텐츠contents 통신망communications network 실시간real time 전송send 정보 통계량Information statistics 21.7621.76 15.0015.00 12.0012.00 10.0010.00 8.008.00 이진 가중치Binary weights 00 1One 1One 00 1One 2 등급2 ratings 키워드keyword 벤젠benzene 염기base 조성물Composition 세포cell 촉매catalyst 정보 통계량Information statistics 25.0025.00 20.0020.00 15.0015.00 10.0010.00 5.005.00 이진 가중치Binary weights 00 00 00 00 1One

표 4와 같이, A 분류모들은 반출시도 정보에 대하여 보안 등급을 결정할 수 있는 분류결과를 산출할 수 있다. 또한, A 분류모듈은 분류결과를 이용하여 각 보안 등급에 대하여 반출시도 정보의 매칭 비율을 산출할 수 있다.As shown in Table 4, the classification A group can calculate the classification result to determine the security level for the export attempt information. In addition, the classification module A may calculate a matching ratio of the export attempt information for each security level using the classification result.

예를 들어, 표 4의 이진 가중치를 단순 비교하면, 1 등급과 2 등급의 매칭 비율은 3:1이 될 수 있다. 또한, 이진 가중치와 정보 통계량을 고려하여 각 이진 가중치에 정보 통계량을 곱하여 그 비율을 산출하면, 1 등급과 2 등급의 매칭 비율은 (15+12+8):5=35:5가 될 수 있다. For example, if the binary weights of Table 4 are simply compared, the matching ratio between the first and second grades may be 3: 1. In addition, when the binary weights and the information statistics are considered and the respective binary weights are multiplied by the information statistics to calculate the ratio, the matching ratio between the first and second grades may be (15 + 12 + 8): 5 = 35: 5. .

등급 결정부(145)는 분류결과 생성부(143)에 의하여 생성된 분류결과를 이용하여 반출시도 정보의 보안 등급을 결정한다. 즉, 등급 결정부(145)는 각 분류모듈이 산출한 분류결과를 종합하여 반출시도 정보의 보안 등급을 결정한다.The rating determiner 145 determines the security level of the export attempt information by using the classification result generated by the classification result generator 143. That is, the rating determiner 145 determines the security level of the export attempt information by combining the classification results calculated by each classification module.

예를 들어, 각 분류 모듈이 분류 결과에 따라 각 보안 등급에 대하여 산출한 반출시도 정보의 매칭 비율은 다음의 표 5와 같이 나타낼 수 있다.For example, the matching ratio of the export attempt information calculated for each security level by each classification module may be expressed as shown in Table 5 below.

A 분류모듈A classification module B 분류모듈B classification module C 분류모듈C classification module 1등급Grade 1 70%70% 50%50% 40%40% 2등급Grade 2 20%20% 30%30% 30%30% 3등급Grade 3 10%10% 20%20% 30%30%

예를 들어, 등급 결정부(145)는 이진 결정 방법(Boolean decision)으로써, 각 분류모듈에서 반출시도 정보와 매칭 비율이 높은 보안 등급을 해당 분류모듈의 결정 등급으로 산출하고, 각 분류모듈에 의하여 산출된 결정 등급 중 가장 많은 수의 결정 등급을 반출시도 정보의 보안 등급으로 결정할 수 있다. 즉, 표 5를 참조하면, A 분류모듈, B 분류모듈 및 C 분류모듈은 1 등급이 가장 높은 매칭 비율임을 나타내고 있다. 따라서, 등급 결정부(145)는 반출시도 정보를 1등급으로 결정할 수 있다. For example, the class determining unit 145 calculates a security level having a high matching ratio with export attempt information in each classification module as a decision level of the corresponding classification module as a binary decision method, and by each classification module. The largest number of decision levels among the calculated decision levels may be determined as the security level of the export attempt information. That is, referring to Table 5, the A classification module, the B classification module, and the C classification module indicate that the first grade is the highest matching ratio. Accordingly, the grade determination unit 145 may determine the export attempt information as the first grade.

예를 들어, 등급 결정부(145)는 각 분류모듈에서 산출된 등급별 매칭 비율을 합하는 방법으로써, 등급별로 매칭 비율을 합한 값을 비교하여 가장 높은 값을 가지는 등급을 반출시도 정보의 등급으로 결정할 수 있다. 즉, 표 5를 참조하면, 1 등급은 70%+50%+40%=160%이고, 2 등급은 20%+30%+30%=80%이고, 3 등급은 10%+20%+30%=60%이다. 따라서, 등급 결정부(145)는 반출시도 정보를 1등급으로 결정할 수 있다.For example, the rating determiner 145 may add the matching ratio for each grade calculated by each classification module, and compare the sum of the matching rates for each grade to determine the grade having the highest value as the grade of the export attempt information. have. That is, referring to Table 5, the first grade is 70% + 50% + 40% = 160%, the second grade is 20% + 30% + 30% = 80%, and the third grade is 10% + 20% + 30. % = 60%. Accordingly, the grade determination unit 145 may determine the export attempt information as the first grade.

예를 들어, 등급 결정부(145)는 각 분류모듈에 대하여 부여된 가중치를 이용하여 각 분류모듈의 가중치와 매칭 비율을 곱한 값을 등급별로 합하여 반출시도 정보의 등급을 결정할 수 있다. 즉, A 분류모듈의 가중치를 1.0, B 분류모듈의 가중치를 0.8, C 분류모듈의 가중치를 1.2로 가정한다. 표 5를 참조하면, 1 등급은 70%*1.0+50%*0.8+40%*1.2=158%이고, 2 등급은 20%*1.0+30%*0.8+30%*1.2=80%이고, 3 등급은10%*1.0+20%*0.8+30%*1.2=62%이다. 따라서, 등급 결정부(145)는 반출시도 정보를 1등급으로 결정할 수 있다. For example, the grade determiner 145 may determine the grade of the export attempt information by summing values obtained by multiplying the weights of the respective classification modules and the matching ratio by the grades using the weights assigned to the respective classification modules. That is, it is assumed that the weight of the A classification module is 1.0, the weight of the B classification module is 0.8, and the weight of the C classification module is 1.2. Referring to Table 5, the first grade is 70% * 1.0 + 50% * 0.8 + 40% * 1.2 = 158%, the second grade is 20% * 1.0 + 30% * 0.8 + 30% * 1.2 = 80%, Tier 3 is 10% * 1.0 + 20% * 0.8 + 30% * 1.2 = 62%. Accordingly, the grade determination unit 145 may determine the export attempt information as the first grade.

제어부(150)는 상술한 구성부(저장부(110), 학습부(120), 감지부(130), 결정부(140))를 제어한다.The controller 150 controls the above-described configuration unit (storage unit 110, learning unit 120, sensing unit 130, determination unit 140).

예를 들어, 제어부(150)는 감지부(130)에 의하여 반출 시도가 감지되면, 결정부(140)가 반출시도 정보의 보안 등급을 결정하도록 제어할 수 있다.For example, the controller 150 may control the determination unit 140 to determine the security level of the export attempt information when the export attempt is detected by the sensing unit 130.

또한, 제어부(150)는 반출시도 정보의 보안 등급과 반출을 시도하는 내부인원의 보안 등급을 비교하여 반출시도 정보의 반출을 허용 또는 차단한다. 예를 들어, 반출을 시도하는 내부인원의 보안 등급이 반출시도 정보의 보안 등급보다 작으면, 반출을 차단할 수 있다.In addition, the controller 150 allows or blocks the export attempt information by comparing the security level of the export attempt information with the security level of an internal person who attempts to export the information. For example, if the security level of the internal person who attempts to export is less than the security level of the export attempt information, the export may be blocked.

제어부(150)는 반출을 시도하는 내부인원의 등급이 반출시도 정보의 등급보다 작은 경우, 등급 재결정 요청을 받으면, 결정부(140)가 반출시도 정보의 보안 등급을 재결정하도록 제어할 수 있다. 이때, 제어부(150)는 결정부(140)가 분류모듈의 수를 조절하여 등급을 결정하도록 제어할 수 있다. 예를 들어, 반출 시도 감지 후, 첫번째 등급 결정 시, 결정부(140)는 다수의 분류 모듈 중 미리 설정된 임의의 개수의 분류모듈을 이용하여 반출시도 정보의 보안 등급을 결정할 수 있다. 이후, 등급 재결정 시, 결정부(140)는 첫번째 등급 결정 시 사용된 분류 모듈을 제외한 나머지 분류 모듈을 이용하여 반출시도 정보의 보안 등급을 결정할 수 있다. 또한, 재차 등급 재결정 시, 결정부(140)는 다수의 분류 모듈을 모두 이용하여 보안 등급을 결정할 수 있다.The controller 150 may control the determination unit 140 to re-determine the security level of the export attempt information when the grade of the internal person attempting to export is smaller than the grade of the export attempt information. In this case, the controller 150 may control the determination unit 140 to determine the grade by adjusting the number of classification modules. For example, when the first class is determined after detecting the export attempt, the determination unit 140 may determine the security level of the export attempt information by using a predetermined number of classification modules among a plurality of classification modules. Then, when re-determining the grade, the determination unit 140 may determine the security level of the export attempt information by using the remaining classification module other than the classification module used when determining the first grade. In addition, when re-determining the grade, the determination unit 140 may determine the security grade by using all of the plurality of classification modules.

또한, 다른 실시예로, 제어부(150)는 반출을 시도하는 내부인원의 등급이 반출시도 정보의 등급보다 작은 경우, 등급 재결정 요청을 받으면, 복수의 단말(200) 중 보안 담당자(또는 보안 위원회)의 단말로 반출시도 정보의 보안 등급의 재결정 요청을 할 수 있다. 이때, 보안 담당자 또는 보안 위원회는 해당 반출시도 정보의 보안 등급에 대하여 검토를 하여 보안 등급을 재결정하고, 재결정된 보안 등급을 단말(200)을 통해 기밀정보 유출 방지 장치(100)로 전달할 수 있다. 이어, 제어부(150)는 전달받은 반출시도 정보의 보안 등급과 반출을 시도하는 내부인원의 보안 등급을 비교하여 반출시도 정보의 반출을 허용 또는 차단할 수 있다.In addition, in another embodiment, the controller 150 is a security officer (or security committee) of the plurality of terminals 200 when receiving a request for re-determination of the rating, if the rating of the internal personnel attempting to export is less than the rating of the export attempt information The terminal may request to re-determine the security level of the export attempt information. In this case, the security officer or security committee may review the security level of the corresponding export attempt information to re-determine the security level, and transmit the re-determined security level to the apparatus 100 for preventing leakage of confidential information through the terminal 200. Subsequently, the controller 150 may allow or block the export attempt information by comparing the security level of the received export attempt information with the security level of an internal person who attempts to export the information.

이때, 제어부(150)는 재결정된 보안 등급이 기존 보안 등급과 다르면, 저장부(110)에 보안 등급이 재결정된 반출시도 정보를 해당 보안 등급의 기밀정보로 추가한다. 이와 같이, 제어부(150)는 보안 등급별 기밀정보를 업데이트할 수 있으며, 이후 학습부(120)는 학습 결과 산출 시, 업데이트된 보안 등급별 기밀정보를 통해 학습 결과의 정확성을 증가시킬 수 있다.At this time, if the re-determined security level is different from the existing security level, the controller 150 adds the export attempt information of which the security level is re-determined to the storage unit 110 as confidential information of the corresponding security level. As such, the controller 150 may update the confidential information for each security level, and then the learner 120 may increase the accuracy of the learning result through the updated confidential information for each security level when calculating the learning result.

도 3은 기밀정보 유출 방지 장치가 학습 결과를 산출하는 절차를 나타낸 흐름도이다.3 is a flowchart illustrating a procedure of calculating a learning result by the apparatus for preventing leakage of confidential information.

S310 단계에서, 내부 보안담당자 또는 보안위원회는 기밀정보 유출 방지 장치(100)에 저장된 기밀정보 및 내부인원 정보에 대하여 등급을 설정한다. 여기서, 기밀정보 유출 방지 장치(100)는 저장부(110)에 보안 등급별로 기밀정보 및 내부인원 정보를 저장할 수 있다.In step S310, the internal security officer or security committee sets a rating for the confidential information and internal personnel information stored in the confidential information leakage prevention apparatus 100. Here, the confidential information leakage prevention apparatus 100 may store confidential information and internal personnel information for each security level in the storage unit 110.

S320 단계에서, 기밀정보 유출 방지 장치(100)는 등급별 기밀정보로부터 등급별로 특징 정보를 추출한다. 예를 들어, 기밀정보 유출 방지 장치(100)는 기밀정보를 기계적 처리가 가능하도록 변환하는 정규화를 거쳐 기밀정보의 특징을 잘 반영할 수 있는 키워드 또는 멀티미디어(이미지, 음성, 동영상 등)를 추출한다.In operation S320, the confidential information leakage prevention apparatus 100 extracts feature information for each grade from the confidential information for each grade. For example, the confidential information leakage prevention apparatus 100 extracts a keyword or multimedia (image, voice, video, etc.) that can accurately reflect the characteristics of the confidential information through normalization of converting the confidential information into mechanical processing. .

S330 단계에서, 기밀정보 유출 방지 장치(100)는 복수의 학습모듈을 이용하여 등급별 학습결과를 생성한다. 여기서, 기밀정보 유출 방지 장치(100)는 다수의 학습 모듈을 포함하며, 각 학습모듈은 설정된 학습 알고리즘에 따라 추출된 특징 정보를 이용하여 등급별 기밀정보에 대한 학습 결과를 생성한다. 학습결과의 생성에 관한 것은 도 2에서 상술하였으므로 그 상세한 설명은 생략한다.In operation S330, the confidential information leakage prevention apparatus 100 generates a learning result for each grade using a plurality of learning modules. Here, the confidential information leakage prevention apparatus 100 includes a plurality of learning modules, and each learning module generates a learning result for confidential information for each grade by using feature information extracted according to a set learning algorithm. Since the generation of the learning result has been described above with reference to FIG. 2, a detailed description thereof will be omitted.

S340 단계에서, 기밀정보 유출 방지 장치(100)는 각 학습모듈별로 학습결과를 저장한다. 여기서, 기밀정보 유출 방지 장치(100)는 각 학습모듈에 대하여 등급별로 정리된 학습결과를 저장할 수 있다.In operation S340, the confidential information leakage prevention apparatus 100 stores the learning results for each learning module. Here, the confidential information leakage prevention apparatus 100 may store the learning results organized by grade for each learning module.

도 4는 기밀정보 유출 방지 장치가 반출시도 정보의 보안 등급을 결정하는 절차를 나타낸 흐름도이다.4 is a flowchart illustrating a procedure of determining, by the apparatus for preventing the leakage of confidential information, the security level of the export attempt information.

S410 단계에서, 기밀정보 유출 방지 장치(100)는 반출 시도를 감지한다. 내부 인원은 단말(200)을 이용하여 내부의 정보를 외부로 반출할 수 있다. 예를 들어, 내부 인원은 외부 네트워크로 파일을 전송하거나, 이동 저장 장치를 이용하여 파일을 복사하거나, 프린터와 같은 사무기기를 이용하여 파일을 인쇄함으로써, 내부의 정보를 외부로 반출할 수 있다.In operation S410, the confidential information leakage prevention apparatus 100 detects an attempt to export. The internal personnel may take out the internal information to the outside using the terminal 200. For example, the internal personnel may export the internal information to the outside by transmitting the file to an external network, copying the file using a mobile storage device, or printing the file using an office device such as a printer.

S420 단계에서, 기밀정보 유출 방지 장치(100)는 반출시도 정보에서 특징 정보를 추출한다. 예를 들어, 기밀정보 유출 방지 장치(100)는 반출시도 정보를 정규화하여 키워드 또는 멀티미디어를 추출할 수 있다.In operation S420, the confidential information leakage prevention apparatus 100 extracts feature information from the export attempt information. For example, the confidential information leakage prevention apparatus 100 may extract keywords or multimedia by normalizing the export attempt information.

S430 단계에서, 기밀정보 유출 방지 장치(100)는 복수의 분류모듈을 이용하여 분류결과를 생성한다. 기밀정보 유출 방지 장치(100)는 다수의 학습모듈에 대응하는 다수의 분류모듈을 포함한다. 각 분류모듈은 추출된 반출시도 정보의 특징 정보를 학습 결과와 비교한다. 이에 따라, 각 분류모듈은 는 반출시도 정보의 보안 등급을 결정할 수 있는 분류 결과를 산출할 수 있다. 이때, 기밀정보 유출 방지 장치(100)는 구비한 전체 분류모듈 중에서 미리 설정된 개수의 분류모듈이 분류 결과를 산출하도록 제어할 수 있다. 분류결과 생성에 대해서는 도 2에서 상술하였으므로 그 상세한 설명은 생략한다.In operation S430, the confidential information leakage prevention apparatus 100 generates a classification result using a plurality of classification modules. The confidential information leakage prevention apparatus 100 includes a plurality of classification modules corresponding to the plurality of learning modules. Each classification module compares the feature information of the extracted export attempt information with the learning result. Accordingly, each classification module may calculate a classification result for determining the security level of the export attempt information. In this case, the confidential information leakage prevention apparatus 100 may control the classification module of a predetermined number of all classification modules provided to calculate a classification result. Since the classification result generation has been described above with reference to FIG. 2, a detailed description thereof will be omitted.

S440 단계에서, 기밀정보 유출 방지 장치(100)는 각 분류모듈의 분류결과에 따라 반출시도 정보의 보안 등급을 결정한다. 기밀정보 유출 방지 장치(100)는 각 분류모듈의 분류결과를 조합하여 반출시도 정보의 보안 등급을 결정할 수 있다. 예를 들어, 기밀정보 유출 방지 장치(100)는 각 분류모듈이 생성한 등급별 반출시도 정보의 매칭 비율을 이용하여 각 분류모듈별로 매칭 비율이 가장 높은 등급을 산출하고, 가장 많이 산출된 등급을 반출시도 정보의 등급으로 결정할 수 있다.
In operation S440, the confidential information leakage prevention apparatus 100 determines a security level of the export attempt information according to the classification result of each classification module. The confidential information leakage prevention apparatus 100 may determine the security level of the export attempt information by combining the classification result of each classification module. For example, the confidential information leakage prevention apparatus 100 calculates the highest matching ratio for each classification module by using the matching ratio of the export attempt information for each level generated by each classification module, and exports the most calculated rating. This may be determined by the degree of the attempt information.

도 5는 기밀정보 유출 방지 장치가 수행하는 기밀정보 유출 방지 방법을 나타낸 흐름도이다.5 is a flowchart illustrating a confidential information leakage prevention method performed by the confidential information leakage prevention apparatus.

S510 단계에서, 기밀정보 유출 방지 장치(100)는 반출 시도를 감지하여 반출시도 정보의 등급을 결정한다. S510 단계는 도 4에서 상술한 보안 등급 결정 절차에 따라 수행된다.In operation S510, the confidential information leakage prevention apparatus 100 determines an export attempt information level by detecting an export attempt. Step S510 is performed according to the security level determination procedure described above in FIG.

S520 단계에서, 기밀정보 유출 방지 장치(100)는 반출을 시도한 내부인원의 등급과 S510 단계에서 결정된 반출시도 정보의 등급을 비교한다.In operation S520, the apparatus 100 for preventing leakage of confidential information compares the grade of the internal person who attempted to carry out with the grade of the export attempt information determined in operation S510.

S530 단계에서, 반출시도 내부인원 등급이 반출시도 정보 등급보다 큰 경우, 기밀정보 유출 방지 장치(100)는 반출시도 정보의 반출을 허용한다.In operation S530, when the export attempt internal personnel rating is larger than the export attempt information level, the confidential information leakage prevention apparatus 100 allows the export of the export attempt information.

S540 단계에서, 반출시도 내부인원 등급이 반출시도 정보 등급보다 작은 경우, 기밀정보 유출 방지 장치(100)는 등급 재결정 요청 여부를 판단한다. 반출을 시도한 내부인원은 자신의 등급이 반출시도 정보 등급보다 작아 반출이 허용되지 않을 경우 단말(200)을 통해 기밀정보 유출 방지 장치(100)로 등급 재결정 요청을 할 수 있다. In operation S540, when the export attempt internal personnel rating is smaller than the export attempt information rating, the confidential information leakage prevention apparatus 100 determines whether to request a re-determination of the rating. When the internal person who attempted to carry out is not allowed to export because his or her grade is smaller than the export attempt information level, the insider may request a re-determination to the confidential information leakage prevention apparatus 100 through the terminal 200.

기밀정보 유출 방지 장치(100)는 등급 재결정이 요청되지 않는 경우, S580 단계로 진입하여 반출시도 정보의 반출을 차단한다.If the confidential information leakage prevention apparatus 100 is not requested to re-determined, the process proceeds to step S580 to block the export of the export attempt information.

S550 단계에서, 기밀정보 유출 방지 장치(100)는 등급 재결정 요청을 받으면, 반출시도 정보의 등급을 재결정한다. 이때, 기밀정보 유출 방지 장치(100)는 S430 단계에서 사용된 분류모듈을 제외한 나머지 분류모듈을 이용하여 반출시도 정보의 등급을 재결정할 수 있다. 또한, 기밀정보 유출 방지 장치(100)는 이후 재차 등급 재결정 요청을 받으면 구비한 전체 분류모듈을 이용하여 반출시도 정보의 보안 등급을 재결정할 수 있다.In operation S550, the confidential information leakage prevention apparatus 100 re-determines the grade of the export attempt information when receiving the grade re-determination request. At this time, the confidential information leakage prevention apparatus 100 may re-determine the class of the export attempt information by using the remaining classification module except the classification module used in step S430. In addition, the confidential information leakage prevention apparatus 100 may re-determine the security level of the export attempt information using the entire classification module provided after receiving the request for re-determination of the rating again.

또한, 다른 실시예로, 기밀정보 유출 방지 장치(100)는 등급 재결정 요청을 받으면, 복수의 단말(200) 중 보안 담당자(또는 보안 위원회)의 단말로 반출시도 정보의 보안 등급의 재결정 요청을 하고, 단말을 통해 보안 담당자 또는 보안 위원회에 의하여 재결정된 반출시도 정보의 보안 등급을 전달받을 수 있다. In addition, in another embodiment, when the confidential information leakage prevention apparatus 100 receives a request for re-determination of the rating, the request for re-determination of the security level of the information on the release attempt to the terminal of the security officer (or security committee) of the plurality of terminals 200 Through the terminal, the security level of the export attempt information determined by the security officer or the security committee may be received.

S560 단계에서, 기밀정보 유출 방지 장치(100)는 등급의 변화 여부를 판단한다. 재결정된 등급이 기존 등급과 일치하는 경우, 기밀정보 유출 방지 장치(100)는 S580 단계로 진입하여 반출시도 정보의 반출을 차단한다.In operation S560, the confidential information leakage prevention apparatus 100 determines whether the grade is changed. If the re-determined grade is consistent with the existing grade, the confidential information leakage prevention device 100 enters the step S580 to block the export of the export attempt information.

S570 단계에서, 기밀정보 유출 방지 장치(100)는 재결정된 등급이 기존 등급과 일치하지 않는 경우, 저장부(110)에 보안 등급이 재결정된 반출시도 정보를 해당 보안 등급의 기밀정보로 추가한다. 이와 같이, 기밀정보 유출 방지 장치(100)는 보안 등급별 기밀정보를 업데이트할 수 있으며, 이후 학습 결과 산출 시, 업데이트된 보안 등급별 기밀정보를 통해 학습 결과의 정확성을 증가시킬 수 있다.In operation S570, when the reclassified grade does not match the existing grade, the confidential information leakage prevention apparatus 100 adds the export attempt information of which the security grade is re-determined to the storage 110 as confidential information of the corresponding security grade. As such, the confidential information leakage prevention apparatus 100 may update the confidential information for each security level, and when calculating the learning result thereafter, may increase the accuracy of the learning result through the updated confidential information for each security level.

이후, 기밀정보 유출 방지 장치(100)는 S520 단계로 진입하여 반출시도 내부인원 등급이 재결정된 반출시도 정보의 등급보다 크면, 반출시도 정보의 반출을 허용한다.
Thereafter, the confidential information leakage prevention apparatus 100 enters the step S520, and when the export attempt internal personnel rating is greater than the re-determined rating of the export attempt information, allows the export of the export attempt information.

한편, 본 실시예에 따른 기밀정보 유출 방지 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조등을 단독으로 또는 조합하여 포함할 수 있다. Meanwhile, the method for preventing the leakage of confidential information according to the present embodiment may be implemented in the form of program instructions that can be executed by various computer means and recorded in a computer readable medium. Computer-readable media may include, alone or in combination with the program instructions, data files, data structures, and the like.

컴퓨터 판독 가능 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 분야 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media) 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 또한 상술한 매체는 프로그램 명령, 데이터 구조 등을 지정하는 신호를 전송하는 반송파를 포함하는 광 또는 금속선, 도파관 등의 전송 매체일 수도 있다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. The program instructions recorded on the computer readable medium may be those specially designed and constructed for the present invention, or may be known and available to those skilled in the computer software arts. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tape, optical media such as CD-ROMs, DVDs, and magnetic disks, such as floppy disks. Hardware devices specially configured to store and execute program instructions such as magneto-optical media and ROM, RAM, flash memory and the like. In addition, the above-described medium may be a transmission medium such as an optical or metal wire, a waveguide, or the like including a carrier wave for transmitting a signal specifying a program command, a data structure, and the like. Examples of program instructions include not only machine code generated by a compiler, but also high-level language code that can be executed by a computer using an interpreter or the like.

상술한 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.
The hardware device described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

상기에서는 본 발명의 바람직한 실시예를 참조하여 설명하였지만, 해당 기술 분야에서 통상의 지식을 가진 자라면 하기의 특허 청구의 범위에 기재된 본 발명의 사상 및 영역으로부터 벗어나지 않는 범위 내에서 본 발명을 다양하게 수정 및 변경시킬 수 있음을 이해할 수 있을 것이다.
Although the above has been described with reference to a preferred embodiment of the present invention, those skilled in the art to which the present invention pertains without departing from the spirit and scope of the present invention as set forth in the claims below It will be appreciated that modifications and variations can be made.

100: 기밀정보 유출 방지 장치
200: 단말100: confidential information leakage prevention device
200: terminal

Claims

A storage unit which stores confidential information and internal personnel information for each security level;
A learning unit extracting feature information from the confidential information and generating a learning result for each security level using the feature information;
A detector for detecting an attempt to export information from the internal network to the outside;
A determination unit determining a security level of export attempt information by using the learning result when the export attempt is detected; And
And a controller for allowing or blocking the export attempt information by comparing the security level of the export attempt information with the security level of an internal person.
The learning unit includes a plurality of learning modules, and the determining unit includes a plurality of classification modules corresponding to the plurality of learning modules.

The method of claim 1,
The learning unit
A learning information extraction unit for normalizing the confidential information to extract feature information including any one or more of keywords and multimedia; And
Comprising a learning result generating unit for calculating the information statistics on the feature information, and generating a learning result of gathering the feature information and the information statistics for each security level; Prevention device.

The method of claim 1,
The determining unit
A classification information extracting unit for extracting feature information including at least one of a keyword and a multimedia by normalizing the export attempt information;
A classification result generation unit having a plurality of classification modules configured to compare the feature information of the export attempt information with the learning result and generate a classification result for calculating a ratio at which the export attempt information is matched for each security level; And
And a rating determiner configured to determine a security level of the export attempt information by using the classification result.

The method of claim 3,
The class determining unit calculates a security level having the highest matching ratio for each classification module by using the classification result, and determines the most calculated security level as the security level of the export attempt information. .

The method of claim 3,
The class determining unit compares the sum of the matching ratios of the classification modules for each class, and determines the security level having the highest value as the security level of the export attempt information.

The method of claim 5,
And the rating determiner applies a weight set in advance for each classification module to the matching ratio.

The method of claim 1,
And the controller controls the determining unit to determine a security level of the export attempt information by adjusting the number of classification modules.

The method of claim 1,
When the controller receives a request for re-determining the security level of the export attempt information, the controller requests a re-determination request from the terminal of the security officer and receives the security level of the export attempt information determined by the security officer from the terminal. Device for preventing the leakage of confidential information, characterized in that.

According to claim 1,
The control unit, if the security level determined according to the request for re-determining the level of the export attempt information is different from the existing security level, add the export attempt information as confidential information of the corresponding security level and store the confidential information in the storage unit. Spill Prevention Device.

In the method of preventing the leakage of confidential information in the internal network leakage prevention device,
Storing confidential information and internal personnel information having a security level set;
Extracting feature information from the confidential information and generating a learning result for each security level by using the feature information;
Detecting an attempt to export information from the internal network to the outside;
Determining a security level of the export attempt information by using the learning result; And
Comprising the step of allowing or blocking the export attempt information by comparing the security level of the export attempt information and the security level of the internal personnel,
The confidential information leakage prevention apparatus includes a plurality of learning modules and a plurality of classification modules corresponding to the plurality of learning modules.

The method of claim 10,
Generating a learning result for each security level by using the feature information
Normalizing the confidential information to extract feature information including at least one of a keyword and a multimedia;
Calculating information statistics on the feature information through the plurality of learning modules; And
And generating a learning result for each learning module, in which the feature information and the information statistics are sorted for each security level.

The method of claim 11,
Determining the security level of the export attempt information by using the learning result
Normalizing the export attempt information to extract feature information including at least one of a keyword and a multimedia;
Generating a classification result for each classification module that compares the feature information of the export attempt information with the learning result and calculates a ratio in which export attempt information is matched for each security level; And
And determining the security level of the export attempt information by using the classification result.

The method of claim 12,
Determining a security level of the export attempt information
And calculating the security level having the highest matching ratio for each classification module by using the classification result, and determining the most calculated security level as the security level of the export attempt information.

The method of claim 12,
Determining a security level of the export attempt information
And comparing the sum of the matching ratios of the classification modules for each level to determine the security level having the highest value as the security level of the export attempt information.

The method of claim 14,
Confidential information leakage prevention method, characterized in that for applying the weight set in advance for each of the classification module to the matching ratio.

The method of claim 10,
Determining the security level of the export attempt information by using the learning result
Determining the first security level of the export attempt information by using a predetermined number of classification modules of the plurality of classification modules,
The step of allowing or blocking the export attempt information by comparing the security level of the export attempt information and the security level of the internal personnel
If the first security level is greater than the security level of the internal personnel, determining a second security level of the export attempt information by using a classification module other than the preset number of classification modules when receiving a rating re-determination request; And
And comparing or disallowing the export of the export attempt information by comparing the second security level with the security level of the internal personnel.

The method of claim 16,
Determining a second security level of the export attempt information
Method for re-determining a rating to the terminal of the security officer, and receives a second security level of the export attempt information re-determined by the security officer from the terminal.

The method of claim 16,
After determining the second security level of the export attempt information,
And if the second security level is different from the first security level, adding the export attempt information as confidential information of the corresponding security level.