KR102656541B1

KR102656541B1 - Device, method and program that analyzes large log data using a distributed method for each log type

Info

Publication number: KR102656541B1
Application number: KR1020220092139A
Authority: KR
Inventors: 공용식; 강현숙; 김종완; 류승환
Original assignee: 주식회사 이글루코퍼레이션
Priority date: 2022-01-05
Filing date: 2022-07-26
Publication date: 2024-04-11
Also published as: KR20230106083A; KR102426889B1

Abstract

본 발명은 대용량 이벤트 로그에 대한 로그 타입별 데이터 분석 처리 장치에 관한 것으로, 보안 장비로부터 수신되는 대용량의 로그 데이터를 로그 타입 분류 기준에 따라 파싱하고, 파싱된 로그 데이터를 각 로그 타입 분류 기준에 설정된 룰셋을 기반으로 분석함으로써, 보안 장비로부터 수신되는 로그 데이터에 대하여 병렬 분산 처리할 수 있는 효과가 있다.The present invention relates to a data analysis and processing device for each log type for large-capacity event logs, which parses large-capacity log data received from security equipment according to log type classification criteria and classifies the parsed log data according to the log type classification criteria. By analyzing based on a ruleset, there is the effect of parallel and distributed processing of log data received from security equipment.

Description

Device, method and program for analyzing large log data using a distributed method for each log type {Device, method and program that analyzes large log data using a distributed method for each log type}

본 발명은 로그 데이터 분석 장치에 관한 것으로, 보다 상세하게는 로그 타입별 분산 방식을 활용하여 대용량 로그 데이터를 분석하는 장치에 관한 것이다.The present invention relates to a log data analysis device, and more specifically, to a device for analyzing large amounts of log data using a distribution method for each log type.

SIEM(Security Information Event Management) 또는 CEP (Complex Event 에서는 실시간으로 다양한 장비에서 생성되는 Event Log Data 들을 분석하여 내 외부로부터 위협이 있는지 판단하게 된다.SIEM (Security Information Event Management) or CEP (Complex Event) analyzes event log data generated from various devices in real time to determine whether there are threats from internal or external sources.

하지만, 운영하는 모든 서버 장비에 추가적 인 SIEM / CEP 모듈을 설치하고 , 각 서버 내에서 자원을 별도로 할당하여 Log Data를 모두 수신부터 분석까지 모두 처리할 수도 있지만 현실적으로 서버에 가해지는 부담이 매우 커져 리소스 부족 등의 이유로 원활한 동작이 불가능한 경우가 많다.However, it is possible to process all log data from reception to analysis by installing additional SIEM / CEP modules on all server equipment and allocating resources separately within each server. However, in reality, the burden on the servers becomes very large and resource constraints are required. There are many cases where smooth operation is impossible due to reasons such as shortage.

또한, 각각의 서버 내에서 분석 처리시 그 분석된 결과가 매우 단편적으로 , 다른 서버에서의 분석된 결과 들과 연결해서 전체 장비 들의 연속적인 결과를 얻고자 한다면 이 또한 어려움이 있다.In addition, when processing analysis within each server, the analyzed results are very fragmented, so it is also difficult to connect with the analyzed results from other servers to obtain continuous results of all equipment.

따라서, 별도의 SIEM / CEP 용 서버를 마련하여 전체 Event Log 들을 하나로 모아 일괄적으로 처리할 수 있다면 많은 장비에서 모인 Log Data 로부터 전체적이고 연속적인 분석 결과를 얻을 수 있다.Therefore, if you prepare a separate SIEM / CEP server and collect all event logs into one and process them in batches, you can obtain overall and continuous analysis results from log data collected from many devices.

그리고, 종래에서는 다양한 보안 장비로부터 수신되는 다양한 타입의 로그 데이터를 일괄적으로 분석하였는데, 최근 들어 데이터의 양이 급증하게 되면서 이러한 분석 방법으로는 모든 로그 데이터를 처리할 수 없게 되었다.In addition, conventionally, various types of log data received from various security devices were analyzed in batches, but recently, as the amount of data has increased rapidly, this analysis method has become unable to process all log data.

이에, 종래와 다른 방법으로 대용량의 로그 데이터를 분석, 처리하는 기술이 필요한 실정이지만, 현재로서는 이러한 기술이 공개되어 있지 않은 실정이다.Accordingly, there is a need for technology to analyze and process large amounts of log data in a different way than before, but such technology is not currently available to the public.

공개특허공보 제10-2018-0061891호, (2018.06.08)Public Patent Publication No. 10-2018-0061891, (2018.06.08)

상술한 바와 같은 문제점을 해결하기 위한 본 발명은 보안 장비로부터 수신되는 대용량의 로그 데이터를 로그 타입 분류 기준에 따라 파싱하고, 파싱된 로그 데이터를 각 로그 타입 분류 기준에 설정된 룰셋을 기반으로 분석하여 위협 정보 여부를 판단하고자 한다.The present invention to solve the problems described above parses large amounts of log data received from security equipment according to log type classification criteria, and analyzes the parsed log data based on the rules set for each log type classification standard to detect threats. We want to determine whether it is information or not.

본 발명이 해결하고자 하는 과제들은 이상에서 언급된 과제로 제한되지 않으며, 언급되지 않은 또 다른 과제들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The problems to be solved by the present invention are not limited to the problems mentioned above, and other problems not mentioned can be clearly understood by those skilled in the art from the description below.

상술한 과제를 해결하기 위한 본 발명의 일 실시예에 따른 이벤트 로그에 대한 로그 타입별 데이터 분석 처리 장치는, 서로 다른 복수의 보안 장비와 통신을 수행하는 통신부; 및 상기 통신부를 통해 상기 서로 다른 복수의 보안 장비로부터 실시간으로 기 설정된 크기 이상의 대용량의 로그 데이터가 수신되는 경우 상기 수신된 로그 데이터를 기 설정된 로그 타입 분류 기준에 따라 파싱하고, 상기 파싱된 로그 데이터를 각 로그 타입 분류 기준에 설정된 룰셋을 기반으로 분석하여 위협 정보 여부를 판단하며, 기 설정된 기간 동안 상기 복수의 노드들에 분류되어 누적된 데이터 양을 기반으로 상기 수신되는 로그 데이터가 상기 복수의 노드에 균일하게 분산되도록 상기 로그 타입 분류 기준을 설정하는 프로세서;를 포함한다.An apparatus for analyzing and processing data by log type for an event log according to an embodiment of the present invention to solve the above-described problem includes: a communication unit that communicates with a plurality of different security devices; And when a large amount of log data larger than a preset size is received in real time from the plurality of different security devices through the communication unit, the received log data is parsed according to a preset log type classification standard, and the parsed log data is Threat information is determined by analysis based on the rule set set for each log type classification standard, and the received log data is distributed to the plurality of nodes based on the amount of data classified and accumulated in the plurality of nodes during a preset period. It includes a processor that sets the log type classification criteria so that they are uniformly distributed.

또한, 복수 개의 노드들로 구성된 클러스터(Cluster);를 더 포함하고, 상기 복수의 노드는 각 노드마다 설정된 로그 타입에 해당하는 분석 애플리케이션이 개별로 구비되어 있으며, 상기 프로세서는 상기 각 노드마다 구비된 상기 분석 애플리케이션을 실행함으로써, 상기 서로 다른 복수의 보안 장비로부터 수신되는 로그 데이터를 병렬 분산 처리할 수 있다.In addition, it further includes a cluster consisting of a plurality of nodes, wherein the plurality of nodes are individually equipped with an analysis application corresponding to a log type set for each node, and the processor is provided for each node. By executing the analysis application, log data received from the plurality of different security devices can be processed in parallel and distributed.

또한, 상기 룰셋은 상기 복수의 보안 장비로부터 수신되는 로그 데이터가 위협 정보에 해당하는지 여부를 판단하기 위한 적어도 하나의 룰을 포함하고, 로그 데이터의 타입에 따라 각각 별개의 분석 모듈이 할당되고, 상기 프로세서는 각각의 로그 타입에 따라 설정된 룰셋(Rule Set)을 이용하여 상기 로그 데이터를 분석할 ㅅ수 있다.In addition, the rule set includes at least one rule for determining whether log data received from the plurality of security devices corresponds to threat information, and a separate analysis module is assigned to each according to the type of log data, and the The processor can analyze the log data using a rule set set according to each log type.

또한, 상기 프로세서는, 특정 로그 타입에 대한 로그 데이터의 비중이 기 설정된 수준 이상을 점유하는 경우, 해당 노드에 대한 로그 타입 분류 기준을 복수 개로 확장할 수 있다.Additionally, if the proportion of log data for a specific log type occupies a preset level or more, the processor may expand the log type classification criteria for the corresponding node to a plurality of log types.

또한, 상기 각 로그 타입 분류 기준은 상기 서로 다른 복수의 보안 장비 종류를 기반으로 결정되고, 상기 프로세서는 특정 로그 타입에 대한 로그 데이터의 비중이 기 설정된 수준 이상을 점유하는 경우 상기 수신되는 로그 데이터가 분산되도록 노드별 보안 장비 분류를 재배치할 수 있다.In addition, the classification criteria for each log type are determined based on the plurality of different types of security equipment, and the processor classifies the received log data when the proportion of log data for a specific log type occupies a preset level or more. Security equipment classification for each node can be rearranged so that it is distributed.

또한, 상기 데이터 분석 처리 장치는 상기 파싱된 로그 데이터가 저장되는 기간에 따라 분류된 적어도 하나의 데이터베이스를 더 포함하고, 상기 프로세서는 상기 파싱된 로그 데이터를 상기 적어도 하나의 데이터베이스에 저장하고, 저장 시점으로부터 기 설정된 저장 기간 후에 상기 로그 데이터를 삭제할 수 있다.In addition, the data analysis processing device further includes at least one database classified according to a period for which the parsed log data is stored, and the processor stores the parsed log data in the at least one database, and stores the parsed log data in the at least one database, and stores the parsed log data in the at least one database. The log data can be deleted after a preset storage period.

또한, 과거의 로그 데이터를 분석하는 분석 모듈을 더 포함하며, 상기 프로세서는 과거 로그 데이터 분석이 필요한 대상 장비 및 대상 기간을 기반으로, 해당되는 데이터베이스를 분석하여 과거 데이터 분석 결과를 생성할 수 있다.In addition, it further includes an analysis module for analyzing past log data, and the processor can generate past data analysis results by analyzing a corresponding database based on target equipment and target period for which past log data analysis is required.

또한, 상기 프로세서는, 상기 판단 결과 상기 로그 데이터가 위협 정보에 해당하는 경우 경보 신호를 발생하고, 기 설정된 시간 동안 발생되는 경보 신호의 수가 임계범위를 초과하는 경우 위협 정보가 검출되는 룰셋에 대한 점검 요청 신호를 생성할 수 있다.In addition, the processor generates an alarm signal when the log data corresponds to threat information as a result of the determination, and checks the ruleset for detecting threat information when the number of alarm signals generated during a preset time exceeds a threshold range. A request signal can be generated.

또한, 상술한 과제를 해결하기 위한 본 발명의 일 실시예에 따른 이벤트 로그에 대한 로그 타입별 데이터 분석 처리 방법은, 데이터 분석 처리 장치에 의해 수행되는 방법으로, 서로 다른 복수의 보안 장비로부터 실시간 로그 데이터를 수신하는 단계; 상기 수신된 로그 데이터를 기 설정된 로그 타입 분류 기준에 따라 파싱하는 단계; 상기 파싱된 로그 데이터를 각 로그 타입 분류 기준에 대하여 설정된 룰셋을 기반으로 분석하여 위협 정보 여부를 판단하는 단계; 및 기 설정된 기간 동안 상기 복수의 노드들에 분류되어 누적된 데이터 양을 기반으로 상기 수신되는 로그 데이터가 상기 복수의 노드에 균일하게 분산되도록 상기 로그 타입 분류 기준을 설정하는 단계를 포함한다.In addition, the data analysis and processing method for each log type for the event log according to an embodiment of the present invention to solve the above-described problem is a method performed by a data analysis processing device, and is a method of analyzing real-time logs from a plurality of different security devices. receiving data; parsing the received log data according to preset log type classification criteria; Analyzing the parsed log data based on a rule set set for each log type classification standard to determine whether it is threat information; and setting the log type classification criteria so that the received log data is uniformly distributed to the plurality of nodes based on the amount of data classified and accumulated in the plurality of nodes during a preset period.

이 외에도, 본 발명을 구현하기 위한 다른 방법, 다른 시스템 및 상기 방법을 실행하기 위한 컴퓨터 프로그램을 기록하는 컴퓨터 판독 가능한 기록 매체가 더 제공될 수 있다.In addition to this, another method for implementing the present invention, another system, and a computer-readable recording medium recording a computer program for executing the method may be further provided.

상기와 같은 본 발명에 따르면, 보안 장비로부터 수신되는 대용량의 로그 데이터를 로그 타입 분류 기준에 따라 파싱하고, 파싱된 로그 데이터를 각 로그 타입 분류 기준에 설정된 룰셋을 기반으로 분석함으로써, 보안 장비로부터 수신되는 로그 데이터에 대하여 병렬 분산 처리할 수 있는 효과가 있다.According to the present invention as described above, a large amount of log data received from security equipment is parsed according to log type classification criteria, and the parsed log data is analyzed based on a rule set set for each log type classification standard, so that the log data received from security equipment This has the effect of allowing parallel and distributed processing of log data.

본 발명의 효과들은 이상에서 언급된 효과로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The effects of the present invention are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the description below.

도 1은 본 발명의 실시예에 따른 이벤트 로그에 대한 로그 타입별 데이터 분석 처리 장치의 블록도이다.
도 2는 본 발명의 실시예에 따른 이벤트 로그에 대한 로그 타입별 데이터 분석 처리 방법의 흐름도이다.
도 3은 본 발명의 실시예에서 클러스터의 구성을 예시한 도면이다.
도 4는 종래의 이벤트 로그에 대한 로그 타입별 데이터 분석 처리 방법을 예시한 도면이다.
도 5는 본 발명의 실시예에 따른 이벤트 로그에 대한 로그 타입별 데이터 분석 처리 방법을 예시한 도면이다.
도 6은 본 발명의 실시예에 따른 이벤트 로그에 대한 로그 타입별 데이터 분석 처리 장치의 구성도를 예시한 도면이다.1 is a block diagram of a data analysis processing device for each log type for an event log according to an embodiment of the present invention.
Figure 2 is a flowchart of a data analysis processing method for each log type for an event log according to an embodiment of the present invention.
Figure 3 is a diagram illustrating the configuration of a cluster in an embodiment of the present invention.
Figure 4 is a diagram illustrating a data analysis processing method for each log type for a conventional event log.
Figure 5 is a diagram illustrating a data analysis processing method for each log type for an event log according to an embodiment of the present invention.
Figure 6 is a diagram illustrating the configuration of a data analysis processing device for each log type for an event log according to an embodiment of the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나, 본 발명은 이하에서 개시되는 실시예들에 제한되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술 분야의 통상의 기술자에게 본 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다.The advantages and features of the present invention and methods for achieving them will become clear by referring to the embodiments described in detail below along with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below and may be implemented in various different forms. The present embodiments are merely provided to ensure that the disclosure of the present invention is complete and to provide a general understanding of the technical field to which the present invention pertains. It is provided to fully inform the skilled person of the scope of the present invention, and the present invention is only defined by the scope of the claims.

본 명세서에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다. 명세서에서 사용되는 "포함한다(comprises)" 및/또는 "포함하는(comprising)"은 언급된 구성요소 외에 하나 이상의 다른 구성요소의 존재 또는 추가를 배제하지 않는다. 명세서 전체에 걸쳐 동일한 도면 부호는 동일한 구성 요소를 지칭하며, "및/또는"은 언급된 구성요소들의 각각 및 하나 이상의 모든 조합을 포함한다. 비록 "제1", "제2" 등이 다양한 구성요소들을 서술하기 위해서 사용되나, 이들 구성요소들은 이들 용어에 의해 제한되지 않음은 물론이다. 이들 용어들은 단지 하나의 구성요소를 다른 구성요소와 구별하기 위하여 사용하는 것이다. 따라서, 이하에서 언급되는 제1 구성요소는 본 발명의 기술적 사상 내에서 제2 구성요소일 수도 있음은 물론이다.The terminology used herein is for describing embodiments and is not intended to limit the invention. As used herein, singular forms also include plural forms, unless specifically stated otherwise in the context. As used in the specification, “comprises” and/or “comprising” does not exclude the presence or addition of one or more other elements in addition to the mentioned elements. Like reference numerals refer to like elements throughout the specification, and “and/or” includes each and every combination of one or more of the referenced elements. Although “first”, “second”, etc. are used to describe various components, these components are of course not limited by these terms. These terms are merely used to distinguish one component from another. Therefore, it goes without saying that the first component mentioned below may also be a second component within the technical spirit of the present invention.

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어(기술 및 과학적 용어를 포함)는 본 발명이 속하는 기술분야의 통상의 기술자에게 공통적으로 이해될 수 있는 의미로 사용될 수 있을 것이다. 또한, 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않는 한 이상적으로 또는 과도하게 해석되지 않는다.Unless otherwise defined, all terms (including technical and scientific terms) used in this specification may be used with meanings commonly understood by those skilled in the art to which the present invention pertains. Additionally, terms defined in commonly used dictionaries are not interpreted ideally or excessively unless clearly specifically defined.

이하, 첨부된 도면을 참조하여 본 발명의 실시예를 상세하게 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the attached drawings.

도 1은 본 발명의 실시예에 따른 이벤트 로그에 대한 로그 타입별 데이터 분석 처리 장치의 블록도이다.1 is a block diagram of a data analysis processing device for each log type for an event log according to an embodiment of the present invention.

도 2는 본 발명의 실시예에 따른 이벤트 로그에 대한 로그 타입별 데이터 분석 처리 방법의 흐름도이다.Figure 2 is a flowchart of a data analysis processing method for each log type for an event log according to an embodiment of the present invention.

도 3은 본 발명의 실시예에서 클러스터의 구성을 예시한 도면이다.Figure 3 is a diagram illustrating the configuration of a cluster in an embodiment of the present invention.

도 4는 종래의 이벤트 로그에 대한 로그 타입별 데이터 분석 처리 방법을 예시한 도면이다.Figure 4 is a diagram illustrating a data analysis processing method for each log type for a conventional event log.

도 5는 본 발명의 실시예에 따른 이벤트 로그에 대한 로그 타입별 데이터 분석 처리 방법을 예시한 도면이다.Figure 5 is a diagram illustrating a data analysis processing method for each log type for an event log according to an embodiment of the present invention.

도 6은 본 발명의 실시예에 따른 이벤트 로그에 대한 로그 타입별 데이터 분석 처리 장치의 구성도를 예시한 도면이다.Figure 6 is a diagram illustrating the configuration of a data analysis processing device for each log type for an event log according to an embodiment of the present invention.

클러스터(cluster, 또는 컴퓨터 클러스터)란 여러 대의 컴퓨터를 네트워크로 연결하여 하나의 컴퓨터처럼 사용할 수 있도록 하는 개념이다. 컴퓨터 클러스터는 컴퓨터 운영체제, 컴퓨터의 하드웨어, 통계 데이터 등 여러 분야에서 사용된다. 컴퓨터 클러스터의 구성 요소들은 일반적으로 고속의 근거리 통신망으로 연결된다.A cluster (or computer cluster) is a concept that connects multiple computers through a network so that they can be used as one computer. Computer clusters are used in many fields, including computer operating systems, computer hardware, and statistical data. The components of a computer cluster are typically connected by a high-speed local area network.

클러스터는 일반적으로 단일 컴퓨터보다 더 뛰어난 성능과 안정성을 자랑하며, 단일 컴퓨터보다 훨씬 더 효율적이다.Clusters typically boast better performance and reliability than a single computer, and are much more efficient than a single computer.

도 1을 참조하면, 본 발명의 실시예에 따른 이벤트 로그에 대한 로그 타입별 데이터 분석 처리 장치(100)는 프로세서(110), 통신부(120), 저장부(130), 입출력부(140), 파싱부(150), 분석부(160), 검색부(170) 및 로그 수신 모듈을 포함한다.Referring to FIG. 1, the data analysis and processing device 100 for each log type for an event log according to an embodiment of the present invention includes a processor 110, a communication unit 120, a storage unit 130, an input/output unit 140, It includes a parsing unit 150, an analysis unit 160, a search unit 170, and a log receiving module.

다만, 몇몇 실시예에서 데이터 분석 처리 장치(100)는 도 1에 도시된 구성요소보다 더 적은 수의 구성요소나 더 많은 구성요소를 포함할 수도 있다.However, in some embodiments, the data analysis processing device 100 may include fewer or more components than those shown in FIG. 1 .

통신부(120)는 서로 다른 복수의 보안 장비, 현장 관리자 단말 등과 통신하며, 구체적으로 서로 다른 복수의 보안 장비로부터 실시간 로그 데이터를 수신하고, 프로세서(110)가 로그 데이터에 대한 위협 정보 발견 신호를 발생하면 신호를 현장 관리자 단말로 전송할 수 있다.The communication unit 120 communicates with a plurality of different security devices, field manager terminals, etc., and specifically receives real-time log data from a plurality of different security devices, and the processor 110 generates a threat information discovery signal for the log data. Then, the signal can be transmitted to the field manager terminal.

통신부(120)는 로그 수신만을 담당하는 로그 수신 모듈(190)을 더 포함할 수 있다.The communication unit 120 may further include a log receiving module 190 that is responsible only for receiving logs.

저장부(130)는 적어도 하나의 저장 수단을 포함하며, 예를 들어 파싱부(150)가 파싱하여 분류된 로그 데이터가 저장된다.The storage unit 130 includes at least one storage means, and for example, log data parsed and classified by the parsing unit 150 is stored.

또한, 저장부(130)는 데이터 분석 처리 방법을 실행하기 위한 각종 명령어, 알고리즘이 저장될 수 있으며, 이외에도 각종 인공지능모델이 함께 저장될 수 있다.In addition, the storage unit 130 can store various commands and algorithms for executing data analysis processing methods, and in addition, various artificial intelligence models can be stored together.

저장부(130)는 로그 타입이 저장되는 로그 타입 저장부를 더 포함할 수 있다.The storage unit 130 may further include a log type storage unit in which the log type is stored.

입출력부(140)는 입력 수단, 출력 수단이 별개의 구성으로 구성될 수도 있으며, 입력 수단을 통해서 관리자, 현장 관리자로부터 각종 제어 신호를 입력받을 수 있고, 출력 수단을 통해서 위협 정보 여부 판단 결과, 오류 메시지 등을 출력할 수 있다.The input/output unit 140 may be composed of separate input means and output means, and can receive various control signals from the manager or field manager through the input means, and determine the presence or absence of threat information through the output means, resulting in an error. Messages, etc. can be output.

파싱부(150)는 로그 수신 모듈을 통해 수신된 로그 데이터를 기 설정된 로그 타입 분류 기준에 따라 파싱한다.The parsing unit 150 parses log data received through the log receiving module according to preset log type classification criteria.

파싱부(150)는 적어도 하나의 파싱 모듈을 포함할 수 있으며, 로그 타입에 따라 각각 별개의 파싱 모듈을 포함하여, 로그 데이터를 각각의 파싱 모듈이 로그 타입 분류 기준에 따라 파싱할 수 있다.The parsing unit 150 may include at least one parsing module, each of which may include a separate parsing module depending on the log type, so that each parsing module may parse log data according to log type classification criteria.

본 발명의 실시예에서 파싱부(150)는 수신된 로그 데이터를 로그 타입 분류 기준에 따라 파싱하는 것으로 예시되었으나, 로그 타입 분류 기준이 파싱부(150)의 파싱에서 최우선 고려사항일 뿐, 파싱/분류 기준이 로그 타입 분류 기준 하나로 한정되는 것은 아니다.In an embodiment of the present invention, the parsing unit 150 is illustrated as parsing the received log data according to the log type classification criteria. However, the log type classification criteria is only the top consideration in the parsing of the parsing unit 150, and the parsing/ The classification criteria are not limited to one log type classification criteria.

예를 들어, SQL query where, group by having 등과 같은 분류 기준들이 더 적용될 수 있다.For example, further classification criteria such as SQL query where, group by having, etc. can be applied.

분석부(160)는 적어도 하나의 분석 모듈을 포함할 수 있으며, 로그 타입에 따라 각각 별개의 분석 모듈을 포함하여, 각각의 분석 모듈이 파싱부(150)로부터 파싱되어 분류된 각각의 로그 타입별 로그 데이터에 대한 분석을 수행할 수 있다.The analysis unit 160 may include at least one analysis module, each of which includes a separate analysis module depending on the log type, and each analysis module parses and classifies the log type from the parsing unit 150. Analysis can be performed on log data.

이때, 각각의 분석 모듈은 분석 애플리케이션을 구비하고, 각각의 로그 타입에 따라 최적화된 룰셋(Rule Set)을 이용하여 로그 데이터를 분석할 수 있다.At this time, each analysis module is equipped with an analysis application, and log data can be analyzed using a rule set optimized for each log type.

검색부(170)는 저장부(130)에 저장된 로그 데이터에 대한 검색 기능을 실행할 수 있으며, 적어도 하나의 검색 모듈을 포함하며 구체적으로는, 과거 데이터 통합 분석을 위한 제1 검색 모듈, 실시간 검색 기능을 위한 제2 검색 모듈 등을 포함할 수 있다.The search unit 170 can execute a search function for log data stored in the storage unit 130 and includes at least one search module, specifically, a first search module for integrated analysis of past data and a real-time search function. It may include a second search module for .

일 실시예로, 검색 모듈은 실시간 검색을 위해서 분석부(160)에 포함될 수 있다.In one embodiment, a search module may be included in the analysis unit 160 for real-time search.

프로세서(110)는 데이터 분석 처리 장치(100) 내 구성들의 제어를 담당하며, 저장부(130) 내에 저장된 명령어, 알고리즘, 인공지능모델 등을 실행/이용함으로써 본 발명의 실시예에 따른 데이터 분석 처리 방법을 실행할 수 있다.The processor 110 is responsible for controlling the components within the data analysis processing device 100, and processes data analysis according to an embodiment of the present invention by executing/using instructions, algorithms, artificial intelligence models, etc. stored in the storage unit 130. method can be implemented.

도 2를 참조하여, 본 발명의 실시예에 따른 데이터 분석 처리 방법의 프로세스를 상세하게 설명하도록 한다.With reference to FIG. 2, the process of the data analysis processing method according to an embodiment of the present invention will be described in detail.

프로세서(110)가 통신부(120)를 통해 서로 다른 복수의 보안 장비로부터 실시간 로그 데이터를 수신한다. (S100)The processor 110 receives real-time log data from a plurality of different security devices through the communication unit 120. (S100)

구체적으로, 통신부(120)를 통해 수신되는 서로 다른 복수의 보안 장비는 분석 처리 대상이며, 도 4, 도 5와 같이 IDS, IPS, WAF, FW, WEB 등과 같은 클라이언트가 운용하는 각종 보안 장비가 적용 가능하며, 그 외 보안 장비는 ETC로 묶일 수 있다.Specifically, a plurality of different security devices received through the communication unit 120 are subject to analysis processing, and as shown in FIGS. 4 and 5, various security devices operated by clients such as IDS, IPS, WAF, FW, WEB, etc. are applied. It is possible, and other security equipment can be bundled with ETC.

프로세서(110)가 S100에서 수신된 로그 데이터를 기 설정된 로그 타입 분류 기준에 따라 파싱한다. (S200)The processor 110 parses the log data received from S100 according to preset log type classification criteria. (S200)

일 실시예로, 전술한 바와 같이 파싱부(150)는 로그 타입에 따라 각각 별개의 파싱 모듈을 포함할 수 있으며, 통신부(120)를 통해 수신된 로그 데이터를 각각의 파싱 모듈이 로그 타입 분류 기준에 따라 파싱할 수 있다.In one embodiment, as described above, the parsing unit 150 may include separate parsing modules depending on the log type, and each parsing module may classify the log data received through the communication unit 120 according to the log type classification criteria. It can be parsed accordingly.

그리고, 프로세서(110)는 파싱부(150)를 통해 파싱되어 분류된 로그 데이터를 분류 결과에 따라 저장부(130)에 저장한다.Then, the processor 110 stores the log data parsed and classified through the parsing unit 150 in the storage unit 130 according to the classification result.

상세하게는, 프로세서(110)는 로그 데이터를 저장부(130)에 저장하되, 파싱부(150)를 통해 분류된 대로 저장하여 실시간 분석, 향후 과거 데이터 검색, 분석이 용이하도록 할 수 있다.In detail, the processor 110 stores log data in the storage unit 130 and stores it as classified through the parsing unit 150 to facilitate real-time analysis and future past data search and analysis.

이때, 저장부(130)는 파싱부(150)를 통해 분류된 로그 데이터가 저장되는 기간에 따라 분류된 적어도 하나의 데이터베이스를 포함한다.At this time, the storage unit 130 includes at least one database classified according to the period for which log data classified through the parsing unit 150 is stored.

예를 들어, 저장부(130)는 로그 데이터를 1 week 동안 저장하는 제1 데이터베이스, 로그 데이터를 1 month 동안 저장하는 제2 데이터베이스, 로그 데이터를 1 year 동안 저장하는 제3 데이터베이스를 포함할 수 있다.For example, the storage unit 130 may include a first database that stores log data for 1 week, a second database that stores log data for 1 month, and a third database that stores log data for 1 year. .

프로세서(110)는 파싱된 로그 데이터를 적어도 하나의 데이터베이스에 저장하고, 저장 시점으로부터 기 설정된 저장 기간 후에 로그 데이터를 삭제한다.The processor 110 stores the parsed log data in at least one database and deletes the log data after a preset storage period from the point of storage.

프로세서(110)가 S200에서 파싱된 로그 데이터를 각 로그 타입 분류 기준에 설정된 룰셋을 기반으로 분석하여 위협 정보 여부를 판단한다. (S300)The processor 110 analyzes the log data parsed in S200 based on the rule set set for each log type classification standard to determine whether it is threat information. (S300)

본 발명의 실시예에서 룰셋은 로그 데이터가 위협 정보에 해당하는지 여부를 판단하기 위한 적어도 하나의 룰이 저장되어 있으며, 이러한 룰은 보안 관제 요원, 담당자가 설정할 수 있다.In an embodiment of the present invention, the rule set stores at least one rule for determining whether log data corresponds to threat information, and these rules can be set by security control personnel or personnel.

그리고, 프로세서(110)는 S300을 통해서 특정 로그 데이터가 위협 정보에 해당되는 경우, 입출력부(140), 보안 관제 장치, 단말로 경고를 발생시켜 보안 관제 요원/담당자가 이를 확인하도록 할 수 있다.In addition, when specific log data corresponds to threat information through S300, the processor 110 can generate a warning to the input/output unit 140, a security control device, and a terminal so that a security control agent/person in charge can confirm it.

본 발명의 실시예에 따른 데이터 분석 처리 장치(100)는 복수 개의 노드들로 구성된 클러스터(Cluster)를 포함한다.The data analysis processing device 100 according to an embodiment of the present invention includes a cluster consisting of a plurality of nodes.

각 노드는 각 노드마다 설정된 로그 타입에 해당하는 분석 애플리케이션이 개별로 구비되어 있다.Each node is individually equipped with an analysis application corresponding to the log type set for each node.

프로세서(110)는 각 노드마다 구비된 분석 애플리케이션을 실행함으로써, 서로 다른 복수의 보안 장비로부터 수신되는 대용량의 로그 데이터를 병렬 분산 처리할 수 있다.The processor 110 can parallel and distribute large amounts of log data received from a plurality of different security devices by executing an analysis application provided for each node.

본 발명의 실시예에서 각 로그 타입에 대한 분류 기준은 서로 다른 복수의 보안 장비 종류를 기반으로 결정될 수 있다.In an embodiment of the present invention, classification criteria for each log type may be determined based on a plurality of different types of security equipment.

본 발명의 실시예에서, 클라이언트가 운용하는 보안 장비의 종류가 5종류이며, 이에 따라 로그 타입 종류도 5종류인 것으로 예시하도록 한다.In an embodiment of the present invention, there are five types of security equipment operated by a client, and accordingly, there are five types of log types.

도 4를 참조하면, 종래에는 100개의 로그 데이터가 수신되면, 이에 대한 분석 처리를 진행할 때 100개의 로그 데이터 전부에 대하여 5종류의 로그 타입, 100개의 룰에 대하여 연산을 진행하여, 100 x 5 x 100 = 50,000번의 연산을 수행하였다.Referring to FIG. 4, conventionally, when 100 pieces of log data are received, when analyzing and processing them, calculations are performed on 5 types of log data and 100 rules for all 100 pieces of log data, and 100 x 5 x 100 = 50,000 calculations were performed.

하지만, 전술한 구성들을 적용한 본 발명의 실시예에 따른 데이터 분석 처치 장치를 적용하게 되면, 도 5와 같이 보안 장비로부터 수신된 100개의 로그 데이터가 5개의 로그 타입 분류 기준에 따라 5개의 로그 타입으로 파싱/분류되고, (도 5에서는 정확하게 5등분으로 분류된다고 가정함) 또한, 각각의 로그 타입마다 설정된 룰셋을 적용하여 로그 데이터 분석을 진행하기 때문에, 10,000번의 연산을 수행하게 된다.However, when applying the data analysis treatment device according to the embodiment of the present invention applying the above-described configurations, 100 log data received from the security equipment are divided into 5 log types according to the 5 log type classification criteria, as shown in FIG. 5. It is parsed/classified (in Figure 5, it is assumed that it is accurately classified into 5 equal parts) and log data analysis is performed by applying the rule set set for each log type, so 10,000 operations are performed.

위와 같이 종래의 방법을 통해서는 급격하게 늘어나는 데이터의 양을 커버하지 못하여 분석에 딜레이, 오류가 자주 발생하고 있으나, 본 발명의 실시예에 따른 데이터 분석 처리 장치(100)를 통해서는 이러한 문제점을 해결할 수 있게 된다.As described above, delays and errors frequently occur in analysis because the conventional method cannot cover the rapidly increasing amount of data. However, these problems can be solved through the data analysis and processing device 100 according to the embodiment of the present invention. It becomes possible.

또한, 도 5에서는 로그 타입 #1~#5 모두 룰셋 내 룰이 100개인 것으로 가정하였지만, 실제로는 세부적으로 분류되어 더 적은 개수의 룰이 포함되어 있기 때문에 실질적인 연산은 더 감소하는 효과를 발휘하게 된다.In addition, in Figure 5, it is assumed that log types #1 to #5 all have 100 rules in the ruleset, but in reality, they are classified in detail and contain a smaller number of rules, which has the effect of further reducing actual operations. .

도 7을 참조하면, 본 발명의 실시예에 따른 데이터 분석 처리 장치(100)는 과거의 로그 데이터를 검색할 수 있는 제1 검색 모듈을 포함한다.Referring to FIG. 7, the data analysis and processing device 100 according to an embodiment of the present invention includes a first search module capable of searching past log data.

프로세서(110)는 과거 로그 데이터 분석이 필요한 대상 장비 및 대상 기간을 기반으로, 해당되는 데이터베이스에서 로그 데이터를 검색하고, 분석부(160)를 제어하여 검색된 로그 데이터에 대한 분석을 수행할 수 있다.The processor 110 may search log data from a corresponding database based on the target equipment and target period for which past log data analysis is required, and control the analysis unit 160 to perform analysis on the retrieved log data.

일 실시예로, 프로세서(110)는 S300의 판단 결과 로그 데이터가 위협 정보에 해당하는 경우, 경보 신호를 발생하고 경보 신호 발생 내역을 데이터베이스에 저장한다.In one embodiment, if the log data corresponds to threat information as a result of S300's determination, the processor 110 generates an alarm signal and stores the alarm signal generation details in the database.

프로세서(110)는 기 설정된 시간 동안 발생되는 경보 신호의 수가 임계범위를 초과하는 경우, 위협 정보가 검출되는 룰셋에 대한 점검 요청 신호를 생성한다.If the number of alarm signals generated during a preset time exceeds the threshold range, the processor 110 generates an inspection request signal for the ruleset in which threat information is detected.

상세하게는, 룰셋에 포함된 특정 룰에 대해서 기준치 이상으로 위험 상황이 판단되는 경우 해당 룰에 이상이 있을 수도 있다고 판단하고, 관제 요원/관리자에게 해당 룰에 대한 점검을 요청하는 것을 의미한다.In detail, if a risk situation is judged to be higher than the standard for a specific rule included in the rule set, it means determining that there may be an error in the rule and requesting a control agent/manager to inspect the rule.

일 실시예로, 프로세서(110)는 기 설정된 기간 동안 복수 개의 노드들에 분류되어 누적된 데이터 양을 기반으로, 보안 장비로부터 수신되는 로그 데이터가 복수 개의 노드들에 균일하게 분산되도록 로그 타입 분류 기준을 설정할 수 있다.In one embodiment, the processor 110 sets log type classification criteria so that log data received from security equipment is uniformly distributed across a plurality of nodes, based on the amount of data classified and accumulated in a plurality of nodes over a preset period of time. can be set.

상세하게는, 프로세서(110)는 서로 다른 복수의 보안 장비 중에서 적어도 하나의 제1 보안 장비로부터 수신되는 로그 데이터의 비중이 전체 로그 데이터의 비중 내에서 기 설정된 수준 이상을 점유하는 경우, 제1 보안 장비에 대한 로그 타입 분류 기준을 기 설정된 로그 타입 분류 기준보다 더 세분화되도록 설정할 수 있다.In detail, when the proportion of log data received from at least one first security device among a plurality of different security devices occupies a preset level or more within the proportion of the total log data, the processor 110 determines the first security device. The log type classification criteria for equipment can be set to be more detailed than the preset log type classification criteria.

이때, 로그 타입 분류 기준을 세분화한다는 것은 로그 타입 분류 기준을 복수 개의 하위 기준으로 나누는 것을 의미하며, 즉 해당 보안 장비에 대한 로그 타입 분류 기준, 노드를 여러 개로 분리하는 것을 의미한다.At this time, subdividing the log type classification criteria means dividing the log type classification criteria into a plurality of sub-criteria, that is, dividing the log type classification criteria and nodes for the relevant security device into multiple nodes.

① 로그 타입 분류 기준 세분화를 위해 노드의 개수를 확장할 수 있다.① The number of nodes can be expanded to subdivide the log type classification criteria.

예를 들어, 1시간 동안 IDS: 20,000개, IPS: 20,000개, WAF: 20,000개, FW: 120,000개, WEB: 20,000개의 로그 데이터가 수집되었다고 가정한다.For example, assume that IDS: 20,000 logs, IPS: 20,000 logs, WAF: 20,000 logs, FW: 120,000 logs, and WEB: 20,000 log data were collected for 1 hour.

이 경우, FireWall에서 수집되는 로그 데이터의 양이 너무 집중되어 있으며, 해당 노드에서만 로그 데이터 분석이 지연될 수 있다.In this case, the amount of log data collected from FireWall is too concentrated, and analysis of log data only in that node may be delayed.

따라서, 프로세서(110)는 FW를 제1 FW ~ 제4 FW로 세분화하며, 이로 인해 FW에 대한 로그 타입 분류 기준이 4개로 확장되며, 총 로그 타입 분류 기준(노드 개수)은 8개로 증가하고, 향후 예상되는 로그 데이터 수집은 IDS: 20,000개, IPS: 20,000개, WAF: 20,000개, WEB: 20,000개, 제1 FW: 30,000개, 제2 FW: 30,000개, 제3 FW: 30,000개, 제4 FW: 30,000개로 예상되어 로그 데이터가 적절하게 분산되는 효과가 있다.Accordingly, the processor 110 subdivides the FW into 1st FW to 4th FW, which expands the log type classification criteria for FW to 4, and the total log type classification criteria (number of nodes) increases to 8, Expected future log data collection is IDS: 20,000, IPS: 20,000, WAF: 20,000, WEB: 20,000, 1st FW: 30,000, 2nd FW: 30,000, 3rd FW: 30,000, 4th FW: The number is expected to be 30,000, which has the effect of appropriately distributing log data.

② 로그 타입 분류 기준 세분화를 위해 노드별 보안 장비 분류를 재배치할 수 있다.② The security equipment classification for each node can be rearranged to subdivide the log type classification criteria.

총 로그 데이터의 수에 노드 수를 나누면, 5개의 노드 각각에 40,000개의 로그 데이터로 계산된다.Dividing the number of nodes by the total number of log data results in 40,000 log data for each of the 5 nodes.

프로세서(110)는 IDS, IPS를 제1 노드, WAF, WEB을 제2 노드로 재배치하고, 제1 FW를 제3 노드, 제2 FW를 제4 노드, 제3 FW를 제5 노드에 재배치할 수 있다.The processor 110 relocates the IDS and IPS to the first node, WAF, and WEB to the second node, the first FW to the third node, the second FW to the fourth node, and the third FW to the fifth node. You can.

이와 같이 재배치가 완료되면, 향후 예상되는 로그 데이터 수집은 제1 노드~제5 노드는 각각 40,000개로 예상되어 로그 데이터가 적절하게 분산되는 효과가 있다.Once relocation is completed in this way, the expected log data collection in the future is expected to be 40,000 for each of the first to fifth nodes, which has the effect of appropriately distributing log data.

이상에서 전술한 본 발명의 일 실시예에 따른 방법은, 하드웨어인 서버와 결합되어 실행되기 위해 프로그램(또는 어플리케이션)으로 구현되어 매체에 저장될 수 있다.The method according to an embodiment of the present invention described above may be implemented as a program (or application) and stored in a medium in order to be executed in conjunction with a server, which is hardware.

상기 전술한 프로그램은, 상기 컴퓨터가 프로그램을 읽어 들여 프로그램으로 구현된 상기 방법들을 실행시키기 위하여, 상기 컴퓨터의 프로세서(CPU)가 상기 컴퓨터의 장치 인터페이스를 통해 읽힐 수 있는 C, C++, JAVA, 기계어 등의 컴퓨터 언어로 코드화된 코드(Code)를 포함할 수 있다. 이러한 코드는 상기 방법들을 실행하는 필요한 기능들을 정의한 함수 등과 관련된 기능적인 코드(Functional Code)를 포함할 수 있고, 상기 기능들을 상기 컴퓨터의 프로세서가 소정의 절차대로 실행시키는데 필요한 실행 절차 관련 제어 코드를 포함할 수 있다. 또한, 이러한 코드는 상기 기능들을 상기 컴퓨터의 프로세서가 실행시키는데 필요한 추가 정보나 미디어가 상기 컴퓨터의 내부 또는 외부 메모리의 어느 위치(주소 번지)에서 참조되어야 하는지에 대한 메모리 참조관련 코드를 더 포함할 수 있다. 또한, 상기 컴퓨터의 프로세서가 상기 기능들을 실행시키기 위하여 원격(Remote)에 있는 어떠한 다른 컴퓨터나 서버 등과 통신이 필요한 경우, 코드는 상기 컴퓨터의 통신 모듈을 이용하여 원격에 있는 어떠한 다른 컴퓨터나 서버 등과 어떻게 통신해야 하는지, 통신 시 어떠한 정보나 미디어를 송수신해야 하는지 등에 대한 통신 관련 코드를 더 포함할 수 있다.The above-mentioned program is C, C++, JAVA, machine language, etc. that can be read by the processor (CPU) of the computer through the device interface of the computer in order for the computer to read the program and execute the methods implemented in the program. It may include code coded in a computer language. These codes may include functional codes related to functions that define the necessary functions for executing the methods, and include control codes related to execution procedures necessary for the computer's processor to execute the functions according to predetermined procedures. can do. In addition, these codes may further include memory reference-related codes that indicate at which location (address address) in the computer's internal or external memory additional information or media required for the computer's processor to execute the above functions should be referenced. there is. In addition, if the computer's processor needs to communicate with any other remote computer or server in order to execute the above functions, the code uses the computer's communication module to determine how to communicate with any other remote computer or server. It may further include communication-related codes regarding whether communication should be performed and what information or media should be transmitted and received during communication.

상기 저장되는 매체는, 레지스터, 캐쉬, 메모리 등과 같이 짧은 순간 동안 데이터를 저장하는 매체가 아니라 반영구적으로 데이터를 저장하며, 기기에 의해 판독(reading)이 가능한 매체를 의미한다. 구체적으로는, 상기 저장되는 매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광 데이터 저장장치 등이 있지만, 이에 제한되지 않는다. 즉, 상기 프로그램은 상기 컴퓨터가 접속할 수 있는 다양한 서버 상의 다양한 기록매체 또는 사용자의 상기 컴퓨터상의 다양한 기록매체에 저장될 수 있다. 또한, 상기 매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장될 수 있다.The storage medium refers to a medium that stores data semi-permanently and can be read by a device, rather than a medium that stores data for a short period of time, such as a register, cache, or memory. Specifically, examples of the storage medium include ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage device, etc., but are not limited thereto. That is, the program may be stored in various recording media on various servers that the computer can access or on various recording media on the user's computer. Additionally, the medium may be distributed to computer systems connected to a network, and computer-readable code may be stored in a distributed manner.

본 발명의 실시예와 관련하여 설명된 방법 또는 알고리즘의 단계들은 하드웨어로 직접 구현되거나, 하드웨어에 의해 실행되는 소프트웨어 모듈로 구현되거나, 또는 이들의 결합에 의해 구현될 수 있다. 소프트웨어 모듈은 RAM(Random Access Memory), ROM(Read Only Memory), EPROM(Erasable Programmable ROM), EEPROM(Electrically Erasable Programmable ROM), 플래시 메모리(Flash Memory), 하드 디스크, 착탈형 디스크, CD-ROM, 또는 본 발명이 속하는 기술 분야에서 잘 알려진 임의의 형태의 컴퓨터 판독가능 기록매체에 상주할 수도 있다.The steps of the method or algorithm described in connection with embodiments of the present invention may be implemented directly in hardware, implemented as a software module executed by hardware, or a combination thereof. The software module may be RAM (Random Access Memory), ROM (Read Only Memory), EPROM (Erasable Programmable ROM), EEPROM (Electrically Erasable Programmable ROM), Flash Memory, hard disk, removable disk, CD-ROM, or It may reside on any type of computer-readable recording medium well known in the art to which the present invention pertains.

이상, 첨부된 도면을 참조로 하여 본 발명의 실시예를 설명하였지만, 본 발명이 속하는 기술분야의 통상의 기술자는 본 발명이 그 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 그러므로, 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며, 제한적이 아닌 것으로 이해해야만 한다.Above, embodiments of the present invention have been described with reference to the attached drawings, but those skilled in the art will understand that the present invention can be implemented in other specific forms without changing its technical idea or essential features. You will be able to understand it. Therefore, the embodiments described above should be understood in all respects as illustrative and not restrictive.

100: 데이터 분석 처리 장치
110: 프로세서
120: 통신부
130: 저장부
140: 입출력부
150: 파싱부
160: 분석부
170: 검색부100: data analysis processing device
110: processor
120: Department of Communications
130: storage unit
140: input/output unit
150: parsing unit
160: analysis department
170: Search unit

Claims

A communication unit that communicates with a plurality of different security devices; and
When a large amount of log data larger than a preset size is received in real time from the plurality of different security devices through the communication unit, the received log data is parsed according to a preset log type classification standard, and the parsed log data is classified into each log data. Threat information is determined by analysis based on a rule set set in the log type classification criteria, and the received log data is uniformly distributed to the plurality of nodes based on the amount of data classified and accumulated in the plurality of nodes during a preset period. It includes a processor that sets the log type classification criteria to be distributed,
The processor,
When the proportion of log data for a specific log type occupies more than a preset level, the log type classification criteria for the node is expanded to a plurality,
Large-capacity log data analysis device.

According to paragraph 1,
It further includes a cluster consisting of a plurality of nodes,
The plurality of nodes are individually equipped with an analysis application corresponding to the log type set for each node,
The processor performs parallel and distributed processing of log data received from the plurality of different security devices by executing the analysis application provided for each node.
Large-capacity log data analysis device.

According to paragraph 1,
The rule set includes at least one rule for determining whether log data received from the plurality of security devices corresponds to threat information,
A separate analysis module is allocated depending on the type of log data,
The processor analyzes the log data using a rule set set according to each log type,
Large-capacity log data analysis device.

delete

According to paragraph 1,
The classification criteria for each log type are determined based on the plurality of different types of security equipment,
The processor is characterized in that when the proportion of log data for a specific log type occupies a preset level or more, the processor rearranges the security equipment classification for each node so that the received log data is distributed.
Large-capacity log data analysis device.

According to paragraph 1,
The data analysis processing device further includes at least one database classified according to a period for which the parsed log data is stored,
The processor stores the parsed log data in the at least one database and deletes the log data after a preset storage period from the point of storage.
Large-capacity log data analysis device.

According to clause 6,
It further includes an analysis module that analyzes past log data,
The processor generates past data analysis results by analyzing the corresponding database based on the target equipment and target period for which past log data analysis is required.
Large-capacity log data analysis device.

According to paragraph 1,
The processor,
As a result of the determination, if the log data corresponds to threat information, an alarm signal is generated,
Characterized in generating an inspection request signal for the ruleset in which threat information is detected when the number of alarm signals generated during a preset time exceeds the threshold range,
Large-capacity log data analysis device.

A method performed by a data analysis processing device,
Receiving real-time log data from a plurality of different security devices;
parsing the received log data according to preset log type classification criteria;
Analyzing the parsed log data based on a rule set set for each log type classification standard to determine whether it is threat information; and
Comprising the step of setting the log type classification criteria so that the received log data is uniformly distributed to the plurality of nodes based on the amount of data classified and accumulated in the plurality of nodes for a preset period of time,
The data analysis processing device,
When the proportion of log data for a specific log type occupies more than a preset level, the log type classification criteria for the node is expanded to a plurality,
Method for analyzing large volume log data.

A computer-readable recording medium that is combined with a hardware computer and stores a large-capacity log data analysis program for executing the large-capacity log data analysis method of claim 9.