KR102669475B1

KR102669475B1 - Data management device, data management method and a computer-readable storage medium for storing data management program

Info

Publication number: KR102669475B1
Application number: KR1020230086343A
Authority: KR
Inventors: 송균상; 최정규; 신동하
Original assignee: 인스피언 주식회사
Filing date: 2023-07-04
Publication date: 2024-05-27

Abstract

본 발명의 일 실시 예는 제 1 마크다운 셀을 입력받는 단계; 상기 제 1 마크다운 셀에 대한 제 1 코드 셀을 입력받는 단계; 및 상기 제 1 코드 셀에 기초하여 제 1 분석 작업을 실행하는 단계를 포함하는, 데이터 관리 방법을 제공한다.
One embodiment of the present invention includes receiving a first markdown cell; Receiving a first code cell for the first markdown cell; and executing a first analysis task based on the first code cell.

Description

A computer-readable storage medium storing a data management device, a data management method, and a data management program {DATA MANAGEMENT DEVICE, DATA MANAGEMENT METHOD AND A COMPUTER-READABLE STORAGE MEDIUM FOR STORING DATA MANAGEMENT PROGRAM}

본 발명은 데이터 관리 장치, 데이터 관리 방법 및 데이터 관리 프로그램을 저장하는 컴퓨터로 판독 가능한 저장 매체에 관한 것이다. The present invention relates to a data management device, a data management method, and a computer-readable storage medium storing a data management program.

로그 데이터는 다양한 소스로부터 발생하게 된다. 예를 들어, 시스템이나 애플리케이션의 동작을 모니터링 하기 위해 로그 데이터가 기록되며, 시스템의 보안 상태를 모니터링하고 사용자의 활동을 추적하기 위해 사용되며, 애플리케이션에서 발생하는 오류를 추적하고 디버깅하기 위해 로그 데이터가 사용된다. Log data comes from various sources. For example, log data is recorded to monitor the behavior of a system or application, it is used to monitor the security status of the system and track user activity, and log data is used to track and debug errors that occur in the application. It is used.

이외에도, 로그 데이터는 비즈니스 분석에 활용되어 의사 결정에 도움을 주게 된다. 예를 들어, 고객의 행동 분석, 마케팅 효과 분석, 품질 관리 등에 활용될 수 있다. In addition, log data is used for business analysis to aid decision-making. For example, it can be used to analyze customer behavior, analyze marketing effectiveness, and quality control.

이러한 이유로 대용량 로그 데이터가 발생하며, 이 데이터를 효율적으로 처리하고 분석하는 것은 중요한 과제가 된다. For this reason, large amounts of log data are generated, and efficiently processing and analyzing this data becomes an important task.

따라서, 본 발명은 기존 기술의 문제점을 해결하기 위하여 데이터를 처리하는 사용자 인터페이스를 제공하는 데이터 관리 장치, 데이터 관리 방법 및 데이터 관리 프로그램을 저장하는 컴퓨터로 판독 가능한 저장 매체를 제공하고자 한다. Therefore, in order to solve the problems of existing technologies, the present invention seeks to provide a data management device that provides a user interface for processing data, a data management method, and a computer-readable storage medium that stores a data management program.

상기 데이터 관리 방법은 상기 제 1 마크다운 셀은 실행 스크립트에 대한 설명 데이터를 포함하고, 상기 제 1 코드 셀은 분석을 위한 제 1 Lua 스크립트 정보를 포함하는 것을 특징으로 한다. The data management method is characterized in that the first Markdown cell includes description data for an execution script, and the first code cell includes first Lua script information for analysis.

상기 데이터 관리 방법은 분석 엔진 내에서 상기 제 1 Lua 스크립트 정보를 해석하는 단계; 및 상기 제 1 코드 셀에 대한 결과를 반환하는 단계를 더 포함하는 것을 특징으로 한다. The data management method includes interpreting the first Lua script information within an analysis engine; and returning a result for the first code cell.

상기 데이터 관리 방법은 제 2 분석 작업을 분석 작업 스케쥴러에 등록하는 단계; 및 상기 제 2 분석 작업은 기 설정된 조건에 기초하여 주기적으로 실행되는 단계를 더 포함하는 것을 특징으로 한다. The data management method includes registering a second analysis task in an analysis task scheduler; And the second analysis task is characterized in that it further includes a step of being periodically executed based on preset conditions.

본 발명의 일 실시 예는 데이터를 저장하는 데이터베이스; 및 상기 데이터를 처리하는 프로세서를 포함하고, 상기 프로세서는, 제 1 마크다운 셀을 입력받고, 상기 제 1 마크다운 셀에 대한 제 1 코드 셀을 입력받고, 상기 제 1 코드 셀에 기초하여 제 1 분석 작업을 실행하는, 데이터 관리 장치를 제공한다. One embodiment of the present invention includes a database storing data; And a processor that processes the data, wherein the processor receives a first markdown cell, receives a first code cell for the first markdown cell, and executes a first code cell based on the first code cell. Provides a data management device that executes analysis tasks.

상기 데이터 관리 장치는 상기 제 1 마크다운 셀은 실행 스크립트에 대한 설명 데이터를 포함하고, 상기 제 1 코드 셀은 분석을 위한 제 1 Lua 스크립트 정보를 포함하는 것을 특징으로 한다. The data management device is characterized in that the first markdown cell includes description data for an execution script, and the first code cell includes first Lua script information for analysis.

상기 프로세서는, 분석 엔진 내에서 상기 제 1 Lua 스크립트 정보를 해석하고, 상기 제 1 코드 셀에 대한 결과를 반환하는 것을 특징으로 한다. The processor is characterized in that it interprets the first Lua script information within an analysis engine and returns a result for the first code cell.

상기 프로세서는, 제 2 분석 작업을 분석 작업 스케쥴러에 등록하고, 상기 제 2 분석 작업은 기 설정된 조건에 기초하여 주기적으로 실행되는 것을 특징으로 한다. The processor registers a second analysis task in an analysis task scheduler, and the second analysis task is periodically executed based on preset conditions.

본 발명의 일 실시 예는 제 1 마크다운 셀을 입력받고, 상기 제 1 마크다운 셀에 대한 제 1 코드 셀을 입력받고, 상기 제 1 코드 셀에 기초하여 제 1 분석 작업을 실행하는 단계를 포함하는 데이터 관리 프로그램을 저장하는 컴퓨터로 판독 가능한 저장 매체를 제공한다.One embodiment of the present invention includes the steps of receiving a first markdown cell, receiving a first code cell for the first markdown cell, and executing a first analysis task based on the first code cell. Provides a computer-readable storage medium that stores a data management program.

본 발명의 일 실시 예에 따르면, ERP 시스템에서의 데이터 관리를 위한 혁신적인 기술을 제공함으로써 기업의 데이터 보안과 모니터링에 대한 요구를 충족시킬 수 있다. According to one embodiment of the present invention, the needs for data security and monitoring of companies can be met by providing innovative technology for data management in ERP systems.

또한, 본 발명의 일 실시 예에 따르면, 서버에 대한 모든 접속 기록을 로그로 생성하여 로그 기록에 대한 안정성을 확보할 수 있다는 장점이 있다. Additionally, according to an embodiment of the present invention, there is an advantage in that all connection records to the server are created as logs, thereby ensuring the stability of log records.

또한, 본 발명의 일 실시 예에 따르면, 디피-헬만 알고리즘을 사용하는 경우에도 SSL 암호를 해제하고 로그를 기록하며 안전한 방식으로 업무 시스템으로 전송함으로써 데이터의 무결성과 기밀성을 보장할 수 있다는 장점이 있다. In addition, according to an embodiment of the present invention, even when using the Diffie-Hellman algorithm, there is an advantage that the integrity and confidentiality of data can be guaranteed by decrypting the SSL encryption, recording logs, and transmitting to the business system in a secure manner. .

또한, 본 발명의 일 실시 예에 따르면, 개인정보 메타데이터 및 예외처리 리스트를 사용한 다차원 추출 방법을 통해 개인정보 추출에 대한 정확도를 높일 수 있다는 장점이 있다. In addition, according to an embodiment of the present invention, there is an advantage in that the accuracy of personal information extraction can be increased through a multidimensional extraction method using personal information metadata and an exception handling list.

또한, 본 발명의 일 실시 예에 따르면, 개인정보가 포함된 모든 로그 데이터에 대해 사용자 행위를 매핑할 수 있다는 장점이 있다.Additionally, according to an embodiment of the present invention, there is an advantage that user behavior can be mapped to all log data containing personal information.

또한, 본 발명의 일 실시 예에 따르면, 다량의 개인정보 데이터를 실시간으로 복호화 및 재암호화할 수 있다는 장점이 있다. Additionally, according to an embodiment of the present invention, there is an advantage that a large amount of personal information data can be decrypted and re-encrypted in real time.

또한, 본 발명의 일 실시 예에 따르면, 사용자에 대하여 변조된 데이터를 제공함으로써 원문 데이터의 안전성을 유지할 수 있다는 장점이 있다. Additionally, according to an embodiment of the present invention, there is an advantage that the safety of original data can be maintained by providing altered data to the user.

또한, 본 발명의 일 실시 예에 따르면, 적어도 하나의 에이전트의 업데이트, 패치 및 롤백을 용이하게 수행할 수 있다는 장점이 있다. Additionally, according to an embodiment of the present invention, there is an advantage that updating, patching, and rollback of at least one agent can be easily performed.

또한, 본 발명의 일 실시 예에 따르면, 정규식 입력을 어려워하는 사용자들에게 부분적으로 정규식 입력을 도와줘 전체 정규식을 완성할 수 있는 편의성을 제공할 수 있다. In addition, according to an embodiment of the present invention, it is possible to provide users who have difficulty entering regular expressions with the convenience of completing the entire regular expression by partially assisting them in entering regular expressions.

또한, 본 발명의 일 실시 예에 따르면, 등록된 파서의 재사용성을 높일 수 있도록 기 등록된 파서를 스캐닝하는 편의 기능을 제공할 수 있다. Additionally, according to an embodiment of the present invention, a convenience function of scanning a pre-registered parser can be provided to increase the reusability of the registered parser.

또한, 본 발명의 일 실시 예에 따르면, 대량의 데이터를 동시에 검색하며, 실시간으로 결과를 확인할 수 있다는 장점이 있다. Additionally, according to an embodiment of the present invention, there is an advantage in that a large amount of data can be searched simultaneously and the results can be confirmed in real time.

또한, 본 발명의 일 실시 예에 따르면, 사용자는 기본 UI로 제공되지 않는 데이터에 대해 스크립트를 활용하여 다양한 데이터의 분석과 조작을 수행할 수 있다는 장점이 있다. Additionally, according to an embodiment of the present invention, there is an advantage that a user can perform analysis and manipulation of various data using a script for data that is not provided through the basic UI.

도 1은 본 발명의 데이터 관리 장치를 설명하기 위한 하드웨어를 개시하는 도면이다.
도 2는 본 발명의 데이터 관리 장치의 일 실시 예를 개시하는 도면이다.
도 3은 본 발명의 데이터 관리 플랫폼의 일 실시 예를 개시하는 도면이다.
도 4는 본 발명의 데이터 관리 방법의 일 실시 예를 설명하는 플로우 차트이다.
도 5는 본 발명의 데이터 관리 방법의 일 실시 예를 개시하는 도면이다.
도 6은 본 발명의 패킷 수집 방법의 일 실시 예를 개시하는 도면이다.
도 7은 본 발명의 패킷 수집 방법의 다른 일 실시 예를 개시하는 도면이다.
도 8은 본 발명의 패킷 수집 방법에 따라 수집한 RFC 프로토콜 기반의 패킷 분석 데이터의 일 실시 예를 개시하는 도면이다.
도 9는 본 발명의 패킷 수집 방법에 따라 수집한 GUI 프로토콜 기반의 패킷 분석 장치의 일 실시 예를 개시하는 도면이다.
도 10은 본 발명의 패킷 수집 방법에 따라 수집한 GUI 프로토콜 기반의 패킷 분석 방법의 일 실시 예를 개시하는 도면이다.
도 11은 본 발명의 패킷 수집 방법에 따라 수집한 GUI 프로토콜 기반의 패킷 분석 데이터의 일 실시 예를 개시하는 도면이다.
도 12는 본 발명의 패킷 수집 방법에 따라 수집한 HTTP/HTTPS 기반의 패킷 분석 방법의 일 실시 예를 개시하는 도면이다.
도 13은 본 발명의 패킷 수집 방법에 따라 수집한 HTTP/HTTPS 기반의 패킷의 분석 데이터의 일 실시 예를 개시하는 도면이다.
도 14는 본 발명의 분석된 패킷에 포함된 데이터 중 개인정보를 추출하는 일 실시 예를 개시하는 도면이다.
도 15는 본 발명의 분석된 패킷에 포함된 데이터를 시각화한 일 실시 예를 개시하는 도면이다.
도 16은 본 발명의 분석된 패킷에 포함된 데이터를 검색하는 일 실시 예를 개시하는 도면이다.
도 17은 본 발명의 데이터 관리 방법에 따라 수집된 데이터를 모니터링하는 일 실시 예를 개시하는 도면이다.
도 18은 본 발명의 데이터 관리 플랫폼이 수집된 패킷을 분배하는 실시 예를 설명하는 도면이다.
도 19는 본 발명의 데이터 관리 플랫폼이 RFC 프로토콜 기반의 패킷을 분석하는 실시 예를 설명하는 도면이다.
도 20은 본 발명의 데이터 관리 플랫폼이 패킷을 분석하는 실시 예를 설명하는 도면이다.
도 21은 본 발명의 데이터 관리 플랫폼이 감사 로그를 저장하고 모니터링하는 실시 예를 설명하는 도면이다.
도 22는 본 발명의 데이터 관리 방법이 수집된 패킷을 분배하는 실시 예를 설명하는 도면이다.
도 23은 본 발명의 데이터 관리 플랫폼이 HTTPS 기반 패킷을 분석하는 실시 예를 설명하는 도면이다.
도 24는 본 발명의 데이터 관리 방법이 HTTPS 기반 패킷을 분석하는 실시 예를 설명하는 도면이다.
도 25는 본 발명의 데이터 관리 방법의 일 실시 예를 설명하는 도면이다.
도 26은 본 발명의 데이터 관리 플랫폼에서 개인정보를 추출하고 저장하는 실시 예를 설명하는 도면이다.
도 27은 본 발명의 데이터 관리 플랫폼에서 사용되는 정규식 패턴의 일 예를 설명하는 도면이다.
도 28은 본 발명의 데이터 관리 플랫폼에서 사용되는 마스킹 패턴의 일 예를 설명하는 도면이다.
도 29는 본 발명의 데이터 관리 플랫폼에서 정의하는 개인정보 메타데이터(1036)의 일 예를 설명하는 도면이다.
도 30은 본 발명의 데이터 관리 플랫폼에서 아키텍쳐 유형을 구분하여 개인정보를 추출하는 실시 예를 설명하는 도면이다.
도 31은 본 발명의 데이터 관리 플랫폼에서 개인정보 추출 규칙을 생성하는 실시 예를 설명하는 도면이다.
도 32는 본 발명의 데이터 관리 플랫폼에서 개인정보 예외처리 리스트를 생성하는 실시 예를 설명하는 도면이다.
도 33은 본 발명의 데이터 관리 플랫폼의 예외 필터를 생성하는 실시 예를 설명하는 도면이다.
도 34는 본 발명의 데이터 관리 플랫폼에서 추출된 개인정보를 검색하고 출력하는 실시 예를 설명하는 도면이다.
도 35는 본 발명의 데이터 관리 방법이 개인정보를 관리하는 실시 예를 설명하는 도면이다.
도 36은 본 발명의 데이터 관리 방법이 개인정보를 추출하고 저장하는 다른 실시 예를 설명하는 도면이다.
도 37은 본 발명의 데이터 관리 방법에서 개인정보를 관리하는 실시 예를 설명하는 도면이다.
도 38은 본 발명의 데이터 관리 플랫폼에서 사용자 행위를 수집하고 매핑하는 실시 예를 설명하는 도면이다.
도 39는 본 발명의 데이터 관리 플랫폼에서 사용자 행위 메타데이터를 생성하는 실시 예를 설명하는 도면이다.
도 40은 본 발명의 데이터 관리 플랫폼에서 사용자 행위 메타데이터와 로그 데이터를 매핑하는 실시 예를 설명하는 도면이다.
도 41은 본 발명의 데이터 관리 방법이 사용자 행위 메타데이터를 생성하는 실시 예를 설명하는 도면이다.
도 42는 본 발명의 데이터 관리 방법이 사용자 행위 메타데이터와 로그 데이터를 매핑하는 실시 예를 설명하는 도면이다.
도 43은 본 발명의 데이터 관리 방법이 사용자 행위 메타데이터를 생성하는 실시 예를 설명하는 도면이다.
도 44은 본 발명의 데이터 관리 플랫폼에서 개인정보를 암호화하는 실시 예를 설명하는 도면이다.
도 45는 본 발명의 데이터 관리 플랫폼에서 개인정보를 암호화하는 실시 예를 설명하는 도면이다.
도 46는 본 발명의 매핑 정보 테이블과 업무 테이블을 설명하는 도면이다.
도 47은 본 발명의 데이터 관리 플랫폼에서 신규 암호화 키를 생성하는 실시 예를 설명하는 도면이다.
도 48은 본 발명의 데이터 관리 플랫폼에서 새로운 업무 데이터를 추가하는 실시 예를 설명하는 도면이다.
도 49는 본 발명의 개인정보 암호화의 실 사용 예를 설명하는 도면이다.
도 50은 본 발명의 데이터 관리 방법이 개인정보를 암호화하는 실시 예를 설명하는 도면이다.
도 51은 본 발명의 데이터 관리 방법이 신규 암호화 키를 생성하는 실시 예를 설명하는 도면이다.
도 52는 본 발명의 데이터 관리 방법이 새로운 업무 데이터를 추가하는 실시 예를 설명하는 도면이다.
도 53은 본 발명의 데이터 관리 방법이 개인정보를 암호화하는 실시 예를 설명하는 도면이다.
도 54는 본 발명의 데이터 관리 플랫폼에서 데이터를 변조하는 실시 예를 설명하는 도면이다.
도 55는 본 발명의 데이터 관리 방법이 데이터에 대응하는 토큰 값 및 매핑 정보를 생성하는 실시 예를 설명하는 도면이다.
도 56은 본 발명의 매핑 정보 테이블 및 업무 테이블을 설명하는 도면이다.
도 57은 본 발명의 데이터 관리 방법이 데이터를 변조하는 실시 예를 설명하는 도면이다.
도 58은 본 발명의 변조된 데이터가 저장된 매핑 정보 테이블 및 업무 테이블을 설명하는 도면이다.
도 59는 본 발명의 데이터 관리 방법에서 변조된 데이터를 조회하는 실시 예를 설명하는 도면이다.
도 60은 본 발명의 데이터 관리 플랫폼에서 데이터를 변조하는 실시 예를 설명하는 도면이다.
도 61은 본 발명의 데이터 관리 플랫폼에서 데이터를 변조하는 실시 예를 설명하는 도면이다.
도 62는 본 발명의 데이터 관리 방법이 데이터를 변조하는 실시 예를 설명하는 도면이다.
도 63은 본 발명의 데이터 관리 플랫폼에서 에이전트를 관리하는 실시 예를 설명하는 도면이다.
도 64는 본 발명의 데이터 관리 방법에서 에이전트를 관리하는 방법을 설명하는 도면이다.
도 65는 본 발명의 데이터 관리 방법에서 에이전트와의 통신 방법을 설명하는 도면이다.
도 66은 본 발명의 데이터 관리 방법에서 에이전트의 동작 방법을 설명하는 도면이다.
도 67은 본 발명의 데이터 관리 방법에서 에이전트의 동작 방법을 설명하는 도면이다.
도 68은 본 발명의 데이터 관리 플랫폼이 로그 데이터를 정규화하는 실시 예를 설명하는 도면이다.
도 69는 본 발명의 데이터 관리 방법이 파서를 생성하는 일 실시 예를 설명하는 도면이다.
도 70은 본 발명의 파서 생성 화면의 일 실시 예를 설명하는 도면이다.
도 71은 본 발명의 파서 테스트 화면의 일 실시 예를 설명하는 도면이다.
도 72는 본 발명의 데이터 관리 방법이 변환 규칙을 생성하는 일 실시 예를 설명하는 도면이다.
도 73은 본 발명의 변환 규칙 생성 화면의 일 실시 예를 설명하는 도면이다.
도 74는 본 발명의 파서필드 일괄등록 화면의 일 실시 예를 설명하는 도면이다.
도 75는 본 발명의 데이터 관리 방법이 수집 경로 규칙을 생성하는 일 실시 예를 설명하는 도면이다.
도 76은 본 발명의 수집 경로 규칙 생성 화면의 일 실시 예를 설명하는 도면이다.
도 77은 본 발명의 데이터 관리 방법이 이벤트 로그를 검색하는 일 실시 예를 설명하는 도면이다.
도 78은 본 발명의 이벤트 로그 검색 화면의 일 실시 예를 설명하는 도면이다.
도 79는 본 발명의 데이터 관리 방법의 일 실시 예를 설명하는 도면이다.
도 80은 본 발명의 데이터 관리 플랫폼에서 로그를 검색하는 실시 예를 설명하는 도면이다.
도 81은 본 발명의 데이터 관리 플랫폼에서 로그를 검색하는 실시 예를 설명하는 도면이다.
도 82는 본 발명의 데이터 관리 방법이 로그를 검색하는 실시 예를 설명하는 도면이다.
도 83은 본 발명의 데이터 관리 방법에서 로그를 검색하는 실시 예를 설명하는 도면이다.
도 84는 본 발명의 데이터 관리 방법에서 로그를 검색하는 실시 예를 설명하는 도면이다.
도 85는 본 발명의 익스터널 머지 소트 알고리즘 실시 예를 설명하는 도면이다.
도 86은 본 발명의 검색 결과 파일을 설명하는 도면이다.
도 87은 본 발명의 데이터 관리 방법이 로그를 검색하는 실시 예를 설명하는 도면이다.
도 88은 본 발명의 데이터 관리 플랫폼에서 데이터를 분석하는 실시 예를 설명하는 도면이다.
도 89는 본 발명의 데이터 관리 방법이 데이터를 분석하는 실시 예를 설명하는 도면이다.
도 90은 본 발명의 데이터 관리 방법이 분석 작업의 스케쥴을 설정하는 실시 예를 설명하는 도면이다.
도 91은 본 발명의 데이터 관리 방법이 분석 작업을 실행하는 실시 예를 설명하는 도면이다.
도 92는 본 발명의 데이터 관리 플랫폼에서 데이터를 변조하는 실시 예를 설명하는 도면이다.
도 93은 본 발명의 데이터 관리 플랫폼에서 제공하는 분석 작업 편집기의 사용자 인터페이스를 설명하는 도면이다.
도 94는 본 발명의 데이터 관리 플랫폼에서 제공하는 분석 작업 편집기의 사용자 인터페이스를 설명하는 도면이다.
도 95는 본 발명의 데이터 관리 플랫폼에서 제공하는 분석 작업 편집기의 사용자 인터페이스를 설명하는 도면이다.
도 96은 본 발명의 데이터 관리 플랫폼의 모니터링 모듈이 제공하는 사용자 인터페이스를 설명하는 도면이다.
도 97은 본 발명의 데이터 관리 플랫폼에서 제공하는 분석 작업 편집기의 사용자 인터페이스를 설명하는 도면이다.
도 98은 본 발명의 데이터 관리 방법이 데이터를 분석하는 실시 예를 설명하는 도면이다.1 is a diagram illustrating hardware for explaining the data management device of the present invention.
Figure 2 is a diagram disclosing an embodiment of the data management device of the present invention.
Figure 3 is a diagram disclosing an embodiment of the data management platform of the present invention.
Figure 4 is a flow chart explaining an embodiment of the data management method of the present invention.
Figure 5 is a diagram disclosing an embodiment of the data management method of the present invention.
Figure 6 is a diagram illustrating an embodiment of the packet collection method of the present invention.
Figure 7 is a diagram illustrating another embodiment of the packet collection method of the present invention.
Figure 8 is a diagram illustrating an example of RFC protocol-based packet analysis data collected according to the packet collection method of the present invention.
Figure 9 is a diagram illustrating an embodiment of a GUI protocol-based packet analysis device collected according to the packet collection method of the present invention.
Figure 10 is a diagram illustrating an example of a method for analyzing packets based on a GUI protocol collected according to the packet collection method of the present invention.
Figure 11 is a diagram illustrating an example of GUI protocol-based packet analysis data collected according to the packet collection method of the present invention.
Figure 12 is a diagram illustrating an embodiment of a method for analyzing packets based on HTTP/HTTPS collected according to the packet collection method of the present invention.
Figure 13 is a diagram illustrating an example of HTTP/HTTPS-based packet analysis data collected according to the packet collection method of the present invention.
Figure 14 is a diagram illustrating an example of extracting personal information from data included in an analyzed packet of the present invention.
Figure 15 is a diagram illustrating an embodiment of the present invention visualizing data included in an analyzed packet.
Figure 16 is a diagram illustrating an embodiment of retrieving data included in an analyzed packet of the present invention.
Figure 17 is a diagram illustrating an embodiment of monitoring data collected according to the data management method of the present invention.
Figure 18 is a diagram explaining an embodiment in which the data management platform of the present invention distributes collected packets.
Figure 19 is a diagram explaining an embodiment in which the data management platform of the present invention analyzes packets based on the RFC protocol.
Figure 20 is a diagram explaining an embodiment in which the data management platform of the present invention analyzes packets.
Figure 21 is a diagram illustrating an embodiment in which the data management platform of the present invention stores and monitors audit logs.
Figure 22 is a diagram illustrating an embodiment of distributing collected packets by the data management method of the present invention.
Figure 23 is a diagram illustrating an embodiment in which the data management platform of the present invention analyzes HTTPS-based packets.
Figure 24 is a diagram explaining an embodiment of the data management method of the present invention analyzing HTTPS-based packets.
Figure 25 is a diagram explaining an embodiment of the data management method of the present invention.
Figure 26 is a diagram explaining an embodiment of extracting and storing personal information in the data management platform of the present invention.
Figure 27 is a diagram illustrating an example of a regular expression pattern used in the data management platform of the present invention.
Figure 28 is a diagram illustrating an example of a masking pattern used in the data management platform of the present invention.
Figure 29 is a diagram illustrating an example of personal information metadata 1036 defined in the data management platform of the present invention.
Figure 30 is a diagram illustrating an embodiment of extracting personal information by classifying architecture types in the data management platform of the present invention.
Figure 31 is a diagram explaining an embodiment of creating a personal information extraction rule in the data management platform of the present invention.
Figure 32 is a diagram illustrating an embodiment of creating a personal information exception processing list in the data management platform of the present invention.
Figure 33 is a diagram explaining an embodiment of creating an exception filter of the data management platform of the present invention.
Figure 34 is a diagram explaining an embodiment of searching and outputting personal information extracted from the data management platform of the present invention.
Figure 35 is a diagram explaining an embodiment of how the data management method of the present invention manages personal information.
Figure 36 is a diagram illustrating another embodiment of the data management method of the present invention extracting and storing personal information.
Figure 37 is a diagram explaining an embodiment of managing personal information in the data management method of the present invention.
Figure 38 is a diagram explaining an embodiment of collecting and mapping user behavior in the data management platform of the present invention.
Figure 39 is a diagram explaining an embodiment of generating user behavior metadata in the data management platform of the present invention.
Figure 40 is a diagram illustrating an embodiment of mapping user behavior metadata and log data in the data management platform of the present invention.
Figure 41 is a diagram explaining an embodiment in which the data management method of the present invention generates user behavior metadata.
Figure 42 is a diagram illustrating an embodiment of the data management method of the present invention mapping user behavior metadata and log data.
Figure 43 is a diagram explaining an embodiment in which the data management method of the present invention generates user behavior metadata.
Figure 44 is a diagram explaining an embodiment of encrypting personal information in the data management platform of the present invention.
Figure 45 is a diagram explaining an embodiment of encrypting personal information in the data management platform of the present invention.
Figure 46 is a diagram explaining the mapping information table and task table of the present invention.
Figure 47 is a diagram explaining an embodiment of generating a new encryption key in the data management platform of the present invention.
Figure 48 is a diagram explaining an embodiment of adding new work data in the data management platform of the present invention.
Figure 49 is a diagram explaining an actual use example of personal information encryption of the present invention.
Figure 50 is a diagram explaining an embodiment of the data management method of the present invention encrypting personal information.
Figure 51 is a diagram explaining an embodiment in which the data management method of the present invention generates a new encryption key.
Figure 52 is a diagram explaining an embodiment in which the data management method of the present invention adds new work data.
Figure 53 is a diagram explaining an embodiment of the data management method of the present invention encrypting personal information.
Figure 54 is a diagram explaining an embodiment of modulating data in the data management platform of the present invention.
Figure 55 is a diagram illustrating an embodiment in which the data management method of the present invention generates token values and mapping information corresponding to data.
Figure 56 is a diagram explaining the mapping information table and task table of the present invention.
Figure 57 is a diagram explaining an embodiment in which the data management method of the present invention modifies data.
Figure 58 is a diagram explaining the mapping information table and task table in which the modulated data of the present invention is stored.
Figure 59 is a diagram illustrating an embodiment of searching for altered data in the data management method of the present invention.
Figure 60 is a diagram explaining an embodiment of modulating data in the data management platform of the present invention.
Figure 61 is a diagram explaining an embodiment of modulating data in the data management platform of the present invention.
Figure 62 is a diagram explaining an embodiment in which the data management method of the present invention modifies data.
Figure 63 is a diagram explaining an embodiment of managing agents in the data management platform of the present invention.
Figure 64 is a diagram explaining a method of managing an agent in the data management method of the present invention.
Figure 65 is a diagram explaining a method of communicating with an agent in the data management method of the present invention.
Figure 66 is a diagram explaining the agent operation method in the data management method of the present invention.
Figure 67 is a diagram explaining the agent operation method in the data management method of the present invention.
Figure 68 is a diagram explaining an embodiment in which the data management platform of the present invention normalizes log data.
Figure 69 is a diagram explaining an embodiment of how the data management method of the present invention generates a parser.
Figure 70 is a diagram illustrating an example of a parser creation screen of the present invention.
Figure 71 is a diagram illustrating an example of a parser test screen of the present invention.
Figure 72 is a diagram illustrating an embodiment of how the data management method of the present invention generates a conversion rule.
Figure 73 is a diagram illustrating an example of a conversion rule creation screen of the present invention.
Figure 74 is a diagram illustrating an embodiment of the parser field batch registration screen of the present invention.
Figure 75 is a diagram illustrating an embodiment of how the data management method of the present invention creates a collection path rule.
Figure 76 is a diagram illustrating an example of a collection path rule creation screen of the present invention.
Figure 77 is a diagram illustrating an embodiment of the data management method of the present invention searching an event log.
Figure 78 is a diagram illustrating an example of an event log search screen of the present invention.
Figure 79 is a diagram explaining an embodiment of the data management method of the present invention.
Figure 80 is a diagram explaining an embodiment of searching logs in the data management platform of the present invention.
Figure 81 is a diagram explaining an embodiment of searching a log in the data management platform of the present invention.
Figure 82 is a diagram illustrating an embodiment in which the data management method of the present invention searches logs.
Figure 83 is a diagram explaining an embodiment of searching a log in the data management method of the present invention.
Figure 84 is a diagram explaining an embodiment of searching a log in the data management method of the present invention.
Figure 85 is a diagram explaining an embodiment of the external merge sort algorithm of the present invention.
Figure 86 is a diagram explaining the search result file of the present invention.
Figure 87 is a diagram illustrating an embodiment in which the data management method of the present invention searches logs.
Figure 88 is a diagram explaining an embodiment of analyzing data in the data management platform of the present invention.
Figure 89 is a diagram explaining an embodiment in which the data management method of the present invention analyzes data.
Figure 90 is a diagram explaining an embodiment of the data management method of the present invention setting a schedule for analysis work.
Figure 91 is a diagram illustrating an embodiment in which the data management method of the present invention performs an analysis task.
Figure 92 is a diagram explaining an embodiment of modulating data in the data management platform of the present invention.
Figure 93 is a diagram illustrating the user interface of the analysis task editor provided by the data management platform of the present invention.
Figure 94 is a diagram illustrating the user interface of the analysis task editor provided by the data management platform of the present invention.
Figure 95 is a diagram illustrating the user interface of the analysis task editor provided by the data management platform of the present invention.
Figure 96 is a diagram explaining the user interface provided by the monitoring module of the data management platform of the present invention.
Figure 97 is a diagram illustrating the user interface of the analysis task editor provided by the data management platform of the present invention.
Figure 98 is a diagram explaining an embodiment of the data management method of the present invention analyzing data.

이하에서는 도면을 참조하여 다양한 실시예들을 상세히 설명한다. 이하에서 설명되는 실시예들은 여러 가지 상 이한 형태로 변형되어 실시될 수도 있다. 실시예들의 특징을 보다 명확히 설명하기 위하여 이하의 실시예들이 속하는 기술분야에서 통상의 지식을 가진 자에게 널리 알려져 있는 사항들에 관해서 자세한 설명은 생략한다.Hereinafter, various embodiments will be described in detail with reference to the drawings. The embodiments described below may be modified and implemented in various different forms. In order to more clearly explain the characteristics of the embodiments, detailed descriptions of matters widely known to those skilled in the art to which the following embodiments belong will be omitted.

한편, 본 명세서에서 어떤 구성이 다른 구성과 "연결"되어 있다고 할 때, 이는 '직접적으로 연결'되어 있는 경우 뿐 아니라, '그 중간에 다른 구성을 사이에 두고 연결'되어 있는 경우도 포함한다. 또한, 어떤 구성이 다른 구성을 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한, 그 외 다른 구성을 제외하는 것이 아니라 다른 구성들 더 포함할 수도 있다는 것을 의미한다.Meanwhile, in this specification, when a configuration is said to be “connected” to another configuration, this includes not only the case of being “directly connected,” but also the case of being “connected with another configuration in between.” In addition, when a configuration “includes” another configuration, this means that other configurations may be further included rather than excluding other configurations, unless specifically stated to the contrary.

또한, 본 명세서에서 사용되는 “제 1”또는 “제 2”등과 같이 서수를 포함하는 용어는 다양한 구성 요소들을 설 명하는데 사용할 수 있지만, 상기 구성 요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성 요소를 다른 구성 요소로부터 구별하는 목적으로만 사용된다. Additionally, terms including ordinal numbers such as “first” or “second” used in this specification may be used to describe various components, but the components should not be limited by the terms. The above terms are used only for the purpose of distinguishing one component from another.

또한, 각각의 모듈이 포함하는 기능부(function unit)들은 모듈이 수행하는 기능을 설명하기 위한 논리적 구조이기 때문에, 각각의 기능부가 수행하는 기능은 모듈이 수행할 수 있음은 물론이다. 즉, 각각의 모듈은 모듈 내에 포함된 모든 기능부들을 포함할 필요가 없고, 기능을 수행하기 위한 적어도 하나의 기능부를 포함할 수 있다. In addition, since the function units included in each module are a logical structure for explaining the functions performed by the module, it goes without saying that the module can perform the functions performed by each function unit. That is, each module does not need to include all functional units included within the module, but may include at least one functional unit to perform the function.

이하에서는 첨부한 도면을 참조하여 실시 예를 예시하여 상세히 기술하도록 한다. 실시 예에서 프레임워크, 모듈, 응용 프로그램 인터페이스 등은 물리 장치 결합된 장치로 구현할 수도 있고 소프트웨어로 구현할 수도 있다. 이때, 실시 예가 소프트웨어로 구현될 경우 저장매체에 저장되고 컴퓨터 등에 설치되어 프로세서에 의해 실행될 수 있다. Hereinafter, examples will be described in detail with reference to the attached drawings. In an embodiment, the framework, module, application program interface, etc. may be implemented as a device combined with a physical device or may be implemented as software. At this time, if the embodiment is implemented as software, it may be stored in a storage medium, installed on a computer, etc., and executed by a processor.

도 1은 본 발명의 데이터 관리 장치를 설명하기 위한 하드웨어를 개시하는 도면이다. 1 is a diagram illustrating hardware for explaining the data management device of the present invention.

본 도면은 후술하는 데이터 관리 플랫폼과 관련된 하드웨어 구성 요소 간의 데이터 송수신에 대한 실시 예를 설명한다. This drawing explains an embodiment of data transmission and reception between hardware components related to a data management platform, which will be described later.

본 발명은 플랫폼을 통하여 사용자에게 제공될 수 있다. 이를 위하여 사용자/클라이언트(1000)는 웹 브라우저 등을 통하여 네트워크 장치(1001)에 접속하여 컴퓨팅 서버(1004)의 제어 하에 별도의 스토리지/데이터베이스(1003)에 저장된 데이터와 애플리케이션을 이용할 수 있게 된다. The present invention can be provided to users through a platform. To this end, the user/client 1000 can access the network device 1001 through a web browser or the like and use data and applications stored in a separate storage/database 1003 under the control of the computing server 1004.

보다 상세하게는, 사용자/클라이언트(1000)는 클라이언트 단말기를 통하여 서비스(예를 들어, 데이터의 검색, 수정, 삭제 등의 작업)를 요청하고, 컴퓨팅 서버(1004)를 통하여 연산되고 네트워크 장치(1001)를 통하여 수신한 데이터를 화면에 출력할 수 있다. 일 실시 예에서, 사용자/클라이언트(1000)는 본 발명의 데이터 관리 플랫폼에 접근하는 모든 대상을 포함할 수 있다. 즉, 본 도면에서 사용자/클라이언트(1000)는 네트워크 장치(1001)를 이용하여 데이터 관리 플랫폼에 접근할 수 있으며, 네트워크 장치(1001)의 제한을 두지 않는다. More specifically, the user/client 1000 requests a service (e.g., tasks such as searching, modifying, deleting data, etc.) through a client terminal, and is computed through the computing server 1004 and network device 1001. ), the received data can be displayed on the screen. In one embodiment, user/client 1000 may include anyone accessing the data management platform of the present invention. That is, in this figure, the user/client 1000 can access the data management platform using the network device 1001, and is not limited by the network device 1001.

네트워크 장치(1001)는 사용자/클라이언트(1000)와 컴퓨팅 서버(1004) 사이에서 데이터 전송을 중개하는 역할을 한다. 여기에서, 네트워크 장치(1001)는 라우터, TAP, 스위치 등을 포함할 수 있다. 라우터는 IP 주소를 이용하여 데이터를 전송하고, 스위치는 MAC 주소를 이용하여 데이터를 전송한다. 여기에서, 라우터와 같은 네트워크 장비(1001)는 SPAN(Switched Port Analyzer) 모드를 이용하여 특정 포트(port)만을 미러링하여 패킷을 데이터 관리 플랫폼(10000)에 전달할 수 있다. 이하에서 자세히 설명하도록 한다. The network device 1001 serves to mediate data transmission between the user/client 1000 and the computing server 1004. Here, the network device 1001 may include a router, TAP, switch, etc. Routers transmit data using IP addresses, and switches transmit data using MAC addresses. Here, network equipment 1001, such as a router, can transmit packets to the data management platform 10000 by mirroring only a specific port using SPAN (Switched Port Analyzer) mode. This will be explained in detail below.

TAP(Test Access Point)은 네트워크 상에서 데이터를 수집하는 용도로 사용될 수 있다. 보다 상세하게는, TAP은 네트워크 장치(1001) 중 하나로, 네트워크의 백-본(back-bone) 라인에 추가되어 미러링만 전문적으로 해주는 장치에 대응한다. 즉, Network TAP(Test Access Point, 이하, TAP)을 통해 본 발명의 데이터 관리 플랫폼(10000)에 패킷을 전달할 수 있다. 이에 따라, 네트워크 상에서 송수신되는 데이터 패킷의 흐름에 영향을 주지 않고 패킷을 복사하여 데이터 관리 플랫폼에 전달할 수 있다. TAP (Test Access Point) can be used to collect data on the network. More specifically, TAP is one of the network devices 1001 and corresponds to a device that is added to the back-bone line of the network and specializes only in mirroring. That is, packets can be delivered to the data management platform 10000 of the present invention through Network Test Access Point (TAP). Accordingly, packets can be copied and delivered to the data management platform without affecting the flow of data packets transmitted and received on the network.

스토리지/데이터베이스(1003)는 데이터를 저장하고, 관리할 수 있다. 스토리지는 주로 하드 디스크나 SSD를 이용하여 데이터를 저장하고, 데이터베이스는 구조화된 데이터를 관리하며, 검색 및 수정 등의 작업을 수행할 수 있다. The storage/database 1003 can store and manage data. Storage mainly uses hard disks or SSDs to store data, and databases manage structured data and can perform tasks such as search and modification.

컴퓨팅 서버(1004)는 데이터 처리를 담당한다. 클라이언트(1000)로부터 요청된 작업을 처리하고, 처리 결과를 클라이언트(10000)에게 반환한다. 이를 위해 중앙연산장치(Central Processing Unit, CPU), RAM 등의 하드웨어를 이용하여 계산 작업을 수행할 수 있다. 그리고 컴퓨팅 서버(1004)는 여러 가지 데이터의 입출력을 제어하고 데이터 관리 플랫폼(10000)에서 처리된 데이터를 스토리지/데이터베이스(1003)에 저장할 수 있다. 이때, 본 발명의 데이터 관리 플랫폼(10000)이 수행하는 기능은 컴퓨팅 서버(1004)의 프로세서에 의해 수행될 수 있다. 또한, 컴퓨팅 서버(1004)는 하드웨어 구성 요소 또는 데이터 관리 플랫폼 내의 모듈들의 상태를 모니터링하고 제어하는 시스템 매니저를 포함할 수 있다. Computing server 1004 is responsible for data processing. The task requested from the client (1000) is processed and the processing result is returned to the client (10000). For this purpose, calculation tasks can be performed using hardware such as a central processing unit (CPU) and RAM. Additionally, the computing server 1004 can control the input and output of various data and store the data processed by the data management platform 10000 in the storage/database 1003. At this time, the functions performed by the data management platform 10000 of the present invention may be performed by the processor of the computing server 1004. Additionally, computing server 1004 may include a system manager that monitors and controls the status of hardware components or modules within the data management platform.

이하, 본 발명을 실시하기 위하여 본 도면의 하드웨어 구성요소가 사용될 수 있으며, 상기 하드웨어 구성 요소 간의 데이터 처리(processing) 방법이 포함됨은 물론이다. Hereinafter, the hardware components of this drawing may be used to implement the present invention, and of course, a data processing method between the hardware components is included.

도 2는 본 발명의 데이터 관리 장치의 일 실시 예를 개시하는 도면이다. Figure 2 is a diagram disclosing an embodiment of the data management device of the present invention.

이 도면의 실시 예는 데이터 관리 장치를 예시하고 있으며, 데이터 관리 장치를 설명하기 위한 physical) 장치와 논리적인 요소(logical component)를 포함하고 있다. The embodiment of this drawing illustrates a data management device and includes a physical device and logical components to explain the data management device.

본 발명은 일 실시 예에서 SaaS(Software as a Service) 플랫폼을 통해 사용자에게 제공될 수 있다. SaaS 플랫폼은 클라우드 컴퓨팅 기술을 이용하여 네트워크를 통해 사용자에게 서비스로 제공되는 소프트웨어를 뜻한다. 이를 위하여, 스토리지/데이터베이스(1003), 컴퓨팅 서버(1004) 및 컨테이너 플랫폼(1005)은 사용자/클라이언트(1000)가 데이터 관리 플랫폼(10000) 및 데이터 관리 소프트웨어 패키지(20000)를 클라우드 상에서 이용할 수 있도록 지원할 수 있다. In one embodiment, the present invention may be provided to users through a SaaS (Software as a Service) platform. SaaS platform refers to software provided as a service to users through a network using cloud computing technology. To this end, the storage/database 1003, computing server 1004, and container platform 1005 support users/clients 1000 to use the data management platform 10000 and data management software package 20000 on the cloud. You can.

본 발명은 데이터 처리를 위하여 사용자/클라이언트(1000), 스토리지/데이터베이스(1003), 애플리케이션 서버(1002), 컴퓨팅 서버(1004), 컨테이너 플랫폼(1005), 데이터 관리 플랫폼(10000) 및 데이터 관리 소프트웨어 패키지(20000)를 사용할 수 있다. The present invention provides a user/client (1000), a storage/database (1003), an application server (1002), a computing server (1004), a container platform (1005), a data management platform (10000), and a data management software package for data processing. (20000) can be used.

이때, 스토리지/데이터베이스(1003) 및 컴퓨팅 서버(1004)는 하드웨어일 수 있으며, 애플리케이션 서버(1002), 컨테이너 플랫폼(1005), 데이터 관리 플랫폼(10000) 및 데이터 관리 소프트웨어 패키지(20000)는 소프트웨어에 대응할 수 있다. 하드웨어에 대하여는 상술한 내용을 참고하도록 하고, 이 도면을 참조하여 데이터 관리 장치의 실시 예를 설명하면 다음과 같다. At this time, the storage/database 1003 and the computing server 1004 may be hardware, and the application server 1002, container platform 1005, data management platform 10000, and data management software package 20000 may correspond to software. You can. For the hardware, refer to the above-described content, and an embodiment of the data management device will be described with reference to this drawing as follows.

사용자/클라이언트(1000)는 데이터 처리를 위해 데이터 관리 소프트웨어(20000)에 접속할 수 있다. User/client 1000 may access data management software 20000 for data processing.

컨테이너 플랫폼(1005)은 OS(Operating System), 컨테이너(container), 도커(docker) 등으로 구성되어 데이터 처리를 위한 가상 환경을 제공할 수 있다. The container platform 1005 is composed of an operating system (OS), a container, a docker, etc., and can provide a virtual environment for data processing.

데이터 관리 플랫폼(10000)은 데이터 관리 소프트웨어 패키지(20000) 내에 포함된 적어도 하나의 엔진 또는 모듈을 제어할 수 있다. 이를 위하여, 데이터 관리 플랫폼(10000)은 내부 데이터베이스(여기에서, 데이터베이스는 데이터 관리 소프트웨어 패키지(20000) 내의 내부 데이터베이스를 의미한다.), 스토리지, 분산 파일 시스템 등의 기술을 사용하여 데이터를 관리할 수 있다. 또한, 데이터 관리 플랫폼(10000)은 데이터 관리 소프트웨어 패키지(20000) 내에 포함된 적어도 하나의 엔진 또는 모듈을 관리하기 위한 시스템 매니저 또는 관리콘솔을 포함할 수 있다. The data management platform 10000 may control at least one engine or module included in the data management software package 20000. To this end, the data management platform 10000 can manage data using technologies such as an internal database (here, the database refers to an internal database within the data management software package 20000), storage, and distributed file system. there is. Additionally, the data management platform 10000 may include a system manager or management console for managing at least one engine or module included in the data management software package 20000.

데이터 관리 소프트웨어 패키지(20000)는 수집 모듈(20001), 분석 모듈(20002), 키 관리 모듈(20003), 개인정보 관리 모듈(20004), 모니터링 모듈(20005) 및 AI 엔진(20006) 중 적어도 하나를 포함할 수 있다. 다만, 데이터 관리 소프트웨어 패키지(20000)에 포함된 모듈 및 엔진은 필수적인 구성요소가 아니며 본 발명을 설명하기 위한 요소에 해당한다. 따라서, 데이터 실시 예를 수행하기 위한 다른 이름의 모듈이 포함될 수 있음은 물론이다. The data management software package 20000 includes at least one of a collection module 20001, an analysis module 20002, a key management module 20003, a personal information management module 20004, a monitoring module 20005, and an AI engine 20006. It can be included. However, the modules and engines included in the data management software package 20000 are not essential components and correspond to elements for explaining the present invention. Therefore, of course, modules with different names for performing data embodiments may be included.

수집 모듈(20001)은 다양한 소스에서 데이터를 수집하고 데이터를 처리 파이프 라인으로 전송할 수 있다. 수집 모듈(20001)은 로그, 이벤트, 센서, 웹 서버 등 다양한 소스에서 데이터(예를 들어, 패킷)를 수집할 수 있다. 특히, 수집 모듈(20001)은 로그 수집을 위해 에이전트를 사용할 때 에이전트를 중앙 관리할 수 있다. Collection module 20001 may collect data from various sources and transmit the data to a processing pipeline. The collection module 20001 may collect data (eg, packets) from various sources such as logs, events, sensors, and web servers. In particular, the collection module 20001 can centrally manage the agent when using the agent for log collection.

분석 모듈(20002)은 데이터를 분석하고 가치 있는 인사이트를 도출하는 데 사용될 수 있다. 분석 모듈(20002)은 수집된 패킷을 분석하여 데이터를 추출할 수 있다. 이때, 포함된 데이터가 개인정보인 경우, 개인정보 관리 모듈(20004)을 통하여 개인정보 보호와 관련된 기능을 제공할 수 있다. 또한, 포함된 데이터를 미리 수집된 행위 정보(예를 들어, 조회, 삭제, 추가, 변경, 출력 등을 포함한다.)와 매핑할 수 있다. The analysis module 20002 can be used to analyze data and derive valuable insights. The analysis module 20002 may extract data by analyzing the collected packets. At this time, if the included data is personal information, functions related to personal information protection can be provided through the personal information management module 20004. Additionally, the included data can be mapped with previously collected behavior information (for example, including inquiry, deletion, addition, change, output, etc.).

키 관리 모듈(20003)은 데이터 암호화 및 복호화를 위한 키를 생성, 저장 및 관리할 수 있다. 키 관리 모듈(20003)은 토큰, 대칭 키, 공개 키, 디지털 인증서 등의 기술을 사용하여 키를 관리할 수 있다. 키 관리 모듈(20003)은 데이터에 개인정보가 포함되어 있는 경우, 개인정보를 암호화 및 복호화를 위한 키를 생성, 저장 및 관리할 수 있다. 또한, 키 관리 모듈(20003)은 데이터에 개인정보가 포함되어 있지 않더라도 데이터의 보안을 위하여 토큰, 대칭 키, 공개 키, 디지털 인증서 등의 기술을 사용할 수 있다. The key management module 20003 can generate, store, and manage keys for data encryption and decryption. The key management module 20003 may manage keys using technologies such as tokens, symmetric keys, public keys, and digital certificates. If the data includes personal information, the key management module 20003 can generate, store, and manage keys for encrypting and decrypting personal information. Additionally, the key management module 20003 can use technologies such as tokens, symmetric keys, public keys, and digital certificates to secure data even if the data does not contain personal information.

개인정보 관리 모듈(20004)은 개인정보 보호와 관련된 기능을 제공할 수 있다. 개인정보 관리 모듈(20004)은 데이터에 개인정보가 포함되어 있는 경우, 개인정보의 수집, 추출, 암호화, 저장, 처리, 검색, 삭제 등을 제어할 수 있다. The personal information management module 20004 may provide functions related to personal information protection. If the data includes personal information, the personal information management module (20004) can control the collection, extraction, encryption, storage, processing, search, and deletion of personal information.

모니터링 모듈(20005)은 데이터의 검색 및 탐지를 수행할 수 있다. 모니터링 모듈(20005)은 데이터 처리 및 데이터 처리 환경을 모니터링하고 문제를 식별할 수 있다. 또한, 모니터링 모듈(20005)은 로그, 성능 지표, 이벤트 등을 모니터링하고 경고를 생성할 수 있다. The monitoring module 20005 may perform search and detection of data. Monitoring module 20005 may monitor data processing and the data processing environment and identify problems. Additionally, the monitoring module 20005 can monitor logs, performance indicators, events, etc. and generate alerts.

AI 엔진(20006)은 인공지능 기술(기계 학습을 포함한다.)을 이용하여 데이터 처리 및 데이터 분석 작업을 수행할 수 있다. 특히, 수집되어 저장된 로그에 텍스트가 포함되어 있는 경우, AI 엔진(20006)은 수집된 텍스트를 인공지능을 기반으로 행위를 분류할 수 있다. AI Engine (20006) can perform data processing and data analysis tasks using artificial intelligence technology (including machine learning). In particular, if the collected and stored log contains text, the AI Engine (20006) can classify the behavior of the collected text based on artificial intelligence.

도 3은 본 발명의 데이터 관리 플랫폼의 일 실시 예를 개시하는 도면이다. Figure 3 is a diagram disclosing an embodiment of the data management platform of the present invention.

이 도면의 실시 예는 데이터 관리 플랫폼(10000)을 예시하고 있으며, 물리적인(physical) 장치와 논리적인 요소(logical component)를 포함하고 있다. 특히, 본 도면에서 데이터 관리 플랫폼(10000)은 상술한 데이터 관리 플랫폼보다 더 넓은 범위에 대응할 수 있다. 예를 들어, 데이터 관리 플랫폼(10000)은 상술한 데이터 관리 소프트웨어 패키지(20000)에서 구현하는 모듈 중 적어도 하나를 포함하고, 물리 장치 상에서 구동되는 응용 프로그래밍 인터페이스 Application Programming Interface, API)를 포함할 수 있다. 물리 장치에 대하여는 상술한 내용을 참고하도록 한다. The embodiment of this figure illustrates a data management platform 10000 and includes physical devices and logical components. In particular, the data management platform 10000 in this figure can support a wider range than the data management platform described above. For example, the data management platform 10000 includes at least one of the modules implemented in the data management software package 20000 described above, and may include an application programming interface (API) running on a physical device. . For the physical device, please refer to the above-mentioned information.

데이터 관리 플랫폼(10000)은 컴퓨팅 서버(1004)와 스토리지/데이터베이스(1003)의 리소스(resource)를 이용하여 데이터 관리 플랫폼(10000) 내에 포함된 모듈의 기능을 수행할 수 있다. 이때, 시스템 매니저는 데이터 관리 플랫폼(10000) 내의 모듈 또는 엔진 중 적어도 하나를 제어할 수 있으며, 시스템 매니저는 컴퓨팅 서버(1004) 내에 위치하거나 별도로 위치하여 데이터 관리 플랫폼(10000)을 제어할 수 있다. The data management platform 10000 may use the resources of the computing server 1004 and the storage/database 1003 to perform the functions of modules included in the data management platform 10000. At this time, the system manager may control at least one of the modules or engines within the data management platform 10000, and the system manager may be located within the computing server 1004 or separately and control the data management platform 10000.

데이터 관리 플랫폼(10000)은 수집 모듈(20001), 분석 모듈(20002), 키 관리 모듈(20003), 개인정보 관리 모듈(20004), 모니터링 모듈(20005) 및 AI 엔진(20006) 중 적어도 하나를 포함할 수 있다. 각각의 모듈에 대한 설명은 상술한 바와 같다. The data management platform 10000 includes at least one of a collection module 20001, an analysis module 20002, a key management module 20003, a personal information management module 20004, a monitoring module 20005, and an AI engine 20006. can do. The description of each module is as described above.

데이터 관리 플랫폼(10000)은 사용자/클라이언트(1000)와 데이터를 송수신하며, 송수신된 데이터에 대하여 수집 모듈(20001), 분석 모듈(20002), 키 관리 모듈(20003), 개인정보 관리 모듈(20004), 모니터링 모듈(20005) 및 AI 엔진(20006)이 수행하는 적어도 하나의 기능을 적용할 수 있다. 이때, 데이터 관리 플랫폼(10000)은 사용자/클라이언트(1000)의 요청에 의해 데이터 관리 플랫폼(10000) 내에 포함된 개별적인 모듈을 단독으로 사용할 수 있다. 예를 들어, 사용자/클라이언트(1000)는 데이터 관리 플랫폼(10000) 내의 키 관리 모듈(20003) 또는 개인정보 관리 모듈(20004)의 기능만을 선택적으로 사용할 수 있다. The data management platform (10000) transmits and receives data with the user/client (1000), and collects a collection module (20001), an analysis module (20002), a key management module (20003), and a personal information management module (20004) for the transmitted and received data. , at least one function performed by the monitoring module 20005 and the AI engine 20006 can be applied. At this time, the data management platform 10000 may independently use individual modules included in the data management platform 10000 at the request of the user/client 1000. For example, the user/client 1000 may selectively use only the functions of the key management module 20003 or the personal information management module 20004 within the data management platform 10000.

또한, 도면에 도시되지는 않았으나 데이터 관리 플랫폼(10000)은 내부에 포함된 모듈의 기능을 수행하기 위하여 외부 스토리지/데이터베이스(1003)과는 다른 내부 데이터베이스를 사용할 수 있다. In addition, although not shown in the drawing, the data management platform 10000 may use an internal database different from the external storage/database 1003 to perform the functions of modules included therein.

도 4는 본 발명의 데이터 관리 방법의 일 실시 예를 설명하는 플로우 차트이다. Figure 4 is a flow chart explaining an embodiment of the data management method of the present invention.

본 발명이 개시하는 데이터 관리 방법은 단계(S1000)에서, 패킷을 수집할 수 있다. The data management method disclosed by the present invention can collect packets in step S1000.

일 실시 예에서, 패킷을 수집하는 방법은 에이전트(agent) 방식의 클라우드 기반의 패킷 수집 방법 또는 패킷 미러 방식의 패킷 수집 방법을 이용할 수 있다. 자세한 설명은 후술하도록 한다. In one embodiment, a method of collecting packets may use an agent-type cloud-based packet collection method or a packet mirror-type packet collection method. A detailed explanation will be provided later.

일 실시 예에서, NIC(Network Interface Card)로부터 패킷을 수집할 수 있다. 여기에서, NIC는 네트워크 연결을 위해 사용되며, 유선 또는 무선 방식으로 네트워크와 연결될 수 있다. NIC는 컴퓨터와 네트워크 간에 데이터 전송을 가능하게 하기 위해 패킷을 보내고 받는 역할을 하며, 네트워크 카드, LAN 카드, 이더넷 카드 등을 포함할 수 있다. In one embodiment, packets may be collected from a Network Interface Card (NIC). Here, the NIC is used for network connection and can be connected to the network in a wired or wireless manner. The NIC is responsible for sending and receiving packets to enable data transfer between the computer and the network, and may include a network card, LAN card, Ethernet card, etc.

또한, 본 발명이 개시하는 데이터 관리 방법은 수집된 패킷에 필터를 적용할 수 있다. Additionally, the data management method disclosed by the present invention can apply a filter to the collected packets.

보다 상세하게는, 본 발명의 데이터 관리 방법은 수집된 패킷을 재조합하고 필터링하여 분석 프로세스로 분배할 수 있다. 보다 상세하게는, 패킷의 재조합은 수집된 각각의 네트워크 패킷을 분석 가능한 패킷 형태로 합치는 것을 의미하며, 필터 적용은 기 설정된 필터링 규칙을 이용하여 해당 패킷을 분석할지 여부를 결정하는 설정을 의미한다. 이에 따라, 분석할 대상이 되는 패킷인 경우, 적절한 분석 프로세스로 분배될 수 있다. More specifically, the data management method of the present invention can reassemble and filter the collected packets and distribute them to the analysis process. More specifically, packet recombination refers to combining each collected network packet into an analyzable packet, and filter application refers to a setting that determines whether to analyze the packet using preset filtering rules. . Accordingly, if the packet is a target for analysis, it can be distributed to an appropriate analysis process.

단계(S2000)에서, 패킷을 분석할 수 있다. 일 실시 예에서, 패킷의 기반이 되는 프로토콜의 종류에 따라 패킷을 분석할 수 있다. 보다 상세하게는, 패킷이 RFC(Remote Function Call) 프로토콜 기반인지, GUI/SNC 프로토콜 기반인지, HTTP/HTTPS 기반인지에 따라 패킷을 분석할 수 있다. 이때, 데이터 관리 방법은 이용 가능한 컴퓨팅 서버의 코어 수 및 트랜잭션 양을 고려하여 프로세스의 개수를 설정할 수 있다. 자세한 설명은 후술하도록 한다.In step S2000, the packet may be analyzed. In one embodiment, packets may be analyzed according to the type of protocol on which the packet is based. More specifically, packets can be analyzed depending on whether they are based on RFC (Remote Function Call) protocol, GUI/SNC protocol, or HTTP/HTTPS. At this time, the data management method can set the number of processes by considering the number of cores and transaction amount of available computing servers. A detailed explanation will be provided later.

단계(S3000)에서, 데이터를 모니터링할 수 있다. 일 실시 예에서, 분석된 패킷에 포함된 데이터를 모니터링할 수 있다. 또한, 저장된 데이터를 실시간 모니터링하여 이상행위 이벤트를 감지하고, 감지된 이벤트에 대한 경고를 발생시킬 지 여부를 결정할 수 있다. In step S3000, data can be monitored. In one embodiment, data included in the analyzed packet may be monitored. In addition, it is possible to monitor stored data in real time to detect abnormal behavior events and decide whether to issue a warning for the detected event.

이를 위하여, 일 실시 예에서, 데이터 관리 방법은 분석된 패킷으로부터 데이터를 추출하고, 추출된 데이터를 저장할 수 있다. 데이터 관리 방법은 데이터에 포함된 정보에 따라 다른 방식으로 추출된 데이터를 저장할 수 있다. 예를 들어, 데이터에 개인정보가 저장되어 있는 경우, 개인정보를 추출한 뒤 개인정보를 암호화한 뒤 추출된 데이터를 저장할 수 있다. 또한, 데이터 관리 방법은 패킷 내의 중요 데이터만을 추출함으로써 중요 필드의 검색을 용이하게 하고, 결과적으로 데이터 관리 방법의 성능을 높일 수 있다. To this end, in one embodiment, the data management method may extract data from the analyzed packet and store the extracted data. Data management methods can store extracted data in different ways depending on the information contained in the data. For example, if personal information is stored in the data, you can extract the personal information, encrypt the personal information, and then store the extracted data. Additionally, the data management method facilitates searching for important fields by extracting only important data from packets, and as a result, the performance of the data management method can be improved.

도 5는 본 발명의 데이터 관리 방법의 일 실시 예를 개시하는 도면이다. Figure 5 is a diagram disclosing an embodiment of the data management method of the present invention.

단계(S1010)에서, 패킷을 수집할 수 있다. 일 실시 예에서, 패킷을 수집하는 방법은 에이전트(agent) 방식의 클라우드 기반의 패킷 수집 방법 또는 패킷 미러 방식의 패킷 수집 방법을 이용할 수 있다. 자세한 설명은 후술하도록 한다. In step S1010, packets may be collected. In one embodiment, a method of collecting packets may use an agent-type cloud-based packet collection method or a packet mirror-type packet collection method. A detailed explanation will be provided later.

단계(S1020)에서, 수집된 패킷에 필터를 적용할 수 있다. 보다 상세하게는, 본 발명의 데이터 관리 방법은 수집된 패킷을 재조합하고 필터링하여 분석 프로세스로 분배할 수 있다. In step S1020, a filter may be applied to the collected packets. More specifically, the data management method of the present invention can reassemble and filter the collected packets and distribute them to the analysis process.

단계(S1030)에서, 필터가 적용된 패킷을 분석할 수 있다. In step S1030, the packet to which the filter is applied can be analyzed.

여기에서, 패킷의 기반이 되는 프로토콜의 종류에 따라 다르게 분석할 수 있다. 보다 상세하게는, 단계(S1031)에서, RFC(Remote Function Call) 프로토콜 기반의 패킷을 분석하고, 단계(S1032)에서, GUI 프로토콜 기반의 패킷을 분석하고, 단계(S1033)에서, HTTP/HTTPS 기반의 패킷을 분석할 수 있다. Here, the packet can be analyzed differently depending on the type of protocol on which it is based. More specifically, in step S1031, packets based on RFC (Remote Function Call) protocol are analyzed, in step S1032, packets based on GUI protocol are analyzed, and in step S1033, packets based on HTTP/HTTPS are analyzed. packets can be analyzed.

단계(S1031)에서, RFC 프로토콜 기반의 패킷 분석은 네트워크에서 RFC 프로토콜을 사용하는 패킷을 분석하는 과정을 의미한다. RFC는 분산 환경에서 서로 다른 시스템 또는 컴퓨터 간에 함수 호출을 수행하기 위한 프로토콜과 메커니즘으로, RFC는 클라이언트와 서버 모델을 기반으로 작동하며 클라이언트가 원격 시스템에 있는 서버의 함수를 호출하여 원격에서 실행할 수 있도록 한다. 즉, RFC 프로토콜은 원격 함수 호출을 위한 통신 프로토콜이기 때문에 RFC 프로토콜을 사용하는 패킷은 이러한 원격 함수 호출에 대한 정보를 담고 있다. In step S1031, RFC protocol-based packet analysis refers to the process of analyzing packets using the RFC protocol in a network. RFC is a protocol and mechanism for performing function calls between different systems or computers in a distributed environment. RFC operates based on a client and server model, allowing clients to call functions on a server on a remote system and execute them remotely. do. In other words, because the RFC protocol is a communication protocol for remote function calls, packets using the RFC protocol contain information about these remote function calls.

단계(S1032)에서, GUI 프로토콜 기반의 패킷 분석은 클라이언트와 애플리케이션 서버 간에 통신하는 GUI 프로토콜 기반의 패킷을 수집하여, 패킷에 포함된 클라이언트 IP또는 Port, 서버 IP 또는 Port, 패킷 데이터(byte stream) 등을 추출하는 방식이다. In step S1032, the GUI protocol-based packet analysis collects the GUI protocol-based packets communicated between the client and the application server, and analyzes the client IP or port, server IP or port, packet data (byte stream), etc. included in the packet. This is a method of extracting.

단계(S1033)에서, HTTP/HTTPS 기반의 패킷 분석은 웹 브라우저와 서버간 송수신하는 패킷을 미러링하거나 SSL 프로세싱하여 패킷에 포함된 데이터를 추출하는 방식이다. In step S1033, HTTP/HTTPS-based packet analysis is a method of extracting data included in packets by mirroring or SSL processing packets transmitted and received between a web browser and a server.

각각에 대한 자세한 분석 방법은 후술하도록 한다.Detailed analysis methods for each will be described later.

단계(S1040)에서, 분석된 패킷으로부터 개인정보 메타데이터를 활용하여 개인정보를 추출할 수 있다. 일 실시 예에서, 분석된 패킷에 개인정보가 포함되어 있는지 확인하기 위하여, 저장된 개인정보 메타데이터를 사용하여 개인정보를 추출할 수 있다. 이에 대하여는, 후술하도록 한다. In step S1040, personal information can be extracted from the analyzed packet using personal information metadata. In one embodiment, in order to check whether the analyzed packet contains personal information, the personal information may be extracted using stored personal information metadata. This will be described later.

단계(S1050)에서, 로그를 저장할 수 있다. 이때, 분석된 패킷에 포함된 유의미한 정보는 로그로 저장될 수 있다. 일 실시 예에서, 데이터 관리 방법은 분석된 정보를 감사 로그(Audit log)에 저장할지 여부를 결정할 수 있다. 또한, 개인정보가 포함된 경우, 개인정보는 패턴화 되어 데이터베이스에 저장될 수 있다. 마지막으로, 데이터 관리 방법은 로그(log) 저장 속도를 높이기 위해 멀티 쓰레딩(multithreading) 방식으로 동작하며, 일시적으로 데이터베이스에 접근이 되지 않는 경우에 대비하여 메모리 큐잉(queuing) 및 파일 큐잉을 수행할 수 있다.In step S1050, the log can be saved. At this time, meaningful information included in the analyzed packet may be stored as a log. In one embodiment, the data management method may determine whether to store the analyzed information in an audit log. Additionally, if personal information is included, the personal information may be patterned and stored in the database. Lastly, the data management method operates in a multithreading manner to increase log storage speed, and can perform memory queuing and file queuing in case the database is temporarily inaccessible. there is.

단계(S1060)에서, 저장된 로그를 이용하여 이상행위를 감지할 수 있다. 이때, 이상행위가 감지된 경우, 이상행위 감지에 대한 정보를 기록한 새로운 로그를 생성하여 다시 단계(S1050)를 통해 로그를 저장할 수 있다. 또한, 데이터 관리 방법은 사용자로부터 감사 로그(Audit Log)를 요청받는 경우, 수집된 데이터의 화면을 재구현할 수 있다. 보다 상세하게는, GUI 프로토콜은 그래픽 사용자 인터페이스를 표시하고 상호 작용하는데 사용되는 프로토콜로, 이러한 프로토콜을 분석함으로써 사용자의 작업 흐름, 입력, 출력 등을 시각적으로 이해할 수 있으며 시스템의 동작 상태를 파악하고 문제를 진단하는데 도움을 줄 수 있다. 이에 대하여는 이하의 도면에서 자세히 설명하도록 한다. In step S1060, abnormal behavior can be detected using the stored log. At this time, if abnormal behavior is detected, a new log recording information about the abnormal behavior detection can be created and the log can be saved again through step S1050. Additionally, the data management method can recreate the screen of the collected data when an audit log is requested from the user. More specifically, the GUI protocol is a protocol used to display and interact with graphical user interfaces. By analyzing these protocols, you can visually understand the user's workflow, input, output, etc., identify the operating state of the system, and solve problems. It can help in diagnosing. This will be explained in detail in the drawings below.

도 6은 본 발명의 패킷 수집 방법의 일 실시 예를 개시하는 도면이다. Figure 6 is a diagram illustrating an embodiment of the packet collection method of the present invention.

본 도면은 에이전트(agent) 방식의 클라우드 기반의 패킷 수집 방법에 대하여 설명한다. 보다 상세하게는, 일 실시 예는 적어도 하나 이상의 클라이언트(1000a, 1000b, 1000c), 네트워크(1001), 클라우드 AP(1010), 데이터 관리 플랫폼(10000, 20000)을 포함한다. This diagram explains an agent-based cloud-based packet collection method. More specifically, one embodiment includes at least one client (1000a, 1000b, 1000c), a network (1001), a cloud AP (1010), and a data management platform (10000, 20000).

적어도 하나의 클라이언트(1000a, 1000b, 1000c)는 수집 대상 패킷을 생성하는 기기로, 예를 들어, 서버, 노트북, 스마트폰 등을 포함할 수 있다. At least one client (1000a, 1000b, 1000c) is a device that generates packets to be collected and may include, for example, a server, a laptop, a smartphone, etc.

네트워크(1001)는 패킷 수집 대상 네트워크로, 클라이언트(1000a, 1000b, 1000c)는 네트워크(1001)에 연결되어 있을 수 있다. The network 1001 is a packet collection target network, and clients 1000a, 1000b, and 1000c may be connected to the network 1001.

클라우드 AP(1010)는 적어도 하나의 AP(1011, 1012, 1013, 1014)를 포함할 수 있다. 여기에서, 적어도 하나의 AP(1011, 1012, 1013, 1014)는 패킷 수집 에이전트와 연결되고, 패킷 수집 에이전트는 AP(1011, 1012, 1013, 1014)에서 생성된 모든 패킷을 수집하고 처리할 수 있다. The cloud AP 1010 may include at least one AP (1011, 1012, 1013, and 1014). Here, at least one AP (1011, 1012, 1013, 1014) is connected to a packet collection agent, and the packet collection agent can collect and process all packets generated by the AP (1011, 1012, 1013, 1014). .

보다 상세하게는, 패킷 수집 에이전트는 수집된 패킷을 인식할 수 있는 형태로 필터링하고 추출하여 압축하고 암호화할 수 있다. 패킷 수집 에이전트는 AP(1011, 1012, 1013, 1014)와 연결되고, 수집된 패킷을 처리하고 데이터 관리 플랫폼(10000, 20000)으로 전송할 수 있다. More specifically, the packet collection agent can filter, extract, compress, and encrypt the collected packets into a recognizable form. The packet collection agent may be connected to the AP (1011, 1012, 1013, 1014), process the collected packets, and transmit them to the data management platform (10000, 20000).

데이터 관리 플랫폼(10000, 20000)은 수집된 패킷을 수신하고, 패킷을 분석하고 저장하며, 분석 모듈을 통해 데이터를 분석할 수 있다. The data management platform (10000, 20000) can receive collected packets, analyze and store the packets, and analyze the data through an analysis module.

이러한 방식으로 클라우드 지원(agent 방식) 기반의 패킷 수집 방법은 높은 확장성과 유연성을 제공하며, 다양한 네트워크 환경에서 적용할 수 있다. 또한, 클라우드 환경에서 데이터를 처리하므로 하드웨어나 소프트웨어 업그레이드나 유지보수가 용이하며, 안정적인 서비스를 제공할 수 있다. In this way, the cloud-supported (agent method)-based packet collection method provides high scalability and flexibility and can be applied in various network environments. In addition, since data is processed in a cloud environment, hardware and software upgrades and maintenance are easy, and stable services can be provided.

본 도면과 같은 에이전트 방식의 클라우드 기반의 패킷 수집 방법은 클라우드를 사용하는 경우 적용이 가능하다. The agent-based cloud-based packet collection method as shown in this figure can be applied when using the cloud.

도 7은 본 발명의 패킷 수집 방법의 다른 일 실시 예를 개시하는 도면이다.Figure 7 is a diagram illustrating another embodiment of the packet collection method of the present invention.

본 도면은 패킷 미러 방식의 패킷 수집 방법에 대하여 설명한다. 보다 상세하게는, 실시 예는, 일 실시 예는 적어도 하나 이상의 클라이언트(1000a, 1000b, 1000c), 네트워크(1001), 전용선/VPN(1001a), CSP(1015) 및 데이터 관리 플랫폼(10000, 20000)을 포함한다. 상술한 내용과 중복되는 설명은 생략하도록 한다. This figure explains the packet mirror method of collecting packets. More specifically, an embodiment includes at least one client (1000a, 1000b, 1000c), a network (1001), a dedicated line/VPN (1001a), a CSP (1015), and a data management platform (10000, 20000). Includes. Descriptions that overlap with the above-described content will be omitted.

네트워크(1001)는 전용선/VPN(1001a)에 연결되며, 전용선/VPN(1001a)은 데이터 관리 플랫폼(10000, 20000)과 CSP(Cloud Service Provider, 클라우드 서비스 제공자, 1015) 각각에 연결될 수 있다. 여기에서, CSP는 예를 들어, AWS(Amason Web Service), GCP(Google Cloud Platform), Azure 와 같은 클라우드 서비스 제공자를 의미한다. The network 1001 is connected to a dedicated line/VPN 1001a, and the dedicated line/VPN 1001a can be connected to each of the data management platforms 10000 and 20000 and a cloud service provider (CSP) 1015. Here, CSP refers to a cloud service provider such as, for example, AWS (Amason Web Service), GCP (Google Cloud Platform), and Azure.

패킷 미러 방식을 통하여 데이터 관리 플랫폼(10000, 20000)은 네트워크 트래픽을 복제하여 전용선/VPN(1001a)로부터 패킷을 전달받고, 동일한 패킷은 CSP(1015)에 전달될 수 있다. Through the packet mirror method, the data management platforms (10000, 20000) replicate network traffic to receive packets from the dedicated line/VPN (1001a), and the same packets can be delivered to the CSP (1015).

이 방법을 통하여 데이터 관리 플랫폼(10000, 20000)은 수집된 모든 패킷을 캡처하고 분석할 수 있다. Through this method, the data management platform (10000, 20000) can capture and analyze all collected packets.

본 도면과 같은 패킷 미러 방식의 패킷 수집 방법은 CSP, 자체 클라우드, 애플리케이션 서버에서 제공하는 프라이빗 클라우드(예를 들어, SAP PCE)를 사용하는 경우 적용이 가능하다. The packet mirror method of packet collection as shown in this figure can be applied when using a CSP, its own cloud, or a private cloud (for example, SAP PCE) provided by an application server.

본 발명은 상술한 바와 같이 수집된 패킷을 분석할 수 있다. 구체적으로 수집된 패킷은 RFC 프로토콜 기반, GUI 프로토콜 기반, HTTP/HTTPS 기반 중 어느 하나에 해당할 수 있다. 이에 따라, 본 발명의 일 실시 예는 수집된 패킷을 다른 방식으로 분석하고, 저장할 수 있다. 이에 대하여 자세히 설명하도록 한다. The present invention can analyze the collected packets as described above. Specifically, the collected packets may correspond to one of RFC protocol-based, GUI protocol-based, and HTTP/HTTPS-based. Accordingly, an embodiment of the present invention can analyze and store collected packets in different ways. This will be explained in detail.

도 8은 본 발명의 패킷 수집 방법에 따라 수집한 RFC 프로토콜 기반의 패킷 분석 데이터의 일 실시 예를 개시하는 도면이다. Figure 8 is a diagram illustrating an example of RFC protocol-based packet analysis data collected according to the packet collection method of the present invention.

데이터 관리 플랫폼은 프로토콜을 기반으로 한 패킷을 통하여 본 도면에서 도시하는 다양한 데이터를 추출할 수 있다. 본 도면에서는, RFC 프로토콜 기반의 패킷 분석 데이터를 예로 들어 설명하나, RFC 프로토콜 기반의 패킷이 이하의 모든 정보를 포함하는 것은 아니다. 특히, 후술하는 (6) 및 (8)에 포함된 정보는 GUI 프로토콜 기반의 패킷 분석 데이터에 해당한다. The data management platform can extract various data shown in this figure through protocol-based packets. In this drawing, packet analysis data based on the RFC protocol is used as an example, but the packet based on the RFC protocol does not include all of the information below. In particular, the information included in (6) and (8) described later corresponds to packet analysis data based on the GUI protocol.

보다 상세하게는, 데이터 관리 플랫폼은 RFC 구조 정보를 이용하여 패킷을 분석하고 데이터를 추출할 수 있다. 이때, RFC 구조 정보가 없는 경우, 데이터 관리 플랫폼은 RFC 정보 요청 파일을 생성하여, RFC 정보를 요청할 수 있다. 자세한 설명은 후술하도록 한다. More specifically, the data management platform can use RFC structure information to analyze packets and extract data. At this time, if there is no RFC structure information, the data management platform can request RFC information by creating an RFC information request file. A detailed explanation will be provided later.

(1) 시간(요청 시간, 세션 정보의 시작 시간을 포함한다.)(1) Time (including request time and start time of session information)

(2) 프로토콜 정보(RFC 프로토콜, GUI 프로토콜, HTTP/HTTPS 등을 포함한다.)(2) Protocol information (including RFC protocol, GUI protocol, HTTP/HTTPS, etc.)

(3) 서버 IP 또는 Port(3) Server IP or Port

(4) 클라이언트 IP 또는 Port(클라이언트가 접속한 지역(region)을 포함한다.)(4) Client IP or Port (including the region where the client connects)

(5) 계정(사용자 ID, 사번, 조직이름 등을 포함한다.)(5) Account (including user ID, employee number, organization name, etc.)

(6) 트랜잭션명(트랜잭션 코드(T-code)를 포함한다. 여기서 트랜잭션 코드는 트랙잭션을 명시하는 일종의 단축코드로써 논리적인 업무 프로세스이다.)(6) Transaction name (including transaction code (T-code). Here, the transaction code is a kind of short code that specifies the transaction and is a logical business process.)

(7) RFC 함수 명(7) RFC function name

(8) 이벤트 정보(이벤트는 로그인(성공/실패), 개인정보 입력, 특정 T-Code 접근 여부, SAP ALL 접속 여부, 공통계정 접속 여부, 사용자 오류, 파일 다운로드, 프로그램 덤프, 사용자 암호 변경 여부, 메타 데이터 수정 여부, 중복접속 조회 여부, 특정 GL 계정(GUI 프로토콜의 경우, 계좌(account) 정보 누적 조회 여부 등을 포함한다.) (8) Event information (events include login (success/failure), personal information input, specific T-Code access, SAP ALL access, common account access, user error, file download, program dump, user password change, This includes whether or not metadata is modified, whether or not duplicate access is checked, and whether or not a specific GL account is checked (in the case of the GUI protocol, whether or not cumulative account information is checked).

(9) 개인정보 개수(9) Number of personal information

일 실시 예에서, 데이터 관리 플랫폼은 트랜잭션명과 이벤트 정보를 통하여 패킷이 수행하고자 하는 업무를 판단할 수 있다. In one embodiment, the data management platform can determine the task that the packet is intended to perform through the transaction name and event information.

도 9는 본 발명의 패킷 수집 방법에 따라 수집한 GUI 프로토콜 기반의 패킷 분석 장치의 일 실시 예를 개시하는 도면이다. Figure 9 is a diagram illustrating an example of a GUI protocol-based packet analysis device collected according to the packet collection method of the present invention.

본 도면에서는, 본 발명의 패킷 수집 방법에 따라 수집한 패킷이 GUI 프로토콜 기반인 경우, 패킷을 분석하는 실시 예를 설명한다. 일 실시 예는, 사용자/클라이언트(1000, 이하, 클라이언트), 네트워크 장치(1001), 스토리지/데이터베이스(1003), 컴퓨팅 서버(1004), 애플리케이션 서버(1002)를 포함할 수 있다. 상술한 구성과 중복되는 설명은 생략하도록 한다. In this drawing, an embodiment of analyzing packets when packets collected according to the packet collection method of the present invention are based on a GUI protocol will be described. One embodiment may include a user/client 1000 (hereinafter referred to as client), a network device 1001, a storage/database 1003, a computing server 1004, and an application server 1002. Descriptions that overlap with the above-described configuration will be omitted.

일 실시 예에서, 상술한 구성 요소를 이용하여 GUI 프로토콜을 기반으로 한 패킷 분석을 수행할 수 있다. 이를 위해서는 클라이언트(1000)에서 발생하는 GUI 프로토콜 기반의 트래픽을 캡처하고 분석하는 것이 필요하다. 이를 위해서는 다음과 같은 작업이 필요하다. In one embodiment, packet analysis based on a GUI protocol can be performed using the above-described components. To achieve this, it is necessary to capture and analyze GUI protocol-based traffic occurring in the client 1000. To achieve this, the following tasks are required:

클라이언트(1000)는 네트워크 장치(1001)를 이용해 애플리케이션 서버(1002)에 대응하는 애플리케이션을 실행하고, 애플리케이션 시스템에 로그인할 수 있다. 클라이언트(1000)는 애플리케이션을 통하여 필요한 작업을 수행할 수 있다. 클라이언트(1000)는 입력되는 입력 데이터 또는 애플리케이션 서버(1002)를 통하여 수신한 출력 데이터에 대한 정보를 화면에 출력할 수 있다. The client 1000 can use the network device 1001 to execute an application corresponding to the application server 1002 and log in to the application system. The client 1000 can perform necessary tasks through the application. The client 1000 may display information about input data or output data received through the application server 1002 on the screen.

네트워크 장비(1001)는 클라이언트(1000)와 애플리케이션 서버(1002) 사이의 네트워크 트래픽을 캡처할 수 있다. 특히, 네트워크 장비(1001)는 TAP을 포함할 수 있다. 이를 통해, 예를 들어, 애플리케이션 서버(1002)가 SAP 운영 서버인 경우, SAP 애플리케이션 사용자, SAP EP 서버 및 레거시 시스템에서 SAP 서버와 통신하는 GUI 프로토콜 기반의 패킷을 로깅할 수 있다. 또한, 네트워크 장비(1001)는 스위치를 포함할 수 있다. 이를 통해, SAP 운영 서버로 네트워크를 주고받는 스위치에서 포트 미러를 통하여 SAP 서버에 대한 내부 감사를 수행할 수 있고, SAP 서버에서 사용되는 개인 정보에 대해 모니터링할 수 있다. Network equipment 1001 may capture network traffic between the client 1000 and the application server 1002. In particular, network equipment 1001 may include a TAP. This allows, for example, if the application server 1002 is an SAP operating server, to log GUI protocol-based packets that communicate with the SAP server from SAP application users, SAP EP servers, and legacy systems. Additionally, network equipment 1001 may include a switch. Through this, internal audit of the SAP server can be performed through a port mirror on the switch that transmits and receives the network to the SAP operation server, and personal information used in the SAP server can be monitored.

애플리케이션 서버(1002)에서는 GUI 프로토콜을 기반으로 하는 트래픽이 발생할 수 있다. Traffic based on the GUI protocol may occur in the application server 1002.

데이터 관리 플랫폼(10000) 또는 데이터 관리 소프트웨어(20000)는 애플리케이션 서버(1002)에서 발생하는 트래픽을 캡쳐하고, 캡처한 트래픽을 분석할 수 있다. 이때, 데이터 관리 플랫폼(10000) 또는 데이터 관리 소프트웨어(20000) 내의 수집 모듈(20001) 및 분석 모듈(20002)을 통해 애플리케이션 서버(1002)에서 발생하는 트래픽을 수집하고 분석할 수 있다. 이에 따라, GUI 프로토콜의 구조와 내용을 이해하고 분석할 수 있다. 이때, 애플리케이션 서버(1002)는 컴퓨팅 서버(1004)의 자원을 이용할 수 있다. The data management platform 10000 or data management software 20000 may capture traffic occurring in the application server 1002 and analyze the captured traffic. At this time, traffic occurring in the application server 1002 can be collected and analyzed through the collection module 20001 and analysis module 20002 in the data management platform 10000 or the data management software 20000. Accordingly, the structure and contents of the GUI protocol can be understood and analyzed. At this time, the application server 1002 can use the resources of the computing server 1004.

스토리지/데이터베이스(1003)는 분석한 결과 또는 애플리케이션 서버(1002)에 필요한 정보를 저장할 수 있다. The storage/database 1003 may store analysis results or information necessary for the application server 1002.

도 10은 본 발명의 패킷 수집 방법에 따라 수집한 GUI 프로토콜 기반의 패킷 분석 방법의 일 실시 예를 개시하는 도면이다.Figure 10 is a diagram illustrating an example of a method for analyzing packets based on a GUI protocol collected according to the packet collection method of the present invention.

단계(S2010)에서, 클라이언트 화면에 출력되는 애플리케이션과 애플리케이션 서버 간 송수신되는 패킷을 수집할 수 있다. 보다 상세하게는, 클라이언트와 애플리케이션 서버 간에 통신하는 GUI 프로토콜 기반의 패킷을 분석하고 입출력 데이터 및 모니터링에 필요한 정보를 수집할 수 있다. 여기에서, 입력 데이터는 클라이언트가 클라이언트의 단말기(terminal)을 이용하여 입력하는 사용자 명령어를 포함할 수 있다. In step S2010, packets transmitted and received between the application displayed on the client screen and the application server may be collected. More specifically, it is possible to analyze packets based on the GUI protocol communicated between the client and the application server and collect information required for input/output data and monitoring. Here, the input data may include a user command that the client inputs using the client's terminal.

클라이언트가 클라이언트 단말기를 통해 사용자 명령어를 입력 데이터로써 입력하면, 입력 데이터는 네트워크를 통해 애플리케이션 서버에 전송된다. 일 실시 예에서, 애플리케이션 서버에 입력 데이터가 전송되는 경우, 입력 데이터는 모니터링 될 수 있다. When the client inputs a user command as input data through the client terminal, the input data is transmitted to the application server through the network. In one embodiment, when input data is transmitted to an application server, the input data may be monitored.

보다 상세하게는, 데이터 관리 플랫폼은 입력 데이터를 제공받아 분석할 수 있다. 이때, 입력 패킷에 기 설정된 사용자 명령어(예를 들어, 특정 이벤트 발생)가 포함된 경우, 데이터 관리 플랫폼은 접근 제한이 필요하다고 판단할 수 있다. 이에 따라, 데이터 관리 플랫폼은 사용자 명령어(여기에서, 사용자 명령어는 입력 패킷에 포함된 사용자 명령어와 다른 언어로 표현될 수 있다.)를 통해 애플리케이션 서버(1002)에게 클라이언트의 접근을 통제할 수 있다. 이때, 입력 데이터는 SAP DIAG 프로토콜을 통해 SAP 애플리케이션 서버와 송수신되는 데이터를 예로 들 수 있다. More specifically, the data management platform can receive input data and analyze it. At this time, if the input packet includes a preset user command (for example, a specific event occurs), the data management platform may determine that access restrictions are necessary. Accordingly, the data management platform can control the client's access to the application server 1002 through a user command (here, the user command may be expressed in a language different from the user command included in the input packet). At this time, the input data may be data transmitted and received with the SAP application server through the SAP DIAG protocol.

또한, 클라이언트 단말기는 애플리케이션 서버에서 출력된 데이터를 출력할 수 있다. 즉, 애플리케이션 서버는 전송되는 입력 데이터에 대한 내용을 검색하면서 검색된 결과에 대한 내용이 포함된 출력 데이터를 클라이언트 단말기에 전송하여 클라이언트 단말기의 화면에 정보가 출력되도록 할 수 있다. 이때, 애플리케이션 서버는 출력 데이터를 출력하되, 출력 데이터는 서버 정보, 트랜잭션 코드(T-code), 프로그램명, 상태 메시지 중 적어도 하나를 포함할 수 있다. Additionally, the client terminal can output data output from the application server. In other words, the application server may search the contents of the transmitted input data and transmit output data containing the contents of the search results to the client terminal so that the information is displayed on the screen of the client terminal. At this time, the application server outputs output data, and the output data may include at least one of server information, transaction code (T-code), program name, and status message.

입력 데이터와 마찬가지로, 데이터 관리 플랫폼은 출력 데이터를 제공받아 분석할 수 있다. 이때, 출력 데이터는 서버 정보, 트랜잭션 코드(T-code), 프로그램명, 상태 메시지 중 적어도 하나를 포함할 수 있다. 입력 데이터와 마찬가지로, 데이터 관리 플랫폼은 분석된 출력 데이터에 기초하여 접근 제한이 필요하다고 판단할 수 있다. 이에 따라, 데이터 관리 플랫폼은 접근을 통제할 수 있다. As with input data, a data management platform can receive and analyze output data. At this time, the output data may include at least one of server information, transaction code (T-code), program name, and status message. As with input data, the data management platform may determine that access restrictions are necessary based on the analyzed output data. Accordingly, the data management platform can control access.

또한, 출력 데이터는 SAP DIAG 프로토콜을 통해 SAP 애플리케이션 서버와 클라이언트 단말기와 송수신되는 데이터를 예로 들 수 있다.Additionally, output data may be data transmitted and received between the SAP application server and the client terminal through the SAP DIAG protocol.

단계(S2020)에서, 수집된 패킷의 분석 여부를 결정할 수 있다. 이를 위하여, 본 발명은 수집된 정보의 분석 여부를 결정하여 성능을 높일 수 있도록 필터링을 수행할 수 있다. 여기에서, 수집된 정보를 분석하지 않기로 결정하는 경우 분석을 진행하지 않고 종료한다. In step S2020, it may be determined whether to analyze the collected packets. To this end, the present invention can perform filtering to improve performance by determining whether to analyze the collected information. Here, if you decide not to analyze the collected information, the analysis ends without proceeding.

단계(S2030)에서, 패킷을 분석할 수 있다. 일 실시 예에서, 패킷 내의 포함된 정보의 보안을 제어하기 위해 패킷 디코딩 및 프로토콜을 분석할 수 있다. 일 실시 예에서, 입력 데이터 또는 출력 데이터에 대한 클라이언트 IP또는 Port, 서버 IP 또는 Port, 패킷 데이터(byte stream) 등으로 자세하게 분석할 수 있다. In step S2030, the packet can be analyzed. In one embodiment, packet decoding and protocols may be analyzed to control the security of information contained within the packet. In one embodiment, input data or output data can be analyzed in detail by client IP or port, server IP or port, packet data (byte stream), etc.

단계(S2040)에서, 분석된 패킷에 포함된 데이터를 세션 정보로 구성하여 저장할 수 있다. GUI 프로토콜 기반의 패킷을 분석한 분석 정보와 입출력 데이터를 사용해 세션 정보를 생성하고, 저장할 수 있다. 일 실시 예에서, 저장된 세션 정보는 사용자가 이해하기 쉬운 언어로 클라이언트 화면에 출력될 수 있다. In step S2040, data included in the analyzed packet may be configured as session information and stored. Session information can be created and stored using GUI protocol-based packet analysis information and input/output data. In one embodiment, the stored session information may be displayed on the client screen in a language that is easy for the user to understand.

즉, 네트워크 연결 정보 및 입출력 패킷을 분석함으로써, 의미 있는 정보인 입출력 데이터를 용이하게 출력할 수 있어, 클라이언트는 애플리케이션 서버의 GUI 화면과 실질적으로 유사한 형태의 화면을 확인할 수 있다. In other words, by analyzing network connection information and input/output packets, input/output data, which is meaningful information, can be easily output, and the client can check a screen that is substantially similar to the GUI screen of the application server.

이와 같이, 클라이언트는 GUI 화면과 실질적으로 유사한 형태로 용이하게 재구현할 수 있기 때문에 애플리케이션 서버 정보, 트랜잭션 코드(T-code), 프로그램명, 상태 메세지, 출력데이터 등을 화면을 통하여 모니터링하면서 용이하게 분석할 수 있다. In this way, the client can be easily re-implemented in a form substantially similar to the GUI screen, so application server information, transaction code (T-code), program name, status messages, output data, etc. can be easily analyzed while being monitored through the screen. can do.

도 11은 본 발명의 패킷 수집 방법에 따라 수집한 GUI 프로토콜 기반의 패킷 분석 데이터의 일 실시 예를 개시하는 도면이다.Figure 11 is a diagram illustrating an example of GUI protocol-based packet analysis data collected according to the packet collection method of the present invention.

일 실시 예에서, GUI 프로토콜을 기반으로 한 패킷으로부터 시간, 프로토콜 정보, 서버/클라이언트 IP 또는 Port, 계정, 트랜잭션명, 이벤트 정보, 개인정보 개수 중 적어도 하나에 대한 데이터를 추출할 수 있다. 각각의 정의에 대한 내용은 상술한 바와 같다. In one embodiment, data on at least one of time, protocol information, server/client IP or port, account, transaction name, event information, and number of personal information can be extracted from a packet based on the GUI protocol. The contents of each definition are as described above.

일 실시 예에서, GUI 프로토콜 기반의 패킷의 분석 결과를 본 도면과 같이 “화면 재현”의 형태로 클라이언트에게 제공할 수 있다. In one embodiment, the results of packet analysis based on the GUI protocol may be provided to the client in the form of a “screen reproduction” as shown in this figure.

보다 상세하게는, 본 발명의 데이터 관리 플랫폼은 클라이언트에게 분석된 패킷에 포함된 데이터를 사용자가 알기 쉬운 방법으로 출력할 수 있다.More specifically, the data management platform of the present invention can output data contained in the analyzed packet to the client in a way that is easy for the user to understand.

본 도면을 통하여 예를 들어, 설명하면, 제 1 사용자가 애플리케이션 서버에 접속하여 “1175”을 입력한 경우, 데이터 관리 플랫폼이 “Request Screen”을 통하여 제 1 사용자가 애플리케이션 서버에 접속하여 입력한 화면을 재현한 실시 예를 나타낸다. For example, if the first user connects to the application server and enters “1175”, the data management platform displays the screen entered by the first user by connecting to the application server through “Request Screen”. An example that reproduces is shown.

또한, 제 1 사용자가 애플리케이션 서버에 접속하여 “1175”를 입력한 이후, 애플리케이션 서버로부터 수신하여 출력한 경우, 데이터 관리 플랫폼은 “Response Screen”을 통하여 제 1 사용자가 애플리케이션 서버로부터 수신한 출력 화면을 재현할 수 있다. In addition, when the first user connects to the application server and inputs “1175” and outputs the data received from the application server, the data management platform displays the output screen received from the application server by the first user through “Response Screen”. It can be reproduced.

이를 통하여, 제 2 사용자(예를 들면, 제 1 사용자의 관리자)는 데이터 관리 플랫폼을 이용하여 제 1 사용자가 입력한 데이터를 포함하는 화면과 출력된 데이터를 포함하는 화면을 각각 “Request Screen” 및 “Response Screen”을 통하여 확인할 수 있다. Through this, the second user (e.g., the administrator of the first user) uses the data management platform to create a “Request Screen” and a screen containing the data input by the first user and a screen containing output data, respectively. You can check it through “Response Screen”.

도 12는 본 발명의 패킷 수집 방법에 따라 수집한 HTTP/HTTPS 기반의 패킷 분석 방법의 일 실시 예를 개시하는 도면이다.Figure 12 is a diagram illustrating an embodiment of a method for analyzing packets based on HTTP/HTTPS collected according to the packet collection method of the present invention.

HTTP(Hypertext Transfer Protocol)는 인터넷에서 데이터를 주고받기 위한 프로토콜 중 하나로, HTTP는 기본적으로 평문(Plain Text)을 이용하여 통신하기 때문에, HTTP 패킷을 수집할 수 있다면 누구나 패킷에 포함된 내용을 확인할 수 있다. 즉, HTTP 기반의 패킷은 보안에 취약하기 때문에 이러한 보안 취약점을 보완하기 위해, HTTPS(HTTP Secure) 기반으로 패킷을 송수신하는 경우가 있다. HTTP (Hypertext Transfer Protocol) is one of the protocols for sending and receiving data on the Internet. Since HTTP basically communicates using plain text, anyone who can collect HTTP packets can check the contents contained in the packet. there is. In other words, since HTTP-based packets are vulnerable to security, in order to compensate for these security vulnerabilities, packets are sometimes transmitted and received based on HTTPS (HTTP Secure).

HTTPS는 HTTP의 보안 버전으로, SSL(Secure Sockets Layer)이나 TLS(Transport Layer Security) 프로토콜을 이용하여 통신할 수 있다. SSL 프로토콜은 데이터를 암호화하여 전송하기 때문에, 임의의 대상이 패킷을 수집하더라도 SSL 암호를 해제하지 않으면 패킷을 분석할 수 없다. HTTPS is a secure version of HTTP and can communicate using SSL (Secure Sockets Layer) or TLS (Transport Layer Security) protocols. Because the SSL protocol encrypts and transmits data, even if a random target collects the packet, the packet cannot be analyzed unless the SSL encryption is deactivated.

따라서, 본 발명은 HTTPS 기반의 패킷을 분석하기 위해 SSL 프로세싱을 수행할 수 있다. 여기에서, SSL 프로세싱은 HTTPS 통신에서 데이터를 암호화하고 복호화하는 과정을 포함할 수 있다. SSL 프로세싱은 클라이언트와 서버 간의 암호화 키 교환, 인증, 데이터 암호화, 복호화 등을 수행할 수 있다. Therefore, the present invention can perform SSL processing to analyze HTTPS-based packets. Here, SSL processing may include the process of encrypting and decrypting data in HTTPS communication. SSL processing can perform encryption key exchange, authentication, data encryption, and decryption between the client and server.

보다 상세하게, 본 발명의 패킷 수집 방법 중 HTTP/HTTPS 기반의 패킷을 분석하는 방법을 설명하면 다음과 같다. In more detail, the method of analyzing HTTP/HTTPS-based packets among the packet collection methods of the present invention will be described as follows.

사용자가 웹 브라우저(web browser, 1030)를 실행하면, 웹 브라우저(1030)는 HTTP/HTTPS를 통해 웹 서버에 요청을 보낼 수 있고, 웹 서버에서 응답을 받아와 사용자에게 데이터를 출력할 수 있다. When a user runs a web browser 1030, the web browser 1030 can send a request to a web server through HTTP/HTTPS, receive a response from the web server, and output data to the user.

프록시 서버(proxy server)는 클라이언트와 웹 서버(예를 들어, SAP 서버) 사이에서 중계 기능을 수행할 수 있다. 웹 브라우저(1030)에서 웹 서버에 접속할 때, 프록시 서버를 거쳐서 요청과 응답을 주고받을 수 있다. 이를 통해, 사용자의 IP 주소를 숨기거나, 캐싱을 통해 네트워크 성능을 향상시키는 등의 기능을 수행할 수 있다.A proxy server may perform a relay function between a client and a web server (eg, SAP server). When the web browser 1030 connects to a web server, requests and responses can be exchanged through a proxy server. Through this, you can perform functions such as hiding the user's IP address or improving network performance through caching.

SSL 프로세싱은 HTTPS 연결에서 SSL 인증서를 확인하고, 키 교환 및 암호화/복호화를 수행할 수 있다. SSL processing can verify SSL certificates in HTTPS connections and perform key exchange and encryption/decryption.

데이터 관리 플랫폼은 HTTP/HTTPS 패킷 추출 및 분석을 수행할 수 있다. 이를 위하여, 데이터 관리 플랫폼에 포함된 수집 모듈 및 분석 모듈을 이용할 수 있다. 수집된 패킷이 HTTP 기반의 패킷인 경우, 바로 분석을 수행하고, HTTPS 기반의 패킷인 경우, SSL 프로세싱을 거친 후 분석을 수행할 수 있다. SSL 프로세싱을 통해 복호화된 HTTPS 패킷은 웹 브라우저나 프록시 서버에서 추출되어 분석될 수 있다. The data management platform can perform HTTP/HTTPS packet extraction and analysis. For this purpose, the collection module and analysis module included in the data management platform can be used. If the collected packets are HTTP-based packets, analysis can be performed immediately, and if the collected packets are HTTPS-based packets, analysis can be performed after SSL processing. HTTPS packets decrypted through SSL processing can be extracted and analyzed by a web browser or proxy server.

데이터 관리 플랫폼은 HTTP 패킷 또는 SSL 프로세싱을 통하여 발생하는 패킷들을 큐(1032, queue)에 저장할 수 있고, 큐(1032)에서 패킷을 추출하여 분석할 수 있다. 여기에서, 큐(1032)의 크기는 SSL 프로세싱의 대역폭과 처리 능력에 따라 결정될 수 있다. 또한, 큐(1032)는 메모리 큐 및 파일 큐를 포함할 수 있다. The data management platform can store packets generated through HTTP packets or SSL processing in a queue (1032), and extract and analyze packets from the queue (1032). Here, the size of the queue 1032 may be determined according to the bandwidth and processing power of SSL processing. Additionally, queue 1032 may include a memory queue and a file queue.

이에 따라, 본 발명은 HTTP/HTTPS 기반의 패킷을 분석할 수 있으며, HTTPS 기반의 패킷을 분석하기 위하여 SSL 프로세싱을 수행하는 방법에 대하여는 이하에서 자세히 설명하도록 한다. Accordingly, the present invention can analyze HTTP/HTTPS-based packets, and the method of performing SSL processing to analyze HTTPS-based packets will be described in detail below.

도 13은 본 발명의 패킷 수집 방법에 따라 수집한 HTTP/HTTPS 기반의 패킷의 분석 데이터의 일 실시 예를 개시하는 도면이다. Figure 13 is a diagram illustrating an example of HTTP/HTTPS-based packet analysis data collected according to the packet collection method of the present invention.

일 실시 예에서, HTTP/HTTPS를 기반으로 한 패킷으로부터 시간, 프로토콜 정보, 서버/클라이언트 IP 또는 Port, 계정, 트랜잭션명, 이벤트 정보, 개인정보 개수 중 적어도 하나에 대한 데이터를 추출할 수 있다. 각각의 정의에 대한 내용은 상술한 바와 같다. In one embodiment, data on at least one of time, protocol information, server/client IP or port, account, transaction name, event information, and number of personal information can be extracted from packets based on HTTP/HTTPS. The contents of each definition are as described above.

일 실시 예에서, HTTP/HTTPS 기반의 패킷 분석 결과를 본 도면과 같이 “데이터”의 형태로 클라이언트에게 제공할 수 있다. In one embodiment, the results of HTTP/HTTPS-based packet analysis may be provided to the client in the form of “data” as shown in this figure.

데이터 관리 플랫폼은 HTTP/HTTPS 기반의 패킷을 분석하면, html 형태 또는 함수 형태의 데이터를 추출할 수 있다. 이때, 데이터 관리 플랫폼은 html 형태의 경우 웹 페이지를 그대로 출력할 수 있고, 함수 형태의 경우, 함수를 그대로 출력할 수 있다. The data management platform can extract data in HTML or function form by analyzing HTTP/HTTPS-based packets. At this time, the data management platform can output the web page as is in HTML format, and can output the function as is in the case of function format.

도 14는 본 발명의 분석된 패킷에 포함된 데이터 중 개인정보를 추출하는 일 실시 예를 개시하는 도면이다. Figure 14 is a diagram illustrating an example of extracting personal information from data included in an analyzed packet of the present invention.

일 실시 예에서, 데이터 관리 플랫폼(10000)은 개인정보 메타데이터를 사용하여 로그 데이터(log data)와 인덱스(index)를 포함하는 감사 로그(Audit log, 1033)로부터 개인정보를 추출할 수 있다. In one embodiment, the data management platform 10000 may extract personal information from an audit log 1033 including log data and an index using personal information metadata.

보다 상세하게는, 감사 로그(1033)는 네트워크 상에서 발생하는 이벤트들을 기록한 로그의 집합으로, 로그 데이터(log data) 및 인덱스(index)를 포함할 수 있다. 여기에서, 감사 로그(1033)는 인덱스 처리가 완료된 로그 데이터의 집합일 수 있다. More specifically, the audit log 1033 is a set of logs that record events occurring on a network and may include log data and an index. Here, the audit log 1033 may be a set of log data for which index processing has been completed.

데이터 관리 플랫폼(10000)은 감사 로그에서 개인 정보를 추출하기 위하여 개인정보 메타데이터를 사용할 수 있다. 여기에서, 개인정보 메타데이터는 개인정보가 어떤 형태로 저장되어 있는지, 어떤 필드에 저장되어 있는지, 개인정보 유형 별로 사용되는 정규식 패턴 및 마스킹 패턴 등이 정의되어 있다. 예를 들어, 이름, 주소, 전화번호 등의 개인정보 유형에 대한 메타데이터를 구성할 수 있다. 이에 따라, 분석 모듈은 개인정보 메타데이터를 기반으로 로그 데이터에서 개인정보를 식별하고 추출할 수 있다. The data management platform 10000 may use personal information metadata to extract personal information from the audit log. Here, personal information metadata defines what form the personal information is stored in, what field it is stored in, and the regular expression patterns and masking patterns used for each type of personal information. For example, you can configure metadata for types of personal information such as name, address, and phone number. Accordingly, the analysis module can identify and extract personal information from log data based on personal information metadata.

일 실시 예에서, 추출된 개인정보는 개인정보 유형 별로 기 설정된 방식에 따라 암호화되어 다시 감사 로그(1033) 또는 데이터 관리 플랫폼(10000) 내부 데이터베이스 상에 저장될 수 있다. 자세한 내용은 후술하도록 한다. In one embodiment, the extracted personal information may be encrypted according to a preset method for each type of personal information and stored again in the audit log 1033 or the internal database of the data management platform 10000. More details will be provided later.

일 실시 예에서, 데이터 관리 플랫폼(10000)은 추출된 개인 정보를 사용하여 특정 기간 동안의 로그 데이터를 검색할 수 있고, 관리자나 사용자가 필요할 때 검색하거나 검색된 로그 데이터를 확인할 수 있다. 이에 대하여는 후술하도록 한다. In one embodiment, the data management platform 10000 can search log data for a specific period using the extracted personal information, and an administrator or user can search or check the retrieved log data when necessary. This will be described later.

도 15는 본 발명의 분석된 패킷에 포함된 데이터를 시각화한 일 실시 예를 개시하는 도면이다.Figure 15 is a diagram illustrating an embodiment of the present invention visualizing data included in an analyzed packet.

상술한 실시 예를 통하여 분석된 패킷에 포함되는 데이터는 다음과 같다. Data included in the packet analyzed through the above-described embodiment is as follows.

(1) 세션 정보: 시작 시간, Duration Time, Log ID, Session ID, Context ID(1) Session information: Start time, Duration Time, Log ID, Session ID, Context ID

(2) 접속 정보: 서버 IP, 서버 Port, 서버 Mac, 클라이언트 IP, 클라이언트 Port, 클라이언트 Mac(2) Connection information: Server IP, Server Port, Server Mac, Client IP, Client Port, Client Mac

(3) SAP 정보: SID, 프로토콜, SAP 인스턴스, 클라이언트(3) SAP information: SID, protocol, SAP instance, client

(4) 프로그램 정보: OK Code, T-code, Title App, Title Main(4) Program information: OK Code, T-code, Title App, Title Main

(5) CUA(Central User Administration) 정보: CUA Name, CUA Status(5) CUA (Central User Administration) information: CUA Name, CUA Status

(6) 사용자 정보: 사용자 ID, 사용자 UID, 사용자명(6) User information: user ID, user UID, user name

(7) 이벤트 정보: 이벤트 카테고리, 이벤트 코드, 이벤트 이름, 이벤트 설명, 이벤트 값, 알림 수준(7) Event information: event category, event code, event name, event description, event value, notification level

(8) 사용자 정의 정보: 이벤트, 경고, 이벤트 유형, 이벤트 건수, 경고 건수, 아키텍쳐, 개인정보 존재 여부, 개인정보 유형 건수, 개인정보건수(8) Custom information: events, warnings, event type, number of events, number of warnings, architecture, presence of personal information, number of personal information types, number of personal information

일 실시 예에서, 데이터 관리 플랫폼은 분석된 패킷으로부터 위와 같은 데이터를 추출할 수 있다. 이때, 데이터 관리 플랫폼은 패킷으로부터 추출된 데이터에 개인정보가 포함된 경우, 개인정보에 대한 암호화를 수행하여 저장할 수 있다. 특히, 데이터 관리 플랫폼은 개인정보에 대하여는 별도의 화면으로 출력할 수 있다. In one embodiment, the data management platform may extract the above data from the analyzed packets. At this time, if the data extracted from the packet includes personal information, the data management platform can perform encryption on the personal information and store it. In particular, the data management platform can output personal information on a separate screen.

이를 통해, 법 및 인증에서 요구하는 모든 사항(계정 정보, 접속일시, 접속지 정보, 처리한 주체 정보, 수행 업무)을 수집할 수 있다. 보다 상세하게는, 계정 정보는 사용자 정보의 사용자 ID, 사번, 조직이름을 통하여, 접속일시는 세션 정보의 시작 시간을 통하여, 접속지 정보는 접속 정보의 클라이언트 IP 또는 Port를 통하여, 처리한 정보주체 정보는 사용자 정의 정보(개인정보 존재 여부, 개인정보 건수, 주민등록번호, 외국인등록번호, 여권 정보 및 카드 정보 등)를 통하여, 수행 업무는 프로그램 정보 및 사용자 정의 정보를 통하여 판단할 수 있다. Through this, all information required by law and certification (account information, access date and time, access location information, processed subject information, and performed tasks) can be collected. More specifically, the account information is provided through the user ID, employee number, and organization name of the user information, the connection date and time are provided through the start time of the session information, and the connection location information is provided through the client IP or port of the connection information. Information can be determined through user-defined information (existence of personal information, number of personal information items, resident registration number, alien registration number, passport information, card information, etc.), and work performed can be determined through program information and user-defined information.

도 16은 본 발명의 분석된 패킷에 포함된 데이터를 검색하는 일 실시 예를 개시하는 도면이다.Figure 16 is a diagram illustrating an embodiment of retrieving data included in an analyzed packet of the present invention.

일 실시 예에서, 데이터 관리 플랫폼은 상술한 데이터들을 검색하여 사용자에게 검색 결과를 제공할 수 있다. In one embodiment, the data management platform may search the above-described data and provide search results to the user.

이를 위하여, 본 도면의 (a)와 같이 데이터 관리 플랫폼은 사용자가 쉽게 검색할 수 있도록 검색 인터페이스를 제공할 수 있다. 데이터 관리 플랫폼은 분석된 데이터의 필드를 기준으로 데이터를 검색할 수 있다. 예를 들어, 데이터 필드는 프로토콜, 시스템, 계정, 사번, 서버 IP, 서버 Port, 클라이언트 IP, 클라이언트 Port, 인스턴스명, 트랜잭션명, 프로그램명 등을 포함할 수 있다. 이때, 데이터 관리 플랫폼은 사용자로부터 입력 받은 필드 중 적어도 하나를 기준으로 데이터를 검색하고, 검색 결과를 출력할 수 있다. To this end, as shown in (a) of this drawing, the data management platform can provide a search interface so that users can easily search. The data management platform can search data based on fields in the analyzed data. For example, data fields may include protocol, system, account, employee number, server IP, server port, client IP, client port, instance name, transaction name, program name, etc. At this time, the data management platform may search data based on at least one of the fields input from the user and output the search results.

다른 실시 예에서, 본 도면의 (b)와 같이 데이터 관리 플랫폼은 쿼리(query) 검색 기능을 제공할 수 있다. 예를 들어, 사용자가 protocol_type=GUI를 제 1 쿼리로 입력하고, server_ip=175.117.145.125를 제 2 쿼리로 입력하는 경우, 데이터 관리 플랫폼은 제 1 쿼리 및 제 2 쿼리를 기준으로 데이터를 검색할 수 있다. In another embodiment, the data management platform may provide a query search function, as shown in (b) of this drawing. For example, if a user enters protocol_type=GUI as the first query and server_ip=175.117.145.125 as the second query, the data management platform can retrieve data based on the first query and the second query. there is.

일 실시 예에서, 검색된 결과는 상술한 바와 같이 “화면 재현” 형태 또는 “데이터” 형태로 출력될 수 있다.In one embodiment, the search results may be output in the form of “screen reproduction” or “data” as described above.

도 17은 본 발명의 데이터 관리 방법에 따라 수집된 데이터를 모니터링하는 일 실시 예를 개시하는 도면이다. Figure 17 is a diagram illustrating an embodiment of monitoring data collected according to the data management method of the present invention.

상술한 바와 같이, 데이터 관리 플랫폼(10000)은 적어도 하나의 클라이언트(1000a, 1000b, 1000c)로부터 네트워크(1001) 및 TAP(1001b)을 통하여 패킷을 수집하고 분석할 수 있다. As described above, the data management platform 10000 can collect and analyze packets from at least one client 1000a, 1000b, and 1000c through the network 1001 and the TAP 1001b.

수집된 패킷은 기 정의된 이벤트에 기초하여 분류되고 저장될 수 있다. 일 실시 예에서, 데이터 관리 플랫폼(10000)은 분석된 패킷으로부터 수집된 데이터를 모니터링할 수 있다. Collected packets can be classified and stored based on predefined events. In one embodiment, data management platform 10000 may monitor data collected from analyzed packets.

보다 상세하게는, 데이터 관리 플랫폼(10000)은 모니터링 모듈(20005)을 통하여 수집된 데이터에 대한 이상 행위를 감지하여 클라이언트(1000a, 1000b, 1000c)의 사용자에게 알람을 제공할 수 있다. 예를 들어, 데이터 관리 플랫폼(10000)은 대시 보드(Dash board), 이메일, SMS 등을 통하여 사용자에게 수집된 데이터에 대한 이상 행위를 보고(report)할 수 있다. More specifically, the data management platform 10000 may detect abnormal behavior in data collected through the monitoring module 20005 and provide an alarm to users of the clients 1000a, 1000b, and 1000c. For example, the data management platform 10000 can report abnormal behavior regarding collected data to the user through a dashboard, email, SMS, etc.

일 실시 예에서, 데이터 관리 플랫폼(10000)은 이상 행위가 감지되는 경우, 클라이언트(1000a, 1000b, 1000c)의 접속을 제어할 수 있다. 예를 들어, 사용자가 클라이언트(1000a, 1000b, 1000c)에 접속하여 수행한 이벤트가 이상 행위로 판단되는 경우, 모니터링 모듈(20005)은 대시 보드 등을 통하여 사용자에게 알람을 제공함과 동시에 클라이언트(1000a, 1000b, 1000c)의 접속을 차단할 수 있다. In one embodiment, the data management platform 10000 may control the connection of the clients 1000a, 1000b, and 1000c when abnormal behavior is detected. For example, if an event performed by the user by accessing the clients (1000a, 1000b, 1000c) is determined to be an abnormal behavior, the monitoring module (20005) provides an alarm to the user through the dashboard, etc. and simultaneously monitors the clients (1000a, 1000c). 1000b, 1000c) connections can be blocked.

특히, 데이터 관리 플랫폼(10000)은 클라이언트(1000a, 1000b, 1000c)가 아카이브 파일(Archive File)이 포함된 데이터베이스(1003)에 접근하여 이벤트를 발생시키는 것을 모니터링하여 알람을 제공하거나 접근을 제어할 수 있다. 여기에서, 데이터베이스(1003)는 데이터 관리 플랫폼(10000) 외부에 저장된 저장소(storage) 또는 내부에 저장된 저장소에 대응한다. 또한, 이벤트에 대한 정보는 상술한 내용을 참고하도록 한다. In particular, the data management platform 10000 monitors clients (1000a, 1000b, 1000c) accessing the database 1003 containing archive files and generating events to provide an alarm or control access. there is. Here, the database 1003 corresponds to storage stored externally or internally to the data management platform 10000. Additionally, please refer to the above-mentioned information for information about the event.

또한, 데이터 관리 플랫폼(1000)은 이상 행위가 아니더라도 기 설정된 이벤트에 해당하는 경우, 클라이언트(1000a, 1000b, 1000c)의 접근을 제어할 수 있다. Additionally, the data management platform 1000 may control access of the clients 1000a, 1000b, and 1000c when the behavior corresponds to a preset event even if it is not an abnormal behavior.

또한, 데이터 관리 플랫폼(1000)은 클라이언트(1000a, 1000b, 1000c)로부터 입력된 데이터가 개인정보인 경우, 이를 감지하여 보안 담당자에게 예를 들어, 메일로 입력된 데이터를 포함하는 화면을 전송할 수 있다. In addition, the data management platform 1000 can detect if the data input from the clients 1000a, 1000b, and 1000c is personal information and send a screen containing the input data to the security officer, for example, by email. .

도 18은 본 발명의 데이터 관리 플랫폼이 수집된 패킷을 분배하는 실시 예를 설명하는 도면이다. Figure 18 is a diagram explaining an embodiment in which the data management platform of the present invention distributes collected packets.

일 실시 예에서, 데이터 관리 플랫폼은 수집 모듈(20001)을 통하여 패킷을 수집할 수 있다. In one embodiment, the data management platform may collect packets through collection module 20001.

보다 상세하게는, 수집 모듈(20001)은 네트워크(예를 들어, 상술한 NIC를 통하여)에서 패킷을 수집할 수 있다. 여기에서, 패킷 수집은 상술한 물리적인 장치 또는 가상 머신에서 실행되는 소프트웨어로 수행될 수 있다. 여기에서, 패킷은 프로토콜 헤더와 페이로드 데이터로 구성된 패킷을 포함할 수 있다. More specifically, collection module 20001 may collect packets from the network (e.g., via the NIC described above). Here, packet collection can be performed by software running on the physical device or virtual machine described above. Here, the packet may include a packet consisting of a protocol header and payload data.

본 발명의 일 실시 예에서, 데이터 관리 플랫폼은 사용자가 애플리케이션 서버에 접근해서 수행하는 모든 행위에 대한 패킷을 수집할 수 있다. 예를 들어, 애플리케이션 사용자는 SAP GUI 프로토콜, RFC 프로토콜, HTTP 등 다양한 프로토콜을 사용하여 SAP 서버에 접근할 수 있다. 뿐만 아니라, 애플리케이션 사용자는 SAP SNC 프로토콜, HTTPS 등을 사용하여서도 SAP 서버에 접근할 수 있다. 각각의 경우에 대하여 자세히 설명하도록 한다. In one embodiment of the present invention, the data management platform may collect packets for all actions performed by the user by accessing the application server. For example, application users can access SAP servers using various protocols, including SAP GUI protocol, RFC protocol, and HTTP. In addition, application users can also access SAP servers using the SAP SNC protocol, HTTPS, etc. Each case will be explained in detail.

분석 모듈(20002)은 수집 모듈(20001)을 통하여 수집된 패킷을 프로토콜을 기반으로 분배할 수 있다. 보다 상세하게는, 분석 모듈(20002)은 조각난 패킷을 재조립하고, TCP 정보, 프로토콜, 포트(port), IP 주소 등을 기반으로 패킷을 제 1 분석 모듈(20002a), 제 2 분석 모듈(20002b), …등에 분배할 수 있다. 여기에서, 제 1 분석 모듈(20002a) 및 제 2 분석 모듈(20002b)는 예를 든 것으로, 프로토콜에 기초하여, 데이터 관리 플랫폼은 패킷을 분석하기 위한 모듈을 생성할 수 있다. The analysis module 20002 may distribute packets collected through the collection module 20001 based on a protocol. More specifically, the analysis module 20002 reassembles the fragmented packets and analyzes the packets based on TCP information, protocol, port, IP address, etc. into the first analysis module 20002a and the second analysis module 20002b. ), … It can be distributed etc. Here, the first analysis module 20002a and the second analysis module 20002b are examples, and based on the protocol, the data management platform can create a module for analyzing packets.

제 1 분석 모듈(20002a) 및 제 2 분석 모듈(20002b)은 패킷을 분석하고, 패킷에 포함된 데이터를 데이터베이스(20007)에 저장할 수 있다. The first analysis module 20002a and the second analysis module 20002b may analyze the packet and store data included in the packet in the database 20007.

데이터베이스(20007)은 분석된 패킷의 로그 데이터를 저장하고, 필요한 경우 검색 및 조회 기능을 제공할 수 있다. The database 20007 stores log data of analyzed packets and may provide search and inquiry functions when necessary.

이하에서는, 각각 다른 프로토콜에 대한 패킷 분석 방법에 대하여 자세히 설명하도록 한다. Below, packet analysis methods for each different protocol will be described in detail.

도 19는 본 발명의 데이터 관리 플랫폼이 RFC 프로토콜 기반의 패킷을 분석하는 실시 예를 설명하는 도면이다.Figure 19 is a diagram illustrating an embodiment in which the data management platform of the present invention analyzes packets based on the RFC protocol.

일 실시 예에서, 데이터 관리 플랫폼(10000)은 수집 모듈(20001)과 분석 모듈(20002)을 통하여 애플리케이션서버(1002)와 RFC 정보와 패킷에 대한 분석 및 구조 정보를 송수신할 수 있다. 이하에서, 데이터 관리 플랫폼(10000)은 수집 모듈(20001) 또는 분석 모듈(20002)이 수행하는 기능을 지원하기 위한 API를 제공할 수 있다. In one embodiment, the data management platform 10000 may transmit and receive RFC information and packet analysis and structure information with the application server 1002 through the collection module 20001 and the analysis module 20002. Hereinafter, the data management platform 10000 may provide an API to support the functions performed by the collection module 20001 or the analysis module 20002.

수집 모듈(20001)은 RFC 패킷 수집부(2001)를 포함할 수 있다. 여기에서, RFC 패킷 수집부(2001)는 수집된 패킷 중 RFC 패킷을 추출하여 분석 모듈(20002)에 전달할 수 있다. The collection module 20001 may include an RFC packet collection unit 2001. Here, the RFC packet collection unit 2001 may extract RFC packets from the collected packets and transmit them to the analysis module 20002.

분석 모듈(20002)은 RFC 정보 요청부(2002) 및 RFC 패킷 분석부(2003)를 포함할 수 있다. 즉, 분석 모듈(20002)은 수집 모듈(20001)로부터 전달받은 RFC 패킷을 분석할 수 있다. 보다 상세하게는, 데이터 관리 플랫폼(10000) 내에 RFC 구조 정보가 있는 경우, RFC 구조 정보를 이용하여 RFC 패킷을 분석할 수 있다. 반면, 데이터 관리 플랫폼(10000) 내에 RFC 구조 정보가 없는 경우, RFC 정보 요청부(2002)는 RFC 정보 요청 파일을 생성할 수 있다. RFC 정보 요청부(2002)는 RFC 정보 요청 파일을 감지하는 경우, 애플리케이션 서버(1002)에 RFC 구조 정보를 요청할 수 있다. 이에 따라, 데이터 관리 플랫폼(10000)은 애플리케이션 서버(1002)에 RFC 정보를 요청할 수 있고, RFC 구조 정보를 수신할 수 있다. RFC 패킷 분석부(2003) 은 애플리케이션 서버(1002)로부터 수신한 RFC 구조 정보를 이용하여 RFC 패킷을 분석할 수 있다. The analysis module 20002 may include an RFC information request unit 2002 and an RFC packet analysis unit 2003. That is, the analysis module 20002 can analyze the RFC packet received from the collection module 20001. More specifically, if there is RFC structure information in the data management platform 10000, the RFC packet can be analyzed using the RFC structure information. On the other hand, if there is no RFC structure information in the data management platform 10000, the RFC information request unit 2002 may generate an RFC information request file. When the RFC information request unit 2002 detects an RFC information request file, it may request RFC structure information from the application server 1002. Accordingly, the data management platform 10000 may request RFC information from the application server 1002 and receive RFC structure information. The RFC packet analysis unit 2003 can analyze the RFC packet using the RFC structure information received from the application server 1002.

도 20은 본 발명의 데이터 관리 플랫폼이 패킷을 분석하는 실시 예를 설명하는 도면이다.Figure 20 is a diagram explaining an embodiment in which the data management platform of the present invention analyzes packets.

일 실시 예에서, 데이터 관리 플랫폼(10000)은 수집 모듈(20001)을 통하여 수집된 패킷이 HTTP/HTTPS 기반의 패킷, GUI/SNC 프로토콜 기반의 패킷인 경우, 분석 모듈(20002)을 통하여 패킷을 분석할 수 있다. 이때, 데이터 관리 플랫폼(10000)은 패킷을 분석하는데 사용되는 인증서를 관리하는 데이터베이스(20007)를 유지할 수 있다. In one embodiment, the data management platform 10000 analyzes the packets through the analysis module 20002 when the packets collected through the collection module 20001 are HTTP/HTTPS-based packets or GUI/SNC protocol-based packets. can do. At this time, the data management platform 10000 may maintain a database 20007 that manages certificates used to analyze packets.

보다 상세하게는, 분석 모듈(20002)은 HTTP 패킷 분석부(2004), HTTPS 패킷 분석부(2005), GUI 패킷 분석부(2006) 및 SNC 패킷 분석부(2007)를 포함할 수 있다. More specifically, the analysis module 20002 may include an HTTP packet analysis unit 2004, an HTTPS packet analysis unit 2005, a GUI packet analysis unit 2006, and an SNC packet analysis unit 2007.

여기에서, HTTP 패킷 분석부(2004)는 HTTP 패킷을 분석하고, HTTPS 패킷 분석부(2005)는 HTTPS 패킷을 분석하고, GUI 패킷 분석부(2006)는 GUI 프로토콜 기반 패킷을 분석하고, SNC 패킷 분석부(2007) SNC 프로토콜 기반 패킷 분석을 수행할 수 있다. Here, the HTTP packet analysis unit (2004) analyzes HTTP packets, the HTTPS packet analysis unit (2005) analyzes HTTPS packets, the GUI packet analysis unit (2006) analyzes GUI protocol-based packets, and the SNC packet analysis. Bu (2007) SNC protocol-based packet analysis can be performed.

분석 모듈(20002)은 보안 수준이 상대적으로 낮은 HTTP 패킷 및 GUI 프로토콜 기반의 패킷은 상술한 실시 예에 따라 패킷을 분석할 수 있다. The analysis module 20002 can analyze packets based on HTTP packets and GUI protocols with relatively low security levels according to the above-described embodiment.

반면, 보안 수준이 상대적으로 높은 HTTPS 패킷 및 SNC 패킷은 데이터베이스(20007)에 저장된 SSL 인증서를 사용하여 암호화된 통신을 복호화하여 패킷을 분석할 수 있다. On the other hand, HTTPS packets and SNC packets with relatively high security levels can be analyzed by decrypting the encrypted communication using the SSL certificate stored in the database (20007).

이에 따라, 데이터 관리 플랫폼(10000)은 암호화된 HTTPS 기반의 패킷을 HTTP 기반의 패킷과 동일하게 데이터를 추출할 수 있다. 마찬가지로, 데이터 관리 플랫폼(10000)은 암호화된 SNC 프로토콜 기반의 패킷을 GUI 프로토콜 기반의 패킷과 동일하게 데이터를 추출할 수 있다. Accordingly, the data management platform 10000 can extract data from encrypted HTTPS-based packets in the same way as HTTP-based packets. Likewise, the data management platform 10000 can extract data from encrypted SNC protocol-based packets in the same way as GUI protocol-based packets.

HTTPS 패킷을 분석하는 방법에 대하여는 후술하도록 한다. 이때, HTTPS 패킷과 SNC 패킷에 사용되는 인증서는 동일하거나 상이한 인증서에 대응할 수 있다. Methods for analyzing HTTPS packets will be described later. At this time, the certificates used for HTTPS packets and SNC packets may correspond to the same or different certificates.

도 21은 본 발명의 데이터 관리 플랫폼이 감사 로그를 저장하고 모니터링하는 실시 예를 설명하는 도면이다.Figure 21 is a diagram illustrating an embodiment in which the data management platform of the present invention stores and monitors audit logs.

일 실시 예에서, 데이터 관리 플랫폼(10000)은 상술한 실시 예에 따라 추출한 원본 데이터를 정의된 필드 규칙에 따라 변환하고 가공하여 감사 로그(1033)를 생성할 수 있다. 여기에서, 생성된 감사 로그(1033)는 데이터베이스(20007)에 저장될 수 있다. In one embodiment, the data management platform 10000 may generate an audit log 1033 by converting and processing the original data extracted according to the above-described embodiment according to defined field rules. Here, the generated audit log 1033 may be stored in the database 20007.

또한, 데이터 관리 플랫폼(10000)은 가공된 데이터 중에서 개인정보 메타데이터를 이용하여 개인정보를 추출할 수 있다. 이에 따라, 데이터 관리 플랫폼(10000)은 감사 로그(1033)와 개인정보를 각각 데이터베이스(20007)에 저장할 수 있다. 이를 위하여, 분석 모듈(20002)은 감사 로그 저장부(2008)를 더 포함할 수 있다. Additionally, the data management platform 10000 can extract personal information from processed data using personal information metadata. Accordingly, the data management platform 10000 may store the audit log 1033 and personal information in the database 20007, respectively. To this end, the analysis module 20002 may further include an audit log storage unit 2008.

또한, 분석 모듈(20003)은 상관 분석 규칙 생성부(2009)를 더 포함할 수 있다. 데이터 관리 플랫폼(10000)은 분석 모듈(20002)를 통하여 상관 분석 규칙을 생성할 수 있다. 여기에서, 상관 규칙이란 이상 행위 판단에 대한 규칙을 나타낼 수 있다. 예를 들어, 데이터 관리 플랫폼(10000)은 1초당 데이터가 1-2건 입력되는 경우 정상 행위로 판단하고, 1초당 데이터가 10건 이상 입력되는 경우 이상 행위로 판단할 수 있다. 여기에서, “상관”이라는 표현은 로그(log) 간의 상관성을 나타낼 수 있다. 이때, 데이터 관리 플랫폼(10000)은 상관 규칙에 의해 또다른 이벤트(로그)를 발생시킬 수 있다. 이때의 로그는 인시던트로 정의할 수 있다. 즉, 데이터 관리 플랫폼(10000)을 통하여 모니터링을 원하는 주체(subject)가 상관 규칙을 생성하고, 데이터 관리 플랫폼(10000)은 상관 규칙에 기초하여 로그들을 분석하고 또 다른 로그인 인시던트를 발생시킬 수 있다. Additionally, the analysis module 20003 may further include a correlation analysis rule generator 2009. The data management platform 10000 may generate correlation analysis rules through the analysis module 20002. Here, the correlation rule may represent a rule for determining abnormal behavior. For example, the data management platform 10000 may determine normal behavior when 1-2 pieces of data are input per second, and may judge it as abnormal behavior when more than 10 pieces of data are input per second. Here, the expression “correlation” may indicate correlation between logs. At this time, the data management platform 10000 may generate another event (log) according to the correlation rule. The log at this time can be defined as an incident. That is, a subject who wants to be monitored through the data management platform 10000 creates a correlation rule, and the data management platform 10000 analyzes logs based on the correlation rule and generates another login incident.

또한, 상관 분석 규칙은 다른 플랫폼에 의해 기 정의된 규칙에 대응할 수 있다. 이에 따라, 데이터 관리 플랫폼(10000)은 생성된 상관 분석 규칙을 이용하여 감사 로그(1033)를 분석하여 이상행위 이벤트를 생성할 수 있다. 이때, 생성된 상관 분석 규칙에 대한 정보 및 이상행위 이벤트에 대한 정보는 데이터베이스(20007)에 저장될 수 있다.Additionally, correlation analysis rules may correspond to rules predefined by other platforms. Accordingly, the data management platform 10000 may generate an abnormal behavior event by analyzing the audit log 1033 using the generated correlation analysis rule. At this time, information about the generated correlation analysis rules and information about abnormal behavior events may be stored in the database 20007.

데이터 관리 플랫폼(10000)의 모니터링 모듈(20005)는 이상행위 모니터링부(2010)를 더 포함할 수 있다. 이상행위 모니터링부(2010)는 생성된 이상행위 이벤트에 기초하여 저장된 감사 로그(1033), 개인정보 등을 검색 또는 모니터링할 수 있다. The monitoring module 20005 of the data management platform 10000 may further include an abnormal behavior monitoring unit 2010. The abnormal behavior monitoring unit 2010 may search or monitor the stored audit log 1033, personal information, etc. based on the generated abnormal behavior event.

이에 따라, 데이터 관리 플랫폼(10000)은 수집, 분석되어 저장된 데이터에서 개인정보를 추출할 수 있고, 추출된 개인정보를 이용하여 사용자 행위를 통계화할 수 있다. 이후, 수집된 사용자 행위 중 위반 행위가 발생한 경우, 데이터 관리 플랫폼(10000)은 사용자를 차단 또는 관리자에게 경고를 제공할 수 있다. Accordingly, the data management platform 10000 can extract personal information from the collected, analyzed, and stored data, and use the extracted personal information to statisticalize user behavior. Thereafter, if a violation occurs among the collected user actions, the data management platform 10000 may block the user or provide a warning to the administrator.

도 22는 본 발명의 데이터 관리 방법이 수집된 패킷을 분배하는 실시 예를 설명하는 도면이다.Figure 22 is a diagram illustrating an embodiment of distributing collected packets by the data management method of the present invention.

단계(S10010)에서, 데이터 관리 방법은 패킷을 수집하고 필터를 적용할 수 있다. 이때, 데이터 관리 방법은 NIC와 같은 네트워크 장비를 통하여 패킷을 수집할 수 있다. 이후, 데이터 관리 방법은 수집된 패킷을 재조합하고, 각각의 네트워크 패킷을 분석 가능한 패킷 형태로 합칠 수 있다. At step S10010, the data management method may collect packets and apply filters. At this time, the data management method can collect packets through network equipment such as NIC. Thereafter, the data management method can reassemble the collected packets and combine each network packet into a packet that can be analyzed.

일 실시 예에서, 데이터 관리 방법은 필터링 규칙에 기초하여 패킷을 분석할지 여부를 결정할 수 있다. 여기에서, 필터링 규칙은 데이터 관리 플랫폼에 의해 결정될 수 있다. 보다 상세하게는, 데이터 관리 방법은 상술한 NIC나 라우터와 같은 네트워크 장비를 통하여 패킷을 수집할 수 있다. 이때, 수집되는 모든 패킷을 분석한다면 성능에 이슈가 있을 수 있기 때문에 필터링 규칙에 의해 분석할 패킷을 필터링할 수 있다. 예를 들면, HTTP나 SAP GUI 프로토콜 기반의 패킷은 분석할 필요가 있지만 그 외의 다른 프로토콜로 수집된 패킷은 분석할 필요가 없을 수 있다. 이때, 데이터 관리 방법은 포트 또는 IP 별로 원하는 프로토콜 기반의 패킷만을 분석하도록 수집된 패킷을 필터링할 수 있다. 이때, 사용자는 데이터 관리 방법을 이용하여 직접 필터링 규칙을 설정할 수 있다. In one embodiment, the data management method may determine whether to analyze the packet based on filtering rules. Here, filtering rules may be determined by the data management platform. More specifically, the data management method can collect packets through network equipment such as the NIC or router described above. At this time, since there may be performance issues if all collected packets are analyzed, the packets to be analyzed can be filtered using filtering rules. For example, packets based on HTTP or SAP GUI protocols may need to be analyzed, but packets collected using other protocols may not need to be analyzed. At this time, the data management method can filter the collected packets to analyze only desired protocol-based packets for each port or IP. At this time, the user can directly set filtering rules using a data management method.

단계(S10020)에서, 데이터 관리 방법은 TCP 세션 정보에 기초하여 프로토콜을 구분할 수 있다. 상술한 바와 같이, 데이터 관리 방법은 수집된 패킷을 프로토콜을 기반으로 구분하여 분석할 수 있다. In step S10020, the data management method may distinguish protocols based on TCP session information. As described above, the data management method can classify and analyze collected packets based on protocols.

데이터 관리 방법은 단계(S10020)에서 구분된 프로토콜에 기초하여 다른 방법으로 패킷을 분석할 수 있다. 각각에 대한 분석 방법은 상술한 바와 같다.The data management method may analyze packets in different ways based on the protocols identified in step S10020. The analysis methods for each are as described above.

단계(S10030)에서, 데이터 관리 방법은 RFC 프로토콜 기반 패킷을 분석할 수 있다. In step S10030, the data management method may analyze the RFC protocol based packet.

단계(S10040)에서, 데이터 관리 방법은 GUI/SNC 프로토콜 기반 패킷을 분석할 수 있다. In step S10040, the data management method may analyze packets based on the GUI/SNC protocol.

단계(S10050)에서, 데이터 관리 방법은 HTTP/HTTPS 기반 패킷을 분석할 수 있다. In step S10050, the data management method may analyze HTTP/HTTPS based packets.

단계(S10030) 내지 단계(S10050)에서, 데이터 관리 방법은 패킷을 파싱한 후 분석을 위한 처리 규칙을 수행할 수 있다. 이때, 데이터 관리 방법은 패킷의 바이너리 데이터를 분석하여 각종 데이터(계정 데이터, SID, 화면 입력 데이터, 화면 출력 데이터 등)를 추출할 수 있다. 또한, 처리 규칙은 분석된 패킷 내에 포함된 데이터를 감사 로그(1033)에 저장할 것인지 여부, 분석된 패킷 내에 포함된 데이터의 가공 또는 로깅(logging) 여부, 이벤트 및 경고를 발생시킬지 여부 등을 처리하는 설정을 나타낼 수 있다. 이외에도, 데이터 관리 방법은 불필요한 로그를 필터링하는 필터 규칙과 처리 규칙을 처리하기 위한 데이터를 로딩하는 적제 규칙을 더 수행할 수 있다. In steps S10030 to S10050, the data management method may parse the packet and then perform processing rules for analysis. At this time, the data management method can extract various data (account data, SID, screen input data, screen output data, etc.) by analyzing the binary data of the packet. In addition, the processing rules process whether to store the data contained in the analyzed packet in the audit log 1033, whether to process or log the data contained within the analyzed packet, and whether to generate events and warnings. Settings can be displayed. In addition, the data management method can further perform filter rules for filtering out unnecessary logs and loading rules for loading data to process processing rules.

단계(S10060)에서, 데이터 관리 방법은 분석된 패킷에 기초하여 감사 로그(1033)를 생성할 수 있다. 이때, 데이터 관리 방법은 로그 저장 속도를 높이기 위하여 멀티 쓰레딩(multi-threading) 방식으로 동작하며 데이터베이스의 접근이 일시적으로 불가능한 경우에 대비하여 메모리 큐잉 및 파일 큐잉을 수행할 수 있다. In step S10060, the data management method may generate an audit log 1033 based on the analyzed packet. At this time, the data management method operates in a multi-threading manner to increase log storage speed, and memory queuing and file queuing can be performed in case the database is temporarily inaccessible.

단계(S10070)에서, 데이터 관리 방법은 생성된 감사 로그(1033)에 대하여 개인정보 메타데이터를 이용하여 개인정보를 추출할 수 있다. In step S10070, the data management method may extract personal information using personal information metadata for the generated audit log 1033.

단계(S10080)에서, 데이터 관리 방법은 감사 로그(1033)를 저장할 수 있다. 이때, 데이터 관리 방법은 감사 로그(1033)와 개인정보를 각각 저장할 수 있다. In step S10080, the data management method may store the audit log 1033. At this time, the data management method can store the audit log 1033 and personal information, respectively.

단계(S10090)에서, 데이터 관리 방법은 이상행위를 모니터링할 수 있다. In step S10090, the data management method may monitor abnormal behavior.

도 23은 본 발명의 데이터 관리 플랫폼이 HTTPS 기반 패킷을 분석하는 실시 예를 설명하는 도면이다.Figure 23 is a diagram explaining an embodiment in which the data management platform of the present invention analyzes HTTPS-based packets.

HTTPS(SSL)로 보호된 웹 브라우저에서의 사용자 행위는 기존의 네트워크 트래픽 미러 기술로는 로그를 기록할 수 없다. 특히, HTTPS 연결 시 키교환을 위해 디피-헬만(Diffie-Hellman) 알고리즘을 사용하는 경우에는 로그를 기록할 수 없다. 여기에서, 디피-헬만(Diffie-Hellman) 알고리즘은 대칭키 암호화 방식에서 사용되는 알고리즘 중 하나로, 키 교환 프로토콜을 안전하게 수행하기 위한 방법에 대응한다. User actions in web browsers protected by HTTPS (SSL) cannot be logged using existing network traffic mirror technology. In particular, logs cannot be recorded when using the Diffie-Hellman algorithm for key exchange during an HTTPS connection. Here, the Diffie-Hellman algorithm is one of the algorithms used in the symmetric key encryption method and corresponds to a method for safely performing the key exchange protocol.

하지만 기업 내 이상행위 및 개인정보 과남용을 모니터링하기 위해서는 웹 브라우저에서의 사용자 행위에 대한 로그를 기록하고 모니터링 해야 하는 요구가 있다.However, in order to monitor abnormal behavior and overuse of personal information within a company, there is a need to record and monitor logs of user behavior in web browsers.

본 발명의 일 실시 예에 따르면, 키교환을 위해 디피-헬만 알고리즘을 사용하는 경우에도 웹 브라우저에서의 사용자 행위에 대한 로그를 기록해 기업 내 이상행위 및 개인정보 과남용을 모니터링할 수 있다. According to one embodiment of the present invention, even when the Diffie-Hellman algorithm is used for key exchange, abnormal behavior and excessive abuse of personal information within a company can be monitored by recording a log of user behavior in a web browser.

이를 위하여, 데이터 관리 플랫폼(10000)은 디피-헬만 알고리즘을 사용하는 경우에도 사용자 행위를 모니터링할 수 있도록 한다. To this end, the data management platform 10000 allows monitoring user behavior even when using the Diffie-Hellman algorithm.

보다 상세하게는, 데이터 관리 플랫폼(10000)의 분석 모듈(20002)은 프록시 서버 구성부(2011), SSL 설정부(2012), HTTP 요청/응답 데이터 분석부(2013) 및 메시지 데이터 생성부(2014)를 더 포함할 수 있다. More specifically, the analysis module 20002 of the data management platform 10000 includes a proxy server configuration unit 2011, an SSL setting unit 2012, an HTTP request/response data analysis unit 2013, and a message data generation unit 2014. ) may further be included.

여기에서, 프록시 서버 구성부(2011)는 웹 브라우저(1030)와 웹 서버(1034) 사이에 프록시 서버(1031)를 구성할 수 있다. 여기에서, 프록시 서버(1031)는 클라이언트와 서버 간의 네트워크 통신을 중개하는 서버로, 클라이언트는 프록시 서버(1031)를 이용하는 경우 직접 웹 서버(1034)와 통신하지 않고 프록시 서버(1031)를 통하여 간접적으로 통신할 수 있다. Here, the proxy server configuration unit 2011 may configure the proxy server 1031 between the web browser 1030 and the web server 1034. Here, the proxy server 1031 is a server that mediates network communication between the client and the server. When using the proxy server 1031, the client does not communicate directly with the web server 1034 but indirectly through the proxy server 1031. Can communicate.

SSL 설정부(2012)는 프록시 서버(1031)에서 SSL을 설정할 수 있다. 보다 상세하게는, SSL 설정부(2012)는 프록시 서버(1031)를 구성하고, 클라이언트 SSL 설정을 이용하여 웹 브라우저(1030)와 프록시 서버(1031) 간의 SSL 환경을 구성하여 HTTPS 요청/응답에 대한 처리를 수행할 수 있다. 일 실시 예에서, SSL 설정부(2012)는 SSL 설정을 위하여 SSL 인증서를 사용할 수 있다. 여기에서, SSL 인증서는 상술한 패킷을 분석하기 위한 인증서와는 상이한 인증서에 대응할 수 있다. 즉, 이때의 SSL 인증서는 SSL 설정을 지원(support)하기 위한 것에 대응한다. The SSL setting unit 2012 can set SSL in the proxy server 1031. More specifically, the SSL setting unit 2012 configures the proxy server 1031 and configures an SSL environment between the web browser 1030 and the proxy server 1031 using client SSL settings to provide HTTPS requests/responses. Processing can be performed. In one embodiment, the SSL configuration unit 2012 may use an SSL certificate to configure SSL. Here, the SSL certificate may correspond to a certificate different from the certificate for analyzing the packet described above. In other words, the SSL certificate at this time corresponds to supporting SSL settings.

HTTPS 요청 데이터를 전달받은 프록시 서버(1031)는 SSL 연결을 일시적으로 중단할 수 있고, HTTP 요청/응답 분석부(2013)를 통해 HTTP 요청/응답 데이터를 처리하고, 다시 SSL 연결을 수행할 수 있다. The proxy server 1031 that receives the HTTPS request data can temporarily suspend the SSL connection, process the HTTP request/response data through the HTTP request/response analysis unit 2013, and perform the SSL connection again. .

이에 따라, 메시지 데이터 생성부(2014)는 HTTP 요청/응답 데이터를 조합하여 메시지 데이터(1035)를 생성할 수 있고, 생성된 메시지 데이터(1035)를 큐(1032)에 저장할 수 있다. 여기에서, 큐(1032)는 메모리 큐와 파일 큐를 포함할 수 있다. Accordingly, the message data generator 2014 can generate message data 1035 by combining HTTP request/response data, and store the generated message data 1035 in the queue 1032. Here, the queue 1032 may include a memory queue and a file queue.

이후, 데이터 관리 플랫폼(10000)은 큐(1032)에 저장된 메시지 데이터(1035)를 외부로 전송할 수 있다. Thereafter, the data management platform 10000 may transmit the message data 1035 stored in the queue 1032 to the outside.

마지막으로, 데이터 관리 플랫폼(10000)은 서버 SSL 설정을 이용하여 프록시 서버(1031)와 웹 서버(1034) 간의 SSL 환경을 구성하여 HTTPS 요청/응답에 대한 처리를 수행하고, HTTPS 응답 데이터를 전달할 수 있다. Finally, the data management platform 10000 can configure an SSL environment between the proxy server 1031 and the web server 1034 using server SSL settings to process HTTPS requests/responses and deliver HTTPS response data. there is.

이를 통해 디피-헬만 알고리즘을 사용하는 경우에도 웹 브라우저에서의 사용자 행위에 대한 로그를 기록할 수 있다. This allows logs of user behavior in web browsers to be recorded even when using the Diffie-Hellman algorithm.

도 24는 본 발명의 데이터 관리 방법이 HTTPS 기반 패킷을 분석하는 실시 예를 설명하는 도면이다.Figure 24 is a diagram explaining an embodiment of the data management method of the present invention analyzing HTTPS-based packets.

단계(S20010)에서, 데이터 관리 방법은 HTTPS 기반 패킷을 수집하기 위하여 프록시 서버를 구성하고 클라이언트 SSL을 설정하고, 단계(S20020)에서, 프록시 서버를 구성하고 서버 SSL을 설정할 수 있다. 즉, 데이터 관리 방법은 프록시 서버를 구성하고, 서버 SSL 설정을 이용하여 프록시 서버와 웹 서버 간의 SSL 환경을 구성하여 HTTPS 요청/응답에 대한 처리를 수행하고, HTTPS 응답 데이터를 전달할 수 있다. In step S20010, the data management method configures a proxy server and sets client SSL to collect HTTPS-based packets, and in step S20020, configures a proxy server and sets server SSL. In other words, the data management method configures a proxy server, configures an SSL environment between the proxy server and the web server using server SSL settings, processes HTTPS requests/responses, and delivers HTTPS response data.

프록시 서버가 구성되고, SSL이 설정되는 경우, 단계(S20030)에서, 데이터 관리 방법은 HTTPS 패킷을 수집할 수 있다. 상술한 바와 달리, 본 도면의 데이터 관리 방법에서는 HTTPS 패킷을 수집하는 경우를 예를 들어 설명한다. If the proxy server is configured and SSL is set, in step S20030, the data management method may collect HTTPS packets. Unlike the above, in the data management method of this figure, the case of collecting HTTPS packets is explained as an example.

HTTPS 연결의 경우 키교환을 위해 디피-헬만 알고리즘을 사용하는데, 디피-헬만 알고리즘을 이용하여 패킷을 암호화하는 경우에는 기존의 네트워크 트래픽 미러 기술로는 사용자 행위에 대한 로그를 기록할 수 없기 때문에 본 발명은 이하의 방법을 제안한다. In the case of an HTTPS connection, the Diffie-Hellman algorithm is used for key exchange. When packets are encrypted using the Diffie-Hellman algorithm, logs of user behavior cannot be recorded using the existing network traffic mirror technology, so the present invention suggests the following method.

단계(S20040)에서, 데이터 관리 방법은 HTTPS 요청/응답에 대한 처리를 수행할 수 있다. 즉, 데이터 관리 방법은 프록시 서버를 구성하고, 클라이언트 SSL 설정을 이용하여 웹 브라우저와 프록시 서버 간의 SSL 환경을 구성하여 HTTPS 요청/응답에 대한 처리를 수행할 수 있다. 이후, HTTPS 요청 데이터를 전달받은 프록시 서버는 SSL 연결을 중단하고, HTTP 요청/응답 데이터를 처리한 후 다시 SSL 연결을 수행할 수 있다.In step S20040, the data management method may perform processing on HTTPS requests/responses. In other words, the data management method is to configure a proxy server and configure an SSL environment between the web browser and the proxy server using client SSL settings to process HTTPS requests/responses. Afterwards, the proxy server that receives the HTTPS request data can suspend the SSL connection, process the HTTP request/response data, and then perform the SSL connection again.

단계(S20050)에서, 데이터 관리 방법은 HTTP 요청/응답 데이터를 조합하여 메시지 데이터(1035)를 생성할 수 있다. 이때, 데이터 관리 방법은 불필요한 데이터를 필터링할 수 있다. 보다 상세하게는, 데이터 관리 방법은 HTTP 요청/응답 데이터를 조합하여 메시지 데이터(1035)를 생성할 수 있는데, 로깅(logging)할 때 불필요한 데이터(예를 들어, 이미지 데이터)는 전달하지 않을 수 있다. In step S20050, the data management method may generate message data 1035 by combining HTTP request/response data. At this time, the data management method can filter out unnecessary data. More specifically, the data management method may generate message data 1035 by combining HTTP request/response data, and unnecessary data (e.g., image data) may not be transmitted when logging. .

단계(S20060)에서, 데이터 관리 방법은 생성된 메시지 데이터(1035)를 메모리 큐(queue) 또는 파일 큐 중 적어도 하나에 적재할 수 있다. 보다 상세하게는, 데이터 관리 방법은 메시지 데이터(1035)를 저장하기 위한 큐로 메모리 큐 및 파일 큐를 사용할 수 있다. 이때, 메시지 데이터(1035)를 저장하기 위하여 파일 큐만 사용하는 경우 성능이 낮아질 위험이 있고, 메모리 큐만 사용하는 경우 데이터의 유실이 발생할 수 있기 때문에 두가지 큐를 조합하여 사용할 수 있다. In step S20060, the data management method may load the generated message data 1035 into at least one of a memory queue or a file queue. More specifically, the data management method may use a memory queue and a file queue as queues for storing message data 1035. At this time, if only the file queue is used to store the message data 1035, there is a risk that performance will be lowered, and if only the memory queue is used, data loss may occur, so a combination of the two queues can be used.

단계(S20070)에서, 데이터 관리 방법은 큐에 적재된 메시지 데이터(1035)를 저장소로 전달하여 저장할 수 있다. 여기에서, 저장소는 상술한 데이터 관리 플랫폼 내에 포함된 데이터베이스에 대응한다. In step S20070, the data management method may transfer and store the message data 1035 loaded in the queue to the storage. Here, the repository corresponds to a database contained within the data management platform described above.

도 25는 본 발명의 데이터 관리 방법의 일 실시 예를 설명하는 도면이다. Figure 25 is a diagram explaining an embodiment of the data management method of the present invention.

단계(S30010)에서, 데이터 관리 방법은 네트워크를 통하여 패킷을 수집할 수 있다. 본 발명의 데이터 관리 방법이 네트워크를 통하여 패킷을 수집하는 방법은 도 5 내지 도 7, 도 18 및 도 19의 실시 예를 참고하도록 한다. In step S30010, the data management method may collect packets through the network. For details on how the data management method of the present invention collects packets through a network, refer to the embodiments of FIGS. 5 to 7, 18, and 19.

단계(S30020)에서, 데이터 관리 방법은 수집된 패킷에 포함된 정보에 기초하여 프로토콜을 구분할 수 있다. 본 발명의 데이터 관리 방법이 수집된 패킷에 포함된 정보에 기초하여 프로토콜을 구분하는 방법은 도 5 및 도 18의 실시 예를 참고하도록 한다. In step S30020, the data management method may distinguish protocols based on information included in the collected packets. Refer to the embodiments of FIGS. 5 and 18 for how the data management method of the present invention distinguishes protocols based on information included in collected packets.

단계(S30030)에서, 데이터 관리 방법은 프로토콜에 기초하여 패킷을 분석할 수 있다. 본 발명의 데이터 관리 방법이 프로토콜에 기초하여 패킷을 분석하는 방법은 도 5, 도 8 내지 도 13, 도 18 내지 도 20의 실시 예를 참고하도록 한다. In step S30030, the data management method may analyze the packet based on the protocol. Refer to the embodiments of FIGS. 5, 8 to 13, and 18 to 20 for the method of analyzing packets based on the protocol in the data management method of the present invention.

단계(S30040)에서, 데이터 관리 방법은 분석된 패킷에 포함된 정보에 기초하여 로그를 생성할 수 있다. 본 발명의 데이터 관리 방법이 분석된 패킷에 포함된 정보에 기초하여 로그를 생성하는 방법은 도 18 및 도 21의 실시 예를 참고하도록 한다. In step S30040, the data management method may generate a log based on information included in the analyzed packet. Refer to the embodiments of FIGS. 18 and 21 for how the data management method of the present invention generates a log based on information included in the analyzed packet.

이를 통하여, 법 또는 규정에서 기준으로 하는 개인정보 처리 시스템에 대한 요구사항(requirements)을 만족시킬 수 있다. 예를 들어, 본 발명에 따르면, “개인정보의 안정성 확보조치 기준 제 8 조”에 의해 개인정보 처리 시스템에 대한 접속기록을 의미적으로 보관 하여야 한다는 조건을 만족할 수 있다. Through this, it is possible to satisfy the requirements for the personal information processing system based on laws or regulations. For example, according to the present invention, the condition that access records to the personal information processing system must be semantically stored according to “Article 8 of the Standards for Ensuring the Safety of Personal Information” can be satisfied.

개인정보를 추출하기 위한 방법으로는 주로 정규식(Regular expression)을 이용한 방법이 사용된다. 다만, 정규식을 이용한 방법은 잘못 추출될 우려가 존재한다. A method using regular expressions is mainly used to extract personal information. However, there is a risk of incorrect extraction when using regular expressions.

이러한 점을 보완하기 위하여 본 발명에서는 개인정보를 추출하기 위하여 정규식 방법 뿐만 아니라 개인정보 메타데이터 및 예외처리 리스트를 사용한 다차원 추출 방법을 통해 개인정보 추출에 대한 정확도를 높일 수 있다. In order to compensate for this, the present invention can increase the accuracy of personal information extraction through a multidimensional extraction method using personal information metadata and exception handling lists as well as a regular expression method to extract personal information.

도 26은 본 발명의 데이터 관리 플랫폼에서 개인정보를 추출하고 저장하는 실시 예를 설명하는 도면이다. Figure 26 is a diagram explaining an embodiment of extracting and storing personal information in the data management platform of the present invention.

본 발명의 데이터 관리 플랫폼(10000)에 포함된 개인정보 관리 모듈(20004)는 감사 로그(1033)로부터 개인정보를 추출 위하여 개인정보 유형을 정의할 수 있다. 여기에서, 감사 로그(1033)에는 인덱스(index) 처리가 완료된 로그 데이터(log data)를 포함할 수 있다. 또한, 개인정보 유형은 예를 들어, 주민등록번호, 신용카드 번호, 계좌 번호 등과 같은 개인정보의 종류를 포함할 수 있다. The personal information management module 20004 included in the data management platform 10000 of the present invention can define personal information types to extract personal information from the audit log 1033. Here, the audit log 1033 may include log data for which index processing has been completed. Additionally, the type of personal information may include types of personal information such as, for example, resident registration number, credit card number, account number, etc.

보다 상세하게는, 개인정보 관리 모듈(20004)의 개인정보 유형 분석부(2019)는 개인정보 추출을 위한 개인정보 메타데이터(1036) 생성을 위해 저장된 감사 로그(1033)에서 사용되는 개인정보 유형 및 추출 방식을 분석할 수 있다. 이에 대한 분석 결과를 개인정보 메타데이터(1036)에 저장할 수 있다. 이때, 개인정보 관리 모듈(20004)는 아키텍쳐(여기에서, 아키텍쳐는 애플리케이션 개발환경, 화면 사용자 인터페이스(User Interface, UI)를 포함한다.) 유형에 기초하여 개인정보 메타데이터(1036)에 포함된 필드를 정의할 수 있다. 여기에서, 아키텍쳐 유형은 로그 데이터의 변수 및 값을 분석하는 파서(parser)를 구분하는 정보에 대응한다. 일 실시 예에서, 개인정보 관리 모듈(20004)는 로그 데이터 내 필드 정보에 대하여 프로토콜 유형 및 URL을 기준으로 아키텍쳐 유형을 구분할 수 있고, 이에 따라 인덱스의 필드 정보를 생성할 수 있다. 또한, 개인정보 메타데이터(1036)는 아키텍쳐 유형(화면 유형), 화면 정보(level 1, level 2) 및 개인정보 추출 규칙을 포함할 수 있다. More specifically, the personal information type analysis unit 2019 of the personal information management module 20004 determines the personal information type and The extraction method can be analyzed. The analysis results can be stored in personal information metadata 1036. At this time, the personal information management module 20004 stores the fields included in the personal information metadata 1036 based on the type of architecture (here, the architecture includes an application development environment and a screen user interface (UI)). can be defined. Here, the architecture type corresponds to information that distinguishes the parser that analyzes the variables and values of log data. In one embodiment, the personal information management module 20004 can classify the architecture type of field information in log data based on protocol type and URL, and generate field information of the index accordingly. Additionally, personal information metadata 1036 may include architecture type (screen type), screen information (level 1, level 2), and personal information extraction rules.

이를 위하여, 개인정보 관리 모듈(20004)의 개인정보 패턴 저장부(2020)는 개인정보 유형 별로 사용되는 패턴을 저장할 수 있다. 개인정보 관리 모듈(20004)은 개인정보 유형을 정의하고, 개인정보 유형 별로 사용되는 정규식 패턴 및 마스킹 패턴을 저장할 수 있다. 여기에서, 개인정보 관리 모듈(20004)이 개인정보 유형을 정의하고, 개인정보 유형 별로 사용되는 정규식 패턴을 저장하는 이유는 개인정보 메타데이터(1036)에 포함되는 정보를 생성하기 위함이다. To this end, the personal information pattern storage unit 2020 of the personal information management module 20004 can store patterns used for each type of personal information. The personal information management module (20004) can define personal information types and store regular expression patterns and masking patterns used for each personal information type. Here, the reason why the personal information management module 20004 defines personal information types and stores regular expression patterns used for each personal information type is to generate information included in the personal information metadata 1036.

결론적으로, 데이터 관리 플랫폼(10000)은 개인정보 유형을 정의하고, 개인정보 유형 별로 사용되는 정규식 패턴을 저장해 놓음으로써 개인정보 메타데이터(1036)를 생성할 수 있게 되고, 개인정보 메타데이터(1036)와 후술하는 개인정보 예외처리 리스트(1037)를 이용하여 개인정보 추출 규칙을 생성하고, 개인정보 추출 규칙을 이용해 개인정보를 추출할 수 있다. In conclusion, the data management platform 10000 can create personal information metadata 1036 by defining personal information types and storing regular expression patterns used for each personal information type. A personal information extraction rule can be created using the personal information exception processing list 1037 described later, and personal information can be extracted using the personal information extraction rule.

또한, 개인정보 관리 모듈(20004)의 개인정보 예외처리 리스트 생성부(2021)는 개인정보 예외처리 리스트(1037)를 생성할 수 있고, 개인정보 추출 규칙 생성부(2022)는 추출할 개인정보 추출 규칙을 생성할 수 있다. 이때, 개인정보 관리 모듈(20004)는 개인정보 예외처리 리스트(1037)를 생성하기 위하여, 개인정보 메타데이터(1036)를 이용하여 감사 로그(1033)에서 가상으로 추출되는 개인정보 항목을 확인할 수 있고, 추출된 값이나 분석된 변수에 기초하여 예외처리 리스트(1037)에 포함되는 예외처리 규칙을 정의할 수 있다. In addition, the personal information exception processing list generator 2021 of the personal information management module 20004 can create a personal information exception processing list 1037, and the personal information extraction rule generator 2022 extracts the personal information to be extracted. You can create rules. At this time, the personal information management module 20004 can check personal information items virtually extracted from the audit log 1033 using the personal information metadata 1036 to create a personal information exception processing list 1037. , Exception handling rules included in the exception handling list 1037 can be defined based on extracted values or analyzed variables.

다른 실시 예에서, 개인정보 관리 모듈(20004)의 개인정보 해쉬 값 수집부(2023)는 개인정보 유형별 해쉬(hash) 값을 수집할 수 있고, 개인정보 유형별 값 필터(value filter)를 생성할 수 있다. 여기에서, 값 필터는 개인정보 값의 해쉬 값에 대해 블룸-필터 자료 구조를 사용하여 만든 필터에 대응한다. 본 발명에서는, 값 필터를 통해 대량의 데이터 집합에서 비교 값의 포함 여부를 빠르게 확인할 수 있다는 장점이 있다. In another embodiment, the personal information hash value collection unit 2023 of the personal information management module 20004 may collect hash values for each personal information type and create a value filter for each personal information type. there is. Here, the value filter corresponds to a filter created using the Bloom-Filter data structure for the hash value of the personal information value. The present invention has the advantage of being able to quickly check whether a comparison value is included in a large data set through a value filter.

보다 상세하게는, 개인정보 관리 모듈(20004)는 여러 유형 별로 개인정보 값에 대한 해쉬 값을 수집하여 개인정보 유형 별로 저장할 수 있다. 개인정보 관리 모듈(20004)의 개인정보 값 필터 생성부(2024)는 수집된 개인정보 데이터를 기준으로 개인정보 유형별 값 필터 파일을 생성할 수 있다.More specifically, the personal information management module 20004 can collect hash values for personal information values for various types and store them for each personal information type. The personal information value filter generation unit 2024 of the personal information management module 20004 may generate a value filter file for each personal information type based on the collected personal information data.

이후, 개인정보 관리 모듈(20004)의 개인정보 추출부(2025)는 개인정보를 추출할 수 있고, 개인정보 암호화부(2026)는 추출된 개인정보를 암호화하고, 개인정보 저장부(2027)는 암호화된 개인정보를 저장할 수 있다. 보다 상세하게는, 개인정보 관리 모듈(20004)는 감사 로그(1033)에 포함된 로그 데이터를 아키텍쳐 유형 별로 분석하여 변수 및 값을 추출하고, 추출된 값에 개인정보 (추출) 규칙이 포함된 개인정보 메타데이터(1036)를 이용하여 개인정보를 추출할 수 있다. Afterwards, the personal information extraction unit 2025 of the personal information management module 20004 can extract personal information, the personal information encryption unit 2026 encrypts the extracted personal information, and the personal information storage unit 2027 Encrypted personal information can be stored. More specifically, the personal information management module 20004 analyzes the log data included in the audit log 1033 by architecture type to extract variables and values, and extracts personal information (extraction) rules included in the extracted values. Personal information can be extracted using information metadata 1036.

일 실시 예에서, 개인정보 관리 모듈(20004)는 추출된 개인정보 값의 해쉬 값이 값 필터 내에 포함되어 있는지 검증할 수 있다. 이때, 해쉬 값이 값 필터 내에 포함되어 있는 경우, 개인정보 관리 모듈(20004)는 감사 로그(1033)의 인덱스에 추출된 개인정보를 저장할 수 있다. 반면, 해쉬 값이 값 필터 내에 포함되어 있지 않은 경우, 개인정보 관리 모듈(20004)는 잘못 추출된 것으로 판단하여 추출된 개인정보를 제거할 수 있다. In one embodiment, the personal information management module 20004 may verify whether the hash value of the extracted personal information value is included in the value filter. At this time, if the hash value is included in the value filter, the personal information management module 20004 may store the extracted personal information in the index of the audit log 1033. On the other hand, if the hash value is not included in the value filter, the personal information management module 20004 may determine that it was extracted incorrectly and remove the extracted personal information.

이후, 암호화된 개인정보를 검색하는 경우, 개인정보 관리 모듈(20004)은 암호화된 개인정보를 복호화한 후 마스킹 규칙에 의해 처리된 개인정보를 출력할 수 있다. Afterwards, when searching for encrypted personal information, the personal information management module 20004 may decrypt the encrypted personal information and then output the personal information processed by the masking rule.

이하, 개인정보 관리 모듈(20004)이 수행하는 기능은 데이터 관리 플랫폼(10000)이 수행하는 것으로 지칭할 수 있다. Hereinafter, the functions performed by the personal information management module 20004 may be referred to as those performed by the data management platform 10000.

이를 통하여, 개인정보 추출에 대한 정확도를 높일 수 있다. 이하, 후술하는 도면을 통하여 본 발명을 상세히 설명하도록 한다. Through this, the accuracy of personal information extraction can be increased. Hereinafter, the present invention will be described in detail through the drawings described later.

도 27은 본 발명의 데이터 관리 플랫폼에서 사용되는 정규식 패턴의 일 예를 설명하는 도면이다.Figure 27 is a diagram illustrating an example of a regular expression pattern used in the data management platform of the present invention.

상술한 실시 예에서, 데이터 관리 플랫폼은 개인정보 유형 별로 사용되는 패턴을 저장할 수 있다. 특히, 데이터 관리 플랫폼은 개인정보 유형 별로 사용될 정규식 패턴 및 마스킹 패턴을 저장할 수 있다. In the above-described embodiment, the data management platform may store patterns used for each type of personal information. In particular, the data management platform can store regular expression patterns and masking patterns to be used for each type of personal information.

본 도면은 데이터 관리 플랫폼에서 사용되는 개인정보 유형 별 정규식 패턴을 나타낸다. 예를 들어, 개인정보가 주민등록번호인 경우, 정규식 패턴은 “(\d{6}[ ,-]-?[1-4]\d{6})|(\d{6}[ ,-]?[1-4])”에 대응할 수 있다. 또한, 다른 예를 들면, 개인정보가 운전면허번호인 경우, 정규식 패턴은 “(\d{2}-\d{2}-\d{6}-\d{2})”에 대응할 수 있다. 이와 같이, 데이터 관리 플랫폼은 개인정보 유형(주민등록번호, 운전면허번호, 전화번호, 메일주소, 주소, 생년월일, 여권번호, 계좌번호, 신용카드번호, 건강보험번호, 외국인등록 번호 등)에 대하여 각각의 정규식 패턴을 정의할 수 있다. 또한, 각각의 개인정보 유형에 대한 정규식 패턴은 기 설정되어 데이터 관리 플랫폼에 저장될 수 있다. 이때, 본 도면에 포함된 개인정보 유형은 예시일 뿐으로 다른 개인정보 등이 더 포함될 수 있음은 물론이다. This diagram shows regular expression patterns for each type of personal information used in the data management platform. For example, if the personal information is a social security number, the regular expression pattern is “(\d{6}[ ,-]-?[1-4]\d{6})|(\d{6}[ ,-]? [1-4])”. Also, for another example, if the personal information is a driver's license number, the regular expression pattern may correspond to “(\d{2}-\d{2}-\d{6}-\d{2})” . In this way, the data management platform provides information on each type of personal information (resident registration number, driver's license number, phone number, email address, address, date of birth, passport number, account number, credit card number, health insurance number, alien registration number, etc.). You can define regular expression patterns. Additionally, regular expression patterns for each type of personal information can be preset and stored in the data management platform. At this time, the types of personal information included in this drawing are only examples, and of course, other personal information may be included.

도 28은 본 발명의 데이터 관리 플랫폼에서 사용되는 마스킹 패턴의 일 예를 설명하는 도면이다.Figure 28 is a diagram illustrating an example of a masking pattern used in the data management platform of the present invention.

본 도면은 데이터 관리 플랫폼에서 사용되는 개인정보 유형 별 마스킹 패턴을 나타낸다. 마스킹 처리 규칙에는 일부 마스킹, 전체 마스킹, 범위 마스킹, 무작위 마스킹 및 일관성 유지 마스킹 등을 포함할 수 있다. This diagram shows masking patterns for each type of personal information used in the data management platform. Masking processing rules may include partial masking, full masking, range masking, random masking, and consistent masking.

예를 들어, 개인정보가 주민등록번호인 경우, 마스킹 처리 기준이 존재하고, 마스킹 패턴은 “11111-2******”에 대응할 수 있다. 즉, 이 경우에는 주민등록번호에 대한 일부 마스킹 패턴을 적용한 것이다. For example, if the personal information is a resident registration number, there are standards for masking processing, and the masking pattern can correspond to “11111-2******”. In other words, in this case, some masking pattern for the resident registration number was applied.

이때, 마스킹 처리 기준은 기 설정된 값에 따라 다를 수 있으며, 본 도면에서는 java를 기준으로 하였으나 다른 방법의 마스킹 처리 규칙을 적용할 수 있음은 물론이다. At this time, the masking processing standard may differ depending on the preset value. In this drawing, Java is used as the standard, but of course, other masking processing rules can be applied.

데이터 관리 플랫폼은 마스킹 처리 규칙에 따라 개인정보를 유형 별로 마스킹 처리할 수 있다. 상술한 예를 들어, 데이터 관리 플랫폼은 주민등록번호의 뒷자리 중 6자리를 가려서 처리할 수 있으며, 운전면허번호의 경우 뒷자리 3자리만을 마스킹 처리할 수 있다. The data management platform can mask personal information by type according to masking processing rules. For example, the data management platform can mask the last 6 digits of a resident registration number, and only the last 3 digits of a driver's license number can be masked.

상술한 실시 예에 따라 마스킹 처리된 개인정보는 이후 개인정보 조회 요청 시 마스킹 처리된 상태로 출력될 수 있다. 이에 따라, 데이터 관리 플랫폼을 이용하는 경우 개인정보에 보안을 가할 수 있다. Personal information that has been masked according to the above-described embodiment may be output in a masked state when a personal information inquiry request is made later. Accordingly, personal information can be secured when using a data management platform.

도 29는 본 발명의 데이터 관리 플랫폼에서 정의하는 개인정보 메타데이터(1036)의 일 예를 설명하는 도면이다.Figure 29 is a diagram illustrating an example of personal information metadata 1036 defined in the data management platform of the present invention.

일 실시 예에서, 데이터 관리 플랫폼은 개인정보 메타데이터(1036)를 이용하여 감사 로그에 포함된 개인정보를 추출할 수 있다. 이를 위하여, 데이터 관리 플랫폼은 개인정보 메타데이터(1036)를 생성하고, 저장할 수 있다. In one embodiment, the data management platform may use personal information metadata 1036 to extract personal information included in the audit log. To this end, the data management platform may generate and store personal information metadata 1036.

데이터 관리 플랫폼은 아키텍쳐 유형 별로 추출할 개인정보 유형 및 방식을 정의하고 개인정보 메타데이터(1036)에 저장할 수 있다. 여기에서, 아키텍쳐 유형이란 GUI 데이터인지, json 데이터인지, WEBGUI인지 여부 등을 나타낼 수 있다. 즉, 데이터 관리 플랫폼은 GUI 데이터 별로 추출할 개인정보 유형 및 방식을 정의하고, 개인정보 메타데이터(1036)에 저장할 수 있고, json 데이터 별로 추출할 개인정보 유형 및 방식을 정의할 수 있다. The data management platform can define the type and method of personal information to be extracted for each architecture type and store it in personal information metadata 1036. Here, the architecture type may indicate whether it is GUI data, json data, or WEBGUI. In other words, the data management platform can define the type and method of personal information to be extracted for each GUI data, store it in the personal information metadata 1036, and define the type and method of personal information to be extracted for each json data.

보다 상세하게는, 개인정보 메타데이터(1036)은 아키텍쳐 유형, 화면 정보(레벨 1), 화면 정보(레벨 2) 및 개인정보 추출 규칙을 포함할 수 있다. More specifically, personal information metadata 1036 may include architecture type, screen information (level 1), screen information (level 2), and personal information extraction rules.

여기에서, 아키텍쳐 유형은 상술한 바와 같이 화면 유형에 대응한다. 실제로 출력되는 화면이 어떤 화면인지에 대한 정보를 포함하고 있다. 일 실시 예에서, 아키텍쳐 유형에 기초하여 개인정보 추출 규칙이 결정될 수 있다. 또한, 개인정보 메타데이터(1036)는 아키텍쳐 유형에 대응하는 레벨 1 화면 정보, 레벨 2 화면 정보를 포함할 수 있다. Here, the architecture type corresponds to the screen type as described above. It contains information about what screen is actually displayed. In one embodiment, personal information extraction rules may be determined based on architecture type. Additionally, the personal information metadata 1036 may include level 1 screen information and level 2 screen information corresponding to the architecture type.

개인정보 추출 규칙은 추출 방식 및 값을 포함할 수 있다. 추출 방식은 값 추출 방식, 변수 추출 방식, 값 컨텐트 추출 방식, 전문 정규식 추출 방식, 복합 개인정보 추출 방식을 예로 들 수 있다. 이하 추출 방식에 대해 설명하도록 한다. Personal information extraction rules may include extraction methods and values. Examples of extraction methods include value extraction method, variable extraction method, value content extraction method, professional regular expression extraction method, and complex personal information extraction method. The extraction method will be explained below.

값 추출 방식은 추출된 개인정보의 값(value)을 기준으로 개인정보를 추출하는 방식이다. 이때, 본 발명의 데이터 관리 플랫폼은 값 추출 방식의 값(value)에 대하여 개인정보 유형 리스트를 이용할 수 있다. The value extraction method is a method of extracting personal information based on the value of the extracted personal information. At this time, the data management platform of the present invention can use a personal information type list for the value of the value extraction method.

변수 추출 방식은 아키텍쳐 유형에 포함된 변수(variable)를 기준으로 개인정보를 추출하는 방식이다. 이때, 본 발명의 데이터 관리 플랫폼은 변수 추출 방식의 값에 대하여 개인정보 유형 리스트와 변수명 리스트를 함께 이용할 수 있다. 여기에서, 변수명 리스트는 개인정보 유형에 따른 변수명 리스트에 대응한다. 예를 들어, A 화면(A 유형)에서는 “주민”이 변수명 리스트에 포함될 수 있고, B 화면(B 유형)에서는 “SSN”이 변수명리스트에 포함될 수 있다. The variable extraction method is a method of extracting personal information based on variables included in the architecture type. At this time, the data management platform of the present invention can use a personal information type list and a variable name list together for the values of the variable extraction method. Here, the variable name list corresponds to the variable name list according to personal information type. For example, in screen A (type A), “Resident” may be included in the variable name list, and in screen B (type B), “SSN” may be included in the variable name list.

전문 정규식 추출 방식은 변수가 존재하지 않은 상태에서 파싱(parsing)이 불가능한 전체 텍스트(full text)를 기준으로 개인정보를 추출하는 방식이다. 이때, 본 발명의 데이터 관리 플랫폼은 전문 정규식 추출 방식의 값에 대하여 개인정보 유형 리스트와 패턴 리스트를 함께 이용할 수 있다. 여기에서, 패턴 리스트는 값에 대한 정규식 뿐만 아니라 정규식의 앞 또는 뒤에 패턴을 추가한 것을 포함할 수 있다. 예를 들어, 데이터 관리 플랫폼은 주민번호가 정규식인 경우, 주민번호 앞에 “:”이 있는 경우를 패턴 리스트에 포함시킬 수 있다. The professional regular expression extraction method extracts personal information based on the full text, which cannot be parsed in the absence of variables. At this time, the data management platform of the present invention can use a personal information type list and a pattern list together for the values of the professional regular expression extraction method. Here, the pattern list may include not only regular expressions for values, but also patterns added before or after the regular expression. For example, the data management platform can include cases where the resident registration number is a regular expression and “:” in front of the resident registration number in the pattern list.

복합 개인정보 추출 방식은 개인정보를 추출할 때 하나의 개인정보를 기준으로 개인정보를 추출하는 것이 아닌 두개 이상의 개인정보를 기준으로 개인정보를 추출하는 방식이다. 일 실시 예에서, 데이터 관리 플랫폼은 아키텍쳐 유형 별로 “이름”과 “주민등록번호”가 모두 존재할 때만 개인정보로 판단할 수 있다. 반면, 데이터 관리 플랫폼은 아키텍쳐 유형 별로 “이름” 또는 “주민등록번호” 중 하나만 있는 경우에는 개인정보가 아니라고 판단할 수 있다. The complex personal information extraction method is a method of extracting personal information based on two or more pieces of personal information rather than extracting personal information based on one piece of personal information. In one embodiment, the data management platform can determine personal information only when both “name” and “resident registration number” exist for each architecture type. On the other hand, the data management platform may determine that it is not personal information if there is only one of “name” or “resident registration number” for each architecture type.

이와 같이, 본 발명의 데이터 관리 플랫폼은 개인정보 메타데이터(1036)에 포함된 정보를 바탕으로 개인정보를 추출할 수 있다. In this way, the data management platform of the present invention can extract personal information based on the information included in the personal information metadata 1036.

도 30은 본 발명의 데이터 관리 플랫폼에서 아키텍쳐 유형을 구분하여 개인정보를 추출하는 실시 예를 설명하는 도면이다.Figure 30 is a diagram illustrating an embodiment of extracting personal information by classifying architecture types in the data management platform of the present invention.

일 실시 예에서, 데이터 관리 플랫폼(10000)의 아키텍쳐 유형 분석부(2029)는 감사 로그(1033)로부터 아키텍쳐 유형을 구분한 뒤, 개인정보 추출부(2025)를 통하여 아키텍쳐 유형 별로 정의된 개인정보 메타데이터(1036)를 이용하여 개인정보를 추출할 수 있다. In one embodiment, the architecture type analysis unit 2029 of the data management platform 10000 classifies the architecture type from the audit log 1033 and then extracts the personal information meta defined for each architecture type through the personal information extraction unit 2025. Personal information can be extracted using data 1036.

보다 상세하게는, 데이터 관리 플랫폼(10000)은 아키텍쳐 유형에 기초하여 개인정보 메타데이터(1036)에 포함되는 필드를 정의할 수 있다. 여기에서, 아키텍쳐 유형은 로그 데이터의 변수 및 값을 분석하는 파서를 구분하는 정보에 대응한다. 즉, 아키텍쳐 유형에 따라 변수 및 값을 분석하기 위한 파싱(parsing) 방법이 달라진다. 따라서, 데이터 관리 플랫폼(10000)은 아키텍쳐 유형에 기초하여 개인정보 메타데이터(1036)에 포함되는 필드를 정의해야 한다. More specifically, the data management platform 10000 may define fields included in the personal information metadata 1036 based on the architecture type. Here, the architecture type corresponds to information that distinguishes the parser that analyzes the variables and values of log data. In other words, the parsing method for analyzing variables and values varies depending on the architecture type. Therefore, the data management platform 10000 must define fields included in the personal information metadata 1036 based on the architecture type.

일 실시 예에서, 데이터 관리 플랫폼(10000)의 아키텍쳐 유형 분석부(2029)는 감사 로그(1033) 내에 포함된 로그 데이터 내의 필드 정보에 대하여 프로토콜 유형 및 URL을 기준으로 아키텍쳐 유형을 구분할 수 있고, 이에 따라 인덱스의 필드 정보를 생성할 수 있다. In one embodiment, the architecture type analysis unit 2029 of the data management platform 10000 may distinguish the architecture type based on the protocol type and URL with respect to field information in the log data included in the audit log 1033. Accordingly, the field information of the index can be created.

이에 따라, 데이터 관리 플랫폼(10000)의 개인정보 추출부(2025)는 아키텍쳐 유형 별로 정의된 개인정보 메타데이터(1036) 및 상술한 예외처리 리스트(1037) 중 적어도 하나를 활용하여 개인정보를 추출할 수 있다. Accordingly, the personal information extraction unit 2025 of the data management platform 10000 extracts personal information using at least one of the personal information metadata 1036 defined for each architecture type and the above-described exception handling list 1037. You can.

이후, 데이터 관리 플랫폼은 추출된 개인정보를 암호화하고, 암호화된 개인정보를 저장할 수 있다. Afterwards, the data management platform can encrypt the extracted personal information and store the encrypted personal information.

도 31은 본 발명의 데이터 관리 플랫폼에서 개인정보 추출 규칙을 생성하는 실시 예를 설명하는 도면이다.Figure 31 is a diagram explaining an embodiment of creating a personal information extraction rule in the data management platform of the present invention.

일 실시 예에서, 데이터 관리 플랫폼은 개인정보 추출 규칙을 생성할 수 있다. 이를 위하여, 본 도면에서는 개인정보 추출 규칙을 생성하는 사용자 인터페이스를 설명한다. 개인정보 추출 규칙을 생성하기 위하여, 사용자 인터페이스는 기본 정보, 값 정규식 추출 정보, 변수 기반 추출 정보 중 적어도 하나를 포함할 수 있다. In one embodiment, the data management platform may generate personal information extraction rules. To this end, this drawing explains a user interface for creating personal information extraction rules. To create a personal information extraction rule, the user interface may include at least one of basic information, value regular expression extraction information, and variable-based extraction information.

보다 상세하게는, 기본 정보는 개인정보 추출 규칙을 적용할 대상을 나타낸다. 여기에서, 기본 정보는 로그 유형, 레벨 1(화면, 트랜잭션), 레벨 2(프로그램, 서비스) 중 적어도 하나를 포함할 수 있다. 이에 따라, 사용자는 로그 유형(예를 들어, SAP GUI 로그), 레벨 1 또는 레벨 2 중 적어도 하나를 개인정보 추출 규칙을 적용할 대상으로 입력할 수 있다. More specifically, basic information represents the target to which personal information extraction rules will be applied. Here, the basic information may include at least one of log type, level 1 (screen, transaction), and level 2 (program, service). Accordingly, the user may input at least one of a log type (eg, SAP GUI log), level 1, or level 2 as the target to which the personal information extraction rule is to be applied.

또한, 값 정규식 추출 정보는 개인정보 추출 방식 별로 추출할 개인 정보를 포함할 수 있다. 보다 상세하게는, 전체 개인정보가 존재하며, 데이터 관리 플랫폼은 사용자로부터 추출할 개인정보를 선택받을 수 있다. 이에 따라, 데이터 관리 플랫폼은 개인정보 추출 방식 별로 로그에서 값을 추출한 후 정규식 패턴과 비교하여 개인정보를 판단할 수 있다. 이를 위하여, 데이터 관리 플랫폼은 상술한 실시 예인 개인정보 유형별로 저장된 정규식 패턴 파일을 이용할 수 있다. Additionally, the value regular expression extraction information may include personal information to be extracted for each personal information extraction method. More specifically, total personal information exists, and the data management platform can select which personal information to extract from the user. Accordingly, the data management platform can determine personal information by extracting values from the log for each personal information extraction method and comparing them with regular expression patterns. For this purpose, the data management platform can use regular expression pattern files stored for each personal information type, which is the embodiment described above.

또한, 데이터 관리 플랫폼은 로그 유형별로 분석된 변수의 값을 이용하여 개인정보 추출 규칙을 생성할 수 있다. 이를 위하여, 데이터 관리 플랫폼은 사용자로부터 변수 기반 추출 정보를 입력 받을 수 있다. Additionally, the data management platform can create personal information extraction rules using the values of variables analyzed by log type. For this purpose, the data management platform can receive variable-based extraction information from the user.

도 32는 본 발명의 데이터 관리 플랫폼에서 개인정보 예외처리 리스트를 생성하는 실시 예를 설명하는 도면이다.Figure 32 is a diagram illustrating an embodiment of generating a personal information exception processing list in the data management platform of the present invention.

일 실시 예에서, 데이터 관리 플랫폼은 개인정보 예외처리 리스트(1037)를 생성할 수 있다. 보다 상세하게는, 데이터 관리 플랫폼은 감사 로그(1033)로부터 개인정보 메타데이터(1036)를 이용하여 가상으로 추출되는 개인정보 항목을 확인할 수 있다. 즉, 데이터 관리 플랫폼은 개인정보 메타데이터(1036)에 포함된 항목(예를 들어, 상술한 key/value, 화면명/필드명 등)을 이용하여 개인정보를 가상으로 추출할 수 있다. 여기에서, 가상으로 추출된 개인정보 항목은 원본 로그 데이터에서 본 발명의 실시 예에 따라 생성된 개인정보 메타데이터(1036)의 항목을 기준으로 출력된 후보 데이터(candidate data)에 대응한다. In one embodiment, the data management platform may create a personal information exception handling list 1037. More specifically, the data management platform can confirm personal information items virtually extracted from the audit log 1033 using personal information metadata 1036. In other words, the data management platform can virtually extract personal information using the items included in the personal information metadata 1036 (e.g., key/value, screen name/field name, etc. described above). Here, the virtually extracted personal information items correspond to candidate data output based on the items of personal information metadata 1036 created according to an embodiment of the present invention from the original log data.

데이터 관리 플랫폼은 추출된 값 또는 분석된 변수를 기준으로 예외처리 리스트(1037)를 생성할 수 있다. 보다 상세하게는, 데이터 관리 플랫폼은 개인정보 메타데이터(1036)에 포함된 항목을 기준으로 예외처리 리스트(1037)를 생성할 수 있다. The data management platform can create an exception handling list 1037 based on extracted values or analyzed variables. More specifically, the data management platform may create an exception handling list 1037 based on the items included in the personal information metadata 1036.

이때, 데이터 관리 플랫폼은 개인정보 메타데이터(1036)에 포함된 아키텍쳐 유형, 화면 정보 및 개인정보 추출 규칙 중 적어도 하나를 예외처리 리스트(1037)를 생성할 수 있다. 예를 들어, 데이터 관리 플랫폼은 예외처리 리스트를 생성할 때, “제 1 아키텍쳐(UI) 유형에서 해당 키는 개인정보가 아니기 때문에 추출하지 않는다” 또는 “제 2 아키텍쳐(UI) 유형에서 해당 값은 개인정보가 아니기 때문에 추출하지 않는다”와 같이 등록할 수 있다. At this time, the data management platform may create an exception handling list 1037 for at least one of the architecture type, screen information, and personal information extraction rule included in the personal information metadata 1036. For example, when the data management platform creates an exception handling list, “In the first architecture (UI) type, the corresponding key is not extracted because it is not personal information” or “In the second architecture (UI) type, the corresponding key is not extracted.” You can register as follows: “Because it is not personal information, it will not be extracted.”

예를 들면, 데이터 관리 플랫폼은 감사 로그(1033)에서 가상의 개인정보로 “계좌번호”를 추출할 수 있다. 여기에서, “계좌번호”는 후보 데이터에 대응한다. 또한, 개인정보 메타데이터(1036)의 화면명/필드명에는 A제품/시리얼 넘버가 포함되어 있을 수 있다. 여기에서, A제품의 시리얼 넘버는 개인정보에 해당하지 않는다고 가정한다. 이때, A제품에 대한 시리얼 넘버가 개인정보인 “계좌번호”와 동일한 경우, 데이터 관리 플랫폼은 추출된 “계좌번호”는 개인정보가 아닌 것으로 판단하여 예외처리 리스트(1037)에 등록할 수 있다. For example, the data management platform can extract “account number” as virtual personal information from the audit log 1033. Here, “account number” corresponds to candidate data. Additionally, the screen name/field name of the personal information metadata 1036 may include product A/serial number. Here, it is assumed that the serial number of product A does not constitute personal information. At this time, if the serial number for product A is the same as the “account number” that is personal information, the data management platform determines that the extracted “account number” is not personal information and can register it in the exception handling list (1037).

즉, 데이터 관리 플랫폼은 상술한 실시 예에 따라 정의된 개인정보 메타데이터(1036)를 참고하여, 가상으로 추출되는 개인정보를 확인할 수 있다. 이때의 개인정보는 정확한 개인정보가 아닌 후보 데이터이기 때문에, 후보 데이터에 대하여 개인정보 메타데이터(1036)에 포함된 항목을 기준으로 예외처리 리스트(1037)를 생성할 수 있다. That is, the data management platform can confirm virtually extracted personal information by referring to the personal information metadata 1036 defined according to the above-described embodiment. Since the personal information at this time is candidate data rather than accurate personal information, an exception handling list 1037 can be created based on the items included in the personal information metadata 1036 for the candidate data.

일 실시 예에서, 데이터 관리 플랫폼은 감사 로그(1033)에서 개인정보를 추출할 때, 예외처리 리스트(1037)에 포함된 항목을 제외하고 개인정보를 추출할 수 있다. In one embodiment, when extracting personal information from the audit log 1033, the data management platform may extract personal information by excluding items included in the exception handling list 1037.

이에 따라, 잘못된 개인정보를 추출할 확률을 낮추는 장점이 있다. Accordingly, there is an advantage in lowering the probability of extracting incorrect personal information.

도 33은 본 발명의 데이터 관리 플랫폼의 예외 필터를 생성하는 실시 예를 설명하는 도면이다.Figure 33 is a diagram explaining an embodiment of creating an exception filter of the data management platform of the present invention.

일 실시 예에서, 데이터 관리 플랫폼은 상술한 방법으로 예외처리 리스트를 생성할 수 있다. 본 도면에서는 예외처리 리스트를 생성하기 위한 예외 필터 사용자 인터페이스를 설명한다. 예외 필터 사용자 인터페이스는 적어도 하나의 개인정보 항목을 포함할 수 있다. In one embodiment, the data management platform may create an exception handling list using the method described above. This drawing explains the exception filter user interface for creating an exception handling list. The exception filter user interface may include at least one personal information item.

일 실시 예에서, 데이터 관리 플랫폼은 사용자로부터 개인정보 항목에 대한 예외 필터 정보를 입력받을 수 있다. 여기에서, 예외 필터 정보는 아키텍쳐 유형(UI 유형), 레벨 1(화면, 트랜잭션), 값, 레벨 2(프로그램, 서비스), 변수 중 적어도 하나를 포함할 수 있다. In one embodiment, the data management platform may receive exception filter information for personal information items from the user. Here, the exception filter information may include at least one of architecture type (UI type), level 1 (screen, transaction), value, level 2 (program, service), and variable.

본 도면의 예를 들어 설명하면, 데이터 관리 플랫폼은 사용자로부터 예외 필터 정보로 “아키텍쳐 유형=SAP GUI”“레벨 1=SE16”“레벨 2=SAPMF02D-7230”“변수=RDTMS_VAL2”“값=8201301037329”를 수신할 수 있다. As an example in this figure, the data management platform receives exception filter information from the user: “Architecture type = SAP GUI” “Level 1 = SE16” “Level 2 = SAPMF02D-7230” “Variable = RDTMS_VAL2” “Value = 8201301037329” can receive.

이에 따라, 데이터 관리 플랫폼은 사용자로부터 입력받은 예외 필터 정보를 이용하여 예외처리 리스트를 생성할 수 있다. Accordingly, the data management platform can create an exception handling list using exception filter information input from the user.

도 34는 본 발명의 데이터 관리 플랫폼에서 추출된 개인정보를 검색하고 출력하는 실시 예를 설명하는 도면이다.Figure 34 is a diagram explaining an embodiment of searching and outputting personal information extracted from the data management platform of the present invention.

본 발명의 데이터 관리 플랫폼(10000)에 포함된 개인정보 관리 모듈(20004)의 개인정보 복호화부(2028)은 암호화된 개인정보를 복호화할 수 있다. 보다 상세하게는, 사용자가 데이터 관리 플랫폼(10000)의 모니터링 모듈(20005)를 통하여 개인정보 검색을 요청하는 경우, 데이터 관리 플랫폼(10000)은 암호화된 개인정보를 복호화할 수 있다. 이때, 데이터 관리 플랫폼(10000)은 감사 로그(1033)에 포함된 데이터를 이용하여 개인정보를 복호화할 수 있다. The personal information decryption unit 2028 of the personal information management module 20004 included in the data management platform 10000 of the present invention can decrypt encrypted personal information. More specifically, when a user requests personal information search through the monitoring module 20005 of the data management platform 10000, the data management platform 10000 can decrypt the encrypted personal information. At this time, the data management platform 10000 can decrypt personal information using data included in the audit log 1033.

일 실시 예에서, 개인정보 관리 모듈(20004)는 데이터베이스(20007)에 포함된 암호화 키를 이용하여 암호화된 개인정보를 복호화할 수 있다. 이후, 개인정보 관리 모듈(20004)의 마스킹 처리부(2029)는 복호화된 개인정보를 마스킹 처리하여 모니터링 모듈(20005)에게 제공할 수 있다. 여기에서, 개인정보를 마스킹 처리하는 방법은 상술한 바와 같다. 이에 따라, 모니터링 모듈(20005)는 개인정보 관리 모듈(20004)에 의해 마스킹 처리된 개인정보를 출력하여 사용자에게 제공할 수 있다. In one embodiment, the personal information management module 20004 may decrypt encrypted personal information using an encryption key included in the database 20007. Thereafter, the masking processing unit 2029 of the personal information management module 20004 may mask the decrypted personal information and provide it to the monitoring module 20005. Here, the method for masking personal information is as described above. Accordingly, the monitoring module 20005 can output personal information masked by the personal information management module 20004 and provide it to the user.

이를 통하여, 데이터 관리 플랫폼은 추출된 개인정보에 대한 검색 요청이 있는 경우, 암호화 키에 기반하여 개인정보를 복호화한 뒤 마스킹 처리하여 사용자에게 제공할 수 있다. 이에 따라, 개인정보 보호에 기여할 수 있다. Through this, when there is a search request for extracted personal information, the data management platform can decrypt the personal information based on the encryption key, mask it, and provide it to the user. Accordingly, it can contribute to protecting personal information.

도 35는 본 발명의 데이터 관리 방법이 개인정보를 관리하는 실시 예를 설명하는 도면이다. Figure 35 is a diagram explaining an embodiment of how the data management method of the present invention manages personal information.

단계(S40010)에서, 데이터 관리 방법은 개인정보 유형을 분석할 수 있다. 이를 위하여, 데이터 관리 방법은 감사 로그에서 사용되는 개인정보 유형을 분석할 수 있다. In step S40010, the data management method may analyze the type of personal information. To this end, data management methods can analyze the types of personal information used in audit logs.

단계(S40020)에서, 데이터 관리 방법은 개인정보 유형을 정의하고, 개인정보 유형 별로 사용되는 패턴을 저장할 수 있다. 데이터 관리 방법은 개인정보 유형 별로 사용되는 정규식 패턴 및 마스킹 패턴을 저장할 수 있다. 이에 따라, 데이터 관리 방법은 키/값 또는 화면명/필드명이 포함된 개인정보 메타데이터(1036)를 생성할 수 있다. 개인정보 메타데이터(1036)를 생성하는 구체적인 방법은 상술한 바와 같다. In step S40020, the data management method may define personal information types and store patterns used for each personal information type. The data management method can store regular expression patterns and masking patterns used for each type of personal information. Accordingly, the data management method can generate personal information metadata 1036 including key/value or screen name/field name. The specific method of generating personal information metadata 1036 is as described above.

단계(S40030)에서, 데이터 관리 방법은 개인정보 예외처리 리스트(1037)를 생성할 수 있다. 데이터 관리 방법은 개인정보 메타데이터(1036)를 이용하여 감사 로그에서 가상으로 추출되는 개인정보 항목을 확인할 수 있고, 추출된 값 또는 분석된 변수에 기초하여 예외처리 리스트(1037)에 포함되는 예외처리 규칙을 정의할 수 있다. 이에 따라, 데이터 관리 방법은 개인정보 예외처리 리스트(1037)를 생성할 수 있다. In step S40030, the data management method may generate a personal information exception processing list 1037. The data management method uses personal information metadata (1036) to identify personal information items virtually extracted from the audit log, and includes exception handling in the exception handling list (1037) based on the extracted values or analyzed variables. Rules can be defined. Accordingly, the data management method can create a personal information exception processing list 1037.

단계(S40040)에서, 데이터 관리 방법은 개인정보 메타데이터(1036) 및 예외처리 리스트(1037)에 기초하여 개인정보 추출 규칙을 생성할 수 있다. In step S40040, the data management method may generate a personal information extraction rule based on the personal information metadata 1036 and the exception handling list 1037.

단계(S40050)에서, 데이터 관리 방법은 개인정보 추출 규칙에 기초하여 개인정보를 추출할 수 있다. In step S40050, the data management method may extract personal information based on a personal information extraction rule.

단계(S40060)에서, 데이터 관리 방법은 추출된 개인정보를 암호화하여 저장할 수 있다. 일 실시 예에서, 데이터 관리 방법은 추출된 개인정보를 암호화하기 위한 암호화 키를 생성할 수 있다. In step S40060, the data management method may encrypt and store the extracted personal information. In one embodiment, the data management method may generate an encryption key to encrypt the extracted personal information.

단계(S40070)에서, 데이터 관리 방법은 개인정보 검색 요청 시 암호화된 개인정보를 복호화할 수 있다. 이때, 데이터 관리 방법은 개인정보 검색 요청 시 생성된 암호화키를 이용하여 암호화된 개인정보를 복호화할 수 있다. In step S40070, the data management method may decrypt the encrypted personal information when requesting personal information search. At this time, the data management method can decrypt the encrypted personal information using the encryption key generated when requesting personal information search.

단계(S40080)에서, 데이터 관리 방법은 복호화된 개인정보를 마스킹 처리할 수 있다. 개인정보를 마스킹 처리하는 방법에 대하여는 상술한 바와 같다. In step S40080, the data management method may mask the decrypted personal information. The method for masking personal information is the same as described above.

단계(S40090)에서, 데이터 관리 방법은 마스킹 처리된 개인정보를 출력할 수 있다. 이를 통하여, 복호화된 개인정보는 마스킹 처리되어 출력되기 때문에 개인정보의 검색을 요청한 사용자는 개인정보를 침해하지 않으면서 데이터베이스 내의 개인정보를 검색하고 이용할 수 있다. In step S40090, the data management method may output masked personal information. Through this, since the decrypted personal information is masked and output, users who request a search for personal information can search and use the personal information in the database without infringing on personal information.

또한, 정규식 패턴 방식으로 추출된 개인정보의 경우 잘못 추출될 우려가 있다. 이에 따라, 본 발명에서는 상술한 실시 예 이외에도 개인정보가 잘못 추출될 위험을 줄이기 위하여 실제로 로그 상에 존재하는 개인정보를 블룸-필터(Bloom filter)화 시킬 수 있다. 즉, 추출된 개인정보에 블룸-필터를 적용하여 개인정보라고 판단되는 경우에만 추출하여 개인정보 추출의 정확도를 높일 수 있다. 이하에서 자세히 설명하도록 한다. Additionally, there is a risk that personal information extracted using the regular expression pattern method may be extracted incorrectly. Accordingly, in the present invention, in addition to the above-described embodiments, personal information actually existing in the log can be bloom-filtered to reduce the risk of personal information being extracted incorrectly. In other words, the accuracy of personal information extraction can be improved by applying a Bloom-filter to the extracted personal information and extracting it only when it is judged to be personal information. This will be explained in detail below.

도 36은 본 발명의 데이터 관리 방법이 개인정보를 추출하고 저장하는 다른 실시 예를 설명하는 도면이다.Figure 36 is a diagram illustrating another embodiment in which the data management method of the present invention extracts and stores personal information.

단계(S50010)에서, 데이터 관리 방법은 개인정보에 대한 해쉬(hash) 값을 수집할 수 있다. 보다 상세하게는, 데이터 관리 방법은 개인정보 암호화 솔루션과 같이 개인정보를 관리하는 시스템/플랫폼으로부터 여러 유형의 개인정보 값에 대한 해쉬 값을 수집할 수 있다. 여기에서, 개인정보를 관리하는 시스템/플랫폼은 외부 서버에 존재할 수 있다. 일 실시 예에서, 데이터 관리 방법은 수집된 해쉬 값을 개인정보 유형 별로 저장할 수 있다. In step S50010, the data management method may collect a hash value for personal information. More specifically, the data management method can collect hash values for various types of personal information values from a system/platform that manages personal information, such as a personal information encryption solution. Here, the system/platform that manages personal information may exist on an external server. In one embodiment, the data management method may store collected hash values for each type of personal information.

단계(S50020)에서, 데이터 관리 방법은 개인정보 유형 별 값 필터(value filter)를 생성할 수 있다. 여기에서, 값 필터는 개인정보 값의 해쉬 값에 대해 블룸-필터 자료 구조(bloom filter data structure)를 사용하여 만든 필터에 대응한다. In step S50020, the data management method may create a value filter for each personal information type. Here, the value filter corresponds to a filter created using a bloom filter data structure for the hash value of the personal information value.

단계(S50030)에서, 데이터 관리 방법은 개인정보를 추출할 수 있다. 일 실시 예에서, 감사 로그 내 로그 데이터를 아키텍쳐 유형 별로 분석하여 변수 및 값을 추출하고, 추출된 값에 개인정보 추출 규칙을 적용하여 개인정보를 추출할 수 있다. 이에 대하여는 상술한 바와 같다. In step S50030, the data management method may extract personal information. In one embodiment, log data in the audit log is analyzed by architecture type to extract variables and values, and personal information can be extracted by applying personal information extraction rules to the extracted values. This is the same as described above.

단계(S50040)에서, 데이터 관리 방법은 개인정보 값을 검증할 수 있다. 즉, 상술한 내용에 더불어 추출된 개인정보 값을 검증하기 위하여 값 필터를 사용할 수 있다. In step S50040, the data management method may verify the personal information value. That is, in addition to the above-described content, a value filter can be used to verify the extracted personal information value.

단계(S50050)에서, 개인정보에 대한 해쉬 값이 값 필터 내에 포함되어 있는 경우, 단계(S50060)에서, 데이터 관리 방법은 인덱스에 추출된 개인정보를 저장할 수 있다. 보다 상세하게는, 추출한 개인정보에 대한 해쉬 값이 블룸-필터 자료 구조를 사용하여 만든 값 필터 내에 포함되어 있는 경우, 데이터 관리 방법은 추출한 개인정보를 진짜(real) 개인정보인 것으로 판단할 수 있다. 이에 따라, 데이터 관리 방법은 감사 로그 내 추출된 개인정보에 대하여 진짜 개인정보라는 인덱스를 저장할 수 있다. In step S50050, if the hash value for personal information is included in the value filter, in step S50060, the data management method may store the extracted personal information in the index. More specifically, if the hash value for the extracted personal information is included in a value filter created using the Bloom-Filter data structure, the data management method can determine that the extracted personal information is real personal information. . Accordingly, the data management method can store an index called real personal information for personal information extracted from the audit log.

단계(S50070)에서, 데이터 관리 방법은 개인정보에 대한 해쉬 값이 값 필터 내에 포함되어 있지 않은 경우, 단계(S50080)에서, 데이터 관리 방법은 추출된 개인정보를 제거할 수 있다. 보다 상세하게는, 추출한 개인정보에 대한 해쉬 값이 블룸-필터 자료 구조를 사용하여 만든 값 필터 내에 포함되어 있지 않은 경우, 데이터 관리 방법은 추출한 개인정보를 가짜(fake) 개인정보인 것으로 판단할 수 있다. 이에 따라, 데이터 관리 방법은 추출한 개인정보가 잘못 추출된 것으로 판단하여 추출된 개인정보를 제거할 수 있다. 여기에서, 추출된 개인정보를 제거한다는 것은 데이터를 자체를 제거하는 것이 아닌 개인정보로서의 가치를 잃는 것을 의미할 수 있다. In step S50070, if the hash value for the personal information is not included in the value filter, the data management method may remove the extracted personal information in step S50080. More specifically, if the hash value for the extracted personal information is not included in the value filter created using the Bloom-Filter data structure, the data management method may determine the extracted personal information to be fake personal information. there is. Accordingly, the data management method may determine that the extracted personal information was extracted incorrectly and remove the extracted personal information. Here, removing extracted personal information may mean losing its value as personal information rather than removing the data itself.

이에 따라, 본 발명에서는, 블룸-필터 자료 구조를 이용하여 대량으로 관리되는 개인정보와 개인정보로 추출된 값을 빠르게 비교할 수 있다. 또한, 값을 검증하기 힘든 유형의 개인정보에 대한 오류를 제거할 수 있다. Accordingly, in the present invention, it is possible to quickly compare personal information managed in large quantities and values extracted from personal information using the Bloom-filter data structure. Additionally, errors regarding types of personal information whose values are difficult to verify can be eliminated.

도 37은 본 발명의 데이터 관리 방법에서 개인정보를 관리하는 실시 예를 설명하는 도면이다.Figure 37 is a diagram explaining an embodiment of managing personal information in the data management method of the present invention.

단계(S60010)에서, 데이터 관리 방법은 데이터를 분석할 수 있다. 본 발명의 데이터 관리 방법이 데이터를 분석하는 방법은 도 2 내지 도 5, 도 8 내지도 26, 도 29, 도 30 및 도 35의 실시 예를 참고하도록 한다.In step S60010, the data management method may analyze the data. For information on how the data management method of the present invention analyzes data, refer to the embodiments of FIGS. 2 to 5, 8 to 26, 29, 30, and 35.

단계(S60020)에서, 데이터 관리 방법은 분석된 데이터로부터 개인정보를 추출할 수 있다. In step S60020, the data management method may extract personal information from the analyzed data.

본 발명의 데이터 관리 방법이 개인정보를 추출하는 방법은 도 5, 도 14, 도 26, 도 35 및 도 36의 실시 예를 참고하도록 한다. Refer to the embodiments of FIGS. 5, 14, 26, 35, and 36 for the method of extracting personal information by the data management method of the present invention.

애플리케이션 상에서 수행하는 사용자의 업무 행위는 개발자의 취향 및 개발 표준에 따라 상이하게 표현된다. 즉, 본 발명을 통하여 사용자의 업무 행위를 기초로 하는 데이터에 대한 “조회, 삭제, 추가, 변경, 프린트”와 같은 사용자 행위의 구분을 용이하게 하고자 한다. The user's work activities performed on the application are expressed differently depending on the developer's tastes and development standards. In other words, the present invention aims to facilitate the classification of user actions such as “view, delete, add, change, and print” for data based on the user's work actions.

따라서, 본 발명은 애플리케이션의 메뉴 및 아이콘의 텍스트를 수집하고, 인공지능을 기반으로 자동으로 분류하여 법적으로 요구하는 사용자 행위에 대한 상세 구분을 할 수 있다. Therefore, the present invention collects the text of the application's menu and icon, automatically classifies it based on artificial intelligence, and allows detailed classification of user behavior as required by law.

또한, 감사 로그(audit log) 시스템에서 발생하는 이벤트 및 사용자의 활동 등의 정보를 기록한 로그로, 보안 및 감사 추적 등을 위해 사용될 수 있다. 여기에서, 감사 로그는 로그 데이터 및 로그 데이터에 대응하는 인덱스를 포함할 수 있다. Additionally, the audit log is a log that records information such as events occurring in the system and user activities, and can be used for security and audit tracking. Here, the audit log may include log data and an index corresponding to the log data.

로그 데이터는 일반적으로 시간, 이벤트/행위, 사용자/주체, 대상/객체, 결과 등의 정보로 구별되며, 각각의 로그 데이터는 하나의 레코드(Record)를 형성하며, 여러 개의 레코드가 연속적으로 기록될 수 있다. Log data is generally distinguished by information such as time, event/action, user/subject, target/object, and result. Each log data forms one record, and multiple records can be recorded continuously. You can.

또한, 감사 로그의 데이터는 데이터베이스 등에서 인덱싱 처리되어 관리되는 것이 일반적이다. 이때, 로그 데이터의 검색 및 분석을 효율적으로 수행하기 위해 각 로그 레코드에 대한 인덱스도 함께 관리될 수 있다. 여기에서, 인덱스는 보통 검색에 사용되는 필드와 검색 속도를 향상시키기 위한 키(Key) 등의 정보를 포함한다. Additionally, it is common for audit log data to be indexed and managed in a database, etc. At this time, in order to efficiently search and analyze log data, an index for each log record can also be managed. Here, the index usually includes information such as fields used for search and keys to improve search speed.

예를 들어, 시간, 이벤트/행위, 사용자/주체, 대상/객체 등의 필드를 가진 감사 로그에서 사용자가 특정 파일을 삭제한 기록을 검색하기 위해, 시간 필드와 대상 필드를 조합하여 인덱스를 생성할 수 있다. 이렇게 생성된 인덱스를 활용하면, 검색 시간을 대폭 줄일 수 있다는 장점이 있다. For example, to retrieve records of a user deleting a specific file in an audit log with fields such as time, event/action, user/subject, target/object, etc., an index can be created by combining the time field and the target field. You can. Using the index created in this way has the advantage of significantly reducing search time.

뿐만 아니라, 인덱스에 사용자 행위를 매핑하면, 로그 데이터를 사용자 행위를 기준으로 분류할 수 있어 보안 분석이나 모니터링에 용이하다는 장점이 있다. In addition, mapping user behavior to an index has the advantage of being able to classify log data based on user behavior, making it easier for security analysis and monitoring.

본 발명의 데이터 관리 플랫폼은 감사 로그에 포함된 로그 데이터를 인공지능을 기반으로 분류된 사용자 행위를 기준으로 인덱싱 처리해 사용자가 사용자 행위를 기준으로 로그 데이터를 검색할 수 있도록 한다. 이하, 본 발명에 대해 자세히 설명한다. The data management platform of the present invention indexes log data included in the audit log based on user behavior classified based on artificial intelligence, allowing users to search log data based on user behavior. Hereinafter, the present invention will be described in detail.

도 38은 본 발명의 데이터 관리 플랫폼에서 사용자 행위를 수집하고 매핑하는 실시 예를 설명하는 도면이다.Figure 38 is a diagram explaining an embodiment of collecting and mapping user behavior in the data management platform of the present invention.

본 발명의 데이터 관리 플랫폼(10000)은 수집 모듈(20001), 분석 모듈(20002), 모니터링 모듈(20005) 및 AI 엔진(20006)을 이용하여 사용자 행위를 수집하고, 로그 데이터와 매핑하여 인덱스 처리한 후 로그 데이터 조회 요청에 따라 사용자 행위로 인덱스 처리된 로그 데이터를 제공할 수 있다. The data management platform (10000) of the present invention collects user behavior using a collection module (20001), an analysis module (20002), a monitoring module (20005), and an AI engine (20006), maps it to log data, and indexes it. Afterwards, upon request for log data inquiry, log data indexed by user behavior can be provided.

보다 상세하게는, 수집 모듈(20001)을 사용하여 사용자 행위에 대응하는 키 및 텍스트를 수집할 수 있다. 이때, 수집 모듈(20001)은 애플리케이션 개발환경에 연결하여 사용자 행위를 수집할 수 있다. 애플리케이션 개발환경에서는 사용자 행위를 선택하는 메뉴, 아이콘에 대한 키 및 텍스트를 보관하고 있다. 이에 따라, 본 발명의 데이터 관리 플랫폼(10000)에 포함된 수집 모듈(20001)은 이러한 사용자 행위에 대응하는 키 및 텍스트를 수집할 수 있다. More specifically, the collection module 20001 can be used to collect keys and text corresponding to user actions. At this time, the collection module 20001 can collect user behavior by connecting to the application development environment. In the application development environment, keys and text for menus and icons for selecting user actions are stored. Accordingly, the collection module 20001 included in the data management platform 10000 of the present invention can collect keys and texts corresponding to such user actions.

AI 엔진(20006)은 수집된 텍스트를 인공지능(AI)을 기반으로 사용자 행위(action)을 분류할 수 있다. 예를 들어, 사용자 행위는 조회, 삭제, 추가, 변경, 프린트 등의 사용자의 업무 행위를 포함할 수 있다. 이를 위하여, AI 엔진(20006)은 기계학습(machine learning), 딥 러닝(deep learning), 자연어 처리(Natural Language Processing, NLP), 규칙 기반 접근(Rule-Based Approach) 등의 방법을 사용할 수 있다. AI Engine (20006) can classify user actions based on artificial intelligence (AI) from the collected text. For example, user actions may include user work actions such as viewing, deleting, adding, changing, and printing. For this purpose, AI Engine (20006) can use methods such as machine learning, deep learning, Natural Language Processing (NLP), and Rule-Based Approach.

분석 모듈(20002)은 분류된 사용자 행위에 대한 데이터를 키(key) 및 사용자 행위(action)로 구분하여 사용자 행위 메타데이터(1038)를 생성할 수 있다. 이때, 사용자 행위 메타데이터(1038)는 데이터 관리 플랫폼(10000) 내부 데이터베이스(20007) 안에 저장될 수 있다.The analysis module 20002 may generate user action metadata 1038 by dividing data on classified user actions into keys and user actions. At this time, user behavior metadata 1038 may be stored in the internal database 20007 of the data management platform 10000.

분석 모듈(20002)은 저장된 사용자 행위 메타데이터(1038)와 감사 로그(1033) 안에 포함된 로그 데이터를 매핑할 수 있다. 이때, 분석 모듈(20002)는 로그 데이터와 사용자 행위 메타데이터(1038)을 매핑하기 위하여, 로그 데이터를 인덱스 처리할 때 애플리케이션 개발환경 유형 별 키로 사용할 수 있는 필드와 사용자 행위 메타데이터(1038) 내에 키 정보를 매핑하여 인덱스의 사용자 행위 필드에 저장할 수 있다. The analysis module 20002 may map the stored user behavior metadata 1038 and log data included in the audit log 1033. At this time, in order to map log data and user behavior metadata 1038, the analysis module 20002 uses fields that can be used as keys for each application development environment type when indexing log data and keys in the user behavior metadata 1038. Information can be mapped and stored in the user behavior field of the index.

여기에서, 키 정보는 사용자 행위를 나타내는 식별 정보를 나타낸다. 일 실시 예에서, 데이터 관리 플랫폼(10000)은 사용자 행위를 텍스트로 저장하지 않고, 사용자 행위를 식별하기 위한 축약된 ID(Identification)로 저장할 수 있다. 예를 들어, 데이터 관리 플랫폼(10000)은 사용자 행위가 “조회”인 경우, 키 정보로 “R”을 저장하고, 사용자 행위가 “삭제”인 경우, 키 정보로 “D”를 저장하고, 사용자 행위가 “수정”인 경우, 키 정보로 “U”를 저장하고, 사용자 행위가 “프린트”인 경우, 키 정보로 “P”를 저장할 수 있다. Here, key information represents identification information indicating user behavior. In one embodiment, the data management platform 10000 may not store user behavior as text, but may store it as an abbreviated ID (Identification) to identify user behavior. For example, the data management platform 10000 stores “R” as key information when the user action is “search”, stores “D” as key information when the user action is “delete”, and stores “D” as key information when the user action is “search”. If the user action is “Modify,” “U” can be stored as key information, and if the user action is “Print,” “P” can be stored as key information.

사용자는 모니터링 모듈(20005)를 통하여 로그를 조회할 수 있다. 일 실시 예에서, 모니터링 모듈(20005)은 데이터베이스(20007) 내 감사 로그(1033)에 포함된 로그 데이터를 제공할 때, 인덱스의 사용자 행위 필드를 참조하여 사용자에게 정보를 제공할 수 있다. Users can view logs through the monitoring module (20005). In one embodiment, when providing log data included in the audit log 1033 in the database 20007, the monitoring module 20005 may provide information to the user by referring to the user behavior field of the index.

이하, 데이터 관리 플랫폼(10000) 내부의 각각의 모듈에서 수행되는 기능은 데이터 관리 플랫폼(10000)이 수행하는 것으로 기재하도록 한다. Hereinafter, the functions performed in each module within the data management platform 10000 will be described as being performed by the data management platform 10000.

도 39는 본 발명의 데이터 관리 플랫폼에서 사용자 행위 메타데이터를 생성하는 실시 예를 설명하는 도면이다.Figure 39 is a diagram explaining an embodiment of generating user behavior metadata in the data management platform of the present invention.

일 실시 예에서, 데이터 관리 플랫폼(10000)은 상술한 AI 엔진을 통하여 사용자 행위에 대응하는 키 및 텍스트를 수집할 수 있고, 수집된 텍스트를 인공지능을 기반으로 사용자 행위로 분류할 수 있다. 이에 따라, 데이터 관리 플랫폼(10000)은 수집된 사용자 행위에 대응하는 키 및 텍스트에 대하여 사용자 행위 메타데이터(1038)를 생성할 수 있다. In one embodiment, the data management platform 10000 can collect keys and text corresponding to user behavior through the above-described AI engine, and classify the collected text into user behavior based on artificial intelligence. Accordingly, the data management platform 10000 may generate user behavior metadata 1038 for keys and text corresponding to the collected user behavior.

보다 상세하게는, 데이터 관리 플랫폼(10000)은 사용자 행위 메타데이터(1038)를 생성하기 위하여, 애플리케이션 개발환경 유형 별 키 및 사용자 행위 키를 매핑할 수 있다. 즉, 사용자 행위 메타데이터(1038)는 애플리케이션 개발환경 유형 별 키, 사용자 행위 키, 사용자 행위를 필드로 가질 수 있다. More specifically, the data management platform 10000 may map keys and user behavior keys for each type of application development environment to generate user behavior metadata 1038. That is, the user behavior metadata 1038 may have a key for each application development environment type, a user behavior key, and a user behavior as fields.

여기에서, 애플리케이션 개발환경 유형은 상술한 아키텍쳐 유형, 화면 유저 인터페이스 유형을 포함할 수 있다. 즉, 애플리케이션 개발환경 유형은 애플리케이션 개발환경 내에서 사용자 업무 행위로 사용할 수 있는 정보 필드를 나타낸다. 예를 들어, 화면에 존재하는 메뉴 아이콘, ok 아이콘, cancel 아이콘 등을 포함할 수 있다. 또한, 사용자 행위 키는 상술한 사용자 행위를 식별하기 위한 ID를 나타낸다. 따라서, 데이터 관리 플랫폼(10000)은 사용자 행위 메타데이터(1038) 내에 애플리케이션 개발환경 유형 별 키 및 사용자 행위 키를 매핑할 수 있다. Here, the application development environment type may include the above-described architecture type and screen user interface type. In other words, the application development environment type represents information fields that can be used for user work actions within the application development environment. For example, it may include menu icons, ok icons, cancel icons, etc. that exist on the screen. Additionally, the user action key represents an ID for identifying the above-described user action. Accordingly, the data management platform 10000 can map keys and user behavior keys for each application development environment type within the user behavior metadata 1038.

예를 들어, 애플리케이션 개발환경 유형별 키가 “제 1 애플리케이션에서 del 키”라면, 상술한 AI 엔진 및 분석 모듈을 통하여, 데이터 관리 플랫폼(10000)은 사용자 행위 키 “D”와 매핑하여 사용자 행위 메타데이터(1038)를 생성할 수 있다. 이때, 데이터 관리 플랫폼(10000)은 애플리케이션 개발환경 유형 별 키인 “제 1 애플리케이션에서 del 키”와 사용자 행위 키인 ”D”의 사용자 행위가 “삭제”임을 사용자 행위 메타데이터(1038)에 함께 저장할 수 있다. For example, if the key for each application development environment type is “del key in the first application,” through the AI engine and analysis module described above, the data management platform 10000 maps the user behavior key “D” to user behavior metadata. (1038) can be generated. At this time, the data management platform 10000 may store together in the user action metadata 1038 that the user action of “del key in the first application”, which is a key for each application development environment type, and “D”, which is the user action key, is “delete”. .

도 40은 본 발명의 데이터 관리 플랫폼에서 사용자 행위 메타데이터와 로그 데이터를 매핑하는 실시 예를 설명하는 도면이다.Figure 40 is a diagram explaining an embodiment of mapping user behavior metadata and log data in the data management platform of the present invention.

일 실시 예에서, 데이터 관리 플랫폼(10000)은 저장된 사용자 행위 메타데이터(1038)와 감사 로그(1033)에 포함된 로그 데이터를 매핑할 수 있다. In one embodiment, the data management platform 10000 may map stored user behavior metadata 1038 and log data included in the audit log 1033.

보다 상세하게는, 데이터 관리 플랫폼(10000)은 로그 데이터와 사용자 행위 메타데이터(1038)을 매핑하기 위하여, 로그 데이터를 인덱스 처리할 수 있다. 구체적으로, 데이터 관리 플랫폼(10000)은 상술한 실시 예를 통하여 사용자 행위 메타데이터(1038) 내의 애플리케이션 개발환경 유형 별 키 필드와 사용자 행위 키 필드를 매핑할 수 있다. 또한, 일 실시 예에서, 데이터 관리 플랫폼(10000)은 인덱스의 사용자 행위 필드에 사용자 행위 메타데이터(1038)의 사용자 행위를 저장할 수 있다. More specifically, the data management platform 10000 may index log data to map log data and user behavior metadata 1038. Specifically, the data management platform 10000 can map key fields for each application development environment type and user behavior key fields in the user behavior metadata 1038 through the above-described embodiment. Additionally, in one embodiment, data management platform 10000 may store user behavior in user behavior metadata 1038 in a user behavior field of the index.

이에 따라, 감사 로그(1033)에 저장된 제 1 로그 데이터가 사용자가 제 1 애플리케이션 화면에서 del 키를 누르는 로그를 나타내는 경우, 데이터 관리 플랫폼(10000)은 제 1 로그 데이터를 조회하는 요청을 수신하는 경우, 매핑된 사용자 행위인 “삭제”를 바로 제공할 수 있다. Accordingly, when the first log data stored in the audit log 1033 represents a log in which the user presses the del key on the first application screen, the data management platform 10000 receives a request to query the first log data. , the mapped user action “delete” can be provided directly.

이렇게 인덱스에 사용자 행위를 매핑하면, 로그 데이터를 사용자 행위를 기준으로 분류할 수 있어 보안 분석이나 모니터링에 용이하다는 장점이 있다. Mapping user behavior to an index in this way has the advantage of being able to classify log data based on user behavior, making it easier for security analysis and monitoring.

도 41은 본 발명의 데이터 관리 방법이 사용자 행위 메타데이터를 생성하는 실시 예를 설명하는 도면이다.Figure 41 is a diagram illustrating an embodiment in which the data management method of the present invention generates user behavior metadata.

단계(S60030)에서, 데이터 관리 방법은 애플리케이션 개발환경에서 사용자 행위를 나타내는 메뉴, 버튼 또는 아이콘 등에 대한 키(key) 및 텍스트(text)를 수집할 수 있다. 보다 상세하게는, 애플리케이션 개발환경에서는 사용자 행위를 선택하는 메뉴에 대한 키 및 텍스트를 보관할 수 있다. 이에 따라, 데이터 관리 방법은 애플리케이션 개발환경에 연결하여 사용자 행위를 선택하는 메뉴에 대한 키 및 텍스트를 수집할 수 있다. 예를 들어, 데이터 관리 방법은 제 1 애플리케이션 개발환경에 연결하여, 제 1 애플리케이션 개발 환경에서 사용자 행위인 “삭제”를 나타내는 제 1 아이콘에 대한 키 및 텍스트 “”를 수집할 수 있다. In step S60030, the data management method may collect keys and text for menus, buttons, or icons representing user behavior in an application development environment. More specifically, the application development environment can store keys and text for menus that select user actions. Accordingly, the data management method can collect keys and text for menus that select user actions by connecting to the application development environment. For example, the data management method may connect to a first application development environment to collect the key and text “” for a first icon representing the user action “delete” in the first application development environment.

이때, 인공지능을 통하여 수집된 텍스트를 분류하기 때문에, 완벽하게 동일한 단어를 사용하지 않더라도 그 의미가 동일한 경우 동일한 사용자 행위로 분류될 수 있다. 예를 들어, 애플리케이션 개발환경에 따라 사용자 행위를 나타내는 버튼에 대응하는 텍스트가 다를 수 있다. 예를 들어, 제 1 애플리케이션 개발환경에서는 삭제를 나타내는 버튼에 대응하는 텍스트가 “delete”지만, 제 2 애플리케이션 개발환경에서는 삭제를 나타내는 버튼에 대응하는 텍스트가 “remove”일 수 있다. 이 경우, 본 발명의 데이터 관리 방법에 따르면, 다른 애플리케이션 개발환경에서 다른 텍스트를 쓰더라도 의미가 동일한 경우 동일한 사용자 행위로 분류할 수 있다. At this time, because the text collected through artificial intelligence is classified, even if the exact same word is not used, if the meaning is the same, it can be classified as the same user action. For example, depending on the application development environment, the text corresponding to the button representing user action may be different. For example, in the first application development environment, the text corresponding to the button indicating deletion may be “delete”, but in the second application development environment, the text corresponding to the button indicating deletion may be “remove”. In this case, according to the data management method of the present invention, even if different texts are used in different application development environments, if the meaning is the same, it can be classified as the same user action.

본 발명의 데이터 관리 방법이 사용자 행위를 나타내는 키 및 텍스트를 수집하는 방법은 도 38의 실시 예를 참고하도록 한다. Refer to the embodiment of FIG. 38 for how the data management method of the present invention collects keys and text representing user behavior.

단계(S60040)에서, 데이터 관리 방법은 AI 엔진을 통하여 수집된 텍스트를 분류할 수 있다. 상술한 예를 참고하면, 데이터 관리 방법은 AI 엔진을 통하여 “delete”라는 텍스트를 사용자 행위인 “삭제”로 분류할 수 있다. In step S60040, the data management method may classify the collected text through an AI engine. Referring to the above example, the data management method can classify the text “delete” as the user action “delete” through an AI engine.

단계(S60050)에서, 데이터 관리 방법은 분류된 데이터를 사용자 행위 메타데이터로 저장할 수 있다. 상술한 바와 같이, 데이터 관리 방법은 제 1 애플리케이션 개발환경에서, 제 1 아이콘에 대한 키 및 텍스트 “”를 사용자 행위 “삭제”와 매핑하여 사용자 행위 메타데이터에 저장할 수 있다. In step S60050, the data management method may store classified data as user behavior metadata. As described above, the data management method may map the key and text “” for the first icon to the user action “delete” in the first application development environment and store it in user action metadata.

도 42는 본 발명의 데이터 관리 방법이 사용자 행위 메타데이터와 로그 데이터를 매핑하는 실시 예를 설명하는 도면이다.Figure 42 is a diagram illustrating an embodiment of the data management method of the present invention mapping user behavior metadata and log data.

단계(S60060)에서, 데이터 관리 방법은 로그 데이터와 사용자 행위 메타데이터를 매핑하기 위한 메모리를 캐싱할 수 있다. 여기에서, 사용자 행위 메타데이터 상술한 바와 같은 방법으로 생성되어 저장된 것을 특징으로 한다. 일 실시 예에서, 데이터 관리 방법은 로그 데이터와 사용자 행위 메타데이터를 빠르게 매핑하기 위하여 매핑할 정보를 메모리에 캐싱(caching)할 수 있다. 즉, 데이터 관리 방법은 로그 데이터와 사용자 행위 메타데이터 중 매핑할 정보를 데이터베이스 상에서 조회하는 것이 아니라, 미리 메모리에 읽어 두어 빠르게 접근하도록 할 수 있다. In step S60060, the data management method may cache memory for mapping log data and user behavior metadata. Here, the user behavior metadata is characterized as being created and stored in the same manner as described above. In one embodiment, the data management method may cache information to be mapped in memory in order to quickly map log data and user behavior metadata. In other words, the data management method allows for quick access by reading the information to be mapped among log data and user behavior metadata into memory in advance rather than searching it in the database.

단계(S60070)에서, 데이터 관리 방법은 로그 데이터와 사용자 행위 메타데이터를 매핑한 후 인덱스 처리할 수 있다. 보다 상세하게는, 데이터 관리 방법은 로그 데이터를 인덱스 처리할 때, 애플리케이션 개발환경 유형 별 키 필드와 사용자 행위 키 필드를 매핑하여 인덱스의 사용자 행위 필드에 저장할 수 있다. 일 시 예에서, 데이터 관리 방법은 로그 데이터를 인덱스 처리할 때, 인덱스의 사용자 행위 필드에 사용자 행위 메타데이터의 사용자 행위를 저장할 수 있다. In step S60070, the data management method may index log data and user behavior metadata after mapping them. More specifically, when indexing log data, the data management method can map key fields and user behavior key fields for each type of application development environment and store them in the user behavior field of the index. In one example, when the data management method indexes log data, it may store the user behavior of the user behavior metadata in the user behavior field of the index.

단계(S60080)에서, 데이터 관리 방법은 로그 조회를 요청받는 경우, 인덱스의 사용자 행위 필드를 참조하여 정보를 제공할 수 있다. In step S60080, when a log inquiry is requested, the data management method may provide information by referring to the user behavior field of the index.

이를 통해, 개인정보가 포함된 모든 로그 데이터에 대해 사용자 행위를 매핑할 수 있다. Through this, user behavior can be mapped to all log data containing personal information.

도 43은 본 발명의 데이터 관리 방법이 사용자 행위 메타데이터를 생성하는 실시 예를 설명하는 도면이다. Figure 43 is a diagram explaining an embodiment in which the data management method of the present invention generates user behavior metadata.

단계(S70010)에서, 데이터 관리 방법은 데이터를 수집할 수 있다. 본 발명의 데이터 관리 방법이 데이터를 수집하는 방법은 도 2 내지 도 7 및 도 38의 실시 예를 참고하도록 한다. In step S70010, the data management method may collect data. For information on how the data management method of the present invention collects data, refer to the embodiments of FIGS. 2 to 7 and FIG. 38.

단계(S70020)에서, 데이터 관리 방법은 수집된 데이터를 분류할 수 있다. 본 발명의 데이터 관리 방법이 데이터를 분류하는 방법은 도 38, 도 39 및 도 41의 실시 예를 참고하도록 한다.In step S70020, the data management method may classify the collected data. For information on how the data management method of the present invention classifies data, refer to the embodiments of FIGS. 38, 39, and 41.

단계(S70030)에서, 데이터 관리 방법은 데이터에 포함된 적어도 하나의 정보를 매핑하여 사용자 행위 메타데이터를 생성할 수 있다. 본 발명의 데이터 관리 방법이 사용자 행위 메타데이터를 생성하는 방법은 도 39 및 도 41의 실시 예를 참고하도록 한다.In step S70030, the data management method may generate user behavior metadata by mapping at least one piece of information included in the data. Refer to the embodiments of FIGS. 39 and 41 for how the data management method of the present invention generates user behavior metadata.

개인정보 암호화 키는 법적 기준을 준수하기 위해 주기적으로 변경해야 한다. 이때, 암호화를 위한 키 값은 해당 키를 변경하게 되면 기존 키를 복호화하고 새로운 키로 재 암호화해야 한다. 즉, 키를 변경하는 경우 수천만 내지 수억 건의 데이터를 복호화하고, 재 암호화하기 위해 많은 시간이 소요된다. 따라서, 대부분의 회사들은 법적 기준을 준수해야 함에도 불구하고 암호화 키 변경을 수행하고 있지 않는 경우가 많다. Personal information encryption keys must be changed periodically to comply with legal standards. At this time, if the key value for encryption is changed, the existing key must be decrypted and re-encrypted with the new key. In other words, when changing the key, it takes a lot of time to decrypt and re-encrypt tens of millions of pieces of data. Therefore, most companies often do not perform encryption key changes even though they must comply with legal standards.

본 발명은 암호화 값에 대한 토큰(token, 대체 값)과 암호화 값에 대한 키 ID를 별도로 저장하는 방안을 제안하고자 한다. 이를 통하여 신규로 생성된 키 값으로 실시간으로 암호화할 수 있고, 새로운 키 값으로 운영 서버에 영향을 주지 않고 복호화 및 재암호화하는 기능을 제공할 수 있다. The present invention proposes a method of separately storing the token (replacement value) for the encryption value and the key ID for the encryption value. Through this, it is possible to encrypt in real time with a newly generated key value, and provide decryption and re-encryption functions with the new key value without affecting the operating server.

또한, 상술한 점 이외에도 애플리케이션 서버의 요구에 따라 암호화 데이터를 처리해야 할 필요가 있다. 이때, 데이터베이스의 암호화 방식을 사용하는 경우, 데이터가 로드될 때 복호화되어 원문(original text)으로 전송되기 때문에 다양한 방법에 의해서 원문이 유출될 수 있다. 따라서, 서버 메모리 내에서 암호화 데이터를 처리하고 필요할 때만 복호화 해야 한다. Additionally, in addition to the above-mentioned points, there is a need to process encrypted data according to the requirements of the application server. At this time, when the database encryption method is used, when the data is loaded, it is decrypted and transmitted as the original text, so the original text can be leaked through various methods. Therefore, encrypted data must be processed within server memory and decrypted only when necessary.

본 발명의 데이터 관리 플랫폼에서는 암호화 키가 변경되더라도 데이터의 복호화 및 재암호화가 내부에서 일어나기 때문에 원문의 유출이 없다는 장점이 있다. The data management platform of the present invention has the advantage that even if the encryption key is changed, the original text is not leaked because decryption and re-encryption of the data occurs internally.

이하 본 발명에 대하여 자세히 설명하도록 한다. Hereinafter, the present invention will be described in detail.

도 44은 본 발명의 데이터 관리 플랫폼에서 개인정보를 암호화하는 실시 예를 설명하는 도면이다.Figure 44 is a diagram explaining an embodiment of encrypting personal information in the data management platform of the present invention.

본 발명의 일 실시 예에서, 데이터 관리 플랫폼(10000)은 키 관리 모듈(20003)을 통하여 암호화 키 및 키 ID를 생성하고, 생성된 암호화 키를 사용하여 개인정보를 암호화하고, 암호화된 개인정보에 대응하는 토큰 값을 생성할 수 있다. In one embodiment of the present invention, the data management platform 10000 generates an encryption key and key ID through the key management module 20003, encrypts personal information using the generated encryption key, and stores the encrypted personal information. A corresponding token value can be created.

이를 위하여, 데이터 관리 플랫폼(10000)의 키 관리 모듈(20003)은 암호화 키 및 키 ID 생성부(2015), 개인정보 암호화부(2016), 토큰 값 생성부(2017) 및 매핑 정보 저장부(2018)를 포함할 수 있다. 여기에서, 키 관리 모듈(20003)은 Java 및 RFC를 사용할 수 있고, 개인정보의 암호화 및 복호화를 담당할 수 있다. To this end, the key management module 20003 of the data management platform 10000 includes an encryption key and key ID generation unit 2015, a personal information encryption unit 2016, a token value generation unit 2017, and a mapping information storage unit 2018. ) may include. Here, the key management module 20003 may use Java and RFC and may be responsible for encryption and decryption of personal information.

여기에서, 암호화 키 및 키 ID 생성부(2015)는 개인정보 암호화를 위한 암호화 키 및 키 ID를 생성 및 관리할 수 있다. 일 실시 예에서, 암호화 키 및 키 ID 생성부(2015)는 사용자의 제어에 기초하여 암호화 키 및 키 ID를 새로 생성할 수 있다. 이때, 암호화 키는 신규 키 ID가 발급됨으로써 변경될 수 있다. Here, the encryption key and key ID generator 2015 can generate and manage the encryption key and key ID for encrypting personal information. In one embodiment, the encryption key and key ID generator 2015 may generate a new encryption key and key ID based on user control. At this time, the encryption key can be changed by issuing a new key ID.

예를 들어, 사용자는 데이터 관리 플랫폼(10000)이 제공하는 일괄 암호화 키 변경 기능을 선택할 수 있고, 이에 따라 암호화 키 및 키 ID 생성부(2015)는 암호화 키 및 키 ID를 새로운 값으로 생성할 수 있다. 또한, 다른 일 실시 예에서, 암호화 키 및 키 ID 생성부(2015)는 사용자 제어가 없더라도 기 설정된 주기에 기초하여 암호화 키 및 키 ID를 업데이트할 수 있다. For example, the user can select the batch encryption key change function provided by the data management platform (10000), and the encryption key and key ID generation unit (2015) can accordingly generate the encryption key and key ID with new values. there is. Additionally, in another embodiment, the encryption key and key ID generator 2015 may update the encryption key and key ID based on a preset cycle even without user control.

개인정보 암호화부(2016)는 암호화 키를 이용하여 개인정보를 암호화할 수 있다. 개인정보 암호화부(2016)는 데이터 관리 플랫폼(10000) 내에 저장된 가장 최신 암호화 키를 이용하여 개인정보를 암호화할 수 있다. The personal information encryption unit (2016) can encrypt personal information using an encryption key. The personal information encryption unit 2016 can encrypt personal information using the most recent encryption key stored in the data management platform 10000.

토큰 값 생성부(2017)는 암호화된 개인정보에 대응하는 토큰 값(대체 값)을 생성할 수 있다. The token value generator 2017 may generate a token value (replacement value) corresponding to the encrypted personal information.

정보 저장부(2018)는 생성된 토큰 값, 암호화된 개인정보에 대응하는 암호본, 키 ID를 매핑 정보 테이블(1039)에 저장할 수 있다. 또한, 정보 저장부(2018)는 토큰 값을 업무 테이블(1040)에 별도로 저장할 수 있다. The information storage unit 2018 may store the generated token value, the encrypted copy corresponding to the encrypted personal information, and the key ID in the mapping information table 1039. Additionally, the information storage unit 2018 may separately store the token value in the task table 1040.

즉, 암호화 키 및 키 ID 생성부(2015)를 통하여 암호화 키가 변경되더라도 업무 테이블(1040)에 저장된 값은 변하지 않기 때문에, 시스템 운영을 중단하지 않으면서 암호화 키를 변경할 수 있다. In other words, even if the encryption key is changed through the encryption key and key ID generator 2015, the value stored in the task table 1040 does not change, so the encryption key can be changed without stopping system operation.

도 45는 본 발명의 데이터 관리 플랫폼에서 개인정보를 암호화하는 실시 예를 설명하는 도면이다.Figure 45 is a diagram explaining an embodiment of encrypting personal information in the data management platform of the present invention.

일 실시 예에서, 데이터 관리 플랫폼은 개인정보를 수집할 수 있다. 데이터 관리 플랫폼은 수집된 패킷으로부터 분석한 데이터 또는 직접적으로 수신한 데이터 중 개인정보를 추출할 수 있다. 또한, 데이터 관리 플랫폼은 개인정보 자체(예를 들어, 주민등록번호 “730101-1088123”)를 수신할 수 있다. In one embodiment, the data management platform may collect personal information. The data management platform can extract personal information from data analyzed from collected packets or from data received directly. Additionally, the data management platform may receive personal information itself (e.g., social security number “730101-1088123”).

일 실시 예에서, 데이터 관리 플랫폼은 상술한 암호화 키 및 키 ID 생성부를 통하여 생성된 암호화 키와 키 ID를 이용하여 개인정보를 암호화할 수 있다. In one embodiment, the data management platform may encrypt personal information using the encryption key and key ID generated through the encryption key and key ID generator described above.

본 도면을 예로 들어 설명하면, 데이터 관리 플랫폼은 생성된 제 1 암호화 키 및 키 ID는 “KEY001”를 사용하여 개인정보를 암호화할 수 있다. 여기에서, 암호화 키는 binary 형태로 생성될 수 있다. 예를 들어, 제 1 암호화 키는 “01001000 01001010 00110001 00110001 00110011 00111000 00110000 00110001 01011010 00111101 00111101”에 대응할 수 있다. 상술한 예를 들어 설명하면, 개인정보인 주민등록번호 “730101-1088123”은 생성된 암호화 키에 의해 암호화될 수 있다. Taking this drawing as an example, the data management platform can encrypt personal information using the generated first encryption key and key ID “KEY001”. Here, the encryption key can be generated in binary form. For example, the first encryption key may correspond to “01001000 01001010 00110001 00110001 00110011 00111000 00110000 00110001 01011010 00111101 00111101”. Using the above example, personal information, resident registration number “730101-1088123”, can be encrypted using the generated encryption key.

일 실시 예에서, 암호화는 개인정보 전체를 암호화하거나, 일부를 암호화하는 방법으로 진행될 수 있다. 특히, 본 발명은 개인정보 중 일부를 암호화하는 것을 특징으로 한다. 이때, 키 ID는 암호화 알고리즘이 매핑되어 있다. 일 실시 예에서, 사용 가능한 암호화 알고리즘은 양방향 알고리즘으로 SEED, ARIA128, ARIS192, ARIA256, AES128, AES192, DES, TDES를 사용할 수 있고, 단방향 알고리즘으로 SHA-256을 사용할 수 있다. 이때, 키 ID에 기초하여 암호화되는 알고리즘이 결정될 수 있다. In one embodiment, encryption may be performed by encrypting the entire personal information or encrypting part of the personal information. In particular, the present invention is characterized by encrypting some of personal information. At this time, the key ID is mapped to an encryption algorithm. In one embodiment, available encryption algorithms may use SEED, ARIA128, ARIS192, ARIA256, AES128, AES192, DES, and TDES as a two-way algorithm, and SHA-256 may be used as a one-way algorithm. At this time, an encryption algorithm may be determined based on the key ID.

개인정보가 암호화된 이후, 데이터 관리 플랫폼은 암호화된 개인정보에 대응하는 토큰 값을 생성할 수 있다. 본 도면의 예에서 토큰 값은 “abcxxf”에 대응한다. 이때, 토큰 값은 암호화된 개인정보의 자리 수 및 형식(format)을 유지할 수 있다. 예를 들어, 개인정보인 주민등록번호 “730101-1088123” 중 뒤 6자리를 부분 암호화하는 경우, 이를 대체하는 토큰 값은 동일한 자리 수 및 형식인 “abcxxf”에 대응할 수 있다. After personal information is encrypted, the data management platform can generate a token value corresponding to the encrypted personal information. In the example of this figure, the token value corresponds to “abcxxf”. At this time, the token value can maintain the number and format of the encrypted personal information. For example, when the last 6 digits of the personal information, resident registration number “730101-1088123”, are partially encrypted, the token value that replaces it may correspond to “abcxxf”, which has the same number of digits and format.

일 실시 예에서, 데이터 관리 플랫폼은 생성된 토큰 값, 암호화된 개인정보에 대응하는 암호본, 키 ID를 매핑 정보 테이블에 저장하고, 토큰 값을 업무 테이블에 저장할 수 있다. 매핑 정보 테이블과 업무 테이블에 대하여는 후술하도록 한다. In one embodiment, the data management platform may store the generated token value, the encrypted copy corresponding to the encrypted personal information, and the key ID in a mapping information table, and store the token value in a business table. The mapping information table and task table will be described later.

도 46는 본 발명의 매핑 정보 테이블과 업무 테이블을 설명하는 도면이다.Figure 46 is a diagram explaining the mapping information table and task table of the present invention.

본 도면은 매핑 정보 테이블(1039)과 업무 테이블(1040)를 예시하는 도면이다. 본 도면에서 매핑 정보 테이블(1039)과 업무 테이블(1040)은 본 발명과 관련이 있는 필드만을 나타낸 것으로, 이외의 필드를 더 포함할 수 있음은 물론이다. This figure is a diagram illustrating the mapping information table 1039 and the task table 1040. In this drawing, the mapping information table 1039 and the task table 1040 only show fields related to the present invention, and of course, other fields may be included.

일 실시 예에서, 매핑 정보 테이블(1039)은 토큰 값, 암호본 및 키 ID를 포함할 수 있다. 토큰 값, 암호본, 키 ID는 각각의 필드로 구성되어 있으며, 매핑 정보 테이블(1039)은 매핑 값을 각각 필드에 맞게 포함할 수 있다. 상술한 실시 예를 예로 들어 설명하면, 키 ID “KEY001”를 이용하여 개인정보를 암호화할 수 있고, 암호화된 개인정보에 대응하는 암호본은 “HJ113801Z==”이고, 이에 대응하는 토큰 값이 “abcxxf”인 경우, 매핑 정보 테이블(1039)은 토큰 값, 암호본 및 키 ID를 동일한 행으로 매핑하여 저장할 수 있다. In one embodiment, the mapping information table 1039 may include a token value, a passphrase, and a key ID. The token value, encrypted copy, and key ID are composed of respective fields, and the mapping information table 1039 can include mapping values for each field. Taking the above-described embodiment as an example, personal information can be encrypted using the key ID “KEY001”, the encrypted copy corresponding to the encrypted personal information is “HJ113801Z==”, and the corresponding token value is “ abcxxf”, the mapping information table 1039 can store the token value, encrypted copy, and key ID by mapping them to the same row.

일 실시 예에서, 업무 테이블(1040)은 토큰 값을 포함할 수 있다. 여기에서, 업무 테이블(1040)은 필드의 길이가 정해져 있기 때문에 암호본이나 키 ID를 직접 저장할 수 없기 때문에 데이터 관리 플랫폼은 토큰에 매핑되어 있는 암호본과 키 ID를 매핑 정보 테이블(1039)에 별도로 저장할 수 있다. In one embodiment, task table 1040 may include token values. Here, since the business table 1040 cannot directly store the encrypted copy or key ID because the length of the field is fixed, the data management platform separately stores the encrypted copy and key ID mapped to the token in the mapping information table 1039. You can save it.

이때, 권한 있는 사용자가 개인정보를 확인하기 위해서는, 매핑 정보 테이블(1039)에 포함된 암호본을 이용할 수 있다. 상술한 예를 들어 설명하면, 권한 있는 사용자가 데이터 관리 플랫폼에게 개인정보의 복호화를 요청하는 경우, 데이터 관리 플랫폼은 업무 테이블(1040)에 포함된 토큰 값 “abcxxf”을 이용해 매핑 정보 테이블(1039)에 있는 암호본 “”을 추출하고, 암호본을 이용하여 개인정보를 복호화할 수 있다. 특히, 본 발명은 매핑 정보 테이블(1039)와 업무 테이블(1040)이 저장된 저장소와 암호화 키 및 키 ID가 저장된 저장소를 별도로 구분된 것을 특징으로 한다. 또한, 본 발명의 데이터 관리 플랫폼은 매핑 정보 테이블(1039)와 업무 테이블(1040)이 저장된 시스템과 암호화 키 및 키 ID가 저장된 시스템을 별도로 구비할 수 있다. 이를 통해, 법적인 보안 요구 조건을 만족할 수 있다. At this time, in order for an authorized user to check personal information, the encrypted copy included in the mapping information table 1039 can be used. Taking the above example, when an authorized user requests the data management platform to decrypt personal information, the data management platform uses the token value “abcxxf” included in the task table 1040 to create the mapping information table 1039. You can extract the password “” from and decrypt personal information using the password. In particular, the present invention is characterized in that the storage where the mapping information table 1039 and the task table 1040 are stored and the storage where the encryption key and key ID are stored are separated. In addition, the data management platform of the present invention may be separately equipped with a system in which the mapping information table 1039 and the task table 1040 are stored and a system in which the encryption key and key ID are stored. Through this, legal security requirements can be satisfied.

상술한 실시 예에 따라, 데이터 관리 플랫폼은 업무 테이블(1040)에 토큰 값을 저장할 수 있다. 여기에서, 데이터 관리 플랫폼은 키 ID나 개인정보 암호화 값이 변경되더라도 토큰 값은 변경하지 않은 상태로 유지할 수 있다. According to the above-described embodiment, the data management platform may store the token value in the task table 1040. Here, the data management platform can keep the token value unchanged even if the key ID or personal information encryption value changes.

이를 통하여, 토큰 값은 변하지 않은 상태에서 암호화 키와 키 ID를 새롭게 변경할 수 있다. Through this, the encryption key and key ID can be changed while the token value remains unchanged.

도 47은 본 발명의 데이터 관리 플랫폼에서 신규 암호화 키를 생성하는 실시 예를 설명하는 도면이다.Figure 47 is a diagram explaining an embodiment of generating a new encryption key in the data management platform of the present invention.

일 실시 예에서, 암호화 키 및 키 ID 생성부(2015)는 새로운 암호화 키 및 키 ID를 생성할 수 있다. 예를 들어, 암호화 키 및 키 ID 생성부(2015)는 제 2 암호화 키 및 키 ID “KEY002”를 생성할 수 있다. 여기에서, 제 2 암호화 키는 상술한 실시 예와 마찬가지로 binary로 표현될 수 있다. In one embodiment, the encryption key and key ID generator 2015 may generate a new encryption key and key ID. For example, the encryption key and key ID generator 2015 may generate a second encryption key and key ID “KEY002”. Here, the second encryption key may be expressed in binary, similar to the above-described embodiment.

이에 따라, 데이터 관리 플랫폼은 기존의 매핑 정보 테이블(1039)에 포함된 암호본과 키 ID를 새로운 암호본 및 키 ID로 업데이트할 수 있다. Accordingly, the data management platform can update the encrypted copy and key ID included in the existing mapping information table 1039 with the new encrypted copy and key ID.

보다 상세하게는, 데이터 관리 플랫폼은 새롭게 생성된 키 ID에 기초하여 암호화 알고리즘을 결정하고, 새로운 암호화 키를 이용하여 개인정보 암호화 값을 변경할 수 있다. 이를 위하여, 데이터 관리 플랫폼은 기존 암호본 및 키 ID를 이용하여 개인정보를 복호화한 뒤, 새롭게 생성된 암호화 키 및 키 ID를 이용하여 개인정보를 재암호화할 수 있다. More specifically, the data management platform can determine the encryption algorithm based on the newly generated key ID and change the personal information encryption value using the new encryption key. To this end, the data management platform can decrypt personal information using the existing encryption key and key ID, and then re-encrypt the personal information using the newly generated encryption key and key ID.

이에 따라, 매핑 정보 테이블(1039)에 포함된 토큰 값은 유지가 되지만, 개인정보 암호화 값과 키 ID는 새로 생성된 값으로 변경된다. 이때, 업무 테이블(1040)에 포함된 토큰 값은 변함이 없다. Accordingly, the token value included in the mapping information table 1039 is maintained, but the personal information encryption value and key ID are changed to newly generated values. At this time, the token value included in the task table 1040 does not change.

도 48은 본 발명의 데이터 관리 플랫폼에서 새로운 업무 데이터를 추가하는 실시 예를 설명하는 도면이다.Figure 48 is a diagram explaining an embodiment of adding new work data in the data management platform of the present invention.

본 도면에서는 데이터 관리 플랫폼에서 새로운 개인정보가 추가된 실시 예를 설명한다. 데이터 관리 플랫폼은 새로운 업무 데이터(개인정보인 경우를 예로 한다.)를 수집 또는 수신하는 경우, 가장 최신 암호화 키 및 키 ID를 이용하여 새로운 개인정보를 암호화할 수 있다. This drawing explains an embodiment in which new personal information is added to the data management platform. When the data management platform collects or receives new business data (for example, personal information), it can encrypt the new personal information using the most recent encryption key and key ID.

예를 들어, 새로운 개인정보로 계좌번호 “114-910224-12345”가 수신된 경우, 데이터 관리 플랫폼은 가장 최신 암호화 키인 제 2 암호화 키 및 키 ID “KEY002”를 이용하여 개인정보를 암호화할 수 있다. 이후, 데이터 관리 플랫폼은 개인정보에 대응하는 토큰 값을 생성할 수 있다. 예를 들어, 토큰 값은 “hijklm”에 대응한다. For example, if the account number “114-910224-12345” is received as new personal information, the data management platform can encrypt the personal information using the second encryption key and key ID “KEY002”, which is the most recent encryption key. . Afterwards, the data management platform can generate a token value corresponding to the personal information. For example, the token value corresponds to “hijklm”.

데이터 관리 플랫폼은 토큰 값, 암호본 및 키 ID를 매핑 정보 테이블(1039)에 저장하고, 토큰 값을 업무 테이블(1040)에 저장할 수 있다. The data management platform may store the token value, encrypted copy, and key ID in the mapping information table 1039, and store the token value in the task table 1040.

새로운 개인정보가 추가되기 전 가장 최신 암호화 키 및 키 ID가 반영된 매핑 정보 테이블(1039)은 제 1 행에 토큰 값 “abcxxf”, 암호본 “29AB3801Z==” 및 키 ID “KEY002”를 저장하고 있다. 새로운 개인정보가 추가되면, 매핑 정보 테이블(1039)은 제 2 행에 토큰 값 “hijklm”암호본 ”AQ348701Z==” 및 키 ID “KEY002”를 더 포함할 수 있다. The mapping information table 1039, which reflects the most recent encryption key and key ID before new personal information is added, stores the token value “abcxxf”, ciphertext “29AB3801Z==”, and key ID “KEY002” in the first row. . When new personal information is added, the mapping information table 1039 may further include token value “hijklm” ciphertext “AQ348701Z==” and key ID “KEY002” in the second row.

마찬가지로, 새로운 개인정보가 추가되기 전 업무 테이블(1040)은 제 1 행에 “abcxxf”를 저장하고 있고, 새로운 개인정보가 추가되면 제 2 행에 “hijklm”을 저장할 수 있다. Likewise, before new personal information is added, the business table 1040 stores “abcxxf” in the first row, and when new personal information is added, “hijklm” can be stored in the second row.

즉, 본 발명의 데이터 관리 플랫폼은 가장 최신 암호화 키 및 키 ID를 기준으로 매핑 정보 테이블(1039)에 포함된 정보를 업데이트할 수 있고, 키 ID만 변경하면 개인정보의 복호화 및 재암호화를 진행하기 때문에 수천만 건의 데이터를 실시간으로 변경할 수 있다. In other words, the data management platform of the present invention can update the information included in the mapping information table 1039 based on the most recent encryption key and key ID, and can proceed with decryption and re-encryption of personal information by changing only the key ID. Therefore, tens of millions of pieces of data can be changed in real time.

도 49는 본 발명의 개인정보 암호화의 실 사용 예를 설명하는 도면이다.Figure 49 is a diagram explaining an actual use example of personal information encryption of the present invention.

일 실시 예에서, 사용자는 애플리케이션 서버의 운영 시스템(1041)에 접속하여 개인정보를 입력할 수 있다. 이에 따라, 데이터 관리 플랫폼(10000)은 개인정보를 암호화 및 복호화할 수 있다. 이때, 데이터 관리 플랫폼(10000)은 상술한 암호화 키 및 키 ID를 생성하여 개인정보를 암호화할 수 있다. 또한, 데이터 관리 플랫폼(10000)은 사용자의 명령에 기초하여 실시간으로 키를 변경할 수 있어, 기존의 키로는 개인정보를 복호화할 수 없게 된다. 이에 따라, 데이터 관리 플랫폼(10000)은 가장 최신 암호화 키 및 키 ID를 이용하여 개인정보를 복호화 및 재암호화할 수 있다. In one embodiment, the user may access the operating system 1041 of the application server and enter personal information. Accordingly, the data management platform 10000 can encrypt and decrypt personal information. At this time, the data management platform 10000 can encrypt personal information by generating the above-described encryption key and key ID. Additionally, the data management platform 10000 can change the key in real time based on the user's command, making it impossible to decrypt personal information with the existing key. Accordingly, the data management platform 10000 can decrypt and re-encrypt personal information using the most recent encryption key and key ID.

일 실시 예에서, 데이터 관리 플랫폼(10000)은 암호화된 개인정보에 대응하는 토큰 값을 생성할 수 있다. 이후, 데이터 관리 플랫폼(10000)은 생성된 토큰 값, 암호본, 키 ID를 매핑 정보 테이블(1039)에 저장하고, 토큰 값을 업무 테이블(1040)에 저장할 수 있다. 또한, 매핑 정보 테이블(1039)와 업무 테이블(1040)은 데이터 관리 플랫폼(10000)의 내부 또는 외부 데이터베이스에 저장될 수 있다. 이때, 매핑 정보 테이블(1039)와 업무 테이블(1040)에 저장되는 데이터베이스와 암호화 키 및 키 ID가 저장되는 데이터베이스는 별도로 구비되어야 한다. In one embodiment, the data management platform 10000 may generate a token value corresponding to the encrypted personal information. Thereafter, the data management platform 10000 may store the generated token value, password, and key ID in the mapping information table 1039, and store the token value in the task table 1040. Additionally, the mapping information table 1039 and the task table 1040 may be stored in an internal or external database of the data management platform 10000. At this time, the database stored in the mapping information table 1039 and the task table 1040 and the database storing the encryption key and key ID must be provided separately.

이에 따라, 비인가 사용자가 시스템(1041)에 접속하여 개인정보를 조회하는 경우, 데이터 관리 플랫폼(10000)은 암호화된 개인정보에 대응하는 토큰 값 만을 반환할 수 있다. Accordingly, when an unauthorized user accesses the system 1041 and inquires personal information, the data management platform 10000 may return only the token value corresponding to the encrypted personal information.

도 50은 본 발명의 데이터 관리 방법이 개인정보를 암호화하는 실시 예를 설명하는 도면이다.Figure 50 is a diagram explaining an embodiment of the data management method of the present invention encrypting personal information.

단계(S80010)에서, 데이터 관리 방법은 암호화 키 및 키 ID를 생성할 수 있다. 이때, 키 ID는 암호화 알고리즘이 매핑되어 있다. At step S80010, the data management method may generate an encryption key and a key ID. At this time, the key ID is mapped to an encryption algorithm.

단계(S80020)에서, 데이터 관리 방법은 생성된 암호화 키를 이용하여 개인정보를 암호화할 수 있다. 일 실시 예에서, 데이터 관리 방법은 가장 최신 암호화 키를 이용하여 개인정보를 암호화할 수 있다. In step S80020, the data management method may encrypt personal information using the generated encryption key. In one embodiment, the data management method may encrypt personal information using the most recent encryption key.

단계(S80030)에서, 데이터 관리 방법은 암호화된 개인정보에 대응하는 토큰(token) 값(대체 값)을 생성할 수 있다. 일 실시 예에서, 토큰 값은 암호화된 개인정보의 자리 수 및 형식(format)을 유지할 수 있다. In step S80030, the data management method may generate a token value (replacement value) corresponding to the encrypted personal information. In one embodiment, the token value may maintain the number and format of the encrypted personal information.

단계(S80040)에서, 데이터 관리 방법은 토큰 값, 암호본 및 키 ID를 포함하는 매핑 정보를 매핑 정보 테이블에 저장할 수 있다. In step S80040, the data management method may store mapping information including a token value, encrypted copy, and key ID in a mapping information table.

단계(S80050)에서, 데이터 관리 방법은 토큰 값을 업무 테이블에 저장할 수 있다. In step S80050, the data management method may store the token value in a business table.

도 51은 본 발명의 데이터 관리 방법이 신규 암호화 키를 생성하는 실시 예를 설명하는 도면이다.Figure 51 is a diagram illustrating an embodiment in which the data management method of the present invention generates a new encryption key.

단계(S80060)에서, 데이터 관리 방법은 신규 암호화 키 및 신규 키 ID를 생성할 수 있다. 일 실시 예에서, 데이터 관리 방법은 사용자 설정에 따라 또는 주기적으로 자동으로 신규 암호화 키 및 신규 키 ID를 생성할 수 있다. In step S80060, the data management method may generate a new encryption key and a new key ID. In one embodiment, the data management method may automatically generate a new encryption key and a new key ID according to user settings or periodically.

단계(S80070)에서, 데이터 관리 방법은 기존 암호화 키를 사용하여 암호화된 개인정보를 복호화할 수 있다. In step S80070, the data management method may decrypt personal information encrypted using an existing encryption key.

단계(S80080)에서, 데이터 관리 방법은 신규 암호화 키를 사용하여 개인정보를 다시 암호화할 수 있다. In step S80080, the data management method may re-encrypt the personal information using a new encryption key.

단계(S80090)에서, 데이터 관리 방법은 토큰 값, 새로운 암호본 및 신규 키 ID를 포함하는 매핑 정보를 매핑 정보 테이블에 저장할 수 있다. 이때, 업무 테이블에 저장된 토큰 값에는 아무 영향이 없다. In step S80090, the data management method may store mapping information including a token value, a new encrypted copy, and a new key ID in a mapping information table. At this time, there is no effect on the token value stored in the business table.

도 52는 본 발명의 데이터 관리 방법이 새로운 업무 데이터를 추가하는 실시 예를 설명하는 도면이다.Figure 52 is a diagram illustrating an embodiment of the data management method of the present invention adding new work data.

일 실시 예에서, 데이터 관리 방법은 새로운 업무 데이터를 수신하는 경우, 새로운 업무 데이터에 포함된 정보 중 암호화가 필요한 정보가 존재하는지 여부를 판단할 수 있다. 보다 상세하게는, 데이터 관리 방법은 새로운 업무 데이터를 수신하는 경우, 새로운 업무 데이터에 개인정보가 포함되어 있는 경우, 개인정보를 암호화하여 데이터베이스에 저장할 수 있다. 이하, 새로 수신한 업무 데이터에 포함된 정보 중 암호화가 필요한 정보가 개인정보인 경우를 예로 들어 설명하도록 한다. 다만, 개인정보가 아니더라도 암호화가 필요한 다른 정보에 적용될 수 있음은 물론이다. In one embodiment, when receiving new business data, the data management method may determine whether information that requires encryption exists among information included in the new business data. More specifically, the data management method may encrypt the personal information and store it in a database when new work data is received and the new work data includes personal information. Hereinafter, an example will be given where the information included in newly received work data that requires encryption is personal information. However, of course, it can be applied to other information that requires encryption even if it is not personal information.

단계(S90010)에서, 데이터 관리 방법은 새로운 개인정보를 수신할 수 있다. 일 실시 예에서, 데이터 관리 방법은 수집된 패킷으로부터 분석한 데이터로부터 또는 직접적으로 수신한 데이터 중 개인정보를 수신할 수 있다. At step S90010, the data management method may receive new personal information. In one embodiment, the data management method may receive personal information from data analyzed from collected packets or from data received directly.

단계(S90020)에서, 데이터 관리 방법은 새로운 개인정보에 대하여 가장 최신 암호화 키 및 키 ID를 적용하여 개인정보를 암호화할 수 있다. In step S90020, the data management method may encrypt personal information by applying the most recent encryption key and key ID to new personal information.

단계(S90030)에서, 데이터 관리 방법은 개인정보에 대응하는 토큰 값을 생성할 수 있다. In step S90030, the data management method may generate a token value corresponding to personal information.

단계(S90040)에서, 데이터 관리 방법은 토큰 값, 암호본 및 키 ID를 포함하는 매핑 정보를 매핑 정보 테이블에 저장할 수 있다. In step S90040, the data management method may store mapping information including a token value, encrypted copy, and key ID in a mapping information table.

단계(S90050)에서, 데이터 관리 방법은 토큰 값을 업무 테이블에 저장할 수 있다. In step S90050, the data management method may store the token value in a business table.

도 53은 본 발명의 데이터 관리 방법이 개인정보를 암호화하는 실시 예를 설명하는 도면이다. Figure 53 is a diagram explaining an embodiment of the data management method of the present invention encrypting personal information.

단계(S90060)에서, 데이터 관리 방법은 제 1 데이터를 수집할 수 있다. 본 발명의 데이터 관리 방법이 제 1 데이터를 수집하는 방법은 도 2 내지 도 7의 실시 예를 참고하도록 한다. In step S90060, the data management method may collect first data. For information on how the data management method of the present invention collects first data, refer to the embodiments of FIGS. 2 to 7.

단계(S90070)에서, 데이터 관리 방법은 제 1 데이터를 분석할 수 있다. 본 발명의 데이터 관리 방법이 제 1 데이터를 분석하는 방법은 도 2 내지 도 5, 도 8 내지 13의 실시 예를 참고하도록 한다. In step S90070, the data management method may analyze the first data. Refer to the embodiments of FIGS. 2 to 5 and 8 to 13 for how the data management method of the present invention analyzes the first data.

단계(S90080)에서, 데이터 관리 방법은 제 1 데이터로부터 제 1 개인정보를 추출할 수 있다. 본 발명의 데이터 관리 방법이 제 1 데이터로부터 제 1 개인정보를 추출하는 방법은 도 2 내지 5, 도 14, 도 44의 실시 예를 참고하도록 한다. In step S90080, the data management method may extract first personal information from first data. For the method of extracting first personal information from first data by the data management method of the present invention, refer to the embodiments of FIGS. 2 to 5, 14, and 44.

단계(S90090)에서, 데이터 관리 방법은 제 1 개인정보를 암호화할 수 있다. 본 발명의 데이터 관리 방법이 제 1 개인정보를 암호화하는 방법은 도 44 내지 도 48의 실시 예를 참고하도록 한다.In step S90090, the data management method may encrypt the first personal information. For the method of encrypting the first personal information in the data management method of the present invention, refer to the embodiments of FIGS. 44 to 48.

데이터에 개인정보가 포함되어 있는 경우, 법적 조건에 따라 개인정보를 암호화하여 별도로 보관해야 할 필요가 있다. 즉, 상술한 바와 같이 본 발명의 데이터 관리 플랫폼에서는 개인 정보를 토큰 값으로 저장할 수 있다. 다만, 시스템 운영자의 요구에 따라 테스트를 위하여 개인정보를 사용하는 경우가 있다. If the data contains personal information, it is necessary to encrypt the personal information and store it separately in accordance with legal conditions. That is, as described above, the data management platform of the present invention can store personal information as a token value. However, there are cases where personal information is used for testing at the request of the system operator.

본 발명에서는, 시스템에 저장된 토큰 값은 유지한 상태로, 암호화 데이터만을 변조하여 테스트 시스템에서 사용할 수 있도록 한다. 즉, 본 발명은 테스트 환경에서 암호화된 개인정보 조회 시 변조된 데이터를 제공함으로써 암호화된 데이터와 다른 데이터를 제공할 수 있다. In the present invention, the token value stored in the system is maintained, and only the encrypted data is altered so that it can be used in the test system. In other words, the present invention can provide data different from the encrypted data by providing altered data when encrypted personal information is searched in a test environment.

이를 통하여, 사용 목적을 다하거나, 내/외부 개발자를 통한 데이터 유출을 막을 수 있다는 장점이 있다. This has the advantage of being able to fulfill its intended purpose and prevent data leakage through internal/external developers.

도 54는 본 발명의 데이터 관리 플랫폼에서 데이터를 변조하는 실시 예를 설명하는 도면이다. Figure 54 is a diagram explaining an embodiment of modulating data in the data management platform of the present invention.

본 발명의 데이터 관리 플랫폼(10000)은 키 관리 모듈(20003)을 통하여 데이터를 변조할 수 있다. 이를 위하여, 데이터 관리 플랫폼(10000)의 키 관리 모듈(20003)은 토큰 값/매핑 정보 생성부(2017), 매핑 정보 저장부(2018), 데이터 변조부(2030) 및 데이터 조회부(2031)를 포함한다. The data management platform 10000 of the present invention can modulate data through the key management module 20003. To this end, the key management module 20003 of the data management platform 10000 includes a token value/mapping information generation unit 2017, a mapping information storage unit 2018, a data modulation unit 2030, and a data inquiry unit 2031. Includes.

토큰 값/매핑 정보 생성부(2017)는 변조할 데이터에 대응하는 토큰 값 및 매핑 정보를 생성할 수 있다. 여기에서, 토큰 값은 데이터의 변조를 지원하고 원문 데이터를 대신하여 사용될 수 있다. The token value/mapping information generator 2017 may generate a token value and mapping information corresponding to data to be modulated. Here, the token value supports the alteration of data and can be used in place of the original data.

매핑 정보 저장부(2018)는 생성된 토큰 값을 업무 테이블(1040)에 저장하고, 토큰 값, 원문 데이터에 대응하는 암호본 및 기타 정보를 매핑 정보 테이블(1039)에 저장할 수 있다. 이를 통해, 변조된 데이터와 원문 데이터를 연결하고, 변조 대상 정보에 대한 신뢰성을 제공할 수 있다. The mapping information storage unit 2018 may store the generated token value in the task table 1040, and store the token value, the encrypted copy corresponding to the original data, and other information in the mapping information table 1039. Through this, it is possible to connect the falsified data and the original data and provide reliability for the falsified information.

또한, 매핑 정보 저장부(2018)는 외부 요청에 기초하여 업무 테이블(1040) 및 매핑 정보 테이블(1039)에 저장된 정보를 조회할 수 있도록 한다. 이를 통해, 변조된 데이터에 대한 접근 및 검증을 용이하게 할 수 있다. Additionally, the mapping information storage unit 2018 allows information stored in the task table 1040 and the mapping information table 1039 to be searched based on external requests. Through this, access to and verification of falsified data can be facilitated.

데이터 변조부(2030)는 변조 정책에 따라 원문 데이터를 변조된 데이터로 변조할 수 있다. 즉, 데이터 변조부(2030)는 업무 테이블(1040)에 저장된 토큰 값은 변경하지 않고, 매핑 정보 테이블(1039)에 저장된 암호본을 변조된 암호본으로 변조하여, 업무 데이터의 변조 효과를 달성할 수 있다. 또한, 일 실시 예에서, 데이터 변조부(2030)는 데이터 변조를 위한 정책의 주기 및 승인을 관리할 수 있다. 보다 상세하게는, 변조를 위한 정책은 어떤 원문 데이터를 어떤 키를 사용하여 변조할 것인지를 포함할 수 있다. 또한, 변조 주기는 사용자에 의해 설정될 수 있다. 또한, 데이터 관리 플랫폼(10000)은 변조 승인 여부에 대하여 데이터 변조를 수행한 후 변조를 적용할 것인지 여부를 사용자에게 요청할 수 있다. 이를 통해, 데이터 변조 작업을 통제하고 변조 작업에 대한 검증 절차를 마련할 수 있다. The data modulator 2030 may modulate original data into altered data according to the modulation policy. In other words, the data modulation unit 2030 does not change the token value stored in the business table 1040, but modifies the password stored in the mapping information table 1039 with the altered password to achieve the effect of modifying business data. You can. Additionally, in one embodiment, the data modulation unit 2030 may manage the cycle and approval of policies for data modulation. More specifically, the policy for modification may include which original data will be modified using which key. Additionally, the modulation period can be set by the user. Additionally, the data management platform 10000 may request the user whether to apply the modulation after performing data modulation regarding whether to approve the modulation. Through this, it is possible to control data tampering operations and establish verification procedures for tampering operations.

데이터 조회부(2031)는 업무 테이블(1040)에 저장된 토큰 값과 매핑 정보 테이블(1039)에 저장된 매핑 정보를 이용하여 데이터를 조회할 수 있다. 이때, 데이터 조회부(2031)는 조회를 요청한 사용자에 대하여 변조된 데이터를 제공함으로써 원문 데이터의 안전성을 유지할 수 있다. The data search unit 2031 can search data using the token value stored in the task table 1040 and the mapping information stored in the mapping information table 1039. At this time, the data inquiry unit 2031 can maintain the safety of the original data by providing altered data to the user who requested the inquiry.

도 55는 본 발명의 데이터 관리 방법이 데이터에 대응하는 토큰 값 및 매핑 정보를 생성하는 실시 예를 설명하는 도면이다. Figure 55 is a diagram illustrating an embodiment in which the data management method of the present invention generates token values and mapping information corresponding to data.

단계(S11010)에서, 데이터 관리 방법은 데이터를 수신할 수 있다. 여기에서, 수신한 데이터는 변조 대상 데이터에 대응할 수 있다. 일 실시 예에서, 사용자가 변조를 요청한 데이터에 대응할 수 있다. In step S11010, the data management method may receive data. Here, the received data may correspond to data subject to modulation. In one embodiment, it may respond to data that a user has requested to be altered.

단계(S11020)에서, 데이터 관리 방법은 수신한 데이터에 대응하는 토큰 값 및 매핑 정보를 생성할 수 있다. 여기에서, 매핑 정보는 암호본와 암호본에 대한 기타정보를 포함할 수 있다. 이때, 기타정보는 원문 데이터의 실제 암호화 데이터에 대한 정보, 해쉬(hash) 정보 및 키 ID 등을 포함할 수 있다. In step S11020, the data management method may generate token values and mapping information corresponding to the received data. Here, the mapping information may include the encrypted text and other information about the encrypted text. At this time, other information may include information about the actual encrypted data of the original data, hash information, key ID, etc.

단계(S11030)에서, 데이터 관리 방법은 생성된 토큰 값 및 매핑 정보를 매핑 정보 테이블에 저장할 수 있다. 일 실시 예에서, 데이터 관리 방법은 변조 대상 데이터에 대응하는 토큰 값과 매핑 정보를 매핑하여 매핑 정보 테이블에 저장할 수 있다. 또한, 데이터 관리 방법은 토큰 값에 대해서는 업무 테이블에 저장할 수 있다. In step S11030, the data management method may store the generated token value and mapping information in a mapping information table. In one embodiment, the data management method may map token values and mapping information corresponding to data to be altered and store them in a mapping information table. Additionally, the data management method can store token values in a business table.

도 56은 본 발명의 매핑 정보 테이블 및 업무 테이블을 설명하는 도면이다.Figure 56 is a diagram explaining the mapping information table and task table of the present invention.

일 실시 예에서, 매핑 정보 테이블(1039)은 변조 대상 데이터에 대응하는 토큰 값, 암호본 및 기타정보를 포함할 수 있다. 예를 들어, 변조 대상 데이터가 개인정보이고, 주민등록번호인 경우, 원문 데이터가 “730101-1111111”인 경우, 암호화된 상태의 원문 데이터는 “730101-1abcxxf”에 대응할 수 있다. 일 실시 예에서, 데이터 관리 플랫폼은 변조 대상(target) 데이터인 원문 데이터 “730101-1111111”에 대한 토큰 값으로 “abcxxf”을 생성할 수 있다. In one embodiment, the mapping information table 1039 may include a token value, encrypted copy, and other information corresponding to the data to be altered. For example, if the data to be altered is personal information and a resident registration number, and the original data is “730101-1111111”, the encrypted original data may correspond to “730101-1abcxxf”. In one embodiment, the data management platform may generate “abcxxf” as a token value for the original data “730101-1111111”, which is the target data.

이에 따라, 데이터 관리 플랫폼은 매핑 정보 테이블(1039)에 원문 데이터 “730101-1111111”에 대응하는 암호본인 “HJ113801Z==”와 토큰 값 “abcxxf”와 원문 데이터 ”730101-1111111“에 대한 기타정보를 매핑하여 저장할 수 있다. 여기에서, 기타정보는 원문 데이터 ”730101-1111111”에 대한 암호화 키 정보 및 원문 해쉬 정보 등을 포함할 수 있다. 예를 들어, 기타정보는 원문 데이터 ”730101-1111111”에 대한 키 ID 등을 포함할 수 있다. 이에 대하여는 상술한 바와 같다. Accordingly, the data management platform stores the password “HJ113801Z==” corresponding to the original data “730101-1111111”, the token value “abcxxf”, and other information about the original data “730101-1111111” in the mapping information table 1039. It can be mapped and saved. Here, other information may include encryption key information and original hash information for the original data “730101-1111111”. For example, other information may include the key ID for the original data “730101-1111111”. This is the same as described above.

일 실시 예에서, 업무 테이블(1040)은 토큰 값을 포함할 수 있다. 상술한 예를 들어 설명하면, 변조 대상 데이터가 개인정보이고, 주민등록번호인 경우, 암호화된 상태의 원문 데이터인 ”730101-1abcxxf”가 업무 테이블(1040)에 저장될 수 있다. In one embodiment, task table 1040 may include token values. Referring to the above example, if the data to be altered is personal information and a resident registration number, “730101-1abcxxf”, which is the original data in an encrypted state, may be stored in the work table 1040.

이때, 업무 테이블(1040)에 저장된 토큰 값은 데이터가 암호화되거나 변조되어도 변하지 않기 때문에 데이터의 무결성을 보장할 수 있다. At this time, the token value stored in the business table 1040 does not change even if the data is encrypted or altered, so the integrity of the data can be guaranteed.

도 57은 본 발명의 데이터 관리 방법이 데이터를 변조하는 실시 예를 설명하는 도면이다. Figure 57 is a diagram explaining an embodiment in which the data management method of the present invention modifies data.

단계(S11040)에서, 데이터 관리 방법은 변조 정책(policy)에 따른 원문(original) 데이터를 변조할 수 있다. 여기에서, 변조 정책이란 어떤 데이터를 어떤 키로 변조할 것인지에 대한 정보를 포함할 수 있다. 보다 상세하게는, 데이터 변조의 정의는 원문 데이터를 다른 데이터로 변경(change)하는 것을 나타낸다. 본 발명의 데이터 관리 방법에 있어서, 데이터의 형식, 포맷 또는 길이 등을 유지하는 상태에서, 암호화된 데이터를 또 다른 데이터로 암호화하는 과정을 데이터 변조 과정으로 볼 수 있다. 이를 위하여, 데이터 관리 방법은 암호화된 데이터를 다른 데이터로 변조하기 위한 변조 키를 이용할 수 있다. 여기에서, 변조 키는 상술한 실시 예인 암호화 키와 동일하거나 상이할 수 있다. 보다 상세하게는, 변조 키와 암호화 키는 물리적으로 동일한 키일 수 있으나, 다른 용도로 사용할 수 있다. In step S11040, the data management method may modulate original data according to a modulation policy. Here, the modulation policy may include information about which data will be modulated with which key. More specifically, the definition of data modification refers to changing original data into other data. In the data management method of the present invention, the process of encrypting encrypted data into another data while maintaining the format, format, or length of the data can be viewed as a data modification process. To this end, the data management method can use a modulation key to modulate encrypted data into other data. Here, the modulation key may be the same as or different from the encryption key in the above-described embodiment. More specifically, the modulation key and the encryption key may be physically the same key, but may be used for different purposes.

단계(S11050)에서, 데이터 관리 방법은 변조된 데이터를 매핑 정보 테이블에 저장할 수 있다. 일 실시 예에서, 데이터 관리 방법은 매핑 정보 테이블에 저장된 암호화된 원문 데이터를 변조된 데이터로 대체할 수 있다. In step S11050, the data management method may store the modulated data in a mapping information table. In one embodiment, the data management method may replace encrypted original data stored in a mapping information table with altered data.

도 58은 본 발명의 변조된 데이터가 저장된 매핑 정보 테이블 및 업무 테이블을 설명하는 도면이다. Figure 58 is a diagram explaining the mapping information table and task table in which the modulated data of the present invention is stored.

일 실시 예에서, 매핑 정보 테이블(1039)은 변조 대상 데이터에 대응하는 토큰 값, 원문 데이터 값 및 기타정보를 포함할 수 있다. 상술한 실시 예에 따라, 데이터 변조가 수행되면, 매핑 정보 테이블(1039)은 변조 대상 데이터에 대응하는 토큰 값, 변조된 데이터 값 및 기타정보를 포함할 수 있다. 즉, 데이터 변조가 수행되면, 데이터 관리 플랫폼은 매핑 정보 테이블(1039)에 포함된 원문 데이터를 변조된 데이터로 변경할 수 있다. In one embodiment, the mapping information table 1039 may include a token value corresponding to the data to be altered, an original data value, and other information. According to the above-described embodiment, when data modulation is performed, the mapping information table 1039 may include a token value corresponding to the data to be modulated, a modulated data value, and other information. That is, when data modification is performed, the data management platform can change the original data included in the mapping information table 1039 into the modified data.

상술한 예를 들어 설명하면, 원문 데이터가 암호화된 주민등록번호인 경우, 매핑 정보 테이블(1039)에는 원문 데이터 “730101-1111111”에 대응하는 제 1 암호본인 “HJ113801Z==”이 저장되어 있다. 이때, 데이터 관리 플랫폼을 통하여 데이터 변조가 일어나는 경우, 원문 데이터가 “730101-1222222”로 변조되고, 매핑 정보 테이블(1039)에는 변조 데이터인 “730101-1222222”에 대응하는 제 2 암호본인 “AB123051X==”가 저장될 수 있다. 이때, 데이터 관리 플랫폼은 업무 테이블(1040)에 저장된 토큰 값은 변경하지 않는다. Taking the above example, when the original data is an encrypted resident registration number, the mapping information table 1039 stores the first encrypted text “HJ113801Z==” corresponding to the original data “730101-1111111”. At this time, when data modification occurs through the data management platform, the original data is modified to “730101-1222222”, and the mapping information table 1039 contains the second password “AB123051X=” corresponding to the modified data “730101-1222222”. =” can be stored. At this time, the data management platform does not change the token value stored in the task table 1040.

도 59는 본 발명의 데이터 관리 방법에서 변조된 데이터를 조회하는 실시 예를 설명하는 도면이다. Figure 59 is a diagram illustrating an embodiment of searching for altered data in the data management method of the present invention.

단계(S11060)에서, 데이터 관리 방법은 데이터 조회 요청을 수신할 수 있다. 예를 들어, 제 1 사용자는 데이터를 요청할 수 있다. In step S11060, the data management method may receive a data inquiry request. For example, a first user may request data.

단계(S11070)에서, 데이터 관리 방법은 매핑 정보 및 토큰 값을 이용해 변조된 데이터를 조회할 수 있다. 보다 상세하게는 데이터 관리 방법은 업무 테이블에서 토큰 값을 검색하고, 매핑 정보 테이블에서 토큰 값에 대응하는 변조 데이터를 제공할 수 있다. 이에 따라, 데이터를 요청하는 사용자는 변조된 데이터를 확인할 수 있다. In step S11070, the data management method can query the altered data using mapping information and token values. More specifically, the data management method may retrieve a token value from a business table and provide modulated data corresponding to the token value from a mapping information table. Accordingly, the user requesting data can confirm the altered data.

상술한 예를 들어 설명하면, 사용자가 토큰 값 “abcxxf”에 대한 데이터를 요청하는 경우, 데이터 관리 방법은 토큰 값인 “abcxxf”에 대응하는 변조된 데이터 값인 “730101-1222222”을 제 1 사용자에게 제공할 수 있다. 이때, 데이터 관리 방법은 변조된 데이터 값을 제공하기 위하여 상술한 제 2 암호본을 이용할 수 있다. Taking the above example, when a user requests data for the token value “abcxxf”, the data management method provides the first user with a modulated data value “730101-1222222” corresponding to the token value “abcxxf”. can do. At this time, the data management method may use the above-described second encryption copy to provide the modulated data value.

즉, 사용자는 암호화된 원문 데이터가 아닌 변조된 데이터를 제공받을 수 있다. In other words, users may be provided with altered data rather than encrypted original data.

도 60은 본 발명의 데이터 관리 플랫폼에서 데이터를 변조하는 실시 예를 설명하는 도면이다. Figure 60 is a diagram explaining an embodiment of modulating data in the data management platform of the present invention.

상술한 데이터 관리 플랫폼은 운영 ERP(Enterprise Resource Planning) 시스템 및 테스트 ERP 시스템에서 사용될 수 있다. 여기에서, ERP 시스템은 기업 내 다양한 부서와 업무 영역을 통합하여 관리하고, 중앙에서 자원을 효율적으로 계획, 조정, 관리하는 소프트웨어 시스템에 대응한다. 즉, ERP 시스템은 다양한 모듈로 구성되어 있으며, 각 모듈은 특정 업무에 대한 기능과 데이터를 제공할 수 있다. 이때, ERP 시스템에서 사용되는 모듈은 본 발명의 데이터 관리 플랫폼(10000)의 모듈에 대응한다. The data management platform described above can be used in operational ERP (Enterprise Resource Planning) systems and test ERP systems. Here, the ERP system integrates and manages various departments and business areas within a company and corresponds to a software system that efficiently plans, coordinates, and manages resources centrally. In other words, the ERP system is composed of various modules, and each module can provide functions and data for specific tasks. At this time, the module used in the ERP system corresponds to the module of the data management platform (10000) of the present invention.

일 실시 예에서, 운영 ERP 시스템의 사용자(3001)는 토큰 값이 포함된 암호화된 데이터(3002)를 운영 ERP 시스템에 입력하고, 데이터 관리 플랫폼(10000)을 통하여 복호화된 데이터(3003)를 수신할 수 있다. 이에 따라, 운영 ERP 시스템의 사용자(3001)는 암호화가 필요한 개인정보를 복호화하여 확인할 수 있다. In one embodiment, a user 3001 of the operational ERP system may input encrypted data 3002 containing a token value into the operational ERP system and receive decrypted data 3003 through the data management platform 10000. You can. Accordingly, the user 3001 of the operational ERP system can decrypt and confirm personal information that requires encryption.

이후, 테스트 ERP 시스템의 개발자(3004)는 토큰 값이 포함된 암호화된 데이터(3002)를 테스트 ERP 시스템에 입력할 수 있다. 이때, 데이터 관리 플랫폼은 운영 ERP 시스템에 존재하는 데이터를 복제 후 변조하여 테스트 ERP 시스템에게 변조된 데이터를 전송할 수 있다. 또한, 운영 ERP 시스템 또는 테스트 ERP 시스템에서 데이터 변조를 수행할 수 있다. 운영 ERP 시스템 또는 테스트 ERP 시스템에서 데이터 변조를 수행하는 방법은 상술한 실시 예를 참고하도록 한다. 일 실시 예에서, 데이터의 변조 시점은 기 설정된 주기에 따라 결정될 수 있다. 즉, 데이터 관리 플랫폼(10000)은 데이터 변조를 개발자(3004)의 요청에 따라 실시간으로 수행하지 않는 것을 특징으로 한다. Thereafter, the developer 3004 of the test ERP system may input the encrypted data 3002 including the token value into the test ERP system. At this time, the data management platform can copy and then modify the data existing in the operational ERP system and transmit the modified data to the test ERP system. Additionally, data tampering can be performed in an operational ERP system or a test ERP system. For information on how to perform data falsification in an operational ERP system or a test ERP system, refer to the above-described embodiment. In one embodiment, the data modulation point may be determined according to a preset period. That is, the data management platform 10000 does not perform data modification in real time according to the request of the developer 3004.

이에 따라, 개발자(3004)는 변조된 데이터(3005)를 수신할 수 있다. 이를 통해, 운영 ERP 시스템에서 데이터를 개발 환경으로 이관하여 테스트에 사용해야 하는 사용자(본 도면의 개발자(3004))는 금융감독원의 규정 등을 지킬 수 있다. (참고: 전자금융 감독규정 시행세칙 제9조 제10항 “서비스 개발 또는 개선과 관련한 테스트시 우선적으로 가상 데이터를 사용해야 한다.”)Accordingly, the developer 3004 can receive the modulated data 3005. Through this, users (developer 3004 in this drawing) who need to transfer data from the operational ERP system to the development environment and use it for testing can comply with the regulations of the Financial Supervisory Service. (Reference: Article 9, Paragraph 10 of the Enforcement Decree of the Electronic Financial Supervision Regulations “Virtual data must be used preferentially when testing related to service development or improvement.”)

도 61은 본 발명의 데이터 관리 플랫폼에서 데이터를 변조하는 실시 예를 설명하는 도면이다. Figure 61 is a diagram explaining an embodiment of modulating data in the data management platform of the present invention.

운영 ERP 시스템의 사용자는 운영 데이터를 테스트 ERP 시스템으로 이관(System copy)할 수 있다. 즉, 마이그레이션(migration)을 수행할 수 있다. 마이그레이션은 한 시스템에서 다른 시스템으로의 이전 또는 전환 과정을 의미하며, 특히, 데이터 마이그레이션은 기존 시스템에서 생성, 저장된 데이터를 새로운 시스템으로 이전하는 작업에 해당한다. 데이터 마이그레이션은 데이터의 일관성과 무결성을 유지하면서, 새로운 시스템에서의 정상적인 운영을 보장해야만 한다. Users of the operational ERP system can transfer operational data to the test ERP system (system copy). In other words, migration can be performed. Migration refers to the process of transferring or converting from one system to another. In particular, data migration involves transferring data created and stored in an existing system to a new system. Data migration must ensure normal operation in the new system while maintaining data consistency and integrity.

본 발명의 데이터 관리 플랫폼(10000)은 운영 ERP 시스템의 사용자가 운영 데이터를 테스트 ERP 시스템으로 이관하면서, 법이나 규정을 지키기 위하여 데이터를 변조해야 하는 점에 도움을 줄 수 있다. The data management platform (10000) of the present invention can help users of an operational ERP system transfer operational data to a test ERP system while altering the data to comply with laws or regulations.

보다 상세하게는, 운영 ERP 시스템의 사용자가 개인정보가 포함된 암호화된 데이터(3002)를 테스트 ERP 시스템에 마이그레이션할 수 있다. 여기에서, 암호화된 데이터(3002)는 토큰 값을 포함할 수 있다. 본 도면을 예로 들어 설명하면, 개인정보가 주민등록번호인 경우, 암호화된 데이터(3002)는 토큰 값을 포함하는 “720403-1kjWkaq”에 대응한다. 여기에서, 토큰 값은 “kjWkaq”에 대응할 수 있다. More specifically, users of an operational ERP system can migrate encrypted data 3002 containing personal information to a test ERP system. Here, the encrypted data 3002 may include a token value. Taking this drawing as an example, if the personal information is a resident registration number, the encrypted data 3002 corresponds to “720403-1kjWkaq” including the token value. Here, the token value may correspond to “kjWkaq”.

일 실시 예에서, 테스트 ERP 시스템은 데이터 관리 플랫폼(10000)에 데이터 변조를 요청할 수 있다. 이에 따라, 데이터 관리 플랫폼(10000)은 매핑 정보 테이블에 저장된 제 1 암호본(3002a)를 이용해 애플리케이션 운영 암호화 서버(1020a)로부터 데이터를 전달받아 암호화된 데이터를 복호화할 수 있다. 암호화된 데이터(3002)가 복호화된 경우, 상술한 예를 들어 설명하면, 복호화된 개인정보는 “720403-1234567”에 대응한다. In one embodiment, the test ERP system may request data tampering from the data management platform 10000. Accordingly, the data management platform 10000 can receive data from the application operation encryption server 1020a using the first encryption copy 3002a stored in the mapping information table and decrypt the encrypted data. When the encrypted data 3002 is decrypted, using the above example, the decrypted personal information corresponds to “720403-1234567”.

이후, 데이터 관리 플랫폼(10000)은 키 관리 모듈(20003)을 이용하여 복호화된 데이터(3003)를 변조하여 테스트 ERP 시스템에 변조된 데이터(3005)를 전달할 수 있다. 상술한 예를 들어 설명하면, 복호화된 변조된 개인정보는 “720403-1765432”에 대응한다. 즉, 데이터 관리 플랫폼(10000)을 통하여 변조된 데이터는 기존의 변조되기 전의 개인정보와 동일한 형식을 유지하지만 다른 내용을 포함할 수 있다. Thereafter, the data management platform 10000 may modulate the decrypted data 3003 using the key management module 20003 and transmit the altered data 3005 to the test ERP system. Taking the above example, the decrypted and altered personal information corresponds to “720403-1765432”. In other words, data altered through the data management platform 10000 maintains the same format as existing personal information before alteration, but may include different content.

이후, 데이터 관리 플랫폼(10000)은 변조된 데이터(3005)에 대응하는 제2 암호본(3002b)를 애플리케이션 품질 암호화 서버(1020b)에 전달할 수 있다. 애플리케이션 품질 암호화 서버(1020b)는 제 2 암호본(3002b)을 테스트 ERP 시스템에 전달함으로써, 테스트 ERP 시스템의 개발자가 변조된 개인정보인 “720403-1765432”를 테스트에 이용할 수 있도록 한다. Thereafter, the data management platform 10000 may transmit the second encrypted copy 3002b corresponding to the altered data 3005 to the application quality encryption server 1020b. The application quality encryption server 1020b transmits the second password 3002b to the test ERP system, allowing the developer of the test ERP system to use the altered personal information “720403-1765432” for testing.

테스트 ERP 시스템에는 누구나 접근이 가능하기 때문에 데이터가 복호화 되더라도 원래의 데이터를 사용해서는 안 되며, 본 발명의 데이터 관리 플랫폼(10000)을 통하여 이러한 점을 방지할 수 있다. Since anyone can access the test ERP system, even if the data is decrypted, the original data should not be used, and this can be prevented through the data management platform (10000) of the present invention.

도 62는 본 발명의 데이터 관리 방법이 데이터를 변조하는 실시 예를 설명하는 도면이다.Figure 62 is a diagram explaining an embodiment in which the data management method of the present invention modifies data.

단계(S11081)에서, 본 발명의 데이터 관리 방법은 데이터에 대한 변조 요청을 수신할 수 있다. 이때, 데이터 관리 방법은 시스템의 사용자의 요청에 기초하여 데이터에 대한 변조 요청을 수신할 수 있다. 이에 대하여는, 도 1 내지 도 3 및 도 17의 실시 예를 참고하도록 한다. In step S11081, the data management method of the present invention may receive a request for modulation of data. At this time, the data management method may receive a request for modification of data based on a request from a user of the system. For this, please refer to the embodiments of FIGS. 1 to 3 and FIG. 17.

단계(S11082)에서, 본 발명의 데이터 관리 방법은 데이터를 복호화할 수 있다. 이때, 데이터 관리 방법은 데이터를 복호화하기 위하여 매핑 정보 테이블에 포함된 제 1 암호본을 이용할 수 있다. 본 발명의 데이터 관리 방법이 데이터를 복호화하는 방법은 도 54 내지 도 61의 실시 예를 참고하도록 한다.In step S11082, the data management method of the present invention can decrypt data. At this time, the data management method can use the first encrypted copy included in the mapping information table to decrypt the data. For information on how the data management method of the present invention decrypts data, refer to the embodiments of FIGS. 54 to 61.

단계(S11083)에서, 본 발명의 데이터 관리 방법은 복호화된 데이터를 변조할 수 있다. 본 발명의 데이터 관리 방법이 복호화된 데이터를 변조하는 방법은 도 54 내지 도 61의 실시 예를 참고하도록 한다. In step S11083, the data management method of the present invention can modulate the decrypted data. For a method of modulating decrypted data by the data management method of the present invention, refer to the embodiments of FIGS. 54 to 61.

단계(S11084)에서, 본 발명의 데이터 관리 방법은 변조된 데이터에 대응하는 토큰 값 및 변조된 데이터에 대응하는 제 2 암호본을 매핑 정보 테이블에 저장할 수 있다. 본 발명의 데이터 관리 방법이 토큰 값 및 암호본을 매핑 정보 테이블에 저장하는 방법은 도 44 내지 도 53의 실시 예를 참고하도록 한다. In step S11084, the data management method of the present invention may store the token value corresponding to the altered data and the second encrypted copy corresponding to the altered data in the mapping information table. For information on how the data management method of the present invention stores token values and encrypted copies in a mapping information table, refer to the embodiments of FIGS. 44 to 53.

본 발명에 따르면, 테스트 환경에서 데이터 변조 기능을 제공할 수 있다는 장점이 있다. 즉, 개발 시스템에 실제 운영 데이터가 들어있을 경우 외부 유출 등 심각한 문제가 발생할 수 있어 위/변조하여 저장관리가 필요하다. 본 발명과 같이 토큰 처리를 통해 데이터를 변조하면, 수천만 내지 수억 건의 데이터가 아닌 중복을 제외한 수천건의 데이터 변조 만으로 모든 값을 변조할 수 있어 업무 효율성이 증가한다는 장점이 있다. According to the present invention, there is an advantage in that a data modulation function can be provided in a test environment. In other words, if the development system contains actual operational data, serious problems such as external leakage may occur, so forgery/falsification and storage management are necessary. Modifying data through token processing as in the present invention has the advantage of increasing work efficiency because all values can be modulated by only modulating thousands of data excluding duplicates, rather than tens or hundreds of millions of data.

또한, 업무상 백업(back-up)이 필요한 모든 회사를 대상으로, 백업된 오래된 데이터도 변조가 가능하다는 장점이 있다. In addition, for all companies that require backup for business purposes, it has the advantage of being able to modify even old backed-up data.

또한, 토큰 값 및 매핑 정보를 통해 변조 데이터에 대한 신뢰성을 제공할 수 있다는 장점이 있다. Additionally, it has the advantage of being able to provide reliability for falsified data through token values and mapping information.

로그를 수집하기 위해 에이전트를 사용할 때 다수의 에이전트에 대한 설정 변경, 형상 관리, 상태 모니터링과 같은 작업들은 많은 수작업을 요구하고 복잡성을 증가시킨다. 이로 인해 업무 효율성이 저하될 수 있다. 본 발명은 데이터 수집을 위한 에이전트에 대한 중앙 관리를 통해 이러한 문제를 해결하고자 한다. 이를 통해 에이전트에 대한 패치 적용, 동작 관리, 설정 관리 등의 작업을 자동화하고 중앙화하여 업무의 효율성을 증가시킬 수 있다. When using agents to collect logs, tasks such as changing settings for multiple agents, configuration management, and status monitoring require a lot of manual work and increase complexity. This may reduce work efficiency. The present invention seeks to solve this problem through central management of agents for data collection. Through this, work efficiency can be increased by automating and centralizing tasks such as patching, operation management, and settings management for agents.

도 63은 본 발명의 데이터 관리 플랫폼에서 에이전트를 관리하는 실시 예를 설명하는 도면이다. Figure 63 is a diagram explaining an embodiment of managing agents in the data management platform of the present invention.

본 발명의 데이터 관리 플랫폼(10000)의 수집 모듈(20001)은 에이전트 관리부(2032)를 통하여 적어도 하나의 에이전트(20008a, 20008b, …를 관리할 수 있다. 여기에서, 에이전트(Agent)는 컴퓨터 시스템 또는 네트워크 상에서 특정한 작업을 수행하기 위해 설치되고 실행되는 소프트웨어 모듈 또는 프로그램에 대응할 수 있다. 본 발명의 에이전트는 호스트 서버에 설치되어 설정된 기능(예를 들어, 데이터 수집 기능)을 수행할 수 있다. The collection module 20001 of the data management platform 10000 of the present invention can manage at least one agent 20008a, 20008b, ... through the agent management unit 2032. Here, the agent is a computer system or The agent of the present invention may correspond to a software module or program installed and executed to perform a specific task on a network and may be installed on a host server to perform a set function (for example, a data collection function).

이를 위하여, 데이터 관리 플랫폼(10000)의 수집 모듈(20001)은 에이전트 관리부(2032)를 포함할 수 있다. To this end, the collection module 20001 of the data management platform 10000 may include an agent management unit 2032.

에이전트 관리부(2032)는 에이전트에 대한 형상을 관리할 수 있다. 여기에서, 에이전트에 대한 형상은 에이전트의 버전(version)을 나타낼 수 있다. 에이전트 관리부(2032)는 수집 대상에 설치된 에이전트에 대한 형상 정보를 모니터링하며, 신규 버전의 형상으로 업데이트하는 명령을 수행할 수 있다. 여기에서, 수집 대상은 본 발명의 데이터 관리 플랫폼(10000)에서 로그를 수집하고자 하는 호스트 서버에 해당한다. 또한, 에이전트 관리부(2032)는 적어도 하나의 에이전트(20008a, 20008b, …의 수집 상태를 모니터링할 수 있다. The agent management unit 2032 can manage the configuration of the agent. Here, the shape for the agent may represent the version of the agent. The agent management unit 2032 monitors shape information about the agent installed in the collection target and can execute a command to update the shape to a new version. Here, the collection object corresponds to the host server from which logs are to be collected in the data management platform 10000 of the present invention. Additionally, the agent management unit 2032 may monitor the collection status of at least one agent 20008a, 20008b,...

일 실시 예에서, 에이전트 관리부(2032)는 적어도 하나의 에이전트(20008a, 20008b, …별로 수행에 필요한 명령어를 생성하고, 해당 에이전트에 배포할 수 있다. 예를 들어, 에이전트 관리부(2032)는 제 1 에이전트(20008a)을 위한 로그 데이터 수집 명령어를 생성하고, 제 1 에이전트(20008a)에게 로그 데이터 수집 명령어를 배포할 수 있다. In one embodiment, the agent management unit 2032 may generate commands necessary for execution for each of at least one agent 20008a, 20008b, ... and distribute them to the corresponding agent. For example, the agent management unit 2032 may generate commands necessary for execution for each agent 20008a, 20008b, ... and distribute them to the corresponding agent. For example, the agent management unit 2032 may generate commands required for execution for each agent 20008a, 20008b, etc. A log data collection command for the agent 20008a may be created and the log data collection command may be distributed to the first agent 20008a.

일 실시 예에서, 에이전트 관리부(2032)는 적어도 하나의 에이전트(20008a, 20008b, …가 설치된 서버에 대한 목록을 관리할 수 있다. In one embodiment, the agent management unit 2032 may manage a list of servers on which at least one agent 20008a, 20008b, ... is installed.

이때, 에이전트는 데이터 수집 기능을 수행하기 위한 적어도 하나의 모듈을 포함할 수 있다. 일 실시 예에서, 제 1 에이전트(20008a)는 와치독 모듈(Watchdog module, 2033) 및 워커 모듈(Worker module, 2034)을 포함할 수 있다. 여기에서, 워커 모듈은 실질적으로 데이터를 수집하는 기능을 수행하며, 에이전트 모듈이라는 이름으로 지칭될 수 있다. At this time, the agent may include at least one module to perform a data collection function. In one embodiment, the first agent 20008a may include a watchdog module 2033 and a worker module 2034. Here, the worker module actually performs the function of collecting data and may be referred to as an agent module.

여기에서, 와치독 모듈(2033)은 제 1 에이전트(20008a)에서 사용되는 모듈 중 하나로, 에이전트의 예기치 않은 동작 또는 비정상적인 상태를 감지하는 기능을 수행할 수 있다. 일 실시 예에서, 와치독 모듈(2033)은 데이터 관리 플랫폼(10000)과의 통신을 담당하고, 워커 모듈(2034)에 대한 제어를 담당할 수 있다. Here, the watchdog module 2033 is one of the modules used in the first agent 20008a, and may perform a function of detecting an unexpected operation or abnormal state of the agent. In one embodiment, the watchdog module 2033 may be responsible for communication with the data management platform 10000 and may be responsible for controlling the worker module 2034.

워커 모듈(2034)은 제 1 에이전트(20008a)의 핵심적인 기능을 수행하는 모듈로 주로 백그라운드에서 동작하며 특정 작업을 처리하고 결과를 반환하는 역할을 담당할 수 있다. The worker module 2034 is a module that performs the core functions of the first agent 20008a and mainly operates in the background and may be responsible for processing specific tasks and returning results.

특히, 본 발명에서는, 워커 모듈(2034)은 와치독 모듈(2033)로부터 전달받은 명령어를 수행하고 그 수행 결과를 데이터 관리 플랫폼(10000)으로 반환할 수 있다. In particular, in the present invention, the worker module 2034 can execute the command received from the watchdog module 2033 and return the execution result to the data management platform 10000.

이에 따라, 데이터 관리 플랫폼(10000)은 적어도 하나의 에이전트(20008a, 20008b)로부터 수집된 데이터를 데이터베이스(20007)에 저장할 수 있다. Accordingly, the data management platform 10000 may store data collected from at least one agent 20008a and 20008b in the database 20007.

특히, 본 도면에서는 2개의 에이전트만을 예로 들어 설명하였으나, 에이전트가 수십 내지 수백개가 되는 경우에도 동일하게 적용할 수 있다. 즉, 수십 내지 수백개의 에이전트를 동시에 업데이트 하기 위하여 정책을 배포할 수 있어 업무의 효율성이 극대화될 수 있다. In particular, in this drawing, only two agents are used as an example, but the same can be applied even when there are dozens to hundreds of agents. In other words, policies can be distributed to update tens to hundreds of agents simultaneously, thereby maximizing work efficiency.

도 64는 본 발명의 데이터 관리 방법에서 에이전트를 관리하는 방법을 설명하는 도면이다. Figure 64 is a diagram explaining a method of managing an agent in the data management method of the present invention.

이하의 단계는 호스트 서버에 설치된 에이전트를 관리하기 위하여 본 발명의 데이터 관리 방법이 수행하는 실시 예를 설명한다. 특히, 이하의 방법은 동시에 수행되지 않을 수 있으며, 개별적으로 수행될 수 있음은 물론이다. The following steps describe an example of how the data management method of the present invention is performed to manage the agent installed on the host server. In particular, it goes without saying that the following methods may not be performed simultaneously, but may be performed individually.

단계(S12010)에서, 본 발명의 데이터 관리 방법은 에이전트에 대한 형상을 관리할 수 있다. 일 실시 예에서, 데이터 관리 방법은 관리자의 요청에 따라 에이전트에 대한 형상을 관리할 수 있다. 예를 들어, 관리자는 에이전트 관리부를 통하여 복수 개의 에이전트에 대한 형상 업데이트 등을 요청할 수 있다. 이를 통해, 에이전트의 업데이트, 패치 및 롤백을 용이하게 수행할 수 있다. In step S12010, the data management method of the present invention can manage the shape of the agent. In one embodiment, the data management method can manage the configuration for the agent according to the administrator's request. For example, the manager may request shape updates for multiple agents through the agent management unit. Through this, agent updates, patches, and rollbacks can be easily performed.

단계(S12020)에서, 본 발명의 데이터 관리 방법은 에이전트 별로 수행에 필요한 명령어를 생성하고 해당 에이전트에 배포할 수 있다. 일 실시 예에서, 데이터 관리 방법은 관리자의 형상 업데이트 요청에 기초하여 에이전트 별로 업데이트 수행에 필요한 명령어를 생성하고 해당 에이전트에 배포할 수 있다. 예를 들어, 관리자가 에이전트 관리부 상에 패치 파일을 업로드하고, 에이전트의 업데이트를 요청하는 경우, 에이전트 관리부는 에이전트 별로 업데이트 수행에 필요한 명령어를 생성할 수 있다. 이후, 에이전트의 와치독 모듈은 명령어를 수신하여 워커 모듈을 재시작할 수 있다. 에이전트의 동작은 이하에서 자세히 설명하도록 한다. In step S12020, the data management method of the present invention can generate commands necessary for execution for each agent and distribute them to the corresponding agents. In one embodiment, the data management method may generate commands required to perform updates for each agent based on the manager's request for a shape update and distribute them to the corresponding agents. For example, when an administrator uploads a patch file to the agent management unit and requests an agent update, the agent management unit can generate commands necessary to perform the update for each agent. Afterwards, the agent's watchdog module can receive the command and restart the worker module. The agent's operation will be described in detail below.

단계(S12030)에서, 본 발명의 데이터 관리 방법은 에이전트의 상태를 모니터링할 수 있다. 보다 상세하게는, 데이터 관리 방법은 에이전트의 동작 상태, 연결 상태 등을 실시간으로 모니터링하고 문제 발생 시 대응할 수 있다. In step S12030, the data management method of the present invention can monitor the status of the agent. More specifically, the data management method can monitor the agent's operating status, connection status, etc. in real time and respond when problems occur.

도 65는 본 발명의 데이터 관리 방법에서 에이전트와의 통신 방법을 설명하는 도면이다. Figure 65 is a diagram explaining a communication method with an agent in the data management method of the present invention.

단계(S12040)에서, 본 발명의 데이터 관리 방법은 에이전트의 와치독 모듈에게 명령어를 전달할 수 있다. 보다 상세하게는, 데이터 관리 방법은 에이전트 별로 동작에 필요한 명령어를 생성하고 에이전트에게 전달할 수 있다. 이에 따라, 에이전트의 와치독 모듈은 외부로부터 명령어를 수신할 수 있다. In step S12040, the data management method of the present invention can transmit a command to the agent's watchdog module. More specifically, the data management method can generate commands necessary for operation for each agent and transmit them to the agent. Accordingly, the agent's watchdog module can receive commands from the outside.

단계(S12050)에서, 본 발명의 데이터 관리 방법은 에이전트의 워커 모듈로부터 수집된 데이터를 저장할 수 있다. 에이전트의 워커 모듈은 에이전트가 설치된 서버에서 데이터를 수집할 수 있다. 이때, 워커 모듈을 통하여 수집된 데이터들은 데이터 관리 플랫폼의 수집 모듈로 직접 전송될 수 있다. 일 실시 예에서, 워커 모듈을 통하여 수집된 데이터들은 에이전트 별로 암호화되어 데이터 관리 플랫폼의 수집 모듈로 전송될 수 있다. 이때, 에이전트 별로 다른 암호화 방법에 기초하여 수집된 데이터가 전달될 수 있다. 예를 들어, 에이전트에서 데이터 관리 플랫폼으로 전송되는 데이터들은 SSL/TLS (Secure Sockets Layer/Transport Layer Security) 암호화 프로토콜, VPN 암호화 프로토콜, 대칭키 또는 비대칭키 암호화 알고리즘을 통해 암호화되어 전달될 수 있다. 이를 통하여, 개별적인 에이전트에서 수집된 데이터들은 안전하게 암호화되어 데이터 관리 플랫폼으로 전달될 수 있다. In step S12050, the data management method of the present invention can store data collected from the agent's worker module. The agent's worker module can collect data from the server where the agent is installed. At this time, data collected through the worker module can be directly transmitted to the collection module of the data management platform. In one embodiment, data collected through the worker module may be encrypted for each agent and transmitted to the collection module of the data management platform. At this time, data collected based on a different encryption method for each agent may be delivered. For example, data transmitted from the agent to the data management platform may be encrypted and transmitted using SSL/TLS (Secure Sockets Layer/Transport Layer Security) encryption protocols, VPN encryption protocols, and symmetric key or asymmetric key encryption algorithms. Through this, data collected from individual agents can be safely encrypted and delivered to the data management platform.

이후, 데이터 관리 플랫폼의 수집 모듈은 에이전트에서 수집된 데이터를 저장할 수 있다. 일 실시 예에서, 데이터 관리 플랫폼의 수집 모듈은 에이전트에서 수집된 데이터를 데이터베이스에 저장할 수 있다. Afterwards, the collection module of the data management platform can store the data collected from the agent. In one embodiment, the collection module of the data management platform may store data collected from agents in a database.

이를 통해, 본 발명의 데이터 관리 방법은 에이전트와의 효율적인 통신, 데이터 수집 및 저장을 수행할 수 있다. Through this, the data management method of the present invention can perform efficient communication with agents, data collection and storage.

도 66은 본 발명의 데이터 관리 방법에서 에이전트의 동작 방법을 설명하는 도면이다. Figure 66 is a diagram explaining the operation method of the agent in the data management method of the present invention.

단계(S12060)에서, 에이전트는 데이터 관리 플랫폼의 수집 모듈의 에이전트 관리부와 통신할 수 있다. 에이전트는 에이전트 관리부로부터 수행에 필요한 명령어를 수신할 수 있다. In step S12060, the agent may communicate with the agent management unit of the collection module of the data management platform. The agent can receive commands required for execution from the agent management unit.

단계(S12070)에서, 에이전트는 에이전트 관리부로부터 수신한 명령어를 워커 모듈에게 전달할 수 있다. In step S12070, the agent may transmit the command received from the agent management unit to the worker module.

단계(S12080)에서, 에이전트의 워커 모듈에서 명령어를 수행할 수 있다. 워커 모듈은 에이전트의 핵심적인 기능을 수행하는 모듈로 명령어를 수행하고, 결과를 반환할 수 있다. In step S12080, a command may be executed in the agent's worker module. The worker module is a module that performs the core functions of the agent and can execute commands and return results.

단계(S12090)에서, 에이전트는 워커 모듈을 통하여 수행 결과에 대응하는 데이터를 수집할 수 있다. 일 실시 예에서, 에이전트가 설치된 호스트 서버에서 데이터를 수집하는 기능을 수행할 것을 명령받은 경우, 워커 모듈은 호스트 서버에서 데이터를 수집할 수 있다. 이후, 에이전트는 수집된 데이터를 암호화하여 데이터 관리 플랫폼에게 전송할 수 있다. In step S12090, the agent can collect data corresponding to the performance results through the worker module. In one embodiment, when instructed to perform a function of collecting data from a host server on which an agent is installed, the worker module may collect data from the host server. Afterwards, the agent can encrypt the collected data and transmit it to the data management platform.

도 67은 본 발명의 데이터 관리 방법에서 에이전트의 동작 방법을 설명하는 도면이다.Figure 67 is a diagram explaining the operation method of the agent in the data management method of the present invention.

단계(S13010)에서, 본 발명의 데이터 관리 방법은 데이터를 수집하는 적어도 하나의 에이전트의 형상 정보를 모니터링할 수 있다. 본 발명의 데이터 관리 방법은 도 1 내지 도 7의 실시 예를 이용하여 데이터를 수집하거나 도 63 내지 도 66의 실시 예를 이용하여 데이터를 수집할 수 있다. 단계(S13010)은 도 63 내지 도 66의 데이터 수집 방법을 이용하기 위하여 호스트 서버에 에이전트를 설치할 수 있다. 이때, 네트워크 상에서 송수신되는 패킷으로부터 데이터를 수집하는 도 1 내지 도 7의 실시 예와 달리 호스트 서버에 에이전트를 설치하기 때문에, 호스트 서버에서 발생하는 모든 로그 데이터를 수집할 수 있다. In step S13010, the data management method of the present invention can monitor shape information of at least one agent collecting data. The data management method of the present invention can collect data using the embodiments of FIGS. 1 to 7 or the embodiments of FIGS. 63 to 66. In step S13010, an agent can be installed on the host server to use the data collection method of FIGS. 63 to 66. At this time, unlike the embodiment of Figures 1 to 7, which collects data from packets transmitted and received on the network, since the agent is installed on the host server, all log data generated on the host server can be collected.

단계(S13020)에서, 본 발명의 데이터 관리 방법은 형상 정보의 업데이트 요청에 기초하여 업데이트에 필요한 명령어를 생성할 수 있다. 이에 대하여는 도 64의 실시 예를 참고하도록 한다. In step S13020, the data management method of the present invention can generate a command necessary for update based on a request for update of shape information. For this, please refer to the embodiment of Figure 64.

단계(S13030)에서, 본 발명의 데이터 관리 방법은 생성된 명령어를 적어도 하나의 에이전트에게 배포할 수 있다. 이에 대하여는 도 64의 실시 예를 참고하도록 한다. In step S13030, the data management method of the present invention may distribute the generated command to at least one agent. For this, please refer to the embodiment of Figure 64.

일부 사용자들의 경우 로그 데이터로부터 필요한 정보를 추출하기 위한 정규식 입력을 어려워할 수 있다. 본 발명은 로그 데이터의 정규화를 통해 이러한 문제를 해결하고자 한다. 이를 통해 로그 데이터에서 의미 있는 데이터의 일부분을 추출하기 위한 정규식을 포함하는 파서를 손쉽게 생성함으로써 사용자 편의성을 증가시킬 수 있다. Some users may find it difficult to input regular expressions to extract necessary information from log data. The present invention seeks to solve this problem through normalization of log data. Through this, user convenience can be increased by easily creating a parser that includes regular expressions to extract a portion of meaningful data from log data.

도 68은 본 발명의 데이터 관리 플랫폼이 로그 데이터를 정규화하는 실시 예를 설명하는 도면이다. Figure 68 is a diagram explaining an embodiment in which the data management platform of the present invention normalizes log data.

본 발명의 데이터 관리 플랫폼(10000)은 수집 모듈(20001), 분석 모듈(20002) 및 모니터링 모듈(20005)을 통하여 로그 데이터를 정규화할 수 있고, 사용자/클라이언트(1000)의 이벤트 로그 검색 요청에 따라 로그 데이터를 검색할 수 있다. The data management platform (10000) of the present invention can normalize log data through a collection module (20001), an analysis module (20002), and a monitoring module (20005), and according to the event log search request of the user/client (1000) You can search log data.

구체적으로, 수집 모듈(20001)은 로그 수집부(2040)를 포함할 수 있다. 로그 수집부(2040)는 외부로부터 전송되는 적어도 하나의 로그 데이터를 획득할 수 있다. Specifically, the collection module 20001 may include a log collection unit 2040. The log collection unit 2040 may acquire at least one log data transmitted from outside.

분석 모듈(20002)은 파서 생성부(2043), 변환 규칙 생성부(2044) 및 수집 규칙 생성부(2045)를 포함할 수 있다. 파서 생성부(2043)는 선택된 텍스트에 일정 빈도 이상으로 사용되는 정규식 패턴 별 매칭 블록을 추출할 수 있다. 여기서, 정규식 패턴은 날짜, 시간, 문자열 등 일정 빈도 이상으로 자주 사용되는 정규식 패턴을 포함할 수 있다. 또한, 매칭 블록은 정규식 패턴과 일치하는 텍스트 정보를 나타낼 수 있다. The analysis module 20002 may include a parser generator 2043, a conversion rule generator 2044, and a collection rule generator 2045. The parser generator 2043 can extract matching blocks for each regular expression pattern that is used more than a certain frequency in the selected text. Here, the regular expression pattern may include regular expression patterns that are frequently used more than a certain frequency, such as dates, times, and strings. Additionally, a matching block can represent text information that matches a regular expression pattern.

파서 생성부(2043)는 추출된 다수의 매칭 블록 중 사용자 입력에 의해 선택된 매칭 블록에 기반하여, 로그 데이터로부터 텍스트를 추출하기 위한 정규식을 포함하는 파서를 생성할 수 있다. 일 실시 예에서, 파서는 사용자 입력에 기반한 필드명을 포함할 수 있다. The parser generator 2043 may generate a parser including a regular expression for extracting text from log data based on a matching block selected by user input among a plurality of extracted matching blocks. In one embodiment, the parser may include field names based on user input.

일 실시 예에서, 파서 생성부(2043)는 로그 데이터에서 정규식을 테스트하여 정규식을 검증할 수 있다. 즉, 파서 생성부(2043)는 정규식을 이용하여 로그 데이터로부터 각 필드에 해당하는 필드값이 제대로 추출되는지 테스트를 진행할 수 있다. In one embodiment, the parser generator 2043 may verify the regular expression by testing the regular expression in log data. That is, the parser generator 2043 can test whether the field value corresponding to each field is properly extracted from log data using regular expressions.

변환 규칙 생성부(2044)는 정규식에 기반한 파서 및 파서에 대응하는 이벤트 유형을 포함하는 변환 규칙을 생성할 수 있다. 변환 규칙 생성부(2044)는 기 생성된 파서 중 사용자 입력에 기반한 로그 데이터가 매칭되는 정규식에 기반한 파서를 스캔할 수 있다. 변환 규칙 생성부(2044)는 해당 로그 데이터에 대하여 사용 가능한 파서 중 사용자 입력에 기반한 파서를 변환 규칙에 적용하여 저장할 수 있다. 즉, 변환 규칙은 로그 데이터 저장 시 추출된 필드를 어떻게 가공하여 저장할지에 대한 규칙을 나타낼 수 있다. The conversion rule generator 2044 may generate a conversion rule including a parser based on a regular expression and an event type corresponding to the parser. The conversion rule generator 2044 may scan for a parser based on a regular expression that matches log data based on user input among previously generated parsers. The conversion rule generator 2044 may apply a parser based on user input among available parsers for the corresponding log data to a conversion rule and store it. In other words, the conversion rule may represent a rule on how to process and store the extracted fields when saving log data.

따라서, 본 발명에 따르면, 정규식 입력을 어려워하는 사용자들에게 부분적으로 정규식 입력을 도와주어 전체 정규식을 완성할 수 있는 편의성을 제공할 수 있다. 또한, 본 발명에 따르면, 등록된 파서의 재사용성을 높일 수 있도록 기 생성된 파서를 스캔하는 편의성을 제공할 수 있다. Therefore, according to the present invention, the convenience of completing the entire regular expression can be provided to users who have difficulty entering regular expressions by partially assisting them in entering regular expressions. Additionally, according to the present invention, it is possible to provide the convenience of scanning a previously created parser to increase the reusability of the registered parser.

수집 규칙 생성부(2045)는 로그 수집 장치의 로그 데이터에 사용할 변환 규칙을 선택하여 수집 경로 규칙을 생성할 수 있다. The collection rule generator 2045 may select a transformation rule to be used for log data of the log collection device and create a collection path rule.

모니터링 모듈(20005)는 검색 요청 처리부(2041), 사용자 입력 처리부(2042) 및 이벤트 로그 검색부(2046)를 포함할 수 있다. The monitoring module 20005 may include a search request processing unit 2041, a user input processing unit 2042, and an event log search unit 2046.

검색 요청 처리부(2041)는 사용자/클라이언트(1000)로부터 이벤트 유형에 대한 이벤트 검색 요청을 수신할 수 있다. 이후, 검색 요청 처리부(2041)는 미리 저장된 적어도 하나의 로그 데이터로부터 추출된 검색 결과를 사용자/클라이언트(1000)에게 송신할 수 있다. 이 경우, 송신된 검색 결과는 사용자/클라이언트(1000)의 화면에 디스플레이될 수 있다. The search request processing unit 2041 may receive an event search request for an event type from the user/client 1000. Thereafter, the search request processing unit 2041 may transmit a search result extracted from at least one pre-stored log data to the user/client 1000. In this case, the transmitted search results may be displayed on the screen of the user/client 1000.

사용자 입력 처리부(2042)는 사용자/클라이언트(1000)로부터 사용자 입력에 기반한 정보를 획득할 수 있다. 즉, 사용자 입력 처리부(2042)는 사용자/클라이언트(1000)를 통해 사용자 입력에 의해 선택된 정보를 획득할 수 있다. The user input processing unit 2042 may obtain information based on user input from the user/client 1000. That is, the user input processing unit 2042 can obtain information selected by user input through the user/client 1000.

일 실시 예에서, 사용자 입력에 기반한 정보는 정규식 생성 및 테스트를 위한 로그 데이터, 로그 데이터에 포함된 텍스트, 텍스트 정보를 나타내는 다수의 매칭 블록 중 선택된 매칭 블록, 텍스트에 대한 필드, 파서(parser)에 대한 이벤트 유형, 로그 수집 장치의 등록 정보 및 수집 경로 규칙 중 적어도 하나를 포함할 수 있다. In one embodiment, information based on user input is stored in log data for generating and testing regular expressions, text included in the log data, a matching block selected from a plurality of matching blocks representing text information, a field for text, and a parser. It may include at least one of the event type, registration information of the log collection device, and collection path rules.

이벤트 로그 검색부(2046)는 이벤트 유형에 대한 이벤트 검색 요청을 수신함에 응답하여, 파서, 수집 경로 규칙 및 변환 규칙 중 적어도 하나에 기반한 미리 저장된 적어도 하나의 로그 데이터로부터 추출된 검색 결과를 출력할 수 있다. The event log search unit 2046 may output search results extracted from at least one pre-stored log data based on at least one of a parser, a collection path rule, and a transformation rule in response to receiving an event search request for an event type. there is.

도 69는 본 발명의 데이터 관리 방법이 파서를 생성하는 일 실시 예를 설명하는 도면이다. Figure 69 is a diagram explaining an embodiment of how the data management method of the present invention generates a parser.

단계(S18010)에서, 데이터 관리 방법은 로그 데이터를 획득할 수 있다. 일 실시 예에서, 데이터 관리 방법은 사용자/클라이언트(1000)를 통해 사용자에 의해 입력된 로그 데이터를 획득할 수 있다. In step S18010, the data management method may acquire log data. In one embodiment, the data management method may obtain log data input by the user through the user/client 1000.

단계(S18020)에서, 데이터 관리 방법은 사용자 입력에 기반한 로그 데이터에 포함된 텍스트를 획득할 수 있다. 즉, 데이터 관리 방법은 원본 로그 데이터에서 정규식으로 변환하고자 사용자에 의해 선택된 텍스트를 획득할 수 있다. 일 실시 예에서, 정규식은 '정규 표현식' 또는 이와 동등한 기술적 의미를 갖는 용어로 지칭될 수 있다. In step S18020, the data management method may obtain text included in log data based on user input. In other words, the data management method can obtain text selected by the user to convert it into a regular expression from the original log data. In one embodiment, a regular expression may be referred to as a 'regular expression' or a term with an equivalent technical meaning.

단계(S18030)에서, 데이터 관리 방법은 텍스트에 대한 텍스트 정보를 나타내는 매칭 블록을 생성할 수 있다. 예를 들어, 텍스트 정보는 1개의 숫자, 공백을 제외한 1개 이상의 숫자, 1개의 글자, 공백을 제외한 1개 이상의 글자 및 쌍따옴표 사이의 모든 글자 등 다양한 텍스트 정보를 나타낼 수 있다. In step S18030, the data management method may generate a matching block representing text information about the text. For example, text information may represent a variety of text information, such as one number, one or more numbers excluding spaces, one letter, one or more letters excluding spaces, and all characters between double quotation marks.

단계(S18040)에서, 데이터 관리 방법은 매칭 블록에 기반하여 로그 데이터로부터 텍스트를 추출하기 위한 정규식을 생성할 수 있다. 일 실시 예에서, 로그 데이터에 포함된 각 텍스트에 대하여 정규식을 생성하며, 각 정규식을 포함하는 전체 정규식을 생성할 수 있다. In step S18040, the data management method may generate a regular expression for extracting text from log data based on the matching block. In one embodiment, a regular expression is generated for each text included in log data, and an entire regular expression including each regular expression can be generated.

단계(S18050)에서, 데이터 관리 방법은 사용자 입력에 기반한 텍스트에 대한 필드를 획득할 수 있다. 즉, 데이터 관리 방법은 해당 텍스트에 대한 필드명을 설정하기 위한 필드명 정보를 획득할 수 있다. In step S18050, the data management method may obtain a field for text based on user input. In other words, the data management method can obtain field name information for setting the field name for the corresponding text.

단계(S18060)에서, 데이터 관리 방법은 정규식 및 필드를 포함하는 파서를 생성할 수 있다. At step S18060, the data management method may generate a parser including regular expressions and fields.

도 70은 본 발명의 파서 생성 화면의 일 실시 예를 설명하는 도면이다. Figure 70 is a diagram explaining an example of a parser creation screen of the present invention.

일 실시 예에서, 파서 생성 화면은 사용자/클라이언트(1000)에 의해 디스플레이될 수 있다. 파서 생성 화면은 로그 데이터(1042), 텍스트(1043), 매칭 블록(1044), 정규식(1045) 및 필드(1046)를 포함할 수 있다. In one embodiment, the parser creation screen may be displayed by the user/client 1000. The parser creation screen may include log data 1042, text 1043, matching block 1044, regular expression 1045, and field 1046.

사용자에 의해 로그 데이터(1042)가 입력되고, 로그 데이터(1042)에 포함된 텍스트(1043)가 선택되어 입력될 수 있다. 예를 들어, 텍스트는 문자열로 구성될 수 있으며, 16을 나타낼 수 있다. Log data 1042 is input by the user, and text 1043 included in the log data 1042 can be selected and input. For example, text may consist of a string, which may represent the number 16.

입력된 로그 데이터(1042)와 텍스트(1043)가 데이터 관리 플랫폼(10000)에게 전달되면, 데이터 관리 플랫폼(10000)에 의해 생성된 텍스트(1043)에 대응하는 다수의 매칭 블록들이 사용자/클라이언트(1000)의 파서 생성 화면에 표시될 수 있다. When the input log data 1042 and text 1043 are delivered to the data management platform 10000, a number of matching blocks corresponding to the text 1043 generated by the data management platform 10000 are sent to the user/client 1000. ) can be displayed on the parser creation screen.

이 경우, 다수의 매칭 블록들 각각은 색(color)으로 구분될 수 있으며, 각 매칭 블록은 서로 다른 텍스트 정보를 나타낼 수 있다. 이후, 사용자 입력에 의해 다수의 매칭 블록 중 텍스트(1043)의 텍스트 정보를 나타내는 하나의 매칭 블록(1044)이 선택되는 경우, 선택된 매칭 블록(1044)에 대응하는 정규식(1045)이 자동으로 생성될 수 있다. 예를 들어, 정규식(1045)은 (?<REPLACEGROUPNAME>/d+)로 표현될 수 있다. In this case, each of the multiple matching blocks may be distinguished by color, and each matching block may represent different text information. Afterwards, when one matching block 1044 representing text information of the text 1043 is selected among the plurality of matching blocks by user input, a regular expression 1045 corresponding to the selected matching block 1044 will be automatically generated. You can. For example, the regular expression 1045 can be expressed as (?<REPLACEGROUPNAME>/d+).

또한, 사용자에 의해 해당 텍스트(1043)에 해당하는 필드(1046)가 입력되며, 전체 정규식과 필드가 포함된 파서가 생성될 수 있다. Additionally, the field 1046 corresponding to the text 1043 is entered by the user, and a parser including the entire regular expression and field can be created.

도 71은 본 발명의 파서 테스트 화면의 일 실시 예를 설명하는 도면이다. Figure 71 is a diagram illustrating an example of a parser test screen of the present invention.

일 실시 예에서, 파서 테스트 화면은 사용자/클라이언트(1000)에 의해 디스플레이될 수 있다. 전체 정규식과 필드가 포함된 파서에 대하여 테스트가 진행될 수 있다. 이 경우, 사용자에 의해 로그 데이터(1042)가 입력되고, 테스트가 진행되면, 각 필드에 대응하는 필드값이 로그 데이터로부터 추출될 수 있다. In one embodiment, the parser test screen may be displayed by the user/client 1000. Testing can be done against the parser, including full regular expressions and fields. In this case, when the log data 1042 is input by the user and the test proceeds, field values corresponding to each field can be extracted from the log data.

예를 들어, 필드의 필드명은 Facility, Severity, Action, Time, RuleID, Proto, Src, SrcPort, Dst, DstPort 및 Direction을 포함할 수 있으나, 이에 제한되지 않고 다양하게 구성될 수 있다. 또한, Dst 필드에 대하여 124.243.76.7의 필드값이 추출된 것을 확인할 수 있으며, 이러한 테스트를 통해 해당 정규식을 포함하는 파서를 검증할 수 있다. For example, the field name of the field may include Facility, Severity, Action, Time, RuleID, Proto, Src, SrcPort, Dst, DstPort, and Direction, but is not limited to this and may be configured in various ways. Additionally, it can be confirmed that the field value of 124.243.76.7 was extracted for the Dst field, and through this test, the parser containing the corresponding regular expression can be verified.

도 72는 본 발명의 데이터 관리 방법이 변환 규칙을 생성하는 일 실시 예를 설명하는 도면이다. Figure 72 is a diagram illustrating an embodiment of how the data management method of the present invention generates a conversion rule.

단계(S19010)에서, 데이터 관리 방법은 로그 데이터를 획득할 수 있다. 일 실시 예에서, 데이터 관리 방법은 사용자/클라이언트(1000)를 통해 사용자에 의해 입력된 로그 데이터를 획득할 수 있다.In step S19010, the data management method may acquire log data. In one embodiment, the data management method may obtain log data input by the user through the user/client 1000.

단계(S19020)에서, 데이터 관리 방법은 로그 데이터에 대응하는 파서를 결정할 수 있다. 즉, 사용자에 의해 입력된 로그 데이터에 대한 스캐닝을 통해 로그 데이터에 대응하는 파서가 결정될 수 있으며, 해당 파서에 대한 파서명, 버전 및 파서 유형이 결정될 수 있다. 이 경우, 상술한 바와 같이 정규식에 기반한 파서인 경우 파서 유형은 정규식을 나타낼 수 있다. In step S19020, the data management method may determine a parser corresponding to the log data. That is, the parser corresponding to the log data can be determined through scanning the log data input by the user, and the parser name, version, and parser type for the parser can be determined. In this case, as described above, if the parser is based on a regular expression, the parser type may represent a regular expression.

단계(S19030)에서, 데이터 관리 방법은 사용자 입력에 기반한 파서에 대한 이벤트 유형을 결정할 수 있다. 예를 들어, 이벤트 유형은 개인정보 검색 로그 및 방화벽 로그를 포함할 수 있나, 이에 제한되지 않고 다양한 유형으로 구성될 수 있다. At step S19030, the data management method may determine an event type for the parser based on user input. For example, event types may include, but are not limited to, personal information search logs and firewall logs, and may consist of various types.

단계(S19040)에서, 데이터 관리 방법은 파서 및 이벤트 유형에 대응하는 적어도 하나의 필드를 결정할 수 있다. 일 실시예에서, 데이터 관리 방법은 변환 규칙의 각 파서의 필드의 일괄 등록을 수행할 수 있다. 예를 들어, 일괄 등록 시 필드, 필드명, 필드 유형(숫자 또는 문자열) 및 데이터 유형이 결정될 수 있다. In step S19040, the data management method may determine at least one field corresponding to the parser and event type. In one embodiment, the data management method may perform batch registration of the fields of each parser of the transformation rule. For example, during batch registration, the field, field name, field type (number or string), and data type can be determined.

단계(S19050)에서, 데이터 관리 방법은 파서, 이벤트 유형 및 적어도 하나의 필드를 포함하는 변환 규칙을 생성할 수 있다. 즉, 본 발명에 따르면, 데이터 관리 방법은 변환 규칙을 통해 실제로 추출된 필드값 중에서 저장할 것과 저장하지 않을 것을 구분해서 저장할 것인지를 결정할 수 있다. 다시 말해, 데이터 관리 방법은 정규식을 포함한 파서를 이용하여 로그 데이터로부터 필드값을 추출하고, 변환 규칙을 통해 추출된 필드값 중 무엇을 어떻게 저장할지를 결정할 수 있다. In step S19050, the data management method may generate a conversion rule including a parser, an event type, and at least one field. That is, according to the present invention, the data management method can determine whether to store what to save and what not to save among the actually extracted field values through conversion rules. In other words, the data management method extracts field values from log data using a parser containing regular expressions, and determines which of the extracted field values and how to store them through conversion rules.

도 73은 본 발명의 변환 규칙 생성 화면의 일 실시 예를 설명하는 도면이다. Figure 73 is a diagram explaining an embodiment of the conversion rule creation screen of the present invention.

일 실시 예에서, 변환 규칙 생성 화면은 사용자/클라이언트(1000)에 의해 디스플레이될 수 있다. 이 경우, 사용자에 의해 로그 데이터(1042)가 입력되고, 스캔이 진행되면, 로그 데이터에 대응하는 파서(1047)가 결정될 수 있다. In one embodiment, the conversion rule creation screen may be displayed by the user/client 1000. In this case, when log data 1042 is input by the user and scanning proceeds, a parser 1047 corresponding to the log data may be determined.

이후, 파서(1047)에 대한 이벤트 유형(1048)이 사용자에 의한 입력에 기반하여 선택될 수 있다. 예를 들어, 이벤트 유형(1048)은 방화벽 로그로 결정될 수 있다. The event type 1048 for the parser 1047 may then be selected based on input by the user. For example, event type 1048 may be determined to be a firewall log.

일 실시예에서, 이러한 해당 로그 데이터(1042)에 대해 결정된 파서(1047) 및 이벤트 유형(1048)의 구성은 변환 코드에 포함될 수 있으며, 변환 규칙은 적어도 하나의 변환 코드를 포함할 수 있다. In one embodiment, the configuration of the parser 1047 and event type 1048 determined for the corresponding log data 1042 may be included in the conversion code, and the conversion rule may include at least one conversion code.

예를 들어, 변환 규칙은 파서명이 FIREWALL_ADMITTED_PARSER인 파서 및 방화벽로그인 이벤트 유형으로 구성된 제 1 변환 코드와 파서명이 FIREWALL_DENIED_PARSER인 파서 및 방화벽로그인 이벤트 유형으로 구성된 제 2 변환 코드를 포함할 수 있다. For example, the transformation rule may include a first transformation code consisting of a parser and firewalllogin event type with the parser name FIREWALL_ADMITTED_PARSER and a second transformation code consisting of a parser and firewalllogin event type with the parser name FIREWALL_DENIED_PARSER.

도 74는 본 발명의 파서필드 일괄등록 화면의 일 실시 예를 설명하는 도면이다. Figure 74 is a diagram illustrating an embodiment of the parser field batch registration screen of the present invention.

일 실시 예에서, 파서필드 일괄등록 화면은 사용자/클라이언트(1000)에 의해 디스플레이될 수 있다. 이 경우, 각 필드(1046)에 대한 일괄 등록이 수행될 수 있으며, 매칭 방법, 필드(1046), 필드명, 필드 유형 및 데이터 유형이 설정될 수 있다. 예를 들어, 매칭 방법이 신규 등록인 필드에 대하여는 사용자의 입력에 의해 필드(1046), 필드명, 필드 유형 및 데이터 유형이 결정될 수 있다. 예를 들어, 필드 유형은 숫자 또는 문자열을 포함할 수 있고, 데이터 유형은 숫자를 나타내는 INT 또는 문자열을 나타내는 STRING을 포함할 수 있다. In one embodiment, the parser field batch registration screen may be displayed by the user/client 1000. In this case, batch registration can be performed for each field 1046, and the matching method, field 1046, field name, field type, and data type can be set. For example, for a field whose matching method is new registration, the field 1046, field name, field type, and data type may be determined by user input. For example, a field type can contain number or string, and a data type can contain INT, which represents a number, or STRING, which represents a string.

일 실시 예에서, 호스트, 소스, 시간, 장비 IP 및 장비명에 해당하는 필드 이외에 추가적인 필드가 일괄 등록될 수 있다. 일 실시 예에서, 변환 규칙에 다수의 변환 코드, 예를 들어, 제 1 변환 코드와 제 2 변환 코드가 포함된 경우, 제 1 변환 코드에 대한 필드 일괄 등록과 제 2 변환 코드에 대한 필드 일괄 등록이 각각 수행될 수 있다.In one embodiment, additional fields in addition to the fields corresponding to the host, source, time, device IP, and device name may be registered in batches. In one embodiment, when a conversion rule includes a plurality of conversion codes, for example, a first conversion code and a second conversion code, batch registration of fields for the first conversion code and batch registration of fields for the second conversion code Each of these can be done.

도 75는 본 발명의 데이터 관리 방법이 수집 경로 규칙을 생성하는 일 실시 예를 설명하는 도면이다. Figure 75 is a diagram illustrating an embodiment of how the data management method of the present invention creates a collection path rule.

단계(S11110)에서, 데이터 관리 방법은 로그 데이터를 수집하기 위한 로그 수집 장치를 등록할 수 있다. 일 실시예에서, 로그 수집 장치의 등록 시 해당 로그 수집 장치에 대한 장비명, 장비 IP, 장비 유형, OS(operating system) 및 수집 유형이 사용자 입력에 의해 사용자/클라이언트(1000)를 통해 설정될 수 있다. In step S11110, the data management method may register a log collection device for collecting log data. In one embodiment, when registering a log collection device, the device name, device IP, device type, OS (operating system), and collection type for the log collection device can be set through the user/client 1000 by user input. there is.

단계(S11120)에서, 데이터 관리 방법은 사용자 입력에 의한 로그 수집 장치의 로그 데이터에 사용할 변환 규칙을 선택하여 수집 경로 규칙을 생성할 수 있다. 일 실시 예에서, 수집 경로 규칙은 수집 경로, 수집 유형 및 장비명 중 적어도 하나를 포함할 수 있다. In step S11120, the data management method may create a collection path rule by selecting a conversion rule to be used for log data of the log collection device based on user input. In one embodiment, the collection path rule may include at least one of a collection path, collection type, and equipment name.

일 실시 예에서, 수집 유형은 에이전트 방식 및 시스템 로그 방식을 포함할 수 있다. 여기서, 에이전트 방식은 에이전트 프로그램을 시스템 및 장비에 설치하여 필요한 로그 데이터를 전송하는 방식을 포함하고, 시스템 로그 방식은 각종 보완 장비와 스위치, 라우터, 방화벽 등 네트워크 장비의 로그를 수집하는 방식을 포함할 수 있다. In one embodiment, collection types may include agent method and system log method. Here, the agent method includes installing an agent program on the system and equipment to transmit necessary log data, and the system log method includes collecting logs from various complementary devices and network devices such as switches, routers, and firewalls. You can.

단계(S11130)에서, 데이터 관리 방법은 수집 경로 규칙에 변환 규칙을 적용할 수 있다. In step S11130, the data management method may apply a transformation rule to the collection path rule.

도 76은 본 발명의 수집 경로 규칙 생성 화면의 일 실시 예를 설명하는 도면이다. Figure 76 is a diagram illustrating an example of a collection path rule creation screen of the present invention.

일 실시 예에서, 수집 경로 규칙 생성 화면은 사용자/클라이언트(1000)에 의해 디스플레이될 수 있다. 이 경우, 사용자 입력에 기반한 장비명(1049)과 수집 유형(1050)이 표시될 수 있다. 장비명(1049)은 로그 수집 장치의 장비명을 포함하며, 로그 데이터의 수집을 위해 연동된 장비가 설정됨에 따라 변경될 수 있다. 수집 유형(1050)은 에이전트 방식과 시스템 로그 방식 중 하나로 설정될 수 있으며, 예를 들어, 시스템 로그 방식인 SYSLOG가 표시될 수 있다. In one embodiment, a collection path rule creation screen may be displayed by the user/client 1000. In this case, the equipment name 1049 and collection type 1050 based on user input may be displayed. The device name 1049 includes the device name of the log collection device and may change as the device linked to collect log data is set. The collection type 1050 can be set to either agent method or system log method. For example, SYSLOG, which is a system log method, may be displayed.

일 실시 예에서, 수집 경로 규칙 생성 화면은 장비명(1049) 및 수집 유형(1050) 이외에, 수집 경로, 반영 시작, 최종 수정 시각 및 반영 필요 여부에 대한 정보가 표시될 수 있다. 예를 들어, 수집 경로는 수집 유형과 장비 IP를 포함할 수 있다. 또한, 반영 필요 여부는 해당 수집 경로 규칙에 포함된 변환 규칙이 변경되어 수집 경로 규칙에 반영이 필요한지 여부를 표시할 수 있다. In one embodiment, the collection path rule creation screen may display information about the collection path, reflection start, last modification time, and whether reflection is necessary, in addition to the equipment name 1049 and collection type 1050. For example, a collection path may include collection type and device IP. In addition, whether reflection is necessary may indicate whether reflection in the collection path rule is necessary due to a change in the conversion rule included in the corresponding collection path rule.

이와 같이 생성된 수집 경로 규칙에 변환 규칙이 적용되는 경우, 로그 수집 장치와 해당하는 수집 유형이 변환 규칙에 연동될 수 있다. When a conversion rule is applied to the collection path rule created in this way, the log collection device and the corresponding collection type may be linked to the conversion rule.

도 77은 본 발명의 데이터 관리 방법이 이벤트 로그를 검색하는 일 실시 예를 설명하는 도면이다. Figure 77 is a diagram illustrating an embodiment of the data management method of the present invention searching an event log.

단계(S12110)에서, 데이터 관리 방법은 사용자/클라이언트(1000)로부터 이벤트 유형에 대한 이벤트 로그 검색 요청을 수신할 수 있다. 일 실시예에서, 다수의 이벤트 유형들 중 사용자에 의해 선택된 이벤트 유형에 대한 이벤트 로그 검색 요청이 수신될 수 있다. In step S12110, the data management method may receive an event log search request for an event type from the user/client 1000. In one embodiment, an event log search request may be received for an event type selected by the user among multiple event types.

단계(S12120)에서, 데이터 관리 방법은 이벤트 로그 검색 요청에 응답하여, 수집 경로 규칙, 변환 규칙 및 파서에 기반하여 미리 저장된 적어도 하나의 로그 데이터로부터 추출된 검색 결과를 출력할 수 있다. 일 실시 예에서, 적어도 하나의 로그 데이터는 이벤트 유형에 대응할 수 있으며, 일정 기간 동안 수집되어 미리 저장된 로그 데이터를 포함할 수 있다. In step S12120, the data management method may output a search result extracted from at least one log data previously stored based on a collection path rule, a transformation rule, and a parser in response to an event log search request. In one embodiment, the at least one log data may correspond to an event type and may include log data collected over a certain period of time and stored in advance.

즉, 본 발명에 따르면, 정규식을 이용하여 로그 데이터에 탑재된 정보들(즉, 필드값)을 파싱시켜 구분시킴으로써, 사용자는 로그 데이터의 분석 시 구분된 로그 데이터에 내재된 정보를 보다 용이하게 확인할 수 있다.That is, according to the present invention, by parsing and distinguishing the information (i.e., field values) loaded in the log data using regular expressions, the user can more easily check the information contained in the divided log data when analyzing the log data. You can.

단계(S12130)에서, 데이터 관리 방법은 검색 결과를 사용자/클라이언트(1000)에게 송신할 수 있다. 이 경우, 검색 결과는 로그 데이터로부터 파싱된 필드값을 포함할 수 있으며, 사용자에 의한 이벤트 로그 검색 요청에 따라, 파서에 의해 파싱된 필드값이 사용자/클라이언트(1000)의 화면에 대시보드 형태로 표시될 수 있다. In step S12130, the data management method may transmit the search results to the user/client 1000. In this case, the search results may include field values parsed from log data, and according to the event log search request by the user, the field values parsed by the parser are displayed in the form of a dashboard on the screen of the user/client (1000). can be displayed.

도 78은 본 발명의 이벤트 로그 검색 화면의 일 실시 예를 설명하는 도면이다. Figure 78 is a diagram illustrating an example of an event log search screen of the present invention.

일 실시 예에서, 이벤트 로그 검색 화면은 사용자/클라이언트(1000)에 의해 디스플레이될 수 있다. 이 경우, 해당 로그 데이터를 검색하기 위한 이벤트 유형(1048)이 설정될 수 있으며, 예를 들어, 방화벽로그가 이벤트 유형(1048)으로 설정될 수 있다. In one embodiment, an event log search screen may be displayed by the user/client 1000. In this case, an event type 1048 for searching the corresponding log data may be set. For example, a firewall log may be set as the event type 1048.

이후, 검색이 진행될 기간이 설정될 수 있으며, 해당 기간에 대하여 미리 저장된 로그 데이터로부터 검색 결과가 출력될 수 있다. 검색 결과는 각 필드(1049)에 해당하는 필드값이 정리되어 표시될 수 있으며, 예를 들어, 시간, Facility, Severity, 허용/차단, Time, 방화벽정책 ID, 프로토콜, 소스 IP, 소스 Port, 타겟 IP, 타겟 Port 및 방향이 표시될 수 있다. 이와 같이, 각 로그 데이터에 대하여 필드(1049)별 필드값이 구분되기 때문에 사용자는 보다 용이하게 로그 데이터를 분석할 수 있다. Afterwards, a period during which the search will be conducted can be set, and search results can be output from log data previously stored for that period. Search results can be displayed by organizing field values corresponding to each field (1049), for example, time, facility, severity, allow/block, time, firewall policy ID, protocol, source IP, source port, target. IP, target port and direction can be displayed. In this way, since the field values for each field 1049 are distinguished for each log data, the user can more easily analyze the log data.

예를 들어, 허용/차단 필드의 경우, 로그 데이터에 따라 ADMITTED 또는 DENIED가 필드값으로 출력될 수 있다. 또한, 다른 예를 들어, 방화벽정책 ID 필드의 경우, 로그 데이터에 따라 19 또는 1이 필드값으로 출력될 수 있다. For example, in the case of allow/block fields, ADMITTED or DENIED may be output as the field value depending on the log data. Additionally, as another example, in the case of the firewall policy ID field, 19 or 1 may be output as the field value depending on the log data.

도 79는 본 발명의 데이터 관리 방법의 일 실시 예를 설명하는 도면이다. Figure 79 is a diagram explaining an embodiment of the data management method of the present invention.

단계(S13110)에서, 데이터 관리 방법은 로그 데이터를 획득할 수 있다. 본 발명의 데이터 관리 방법이 다수의 로그 데이터를 획득하는 방법은 도 2, 5, 68 및 69의 실시 예를 참고하도록 한다. In step S13110, the data management method may acquire log data. Refer to the embodiments of FIGS. 2, 5, 68, and 69 for how the data management method of the present invention acquires multiple log data.

단계(S13120)에서, 데이터 관리 방법은 로그 데이터에 포함된 텍스트에 대한 텍스트 정보를 나타내는 매칭 블록에 기반하여, 로그 데이터로부터 텍스트를 추출하기 위한 정규식을 포함하는 파서를 생성할 수 있다. 본 발명의 데이터 관리 방법이 다수의 로그 데이터를 획득하는 방법은 도 2, 5 및 68 내지 71의 실시 예를 참고하도록 한다. In step S13120, the data management method may generate a parser including a regular expression for extracting text from log data, based on a matching block representing text information about text included in the log data. Refer to the embodiments of FIGS. 2, 5, and 68 to 71 for how the data management method of the present invention acquires multiple log data.

단계(S13130)에서, 데이터 관리 방법은 정규식에 기반한 파서 및 파서에 대응하는 이벤트 유형을 포함하는 변환 규칙을 생성할 수 있다. 본 발명의 데이터 관리 방법이 다수의 로그 데이터를 획득하는 방법은 도 2, 5, 68 및 72 내지 74의 실시 예를 참고하도록 한다. In step S13130, the data management method may generate a transformation rule including a parser based on a regular expression and an event type corresponding to the parser. For details on how the data management method of the present invention acquires multiple log data, refer to the embodiments of FIGS. 2, 5, 68, and 72 to 74.

단계(S13140)에서, 데이터 관리 방법은 이벤트 유형에 대한 이벤트 검색 요청을 수신함에 응답하여, 파서 및 변환 규칙에 기반한 미리 저장된 적어도 하나의 로그 데이터로부터 추출된 검색 결과를 출력할 수 있다. 본 발명의 데이터 관리 방법이 다수의 로그 데이터를 획득하는 방법은 도 2, 5, 16, 68 및 75 내지 78의 실시 예를 참고하도록 한다.In step S13140, the data management method may output a search result extracted from at least one pre-stored log data based on a parser and transformation rule in response to receiving an event search request for an event type. Refer to the embodiments of FIGS. 2, 5, 16, 68, and 75 to 78 for how the data management method of the present invention acquires a large number of log data.

대용량 로그 검색을 위해서는 검색 결과 전체를 메모리에 올리지 않고도 실시간으로 정렬 및 필터링된 결과를 조회하는 기술이 필요하다. 또한, 검색 중에도 일부 결과를 실시간으로 확인할 수 있어야 한다. For large-capacity log searches, technology is needed to view sorted and filtered results in real time without loading the entire search results into memory. Additionally, you should be able to check some results in real time while searching.

본 발명의 데이터 관리 플랫폼은 대량의 데이터를 효율적으로 처리하고 필요한 결과만을 실시간으로 조회할 수 있도록 한다. The data management platform of the present invention efficiently processes large amounts of data and allows only necessary results to be viewed in real time.

도 80은 본 발명의 데이터 관리 플랫폼에서 로그를 검색하는 실시 예를 설명하는 도면이다. Figure 80 is a diagram explaining an embodiment of searching a log in the data management platform of the present invention.

본 발명의 데이터 관리 플랫폼(10000)의 모니터링 모듈(20005)은 로그 검색을 지원하기 위해 코디네이터 제어부(2047)를 더 포함할 수 있다. The monitoring module 20005 of the data management platform 10000 of the present invention may further include a coordinator control unit 2047 to support log search.

여기에서, 코디네이터 제어부(2047)는 사용자(1000)의 로그 검색 요청에 기초하여 서버(1002)의 코디네이터(2048)를 제어할 수 있다. Here, the coordinator control unit 2047 can control the coordinator 2048 of the server 1002 based on the log search request of the user 1000.

여기에서, 코디네이터(2048)(coordinator)는 분산 시스템에서 여러 개의 인스턴스(2049a, 2049b, 2049c) 중 사용 가능한 인스턴스(2049a, 2049b, 2049c)의 목록을 관리하고 검색 요청을 분배할 수 있다. Here, the coordinator 2048 (coordinator) can manage a list of available instances (2049a, 2049b, 2049c) among multiple instances (2049a, 2049b, 2049c) in the distributed system and distribute search requests.

또한, 인스턴스(instance, 2049a, 2049b, 2049c)는 컴퓨팅 환경에서 실행되는 독립적인 단위로 하드웨어 또는 가상화 기술을 통해 프로세서, 메모리, 디스크 등의 자원을 할당받아 동작할 수 있다. 인스턴스(2049a, 2049b, 2049c)는 특정 운영체제와 응용프로그램을 통해 실행될 수 있으며, 일반적으로 서버, 가상 머신, 컨테이너 등의 형태로 구현될 수 있다. Additionally, instances 2049a, 2049b, and 2049c are independent units running in a computing environment and can operate by being allocated resources such as processor, memory, and disk through hardware or virtualization technology. Instances 2049a, 2049b, and 2049c can run through specific operating systems and applications, and can generally be implemented in the form of servers, virtual machines, containers, etc.

이하를 통하여 본 발명을 자세히 설명하도록 한다. The present invention will be described in detail below.

도 81은 본 발명의 데이터 관리 플랫폼에서 로그를 검색하는 실시 예를 설명하는 도면이다. Figure 81 is a diagram explaining an embodiment of searching a log in the data management platform of the present invention.

코디네이터(2048)는 사용 가능한 인스턴스 목록을 관리하고 사용자의 검색 요청을 분배할 수 있다. 사용자는 본 발명의 데이터 관리 플랫폼의 모니터링 모듈을 통하여 특정 기간 동안의 로그 데이터를 검색할 수 있다. 이후, 사용자는 필요에 따라 인스턴스(2049a, 2049b, 2049c)들을 통하여 검색된 로그 데이터를 확인할 수 있다. The coordinator 2048 can manage the list of available instances and distribute users' search requests. A user can search log data for a specific period of time through the monitoring module of the data management platform of the present invention. Afterwards, the user can check the log data retrieved through instances 2049a, 2049b, and 2049c as needed.

적어도 하나의 인스턴스(2049a, 2049b, 2049c)는 서치 헤드와 서치 피어로 나뉠 수 있다. 일 실시 예에서, 가장 최초에 검색 요청을 받은 인스턴스가 서치 헤드(예를 들어, 2049a)가 되어 코데네이터(2048)로부터 사용 가능한 인스턴스 목록을 수집할 수 있다. At least one instance (2049a, 2049b, 2049c) may be divided into a search head and a search peer. In one embodiment, the instance that first receives the search request becomes the search head (e.g., 2049a) and can collect a list of available instances from the coordinator 2048.

서치 헤드가 된 인스턴스(2049a)는 검색 요청부(2050), 검색 결과 파일 생성부(2051) 및 검색 결과 반환부(2052)를 포함할 수 있다. The instance 2049a that becomes the search head may include a search request unit 2050, a search result file creation unit 2051, and a search result return unit 2052.

여기에서, 검색 요청부(2050)는 사용자로부터 수신한 검색 요청에 기초하여 서치 ID를 발급하고, 서치 ID와 검색 쿼리 정보를 포함하여 서치 피어가 된 각 인스턴스(2049b)에게 검색을 요청할 수 있다. Here, the search request unit 2050 may issue a search ID based on the search request received from the user and request a search from each instance 2049b that has become a search peer, including the search ID and search query information.

검색 결과 파일 생성부(2051)는 서치 피어(2049b)로부터 수신한 검색 결과를 저장할 수 있다. 보다 상세하게는, 검색 결과 파일 생성부(2051)는 서치 피어(2049b)로부터 수신한 로그 데이터를 검색 결과 데이터 파일에 저장하고, 저장된 로그 데이터의 위치 정보와 저장된 로그 데이터의 길이 정보를 offset 파일 내 저장할 수 있다. The search result file generator 2051 may store the search results received from the search peer 2049b. More specifically, the search result file generator 2051 stores the log data received from the search peer 2049b in a search result data file, and stores the location information of the stored log data and the length information of the stored log data in the offset file. You can save it.

일 실시 예에서, 검색 결과 파일 생성부(2051)는 정렬 필드의 값을 기준으로 익스터널 머지 소트 알고리즘을 수행하여 검색 결과를 정렬할 수 있다. 여기에서, 익스터널 머지 소트 알고리즘에 대하여는 후술하도록 한다. In one embodiment, the search result file generator 2051 may sort the search results by performing an external merge sort algorithm based on the value of the sort field. Here, the external merge sort algorithm will be described later.

검색 결과 반환부(2052)는 서치 피어(2049b)로부터 수신한 결과 중 서치 ID, offset 파일과 limit 정보를 활용하여 검색 결과를 요청하는 사용자에게 실시간으로 반환할 수 있다. 여기에서, limit 정보는 한 페이지 당 출력할 로그 데이터의 개수를 나타낸다. 즉, 검색 결과 반환부(2052)는 offset 파일 내의 로그 데이터의 위치 정보와 로그 데이터의 길이 정보를 참조하여 로그 데이터를 추출하고, limit 정보에 기초하여 지정된 개수만큼 모니터링 모듈에게 반환할 수 있다. The search result return unit 2052 can use the search ID, offset file, and limit information among the results received from the search peer 2049b to return the search results in real time to the requesting user. Here, limit information indicates the number of log data to be output per page. That is, the search result return unit 2052 can extract log data by referring to the location information of the log data and the length information of the log data in the offset file, and return the specified number to the monitoring module based on the limit information.

또한, 서치 피어가 된 인스턴스(2049b)는 검색부(2053)를 포함할 수 있다. Additionally, the instance 2049b that becomes a search peer may include a search unit 2053.

여기에서, 검색부는 서치 헤드(2049a)로부터 수신한 검색 쿼리 정보에 기초하여 검색 대상(Segment/Shard)을 구분하고, 검색 대상을 이용하여 로그 데이터를 조회할 수 있다. 이때, 검색 대상이 되는 Segment와 Shard는 데이터베이스 시스템에서 데이터의 구조와 저장 방식을 조직화하기 위하여 사용되는 개념으로, Segment는 논리적인 데이터 단위를, Shard는 물리적인 데이터 분산을 나타낸다. 즉, Segment는 해당 데이터를 논리적인 단위로 나누고, Shard는 데이터를 분산 저장할 수 있도록 한다. Here, the search unit can classify the search target (Segment/Shard) based on the search query information received from the search head 2049a and search log data using the search target. At this time, Segment and Shard, which are search targets, are concepts used to organize the structure and storage method of data in a database system. Segment represents a logical data unit, and Shard represents physical data distribution. In other words, Segment divides the data into logical units, and Shard allows data to be distributed and stored.

이를 통해 대량의 데이터 처리와 동시에 실시간으로 결과를 확인할 수 있어 효율적인 로그 검색을 지원한다는 장점이 있다. This has the advantage of supporting efficient log search by processing large amounts of data and checking results in real time.

도 82는 본 발명의 데이터 관리 방법이 로그를 검색하는 실시 예를 설명하는 도면이다. Figure 82 is a diagram illustrating an embodiment in which the data management method of the present invention searches logs.

단계(S14110)에서, 본 발명의 데이터 관리 방법은 사용자로부터 검색 요청을 수신할 수 있다. In step S14110, the data management method of the present invention may receive a search request from the user.

사용자로부터 검색 요청을 수신함에 따라, 단계(S14120)에서, 본 발명의 데이터 관리 방법은 서치 ID를 발급할 수 있다. 일 실시 예에서, 서치 ID는 고유한(unique) ID에 대응할 수 있다. 이는, 사용자가 하나의 서치 ID를 통하여 결과를 반복적으로 열람할 수 있기 때문이다. Upon receiving a search request from the user, in step S14120, the data management method of the present invention may issue a search ID. In one embodiment, the search ID may correspond to a unique ID. This is because the user can repeatedly view results through one search ID.

단계(S14130)에서, 본 발명의 데이터 관리 방법은 사용자로부터 검색 결과 요청을 수신할 수 있다. 보다 상세하게는, 데이터 관리 방법은 서치 ID를 가지고 있기 때문에 고유한 검색 결과를 요청할 수 있다. 특히, 검색 결과는 실시간으로 변경될 수 있기 때문에, 데이터 관리 방법은 제 1 검색 요청, 제 2 검색 요청 내지 제 n 검색 요청(여기에서, n은 자연수)에 대응하는 검색 결과를 요청할 수 있다. In step S14130, the data management method of the present invention may receive a search result request from the user. More specifically, the data management method can request unique search results because it has a search ID. In particular, because search results may change in real time, the data management method may request search results corresponding to the first search request, second search request, through nth search request (where n is a natural number).

즉, 단계(S14110)에서, 사용자로부터 검색 요청을 수신하는 경우, 후술하는 검색을 수행하는 단계가 순차적으로 수행될 수 있다. 다른 실시 예에서, 사용자로부터 검색 요청 수신 이전에 주기적으로 검색이 수행된 상태에서, 단계(S14130)에 따라 단계(S141440)을 수행할 수 있다. That is, when a search request is received from the user in step S14110, steps for performing a search, which will be described later, may be performed sequentially. In another embodiment, step S141440 may be performed according to step S14130 while the search is performed periodically before receiving a search request from the user.

단계(S14140)에서, 본 발명의 데이터 관리 방법은 사용자에게 검색 결과를 제공할 수 있다. In step S14140, the data management method of the present invention can provide search results to the user.

도 83은 본 발명의 데이터 관리 방법에서 로그를 검색하는 실시 예를 설명하는 도면이다. Figure 83 is a diagram explaining an embodiment of searching a log in the data management method of the present invention.

단계(S15110)에서, 본 발명의 데이터 관리 방법은 코디네이터로부터 사용 가능한 인스턴스 목록을 수집할 수 있다. 일 실시 예에서, 인스턴스 중 서치 헤드는 코디네이터로부터 인스턴스 목록을 수신할 수 있다. In step S15110, the data management method of the present invention can collect a list of available instances from the coordinator. In one embodiment, a search head among instances may receive a list of instances from a coordinator.

단계(S15120)에서, 서치 헤드는 각 인스턴스에게 서치 ID와 검색 쿼리 정보를 포함하여 로그 검색을 요청할 수 있다. In step S15120, the search head may request a log search including a search ID and search query information from each instance.

단계(S15130)에서, 서치 헤드는 서치 피어로부터 검색 결과를 수신할 수 있다. In step S15130, the search head may receive search results from the search peer.

단계(S15140)에서, 서치 헤드는 검색 결과를 data 파일에 저장하고, 데이터가 저장된 위치 정보와 데이터의 길이 정보를 offset 파일 내에 저장할 수 있다. data 파일에는 검색 결과가 계속적으로 누적되기 때문에 offset 파일 내에 데이터의 길이 정보를 포함하여, 검색 결과를 구분할 수 있다. In step S15140, the search head stores the search results in a data file, and stores location information where the data is stored and length information of the data in an offset file. Since search results are continuously accumulated in the data file, search results can be distinguished by including data length information in the offset file.

단계(S15150)에서, 서치 헤드는 익스터널 머지 소트 알고리즘을 수행할 수 있다. 자세한 내용은 후술하도록 한다. In step S15150, the search head may perform an external merge sort algorithm. More details will be provided later.

단계(S15160)에서, 서치 헤드는 콘솔(예를 들어, 본 발명의 데이터 관리 플랫폼의 모니터링 모듈)의 검색 결과 요청에 기초하여 검색 결과를 반환할 수 있다. In step S15160, the search head may return search results based on a search result request from a console (e.g., a monitoring module of the data management platform of the present invention).

보다 상세하게는, 서치 헤드는 offset 파일 내에서 offset 항목(예를 들어, 제 1 데이터)에 해당하는 data 파일 내 제 1 데이터의 위치 정보와 제 1 데이터의 길이 정보를 참조하여 data 파일 내 제 1 데이터를 추출하여 검색 결과를 반환할 수 있다. More specifically, the search head refers to the position information of the first data in the data file corresponding to the offset item (e.g., the first data) in the offset file and the length information of the first data to retrieve the first data in the data file. You can extract data and return search results.

일 실시 예에서, 서치 헤드는 검색 결과를 반환할 때, 복수의 데이터를 포함하는 리스트를 반환할 수 있다. 상술한 바와 같이, 사용자는 제 1 검색 요청, 제 2 검색 요청 내지 제 n 검색 요청 등 시간에 기초하여 여러 번 로그 검색을 요청할 수 있다. 서치 헤드는 제 1 검색 요청에 따른 제 1 데이터 내지 제 a 데이터에 대한 결과, 제 2 검색 요청에 따른 제 b 데이터 내지 제 c 데이터에 대한 결과(여기에서, a 내지 c는 자연수)를 리스트로 반환할 수 있다. In one embodiment, when the search head returns search results, it may return a list containing a plurality of data. As described above, the user may request a log search multiple times based on time, such as a first search request, a second search request, or an nth search request. The search head returns the results for the first data to the a-th data according to the first search request and the results for the b-th data to the c-th data according to the second search request (where a to c are natural numbers) as a list. can do.

도 84는 본 발명의 데이터 관리 방법에서 로그를 검색하는 실시 예를 설명하는 도면이다. Figure 84 is a diagram explaining an embodiment of searching a log in the data management method of the present invention.

단계(S15170)에서, 서치 피어는 서치 헤드로부터 검색 요청을 수신할 수 있다. 여기에서, 검색 요청은 서치 ID와 검색 쿼리 정보를 포함할 수 있다. In step S15170, the search peer may receive a search request from the search head. Here, the search request may include a search ID and search query information.

단계(S15180)에서, 서치 피어는 요청에 기초하여 검색 쿼리 정보를 분석하여 검색 대상을 구분하고 인덱스를 사용해서 검색을 수행할 수 있다. 여기에서, 검색 대상은 Segment 또는 Shard로 구분될 수 있다. In step S15180, the search peer may analyze the search query information based on the request, classify the search target, and perform the search using the index. Here, the search target can be divided into Segment or Shard.

단계(S15190)에서, 서치 피어는 검색 결과를 서치 헤드에게 전송할 수 있다. 이때, 서치 피어는 검색 대상(Segment/Shard)을 이용하여 로그 데이터를 조회하여 검색 결과를 서치 헤드에게 전송할 수 있다. 이때, 서치 피어는 검색 대상에 포함된 인덱스를 이용하여 로그 데이터를 조회할 수 있다. In step S15190, the search peer may transmit the search results to the search head. At this time, the search peer can search log data using the search target (Segment/Shard) and transmit the search results to the search head. At this time, the search peer can search log data using the index included in the search target.

도 85는 본 발명의 익스터널 머지 소트 알고리즘 실시 예를 설명하는 도면이다. Figure 85 is a diagram explaining an embodiment of the external merge sort algorithm of the present invention.

익스터널 머지 소트(External Merge Sort) 알고리즘은 대용량의 데이터를 정렬하는데 사용되는 알고리즘으로, 특히, 메모리 용량을 초과하는 데이터를 정렬할 때 효율적으로 작동한다. 익스터널 머지 소트 알고리즘은 주로 디스크나 외부 저장 장치와 같은 보조 메모리를 활용하여 데이터를 정렬할 수 있다. The External Merge Sort algorithm is an algorithm used to sort large amounts of data, and operates particularly efficiently when sorting data that exceeds memory capacity. The external merge sort algorithm can sort data mainly by utilizing auxiliary memory such as disk or external storage device.

단계(S16110)에서, 정렬 필드의 값을 기준으로 데이터의 offset 정보를 저장하는 인덱스 파일을 생성할 수 있다. 여기에서, 정렬 필드의 값은 데이터 안에 포함된 항목을 나타낼 수 있다. 예를 들어, 데이터 안에 포함된 항목은 사용자 이름, 이메일 주소, 시간 등을 포함할 수 있다. 이에 따라, 익스터널 머지 소트 알고리즘은 데이터 안에 포함된 항목을 기준으로 데이터를 정렬할 수 있다. 또한, offset 정보는 정렬 필드의 기준 값, data 파일 내 위치(예를 들어, data 파일 내의 제 1 로그의 위치) 및 길이 정보(예를 들어, data 파일 내의 제 1 로그의 길이 정보)를 포함할 수 있다. 또한, 인덱스 파일은 특정 건수(예를 들어, 1000건) 단위의 offset 목록으로 이미 정렬된 데이터 순서를 가지는 것을 특징으로 한다. In step S16110, an index file that stores offset information of data can be created based on the value of the sort field. Here, the value of the sort field may indicate items included in the data. For example, items included within the data may include user name, email address, time, etc. Accordingly, the external merge sort algorithm can sort data based on the items included in the data. In addition, the offset information may include the reference value of the sort field, the position in the data file (e.g., the position of the first log in the data file), and the length information (e.g., the length information of the first log in the data file). You can. Additionally, the index file is characterized by having the data already sorted in an offset list in units of a specific number (for example, 1000).

단계(S16120)에서, 머지 소트에 대한 정렬 결과 파일에 포함된 데이터의 개수가 제 1 개수 이상이면, 머지 소트에 대한 정렬 결과 파일을 합쳐서 생성할 수 있다. In step S16120, if the number of data included in the sort result file for the merge sort is greater than or equal to the first number, the sort result file for the merge sort can be combined and generated.

보다 상세하게는, 인덱스 파일을 생성할 때, 인덱스 파일에 포함된 파일의 개수가 제 1 개수 이상이면, 새로운 인덱스 파일을 생성할 수 있다. 예를 들어, 제 1 정렬 결과 파일에 포함된 파일의 개수가 제 n 개수 이상이면, 제 2 정렬 결과 파일을 생성할 수 있다. More specifically, when creating an index file, if the number of files included in the index file is greater than or equal to the first number, a new index file can be created. For example, if the number of files included in the first sort result file is more than the nth number, a second sort result file can be generated.

단계(S16130)에서, 검색이 종료되거나 머지 소트 (merge sort) 파일에 포함된 파일이 제 n 개수 이상이면, 새롭게 생성된 제 2 정렬 결과 파일을 병합하여 새로운 제 3 정렬 결과 파일을 생성할 수 있다. In step S16130, when the search is terminated or the number of files included in the merge sort file is greater than or equal to the nth number, the newly created second sort result file can be merged to generate a new third sort result file. .

즉, 익스터널 머지 소트 알고리즘을 활용하면 데이터를 분할하고 병합하는 과정을 반복하면서 정렬 작업을 수행하기 때문에 전체적인 성능을 최적화할 수 있다는 장점이 있다. In other words, using the external merge sort algorithm has the advantage of optimizing overall performance because the sorting operation is performed while repeating the process of splitting and merging data.

도 86은 본 발명의 검색 결과 파일을 설명하는 도면이다. Figure 86 is a diagram explaining the search result file of the present invention.

일 실시 예에서, 검색 결과 파일은 data 파일(3006)과 offset 파일(3007)을 포함할 수 있다. In one embodiment, the search result file may include a data file 3006 and an offset file 3007.

여기에서, data 파일(3006)은 서치 피어로부터 수신한 검색 결과를 포함할 수 있다. Here, the data file 3006 may include search results received from a search peer.

Offset 파일(3007)은 data 파일(3006) 내 저장된 데이터의 위치 정보 및 데이터의 길이 정보를 포함할 수 있다. 예를 들어, offset 파일(3007)은 data 파일(3006) 내 저장된 제 1 로그 데이터의 위치 정보와 제 1 로그 데이터의 길이 정보를 포함할 수 있다. The offset file 3007 may include location information and data length information of data stored in the data file 3006. For example, the offset file 3007 may include location information of the first log data stored in the data file 3006 and length information of the first log data.

이를 통하여, 사용자로부터 검색 결과 반환 요청이 있는 경우, 본 발명의 데이터 관리 플랫폼은 offset 파일(3007) 내에서 제 1 로그 데이터의 data 파일(3006) 내 위치 정보와 제 1 로그 데이터의 길이 정보에 기초하여 data 파일(3006) 내에서 제 1 로그 데이터를 추출할 수 있다. Through this, when there is a request from the user to return search results, the data management platform of the present invention based on the location information in the data file 3006 of the first log data within the offset file 3007 and the length information of the first log data. Thus, the first log data can be extracted from the data file 3006.

이때, 검색 결과 반환 요청에 제 n 개의 로그 데이터의 반환에 대한 요청이 포함된 경우, 본 발명의 데이터 관리 플랫폼은 offset 파일(3007) 내에서 제 n 개의 로그 데이터의 data 파일 내 위치 정보와 제 n 개의 로그 데이터의 길이 정보에 각각 기초하여 data 파일(3006) 내에서 제 n 개의 로그 데이터를 추출하여 검색 결과를 반환할 수 있다. At this time, if the search result return request includes a request for return of the nth log data, the data management platform of the present invention provides location information within the data file of the nth log data in the offset file 3007 and the nth log data. Based on the length information of each log data, the nth log data can be extracted from the data file 3006 and a search result can be returned.

도 87은 본 발명의 데이터 관리 방법이 로그를 검색하는 실시 예를 설명하는 도면이다. Figure 87 is a diagram illustrating an embodiment in which the data management method of the present invention searches logs.

단계(S16140)에서, 본 발명의 데이터 관리 방법은 데이터 검색 요청을 수신할 수 있다. 본 발명의 데이터 관리 방법이 사용자 또는 클라이언트로부터 데이터 검색 요청을 수신하는 방법은 도 1 내지 도 4, 도 16 및 도 17의 실시 예를 참고하도록 한다. 즉, 데이터 관리 방법은 모니터링 모듈을 통하여 사용자로부터 데이터 검색 요청을 수신하고 호스트 서버에 설치된 코디네이터 및 적어도 하나의 인스턴스를 통하여 분산 데이터를 검색할 수 있다. In step S16140, the data management method of the present invention may receive a data search request. For how the data management method of the present invention receives a data search request from a user or client, refer to the embodiments of FIGS. 1 to 4, 16, and 17. That is, the data management method can receive a data search request from a user through a monitoring module and search distributed data through a coordinator installed on the host server and at least one instance.

단계(S16150)에서, 본 발명의 데이터 관리 방법은 데이터 검색 요청에 기초하여 데이터를 검색할 수 있다. 본 발명의 데이터 관리 방법이 데이터를 검색하는 방법은 도 16, 도 17 및 도 80 내지 도 84의 실시 예를 참고하도록 한다. In step S16150, the data management method of the present invention can search data based on a data search request. For information on how the data management method of the present invention searches data, refer to the embodiments of FIGS. 16, 17, and 80 to 84.

단계(S16160)에서, 본 발명의 데이터 관리 방법은 검색된 결과를 제공할 수 있다. 일 실시 예에서, 데이터 관리 방법은 제 1 데이터가 포함된 검색 결과를 data 파일에 저장하고, 제 1 데이터가 저장된 위치 정보 및 제 1 데이터의 길이 정보를 offset 파일에 저장할 수 있다. 이에 대하여는 도 86의 실시 예를 참고하도록 한다. In step S16160, the data management method of the present invention can provide searched results. In one embodiment, the data management method may store search results including first data in a data file, and store location information where the first data is stored and length information of the first data in an offset file. For this, please refer to the embodiment of Figure 86.

또한, 일 실시 예에서, 데이터 관리 방법은 검색 결과를 익스터널 머지 소트 알고리즘(External Merge Sort Algorithm)에 의해 정렬할 수 있다. 본 발명의 데이터 관리 방법이 데이터를 검색하는 방법은 도 16, 도 17 및 도 85의 실시 예를 참고하도록 한다.Additionally, in one embodiment, the data management method may sort search results using an External Merge Sort Algorithm. For information on how the data management method of the present invention searches data, refer to the embodiments of FIGS. 16, 17, and 85.

로그 데이터를 분석하는 시스템에는 저장된 데이터를 처리하기 위해 사용자를 위한 사용자 인터페이스(User Interface, UI)가 필요하다. 이때, 다양한 분석을 원하는 사용자의 입장에서는 개발사가 제공하는 UI는 한계가 있다. 즉, 보다 복잡한 데이터 분석이나 검색을 수행하거나 스크립트 형태로 쿼리 명령어를 입력하여 데이터를 조작하고 의미 있는 결과를 도출하기 위한 사용자 인터페이스가 필요하다. 이를 위하여, 본 발명에서는 UI 내에서 스크립트를 사용해서 UI 개발에 대한 프로그램 없이 데이터를 조작할 수 있는 기능을 제공하고자 한다. A system that analyzes log data requires a user interface (UI) for users to process the stored data. At this time, from the perspective of users who want various analyses, the UI provided by the developer has limitations. In other words, a user interface is needed to perform more complex data analysis or searches, or to manipulate data and derive meaningful results by entering query commands in script form. To this end, the present invention seeks to provide a function to manipulate data without a program for UI development by using a script within the UI.

도 88은 본 발명의 데이터 관리 플랫폼에서 데이터를 분석하는 실시 예를 설명하는 도면이다. Figure 88 is a diagram explaining an embodiment of analyzing data in the data management platform of the present invention.

본 발명은 분석 작업 편집기(2056)를 통해 스크립트를 이용하여 사용자가 다양한 데이터 분석 및 데이터 조작을 할 수 있도록 한다. The present invention allows users to perform various data analysis and data manipulation using scripts through the analysis task editor 2056.

이를 위하여, 본 발명의 데이터 관리 플랫폼(10000)의 분석 모듈(20002)은 분석 작업 실행부(2054) 및 분석 작업 관리부(2055)를 포함할 수 있다. To this end, the analysis module 20002 of the data management platform 10000 of the present invention may include an analysis task execution unit 2054 and an analysis task management unit 2055.

여기에서, 분석 작업 실행부(2054)는 분석 작업 편집기(2056)에 입력된 정보를 실행하는 분석 엔진을 포함할 수 있다. 일 실시 예에서, 분석 작업 실행부(2054)는 사용자가 분석 작업 편집기(2056)를 통해 입력한 제 1 분석 작업을 실행하는 명령에 기초하여 분석 작업을 실행할 수 있다. 구체적으로, 사용자가 분석 작업 편집기(2056) 내에서 코드 셀을 실행하면, Lua 스크립트가 분석 작업 실행부(2054) 내의 분석 엔진으로 전송되고, 분석 엔진 내에서 Lua 스크립트를 해석하여 실행하고 결과를 반환할 수 있다. 이를 위하여, 분석 작업 실행부(2054)는 데이터베이스(20007)의 로그 데이터 및 통계 데이터를 활용할 수 있다. Here, the analysis task execution unit 2054 may include an analysis engine that executes information input to the analysis task editor 2056. In one embodiment, the analysis task executing unit 2054 may execute an analysis task based on a command for executing the first analysis task input by the user through the analysis task editor 2056. Specifically, when a user executes a code cell within the analysis task editor 2056, the Lua script is transmitted to the analysis engine within the analysis task execution unit 2054, and the analysis engine interprets and executes the Lua script and returns the results. can do. To this end, the analysis task execution unit 2054 may utilize log data and statistical data of the database 20007.

분석 작업 편집기(2056)는 애플리케이션, 소프트웨어 또는 웹 브라우저 등을 통하여 실행 가능한 편집기로 적어도 하나의 셀을 포함할 수 있다. 여기에서 셀은 실행 스크립트에 대한 설명 데이터를 추가하기 위한 마크다운 셀과 분석을 위한 Lua 스크립트 정보를 추가하기 위한 코드 셀을 포함한다.The analysis task editor 2056 is an editor executable through an application, software, or web browser and may include at least one cell. Here, the cells include a Markdown cell for adding descriptive data about the execution script and a code cell for adding Lua script information for analysis.

분석 작업 관리부(2055)는 작성된 Lua 스크립트를 등록할 수 있다. 보다 상세하게는, 사용자는 분석 작업 편집기(2056)를 통해 Lua 스크립트를 추가할 수 있고, 추가된 Lua 스크립트를 분석 작업 관리부(2055)에 등록할 수 있다. 등록된 Lua 스크립트는 분석 작업 관리부(2055)에 포함된 분석 작업 스케쥴러에 의해 주기적으로 실행될 수 있다. The analysis task management unit 2055 can register the written Lua script. More specifically, the user can add a Lua script through the analysis task editor 2056 and register the added Lua script in the analysis task management unit 2055. The registered Lua script may be periodically executed by the analysis task scheduler included in the analysis task management unit 2055.

이후, 모니터링 모듈(20005)은 분석 작업 관리부(2055)를 통해 관리된 분석 작업이 주기적으로 실행된 결과를 출력할 수 있다. Thereafter, the monitoring module 20005 may output the results of periodically executing the analysis task managed through the analysis task management unit 2055.

이를 통하여 사용자는 개발사가 제공하는 사용자 인터페이스를 이용하지 않고도 원하는 데이터를 분석할 수 있다. Through this, users can analyze desired data without using the user interface provided by the developer.

도 89는 본 발명의 데이터 관리 방법이 데이터를 분석하는 실시 예를 설명하는 도면이다. Figure 89 is a diagram explaining an embodiment in which the data management method of the present invention analyzes data.

단계(S17110)에서, 본 발명의 데이터 관리 방법은 분석 작업 편집기를 실행할 수 있다. 보다 상세하게는, 사용자는 데이터 관리 플랫폼을 통하여 제공되는 분석 작업 편집기를 실행할 수 있다. 여기에서, 분석 작업 편집기는 애플리케이션, 소프트웨어 또는 웹 브라우저 등을 통하여 실행할 수 있다. In step S17110, the data management method of the present invention may execute an analysis task editor. More specifically, the user can run the analysis task editor provided through the data management platform. Here, the analysis task editor can be executed through an application, software, or web browser.

단계(S17120)에서, 본 발명의 데이터 관리 방법은 분석 작업 편집기에서 입력된 셀을 수신할 수 있다. 보다 상세하게는, 사용자는 실행된 분석 작업 편집기에 포함된 마크다운 셀에 텍스트를 입력할 수 있고, 데이터 관리 방법은 마크다운 셀에 포함된 텍스트를 수신할 수 있다. In step S17120, the data management method of the present invention may receive cells input from the analysis task editor. More specifically, the user can input text into a Markdown cell included in the executed analysis task editor, and the data management method can receive the text included in the Markdown cell.

단계(S17130)에서, 본 발명의 데이터 관리 방법은 분석 엔진을 통하여 분석 작업을 실행할 수 있다. 보다 상세하게는, 사용자는 분석 작업 편집기의 마크다운 셀에 텍스트를 입력하고, 분석 작업 편집기에 포함된 코드 셀에 출력된 코드를 확인한 후 분석 작업을 실행할 수 있다. 여기에서, 코드 셀은 Lua 스크립트 내에서 각종 데이터를 지원하기 위한 기능 및 데이터를 이용하여 차트나 데이터 모델을 생성할 수 있는 기능들을 함수로 제공할 수 있다. 이에 따라, 본 발명의 데이터 관리 방법은 분석 엔진을 실행하여 사용자가 입력한 셀에 대한 분석 작업을 실행할 수 있다. In step S17130, the data management method of the present invention can execute an analysis task through an analysis engine. More specifically, the user can enter text into the markdown cell of the analysis task editor, check the code output in the code cell included in the analysis task editor, and then execute the analysis task. Here, the code cell can provide functions to support various data within the Lua script and functions to create charts or data models using data as functions. Accordingly, the data management method of the present invention can execute an analysis task on cells input by the user by executing an analysis engine.

일 실시 예에서, 데이터 관리 방법은 미리 등록한 제 1 분석 작업을 실행하는 명령에 기초하여 분석 엔진을 실행할 수 있다. 이때, 제 1 분석 작업을 등록한 제 1 사용자와 제 1 분석 작업을 실행하는 제 2 사용자는 동일하거나 상이할 수 있다. In one embodiment, the data management method may execute an analysis engine based on a command to execute a pre-registered first analysis task. At this time, the first user who registered the first analysis task and the second user who executes the first analysis task may be the same or different.

도 90은 본 발명의 데이터 관리 방법이 분석 작업의 스케쥴을 설정하는 실시 예를 설명하는 도면이다. Figure 90 is a diagram illustrating an embodiment of the data management method of the present invention setting a schedule for analysis work.

단계(S17140)에서, 본 발명의 데이터 관리 방법은 분석 작업 관리부를 통해 작성된 Lua 스크립트를 등록할 수 있다. 보다 상세하게는, 사용자는 분석 작업 편집기에 포함된 마크다운 셀 및 코드 셀을 통해 제 1 분석 작업에 대응하는 Lua 스크립트를 등록할 수 있다. In step S17140, the data management method of the present invention can register the Lua script written through the analysis task management unit. More specifically, the user can register a Lua script corresponding to the first analysis task through the markdown cell and code cell included in the analysis task editor.

단계(S17150)에서, 본 발명의 데이터 관리 방법은 등록된 분석 작업의 스케쥴을 설정할 수 있다. 상술한 실시 예를 참고하여 설명하면, 사용자는 제 1 분석 작업을 주기적으로 실행하기 위한 정보를 함께 입력할 수 있다. 예를 들어, 사용자는 제 1 분석 작업을 주기적으로 실행하기 위하여 실행 주기(예를 들어, 1분), 초기 시간(예를 들어, 5분) 및 지연 처리 시간(예를 들어, 1분)을 설정할 수 있다. In step S17150, the data management method of the present invention can set a schedule for the registered analysis task. If explained with reference to the above-described embodiment, the user may also input information for periodically executing the first analysis task. For example, the user may specify an execution cycle (e.g., 1 minute), an initial time (e.g., 5 minutes), and a delayed processing time (e.g., 1 minute) to periodically run the first analysis task. You can set it.

단계(S17160)에서, 본 발명의 데이터 관리 방법은 분석 엔진을 통하여 분석 작업을 실행할 수 있다. 보다 상세하게는, 사용자가 제 1 분석 작업에 대한 실행 주기, 초기 시간 및 지연 처리 시간을 설정한 경우, 데이터 관리 방법은 제 1 분석 작업에 대한 스케쥴에 기초하여 분석 작업을 실행할 수 있다. In step S17160, the data management method of the present invention can execute an analysis task through an analysis engine. More specifically, when the user sets the execution cycle, initial time, and delayed processing time for the first analysis task, the data management method may execute the analysis task based on the schedule for the first analysis task.

이후, 데이터 관리 방법은 제 1 분석 작업에 대한 결과를 사용자가 주기적으로 모니터링할 수 있도록 제공할 수 있다. 예를 들어, 데이터 관리 방법은 제 1 분석 작업에 대한 결과를 웹 브라우저의 대쉬보드 또는 사용자가 미리 입력한 이메일 등을 통하여 제공할 수 있다. Thereafter, the data management method may provide the results of the first analysis task so that the user can periodically monitor them. For example, the data management method may provide the results of the first analysis task through a dashboard of a web browser or an email entered in advance by the user.

도 91은 본 발명의 데이터 관리 방법이 분석 작업을 실행하는 실시 예를 설명하는 도면이다. Figure 91 is a diagram illustrating an example in which the data management method of the present invention performs an analysis task.

단계(S17170)에서, 사용자는 분석 작업 편집기 내에서 코드 셀을 실행할 수 있다. At step S17170, the user can execute the code cell within the analysis task editor.

이에 따라, 단계(S17180)에서, 본 발명의 데이터 관리 방법은 코드 셀에 포함된 Lua 스크립트를 분석 엔진으로 전송할 수 있다. Accordingly, in step S17180, the data management method of the present invention can transmit the Lua script included in the code cell to the analysis engine.

단계(S17190)에서, 본 발명의 데이터 관리 방법은 분석 엔진 내에서 Lua 스크립트를 해석하고 실행한 뒤 결과를 반환할 수 있다. In step S17190, the data management method of the present invention can interpret and execute the Lua script within the analysis engine and return the result.

도 92는 본 발명의 데이터 관리 플랫폼에서 데이터를 변조하는 실시 예를 설명하는 도면이다. Figure 92 is a diagram explaining an embodiment of modulating data in the data management platform of the present invention.

본 도면은 분석 작업 편집기 내에서 사용 가능한 함수들을 설명하는 도면이다. This diagram is a diagram explaining the functions available within the analysis task editor.

상술한 바와 같이 분석 작업 편집기는 마크다운 셀과 코드 셀을 포함하며, 코드 셀은 분석을 위한 Lua 스크립트를 포함할 수 있다. 이때, 코드 셀은 Lua 스크립트 내에서 각종 데이터를 지원하기 위한 기능 및 데이터를 이용하여 차트나 데이터 모델을 생성할 수 있는 기능들을 함수로 제공할 수 있다. As described above, the analysis task editor includes a markdown cell and a code cell, and the code cell may include a Lua script for analysis. At this time, the code cell can provide functions to support various data within the Lua script and functions to create charts or data models using the data as functions.

이때, 사용 가능한 함수들은 다음과 같다. At this time, the available functions are as follows.

- search: 로그 데이터를 조회 또는 입력하기 위한 함수- search: Function to search or input log data

- analytic: 통계 데이터를 조회 또는 입력하기 위한 함수- analytic: Function for viewing or entering statistical data

- jdbc: 데이터 베이스에 접근하여 쿼리를 수행하기 위한 함수- jdbc: Function to access the database and perform queries

- weka: 머신러닝 데이터 모델을 저장하거나 불러오기 위한 함수- weka: Function for saving or loading machine learning data models

- visualize: 차트를 생성해주는 함수- visualize: Function that creates a chart

이를 통해 분석 작업 편집기 내에서 코드 셀을 실행하는 경우, Lua 스크립트는 분석 모듈로 전송되고, 분석 모듈 내에서 Lua 스크립트를 해석하여 실행할 수 있고, 실행 결과가 분석 작업 편집기 상으로 반환될 수 있다. 즉, 분석 작업 편집기 내에서 실행 결과를 바로 출력하여 확인할 수 있다. Through this, when executing a code cell within the analysis task editor, the Lua script is transmitted to the analysis module, the Lua script can be interpreted and executed within the analysis module, and the execution result can be returned to the analysis task editor. In other words, you can print and check the execution results directly within the analysis task editor.

도 93은 본 발명의 데이터 관리 플랫폼에서 제공하는 분석 작업 편집기의 사용자 인터페이스를 설명하는 도면이다. Figure 93 is a diagram illustrating the user interface of the analysis task editor provided by the data management platform of the present invention.

본 발명의 분석 작업 편집기(2056)는 마크다운 셀(Markdown cell, 2057)과 코드 셀(Code cell, 2058)을 포함할 수 있다. 분석 작업 편집기(2056)는 여러 개의 스크립트 및 텍스트 블록을 저장하는 단위에 대응할 수 있다. The analysis task editor 2056 of the present invention may include a Markdown cell (2057) and a code cell (Code cell, 2058). The analysis task editor 2056 may correspond to a unit storing multiple scripts and text blocks.

여기에서, 마크다운 셀(2057)은 설명 블록(description block)에 대응하고, 코드 셀(2058)은 스크립트 블록(script block)에 대응한다. Here, the markdown cell 2057 corresponds to a description block, and the code cell 2058 corresponds to a script block.

일 실시 예에서, 사용자는 마크다운 셀(2057) 및 코드 셀(2058) 중 적어도 하나에 텍스트를 입력할 수 있다. 예를 들어, 사용자는 마크다운 셀(2057)에 코드 셀(2058)에 대한 설명을 기재할 수 있다. 또한, 사용자는 코드 셀(2058)에 실질적으로 실행하고자 하는 코드를 입력할 수 있다. 이에 따라, 분석 작업 편집기(2056)는 코드 셀(2058)에 포함된 코드를 실행할 수 있다. In one embodiment, the user may input text into at least one of the markdown cell 2057 and the code cell 2058. For example, the user can write a description of the code cell 2058 in the markdown cell 2057. Additionally, the user can input the code he or she wants to actually execute into the code cell 2058. Accordingly, the analysis task editor 2056 can execute the code included in the code cell 2058.

일 실시 예에서, 사용자가 분석 작업 편집기(2056)에서 제공하는 “+ 버튼(2059)”을 누르면, 분석 작업 편집기(2056)는 마크다운 셀 추가 버튼 및 코드 셀 추가 버튼(2060a, 2060b)을 추가로 출력할 수 있다. 일 실시 예에서, 분석 작업 편집기(2056)는 +버튼(2059)을 마크다운 셀(2057)의 위/아래에 모두 출력할 수 있고, 이에 따라 사용자는 추가할 마크다운 셀(2057) 또는 코드 셀(2058)의 위치를 결정할 수 있다. In one embodiment, when the user presses the “+ button 2059” provided by the analysis task editor 2056, the analysis task editor 2056 adds a markdown cell add button and a code cell add button 2060a, 2060b. It can be output as . In one embodiment, the analysis task editor 2056 can output the + button 2059 both above and below the markdown cell 2057, so that the user can select the markdown cell 2057 or code cell to add. (2058) can be determined.

이에 따라, 사용자는 복수 개의 마크다운 셀(2057)을 추가할 수 있으며, 마찬가지로 분석 작업 편집기(2056)에게 복수 개의 코드 셀(2058)을 요청할 수 있다. Accordingly, the user can add a plurality of markdown cells 2057 and similarly request a plurality of code cells 2058 from the analysis task editor 2056.

도 94는 본 발명의 데이터 관리 플랫폼에서 제공하는 분석 작업 편집기의 사용자 인터페이스를 설명하는 도면이다. Figure 94 is a diagram illustrating the user interface of the analysis task editor provided by the data management platform of the present invention.

일 실시 예에서, 사용자가 출력된 코드 셀(2058)의 실행을 요청함에 따라, 분석 작업 편집기는 분석 작업 실행부를 통하여 코드 셀(2058)에 포함된 분석 작업을 실행할 수 있다. In one embodiment, as the user requests execution of the output code cell 2058, the analysis task editor may execute the analysis task included in the code cell 2058 through an analysis task execution unit.

예를 들어, 사용자가 제 1 마크다운 셀(2057)에 “session 통계 데이터를 조회하여 line chart로 시각화”를 입력하고, 이 설명 블록에 대응하는 Lua 스크렙트를 제 1 코드 셀(2058) 상에 입력할 수 있다. For example, the user enters “Search session statistics data and visualize it as a line chart” in the first markdown cell (2057), and enters the Lua script corresponding to this description block in the first code cell (2058). You can enter it.

이후, 사용자가 출력된 제 1 코드 셀(2058)의 실행을 요청함에 따라, 분석 작업 편집기는 도면과 같이 제 1 라인 차트(2061)를 시각화할 수 있다. Thereafter, as the user requests execution of the output first code cell 2058, the analysis task editor may visualize the first line chart 2061 as shown in the drawing.

다른 예를 들면, 사용자가 제 2 마크다운 셀에 “외부 database에 리소스 사용률 테이블을 조회하여 line chart로 시각화”를 입력하고, 이 설명 블록에 대응하는 Lua 스크립트를 제 2 코드 셀 상에 입력할 수 있다. For another example, the user can enter “Query the resource utilization table in an external database and visualize it as a line chart” in the second Markdown cell and enter the Lua script corresponding to this description block in the second code cell. there is.

이후, 사용자가 출력된 제 2 코드 셀의 실행을 요청함에 따라, 분석 작업 편집기는 도면과 같이 제 2 라인 차트를 시각화할 수 있다. Thereafter, as the user requests execution of the output second code cell, the analysis task editor may visualize the second line chart as shown in the drawing.

도 95는 본 발명의 데이터 관리 플랫폼에서 제공하는 분석 작업 편집기의 사용자 인터페이스를 설명하는 도면이다. Figure 95 is a diagram illustrating the user interface of the analysis task editor provided by the data management platform of the present invention.

이후, 도 95의 (a)를 참조하면, 사용자는 제 1 마크다운 셀, 제 1 코드 셀, 제 2 마크다운 셀 및 제 2 코드 셀이 포함된 제 1 분석 작업을 저장할 수 있다. 이때, 사용자는 구분을 위하여 제 1 분석 작업의 이름을 설정할 수 있다. Thereafter, referring to (a) of FIG. 95, the user can save the first analysis task including the first markdown cell, the first code cell, the second markdown cell, and the second code cell. At this time, the user can set the name of the first analysis task for differentiation.

또한, 도 95의 (b)를 참조하면, 저장된 제 1 분석 작업은 동일한 사용자 또는 상이한 사용자에 의해 불러오기 될 수 있다. 사용자가 불러온 제 1 분석 작업에 대하여 분석 작업 편집기를 실행함에 따라, 제 1 분석 작업이 실행될 수 있다. 이를 위하여, 본 발명의 데이터 관리 플랫폼은 기 저장된 적어도 하나의 분석 작업 리스트(2062)를 제공할 수 있다. Additionally, referring to (b) of FIG. 95, the stored first analysis task may be loaded by the same user or a different user. As the user executes the analysis task editor for the loaded first analysis task, the first analysis task may be executed. To this end, the data management platform of the present invention may provide at least one pre-stored analysis task list 2062.

도 96은 본 발명의 데이터 관리 플랫폼의 모니터링 모듈이 제공하는 사용자 인터페이스를 설명하는 도면이다. Figure 96 is a diagram explaining the user interface provided by the monitoring module of the data management platform of the present invention.

일 실시 예에서, 본 발명의 데이터 관리 플랫폼은 모니터링 모듈을 통하여 사용자에게 분석 작업에 대한 모니터링을 제공할 수 있다. 이를 위하여, 데이터 관리 플랫폼은 분석 작업 스케쥴러(2063)를 제공할 수 있다. 예를 들어, 사용자는 분석 작업 스케쥴러(2063)를 이용하기 위하여, 작업 생성 버튼을 선택하여, 분석 작업을 등록하는 팝업창을 출력할 수 있다. 이를 통해, 사용자는 이미 등록된 분석 작업의 스케쥴을 관리할 수 있다. 예를 들어, 사용자는 분석 작업 스케쥴러(2063)를 통해 제 1 분석 작업의 실행 주기(예를 들어, 1분), 초기 시간(예를 들어, 5분) 및 지연 처리 시간(예를 들어, 1분)을 설정할 수 있다. 이후, 사용자가 분석 작업을 실행하면, 설정된 주기에 따라 제 1 분석 작업이 실행될 수 있다. In one embodiment, the data management platform of the present invention may provide monitoring of analysis tasks to users through a monitoring module. To this end, the data management platform may provide an analysis task scheduler 2063. For example, in order to use the analysis task scheduler 2063, the user can select the task creation button to display a pop-up window for registering an analysis task. Through this, users can manage the schedule of already registered analysis tasks. For example, the user may set the execution cycle (e.g., 1 minute), initial time (e.g., 5 minutes), and delay processing time (e.g., 1 minute) of the first analysis task through the analysis task scheduler 2063. minutes) can be set. Afterwards, when the user executes the analysis task, the first analysis task may be executed according to the set cycle.

도 97은 본 발명의 데이터 관리 플랫폼에서 제공하는 분석 작업 편집기의 사용자 인터페이스를 설명하는 도면이다. Figure 97 is a diagram illustrating the user interface of the analysis task editor provided by the data management platform of the present invention.

일 실시 예에서, 데이터 관리 플랫폼의 모니터링 모듈은 분석 작업의 스케쥴에 따라 분석 작업의 모니터링 결과를 출력할 수 있다. In one embodiment, the monitoring module of the data management platform may output monitoring results of the analysis task according to the schedule of the analysis task.

예를 들어, 제 1 분석 작업(2064)은 상술한 실시 예에 기초하여 설정된 작업 주기, 실행 주기, 초기 기간, 지연 처리 기간에 따라 실행될 수 있고, 모니터링 모듈은 제 1 분석 작업(2064)에 대한 실행 결과를 출력할 수 있다. For example, the first analysis task 2064 may be executed according to the task cycle, execution cycle, initial period, and delay processing period set based on the above-described embodiment, and the monitoring module may monitor the first analysis task 2064. The execution results can be printed.

이를 통해, 사용자는 기본 UI로 제공되지 않는 데이터에 대해 스크립트를 활용하여 다양한 데이터의 분석과 조작을 수행할 수 있다. 즉, 본 발명은 데이터 소스의 모니터링과 분석 프로세스를 통합하여 보다 복잡한 분석 작업을 수행할 수 있도록 한다. Through this, users can perform analysis and manipulation of various data using scripts for data that is not provided through the basic UI. In other words, the present invention integrates the monitoring and analysis processes of data sources to enable more complex analysis tasks to be performed.

도 98은 본 발명의 데이터 관리 방법이 데이터를 분석하는 실시 예를 설명하는 도면이다. Figure 98 is a diagram explaining an embodiment of the data management method of the present invention analyzing data.

단계(S18110)에서, 본 발명의 데이터 관리 방법은 제 1 마크다운 셀의 입력을 수신할 수 있다. 여기에서, 마크다운 셀은 실행 스크립트에 대한 설명 데이터를 추가하기 위한 셀으로, 데이터 관리 플랫폼의 분석 작업 편집기를 통하여 사용자에게 제공될 수 있다. 이에 대한 설명은, 도 88, 도 89, 도 93 및 도 94에서 상술한 실시 예를 참고하도록 한다. In step S18110, the data management method of the present invention can receive the input of the first markdown cell. Here, a markdown cell is a cell for adding descriptive data for an execution script, and can be provided to the user through the analysis task editor of the data management platform. For a description of this, please refer to the embodiments described above in FIGS. 88, 89, 93, and 94.

단계(S18120)에서, 본 발명의 데이터 관리 방법은 제 1 마크다운 셀에 대응하는 제 1 코드 셀의 입력을 수신할 수 있다. 여기에서, 제 1 코드 셀은 분석을 위한 Lua 스크립트 정보를 추가하기 위한 셀으로, 데이터 관리 플랫폼의 분석 작업 편집기를 통하여 사용자에게 제공될 수 있다. 이에 대한 설명은 도 88, 도 92, 도 93 및 도 94에서 상술한 실시 예를 참고하도록 한다. In step S18120, the data management method of the present invention can receive an input of the first code cell corresponding to the first markdown cell. Here, the first code cell is a cell for adding Lua script information for analysis, and can be provided to the user through the analysis task editor of the data management platform. For a description of this, refer to the embodiments described above in FIGS. 88, 92, 93, and 94.

단계(S18130)에서, 본 발명의 데이터 관리 방법은 제 1 코드 셀에 기초하여 제 1 분석 작업을 실행할 수 있다. 이에 대한 설명은 도 88, 도 89, 도 91 및 도 94에서 상술한 실시 예를 참고하도록 한다. In step S18130, the data management method of the present invention may execute a first analysis task based on the first code cell. For a description of this, refer to the embodiments described above in FIGS. 88, 89, 91, and 94.

일 실시 예에서, 제 1 코드 셀에 포함된 함수 및 데이터 관리 플랫폼 내에 포함된 데이터베이스의 로그 데이터, 통계 데이터를 활용하여 본 발명의 데이터 관리 방법은 제 1 코드 셀에 대응하는 차트 및 모델을 생성할 수 있다. 이때, 데이터베이스에 포함된 로그 데이터 또는 통계 데이터 등은 도 1 내지 도 25의 실시 예를 통해 수집되어 분석된 데이터에 대응할 수 있다. In one embodiment, the data management method of the present invention utilizes the log data and statistical data of the database included in the function included in the first code cell and the data management platform to generate charts and models corresponding to the first code cell. You can. At this time, log data or statistical data included in the database may correspond to data collected and analyzed through the embodiments of FIGS. 1 to 25.

10000: 데이터 관리 플랫폼10000: Data Management Platform

Claims

Receiving a first markdown cell as input;
Receiving a first code cell for the first markdown cell; and
executing a first analysis operation based on the first code cell,
The data used in the first analysis task corresponds to one matching block representing text information about the text selected by user input among a plurality of matching blocks representing different text information about the text included in the data. Extracted based on a parser containing generated regular expressions,
The one matching block represents text information matching the regular expression pattern of the regular expression,
The first markdown cell contains description data for the executable script,
A data management method, characterized in that the first code cell includes first Lua script information for analysis.

delete

According to claim 1,
interpreting the first Lua script information within an analysis engine; and
Data management method further comprising returning a result for the first code cell.

According to claim 1,
registering a second analysis task in the analysis task scheduler; and
The data management method further includes performing the second analysis task periodically based on preset conditions.

a database that stores data; and
Including a processor that processes the data,
The processor,
Receive the first Markdown cell as input,
Receive the first code cell for the first markdown cell,
Execute a first analysis operation based on the first code cell,
The data used in the first analysis task corresponds to one matching block representing text information about the text selected by user input among a plurality of matching blocks representing different text information about the text included in the data. Extracted based on a parser containing regular expressions,
The one matching block represents text information matching the regular expression pattern of the regular expression,
The first markdown cell contains description data for the executable script,
A data management device, characterized in that the first code cell includes first Lua script information for analysis.

delete

According to claim 5,
The processor,
Interpret the first Lua script information within the analysis engine,
A data management device, characterized in that it returns a result for the first code cell.

According to claim 5,
The processor,
Register the second analysis task in the analysis task scheduler,
A data management device, characterized in that the second analysis task is periodically executed based on preset conditions.

Receive the first Markdown cell as input,
Receive the first code cell for the first markdown cell,
executing a first analysis operation based on the first code cell,
The data used in the first analysis task corresponds to one matching block representing text information about the text selected by user input among a plurality of matching blocks representing different text information about the text included in the data. Extracted based on a parser containing regular expressions,
The one matching block represents text information matching the regular expression pattern of the regular expression,
The first markdown cell includes description data for the executable script,
A computer-readable storage medium storing a data management program, wherein the first code cell includes first Lua script information for analysis.