KR102345750B1

KR102345750B1 - Method and system for analysing data de-identification risk

Info

Publication number: KR102345750B1
Application number: KR1020200069782A
Authority: KR
Inventors: 양진홍
Original assignee: 주식회사 토브데이터
Priority date: 2020-06-09
Filing date: 2020-06-09
Publication date: 2022-01-03
Also published as: KR20210152825A

Abstract

데이터 비식별화 위험도 분석을 위한 방법 및 그 시스템이 개시된다. 데이터 비식별화 위험도 분석을 위한 방법은, 시스템 이용자의 요청에 따라 자사 데이터(1^stparty data)로서 정보 보호 법규나 규정을 포함하는 컴플라이언스(compliance)가 적용된 데이터를 제공하고, 비식별화에 따른 데이터 유출위험도 분석 결과를 바탕으로 상기 컴플라이언스가 적용된 데이터의 적어도 일부에 대해 데이터 비식별화를 수행하여 반출한다.A method and system for data de-identification risk analysis are disclosed. Data, a method for non-identified Chemistry risk analysis, at the request of system users provide data for compliance (compliance) comprising a data protection laws and regulations as its data (1 ^st party data) applied and, according to the non-identified Chemistry Based on the data leakage risk analysis result, data de-identification is performed for at least a portion of the data to which the compliance is applied and exported.

Description

Method and system for data de-identification risk analysis

아래의 설명은 데이터 유통 환경에서 데이터를 반출하는 기술에 관한 것이다.The description below relates to a technology for exporting data from a data distribution environment.

네트워크를 통한 전송 성능이 증가하고 IT 기술 발달에 따른 데이터 전송 수요가 급증하고 있다.The transmission performance through the network is increasing, and the demand for data transmission is rapidly increasing due to the development of IT technology.

최근 개인 사용자 별 건강관리(health care) 정보, 교통 데이터, 스마트 전력량 관리, 환경정보 센싱 데이터 등 다양한 종류의 정보 데이터 및 초저지연(ultra-low latency)의 실시간 데이터에 대한 트래픽 요구가 증가하고 있다.Recently, traffic demands for various types of information data and ultra-low latency real-time data such as health care information for individual users, traffic data, smart energy management, and environmental information sensing data are increasing.

이외에도, 자율주행 자동차 기술이나 가상현실(VR)/증강현실(AR) 서비스 등에 따른 데이터 트래픽 또한 증가하고 있다.In addition, data traffic according to autonomous vehicle technology or virtual reality (VR)/augmented reality (AR) services is also increasing.

빅데이터 범람과 개인화된 정보 서비스 급등으로 인해 다양한 종류의 정보 데이터 유통 환경이 형성되고 있다.Various types of information data distribution environments are being formed due to the overflow of big data and the surge in personalized information services.

이러한 환경에서 다양한 서비스가 B2B(business to business) 중심에서 B2C(business to consumer)나 사람 중심으로 이동하면서 개인 맞춤형 서비스에 따른 개인 정보 유출 문제가 급증하고 있다.In this environment, as various services move from B2B (business to business) to B2C (business to consumer) or people-centered, the problem of personal information leakage due to personalized services is rapidly increasing.

이에 따라, 데이터 기반 4차 산업 등 데이터 생태계 활성화와 개인의 프라이버시 보호 요구 사이에 최적의 균형점이 필요하다.Accordingly, there is a need for an optimal balance between the activation of data ecosystems such as the data-based 4th industry and the needs of individuals to protect privacy.

자사 데이터(1^stparty data)의 반출 시 데이터에 적용 가능한 컴플라이언스를 바탕으로 데이터를 비식별화하여 반출할 수 있는 기술을 제공한다.To its data (1 ^st party data) identifying the non-data on the basis of compliance as possible applied to the data when taken out of the screen provides a technique that can be taken out.

비식별화를 통한 데이터의 중복 반출 시 데이터를 반출하고자 하는 시스템 이용자에 대한 데이터 비식별화의 위험도를 분석할 수 있는 기술을 제공한다.It provides technology to analyze the risk of data de-identification for system users who want to export data in case of duplicate export of data through de-identification.

컴퓨터 시스템에 있어서, 메모리에 포함된 컴퓨터 판독가능한 명령들을 실행하도록 구성된 적어도 하나의 프로세서를 포함하고, 상기 적어도 하나의 프로세서는, 시스템 이용자의 요청에 따라 자사 데이터(1^stparty data)로서 정보 보호 법규나 규정을 포함하는 컴플라이언스(compliance)가 적용된 데이터를 제공하는 과정; 및 상기 컴플라이언스가 적용된 데이터의 적어도 일부에 대해 데이터 비식별화를 수행하여 반출하는 과정을 처리하는 컴퓨터 시스템을 제공한다.A computer system, comprising at least one processor configured to execute computer-readable instructions contained in a memory, and wherein the at least one processor, information protected as its data (1 ^st party data), at the request of the system user laws The process of providing data subject to compliance including regulations; and data de-identification for at least a portion of the data to which the compliance has been applied.

일 측면에 따르면, 상기 적어도 하나의 프로세서는, 개인정보 수집 정도 및 정보 보안 관련 항목, 개인정보 제공자의 권리 보장 여부, 개인정보 활용 정책 및 제3자(3^rd party) 공유 여부를 바탕으로 상기 시스템 이용자에 대한 신뢰도를 평가하여 제공할 수 있다.According to one aspect, the at least one processor, the personal information collected and the degree of information security-related topics, privacy rights guaranteed whether or not the provider, leveraging privacy policy and third-party (3 ^rd party) said system based on a share if It can be provided by evaluating the reliability of users.

다른 측면에 따르면, 상기 적어도 하나의 프로세서는, 상기 시스템 이용자에 대한 신뢰도를 기초로 반출하고자 하는 데이터의 적어도 일부에 대해 데이터 비식별화를 수행할 수 있다.According to another aspect, the at least one processor may perform data de-identification on at least a portion of data to be exported based on the reliability of the system user.

또 다른 측면에 따르면, 상기 적어도 하나의 프로세서는, 마스킹 방법, 범주화 방법, 일반화 방법, 및 동형암호를 이용한 방법 중 어느 하나를 통해 상기 컴플라이언스가 적용된 데이터의 적어도 일부를 비식별화할 수 있다.According to another aspect, the at least one processor may de-identify at least a portion of the data to which the compliance is applied through any one of a masking method, a categorization method, a generalization method, and a method using homomorphic encryption.

또 다른 측면에 따르면, 상기 적어도 하나의 프로세서는, 데이터 특성 별로 비식별화에 따른 데이터 유출위험도를 분석한 후 상기 데이터 유출위험도의 분석 결과를 바탕으로 상기 컴플라이언스가 적용된 데이터에 대한 데이터 비식별화를 수행할 수 있다.According to another aspect, the at least one processor analyzes the data leakage risk according to de-identification for each data characteristic, and then performs data de-identification for the data to which the compliance is applied based on the analysis result of the data leakage risk. can be done

또 다른 측면에 따르면, 상기 적어도 하나의 프로세서는, 비지도 학습 알고리즘을 이용하여 상기 비식별화에 따른 데이터 유출위험도를 분석할 수 있다.According to another aspect, the at least one processor may analyze the risk of data leakage according to the de-identification using an unsupervised learning algorithm.

또 다른 측면에 따르면, 상기 적어도 하나의 프로세서는, 상기 데이터 유출위험도의 분석 결과를 바탕으로 데이터 식별화에 대한 가이드라인을 도출할 수 있다.According to another aspect, the at least one processor may derive a guideline for data identification based on the analysis result of the data leakage risk.

또 다른 측면에 따르면, 상기 적어도 하나의 프로세서는, 상기 데이터 유출위험도가 일정 레벨 미만인 데이터 조합을 중심으로 상기 컴플라이언스가 적용된 데이터에 대한 데이터 비식별화를 수행할 수 있다.According to another aspect, the at least one processor may perform data de-identification of the data to which the compliance is applied, focusing on a data combination in which the risk of data leakage is less than a predetermined level.

또 다른 측면에 따르면, 상기 적어도 하나의 프로세서는, 상기 시스템 이용자로부터 중복 반출 요청이 있는 데이터의 경우 이전에 반출된 데이터의 비식별화를 바탕으로 상기 시스템 이용자에 대한 데이터 비식별화의 위험도를 분석할 수 있다.According to another aspect, the at least one processor analyzes the risk of data de-identification for the system user based on de-identification of previously exported data in the case of data for which there is a duplicate export request from the system user. can do.

또 다른 측면에 따르면, 상기 적어도 하나의 프로세서는, 상기 시스템 이용자로부터 중복 반출 요청이 있는 데이터의 경우 상기 시스템 이용자에게 중복으로 반출된 데이터 간에 식별 가능한 데이터의 교집합을 분석하여 상기 데이터 유출위험도가 일정 레벨 이상인 데이터 조합을 회피하는 방식으로 상기 컴플라이언스가 적용된 데이터에 대한 데이터 비식별화를 수행할 수 있다.According to another aspect, the at least one processor analyzes the intersection of identifiable data among data duplicated exported to the system user in the case of data for which there is a duplicate export request from the system user, so that the data leakage risk is set to a certain level. Data de-identification of the data to which the compliance is applied may be performed in a manner of avoiding abnormal data combinations.

컴퓨터 시스템에서 실행되는 방법에 있어서, 상기 컴퓨터 시스템은 메모리에 포함된 컴퓨터 판독가능한 명령들을 실행하도록 구성된 적어도 하나의 프로세서를 포함하고, 상기 방법은, 상기 적어도 하나의 프로세서에 의해, 시스템 이용자의 요청에 따라 자사 데이터로서 정보 보호 법규나 규정을 포함하는 컴플라이언스가 적용된 데이터를 제공하는 단계; 및 상기 적어도 하나의 프로세서에 의해, 상기 컴플라이언스가 적용된 데이터의 적어도 일부에 대해 데이터 비식별화를 수행하여 반출하는 단계를 포함하는 방법을 제공한다.A method executed in a computer system, the computer system comprising at least one processor configured to execute computer readable instructions contained in a memory, the method comprising, by the at least one processor, at a request of a system user. providing data subject to compliance including information protection laws and regulations as company data; and performing, by the at least one processor, data de-identification on at least a portion of the data to which the compliance has been applied, and exporting the data.

본 발명의 실시예들에 따르면, 데이터 기반 4차 산업 등 데이터 생태계 활성화와 개인의 프라이버시 보호에 대한 요구 사이에 최적의 균형점을 제공할 수 있다.According to embodiments of the present invention, it is possible to provide an optimal balance between the activation of a data ecosystem such as the data-based fourth industry and the demand for personal privacy protection.

도 1은 본 발명의 일실시예에 있어서 컴퓨터 시스템의 내부 구성의 일례를 설명하기 위한 블록도이다.
도 2는 본 발명의 일실시예에 따른 컴퓨터 시스템이 수행할 수 있는 데이터 컴플라이언스 제공 방법의 일례를 도시한 순서도이다.
도 3 내지 도 12는 본 발명의 일실시예에 있어서 데이터 컴플라이언스 제공을 위한 서비스 화면의 예시를 도시한 것이다.
도 13은 본 발명의 일실시예에 있어서 개인 정보 활용 의사결정 매트릭스를 생성하는 과정의 예시를 도시한 것이다.
도 14는 본 발명의 일실시예에 있어서 유출위험도에 기초한 데이터 비식별화 과정의 일례를 도시한 순서도이다.
도 15는 본 발명의 일실시예에 있어서 개인 정보 관리 과정의 일례를 도시한 순서도이다.
도 16은 본 발명의 일실시예에 있어서 서비스 공급자가 이용한 개인 데이터를 기록 및 관리하는 과정의 예시를 도시한 것이다.
도 17은 본 발명의 일실시예에 있어서 실 서비스 구성 사례를 도시한 것이다.
도 18은 본 발명의 일실시예에 있어서 데이터 프라이버시 관련 컨트랙트 과정의 일례를 도시한 순서도이다.1 is a block diagram for explaining an example of an internal configuration of a computer system according to an embodiment of the present invention.
2 is a flowchart illustrating an example of a data compliance providing method that can be performed by a computer system according to an embodiment of the present invention.
3 to 12 are diagrams illustrating an example of a service screen for providing data compliance according to an embodiment of the present invention.
13 illustrates an example of a process of generating a personal information utilization decision matrix according to an embodiment of the present invention.
14 is a flowchart illustrating an example of a data de-identification process based on a risk of leakage in an embodiment of the present invention.
15 is a flowchart illustrating an example of a personal information management process according to an embodiment of the present invention.
16 illustrates an example of a process of recording and managing personal data used by a service provider according to an embodiment of the present invention.
17 shows an example of real service configuration according to an embodiment of the present invention.
18 is a flowchart illustrating an example of a data privacy-related contract process according to an embodiment of the present invention.

이하, 본 발명의 실시예를 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

본 발명의 실시예들은 데이터 비식별화 위험도 분석을 위한 방법과 시스템에 관한 것이다.Embodiments of the present invention relate to a method and system for data de-identification risk analysis.

본 명세서에서 구체적으로 개시되는 것들을 포함하는 실시예들은 개인 정보를 비롯한 자사 데이터 유통에 있어 데이터 관리와 관련된 다양한 법규 또는 가이드를 참조하여 데이터 반출 시 의사결정을 위한 매트릭스(PCM)를 생성하여 이를 최종 의사결정에서 활용하기 위한 데이터 컴플라이언스 관리 환경을 제공할 수 있다.Embodiments including those specifically disclosed herein refer to various laws or guides related to data management in the distribution of own data, including personal information, to generate a matrix (PCM) for decision-making when exporting data, and to create a final decision It can provide a data compliance management environment for use in decision making.

도 1은 본 발명의 일실시예에 있어서 컴퓨터 시스템의 내부 구성의 일례를 설명하기 위한 블록도이다. 예를 들어, 본 발명의 실시예들에 따른 데이터 컴플라이언스 제공 시스템이 도 1의 컴퓨터 시스템(100)을 통해 구현될 수 있다. 도 1에 도시한 바와 같이, 컴퓨터 시스템(100)은 데이터 컴플라이언스 제공 방법을 실행하기 위한 구성요소로서 프로세서(110), 메모리(120), 영구 저장 장치(130), 버스(140), 입출력 인터페이스(150) 및 네트워크 인터페이스(160)를 포함할 수 있다.1 is a block diagram for explaining an example of an internal configuration of a computer system according to an embodiment of the present invention. For example, a system for providing data compliance according to embodiments of the present invention may be implemented through the computer system 100 of FIG. 1 . As shown in FIG. 1 , the computer system 100 is a component for executing the data compliance providing method, including a processor 110 , a memory 120 , a persistent storage device 130 , a bus 140 , an input/output interface ( 150 ) and a network interface 160 .

프로세서(110)는 데이터 컴플라이언스 제공을 위한 구성요소로서 명령어들의 시퀀스를 처리할 수 있는 임의의 장치를 포함하거나 그의 일부일 수 있다. 프로세서(110)는 예를 들어 컴퓨터 프로세서, 이동 장치 또는 다른 전자 장치 내의 프로세서 및/또는 디지털 프로세서를 포함할 수 있다. 프로세서(110)는 예를 들어, 서버 컴퓨팅 디바이스, 서버 컴퓨터, 일련의 서버 컴퓨터들, 서버 팜, 클라우드 컴퓨터, 컨텐츠 플랫폼 등에 포함될 수 있다. 프로세서(110)는 버스(140)를 통해 메모리(120)에 접속될 수 있다.Processor 110 may include or be part of any device capable of processing a sequence of instructions as a component for providing data compliance. Processor 110 may include, for example, a computer processor, a processor in a mobile device, or other electronic device and/or a digital processor. The processor 110 may be included in, for example, a server computing device, a server computer, a series of server computers, a server farm, a cloud computer, a content platform, and the like. The processor 110 may be connected to the memory 120 through the bus 140 .

메모리(120)는 컴퓨터 시스템(100)에 의해 사용되거나 그에 의해 출력되는 정보를 저장하기 위한 휘발성 메모리, 영구, 가상 또는 기타 메모리를 포함할 수 있다. 메모리(120)는 예를 들어 랜덤 액세스 메모리(RAM: random access memory) 및/또는 다이내믹 RAM(DRAM: dynamic RAM)을 포함할 수 있다. 메모리(120)는 컴퓨터 시스템(100)의 상태 정보와 같은 임의의 정보를 저장하는 데 사용될 수 있다. 메모리(120)는 예를 들어 데이터 컴플라이언스 제공을 위한 명령어들을 포함하는 컴퓨터 시스템(100)의 명령어들을 저장하는 데에도 사용될 수 있다. 컴퓨터 시스템(100)은 필요에 따라 또는 적절한 경우에 하나 이상의 프로세서(110)를 포함할 수 있다.Memory 120 may include volatile memory, persistent, virtual, or other memory for storing information used by or output by computer system 100 . The memory 120 may include, for example, random access memory (RAM) and/or dynamic RAM (DRAM). Memory 120 may be used to store any information, such as state information of computer system 100 . Memory 120 may also be used to store instructions of computer system 100 including, for example, instructions for providing data compliance. Computer system 100 may include one or more processors 110 as needed or appropriate.

버스(140)는 컴퓨터 시스템(100)의 다양한 컴포넌트들 사이의 상호작용을 가능하게 하는 통신 기반 구조를 포함할 수 있다. 버스(140)는 예를 들어 컴퓨터 시스템(100)의 컴포넌트들 사이에, 예를 들어 프로세서(110)와 메모리(120) 사이에 데이터를 운반할 수 있다. 버스(140)는 컴퓨터 시스템(100)의 컴포넌트들 간의 무선 및/또는 유선 통신 매체를 포함할 수 있으며, 병렬, 직렬 또는 다른 토폴로지 배열들을 포함할 수 있다.Bus 140 may include a communications infrastructure that enables interaction between various components of computer system 100 . Bus 140 may carry data between, for example, components of computer system 100 , such as between processor 110 and memory 120 . Bus 140 may include wireless and/or wired communication media between components of computer system 100 and may include parallel, serial, or other topological arrangements.

영구 저장 장치(130)는 (예를 들어, 메모리(120)에 비해) 소정의 연장된 기간 동안 데이터를 저장하기 위해 컴퓨터 시스템(100)에 의해 사용되는 바와 같은 메모리 또는 다른 영구 저장 장치와 같은 컴포넌트들을 포함할 수 있다. 영구 저장 장치(130)는 컴퓨터 시스템(100) 내의 프로세서(110)에 의해 사용되는 바와 같은 비휘발성 메인 메모리를 포함할 수 있다. 영구 저장 장치(130)는 예를 들어 플래시 메모리, 하드 디스크, 광 디스크 또는 다른 컴퓨터 판독 가능 매체를 포함할 수 있다.Persistent storage 130 is a component, such as memory or other persistent storage, as used by computer system 100 to store data for an extended period of time (eg, compared to memory 120 ). may include Persistent storage 130 may include non-volatile main memory as used by processor 110 in computer system 100 . Persistent storage 130 may include, for example, flash memory, a hard disk, an optical disk, or other computer-readable medium.

입출력 인터페이스(150)는 키보드, 마우스, 음성 명령 입력, 디스플레이 또는 다른 입력 또는 출력 장치에 대한 인터페이스들을 포함할 수 있다. 구성 명령들 및/또는 데이터 컴플라이언스 제공을 위한 입력이 입출력 인터페이스(150)를 통해 수신될 수 있다.The input/output interface 150 may include interfaces to a keyboard, mouse, voice command input, display, or other input or output device. Input to provide configuration commands and/or data compliance may be received via input/output interface 150 .

네트워크 인터페이스(160)는 근거리 네트워크 또는 인터넷과 같은 네트워크들에 대한 하나 이상의 인터페이스를 포함할 수 있다. 네트워크 인터페이스(160)는 유선 또는 무선 접속들에 대한 인터페이스들을 포함할 수 있다. 구성 명령들 및/또는 데이터 컴플라이언스 제공을 위한 입력이 네트워크 인터페이스(160)를 통해 수신될 수 있다.Network interface 160 may include one or more interfaces to networks such as a local area network or the Internet. Network interface 160 may include interfaces for wired or wireless connections. Input to provide configuration commands and/or data compliance may be received via network interface 160 .

또한, 다른 실시예들에서 컴퓨터 시스템(100)은 도 1의 구성요소들보다 더 많은 구성요소들을 포함할 수도 있다. 그러나, 대부분의 종래기술적 구성요소들을 명확하게 도시할 필요성은 없다. 예를 들어, 컴퓨터 시스템(100)은 상술한 입출력 인터페이스(150)와 연결되는 입출력 장치들 중 적어도 일부를 포함하도록 구현되거나 또는 트랜시버(transceiver), GPS(Global Positioning System) 모듈, 카메라, 각종 센서, 데이터베이스 등과 같은 다른 구성요소들을 더 포함할 수도 있다.Also, in other embodiments, computer system 100 may include more components than those of FIG. 1 . However, there is no need to clearly show most of the prior art components. For example, the computer system 100 is implemented to include at least some of the input/output devices connected to the input/output interface 150 described above, or a transceiver, a global positioning system (GPS) module, a camera, various sensors, It may further include other components such as a database and the like.

도 2는 본 발명의 일실시예에 따른 컴퓨터 시스템이 수행할 수 있는 데이터 컴플라이언스 제공 방법의 일 예를 도시한 순서도이다.2 is a flowchart illustrating an example of a method for providing data compliance that can be performed by a computer system according to an embodiment of the present invention.

프로세서(110)는 데이터 컴플라이언스 제공 방법을 위한 프로그램 파일에 저장된 프로그램 코드를 메모리(120)에 로딩할 수 있다. 예를 들어, 데이터 컴플라이언스 제공 방법을 위한 프로그램 파일은 도 1을 통해 설명한 영구 저장 장치(130)에 저장되어 있을 수 있고, 프로세서(110)는 버스(140)를 통해 영구 저장 장치(130)에 저장된 프로그램 파일로부터 프로그램 코드가 메모리(120)에 로딩되도록 컴퓨터 시스템(100)을 제어할 수 있다. 이때, 데이터 컴플라이언스 제공 방법의 실행을 위해, 프로세서(110) 및 프로세서(110)의 구성요소들은 직접 제어 명령에 따른 연산을 처리하거나 또는 컴퓨터 시스템(100)을 제어할 수 있다.The processor 110 may load the program code stored in the program file for the data compliance providing method into the memory 120 . For example, a program file for the data compliance providing method may be stored in the persistent storage device 130 described with reference to FIG. 1 , and the processor 110 is stored in the persistent storage device 130 via the bus 140 . The computer system 100 may be controlled to load program code from a program file into the memory 120 . In this case, in order to execute the data compliance providing method, the processor 110 and components of the processor 110 may directly process an operation according to a control command or control the computer system 100 .

이하에서는 개인 정보(또는 개인 데이터)를 일례로 하여 구체적인 실시예를 설명하기로 한다. 그러나, 개인 정보에 한정되는 것은 아니며, 기업 간에 혹은 기업과 개인 간에 유통 대상이 되는 데이터로서 기업이 보유하고 있는 모든 형태의 자사 데이터로 확대하여 적용할 수 있다.Hereinafter, specific embodiments will be described using personal information (or personal data) as an example. However, it is not limited to personal information, and as data that is subject to distribution between companies or between companies and individuals, it can be extended and applied to all types of company data owned by the company.

본 실시예들은 사물인터넷 환경(예를 들어, 스마트 홈, 스마트 오피스, 웨어러블 헬스케어 등) 등에서 개인 정보 유통 시 GDPR(General Data Protection Regulation) 정책 만족 여부 등의 컴플라이언스를 판단하고 이를 충족시킬 수 있는 개인 정보관리 프레임워크를 통해 개인 정보 유통 패러다임에 맞는 새로운 개인 정보관리 플랫폼을 제공한다.In the present embodiment, when personal information is distributed in an Internet of Things environment (eg, smart home, smart office, wearable healthcare, etc.), the individual who can determine compliance, such as whether or not the General Data Protection Regulation (GDPR) policy is satisfied, and can satisfy it We provide a new personal information management platform that fits the personal information distribution paradigm through the information management framework.

도 2를 참조하면, 단계(S210)에서 프로세서(110)는 개인 정보와 관련된 국가별 정보 보호법을 플랫폼 상의 컴플라이언스 항목으로 등록할 수 있다. 프로세서(110)는 EU GDPR, 한국 개인 정보보호법, 미국 HIPAA 등 정보 보호법 각각을 사전 템플릿 형태로 등록할 수 있다. 시스템 이용자는 데이터 수집 및 활용 환경에 맞는 기능 선택을 통해 해당 데이터의 컴플라이언스를 자동으로 확인할 수 있다. 프로세서(110)는 다양한 서비스 환경에서 수집된 개인 정보를 사용자 프로파일 및 서비스 계약 정보(사용자 동의 여부 등을 포함)와 바인딩한 후 각 수집 국가별 정보 보호법에 맞춰 컴플라이언스를 확인할 수 있는 기능을 제공한다.Referring to FIG. 2 , in step S210 , the processor 110 may register country-specific information protection laws related to personal information as a compliance item on the platform. The processor 110 may register each of the information protection laws such as EU GDPR, Korea Personal Information Protection Act, and US HIPAA in the form of a pre-template. The system user can automatically check the compliance of the data by selecting the function suitable for the data collection and utilization environment. The processor 110 binds the personal information collected in various service environments with the user profile and service contract information (including whether the user agrees or not), and then provides a function of checking compliance in accordance with the information protection laws of each collection country.

단계(S220)에서 프로세서(110)는 시스템 이용자의 요청에 대응되는 데이터에 대해 플랫폼 상에 사전 등록된 컴플라이언스를 적용한 데이터셋을 제공할 수 있다. 프로세서(110)는 데이터 요청에 대한 질의를 생성하여 질의에 대응되는 로우 데이터(raw data)를 확인한 후 로우 데이터가 수집된 국가를 바탕으로 위치 기반 결과 값을 제공할 수 있다. 프로세서(110)는 플랫폼 상에서 로우 데이터에 대한 사전 등록된 컴플라이언스를 적용하여 컴플라이언스가 적용된 데이터셋을 제공할 수 있다.In step S220 , the processor 110 may provide a data set to which the compliance pre-registered on the platform is applied to data corresponding to the request of the system user. The processor 110 may generate a query for the data request, check raw data corresponding to the query, and then provide a location-based result value based on a country in which the raw data is collected. The processor 110 may provide a data set to which compliance is applied by applying pre-registered compliance to raw data on the platform.

단계(S230)에서 프로세서(110)는 컴플라이언스가 적용된 데이터를 외부로 반출할 수 있는 기능을 제공한다. 프로세서(110)는 플랫폼 상에서 데이터 특성 정보 확인 기능을 제공하는 것으로, 데이터 반출 시 개별 데이터 필드에 대한 데이터 특성을 확인할 수 있는 기능을 제공한다. 프로세서(110)는 데이터 반출 시 개별 데이터 필드에 대한 데이터 비식별화를 적용할 수 있다. 일례로, 프로세서(110)는 데이터마스킹 방법으로서 데이터의 적어도 일부를 데이터 식별이 불가능한 기호(예컨대, 별표)로 마스킹함으로써 비식별화를 적용할 수 있다. 다른 예로, 프로세서(110)는 범주화 방법으로서 데이터에 대한 단위 변환을 통해 반출 데이터 필드의 비식별화가 가능하다. 예를 들어, 실제 나이를 10대, 20대 등 연령 단위로 변환하거나 성명에서 이름을 제외하고 김씨, 이씨 등 성 단위로 변환할 수 있다. 프로세서(110)는 범주화, 일반화, 데이터마스킹 방법 등을 이용한 데이터 비식별화는 물론이고 동형암호를 이용한 비식별화 또한 가능하다. 이에, 시스템 이용자는 컴플라이언스를 준수하는 형태의 데이터를 획득할 수 있다.In step S230, the processor 110 provides a function of exporting the data to which the compliance has been applied. The processor 110 provides a function of verifying data characteristic information on the platform, and provides a function of confirming the data characteristic of an individual data field when data is exported. The processor 110 may apply data de-identification to individual data fields when exporting data. For example, as a data masking method, the processor 110 may apply de-identification by masking at least a portion of data with a symbol (eg, an asterisk) that cannot be identified as data. As another example, the processor 110 may de-identify the exported data field through unit conversion for data as a categorization method. For example, you can convert your actual age into age units such as teenagers and 20s, or convert to last name units such as Kim and Lee by excluding your first name from your full name. The processor 110 can de-identify data using categorization, generalization, data masking methods, etc., as well as de-identification using homomorphic encryption. Accordingly, the system user can acquire data in a form that complies with the compliance.

프로세서(110)는 플랫폼 상에서 기업 간에 개인 정보를 유통하는 경우 각 기업에 대한 신뢰도를 점수화된 형태로 제공할 수 있다. 일례로, 프로세서(110)는 신뢰도 평가 대상자의 능력(ability), 관계성(benevolence), 일관성(integrity), 및 경향(inclination) 등을 바탕으로 신뢰도를 평가할 수 있다. 능력은 개인 정보 수집 정도와 정보 보안 관련 항목 등을 포함하는 것으로, 신뢰도 평가 대상자의 개인 정보 관련 기술 및 역량과 관련한 특성으로 개인 정보 수집 범위, 개인 정보 처리 보안성 등을 포함할 수 있다. 관계성은 개인 정보 제공자의 권리를 보장하는지 여부 등을 포함하는 것으로, 신뢰도 평가 대상자가 개인 정보 제공자와 함께 일하거나 행동하려는 태도의 특성으로 제공자에게 개인 정보 처리/보호 방침과 같은 제공자의 권리에 관한 내용 전달 여부 및 전달 방법 등을 포함할 수 있다. 일관성은 개인 정보 활용 정책 및 제3자 공유 여부 등을 포함하는 것으로, 신뢰도 평가 대상자의 개인 정보 관련 원칙 고수 및 원리 준수와 관련된 특성으로 수집/요구하는 개인 정보들과 수집 목적의 부합성, 개인 정보 활용의 일관성 등이 포함될 수 있다. 경향은 개인 정보 제공자의 신뢰도 평가 시 신뢰지표들의 중요도를 나타낸다. 경향은 신뢰도 계산에서의 가중치로 표현될 수 있으며, 서비스 도메인 별 혹은 신뢰도 평가 대상자 별로 제공자의 경험과 평판에 근거하여 그 값이 다를 수 있다. 이는 신뢰성 평가자의 경험과 평판의 축적된 정보로 개인적인 견해 등을 포함한다. 신뢰도 평가에 활용되는 각 항목(능력, 관계성, 일관성, 경향)의 실제 평가는 개인정보 정책서 및 모바일 어플리케이션 접근 권한 정보 데이터 셋을 활용하여 평가될 수 있다.The processor 110 may provide reliability for each company in a scored form when personal information is distributed between companies on the platform. As an example, the processor 110 may evaluate the reliability based on the reliability evaluation target's ability (ability), relationship (benevolence), consistency (integrity), tendency (inclination), and the like. Capability includes the degree of personal information collection and information security-related items, and may include the personal information collection scope, personal information processing security, etc. Relationship includes whether the rights of the personal information provider are guaranteed or not, and it is the nature of the attitude of the person subject to reliability evaluation to work or act with the personal information provider. It may include whether to deliver and how to deliver. Consistency includes personal information use policy and whether or not to share with a third party Consistency of use may be included. The trend indicates the importance of trust indicators in evaluating the reliability of personal information providers. The trend can be expressed as a weight in the reliability calculation, and the value may be different for each service domain or reliability evaluation target based on the provider's experience and reputation. This is the accumulated information of the credibility evaluator's experience and reputation, including personal opinions. The actual evaluation of each item (capability, relationship, consistency, tendency) used for reliability evaluation can be evaluated using a personal information policy book and mobile application access right information data set.

프로세서(110)는 데이터 반출을 요청한 기업의 신뢰도를 기초로 반출하고자 하는 데이터의 비식별화 가이드를 결정할 수 있다. 프로세서(110)는 특정 기업에서 요청한 데이터에 대해 컴플라이언스 매칭 이후 외부 반출 시 해당 데이터에 대한 컬럼별 비식별화 가이드를 제공할 수 있다.The processor 110 may determine a guide for de-identification of data to be exported based on the reliability of the company requesting the data export. The processor 110 may provide a column-by-column de-identification guide for the data requested by a specific company when exported to the outside after compliance matching.

도 3 내지 도 12는 본 발명의 일실시예에 있어서 데이터 컴플라이언스 제공을 위한 서비스 화면의 예시를 도시한 것이다.3 to 12 are diagrams illustrating an example of a service screen for providing data compliance according to an embodiment of the present invention.

데이터 컴플라이언스 제공 시스템은 웹 기반의 사용자 인터페이스를 제공한다.The data compliance providing system provides a web-based user interface.

서비스 이용자는 크게 두 그룹으로 분류될 수 있으며, 내부 데이터를 직접 관리하는 권한을 가진 데이터 관리자와 외부 기관이나 내부 기관에서 데이터를 이용하고자 하는 데이터 요청자로 구분될 수 있다.Service users can be broadly classified into two groups, and can be divided into data managers who have the authority to directly manage internal data and data requesters who want to use data from external or internal organizations.

데이터 컴플라이언스 제공 시스템은 웹 기반의 아이디/패스워드 방식의 서비스 로그인 방식을 지원할 수 있다.The data compliance providing system may support a web-based ID/password service login method.

도 3을 참조하면, 대시보드 주요 기능은 다음과 같다.Referring to FIG. 3 , main functions of the dashboard are as follows.

대시보드 페이지(300)에는 '데이터 관리' 메뉴(301)와 '컴플라이언스 평가' 메뉴(302)를 포함할 수 있다.The dashboard page 300 may include a 'data management' menu 301 and a 'compliance evaluation' menu 302 .

'데이터 관리' 메뉴(301)는 내부 또는 외부의 데이터 요청을 관리하고 데이터 관리자의 승인 하에 데이터를 반출할 수 있는 기능을 포함한다.The 'data management' menu 301 includes a function for managing internal or external data requests and exporting data under the approval of the data manager.

'컴플라이언스 평가' 메뉴(302)는 플랫폼 상에 제공되는 컴플라이언스들의 정보를 관리하는 기능을 포함한다. 이때, 컴플라이언스는 특정 데이터 반출 규격에 맞춰 그 적합성을 따지는 일련의 기능을 의미하는 것으로, 적용 가능한 컴플라이언스는 EU의 GDPR, 국내의 개인 정보보호법, 미국의 HIPAA가 있으며, 그외 ISO 27018, ISO 27799 등을 포함할 수 있다. 개별 국가 또는 기관의 프라이버시 관련 규약 또는 규격들은 추가적인 컴플라이언스 형태로 등록 가능하다.The 'compliance evaluation' menu 302 includes a function for managing information on compliance provided on the platform. At this time, compliance means a series of functions that check its suitability in accordance with specific data export standards. may include Individual countries or institutions' privacy-related rules or standards can be registered as an additional form of compliance.

또한, '컴플라이언스 평가' 메뉴(302)는 데이터 반출 시 해당 기업의 신뢰도 평가를 통해 데이터의 안전한 반출 가이드 라인을 제공하는 기능을 포함한다.In addition, the 'compliance evaluation' menu 302 includes a function of providing a guideline for safe export of data through reliability evaluation of a corresponding company when exporting data.

대시보드 페이지(300)의 상단에는 로그인한 사용자와 권한, 그리고 접근 가능 시간이 표시된다.At the top of the dashboard page 300 , the logged-in user, authority, and access time are displayed.

대시보드 페이지(300)에서는 '데이터 관리' 메뉴(301)의 '데이터 반출 요청' 메뉴를 통해 신규 데이터 요청 내역(310)을 확인할 수 있고, '데이터 반출 상태' 메뉴를 통해 데이터 요청 처리 내역(320)을 확인할 수 있다.In the dashboard page 300, new data request details 310 can be checked through the 'data export request' menu of the 'data management' menu 301, and data request processing details 320 through the 'data export status' menu )can confirm.

데이터 관리자는 대시보드 페이지(300)에서 신규 데이터 요청 내역(310)을 확인하고 '선택' 메뉴(311)를 이용하여 자신이 처리하고자 하는 요청 건을 선택하여 처리할 수 있다.The data manager may check the new data request details 310 on the dashboard page 300 and select and process the request to be processed by using the 'selection' menu 311 .

신규 데이터 요청의 경우 데이터 요청자가 로그인 페이지에서 데이터 요청을 위해 기 발급된 사용자 계정으로 로그인하여 신규 데이터 요청 프로세스를 진행할 수 있다.In the case of a new data request, the data requester may log in with a previously issued user account for data request on the login page and proceed with the new data request process.

도 4는 신규 데이터 요청을 위한 정보 입력 화면(400)의 예시를 나타내고 있다.4 shows an example of an information input screen 400 for a new data request.

정보 입력 화면(400)은 데이터 요청 시 필요한 정보 입력 필드를 포함하는 것으로, 제목(401), 요청자(402), 반출 기업(403), 데이터 요구 내용(404), 데이터 활용 목적(405), 데이터 활용 기간(406), 국가나 지역(407) 등을 입력하기 위한 인터페이스를 포함할 수 있다. 데이터 요구 내용(404)은 실제 사용하고자 하는 데이터에 대한 상세한 설명을 기입하기 위한 필드이고, 데이터 활용 목적(405)은 연구 목적 또는 상업적 분석 목적 등 구체적인 활용 목적을 기입하기 위한 필드이다. 데이터 활용 기간(406)은 GDPR 규정 등을 준수하기 위해 데이터의 활용 기간을 기입하기 위한 필드이고, 국가나 지역(407)은 데이터 반출 시 국가나 지역에 맞는 데이터 보호법을 적용하기 위해서 반드시 필요한 입력 필드이다.The information input screen 400 includes information input fields necessary for data request, and includes a title 401, a requester 402, a exporting company 403, data request content 404, data utilization purpose 405, data It may include an interface for inputting a utilization period 406 , a country or region 407 , and the like. The data request content 404 is a field for entering a detailed description of the data to be actually used, and the data utilization purpose 405 is a field for entering a specific purpose of use, such as a research purpose or a commercial analysis purpose. The data usage period 406 is a field for entering the data usage period in order to comply with the GDPR regulations, etc., and the country or region 407 is an input field necessary to apply the data protection law appropriate to the country or region when exporting data. to be.

데이터 관리자의 신규 데이터 요청 처리 프로세스를 설명하면 다음과 같다.The data manager's new data request processing process will be described as follows.

도 5는 데이터 관리자가 '선택' 메뉴(311)를 이용하여 선택한 요청 건의 처리 화면(500)을 나타내고 있다.5 illustrates a request processing screen 500 selected by the data manager using the 'selection' menu 311 .

프로세서(110)는 정보 입력 화면(400)을 통해 데이터 요청 사항에 기입된 요청 내용을 바탕으로 데이터 추출을 위한 쿼리(501)를 생성하여 데이터 요청 처리 화면(500) 상에 표시한다. 데이터 추출을 위한 검색어를 쿼리 형태가 아닌 워크벤치(workbench) 형태의 UI로 제공하는 것 또한 가능하다.The processor 110 generates a query 501 for data extraction based on the request written in the data request through the information input screen 400 and displays it on the data request processing screen 500 . It is also possible to provide a search term for data extraction in the form of a workbench rather than a query form.

데이터 요청 처리 화면(500)에는 쿼리(501)를 실행하기 위한 '쿼리' 메뉴(502), 쿼리(501)로 검색된 로우 데이터에 대한 컴플라이언스를 적용하기 위한 '컴플라이언스 적용' 메뉴(503) 등이 포함될 수 있다.The data request processing screen 500 includes a 'query' menu 502 for executing the query 501, a 'compliance application' menu 503 for applying compliance to the raw data retrieved by the query 501, etc. can

프로세서(110)는 데이터 관리자가 '쿼리' 메뉴(502)를 선택하는 경우, 도 6에 도시한 바와 같이 지도(610) 상에 쿼리(501)에 대응되는 데이터 검색 결과(601)를 제공할 수 있다.When the data manager selects the 'query' menu 502, the processor 110 may provide a data search result 601 corresponding to the query 501 on the map 610 as shown in FIG. have.

프로세서(110)는 데이터 별 해당 데이터 수집 국가(또는 지역)를 기반으로 지도(610) 상에 데이터 검색 결과(601)를 국가(또는 지역) 별 데이터 분포 형태로 제공할 수 있다. 지도(610)를 기반으로 데이터 검색 결과(601)를 제공하는 것은 국제법 또는 프라이버시 관련 법규들이 지역적 특성을 가지고 있어 우선적으로 필터링하기 위함이다. 전체적인 데이터의 볼륨을 국가(또는 지역) 단위로 보여줌으로써 데이터의 활용도에 대한 빠른 인식이 가능하다.The processor 110 may provide the data search result 601 on the map 610 in the form of data distribution for each country (or region) based on the country (or region) in which the data is collected for each data. The provision of the data search result 601 based on the map 610 is for preferential filtering because international laws or privacy-related laws have regional characteristics. By showing the overall data volume by country (or region), it is possible to quickly recognize the utilization of data.

프로세서(110)는 검색 결과 화면으로서 도 7에 도시한 바와 같이 쿼리(501)에 대응되는 검색 결과에 포함된 로우 데이터를 확인하기 위한 로우 데이터 확인 화면(720)을 제공할 수 있다.As a search result screen, the processor 110 may provide a raw data check screen 720 for checking raw data included in a search result corresponding to the query 501 as shown in FIG. 7 .

로우 데이터 확인 화면(720)에서는 로우 데이터를 기반으로 쿼리를 통해 검색된 데이터를 필드 기반으로 보여줄 수 있다. 다시 말해, 데이터 관리자 측면에서 데이터 추출을 위한 로우 데이터를 직접 확인할 수 있는 기능을 제공한다.On the raw data confirmation screen 720 , data retrieved through a query based on the raw data may be displayed on a field basis. In other words, it provides the ability to directly check the raw data for data extraction from the data manager side.

프로세서(110)는 기본 검색된 로우 데이터에 대한 컴플라이언스를 적용할 수 있는 기능을 제공한다.The processor 110 provides a function for applying compliance to the basic searched raw data.

데이터 관리자는 쿼리(501)를 통해 검색한 데이터의 기본적 속성을 확인한 후 실제 해당 데이터를 외부로 반출하기 위해 검색된 데이터에 대한 컴플라이언스를 적용할 수 있다.The data manager may check the basic properties of the searched data through the query 501 and then apply compliance to the searched data in order to actually export the corresponding data to the outside.

프로세서(110)는 데이터 관리자가 '컴플라이언스 적용' 메뉴(503)를 선택하는 경우, 도 8에 도시한 바와 같이 사전 등록된 컴플라이언스 목록(801)을 제공할 수 있다. 데이터 관리자는 수집하고자 하는 데이터의 특성인 제품별 사용자 약관과 데이터 수집 국가별 법령 등을 준수하는 컴플라이언스를 생성할 수 있다. 사전에 생성된 컴플라이언스를 로우 데이터에 직접 적용 할 수 있다.When the data manager selects the 'compliance application' menu 503 , the processor 110 may provide a pre-registered compliance list 801 as shown in FIG. 8 . The data manager can create a compliance that complies with the user terms for each product and the laws of each country of data collection, which are the characteristics of the data to be collected. Pre-generated compliance can be applied directly to raw data.

프로세서(110)는 기본 검색된 로우 데이터에 대해 데이터 관리자가 컴플라이언스 목록(801)에서 선택한 컴플라이언스를 적용할 수 있다.The processor 110 may apply the compliance selected by the data manager from the compliance list 801 to the basic searched raw data.

도 9에 도시한 바와 같이, 프로세서(110)는 컴플라이언스가 적용된 결과(901)를 확인하기 위한 기능을 제공할 수 있으며, 기본 검색 결과(도 6, 도 7)와 마찬가지로 컴플라이언스 적용 결과(901)를 지도 상에 보여주거나 로우 데이터 목록으로 보여줄 수 있다.As shown in FIG. 9 , the processor 110 may provide a function for checking the result 901 to which the compliance is applied, and similarly to the basic search result ( FIGS. 6 and 7 ), the result 901 of the compliance application is displayed. It can be displayed on a map or as a raw data list.

데이터 관리자는 컴플라이언스가 적용된 결과(901)를 확인한 후 실제 반출을 위한 작업을 진행할 수 있다.After confirming the result 901 to which the compliance is applied, the data manager may proceed with the actual export operation.

데이터 관리자가 '컴플라이언스 적용' 메뉴(503)를 선택하는 경우, 도 10에 도시한 바와 같이 데이터 요청 처리 화면(500) 상에 컴플라이언스가 적용된 결과(1001)와 함께 '반출' 메뉴(1002)가 표시될 수 있다. 현재 표시된 데이터에 대해 사전 반출 이력이 있으면 데이터 요청 처리 화면(500) 상에 해당 반출 이력(1003)을 표시될 수 있다.When the data manager selects the 'Apply Compliance' menu 503, the 'Export' menu 1002 is displayed on the data request processing screen 500 along with the result 1001 to which the compliance is applied, as shown in FIG. 10 . can be If there is a prior export history for the currently displayed data, the corresponding export history 1003 may be displayed on the data request processing screen 500 .

반출 이력(1003) 이외에도 개별 필드 별 특성을 그래프를 통해 시각적으로 제공하는 기능이 포함된다.In addition to the export history 1003, a function of visually providing characteristics for each field through a graph is included.

프로세서(110)는 반출하고자 하는 컴플라이언스 적용 결과에 포함된 데이터 필드 별로 비식별화를 적용한 결과를 제공할 수 있다.The processor 110 may provide a result of applying de-identification to each data field included in the compliance application result to be exported.

데이터 관리자는 반출되는 데이터를 필드 수준에서 비식별화 적용이 가능하며, 비식별화 유형을 선택적으로 설정할 수 있다. 예를 들어, '성별' 필드에 대해서 비식별화를 적용하고 상단의 그래프에는 성별이 어떠한 데이터 분포를 나타내는지를 시각적으로 보여줄 수 있다. '나이' 필드의 경우 숫자 값을 읽을 수 없는 수준의 완전 비식별화 방법이나 일정 단위로 랜던 증감하는 방법, 연령대 형태로 단위를 변환하는 방법 등을 적용할 수 있다.The data manager can apply de-identification to the exported data at the field level, and can selectively set the de-identification type. For example, de-identification can be applied to the 'gender' field, and the graph at the top can visually show what kind of data distribution by gender. In the case of the 'age' field, a method of completely de-identifying a level in which a numeric value cannot be read, a method of randomly increasing/decreasing in a certain unit, a method of converting a unit into an age group type, etc. can be applied.

이러한 비식별화 기능은 데이터 필드의 특성에 따라 기 정의된 비식별화 기능을 적용할 수 있고, 이외에도 데이터 관리자가 직접 비식별화된 데이터를 입력하는 것 또한 가능하다.For this de-identification function, a predefined de-identification function can be applied according to the characteristics of the data field. In addition, it is also possible for the data manager to directly input de-identified data.

데이터 요청 처리 화면(500)에는 데이터 관리자가 데이터 반출 전 최종적으로 반출 정보를 확인할 수 있는 기능이 포함된다. 데이터 관리자가 최종적으로 데이터에 대한 요청에 대해 반출 처리 전 확인하기 위한 기능으로, 지도 기반 데이터 분포 정보, 요청자 정보, 요청 일시 정보, 책임자(데이터 반출) 정보, 처리 일자 정보(데이터 반출), 적용된 컴플라이언스 정보, 키 발급 정보(데이터 요청자가 반출하고자 하는 데이터에 접근 시 이를 증명하기 위해 발급된 키 정보) 등을 확인할 수 있다. 데이터 관리자는 키 발급과 함께 반출 확인 문구 작성하여 입력할 수 있으며, 키 발급 절차와 사용자 직접 입력을 통한 반출 확인 절차 이후 데이터 반출을 진행할 수 있다.The data request processing screen 500 includes a function for the data manager to finally check export information before exporting data. This is a function for the data manager to check the final data request before processing. Map-based data distribution information, requester information, request date information, person in charge (data export) information, processing date information (data export), applied compliance Information and key issuance information (key information issued to prove that the data requester accesses the data they want to export) can be checked. The data manager can write and input the export confirmation phrase together with the key issuance, and can proceed with the export of data after the key issuance procedure and the export confirmation procedure through the user's direct input.

상기한 과정을 거처 진행된 데이터 반출 건은 대시보드 페이지(300) 상의 데이터 요청 처리 내역(320)에 추가되어 표시될 수 있다. 이때, 프로세서(110)는 데이터 요청 처리 내역(320)에 새로 추가된 신규 생성 건을 다른 건과 구분되도록 강조하여 표시할 수 있다.The data export case performed through the above process may be displayed in addition to the data request processing history 320 on the dashboard page 300 . In this case, the processor 110 may highlight and display the newly created case newly added to the data request processing history 320 to be distinguished from other cases.

대시보드 페이지(300)에서 '컴플라이언스 평가' 메뉴(302)는 플랫폼 상에 제공되는 컴플라이언스들의 정보를 관리하는 기능으로서, 데이터 컴플라이언스 솔루션에서 컴플라이언스를 제공하기 위한 기능을 제공할 수 있다.The 'compliance evaluation' menu 302 on the dashboard page 300 is a function for managing information on compliance provided on the platform, and may provide a function for providing compliance in a data compliance solution.

도 11은 '컴플라이언스 평가' 메뉴(302)에 의한 컴플라이언스 관리 화면(1100)을 나타내고 있다.11 shows the compliance management screen 1100 by the 'compliance evaluation' menu 302 .

도 11을 참조하면, 데이터 관리자는 컴플라이언스 관리 화면(1100) 상에서 '컴플라이언스 제어' 메뉴(1101)를 통해 컴플라이언스 목록(1102)을 확인할 수 있어 기존에 생성된 컴플라이언스를 관리할 수 있다. 또한, 데이터 관리자가 직접 컴플라이언스를 생성하는 기능으로 '컴플라이언스 추가' 메뉴(1103)가 포함될 수 있다.Referring to FIG. 11 , the data manager can check the compliance list 1102 through the 'compliance control' menu 1101 on the compliance management screen 1100 to manage the existing compliance. In addition, the 'add compliance' menu 1103 may be included as a function for the data manager to directly create compliance.

기 등록된 컴플라이언스는 자사 데이터에 적용할 법규들이나 규정들의 집합을 의미한다. 각 컴플라이언스는 하나 이상의 조항을 포함하며 각각의 준수 여부에 따라 이를 전체적으로 점수화하여 표현할 수 있다. 기 등록된 컴플라이언스 목록(1102)에 포함된 각 컴플라이언스 별로 데이터 관리자에 의해 평가(확인)된 데이터의 항목을 함께 제공할 수 있다.Pre-registered compliance means a set of laws or regulations to be applied to company data. Each compliance clause includes one or more clauses, and it can be expressed by scoring it as a whole according to each compliance or non-compliance. An item of data evaluated (confirmed) by the data manager for each compliance included in the pre-registered compliance list 1102 may be provided together.

데이터 관리자는 '컴플라이언스 추가' 메뉴(1103)를 이용하여 신규 컴플라이언스를 추가할 수 있으며, 이때 컴플라이언스를 추가하고자 하는 경우 컴플라이언스 타이틀을 입력하기 위한 인터페이스, 기 정의된 지역 별 컴플라이언스 목록에서 원하는 컴플라이언스를 선택 적용하기 위한 인터페이스 등이 제공될 수 있다.The data manager can add a new compliance by using the 'Add Compliance' menu 1103. In this case, if you want to add a compliance, select and apply the desired compliance from the interface for entering the compliance title and the predefined compliance list for each region. An interface for doing so may be provided.

기 등록된 컴플라이언스 목록(1102)에 대한 관리 기능으로서 컴플라이언스 목록(1102)에 포함된 컴플라이언스 각각에 대하여 도 12에 도시한 바와 같이 컴플라이언스 항목을 검토하기 위한 '작업 검토' 메뉴(1201), 컴플라이언스의 타이틀을 변경하기 위한 '그룹 이름 변경' 메뉴(1202), 컴플라이언스를 삭제하기 위한 '그룹 삭제' 메뉴(1203) 등이 제공될 수 있다.As a management function for the pre-registered compliance list 1102, for each compliance included in the compliance list 1102, as shown in FIG. 12, a 'job review' menu 1201 for reviewing the compliance items, the title of the compliance A 'group name change' menu 1202 for changing the 'group name' menu 1202, a 'group deletion' menu 1203 for deleting the compliance, etc. may be provided.

'작업 검토' 메뉴(1201)는 컴플라이언스의 개별 항목들을 평가하기 위한 기능으로, 컴플라이언스를 추가한 관리자가 컴플라이언스 규칙들에 대해 각각을 평가할 수 있다.The 'job review' menu 1201 is a function for evaluating individual items of compliance, and an administrator who added compliance may evaluate each of the compliance rules.

'작업 검토' 메뉴(1201)에 따른 서비스 페이지에서는 데이터 수집 시점의 이용 약관 정보를 제공할 수 있다. 이때, 개인 데이터를 수집하는 서비스 또는 장치에서 사용자와의 계약 정보를 제시하고, 데이터 관리자는 데이터 수집 시점의 약관 정보에 기반해 해당 서비스 또는 장치에서 수집된 데이터의 활용 가능성을 판단할 수 있다.The service page according to the 'job review' menu 1201 may provide information on terms of use at the time of data collection. In this case, the service or device that collects personal data presents contract information with the user, and the data manager can determine the usability of the data collected from the service or device based on the terms and conditions information at the time of data collection.

데이터 관리자는 '작업 검토' 메뉴(1201)에 따른 서비스 페이지를 통해 컴플라이언스와 관련된 이용 약관 정보를 요청하여 컴플라이언스를 적용할 대상 서비스 또는 장치의 이용 약관 정보를 확인할 수 있다. 데이터 소유자가 가진 개인 데이터에 대해 수집 시점의 이용 약관 정보를 제공함으로써 컴플라이언스를 쉽고 편리하게 판단할 수 있는 기능을 제공한다.The data manager may request information on terms of use related to compliance through the service page according to the 'job review' menu 1201 to check the terms of use information of a target service or device to which compliance is to be applied. It provides the ability to easily and conveniently judge compliance by providing information on terms of use at the time of collection for the personal data of the data owner.

아울러, 컴플라이언스 적용 시 관련 조항에 대한 구체적인 원문 확인 기능을 함께 제공할 수 있다. 데이터 관리자가 컴플라이언스에 적용할 개별 관련 조항에 대한 원문을 확인하고 이를 평가할 수 있는 기능을 제공한다.In addition, when applying compliance, it is possible to provide a function to check the specific text of related provisions together. It provides the ability for data controllers to check and evaluate the original text of each relevant provision to be applied for compliance.

'컴플라이언스 평가' 메뉴(302)에 따른 서비스 페이지에는 컴플라이언스 관리 시스템 상에서 전체적인 PCM를 적용 및 관리하기 위해 자체적으로 관리하는 컴플라이언스 항목들의 정보를 관리하는 '커스텀 컴플라이언스' 기능이 포함될 수 있다. 데이터 관리자는 '커스텀 컴플라이언스' 기능을 통해 데이터 컴플라이언스 솔루션 상에서의 규칙들을 정의할 수 있다. '커스텀 컴플라이언스' 기능 이외에도 지역 별 데이터 관련 법률 및 조항들을 사전에 등록해 두고 활용할 수 있는 기능, 사용자 권한을 관리하는 기능, 데이터를 반출하고자 하는 사업자들의 목록을 관리하는 기능 등이 더 포함될 수 있다.The service page according to the 'compliance evaluation' menu 302 may include a 'custom compliance' function for managing information on compliance items managed by itself in order to apply and manage the overall PCM on the compliance management system. The data manager can define rules on the data compliance solution through the 'custom compliance' function. In addition to the 'custom compliance' function, a function to register and utilize data-related laws and provisions for each region in advance, a function to manage user rights, and a function to manage a list of operators who want to export data may be further included.

도 13은 본 발명의 일실시예에 있어서 개인 정보 활용 의사결정 매트릭스를 생성하는 과정의 예시를 도시한 것이다.13 illustrates an example of a process of generating a personal information utilization decision matrix according to an embodiment of the present invention.

도 13을 참조하면, 프로세서(110)는 개인정보보안 분석(1301)을 포함하는 개인 정보 관리와 관련된 다양한 컴플라이언스를 바탕으로 개인 정보 활용 시 의사결정을 위한 매트릭스(PCM)를 생성할 수 있고 이를 최종 의사결정(1302)에 활용하기 위한 전반의 기능을 제공할 수 있다.Referring to FIG. 13 , the processor 110 may generate a matrix (PCM) for decision-making when using personal information based on various compliances related to personal information management including personal information security analysis 1301, and finally An overall function for use in decision making 1302 may be provided.

프로세서(110)는 개인정보보안 분석(1301) 및 각 지역 로우 데이터에서 개인정보 활용 의사결정을 위한 데이터를 추출하여 매트릭스를 구성할 수 있고, 이를 활용하여 최종 의사결정(1302)을 업데이트할 수 있다.The processor 110 may construct a matrix by extracting data for personal information use decision making from the personal information security analysis 1301 and each region raw data, and may update the final decision 1302 using this. .

프로세서(110)는 관련 법규나 가이드 정보들이 지속적으로 업데이트되는 것을 고려하여 컴플라이언스(1301)를 기계학습을 기반으로 분석함으로써 최종 의사결정(1302)으로서 데이터 관리자의 의사결정에 따른 개인 정보 사용 결정은 물론이고, 컴플라이언스의 개별 항목에 대한 평가를 통해 컴플라이언스(개인 정보 보호 가이드라인 설정)의 적합성을 검증할 수 있고, 개인 데이터 위험도를 산출한 후 위험도에 따른 상이한 위험 관리를 제공할 수 있다.The processor 110 analyzes the compliance 1301 based on machine learning in consideration of the continuous update of related laws and guide information, and as a final decision 1302, the decision to use personal information according to the decision of the data manager as well as In addition, it is possible to verify the suitability of compliance (personal information protection guideline setting) through evaluation of individual items of compliance, and to provide different risk management according to the level of risk after calculating the risk of personal data.

본 실시예들은 개인 정보에 대한 유출위험도를 분석하는 아키텍처를 포함할 수 있다.The present embodiments may include an architecture for analyzing the risk of leakage of personal information.

도 14는 본 발명의 일실시예에 있어서 유출위험도에 기초한 데이터 비식별화 과정의 일례를 도시한 순서도이다.14 is a flowchart illustrating an example of a data de-identification process based on a risk of leakage in an embodiment of the present invention.

도 14를 참조하면, 단계(S1401)에서 프로세서(110)는 반출하고자 하는 데이터에 대해 해당 데이터에 포함된 각 특징 별로 비식별화된 데이터에 대한 유출위험도를 분석할 수 있다.Referring to FIG. 14 , in step S1401 , the processor 110 may analyze the leakage risk of de-identified data for each characteristic included in the data to be exported.

프로세서(110)는 개인 데이터에 대한 유출위험도 분석 모델을 통해 데이터 비식별화의 위험도를 분석할 수 있다. 일례로, 프로세서(110)는 비지도 학습 알고리즘을 이용하여 데이터 특성 별로 유출위험도를 분석할 수 있다.The processor 110 may analyze the risk of data de-identification through the leakage risk analysis model for personal data. For example, the processor 110 may analyze the leakage risk for each data characteristic using an unsupervised learning algorithm.

프로세서(110)는 범주화, 일반화, 데이터마스킹 등을 이용한 비식별화 방법 혹은 동형암호를 이용한 비식별화 방법을 통해 개인 데이터를 비식별화할 수 있다.The processor 110 may de-identify personal data through a de-identification method using categorization, generalization, data masking, or the like, or a de-identification method using homomorphic encryption.

개인 데이터에 포함된 개별 데이터 중 단일 데이터에 대한 유출위험도는 수학식 1과 같이 정의될 수 있다.The leakage risk for single data among individual data included in personal data may be defined as in Equation 1.

[수학식 1][Equation 1]

여기서,

는 사용자 i의 특징 j에 대해 비식별화된 데이터

의 PII 유출위험도를 의미하고, I는 사용자 집합,

는 비식별화 함수, M_i,j는 사용자 i의 특징 j가 기록된 데이터 집합을 의미한다.here,

is the unidentified data for feature j of user i.

means the risk of PII leakage, I is the set of users,

is a de-identification function, and M _i,j is a data set in which the feature j of user i is recorded.

다시 말해, 수학식 1은 주어진 데이터

가 사용자 i의 정보일 확률을 나타낸다.In other words, Equation 1 is the given data

represents the probability that is information of user i.

단일 데이터에 대한 유출위험도를 바탕으로 적어도 둘 이상의 특징을 포함하는 다중 데이터에 대한 유출위험도를 정의하면 수학식 2와 같다.Equation 2 is given when defining the leakage risk for multiple data including at least two or more features based on the leakage risk for single data.

[수학식 2][Equation 2]

수학식 2는 특정 j₁과 특징 j₂에 대해 비식별화된 데이터

의 PII 유출위험도를 의미한다.Equation 2 is unidentified data for _{specific j 1} and feature j _{2 .}

of PII leakage risk.

단계(S1402)에서 프로세서(110)는 유출위험도 분석 결과를 바탕으로 데이터 비식별화를 수행할 수 있다.In step S1402, the processor 110 may perform data de-identification based on the leakage risk analysis result.

프로세서(110)는 비지도 학습 알고리즘을 이용하여 유출위험도가 높은 데이터 조합을 찾아 데이터 비식별화에 적용할 수 있다. 모든 데이터 조합에 대하여 여러 비지도 학습 알고리즘을 적용하여 클러스터를 구성함으로써 유출위험도가 일정 레벨 이상으로 높은 데이터 조합을 찾을 수 있다. 동일 사용자에 의해 생성된 레코드들과 클러스터의 레코드들을 비교하여 성능을 평가한 후 성능을 비교하여 각 특징 별로 유출위험도가 높은 PII 조합을 도출할 수 있다. 특징 선택(feature selection) 알고리즘을 통해서 산출한 PII 조합과 비교하여 해당 데이터셋에 최적화된 특징 선택 알고리즘을 결정할 수 있다.The processor 110 may use an unsupervised learning algorithm to find a data combination with a high risk of leakage and apply it to data de-identification. By applying several unsupervised learning algorithms to all data combinations and forming clusters, it is possible to find data combinations with a high risk of leakage above a certain level. After evaluating the performance by comparing the records created by the same user with the records of the cluster, a PII combination with a high risk of leakage for each characteristic can be derived by comparing the performance. It is possible to determine a feature selection algorithm optimized for the corresponding dataset by comparing it with the PII combination calculated through the feature selection algorithm.

프로세서(110)는 데이터 유출위험도를 바탕으로 데이터 비식별화에 대한 가이드라인을 도출할 수 있으며, 이러한 가이드라인을 통해 유출위험도가 일정 레벨 미만인 데이터 조합을 중심으로 데이터 비식별화를 수행할 수 있다.The processor 110 may derive a guideline for data de-identification based on the risk of data leakage, and through this guideline, data de-identification may be performed focusing on data combinations having a risk of leakage below a certain level. .

더 나아가, 프로세서(110)는 동일한 시스템 이용자로부터 중복 반출 요청이 있는 데이터의 경우 이전에 반출된 데이터의 비식별화를 바탕으로 해당 시스템 이용자에 대한 데이터 비식별화의 위험도를 분석할 수 있다. 프로세서(110)는 동일 시스템 이용자에게 중복으로 반출된 데이터 간에 식별 가능한 데이터 교집합을 분석하여 유출위험도가 일정 레벨 이상인 데이터 조합을 회피하는 방식으로 데이터 비식별화를 수행할 수 있다. 일례로, 프로세서(110)는 시스템 이용자가 중복으로 반출하고자 하는 데이터에 대하여 이전 비식별화된 데이터와 동일한 데이터 조합으로 데이터 비식별화를 수행할 수 있다. 다른 예로, 프로세서(110)는 시스템 이용자가 중복으로 반출하고자 하는 데이터에 대하여 이전 비식별화된 데이터와 추가로 비식별화된 데이터를 포함하여 유출위험도가 일정 레벨 미만인 데이터 조합을 찾아 데이터 비식별화를 수행할 수 있다.Furthermore, the processor 110 may analyze the risk of data de-identification for the corresponding system user based on the previously exported data de-identification in the case of data for which there is a duplicate export request from the same system user. The processor 110 may perform data de-identification in a manner that avoids data combinations having a risk of leakage above a certain level by analyzing data intersections that are identifiable between data duplicately exported to the same system user. For example, the processor 110 may perform data de-identification with the same data combination as previously de-identified data with respect to data that the system user wants to repeatedly export. As another example, the processor 110 finds a data combination in which the risk of leakage is less than a certain level, including previously de-identified data and additional de-identified data, for the data that the system user wants to export in duplicate, and de-identifies the data. can be performed.

본 실시예들은 개인 정보 관리를 위한 컴플라이언스를 준수하는 블록체인 아키텍처를 포함할 수 있다.The present embodiments may include a blockchain architecture that complies with compliance for personal information management.

도 15는 본 발명의 일실시예에 있어서 개인 정보 관리 과정의 일례를 도시한 순서도이다.15 is a flowchart illustrating an example of a personal information management process according to an embodiment of the present invention.

본 실시예에 따른 컴퓨터 시스템(100)은 블록체인 네트워크 상의 컨트롤러 노드(controller node)를 구현할 수 있다.The computer system 100 according to the present embodiment may implement a controller node on a blockchain network.

도 15를 참조하면, 단계(S1510)에서 컴퓨터 시스템(100)은 블록체인 네트워크 상의 컨트롤러 노드에서 사용자 노드로부터 수신되는 개인 데이터를 개인 식별 정보(PII) 및 잠재적 개인 식별 정보(PPII)로 분리할 수 있다.Referring to FIG. 15 , in step S1510, the computer system 100 separates the personal data received from the user node from the controller node on the blockchain network into personally identifiable information (PII) and potential personally identifiable information (PPII). have.

단계(S1520)에서 컴퓨터 시스템(100)은 개인 식별 정보를 로컬 데이터베이스에 저장할 수 있다. 개인 데이터 중 PII는 컨트롤러 노드의 로컬 데이터베이스에 저장될 수 있으며, 이러한 PII의 해시와 PPII는 블록체인에 저장될 수 있다.In step S1520, the computer system 100 may store personal identification information in a local database. Among personal data, PII can be stored in the local database of the controller node, and the hash of this PII and PPII can be stored in the blockchain.

단계(S1530)에서 컴퓨터 시스템(100)은 개인 식별 정보의 해시 값을 생성할 수 있다. 블록체인에는 PII의 해시 값만이 저장되기 때문에 로컬 데이터베이스에서 PII가 삭제되는 경우, 블록체인에 저장된 PII의 해시 값은 무용하게 된다.In step S1530, the computer system 100 may generate a hash value of personal identification information. Since only the hash value of PII is stored in the blockchain, if PII is deleted from the local database, the hash value of PII stored in the blockchain becomes useless.

단계(S1540)에서 컴퓨터 시스템(100)은 블록체인 네트워크 상의 노드들 간의 합의가 이루어지는 경우 합의에 의해 생성되는 스마트 컨트랙트(smart contract), 잠재적 개인 식별 정보, 생성된 해시 값, 사용자 노드에 대응하는 사용자 식별자 및 컨트롤러 노드에 대응하는 컨트롤러 식별자를 포함하는 블록을 생성하여 블록체인에 저장할 수 있다. 이때, 컴퓨터 시스템(100)은 개인 데이터의 목록을 블록체인 네트워크 상의 노드들에게 게시할 수 있다. 또한, 합의를 위한 조건은 개인 데이터에 적용 가능한 컴플라이언스, 일례로 GDPR 규정에 기초할 수 있다.In step S1540, the computer system 100 determines that a user corresponding to a smart contract generated by consensus, potential personal identification information, a generated hash value, and a user node when a consensus is reached between nodes on the block chain network. A block containing the identifier and the controller identifier corresponding to the controller node can be created and stored in the blockchain. At this time, the computer system 100 may publish a list of personal data to nodes on the blockchain network. In addition, the conditions for an agreement may be based on compliance applicable to personal data, for example GDPR regulations.

한편, 컴퓨터 시스템(100)은 블록체인 네트워크 상의 프로세서 노드와 개인 데이터를 공유하는 경우 개인 데이터의 공유 여부 및 개인 데이터의 공유 목적을 사용자 노드로 알릴 수 있다. 이때, 프로세서 노드는 개인 데이터를 개인 식별 정보 및 잠재적 개인 식별 정보로 분리하고, 개인 식별 정보를 프로세서 노드의 로컬 데이터베이스에 저장하고, 개인 식별 정보의 해시 값을 생성하고, 블록체인 네트워크 상의 노드들간의 합의가 이루어지는 경우, 합의에 의해 생성되는 스마트 컨트랙트, 잠재적 개인 식별 정보, 생성된 해시 값, 사용자 노드에 대응하는 사용자 식별자, 컨트롤러 노드에 대응하는 컨트롤러 식별자 및 프로세서 노드에 대응하는 프로세서 식별자를 포함하는 블록을 생성하여 블록체인에 저장하도록 구현될 수 있다.Meanwhile, when sharing personal data with a processor node on a block chain network, the computer system 100 may notify the user node of whether personal data is shared and the purpose of sharing personal data. At this time, the processor node separates the personal data into personally identifiable information and potential personally identifiable information, stores the personally identifiable information in the local database of the processor node, generates a hash value of the personally identifiable information, When consensus is reached, a block containing a smart contract generated by consensus, potential personally identifiable information, a generated hash value, a user identifier corresponding to a user node, a controller identifier corresponding to a controller node, and a processor identifier corresponding to a processor node can be implemented to create and store in the blockchain.

또한, 블록체인 네트워크 상의 모든 노드들 각각은 사용자 노드로부터의 개인 데이터의 삭제 요청이 수신됨에 따라, 블록체인상에 저장된 스마트 컨트랙트를 확인하여 자신의 로컬 데이터베이스에 저장된 사용자 노드의 개인 데이터를 삭제할 수 있다.In addition, each of the nodes on the blockchain network can delete the personal data of the user node stored in their local database by checking the smart contract stored on the blockchain as a request to delete personal data from the user node is received. .

또한, 블록체인 네트워크 상의 모든 노드들 각각은 사용자 노드로부터의 개인 데이터의 수정 요청이 수신됨에 따라, 블록체인상에 저장된 스마트 컨트랙트를 확인하여 자신의 로컬 데이터베이스에 저장된 사용자 노드의 개인 데이터를 수정할 수 있으며, 이때, 사용자 노드의 개인 데이터를 자신의 로컬 데이터베이스에서 수정한 노드는 수정된 개인 식별 정보의 해시를 포함하는 새로운 블록을 생성하여 블록체인에 추가할 수 있다.In addition, each of the nodes on the blockchain network can modify the personal data of the user node stored in their local database by checking the smart contract stored on the blockchain as a request for modification of personal data from the user node is received. , at this time, a node that has modified the user node's personal data in its local database can create a new block containing the modified personal identification information hash and add it to the blockchain.

본 실시예들은 접근 권한을 가진 블록체인 컨트롤러와의 컨트랙트를 이용하여 개인 데이터를 제공할 수 있는 BCP(Blockchain Controller for Privacy) 아키텍처를 포함할 수 있다.The present embodiments may include a BCP (Blockchain Controller for Privacy) architecture that can provide personal data by using a contract with a blockchain controller having access rights.

본 실시예에 따른 컴퓨터 시스템(100)은 개인 정보 제공 시스템으로서 BCP를 구현할 수 있다.The computer system 100 according to the present embodiment may implement BCP as a personal information providing system.

소셜 서비스, 검색 서비스, 메일 서비스, 미디어 서비스 등 서비스 공급자가 개인 데이터를 이용하고자 하는 경우 개인 데이터 접근 권한을 가진 BCP와의 컨트랙트를 이용해 개인 데이터에 접근함으로써 개인 데이터에 대한 이용 기록 및 활용 정보를 검증할 수 있다.When a service provider such as social service, search service, mail service, media service, etc. wants to use personal data, by accessing the personal data using a contract with the BCP with personal data access rights, the usage record and usage information for personal data can be verified. can

도 16은 본 발명의 일실시예에 있어서 서비스 공급자가 이용한 개인 데이터를 기록 및 관리하는 과정의 예시를 도시한 것이다.16 illustrates an example of a process of recording and managing personal data used by a service provider according to an embodiment of the present invention.

도 16에 도시한 바와 같이, 일례로 실제 사용자의 데이터는 소셜 서비스 공급자 #B를 통해서 클라우드 플랫폼으로 이동하게 된다. 다른 소셜 서비스 공급자 #A와 #C는 BCP #2를 통해 개인 데이터를 수집할 수 있다.As shown in FIG. 16 , for example, real user data is moved to the cloud platform through the social service provider #B. Other social service providers #A and #C may collect personal data through BCP #2.

이러한 경우, 사용자가 인증한 BCP(#1, #2)를 이용하여 서비스들의 개인 식별 정보(PII)에 대한 활용 내역을 검증할 수 있으며, 서비스 공급자들 또한 필수적으로 BCP 형태의 기능을 이용해야 한다.In this case, using the BCP (#1, #2) authenticated by the user, the usage details of the personal identification information (PII) of the services can be verified, and service providers must also use the BCP type function. .

도 17은 본 발명의 일실시예에 있어서 실 서비스 구성 사례를 도시한 것으로, 사용자는 자신의 데이터에 대한 접근 권한을 BCP에 위임하고 서비스 공급자는 사용자가 위임한 BCP와의 컨트랙트를 통해 개인 데이터를 이용할 수 있다.17 is a diagram showing a real service configuration example according to an embodiment of the present invention, wherein the user delegates access to his/her data to the BCP, and the service provider uses personal data through a contract with the BCP delegated by the user. can

본 발명에 따른 BCP의 기능적 특징은 다음과 같다.The functional characteristics of the BCP according to the present invention are as follows.

1. BCP는 사용자와의 데이터 프라이버시 관련 컨트랙트를 수행한다.1. BCP performs data privacy-related contracts with users.

2. BCP는 자체 개인 데이터 저장소(storage or vault)를 제공한다.2. BCP provides its own private data storage (storage or vault).

3. BCP는 사용자가 이용하는 온라인 데이터 저장소(online data storage)와의 컨트랙트 기능을 체결한다.3. BCP concludes a contract function with the online data storage used by the user.

4. BCP는 사용자가 이용하는 온라인 데이터 저장소와의 연동 인터페이스 기능을 제공한다.4. BCP provides interworking interface function with online data storage used by users.

5. BCP는 사용자 서비스 이용 환경(개인 정보 관련 데이터 생성 환경)의 모니터링을 위한 BCP 에이전트(agent)를 제공한다.5. BCP provides BCP agent for monitoring user service usage environment (personal information related data generation environment).

먼저, BCP와 사용자 간의 데이터 프라이버시 관련 컨트랙트 프로세스는 다음과 같다.First, the data privacy-related contract process between the BCP and the user is as follows.

BCP는 개인 데이터가 온라인으로 노출되는 다양한 접점에서 이를 명확히 모니터링하기 위한 방법으로 데이터 프라이버시에 대한 컨트랙트를 제공한다.BCP provides contracts for data privacy as a way to clearly monitor personal data at the various touchpoints where it is exposed online.

사용자의 노출 가능한 주요 온라인 데이터에 대해서는 사용자와의 컨트랙트에 따라 (1) 주요한 노출 포인트에 대해서 SDK(Software Development Kit)나 API(Application Program Interface) 형태로 제공 가능한 부분에 대해서만 노출하거나, (2) 네트워크 모니터링(Network Monitoring) 수준에서 전체 모니터링을 통해 개인 정보를 구분할 수 있다.Regarding the main online data that can be exposed by the user, depending on the contract with the user (1) only the part that can be provided in the form of SDK (Software Development Kit) or API (Application Program Interface) for major exposure points, or (2) network At the level of monitoring (Network Monitoring), personal information can be distinguished through overall monitoring.

이때, 웹(web)이나 앱(App) 기반의 서버의 경우 상기 (1)번에 해당하며 해당 서비스 공급자들이 데이터 모니터링을 위한 권한에 대한 정보도 함께 제공할 수 있어야 한다.In this case, in the case of a web- or App-based server, it falls under (1) above, and the corresponding service providers should be able to provide information on the authority for data monitoring as well.

도 18은 BCP와 사용자 간의 데이터 프라이버시 관련 컨트랙트 과정을 도시한 순서도이다.18 is a flowchart illustrating a data privacy-related contract process between a BCP and a user.

도 18을 참조하면, 스마트 컨트랙트를 위해 사용자가 자신의 데이터에 대한 접근 권한을 위임하고자 하는 BCP에 접속할 수 있으며(S1801), 이에 해당 BCP는 접속한 사용자의 컨트랙트 요청을 수신할 수 있다.Referring to FIG. 18 , for a smart contract, a user may access a BCP to which he/she wants to delegate access to his/her data (S1801), and the corresponding BCP may receive a contract request from the connected user.

BCP는 사용자가 이용하는 서비스(온라인 데이터 저장소 또는 서비스 공급자)에 대한 선택을 수신할 수 있다(S1802).The BCP may receive a selection for the service (online data store or service provider) used by the user (S1802).

BCP는 사용자가 선택한 서비스와 연동되어 있는지 여부를 판단한다(S1803). 사용자가 이용하는 서비스가 BCP와 연동되어 있지 않는 경우 블록체인 기반 개인 정보 제공 서비스를 제공할 수 없다.The BCP determines whether it is linked with the service selected by the user (S1803). If the service used by the user is not linked with the BCP, the blockchain-based personal information provision service cannot be provided.

사용자가 이용하는 서비스가 BCP와 연동되어 있는 경우에는 BCP 측에서 해당 서비스와의 데이터 연동을 위한 사용자 인증을 진행할 수 있다(S1804). 일례로, Oauth(Open Authorization) 방식과 같이 사용자가 이용하는 서비스에 접근이 가능하도록 BCP가 요청할 수 있다.When the service used by the user is linked with the BCP, the BCP side may perform user authentication for data interworking with the service (S1804). For example, the BCP may request to allow access to a service used by the user, such as an Oauth (Open Authorization) method.

BCP는 사용자 인증이 완료되면 컨트랙트 권리 범위를 설정할 수 있다(S1805). 컨트랙트 권리 범위는 접근 권한의 범위를 의미하는 것으로, 일례로, 데이터 항목, 모니터링 및 트래킹 기간 등이 포함될 수 있다. 서비스 공급자는 사용자가 가입 시 동의한 데이터 제공 범위, 동의 시점 등에 대한 정보를 제공할 수 있다. 이때, BCP는 API를 이용한 접근, DFA(direct file access) 등 데이터 접근성(Data Accessibility)을 설정할 수 있다. 또한, BCP는 잠재적 개인 식별 정보(Potential Personal Identifiable Information, PPII), 개인 식별 정보(PII) 등 등급에 다른 데이터 접근 여부를 포함하는 데이터 커버리지(Data Coverage)를 설정할 수 있다. 그리고, BCP는 사용자 요구에 따른 데이터 익명화를 설정할 수 있다.BCP may set the contract right scope when user authentication is completed (S1805). The contract rights scope means the scope of access rights, and may include, for example, data items, monitoring and tracking periods, and the like. The service provider may provide information on the scope of data provision agreed upon by the user at the time of subscription, the timing of consent, etc. In this case, the BCP may set data accessibility such as access using API and direct file access (DFA). In addition, the BCP may set data coverage including whether to access different data in grades such as Potential Personal Identifiable Information (PPII) and Personal Identifiable Information (PII). In addition, the BCP may set data anonymization according to the user's request.

사용자가 이용하는 서비스가 신규 서비스이거나 사용자 단말에서 데이터를 수집하는 경우에는 사용자 환경(단말, 어플리케이션, IoT 등)에서 BCP와 연결한 정보를 설정할 수 있으며, 이에 BCP는 사용자 환경과의 연결 정보를 수신할 수 있다(S1806). 이를 위해서는 BCP 측에서 사용자의 데이터를 업로드할 별도의 API 게이트웨이 주소를 제공할 수 있다.When the service used by the user is a new service or data is collected from the user terminal, information connected to the BCP can be set in the user environment (terminal, application, IoT, etc.), and the BCP can receive connection information with the user environment. It can be (S1806). For this, the BCP side can provide a separate API gateway address to upload the user's data.

개인 데이터가 이미 온라인 상에 존재하는 경우 등 온라인 기반 서비스 공급자인 경우에 단말과의 설정 과정(S1806)은 생략될 수 있다.In the case of an online-based service provider, such as when personal data already exists online, the setting process with the terminal ( S1806 ) may be omitted.

BCP는 사용자가 이용하는 서비스에 대해 상기한 과정(S1801~S1806)을 통해 설정된 정보를 사용자와의 컨트랙트 정보로서 저장할 수 있다(S1807).The BCP may store information set through the above processes (S1801 to S1806) for the service used by the user as contract information with the user (S1807).

사용자는 다양한 사용자 환경을 통해 브라우저(browser) 상에서 자신이 이용하고자 하는 특정 웹 서비스의 가입을 진행할 수 있다.A user may proceed with subscribing to a specific web service that he/she wants to use on a browser through various user environments.

예를 들어, 웹 서비스에서는 사용자 가입 과정에서 BCP를 이용할 것인지에 대한 의사를 확인할 수 있다. 사용자가 BCP 이용 의사를 입력하면 BCP 로그인 프로세스를 제공할 수 있으며, 이를 통해 사용자는 자신이 이용하는 BCP에 로그인을 진행할 수 있다.For example, in a web service, it is possible to confirm whether or not to use BCP in the user registration process. If the user inputs his/her intention to use BCP, the BCP login process can be provided, and through this, the user can log in to the BCP used by the user.

BCP 측에서는 사용자가 가입하려는 웹 서비스가 요청하는 사용자 개인 정보에 대해 모니터링 가능한 데이터 목록을 사용자에게 제공한 후 목록을 통해 사용자로부터 선택 받은 데이터 항목을 모니터링 대상으로 설정할 수 있다.The BCP side provides the user with a list of data that can be monitored for the user's personal information requested by the web service the user wants to subscribe to, and then can set the data item selected by the user through the list as the monitoring target.

BCP는 자체적으로 개인 데이터 저장소의 기능을 제공할 수 있다.BCP can itself provide the function of private data storage.

예를 들어, 기존의 개인용 클라우드 데이터 저장소(Personal Cloud Data Storage) 사업자 또는 생산성 도구를 제공하는 사업자들은 데이터 관리(Data Management) 환경에 블록체인 기반 데이터 입출력에 대한 모니터링 및 스마트 컨트랙트 기반 제3자 서비스들에 대한 개인 데이터 입출력에 권한 컨트롤 및 제공 여부에 대한 모니터링 기능을 추가함으로써 BCP 형태로의 확장이 가능하다. 별도의 BCP와 컨트랙트를 맺고 제3자 서비스와의 노출 접점을 BCP만이 관리함으로써 자체적인 BCP 기능을 독점적으로 연동해 제공하는 것 또한 가능하다.For example, existing personal cloud data storage operators or providers of productivity tools provide monitoring and smart contract-based third-party services for data input/output in a data management environment. It is possible to extend to the BCP form by adding the monitoring function of authority control and provision to the input/output of personal data. By establishing a contract with a separate BCP and only BCP manages the contact point of exposure with a third-party service, it is also possible to exclusively link and provide its own BCP function.

이처럼 본 발명의 실시예들에 따르면, 개인 데이터와 관련된 컴플라이언스를 기반으로 컴플라이언스를 준수하는 데이터 관리 환경은 물론이고, 프라이버스가 보장된 데이터 유통 환경을 제공할 수 있다. 특히, 데이터 유통 과정에서의 데이터 반출 시 데이터 특성 별 유출위험도를 분석하여 반출하고자 하는 데이터의 비식별화를 적용할 수 있다.As such, according to embodiments of the present invention, it is possible to provide a data distribution environment in which privacy is guaranteed as well as a data management environment that complies with the compliance based on the compliance related to personal data. In particular, de-identification of data to be exported can be applied by analyzing the risk of leakage by data characteristics when exporting data in the data distribution process.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 어플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The device described above may be implemented as a hardware component, a software component, and/or a combination of the hardware component and the software component. For example, the apparatus and components described in the embodiments may include a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA), and a programmable logic unit (PLU). It may be implemented using one or more general purpose or special purpose computers, such as a logic unit, microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications executed on the operating system. A processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For convenience of understanding, although one processing device is sometimes described as being used, one of ordinary skill in the art will recognize that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that can include For example, the processing device may include a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as parallel processors.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 컴퓨터 저장 매체 또는 장치에 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may comprise a computer program, code, instructions, or a combination of one or more thereof, which configures a processing device to operate as desired or is independently or collectively processed You can command the device. The software and/or data may be embodied in any tangible machine, component, physical device, computer storage medium or device for interpretation by or providing instructions or data to the processing device. have. The software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored in one or more computer-readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 이때, 매체는 컴퓨터로 실행 가능한 프로그램을 계속 저장하거나, 실행 또는 다운로드를 위해 임시 저장하는 것일 수도 있다. 또한, 매체는 단일 또는 수 개의 하드웨어가 결합된 형태의 다양한 기록수단 또는 저장수단일 수 있는데, 어떤 컴퓨터 시스템에 직접 접속되는 매체에 한정되지 않고, 네트워크 상에 분산 존재하는 것일 수도 있다. 매체의 예시로는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등을 포함하여 프로그램 명령어가 저장되도록 구성된 것이 있을 수 있다. 또한, 다른 매체의 예시로, 어플리케이션을 유통하는 앱 스토어나 기타 다양한 소프트웨어를 공급 내지 유통하는 사이트, 서버 등에서 관리하는 기록매체 내지 저장매체도 들 수 있다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. In this case, the medium may be to continuously store a program executable by a computer, or to temporarily store it for execution or download. In addition, the medium may be a variety of recording means or storage means in the form of a single or several hardware combined, it is not limited to a medium directly connected to any computer system, and may exist distributedly on a network. Examples of the medium include a hard disk, a magnetic medium such as a floppy disk and a magnetic tape, an optical recording medium such as CD-ROM and DVD, a magneto-optical medium such as a floppy disk, and those configured to store program instructions, including ROM, RAM, flash memory, and the like. In addition, examples of other media may include recording media or storage media managed by an app store that distributes applications, sites that supply or distribute various other software, and servers.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with reference to the limited embodiments and drawings, various modifications and variations are possible from the above description by those skilled in the art. For example, the described techniques are performed in an order different from the described method, and/or the described components of the system, structure, apparatus, circuit, etc. are combined or combined in a different form than the described method, or other components Or substituted or substituted by equivalents may achieve an appropriate result.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

Claims

In a computer system,
at least one processor configured to execute computer readable instructions contained in memory
including,
the at least one processor,
The process of compliance (compliance) to provide the applied data, including data protection laws and regulations as its data (1 ^st party data), at the request of the system user; and
Process of performing data de-identification for at least a portion of the data to which the compliance is applied and exporting
process the
the at least one processor,
Personal information collected and the degree of information security-related topics, privacy rights guaranteed whether the providers, personal information utilization policies and third party (3 ^rd party) shares, whether personal information on the basis of confidence indicators for the provider confidence in the system user the process of evaluating; and
The process of performing data de-identification for at least a portion of the data to be exported based on the reliability of the system user
process the
the at least one processor,
By analyzing the risk of data leakage due to de-identification for each characteristic included in the company data,
The process of analyzing the risk of leakage for single data including one characteristic; and
The process of analyzing the risk of leakage for multiple data including at least two or more features based on the risk of leakage for the single data
process the
the at least one processor,
To perform data de-identification of the data to which the compliance is applied based on the analysis result of the risk of data leakage,
Data on data to which the compliance has been applied, focusing on data combinations in which the data leakage risk is less than a certain level, including previously de-identified data and additional de-identified data for data that the system user wants to export in duplicate performing de-identification
A computer system that processes them.

delete

According to claim 1,
the at least one processor,
De-identifying at least a portion of the compliant data through any one of a masking method, a categorization method, a generalization method, and a method using homomorphic encryption
A computer system characterized by a.

delete

According to claim 1,
the at least one processor,
Analyzing the risk of data leakage according to the de-identification using an unsupervised learning algorithm
A computer system characterized by a.

According to claim 1,
the at least one processor,
Deriving a guideline for data identification based on the analysis result of the data leakage risk
A computer system characterized by a.

delete

According to claim 1,
the at least one processor,
In the case of data for which there is a duplicate export request from the system user, the data to which the compliance is applied in such a way as to avoid data combinations with the data leakage risk of a certain level or higher by analyzing the intersection of identifiable data among the data duplicated exported to the system user to perform data de-identification for
A computer system characterized by a.

A method executed on a computer system, comprising:
the computer system comprising at least one processor configured to execute computer readable instructions contained in a memory;
The method is
providing, by the at least one processor, compliance data including information protection laws or regulations as company data according to a system user's request; and
performing, by the at least one processor, data de-identification on at least a portion of the data to which the compliance is applied and exporting
including,
The exporting step is
Personal information collected and the degree of information security-related topics, privacy rights guaranteed whether the providers, personal information utilization policies and third party (3 ^rd party) shares, whether personal information on the basis of confidence indicators for the provider confidence in the system user to evaluate; and
Performing data de-identification of at least a portion of the data to be exported based on the reliability of the system user
including,
The exporting step is
analyzing the risk of data leakage due to de-identification for each characteristic included in the company data; and
performing data de-identification of the data to which the compliance is applied based on the analysis result of the data leakage risk
further comprising,
The analyzing step is
analyzing the risk of leakage for single data including one characteristic; and
Analyzing the risk of leakage for multiple data including at least two or more features based on the risk of leakage for the single data
including,
The step of performing data de-identification on the data to which the compliance is applied,
Data on data to which the compliance has been applied, focusing on data combinations in which the data leakage risk is less than a certain level, including previously de-identified data and additional de-identified data for data that the system user wants to export in duplicate performing de-identification
How to characterize.

delete

12. The method of claim 11,
The exporting step is
De-identifying at least a portion of the data to which the compliance has been applied through any one of a masking method, a categorization method, a generalization method, and a method using homomorphic encryption
How to include.

delete

12. The method of claim 11,
The step of performing data de-identification on the data to which the compliance is applied,
In the case of data for which there is a duplicate export request from the system user, the data to which the compliance is applied in such a way as to avoid data combinations with the data leakage risk of a certain level or higher by analyzing the intersection of identifiable data among the data duplicated exported to the system user to perform data de-identification on
How to characterize.