KR20180031332A

KR20180031332A - METHOD AND SYSTEM FOR k-NN QUERY PROCESSING BASED ON GARBLED CIRCUIT

Info

Publication number: KR20180031332A
Application number: KR1020160119809A
Authority: KR
Inventors: 김형일; 신영성; 김형진; 신재환; 장재우; 김현태; 신광식
Original assignee: 전북대학교산학협력단; (주)아이엠시티
Priority date: 2016-09-20
Filing date: 2016-09-20
Publication date: 2018-03-28
Also published as: KR101916228B1

Abstract

The present invention relates to a k-NN query processing algorithm on an encryption database outsourced in a cloud. According to an embodiment of the present invention, a method for k-NN query processing based on a garbled circuit comprises the following steps of: building a first cloud and a second cloud which is independent from the first cloud; maintaining an encryption database, in which data stored in an original database is encrypted, and an encryption public key, which is generated in association with the encryption, in the first cloud; maintaining a decryption private key which corresponds to the encryption public key in the second cloud; and performing multiparty computation between the first cloud and the second cloud based on the encryption public key and the decryption private key, if a k-NN query occurs, in a user terminal to which the encryption public key is distributed, and deriving result data for the k-NN query from the encryption database to provide the same to the user terminal.

Description

TECHNICAL FIELD [0001] The present invention relates to a k-NN query processing method and a k-NN query processing system.

본 발명은 클라우드에 아웃소싱된 암호화 데이터베이스 상에서의 k-NN 질의 처리 알고리즘에 관한 것으로, 암호화 데이터베이스에 대한 사용자 질의를 처리하는 과정에서 노출될 우려가 있는 데이터 접근 패턴의 보호를 지원하기 위한 것이다.The present invention relates to a k-NN query processing algorithm on an outsourced cryptographic database in a cloud, and is intended to support the protection of a data access pattern that may be exposed in the process of processing a user query on an encrypted database.

최근 클라우드 컴퓨팅에 대한 연구가 활성화됨에 따라 데이터베이스의 관리 및 운용을 외부사업자에게 위탁하는 데이터베이스 아웃소싱에 대한 관심이 고조되고 있다. As research on cloud computing becomes active in recent years, there is a growing interest in database outsourcing which entrusts management and operation of databases to external vendors.

그러나 종래 기술에서의 아웃소싱된 데이터베이스는 클라우드 및 공격자로부터 의미 있는 정보를 추출되고, 사용자가 클라우드에 전송하는 질의를 통해 사용자의 성향이나, 선호도 등과 같은 개인 정보를 유추될 수 있는 보안 상의 문제가 있다.However, the outsourced database in the prior art has a security problem in that meaningful information is extracted from the cloud and the attacker, and personal information such as the user's propensity, preference, etc. can be deduced through the query transmitted by the user to the cloud.

따라서, 아웃소싱된 데이터베이스 환경에서의 정보를 보호하기 위한 기술이 요구되는 실정이다. 이러한 문제를 해결하기 위해 준동형 암호화(homomorphic encryption) 기법을 이용해 암호화된 데이터 상에서의 k-NN 질의처리에 대한 연구가 활성화되고 있다.Therefore, a technology for protecting information in an outsourced database environment is required. In order to solve this problem, research on k-NN query processing on encrypted data using homomorphic encryption technique is being activated.

그러나, 기존 암호화 데이터베이스 상에서의 대부분의 k-NN 질의처리 연구는 보안에 취약한 단점이 있다. 예를 들어, 데이터 접근 패턴이 노출되는 문제점과 선택 평문 공격에의 노출 문제점 등이 존재할 수 있다. 종래 기술에서, 데이터 보호, 사용자 질의 보호, 데이터 접근 패턴 노출 등을 모두 지원하는 기술은 매우 높은 질의처리 비용을 요구하는 문제점이 있다.However, most k-NN query processing studies on existing cryptographic databases are vulnerable to security. For example, there may be a problem of exposure of data access pattern and a problem of exposure to selective plaintext attack. In the prior art, a technology that supports both data protection, user query protection, and data access pattern exposure requires a very high processing cost.

따라서, 클라우드에 아웃소싱된 데이터베이스 환경에서 데이터 보호, 사용자 질의 보호, 데이터 접근 패턴 보호를 모두 지원하는 동시에 효율적인 질의처리 성능을 제공할 수 있는 시스템 및 방법이 필요한 실정이다.Accordingly, there is a need for a system and method that can provide efficient query processing performance while supporting data protection, user query protection, and data access pattern protection in an outsourced database environment in the cloud.

본 발명은 상기와 같은 문제점을 해결하기 위하여 안출된 것으로서, 가블드 회로 및 데이터 패킹 기법 기반의 암호화 연산 프로토콜을 제공함으로써, 연산 횟수를 감소시켜 효율적인 질의처리 성능을 제공할 수 있는 것을 목적으로 한다.It is an object of the present invention to provide an encryption processing protocol based on a garbage circuit and a data packing technique, thereby reducing the number of operations and providing efficient query processing performance.

또한, 본 발명은 향상된 암호화 연산 프로토콜을 기반으로 하는 암호화 인덱스 탐색과 암호화 데이터베이스 상에서의 데이터 접근 패턴 보호를 지원하는 k-NN 질의처리 알고리즘을 제공함으로써, 추가적인 정보의 노출을 방지하여 데이터 보호와 사용자 질의 보호 뿐만 아니라, 질의 처리 과정에서의 데이터 접근 패턴 보호를 모두 지원할 수 있게 하는 다른 목적을 가지고 있다.In addition, the present invention provides a k-NN query processing algorithm that supports encryption index search based on an improved cryptographic operation protocol and data access pattern protection on an encrypted database, thereby preventing additional information from being exposed, It has a different purpose, which can support both protection as well as data access pattern protection in query processing.

본 발명의 일실시예에 따른 가블드 회로(GARBLED CIRCUIT) 기반 k-NN 질의 처리 방법은, 제1 클라우드와, 상기 제1 클라우드와 독립(non-colluding)되는 제2 클라우드를 구축하는 단계, 원본 데이터베이스에 저장되는 데이터를 암호화 한 암호화 데이터베이스와, 상기 암호화와 연관되어 생성되는 암호화 공개키를, 상기 제1 클라우드에 유지하는 단계, 상기 암호화 공개키에 대응하는 복호화 비밀키를, 상기 제2 클라우드에 유지하는 단계 및 상기 암호화 공개키를 배포 받은 사용자 단말에서, kNN(k Nearest Neighbor) 질의가 발생되는 경우, 상기 암호화 공개키와 상기 복호화 비밀키에 기초한, 상기 제1 클라우드와 상기 제2 클라우드 간의 다자간 계산(SMC, Secure Multiparty Computation)을 수행하여, 상기 암호화 데이터베이스로부터 상기 kNN 질의에 대한 결과 데이터를 도출하여 상기 사용자 단말로 제공하는 단계를 포함하여 구성할 수 있다.A method for processing a k-NN query based on a GALBLED CIRCUIT according to an embodiment of the present invention includes: constructing a first cloud and a second cloud that is non-colluding with the first cloud; The method comprising the steps of: maintaining in the first cloud an encryption database in which data stored in a database is encrypted; and a cryptographic public key generated in association with the cryptography, the method comprising: storing a decryption secret key corresponding to the encrypted public key in the second cloud (KNN) query is generated in a user terminal to which the encrypted public key has been distributed, the method comprising the steps of: determining, based on the encrypted public key and the decrypted secret key, Performs computation (SMC, Secure Multiparty Computation) to derive result data for the kNN query from the encryption database It may comprise the step of providing a group user terminals.

또한, 본 발명의 일실시예에 따른 가블드 회로 기반 k-NN 질의 처리 시스템은, 제1 클라우드와, 상기 제1 클라우드와 독립되는 제2 클라우드를 구축하고, 원본 데이터베이스에 저장되는 데이터를 암호화 한 암호화 데이터베이스와, 상기 암호화와 연관되어 생성되는 암호화 공개키를, 상기 제1 클라우드에 유지하고, 상기 암호화 공개키에 대응하는 복호화 비밀키를, 상기 제2 클라우드에 유지하고, 상기 암호화 공개키를 배포 받은 사용자 단말에서, kNN 질의가 발생되는 경우, 상기 암호화 공개키와 상기 복호화 비밀키에 기초한, 상기 제1 클라우드와 상기 제2 클라우드 간의 다자간 계산을 수행하여, 상기 암호화 데이터베이스로부터 상기 kNN 질의에 대한 결과 데이터를 도출하여 상기 사용자 단말로 제공할 수 있다.In addition, the k-NN query processing system based on the garbage circuit according to an embodiment of the present invention includes a first cloud, a second cloud independent of the first cloud, and a second cloud configured to encrypt data stored in the original database Maintaining a decryption secret key corresponding to the encrypted public key in the second cloud, and distributing the encrypted public key to the first cloud, And performing a multiparametric calculation between the first cloud and the second cloud based on the encrypted public key and the decrypted secret key when the kNN query is generated in the received user terminal, Data can be derived and provided to the user terminal.

본 발명의 일실시예에 따르면, 가블드 회로 및 데이터 패킹 기법 기반의 ESSED 프로토콜, GSCMP 프로토콜, 및 GSPE 프로토콜 중 적어도 하나의 암호화 연산 프로토콜을 수행함으로써, 연산 횟수를 감소시켜 효율적인 질의처리 성능을 제공할 수 있다.According to an embodiment of the present invention, an encryption operation protocol of at least one of an ESSED protocol, a GSCMP protocol, and a GSPE protocol based on a garbled circuit and a data packing technique is performed, thereby reducing the number of operations and providing efficient query processing performance .

또한, 본 발명의 일실시예에 따르면, 향상된 암호화 연산 프로토콜을 기반으로 하는 암호화 인덱스 탐색과 암호화 데이터베이스 상에서의 데이터 접근 패턴 보호를 지원하는 k-NN 질의처리 알고리즘을 제공함으로써, 추가적인 정보의 노출을 방지하여 데이터 보호와 사용자 질의 보호 뿐만 아니라, 질의 처리 과정에서의 데이터 접근 패턴 보호를 모두 지원할 수 있다.According to an embodiment of the present invention, a k-NN query processing algorithm that supports encryption index search based on an improved cryptographic operation protocol and data access pattern protection on an encrypted database is provided, thereby preventing exposure of additional information To protect both data protection and user query protection as well as data access pattern protection during query processing.

도 1은 본 발명의 일실시예에 따른 가블드 회로 기반 k-NN 질의 처리 시스템을 도시한 도면이다.
도 2는 본 발명의 일실시예에 따른 제1 클라우드에서 보유하는 데이터를 도시한 도면이다.
도 3은 본 발명의 일실시예에 따른 암호화 데이터 베이스를 도시한 도면이다.
도 4a 내지 도 4d는 본 발명의 일실시예에 따른 kNN 질의 처리 알고리즘을 설명하기 위한 도면이다.
도 5는 본 발명의 일실시예에 따른 1차원 공간에서의 점-영역 관계를 도시한 도면이다.
도 6은 본 발명의 일실시예에 따른 가블드 회로 기반 k-NN 질의 처리 방법의 순서를 도시한 흐름도이다.FIG. 1 is a diagram illustrating a garbled circuit-based k-NN query processing system according to an embodiment of the present invention.
2 is a diagram showing data held in a first cloud according to an embodiment of the present invention.
3 is a diagram illustrating an encryption database according to an embodiment of the present invention.
4A to 4D are diagrams for explaining a kNN query processing algorithm according to an embodiment of the present invention.
5 is a diagram illustrating point-to-area relationships in a one-dimensional space according to an embodiment of the present invention.
FIG. 6 is a flowchart illustrating a procedure of a method for processing a k-NN query based on a garbled circuit according to an embodiment of the present invention.

이하에서, 본 발명에 따른 실시예들을 첨부된 도면을 참조하여 상세하게 설명한다. 그러나, 본 발명이 실시예들에 의해 제한되거나 한정되는 것은 아니다. 각 도면에 제시된 동일한 참조 부호는 동일한 부재를 나타낸다.Hereinafter, embodiments according to the present invention will be described in detail with reference to the accompanying drawings. However, the present invention is not limited to or limited by the embodiments. Like reference symbols in the drawings denote like elements.

본 발명에서 "k-NN 질의"는 질의로부터 가장 가까운 거리에 존재하는 k개의 데이터를 탐색하는 질의를 지칭할 수 있다. 본 명세서에서 설명되는 가블드 회로 기반 k-NN 질의 처리 방법 및 k-NN 질의 처리 시스템은 가블드 회로 및 데이터 패킹 기법 기반의 효율적인 암호화 연산 프로토콜을 이용하여 두 점간 유클리디언 거리 계산, 두 데이터 간 비교 연산, 암호화 영역의 암호화 점 포함 여부 판단함으로써, 암호화 데이터베이스 상에서 k-NN 질의를 처리할 수 있다.In the present invention, the "k-NN query" can refer to a query that searches for k pieces of data that are closest to the query. The garbled circuit-based k-NN query processing method and k-NN query processing system described in this specification are based on a two-point Euclidian distance calculation using an efficient cryptographic computation protocol based on a garbage circuit and a data packing technique, It is possible to process the k-NN query on the encryption database by determining whether the comparison operation and the encryption area include the encryption point.

도 1은 본 발명의 일실시예에 따른 가블드 회로 기반 k-NN 질의 처리 시스템을 도시한 도면이다. FIG. 1 is a diagram illustrating a garbled circuit-based k-NN query processing system according to an embodiment of the present invention.

본 발명의 가블드 회로 기반 k-NN 질의 처리 시스템(100, 이하, k-NN 질의 처리 시스템)은 데이터베이스(T)(110), kd 트리(120), 암호화 공개키(public key; pk)(130), 복호화 비밀키(secret key; sk)(140), 제1 클라우드(C_A)(150), 암호화 데이터베이스(160), 암호화 kd 트리(170) 및 제2 클라우드(C_B)(180)를 포함할 수 있다.The k-NN query processing system 100 (hereinafter referred to as a k-NN query processing system) of the present invention includes a database 110, a kd tree 120, a public key pk 130), decrypting the secret key (secret key; sk) (140), a first cloud (C _A) (150), encrypted database 160, the encrypted kd-tree 170, and a second cloud (C _B) (180) . &Lt; / RTI >

k-NN 질의 처리 시스템(100)은 데이터베이스(110)에 저장된 데이터를 선정된 개수(예를 들어, F개) 단위로 분할하고, 분할된 데이터를 포함하는 단말 노드를, 복수로 가지는 kd 트리(120)를 구축한다.The k-NN query processing system 100 divides the data stored in the database 110 into a predetermined number (for example, F), and a kd tree having a plurality of terminal nodes including the divided data 120).

일례로, k-NN 질의 처리 시스템(100)은 레벨이 h이고, 총 2^h-1개의 단말 노드를 가지는 kd 트리(120)를 데이터베이스(110)로부터 구축할 수 있으며, 각 단말 노드는 최대 F(FanOut)개의 데이터를 저장할 수 있다.For example, the k-NN query processing system 100 may build a kd tree 120 with a level h and a total of 2 ^h-1 terminal nodes from the database 110, (FanOut) data can be stored.

kd 트리(120)의 각 단말 노드는, 자신이 담당하는 노드 영역에 관한 영역 정보와, 노드 영역 내에 포함되는 데이터에 대한 데이터ID를 평문 형태로 저장할 수 있다. 여기서, 상기 영역 정보는 노드 영역에 대한 하한점(lb_z,m) 및 상한점(ub_z,m)(1≤z≤num_node, 1≤j≤m)을 속성(m) 별로 포함할 수 있다.Each terminal node of the kd tree 120 may store the area information about the node area taken by itself and the data ID for the data included in the node area in a plain text form. Here, the area information may include a lower limit point (lb _{z, m} ) and an upper limit point (ub _{z, m} ) (1? Z? Num _node , 1? have.

예를 들어, 도 2를 참조하면, k-NN 질의 처리 시스템(100)은 8개의 2차원 (예컨대, x 및 y 차원) 데이터를 저장할 수 있다. 이 때, k-NN 질의 처리 시스템(100)은 해당 데이터를 kd 트리를 기반으로 분할할 수 있다. 구축된 kd 트리는 총 4개의 단말 노드를 포함할 수 있고, 각각의 단말 노드에 노드의 하한점(lb _x 와 lb _y ), 상한점(ub _x 와 ub _y ) 정보 및 노드에 포함된 데이터의 ID가 저장될 수 있다. 즉, kd 트리(120)의 단말 노드 'node 1'은, 단말 노드 'node 1'과 연관된 노드 영역에 대한 하한점 '(lb₁ _,0, lb₁ _,1)' 및 상한점 '(ub₁ _,0, ub₁ _,1)'과, 상기 노드 영역에 포함되는 데이터에 대한 데이터ID 't₁', 't₂'를 저장할 수 있다.For example, referring to FIG. 2, the k-NN query processing system 100 may store eight two-dimensional (e.g., x and y dimensional) data. At this time, the k-NN query processing system 100 can divide the data based on the kd tree. The constructed kd tree may include a total of four terminal nodes, and each terminal node is provided with a node lower bound ( lb _x and lb _y ), upper bound ( ub _x and ub _y ) information, and an ID Can be stored. That is, the terminal node 'node 1' of the kd tree 120 receives the lower limit point 'lb ₁ _{, 0} , lb ₁ _{, 1} ' and upper limit point 'ub ₁ ' for the node region associated with the terminal node ' _{, 0} , ub ₁ _{, 1} ) 'and data ID' t ₁ ',' t ₂ 'for data included in the node area.

본원에서 데이터와 노드 간의 포함관계를 명확히 하기 위해, kd 트리의 각 노드 경계에는 데이터가 존재하지 않는다고 가정하지만, 이에 한정된 것은 아니다.It is assumed herein that there is no data at each node boundary of the kd tree to clarify the containment relationship between the data and the node, but the present invention is not limited thereto.

k-NN 질의 처리 시스템(100)은 데이터베이스(110)로부터 구축한 kd 트리(120)를 암호화하여, 암호화 kd 트리(170)를 생성한다.The k-NN query processing system 100 encrypts the kd tree 120 constructed from the database 110 to generate an encrypted kd tree 170. [

구체적으로, k-NN 질의 처리 시스템(100)은 각 단말 노드와 연관된 노드 영역에 포함되는 데이터에 대한 데이터ID를, 속성 별로 더 암호화하여 암호화 kd 트리(170)를 생성할 수 있다.Specifically, the k-NN query processing system 100 may further encrypt the data IDs of the data included in the node region associated with each terminal node by attribute to generate the encrypted kd tree 170. [

예를 들어, k-NN 질의 처리 시스템(100)은 kd 트리(120)의 단말 노드 'node 1'에 포함되는 데이터ID 't₁', 't₂'를, 속성 m 별로 암호화하고, 나머지 단말 노드에 포함된 데이터ID를 속성 m 별로 암호화하여, 암호화 kd 트리(170)를 생성할 수 있다. 예를 들면, 암호화 kd 트리(170)는 4개의 단말 노드(node 1, node2, node3, node4)를 포함하고, 각 단말 노드(210, 220, 230, 240)는 하한점과 상한점으로 구성되는 노드 영역을 가질 수 있다.For example, the k-NN query processing system 100 encrypts the data ID 't ₁ ', 't ₂ ' included in the terminal node 'node 1' of the kd tree 120 for each attribute m, The encryption ID tree 170 can be generated by encrypting the data ID included in the node by attribute m. For example, the encryption kd tree 170 includes four terminal nodes (node 1, node 2, node 3, and node 4), and each of the terminal nodes 210, 220, 230, and 240 includes a lower limit point and an upper limit point Node region.

이때, k-NN 질의 처리 시스템(100)은 암호화 데이터베이스(160)의 생성 시 이용한 동일 암호화 공개키(130)로, kd 트리(120)에 포함되는 각 단말 노드를 암호화 하여 암호화 kd 트리(170)를 생성할 수 있다.The k-NN query processing system 100 encrypts each terminal node included in the kd tree 120 with the same encryption public key 130 used when the encryption database 160 is created, Lt; / RTI >

예를 들어, k-NN 질의 처리 시스템(100)은 도 3과 같은 암호화 데이터베이스를 생성할 수 있다. k-NN 질의 처리 시스템(100)은 차원 단위로 암호화를 수행하여 제1 클라우드(150)로 전달할 수 있다. 즉, k-NN 질의 처리 시스템(100)은 kd 트리(120)에 포함된 각 단말 노드의 영역 정보를 속성 별로 암호화 할 수 있는데, 도 3에 도시된 바와 같이 암호화 데이터베이스(160)의 암호화 공개키를 이용하여 암호화 할 수 있다. For example, the k-NN query processing system 100 can generate an encryption database as shown in Fig. The k-NN query processing system 100 may encrypt the data in units of dimensions and transmit the encrypted data to the first cloud 150. That is, the k-NN query processing system 100 can encrypt domain information of each terminal node included in the kd tree 120 by attribute. As shown in FIG. 3, As shown in Fig.

k-NN 질의 처리 시스템(100)은 독립(non-colluding)되는 제1 클라우드(150) 및 제2 클라우드(180)를 마련할 수 있다.The k-NN query processing system 100 may provide a first cloud 150 and a second cloud 180 that are non-colluding.

본 발명에서, 각 클라우드(150, 180)는 사용자 질의를 처리하기 위해 암호화 프로토콜을 수행 시, 질의 처리 과정 중에 획득한 정보를 바탕으로, 추가적인 정보를 획득하기 위해 다른 클라우드와 결탁하여 데이터 및 정보를 주고 받지 않도록 할 수 있다.In the present invention, each cloud 150 and 180, when performing an encryption protocol to process a user query, may associate data and information with other clouds in order to acquire additional information based on the information acquired during the query processing You can not send or receive.

k-NN 질의 처리 시스템(100)은 암호화 kd 트리(170)를, 암호화 데이터베이스(160), 및 암호화 공개키(130)를 유지하는 제1 클라우드(150)에 보관한다(단계 101).The k-NN query processing system 100 stores the encryption kd tree 170 in the first cloud 150 holding the encryption database 160 and the encryption public key 130 (step 101).

또한, k-NN 질의 처리 시스템(100)은 상기 암호화 공개키(130)에 대응한 복호화 비밀키(140)를, 제1 클라우드(150)와 상이한 제2 클라우드(180)에 보관한다(단계 102).The k-NN query processing system 100 also stores the decryption secret key 140 corresponding to the encrypted public key 130 in the second cloud 180 different from the first cloud 150 ).

다시 말해, k-NN 질의 처리 시스템(100)은 암호화 kd 트리(170)를, 암호화 데이터베이스(160) 및 암호화 공개키(130)와 함께 제1 클라우드(150)에 보관하고, 비밀 키로 생성한 복호화 비밀키(140)를, 다른 제2 클라우드(180)에 보관할 수 있다.In other words, the k-NN query processing system 100 stores the encryption kd tree 170 in the first cloud 150 together with the encryption database 160 and the encryption public key 130, The secret key 140 may be stored in another second cloud 180.

또한, k-NN 질의 처리 시스템(100)은 데이터베이스(110)의 암호화 시 이용한 동일 암호화 공개키(130)를 사용자 단말(AU; Authorized User)(190)로 제공할 수 있다(단계 103). 단말(190)에서는 데이터를 획득하기 위해 제1 클라우드(150)로 질의를 요청 시, 상기 제공된 암호화 공개키(130)를 이용하여 사용자 질의를 암호화할 수 있다(단계 104).In addition, the k-NN query processing system 100 may provide the same encrypted public key 130 used in the encryption of the database 110 to the user terminal (AU) 190 (step 103). When the terminal 190 requests a query to the first cloud 150 to acquire data, the user query can be encrypted using the provided encrypted public key 130 (step 104).

예를 들어, 단말(190)에서는 질의 점을, 예컨대, 'E(q_j)(1≤j≤m)'와 같이 암호화 공개키(130)로 암호화하여 사용자 질의를 요청할 수 있다.For example, the terminal 190 may request the user query by encrypting the query point with the encrypted public key 130, for example, 'E (q _j ) (1? _J ? M)'.

이와 같이, k-NN 질의 처리 시스템(100)은 암호화된 질의를 기반으로 서비스를 제공하여 데이터 보호 및 사용자 질의 보호를 지원할 수 있다.In this manner, the k-NN query processing system 100 can support data protection and user query protection by providing a service based on an encrypted query.

k-NN 질의 처리 시스템(100)은 사용자 단말(190)로부터 사용자 질의가 수신되면, 선정된 암호화 연산 프로토콜을 기반으로, 제1 클라우드(150)와 제2 클라우드(180) 간에 다자간 계산(SMC, Secure Multiparty Computation)을 수행하여, kNN 질의를 처리할 수 있다(단계 105). When the user query is received from the user terminal 190, the k-NN query processing system 100 performs a multiparametric calculation (SMC) between the first cloud 150 and the second cloud 180 based on the selected cryptographic operation protocol, Secure Multiparty Computation) to process the kNN query (step 105).

또한, k-NN 질의 처리 시스템(100)은 제1 클라우드(150) 및 제2 클라우드(180)와 함께 사용자 질의를 처리한 결과를 단말(190)에 전송할 수 있다(단계 106).In addition, the k-NN query processing system 100 may transmit the result of processing the user query together with the first cloud 150 and the second cloud 180 to the terminal 190 (step 106).

여기서, 다자간 계산이란, 데이터 소유자가 보유하고 있는 원본 데이터를 노출하지 않은 채, 다른 개체(제1 클라우드(150)와 제2 클라우드(180))를 통해 프로토콜 및 연산을 안전하게 수행하는 것을 지칭할 수 있다.Here, the multiparameter calculation can refer to safely performing the protocol and the operation via another entity (the first cloud 150 and the second cloud 180) without exposing the original data held by the data owner have.

이를 위해, k-NN 질의 처리 시스템(100)은 암호화 kd 트리(170), 암호화 데이터베이스(160) 및 암호화 공개키(130)를 보관하는 제1 클라우드(150)와 다른 제2 클라우드(180)에 복호화 비밀키(140)를 보관하고, 이를 바탕으로, 제1 클라우드(150)와 제2 클라우드(180) 간에 다자간 계산을 통해 kNN 질의를 처리할 수 있다.To this end, the k-NN query processing system 100 includes a first cloud 150 storing an encryption kd tree 170, an encryption database 160 and an encrypted public key 130, and a second cloud 180, The decryption secret key 140 is stored and the kNN query can be processed between the first cloud 150 and the second cloud 180 based on the multi-party calculation.

이를 통해, k-NN 질의 처리 시스템(100)은 데이터 보호를 지원하면서, 암호화 kd 트리(170)를 기반으로, 암호화 데이터베이스(160) 상에서의 사용자 질의를 안전하게 처리할 수 있다. 또한, k-NN 질의 처리 시스템(100)은 암호화 데이터베이스 상에서 kNN 질의처리 알고리즘을 통해 질의를 처리하는 과정에서 데이터 프라이버시, 질의 프라이버시 및 데이터 접근 패턴과 관련된 어떠한 정보도 노출되지 않는 장점이 있을 수 있다. kNN 질의처리 알고리즘에 대한 보다 상세한 설명은 후술하는 도 4a 내지 도 4d를 참고하여 설명하고자 한다. kNN 질의처리 알고리즘에 대하여 설명하기에 앞서, 본원에서 사용되는 프로토콜에 대하여 설명하고자 한다.In this way, the k-NN query processing system 100 can securely process user queries on the encryption database 160, based on the encryption kd tree 170, while supporting data protection. In addition, the k-NN query processing system 100 may have an advantage that no information related to data privacy, query privacy, and data access pattern is exposed in the process of processing a query through the kNN query processing algorithm on the encryption database. A more detailed description of the kNN query processing algorithm will be described with reference to Figs. 4A to 4D described later. Prior to describing the kNN query processing algorithm, the protocol used herein will be described.

k-NN 질의 처리 시스템(100)은 ESSED (Enhanced Secure Squared Euclidean Distance) 프로토콜, GSCMP(Garbled Circuit based Secure Compare) 프로토콜, 및 GSPE(Garbled Circuit based Secure Point Enclosure) 프로토콜 중 적어도 하나를 암호화 연산 프로토콜로서 선정할 수 있다. The k-NN query processing system 100 selects at least one of an Enhanced Secure Squared Euclidean Distance (ESSED) protocol, a Garbled Circuit based Secure Secure Compare (GSCMP) protocol, and a GSPE (Garbled Circuit based Secure Point Enclosure) can do.

예를 들면, k-NN 질의 처리 시스템(100)은 ESSED 프로토콜을 이용하여 벡터 E(X)와 E(Y) 간 거리의 제곱 E(|X-Y|²)을 계산할 수 있다. 이때, X 및 Y는 m 차원 벡터일 수 있다.For example, the k-NN query processing system 100 may calculate the square of the distance E ( X - Y | ² ) between the vector E ( X ) and E ( Y ) using the ESSED protocol. At this time, X and Y may be m- dimensional vectors.

먼저, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서 난수를 생성한 후 수학식 1을 통해 데이터 패킹을 수행하여 R을 계산하도록 할 수 있다.First, the k-NN query processing system 100 may generate a random number in the first cloud 150, and then perform data packing through Equation (1) to calculate R. [

여기서, σ는 하나의 데이터를 나타내는 비트 길이일 수 있다. 또한, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서 R을 암호화하여 E(R)을 생성하도록 할 수 있다.Here,? May be a bit length representing one piece of data. In addition, the k-NN query processing system 100 may encrypt R in the first cloud 150 to generate E ( R ).

다음으로, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서 각 차원에서의 X와 Y의 암호화 거리 E(x _j -y _j )(1≤j≤m)를 계산한 후, 수학식 2를 통해 데이터 패킹을 수행하여 E(v)를 계산할 수 있다.Next, the k-NN query processing system 100 calculates the cryptographic distance E ( x _j - y _j ) (1? J? M ) of X and Y in each dimension in the first cloud 150, E ( v ) can be calculated by performing data packing through equation (2).

다음으로, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서 E(v)=E(v)×E(R)을 계산한 후, E(v)를 제2 클라우드(180)로 전송할 수 있다. 그 다음, k-NN 질의 처리 시스템(100)은 제2 클라우드(180)에서, 전송 받은 E(v)를 복호화하여 [x ₁-y ₁+r ₁|…|x _m-y _m+r _m]을 획득할 수 있다. 또한, k-NN 질의 처리 시스템(100)은 제2 클라우드(180)에서, v

를 통해 언패킹(unpacking)을 수행하여 x _j -y _j +r _j (1≤j≤m)을 획득한 후, 차원별 (x _j -y _j +r _j )²(1≤j≤m)을 합산하여, d에 저장할 수 있다(단, d의 초기값은 0으로 설정할 수 있으나, 이에 한정된 것은 아니다). Next, k-NN query processing system 100 E (v) = E (v) × E after calculating the (R), the E (v) a second cloud 180 in the first cloud 150 Lt; / RTI > Next, the k-NN query processing system 100 decodes the transmitted E ( v ) in the second cloud 180 to obtain [ x ₁ - y ₁ + r ₁ | | x _m - y _m + r _m ]. Also, k-NN query processing system 100 is in the second cloud (180), v

To perform the unpacking (unpacking) through _{_{x j - (- y j +}} r j x j) 2 (1≤ j ≤ m) y j + r j (1≤ j ≤ m) which, after obtaining a specific dimension Can be summed and stored in d (however, the initial value of d can be set to 0, but is not limited thereto).

이를 통해, k-NN 질의 처리 시스템(100)은 차원별 거리의 합산을 평문 상에서 수행함으로써, 종래 기술인 DPSSED 프로토콜에 비해 암호화 데이터 기반 연산 횟수를 감소시킬 수 있다. Thus, the k-NN query processing system 100 can reduce the number of encryption-data-based operations compared to the DPSSED protocol of the prior art by performing the summation of distance by dimension on the plain text.

다음으로, k-NN 질의 처리 시스템(100)은 제2 클라우드(180)에서 d를 암호화한 후, E(d)를 제1 클라우드(150)에게 전송할 수 있다. 해당 과정을 통해, k-NN 질의 처리 시스템(100)은 DPSSED 프로토콜의 제2 클라우드(180)에서 요구되는 m 번의 데이터 암호화를 한 번으로 감소시킬 수 있다.Next, the k-NN query processing system 100 may encrypt d at the second cloud 180 and then send E ( d ) to the first cloud 150. [ Through the process, the k-NN query processing system 100 can reduce m data encryption required by the second cloud 180 of the DPSSED protocol to one time.

마지막으로, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서 난수 삽입에 의해 추가된 값을 수학식 3을 이용하여 각 차원별로 제거함으로써, 두 벡터 X와 Y간 거리의 제곱 E(|X-Y|²)을 계산할 수 있다.Lastly, the k-NN query processing system 100 removes the values added by the random number insertion in the first cloud 150 by each dimension using Equation (3), so that the square of the distance between the two vectors X and Y , E (| X - Y | ² ).

실시예에 따라서, k-NN 질의 처리 시스템(100)은 GSCMP 프로토콜을 이용하여 kNN 질의를 처리할 수 있다. According to an embodiment, the k-NN query processing system 100 may process the kNN query using the GSCMP protocol.

k-NN 질의 처리 시스템(100)은 GSCMP 프로토콜을 이용하여 제1 클라우드(150)에 E(u)와 E(v)가 주어졌을 때, u<v를 만족하는 경우 E(1)을 반환하고, u>v인 경우 E(0)을 반환할 수 있다. k-NN 질의 처리 시스템(100)은 종래 기술의 CMP-S와 마찬가지로 두 개의 ADD 게이트 및 한 개의 CMP 게이트로 구성된 가블드 회로를 통해 GSCMP 프로토콜을 수행할 수 있다. 그러나 k-NN 질의 처리 시스템(100)은 GSCMP 프로토콜 수행 중 제1 클라우드(150)와 제2 클라우드(180) 사이에서 난수가 포함된 데이터를 교환할 수 있는 점에서 CMP-S와 차이가 있을 수 있다.The k-NN query processing system 100 returns E (1) when u < v is satisfied when E ( u ) and E ( v ) are given to the first cloud 150 using the GSCMP protocol , and E (0) if u > v . The k-NN query processing system 100 can perform the GSCMP protocol through a garbage circuit composed of two ADD gates and one CMP gate as in the case of the conventional CMP-S. However, the k-NN query processing system 100 may differ from the CMP-S in that it can exchange data including the random number between the first cloud 150 and the second cloud 180 during the execution of the GSCMP protocol have.

이하에서, GSCMP 프로토콜의 전체적인 수행 알고리즘을 개시하고자 한다. 첫 번째로, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서 두 개의 난수 r _u 와 r _v 를 생성할 수 있다. 두 번째로, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서 r _u 와 r _v 를 암호화 한 후, E(m ₁)=E(u)×E(r _u )² 및 E(m ₂)=E(v)²×E(1)× E(r _v )를 계산할 수 있다. 세 번째로, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서 두 개의 F(F ₀ : u>v, F ₁ : v>u) 중 임의로 하나를 선택할 수 있다. 이때, k-NN 질의 처리 시스템(100)은 F ₀과 F ₁ 중 무엇이 선택되었는지는 제2 클라우드(180)에 공개하지 않을 수 있다. 또한, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, 선택된 F에 따라 다음을 수행할 수 있다.Hereinafter, an overall execution algorithm of the GSCMP protocol will be described. First, the k-NN query processing system 100 may generate two random numbers r _u and r _v in the first cloud 150. Second, the k-NN query processing system 100 encrypts r _u and r _v in the first cloud 150 and then e ( m ₁ ) = E ( u ) x E ( r _u ) ² and E ( m ₂ ) = E ( v ) ² × E (1) × E ( r _v ). Third, the k-NN query processing system 100 may select any one of the two F ( F ₀ : u > v , F ₁ : v > u ) in the first cloud 150. At this time, the k-NN query processing system 100 may not disclose to the second cloud 180 which of F ₀ and F ₁ is selected. In addition, the k-NN query processing system 100 may, in the first cloud 150, do the following according to the selected F :

만약, 제1 클라우드(150)에서 F ₀ : u>v을 선택한 경우, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, <E(m ₂), E(m ₁)>의 순으로 암호화 데이터를 제2 클라우드(180)에게 전송할 수 있다.If the first Cloud F ₀ at 150, if selected the u> v, k-NN query processing system 100 in the first cloud 150, <E (m _2), E (m _1)> To the second cloud 180. In this case,

만약, 제1 클라우드(150)에서 F ₁ : u<v를 선택한 경우, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, <E(m ₁), E(m ₂)>의 순으로 암호화 데이터를 제2 클라우드(180)에게 전송할 수 있다. If the first cloud 150 selects F ₁ : u < v, then the k-NN query processing system 100 determines in the first cloud 150 that the E ( m ₁ ), E ( m ₂ ) To the second cloud 180. In this case,

네 번째로, k-NN 질의 처리 시스템(100)은 제2 클라우드(180)에서, 전송 받은 데이터를 복호화 할 수 있다. 제1 클라우드(150)에서 F ₀ : u>v를 선택한 경우 k-NN 질의 처리 시스템(100)은 제2 클라우드(180)에서, <m ₂, m ₁>을 획득할 수 있고, 제1 클라우드(150)에서 F ₁ : u<v를 선택한 경우 k-NN 질의 처리 시스템(100)은 제2 클라우드(180)에서, <m ₁, m ₂>를 획득할 수 있다.Fourth, the k-NN query processing system 100 can decode the transmitted data in the second cloud 180. [ The k-NN query processing system 100 may obtain < m ₂ , m ₁ > in the second cloud 180 when F ₀ : u> v is selected in the first cloud 150, The k-NN query processing system 100 may obtain < m ₁ , m ₂ > in the second cloud 180 when F ₁ : u <v is selected in the first cloud 150.

다섯 번째로, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)가 두 개의 ADD 게이트와 한 개의 CMP 게이트로 구성된 가블드 회로를 생성하도록 할 수 있다. 만약, F ₀이 선택된 경우 k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, -r _v 및 -r _u 를 각각 제1 ADD 게이트와 제2 ADD 게이트에 전달할 수 있고, F ₁이 선택된 경우 - r _u 및 -r _v 를 각각 제1 ADD 게이트와 제2 ADD 게이트에 전달할 수 있다.Fifthly, the k-NN query processing system 100 can cause the first cloud 150 to generate a garbage circuit consisting of two ADD gates and one CMP gate. If F ₀ is selected, the k-NN query processing system 100 may pass- r _v and -r _u to the first ADD gate and the second ADD gate, respectively, in the first cloud 150, and F ₁ - r _u and - r _v may be passed to the first ADD gate and the second ADD gate, respectively.

여섯 번째로, k-NN 질의 처리 시스템(100)은 제2 클라우드(180)에서, 수신한 데이터 중, 제1 데이터를 제1 ADD 게이트에 전달하도록 할 수 있고, 제2 데이터를 제2 ADD 게이트에 전달하도록 할 수 있다. 따라서, k-NN 질의 처리 시스템(100)은 제2 클라우드(180)가 F ₀이 선택된 경우 m ₂ 및 m ₁을 각각 제1 ADD 게이트와 제2 ADD 게이트에 전달하도록 할 수 있고, F ₁이 선택된 경우 m ₁ 및 m ₂를 각각 제1 ADD 게이트와 제2 ADD 게이트에 전달하도록 할 수 있다.Sixth, the k-NN query processing system 100 may cause the second cloud 180 to transfer the first data among the received data to the first ADD gate and the second data to the second ADD gate As shown in FIG. Thus, k-NN query processing system 100 may be the second cloud 180 is F ₀ is to deliver a selected when m ₂ and m ₁ each of the first to the ADD gate and the second ADD gate, the F ₁ And to transfer m ₁ and m ₂ to the first ADD gate and the second ADD gate, respectively, if selected.

일곱 번째로, k-NN 질의 처리 시스템(100)은 제1 ADD 게이트에서, F ₀이 선택된 경우 -r _v 및 m ₂=v+r _v 를 합산하고, F ₁이 선택된 경우 -r _u 및 m ₁=u+r _u 를 합산하여 해당 결과 "result ₁"을 CMP 게이트로 전달하도록 할 수 있다.If r _v and m ₂ = v + r _v summed and, F ₁ is selected, the-seven cases in the second, k-NN query processing system 100 includes a first ADD gate, F ₀ is selected, r _u, and m ₁ = u + r _u to transfer the result " result ₁ " to the CMP gate.

여덟 번째로, k-NN 질의 처리 시스템(100)은 제2 ADD 게이트에서, F ₀이 선택된 경우 -r _u 및 m ₁=u+r _u 를 합산하고, F ₁이 선택된 경우 -r _v 및 m ₂=v+r _v 를 합산하여 해당 결과 "result ₂"를 CMP 게이트로 전달하도록 할 수 있다. 이 때, ADD 게이트의 결과값은 가블드 회로의 특성에 의해 인코딩 되어 전달되기 때문에, 정보 노출이 발생되지 않을 수 있다.If r _u, and m ₁ = sum of u + r _u, and, F ₁ is selected, - the eighth to, k-NN query processing system 100 is in the 2 ADD gate, F ₀ is selected if r _v and m ₂ = v + r _v to transfer the result " result ₂ " to the CMP gate. At this time, since the resultant value of the ADD gate is encoded and transmitted by the characteristic of the gain circuit, information exposure may not occur.

아홉 번째로, k-NN 질의 처리 시스템(100)은 CMP 게이트에서, result ₁<result ₂인 경우 α=1을 반환하고, 그렇지 않은 경우 α=0을 반환하도록 할 수 있다.Ninth, the k-NN query processing system 100 may return α = 1 in case of result ₁ < result ₂ , and return α = 0 in case of CMP gate.

마지막으로, 가블드 회로의 수행 결과 α는 제2 클라우드(180)에서 확인할 수 있고, k-NN 질의 처리 시스템(100)은 제2 클라우드(180)에서 이를 암호화하여 제1 클라우드(150)로 전송할 수 있다. 그러나, k-NN 질의 처리 시스템(100)의 제2 클라우드(180)는 제1 클라우드(150)에 의해 선택된 F를 알지 못하기 때문에, u<v의 결과를 판단할 수 없다. k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서 F ₀이 선택된 경우에 E(α)의 값을 SBN 프로토콜을 통해 변경하고, E(α)를 반환함으로써 GSCMP 프로토콜을 종료할 수 있다. 이 때, E(α)=E(1)인 경우, u<v임을 의미하지만, k-NN 질의 처리 시스템(100)의 제1 클라우드(150) 및 제2 클라우드(180)는 E(α)의 실제 값을 알 수 없다.Finally, the performance of beuldeu circuit resulting α may be found in the second cloud 180, k-NN query processing system 100 encrypts this in the second cloud 180 to send to the first cloud 150 . However, since the second cloud 180 of the k-NN query processing system 100 does not know the F selected by the first cloud 150, it can not determine the result of u < v . The k-NN query processing system 100 may terminate the GSCMP protocol by changing the value of E ( ? ) through the SBN protocol and returning E ( ? ) when F ₀ is selected in the first cloud 150 have. At this time, if the E (α) = E (1), u <v mean, however, k-NN first cloud 150 and the second cloud 180 of the query processing system 100 E (α) that the Can not be known.

실시예에 따라서, k-NN 질의 처리 시스템(100)은 GSPE 프로토콜을 이용할 수 있다. 제1 클라우드(150)에 m 차원의 점 E(p) 및 하한점 E(lb _j ) 및 상한점 E(ub _j )(1≤j≤m)으로 표현된 암호화 영역 정보 range가 주어졌을 때, k-NN 질의 처리 시스템(100)은 GSPE 프로토콜을 이용하여, 점 p가 영역 range에 포함되는 경우 E(1)을 반환할 수 있다. 또한, k-NN 질의 처리 시스템(100)은 GSPE 프로토콜을 이용하여, 점 p가 영역 range와 겹치지 않는 경우 E(0)을 반환할 수 있다. GSPE 프로토콜을 이용한 전체적인 수행 알고리즘은 다음과 같을 수 있다.Depending on the embodiment, the k-NN query processing system 100 may utilize the GSPE protocol. When the first cloud 150 is given encrypted area information range expressed by an m- dimensional point E ( p ) and a lower limit point E ( lb _j ) and an upper limit point E ( ub _j ) (1? J ? M ) The k-NN query processing system 100 may use the GSPE protocol to return E (1) if the point p is included in the range range . In addition, the k-NN query processing system 100 may use the GSPE protocol to return E (0) if the point p does not overlap the range range . The overall performance algorithm using the GSPE protocol may be as follows.

먼저, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서 두 개의 난수 배열 ra _j , rb _j (1≤j≤2m)을 생성한 후, 수학식 4와 수학식 5를 통해 데이터 패킹을 수행하여, RA 및 RB를 각각 계산할 수 있다.First, k-NN query processing system 100 through two random array ra _j, then generate _{rb j (1≤ j ≤2 m)} , (4) and equation (5) in the first cloud 150 Data packing can be performed to calculate RA and RB , respectively.

여기서, σ는 하나의 데이터를 표현하기 위한 비트 길이를 의미할 수 있다. 또한, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서 RA 및 RB를 암호화하여 E(RA)와 E(RB)를 생성할 수 있다. Here,? Can mean a bit length for expressing one data. In addition, the k-NN query processing system 100 may encrypt RA and RB in the first cloud 150 to generate E ( RA ) and E ( RB ).

다음으로, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서 점 및 영역의 각 차원별 하한점 값에 2를 곱한 후, 이를 각각 E(μ _j ) 및 E(μ _j )(1≤j≤m)에 저장할 수 있다. 이때, k-NN 질의 처리 시스템(100)은 E(μ _j ) ← E(range ₁.lb _j )² 및 E(ξ _j ) ← E(range ₂.lb _j )²를 통해 수행할 수 있다. 또한, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서 점 및 영역의 각 차원별 상한점 값에 2를 곱하고 1을 더한 후, 각각 E(δ _j ) 및 E(φ _j )(1≤j≤m)에 저장할 수 있다. 이때, k-NN 질의 처리 시스템(100)은 E(δ _j ) ← E(range ₁.ub _j )²×E(1) 및 E(ρ _j ) ← E(range ₂.ub _j )²×E(1)를 통해 수행할 수 있다. 이를 통해 k-NN 질의 처리 시스템(100)은 비교하는 두 수가 같은 경우에 대한 포함 여부를 판단할 수 있다.Next, the k-NN query processing system 100 multiplies the lower limit value of each point of the point and the area by 2 in the first cloud 150, and then multiplies it by E ( μ _j ) and E ( μ _j ) ( 1? J ? M ). In this case, k-NN query processing system 100 may be performed via the _{E (μ j) ← E (} range 1. Lb j) 2 and _{E (ξ j) ← E (} range 2. Lb j) 2. Further, the k-NN query processing system 100 multiplies the upper limit point value of each point of the point and the area by 2 in the first cloud 150, adds 1 to E ( ? _J ) and E ( ? _J ) (1? J ? M ). In this case, k-NN query processing system 100 _{E (δ j) ← E (} range 1. Ub j) 2 × E (1) and _{E (ρ j) ← E (} range 2. Ub j) 2 × E (1). &Lt; / RTI > Accordingly, the k-NN query processing system 100 can determine whether the two cases to be compared include the same case.

다음으로, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서 두 개의 F(F ₀ : u>v, F ₁ : v>u) 중 임의의 하나를 선택할 수 있다. 이때, k-NN 질의 처리 시스템(100)은 F ₀과 F ₁중 무엇이 선택되었는지를 제2 클라우드(180)에 공개하지 않을 수 있다. 또한, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, 선택된 F에 따라 p의 값 및 range ₂의 상한점에 대해 각 차원별로 다음과 같이 데이터 패킹을 수행하도록 할 수 있다.Next, the k-NN query processing system 100 can select any one of the two F ( F ₀ : u > v , F ₁ : v > u ) in the first cloud 150. At this time, the k-NN query processing system 100 may not disclose to the second cloud 180 which of F ₀ and F ₁ has been selected. Also, the k-NN query processing system 100 may cause the first cloud 150 to perform data packing for each dimension for the value of p and the upper limit of range ₂ according to the selected F as follows.

F ₀ : u > v 가 선택된 경우,

이고,

일 수 있다. F ₁ : v > u 가 선택된 경우,

이고,

일 수 있다. If F ₀ : u > v is selected,

ego,

Lt; / RTI > If F ₁ : v > u is selected,

ego,

Lt; / RTI >

즉, F ₀이 선택된 경우, k-NN 질의 처리 시스템(100)은 p의 각 차원별 값을 E(RB)와 패킹하고, range의 각 차원별 상한점 값을 E(RA)와 패킹할 수 있다. 반면, F ₁이 선택된 경우, k-NN 질의 처리 시스템(100)은 p의 각 차원별 값을 E(RA)와 패킹하고, range의 각 차원별 상한점 값을 E(RB)와 패킹할 수 있다. 또한, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, 선택된 F에 따라 p의 값 및 range의 하한점에 대해 차원별로 다음과 같이 데이터 패킹을 수행하도록 할 수 있다. That is, when F ₀ is selected, the k-NN query processing system 100 can pack each dimension value of p into E ( RB ) and pack the upper limit value of each dimension of range into E ( RA ) have. On the other hand, if F ₁ is selected, the k-NN query processing system 100 can pack each dimension value of p with E ( RA ) and pack the upper limit value of each dimension of range into E ( RB ) have. In addition, the k-NN query processing system 100 may cause the first cloud 150 to perform data packing on the lower limit of the value of p and the lower limit of the range according to the selected F as follows.

F ₀ : u>v 가 선택된 경우,

이고,

일 수 있다. F ₁ : v>u 가 선택된 경우,

이고,

일 수 있다. If F ₀ : u > v is selected,

ego,

Lt; / RTI > If F ₁ : v > u is selected,

ego,

Lt; / RTI >

즉, F ₀이 선택된 경우, k-NN 질의 처리 시스템(100)은 range의 각 차원별 하한점 값을 E(RB)와 패킹하고, p의 각 차원별 상한점 값을 E(RA)와 패킹할 수 있다. 반면, F ₁이 선택된 경우, k-NN 질의 처리 시스템(100)은 range의 각 차원별 하한점 값을 E(RA)와 패킹하고, p의 각 차원별 상한점 값을 E(RB)와 패킹할 수 있다. 다음으로, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, E(RA) 및 E(RB)를 제2 클라우드(180)에게 전송하도록 할 수 있다. That is, when F ₀ is selected, the k-NN query processing system 100 packs the lower limit value of each dimension of the range with E ( RB ), packs the upper limit value of each dimension of p into E ( RA ) can do. On the other hand, when F ₁ is selected, the k-NN query processing system 100 packs the lower limit value of each dimension of the range with E ( RA ), packs the upper limit value of each dimension of p into E ( RB ) can do. Next, the k-NN query processing system 100 may cause the first cloud 150 to transmit E ( RA ) and E ( RB ) to the second cloud 180.

다음으로, k-NN 질의 처리 시스템(100)은 제2 클라우드(180)에서, E(RA) 및 E(RB)를 복호화하여, RA 및 RB를 획득할 수 있다. 또한, k-NN 질의 처리 시스템(100)은 제2 클라우드(180)에서 RA

를 통해 RA를 언패킹하여 ra _j +u _j (1≤j≤2m)를 획득할 수 있다. 또한, k-NN 질의 처리 시스템(100)은 제2 클라우드(180)에서 RB

를 통해 RA를 언패킹하여 rb _j +v _j (1≤j≤2m)를 획득할 수 있다. 여기서, u _j 및 v _j 는 p의 값 및 range의 하한점 및 상한점 값을 의미할 수 있다. 한편, 해당 값에는 난수가 포함되어 있으며, k-NN 질의 처리 시스템(100)은 제2 클라우드(180)에서는 제1 클라우드(150)에서 선택된 F를 알지 못하기 때문에 추가적인 정보 노출이 발생하지 않을 수 있다. Next, the k-NN query processing system 100 can decode E ( RA ) and E ( RB ) in the second cloud 180 to obtain RA and RB . In addition, the k-NN query processing system 100 may be configured to receive the RA

A can to unpack the RA to obtain a _{_{ra j + u j (1≤ j}} ≤2 m) through. In addition, the k-NN query processing system 100 may be configured to perform RB

To obtain rb _j + v _j (1? J ? 2m ) by unpacking the RA through? Here, u _j and v _j can mean the value of p and the lower limit and upper limit of range , respectively. Meanwhile, the k-NN query processing system 100 does not know the F selected in the first cloud 150 in the second cloud 180, so that no additional information exposure may occur have.

다음으로, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, CMP-S 서킷을 생성할 수 있다. CMP-S 서킷을 생성한 후, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)가 보유하고 있는 ra _j , rb _j (1≤j≤2m)를 기반으로 -ra _j , -rb _j (1≤j≤2m)를 생성한 후, 이를 제1 클라우드(150)에서 차례로 CMP-S의 입력 값으로 전달하도록 할 수 있다. 한편, k-NN 질의 처리 시스템(100)은 제2 클라우드(180)에서 자신이 보유하고 있는 ra _j +u _j , rb _j +v _j (1≤j≤2m)를 차례로 CMP-S의 입력 값으로 전달하도록 할 수 있다. k-NN 질의 처리 시스템(100)은 제2 클라우드(180)에서 CMP-S 수행 결과 α' _j (1≤j≤2m)를 확인할 수 있고, 제2 클라우드(180)에서 이를 암호화하여 제1 클라우드(150)로 전송할 수 있다. Next, the k-NN query processing system 100 may generate a CMP-S circuit in the first cloud 150. [ After creating the circuit CMP-S, k-NN query processing system 100 based on the first cloud 150 is _j ra, rb _j (1≤ j ≤2 m) which has - ra _j, - after generating the _{rb j (1≤ j ≤2 m)} , it can be use to deliver them as input to the CMP-S in turn in the first cloud 150. On the other hand, k-NN query processing system 100 includes a second cloud 180 in ra _{_j} + u _j and that they hold, rb _j + v _j (1≤ j ≤2 m) in order to input the S-CMP Value. &Lt; / RTI > The k-NN query processing system 100 can confirm the CMP-S performance result α ' _j (1? j ? 2m ) in the second cloud 180 and encrypt it in the second cloud 180, To the cloud (150).

다음으로, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, F ₀이 선택된 경우에만 SBN을 통해 E(α' _j )(1≤j≤2m) 값을 변환할 수 있다. 아울러, k-NN 질의 처리 시스템(100)은 SM(Secure Multiplication) 프로토콜을 이용하여 E(α)와 E(α' _j ) 간 곱을 수행할 수 있다. 이 때, k-NN 질의 처리 시스템(100)은 최초 E(α)의 값은 E(1)로 설정할 수 있다. Next, the k-NN query processing system 100 can convert E ( ? ' _J ) (1? J ? 2m ) values through the SBN only when F ₀ is selected in the first cloud 150 . In addition, the k-NN query processing system 100 can perform the multiplication between E ( ? ) And E ( ? ' _J ) using the SM (Secure Multiplication) protocol. At this time, the k-NN query processing system 100 can set the value of the initial E ( ? ) To E (1).

마지막으로, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, E(α)를 반환함으로써 GSPE 프로트콜을 종료할 수 있다. 이 때, E(α)=E(1)인 경우, 점 p는 영역 range에 포함될 수 있다. 그러나, k-NN 질의 처리 시스템(100)은 제1 클라우드(150) 및 제2 클라우드(180)에서 E(α)의 실제 값을 알 수 없기 때문에, 점의 영역 내 포함 여부를 알 수 없다.Finally, the k-NN query processing system 100 may terminate the GSPE protocol by returning E ( ? ) In the first cloud 150. In this case, when E ( ? ) = E (1), the point p may be included in the range range . However, since the k-NN query processing system 100 can not know the actual value of E ( ? ) In the first cloud 150 and the second cloud 180, the k-NN query processing system 100 can not know whether or not it is included in the area of the point.

도 4a 내지 도 4d는 본 발명의 일실시예에 따른 kNN 질의 처리 알고리즘을 설명하기 위한 도면이다.4A to 4D are diagrams for explaining a kNN query processing algorithm according to an embodiment of the present invention.

k-NN 질의 처리 시스템(100)은 kNN 질의 처리 알고리즘으로서, 암호화 인덱스 탐색 단계(410), kNN 단계(420) 및 질의 결과 검증 단계(430)를 통해 k-NN 질의를 처리할 수 있다.The k-NN query processing system 100 may process a k-NN query through a cryptographic index search step 410, a kNN step 420 and a query result verification step 430 as a kNN query processing algorithm.

먼저, k-NN 질의 처리 시스템(100)은 다음과 같은 과정을 통해 암호화 인덱스를 탐색할 수 있다(410).First, the k-NN query processing system 100 can search for an encryption index through the following process (410).

도 4b를 참고하여 설명하면, 단계(411)에서, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, E(q)와 E(node _z )(1≤z≤num _node )를 기반으로 GSPE 프로토콜을 수행함으로써, 질의 지점을 포함하는 노드를 탐색할 수 있다. 이때, GSPE 수행 결과 반환된 E(α _z )의 값이 E(1)인 노드는 질의 지점을 포함하는 노드일 수 있다. 그러나, 제1 클라우드(150) 및 제2 클라우드(180)는 어느 노드가 질의 영역과 겹치는 영역인지 알지 못 할 수 있다. k-NN 질의 처리 시스템(100)은 패일러(Paillier) 암호화 시스템을 기반으로 암호화 데이터베이스를 암호화 할 수 있는데, 패일러 암호화 시스템은 의미적 보안을 지원하기 때문이다. Referring to FIG. 4B, in step 411, the k-NN query processing system 100 determines E ( q ) and E ( node _z ) (1? Z ? Num _node ) in the first cloud 150, , The node including the query point can be searched by performing the GSPE protocol. In this case, the node whose E ( alpha _z ) value E (1) returned from the GSPE execution result may be a node including the query point. However, the first cloud 150 and the second cloud 180 may not know which node overlaps the query area. The k-NN query processing system 100 can encrypt an encryption database based on a Paillier encryption system because the pager encryption system supports semantic security.

단계(412)에서, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, 순서 변경 함수 π를 생성하여 E(α)의 순서를 변경하고, 이를 제2 클라우드(180)로 전송할 수 있다.In step 412, the k-NN query processing system 100 generates a reordering function π in the first cloud 150 to change the order of E ( α ) and send it to the second cloud 180 .

단계(413)에서, k-NN 질의 처리 시스템(100)은 제2 클라우드(180)에서, E(α)를 복호화 한 후, 1의 개수(c)를 확인하고, c개의 노드 그룹 Group을 생성할 수 있다. 이때, k-NN 질의 처리 시스템(100)은 제2 클라우드(180)에서, 각 노드 그룹에 α 값이 1인 노드 한 개와 α 값이 0인 노드 (num _node /c)-1개를 할당할 수 있다. 또한, k-NN 질의 처리 시스템(100)은 각 노드 그룹에 할당된 노드의 순서를 랜덤하게 변환한 후, 이를 제1 클라우드(150)로 전송할 수 있다. In step 413, the k-NN query processing system 100 decodes E ( ? ) In the second cloud 180, checks the number of 1's ( c ), generates c node groups Group can do. In this case, k-NN query processing system 100 includes a second cloud (180), for each group of nodes α value of 1 and one node α value is 0, the node _(node num / c) to allocate the pieces -1 . In addition, the k-NN query processing system 100 may randomly convert the order of nodes assigned to each node group, and then transmit the randomly converted nodes to the first cloud 150.

단계(414)에서, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, 역변경 함수 π ^-1을 이용하여 각 노드 그룹에 속한 노드의 식별 번호를 역변경할 수 있다.In step 414, the k-NN query processing system 100 can inversely change the identification number of a node belonging to each node group in the first cloud 150 using an inverse change function ? ^-1 .

단계(415)에서, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, 노드 그룹 별 노드에 저장된 데이터와 GSPE 프로토콜을 통해 반환된 각 노드의 E(α)를 이용해 SM 프로토콜을 수행하고, 준동형 암호화 특성을 이용하여 질의와 관련된 노드 내에 존재하는 데이터를 E(cand)에 저장할 수 있다.In step 415, the k-NN query processing system 100, in the first cloud 150, uses the data stored in each node group node and the E ( ? ) Of each node returned via the GSPE protocol to transmit the SM protocol , And the data existing in the node related to the query can be stored in E ( cand ) using the perturbed encryption property.

단계(416)에서, k-NN 질의 처리 시스템(100)은 E(cand)를 반환함으로써 암호화 인덱스 탐색을 종료할 수 있다. In step 416, the k-NN query processing system 100 may terminate the cryptographic index search by returning E ( cand ).

다시 도 4a를 설명하면, k-NN 질의 처리 시스템(100)은 다음과 같은 과정을 통해 k-NN 탐색 단계를 수행할 수 있다(420). k-NN 질의 처리 시스템(100)은 k-NN 탐색 단계에서, 암호화 인덱스 탐색 단계(410)에서 추출한 데이터를 기반으로 질의와의 거리가 가까운 k개의 데이터를 탐색할 수 있다. k-NN 질의 처리 시스템(100)은 SkNN_m 알고리즘을 부분적으로 활용하여 수행할 수 있는데, k-NN 질의 처리 시스템(100)은 암호화 인덱스 탐색 단계(410)의 수행을 통해 반환된 cnt 개의 데이터를 기반으로 kNN 탐색을 수행할 수 있다. 또한, k-NN 질의 처리 시스템(100)은 연산 비용이 큰 SBD, SMIN, SMIN_n 프로토콜 대신, 데이터 패킹 및 가블드 회로 기반의 효율적인 프로토콜(즉, ESSED, SMS_n)을 활용할 수 있다.Referring again to FIG. 4A, the k-NN query processing system 100 may perform a k-NN search step (420) through the following procedure. The k-NN query processing system 100 can search k data that are close to the query based on the data extracted in the encryption index search step 410 in the k- NN search step. The k-NN query processing system 100 can perform the partial search using the SkNN _m algorithm. The k-NN query processing system 100 searches the cnt index data 410 based can perform k NN search. In addition, the k-NN query processing system 100 can utilize efficient data packing and garbage-based efficient protocols (i.e., ESSED, SMS _n ) instead of SBD, SMIN, SMIN _n protocols with high computational cost.

도 4c를 참고하여 설명하면, 단계(421)에서, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, ESSED 프로토콜을 통해 질의 E(q)와 암호화 인덱스 탐색을 통해 반환된 cnt 개의 암호화 데이터 E(cand _i ) 간 유클리디언 거리 제곱 E(d _i )(1≤i≤cnt)를 계산할 수 있다.Referring to FIG. 4C, in step 421, the k-NN query processing system 100 searches the first cloud 150 for query E ( q ) via the ESSED protocol and cnt of the encrypted data E (cand _i) between the squared Euclidean distance _{E (d i) (1≤ i} ≤ cnt) it can be calculated.

단계(422)에서, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, SMS_n를 통해 암호화 거리(E(d _i )|1≤i≤cnt) 중 최소값 E(d _min )을 찾을 수 있다. 또한, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, E(d _min )과 E(d _i ) 간 차를 E(d _min )×E(d _i ) ^N ^-1 (1≤i≤cnt)를 통해 계산하고, 그 결과를 E(τ _i )에 저장할 수 있다. 또한, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, E(τ _i )에 암호화 난수를 곱하여 E(τ _i )를 생성할 수 있다. 아울러, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, 임의의 순서 변경 함수 π를 E(τ)에 적용하여 E(β)를 생성하고, 이를 제2 클라우드(180)로 전송할 수 있다.In step 422, the k-NN query processing system 100 determines the minimum value E ( d _min ) of the cryptographic distance E ( d _i ) | 1? I ? Cnt via the SMS _n in the first cloud 150, Can be found. Also, k-NN query processing system 100 in the first cloud (150), E the difference between E (d _min) and E (d _i) (d _min) × E (d _i) ^N ^-1 (1 ≤ i ≤ cnt ), and the result can be stored in E ( τ _i ). Also, k-NN query processing system 100 may generate E (τ _i) by multiplying the encrypted random number to the first cloud _{(150), E (τ i} ). In addition, the k-NN query processing system 100 applies E ( τ ) to an arbitrary order change function π in the first cloud 150 to generate E ( β ), and transmits it to the second cloud 180 Lt; / RTI >

단계(423)에서, k-NN 질의 처리 시스템(100)은 제2 클라우드(180)에서, 전송받은 E(β)의 각 원소를 복호화하고, D(β _i )의 값이 0인 경우에는 E(U _i )=E(1)로, 0이 아닌 경우에는 E(U _i )=E(0)으로 설정할 수 있다. 이 후, k-NN 질의 처리 시스템(100)은 제2 클라우드(180)에서 E(U)를 제1 클라우드(150)로 전송할 수 있다. In step 423, the k-NN query processing system 100 decodes each element of the transmitted E ( ? ) In the second cloud 180, and when the value of D ( ? _I ) is 0, ( U _i ) = E (1), and when it is not 0, E ( U _i ) = E (0). Thereafter, the k-NN query processing system 100 may send E ( U ) to the first cloud 150 in the second cloud 180.

단계(424)에서, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, 제2 클라우드(180)로부터 전송받은 E(U)를 π ^-1을 통해 역변경하여 E(V)에 저장할 수 있다. 또한, k-NN 질의 처리 시스템(100)은 E(V _i )(1≤i≤cnt) 및 암호화 인덱스 탐색을 통해 반환된 E(cand _i,j )(1≤i≤cnt, 1≤j≤m)를 기반으로 SM 프로토콜을 수행하고, 해당 결과를 E(V _i,j )에 저장할 수 있다. 다음으로, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, 준동형 암호화 특성을 기반으로 수학식 6을 통해 E(V _i,j )의 값을 각 차원별로 합산할 수 있다. In step 424, the k-NN query processing system 100 reverses E ( U ) transmitted from the second cloud 180 through π ^-1 in the first cloud 150 to obtain E ( V ) Lt; / RTI > Also, k-NN query processing system 100 E (V _i) (1≤ i ≤ cnt) and the return via the encrypted index search _{E (cand i, j) (} 1≤ i ≤ cnt, 1≤ j ≤ m ), and store the result in E ( V _{i, j} ). Next, the k-NN query processing system 100 can sum the values of E ( V _{i, j} ) for each dimension on the basis of the perceptual cryptographic characteristic in the first cloud 150 through Equation (6) .

단계(425)에서, k-NN 질의 처리 시스템(100)은 아직 사용자가 요청한 k개의 질의 결과를 찾지 못했을 경우, kNN 결과로 선택된 E(t _s )가 다음 수행과정에서 중복 선택되는 것을 방지해야 한다. 이를 위해, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, 수학식 7을 수행하여 각 E(d _i )(1≤i≤cnt)의 값을 갱신할 수 있다. In step (425), k-NN query processing system 100 is yet to prevent the user is unable to find the k number of query results requested, selected duplicate on the selected E (t _s) following the course of a k NN results do. To this end, k-NN query processing system 100 may update the value of the angle E (d _i) (1≤ i ≤ cnt) by performing, equation (7) in the first cloud 150.

여기서, E(max)는 데이터 도메인의 최대값을 의미할 수 있다. kNN 결과로 선택된 데이터는 E(V _i )=E(1) 값을 지니기 때문에, k-NN 질의 처리 시스템(100)은 수학식 7을 통해 E(d _i )=E(max)로 변경할 수 있다. k-NN 질의 처리 시스템(100)은 나머지 데이터에 대하여, E(V _i )=E(0) 값을 지니기 때문에, E(d _i ) 값을 그대로 유지할 수 있다. 이를 통해, k-NN 질의 처리 시스템(100)은 kNN 결과로 선택된 암호화 데이터가 중복 선택되는 것을 방지할 수 있다. Here, E ( max ) may mean the maximum value of the data domain. Since the data is selected as k NN result is E (V _i) = E (1) jinigi values, k-NN query processing system 100 may be changed to E (d _i) = E (max) through the equation (7) have. Since k-NN query processing system 100 for the remaining data, E (V _i) = E (0) jinigi values, it is possible to maintain the E (d _i) values as they are. This, k-NN query processing system 100 through can be prevented from being selected as the encryption data is redundant select k NN results.

k-NN 질의 처리 시스템(100)은 상기의 과정을 k개의 데이터가 탐색될 때까지 반복 수행할 수 있으며, 단계(426)에서, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, 탐색된 k개의 질의 결과를 반환함으로써 알고리즘을 종료할 수 있다.The k-NN query processing system 100 may repeat the above process until k data are found, and in step 426, the k-NN query processing system 100 determines whether the k- , The algorithm can be terminated by returning the results of the searched k queries.

다시 도 4a를 설명하면, k-NN 질의 처리 시스템(100)은 다음과 같은 과정을 통해 노드 확장 탐색을 통한 질의결과 검증 단계를 수행할 수 있다(430).Referring again to FIG. 4A, the k-NN query processing system 100 may perform a query result verification step through a node extension search through the following process (430).

kNN 질의 탐색에 대한 결과는 kd 트리를 통해 분할된 일부 노드의 데이터를 기반으로 탐색된 것일 수 있다. 따라서, 인접한 kd 트리 노드에 질의와 보다 근접한 데이터가 존재하는지 검증하는 과정이 요구될 수 있다. 이를 해결하기 위하여, k-NN 질의 처리 시스템(100)은 k-NN 탐색 단계(420)에서 반환된 결과 E(t) 중 k 번째 결과까지의 거리 dist _k 보다 질의 지점으로부터 가까운 거리에 존재하는 노드들을 탐색할 수 있다. 즉, k-NN 질의 처리 시스템(100)은 질의 지점으로부터의 최단 거리가 dist _k 보다 작은 노드를 찾기 위한 탐색 과정을 수행할 수 있다. 이를 위해, k-NN 질의 처리 시스템(100)은 제1 정의의 최단 거리점을 활용할 수 있다.The result of the query search may be based on the data of some of the nodes divided through the kd tree. Thus, a process may be required to verify that there is a query and closer data to an adjacent kd tree node. To solve this problem, k-NN query processing system 100 includes nodes in a short distance from the distance query point than dist _k of up to k th result of the result E (t) is returned by the k-NN search phase 420 . &Lt; / RTI > That is, the k-NN query processing system 100 can perform a search process for finding a node whose shortest distance from the query point is smaller than dist _k . To this end, the k-NN query processing system 100 may utilize the shortest point of the first definition.

제1 정의의 최단 거리점 sp(shortest point)은 한 점(p)과 한 영역이 주어졌을 때, 영역 내에 존재하는 모든 점 중에서 p까지의 최단 거리를 갖는 점일 수 있다.The shortest point sp (shortest point) of the first positive can jeomil having the shortest distance from a point (p) and when turned the area is given, p of all the points existing in the area.

도 5를 참고하여, 최단 거리점의 특성을 설명하고자 한다. 도 5는 본 발명의 일실시예에 따른 1차원 공간에서의 점-영역 관계를 도시한 도면이다.The characteristics of the shortest distance point will be described with reference to FIG. 5 is a diagram illustrating point-to-area relationships in a one-dimensional space according to an embodiment of the present invention.

도 5에 도시된 바와 같이, k-NN 질의 처리 시스템(100)은 1차원 상의 점 p=3 및 3개의 영역(range ₁, range ₂, range ₃)이 주어졌을 경우, 점과 영역의 위치 관계를 크게 3가지로 구분할 수 있다.As shown in FIG. 5, the k-NN query processing system 100 has a point p = 3 on one dimension and three regions ( range ₁ , range ₂ , range ₃ ) Can be divided into three major.

i) range ₁과 같이 영역의 하한점 값(예컨대, 0) 및 상한점 값(예컨대, 2) 모두 점 p의 값(예컨대, 3) 보다 작은 경우, k-NN 질의 처리 시스템(100)은 p에 대한 range ₁의 최단 거리점을 해당 영역의 상한점으로 할 수 있다. ii) range ₂와 같이 영역의 하한점 값(예컨대, 4) 및 상한점 값(예컨대, 6) 모두 점 p의 값(예컨대, 3) 보다 큰 경우, k-NN 질의 처리 시스템(100)은 p에 대한 range ₂의 최단 거리점을 해당 영역의 하한점으로 할 수 있다. iii) range ₃과 같이 영역의 하한점 값(예컨대, 2) 및 상한점 값(예컨대, 4) 사이에 점 p의 값(예컨대, 3)이 존재하는 경우, k-NN 질의 처리 시스템(100)은 p에 대한 range ₃의 최단 거리점을 p의 값으로 할 수 있다. 본원에서는 이러한 특성을 다차원 공간으로 확장하여 활용할 수 있다. i) If the lower bound (e.g., 0) and the upper bound (e.g., 2) of the region are less than the value of the point p (e.g., 3) as in range ₁ , then the k-NN query processing system 100 returns p The shortest distance point of range ₁ can be set as the upper limit point of the corresponding region. ii) If the lower bound of the region (e.g., 4) and the upper bound of the region (e.g., 6) are both greater than the value of the point p (e.g., 3) as in range ₂ , then the k-NN query processing system 100 returns p The shortest distance point of range ₂ can be set as the lower limit point of the corresponding region. iii) a lower limit point value of the range, such as range ₃ (e.g., 2) and the upper limit value (for example, 4) the value of the point p between (e. g., 3) the presence, k-NN query processing system 100, It can be the minimum distance point of the range ₃ to p with a value of p. In the present application, such a characteristic can be expanded and utilized in a multidimensional space.

이러한 특성을 바탕으로, k-NN 질의 처리 시스템(100)은 패일러 암호화 시스템을 기반으로 암호화된 데이터 상에서의 질의에 대한 노드의 최단 거리점 탐색 및 질의 결과 검증을 다음과 같은 과정으로 수행할 수 있다.Based on this characteristic, the k-NN query processing system 100 can perform the search of the shortest distance of the node and the query result verification for the query on the encrypted data based on the payer encryption system by the following process have.

단계(431)에서, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, ESSED 프로토콜을 이용하여 질의 E(q)와 E(t _k )까지의 거리 E(dist _k )를 계산할 수 있다.In step 431, the k-NN query processing system 100 calculates the distance E ( dist _k ) from query E ( q ) and E ( t _k ) in the first cloud 150 using the ESSED protocol .

단계(432)에서, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, E(q _j )와 노드의 하한점 E(node _z . lb _j )(1≤z≤num _node , 1≤j≤m) 간 GSCMP 프로토콜을 수행하고, 그 결과를 E(ψ ₁)에 저장할 수 있다. 또한, k-NN 질의 처리 시스템(100)은 E(q _j )와 노드의 상한점 E(node _z . ub _j )(1≤z≤num _node , 1≤j≤m) 간 GSCMP 프로토콜을 수행하고, 그 결과를 E(ψ ₂)에 저장할 수 있다. 또한, k-NN 질의 처리 시스템(100)은 E(q _j )가 노드의 하한점 혹은 상한점 보다 작거나 같은 경우, 상응하는 E(ψ)는 E(1) 값을 갖도록 할 수 있다.In step (432), k-NN query processing system 100 includes a lower limit of the point E and the node in the first cloud _{(150), (q j)} E (node z. Lb j) (1≤ z ≤ num node, 1 ≤ j ≤ m ), and store the result in E ( ψ ₁ ). Also, k-NN query processing system 100 may perform GSCMP protocol between the E (q _j), and the upper limit _{_{E (node z. Ub j)}} (1≤ z ≤ num node, 1≤ j ≤ m) of the node, and , And store the result in E ( ? ₂ ). In addition, the k-NN query processing system 100 can make the corresponding E ( ? ) Equal to E (1) when E ( q _j ) is less than or equal to the lower or upper limit of the node.

단계(433)에서, k-NN 질의 처리 시스템(100)은 E(ψ ₁)과 E(ψ ₂)를 이용하여 SBXOR(Secure Bit-XOR) 프로토콜을 수행하고, 결과를 E(ψ ₃)에 저장할 수 있다. In step 433, the k-NN query processing system 100 E (ψ ₁₎ and E (ψ ₂₎ SBXOR (Secure Bit-XOR) performing the protocol, the result E (ψ ₃₎ using Can be stored.

단계(434)에서, k-NN 질의 처리 시스템(100)은 수학식 8 및 수학식 9를 수행하여 각 차원에서의 최단 거리점 E(sp _z,j )을 계산할 수 있다. In step 434, the k-NN query processing system 100 may calculate the shortest point E ( sp _{z, j} ) in each dimension by performing equations (8) and (9).

단계(435)에서, E(q)에 대한 각 노드의 최단 거리점 E(sp _z )(1≤z≤num _node ) 탐색이 완료된 후, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, ESSED 프로토콜을 통해 E(q)와 각 E(sp _z ) 간 유클리디언 거리의 제곱을 계산하여 E(spdist _z )(1≤z≤num _node )에 저장할 수 있다. 또한, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, 수학식 10을 통해 탐색이 완료된 노드의 최단 거리점까지의 거리를 도메인에서의 최대값인 E(max)로 안전하게 변경할 수 있다. After the search for the shortest distance E ( sp _z ) (1? Z ? Num _node ) of each node to E ( q ) is completed at step 435, the k-NN query processing system 100 searches the first cloud 150), the Euclidean distance between E ( q ) and E ( sp _z ) can be calculated and stored in E ( spdist _z ) (1 ≤ z ≤ num _node ) via the ESSED protocol. The k-NN query processing system 100 can securely change the distance from the first cloud 150 to the shortest distance point of the node that has been found through Equation 10 to the maximum value E ( max ) in the domain .

여기서, E(α _z )는 단계(431)에서 GSPE 프로토콜을 통해 반환된 값이며, 이미 탐색이 완료된 노드는 E(α _z )=E(1), 그렇지 않은 노드는 E(α _z )=E(0) 값을 가질 수 있다.Here, E (α _z) is the value returned by the GSPE protocol in step 431, the node already search is complete, E (α _z) = E (1), otherwise, the node E (α _z) = E (0) < / RTI > value.

이 후, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, E(spdist _z ) 및 E(dist _k )를 기반으로 GSCMP 프로토콜을 수행하고, 해당 결과를 E(α _z )에 저장할 수 있다. 만약, E(spdist _z )가 E(dist _k )보다 작은 노드이면, 상기 노드는 추가 탐색이 필요한 노드일 수 있으며, k-NN 질의 처리 시스템(100)은 GSCMP 프로토콜의 수행 결과 E(α _z )=E(1)을 반환 받을 수 있다. 이 때, k-NN 질의 처리 시스템(100)은 이미 탐색이 완료된 노드에 대하여 E(spdist _z )가 도메인에서의 최대값을 지니기 때문에, 질의 결과 검증을 위한 확장 노드로 선정하지 않을 수 있다. Thereafter, the k-NN query processing system 100 performs GSCMP protocol based on E ( spdist _z ) and E ( dist _k ) in the first cloud 150 and outputs the result to E ( ? _Z ) Can be stored. If the E ( spdist _z ) is a node that is smaller than E ( dist _k ), the node may be a node requiring further searching. The k-NN query processing system 100 may perform the GSCMP protocol execution E ( α _z ) = E (1) can be returned. At this time, the k-NN query processing system 100 may not select an extension node for query result verification because E ( spdist _z ) has a maximum value in a domain for a node that has already been searched.

단계(436)에서, k-NN 질의 처리 시스템(100)은 암호화 인덱스 탐색 단계(410)를 재수행 함으로써, E(q)로부터 dist _k 거리 내에 존재하는 노드에 속한 모든 데이터를 추출하여 E(t)에 추가할 수 있다. 아울러, k-NN 질의 처리 시스템(100)은 E(t)를 기반으로 k-NN 탐색 단계(420)를 재수행 함으로써 최종 질의 결과인 E(result _i ) (1≤i≤k)를 획득할 수 있다.In step 436, the k-NN query processing system 100 extracts all data belonging to the nodes within the dist _k distance from E ( q ) by re-executing the cryptographic index search step 410 to obtain E ( t ). In addition, k-NN query processing system 100 to acquire the E _(i result) (1≤ i ≤ k), the final query result by performing a re-E (t) k-NN search step 420 based on the .

단계(437)에서, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, 난수 r _i,j 를 생성한 후, E(γ_i,j)=E(result _i,j ) ×E(r _i,j ) (1≤i≤k, 1≤j≤m)를 수행할 수 있다. 그 다음으로, k-NN 질의 처리 시스템(100)은 제1 클라우드(150)에서, E(γ _i,j )를 제2 클라우드(180)로 전송하고, r _i,j 를 사용자 단말(190)로 전송할 수 있다.In step 437, the k-NN query processing system 100 generates a random number r _{i, j} in the first cloud 150 and then calculates E (γ _{i, j} ) = E ( result _{i, j} ) E ( r _{i, j} ) (1 ≤ i ≤ k , 1 ≤ j ≤ m ). The k-NN query processing system 100 then sends E ( ? _{I, j} ) to the second cloud 180 in the first cloud 150 and sends r _{i, j} to the user terminal 190 Lt; / RTI >

단계(438)에서, k-NN 질의 처리 시스템(100)은 제2 클라우드(180)에서, 전송받은 E(γ _i,j )를 복호화하여 γ _i,j 를 획득한 후, 이를 사용자 단말(190)로 전송할 수 있다.In step 438, the k-NN query processing system 100 decodes the received E ( ? _{I, j} ) in the second cloud 180 to obtain ? _{I, j} and transmits it to the user terminal 190 ).

단계(439)에서, k-NN 질의 처리 시스템(100)은 사용자 단말(190)에서, 제2 클라우드(180) 및 제1 클라우드(150)로부터 전송 받은 γ _i,j 과 r _i,j 를 이용하여 γ _i,j -r _i,j (1≤i≤k, 1≤j≤m)를 수행함으로써, 실제 질의 결과를 획득하도록 할 수 있다.In step 439, the k-NN query processing system 100 uses the γ _{i, j} and r _{i, j} received from the second cloud 180 and the first cloud 150 at the user terminal 190 by γ _{i, j} - by carrying out _{r i, j (1≤ i ≤} k, 1≤ j ≤ m), may be to obtain the actual query result.

이하, 도 6에서는 본 발명의 실시예들에 따른 k-NN 질의 처리 시스템(100)의 작업 흐름을 상세히 설명한다.Hereinafter, the work flow of the k-NN query processing system 100 according to the embodiments of the present invention will be described in detail with reference to FIG.

도 6은 본 발명의 일실시예에 따른 가블드 회로 기반 k-NN 질의 처리 방법의 순서를 도시한 흐름도이다.FIG. 6 is a flowchart illustrating a procedure of a method for processing a k-NN query based on a garbled circuit according to an embodiment of the present invention.

본 실시예에 따른 가블드 회로 기반 k-NN 질의 처리 방법은 상술한 k-NN 질의 처리 시스템(100)에 의해 수행될 수 있다.The method for processing a k-NN query based on the embedded circuit according to the present embodiment can be performed by the k-NN query processing system 100 described above.

먼저, k-NN 질의 처리 시스템(100)은 제1 클라우드와, 상기 제1 클라우드와 독립(non-colluding)되는 제2 클라우드를 구축한다(610). 즉, 단계(610)에서, k-NN 질의 처리 시스템(100)은 각 클라우드는 사용자 질의를 처리하기 위해 암호화 프로토콜을 수행 시, 질의 처리 과정 중에 획득한 정보를 바탕으로, 추가적인 정보를 획득하기 위해 다른 클라우드와 결탁하여 데이터 및 정보를 주고 받지 않도록 할 수 있다.First, the k-NN query processing system 100 constructs a first cloud and a second cloud that is non-colluding with the first cloud (610). In other words, in step 610, the k-NN query processing system 100 determines whether each cloud is to acquire additional information based on the information acquired during the query processing process when performing the encryption protocol to process the user query You can work with other clouds to avoid sending and receiving data and information.

다음으로, k-NN 질의 처리 시스템(100)은 원본 데이터베이스에 저장되는 데이터를 암호화 한 암호화 데이터베이스와, 상기 암호화와 연관되어 생성되는 암호화 공개키를, 상기 제1 클라우드에 유지한다(620). 즉, 단계(620)에서, k-NN 질의 처리 시스템(100)은 제1 클라우드에 암호화 데이버베이스 및 암호화 데이터베이스와 연관되는 암호화 공개키를 유지할 수 있다.Next, the k-NN query processing system 100 maintains (620) the encryption database in which the data stored in the original database is encrypted and the encrypted public key generated in association with the encryption in the first cloud. That is, in step 620, the k-NN query processing system 100 may maintain a cryptographic database in the first cloud and a cryptographic public key associated with the cryptographic database.

다음으로, k-NN 질의 처리 시스템(100)은 상기 암호화 공개키에 대응하는 복호화 비밀키를, 상기 제2 클라우드에 유지한다(630). 즉, 단계(630)에서, k-NN 질의 처리 시스템(100)은 제2 클라우드에, 암호화 공개키에 대응하는 복호화 비밀키를 유지할 수 있다.Next, the k-NN query processing system 100 maintains a decryption secret key corresponding to the encrypted public key in the second cloud (630). That is, in step 630, the k-NN query processing system 100 may maintain a decryption private key corresponding to the encrypted public key in the second cloud.

다음으로, k-NN 질의 처리 시스템(100)은 상기 암호화 공개키를 배포 받은 사용자 단말에서, kNN(k Nearest Neighbor) 질의가 발생되는지 판단한다(640). 즉, 단계(640)에서, k-NN 질의 처리 시스템(100)은 사용자 단말로부터, 데이터베이스의 암호화 시 이용한 암호화 공개키를 이용하여 암호화된 kNN 질의가 수신되는지 판단할 수 있다. 예를 들어, 단말에서는 질의 점을, 예컨대, 'E(q_j)(1≤j≤m)'와 같이 암호화 공개키(130)로 암호화하여 사용자 질의를 요청할 수 있다.Next, the k-NN query processing system 100 determines whether kNN (k Nearest Neighbor) query is generated in the user terminal that has distributed the encrypted public key (640). That is, in step 640, the k-NN query processing system 100 can determine from the user terminal whether the encrypted kNN query is received using the encrypted public key used in the encryption of the database. For example, the terminal may request the user query by encrypting the query point with the encryption public key 130, for example, 'E (q _j ) (1? _J ? M)'.

다음으로, k-NN 질의 처리 시스템(100)은 상기 암호화 공개키와 상기 복호화 비밀키에 기초한, 상기 제1 클라우드와 상기 제2 클라우드 간의 다자간 계산을 수행하여, 상기 암호화 데이터베이스로부터 상기 kNN 질의에 대한 결과 데이터를 도출하여 상기 사용자 단말로 제공한다(650). 즉, 단계(650)에서, k-NN 질의 처리 시스템(100)은 사용자 단말로부터 사용자 질의가 수신되면, 선정된 암호화 연산 프로토콜을 기반으로, 제1 클라우드와 제2 클라우드 간에 다자간 계산을 수행하여, kNN 질의를 처리할 수 있다.Next, the k-NN query processing system 100 performs a multiparametric calculation between the first cloud and the second cloud based on the encrypted public key and the decrypted secret key, thereby calculating, for the kNN query, The result data is derived and provided to the user terminal (650). That is, when the user query is received from the user terminal, the k-NN query processing system 100 performs a multi-point calculation between the first cloud and the second cloud based on the selected cryptographic operation protocol, kNN queries can be processed.

여기서, 다자간 계산이란, 데이터 소유자가 보유하고 있는 원본 데이터를 노출하지 않은 채, 다른 개체(제1 클라우드와 제2 클라우드)를 통해 프로토콜 및 연산을 안전하게 수행하는 것을 지칭할 수 있다.Here, the multiparameter calculation may refer to the secure execution of protocols and operations through other entities (the first cloud and the second cloud) without exposing the original data held by the data owner.

실시예에 따라서, k-NN 질의 처리 시스템(100)은 상기 원본 데이터베이스에 저장된 데이터를, 다수의 속성(attribute) 및 차원(column)으로 분할하여, kd 트리를 구성하고, kd 트리를 암호화 한 암호화 kd 트리를, 상기 제1 클라우드에 유지할 수 있다. 즉, k-NN 질의 처리 시스템(100)은 데이터베이스에 저장된 데이터를 선정된 개수(예를 들어, F개) 단위로 분할하고, 분할된 데이터를 포함하는 단말 노드를, 복수로 가지는 kd 트리를 구축할 수 있다. 또한, k-NN 질의 처리 시스템(100)은 제1 클라우드에 암호화된 kd 트리를 유지할 수 있다.According to an embodiment, the k-NN query processing system 100 divides data stored in the original database into a plurality of attributes and dimensions to construct a kd tree, encrypts the kd tree, 0.0 > kd < / RTI > tree in the first cloud. That is, the k-NN query processing system 100 divides the data stored in the database in units of a predetermined number (for example, F) and constructs a kd tree having a plurality of terminal nodes including the divided data can do. In addition, the k-NN query processing system 100 may maintain an encrypted kd tree in the first cloud.

일례로, k-NN 질의 처리 시스템(100)은 레벨이 h이고, 총 2^h-1개의 단말 노드를 가지는 kd 트리를 데이터베이스로부터 구성할 수 있으며, 각 단말 노드는 최대 F(FanOut)개의 데이터를 저장할 수 있다.For example, the k-NN query processing system 100 can construct a kd tree having a level h and a total of 2 ^h-1 terminal nodes from the database, and each terminal node stores up to F (FanOut) Can be stored.

kd 트리의 각 단말 노드는, 자신이 담당하는 노드 영역에 관한 영역 정보와, 노드 영역 내에 포함되는 데이터에 대한 데이터ID를 평문 형태로 저장할 수 있다. 여기서, 상기 영역 정보는 노드 영역에 대한 하한점(lb_z,m) 및 상한점(ub_z,m)(1≤z≤num_node, 1≤j≤m)을 속성(m) 별로 포함할 수 있다.Each terminal node of the kd tree can store the area information about the node area it is responsible for and the data ID for the data contained in the node area in a plain text form. Here, the area information may include a lower limit point (lb _{z, m} ) and an upper limit point (ub _{z, m} ) (1? Z? Num _node , 1? have.

이때, 단계(650)에서, k-NN 질의 처리 시스템(100)은 상기 결과 데이터를 도출하여 상기 사용자 단말로 제공하고, 선정된 암호화 연산 프로토콜을 기반으로, 상기 암호화 kd 트리에 근거한, 상기 암호화 데이터베이스 상에서의 상기 kNN 질의를 처리할 수 있다. 즉, k-NN 질의 처리 시스템(100)은 암호화 kd 트리에 근거하여 암호화 인덱스를 탐색하고 인접한 kd 트리 노드를 탐색하여 kNN 질의를 처리할 수 있다.At this time, in step 650, the k-NN query processing system 100 derives the result data and provides it to the user terminal. Based on the selected encryption operation protocol, the k- Lt; RTI ID = 0.0 > kNN < / RTI > That is, the k-NN query processing system 100 can search the cryptographic index based on the encryption kd tree and search for the adjacent kd tree node to process the kNN query.

또한, 단계(650)에서, k-NN 질의 처리 시스템(100)은 ESSED(Enhanced Secure Squared Euclidean Distance) 프로토콜, GSCMP(Garbled Circuit based Secure Compare) 프로토콜, 및 GSPE(Garbled Circuit based Secure Point Enclosure) 프로토콜 중 어느 하나를, 상기 암호화 연산 프로토콜로 선정할 수 있다. 예를 들면, k-NN 질의 처리 시스템(100)은 ESSED 프로토콜을 이용하여 벡터 E(X)와 E(Y) 간 거리의 제곱 E(|X-Y|²)을 계산할 수 있다. 또한, k-NN 질의 처리 시스템(100)은 GSCMP 프로토콜을 이용하여 제1 클라우드(150)에 E(u)와 E(v)가 주어졌을 때, u<v를 만족하는 경우 E(1)을 반환하고, u>v인 경우 E(0)을 반환할 수 있다. 또한, k-NN 질의 처리 시스템(100)은 GSPE 프로토콜을 이용하여, 제1 클라우드(150)에 m 차원의 점 E(p) 및 하한점 E(lb _j ) 및 상한점 E(ub _j )(1≤j≤m)으로 표현된 암호화 영역 정보 'range'가 주어졌을 때, 점 p가 영역 range에 포함되는 경우 E(1)을 반환할 수 있다.In addition, in step 650, the k-NN query processing system 100 may use the ESSED (Enhanced Secure Squared Euclidean Distance) protocol, the GSCMP (Garbled Circuit based Secure Secure Compare) protocol, and the GSPE (Garbled Circuit based Secure Point Enclosure Any one of them can be selected by the encryption operation protocol. For example, the k-NN query processing system 100 may calculate the square of the distance E ( X - Y | ² ) between the vector E ( X ) and E ( Y ) using the ESSED protocol. In addition, the E (1) if they meet a k-NN query processing system 100 when turned is E (u) and E (v) of claim 1 cloud 150 is given by the GSCMP protocol, u <v And return E (0) if u > v . In addition, the k-NN query processing system 100 uses the GSPE protocol to map the first cloud 150 to the m- dimensional point E ( p ), the lower point E ( lb _j ) and the upper point E ( ub _j ) 1≤ j ≤ m), given an encrypted area information 'range' is represented by a, in the case where p points included in the area range may be returned to E (1).

실시예에 따라서, 상기 암호화 연산 프로토콜로서, ESSED 프로토콜이 선정되면, 단계(650)에서, k-NN 질의 처리 시스템(100)은 상기 암호화 kd 트리 내 임의의 데이터 쌍에 대한 차원별 거리의 합산을, 2차원 상에서 수행하여, 상기 결과 데이터의 도출을 위한 암호화 데이터 기반 연산 횟수를 감소시킬 수 있다. 즉, k-NN 질의 처리 시스템(100)은 ESSED 프로토콜을 통해 질의 E(q)와 암호화 인덱스 탐색을 통해 반환된 cnt 개의 암호화 데이터 E(cand _i ) 간 유클리디언 거리 제곱 E(d _i )(1≤i≤cnt)를 계산할 수 있다. k-NN 질의 처리 시스템(100)은 kNN 탐색 및 질의결과를 검증하는데 ESSED 프로토콜을 사용할 수 있다.According to an embodiment, if the ESSED protocol is selected as the cryptographic computation protocol, then in step 650, the k-NN query processing system 100 calculates the summation of the dimensional distances for any data pair in the encryption kd tree , Thereby reducing the number of operations based on the encrypted data for deriving the resultant data. That is, the k-NN query processing system 100 calculates the Euclidian distance squared E ( d _i ) between the query E ( q ) through the ESSED protocol and the cnt encrypted data E ( cand _i ) 1? I ? Cnt ) can be calculated. The k-NN query processing system 100 may use the ESSED protocol to verify the kNN search and query results.

실시예에 따라서, 상기 암호화 연산 프로토콜로서, GSCMP 프로토콜이 선정되면, 단계(650)에서 k-NN 질의 처리 시스템(100)은 상기 제1 클라우드와 상기 제2 클라우드 사이에서 난수를 교환하여, 상기 암호화 kd 트리 내 임의의 데이터 쌍에 대한 크기 비교에 따라 반환되는 데이터의 값을 결정할 수 있다. 예를 들면, k-NN 질의 처리 시스템(100)은 질의결과 검증 과정에서, E(q _j )와 노드의 하한점 E(node _z .lb _j ) 및 상한점 E(node _z . ub _j ) 사이에서 각각 GSCMP 프로토콜을 수행하고, 그 결과를 E(ψ ₁) 및 E(ψ ₂)에 각각 저장할 수 있다. 이때, k-NN 질의 처리 시스템(100)은 ψ ₁<ψ ₂를 만족하는 경우 E(1)을 반환하고, ψ ₁>ψ ₂인 경우 E(0)을 반환할 수 있다. 이때, k-NN 질의 처리 시스템(100)은 난수가 포함된 데이터를 교환할 수 있다.According to an embodiment, if the GSCMP protocol is selected as the cryptographic computation protocol, the k-NN query processing system 100 in step 650 exchanges a random number between the first cloud and the second cloud, The value of the returned data may be determined according to size comparison for any data pair in the kd tree. For example, k-NN between the query processing system 100 includes a query result from the verification process, E (q _j), and the lower limit point E _(z .lb node _j), and the upper limit point E (node _z. Ub _j) of the node Respectively, and store the results in E ( ? ₁ ) and E ( ? ₂ ), respectively. At this time, the k-NN query processing system 100 may return E (1) if ψ ₁ < ψ ₂ and E (0) if ψ ₁ > ψ ₂ . At this time, the k-NN query processing system 100 can exchange data including a random number.

실시예에 따라서, 상기 암호화 연산 프로토콜로서, GSPE 프로토콜이 선정되면, 단계(650)에서 k-NN 질의 처리 시스템(100)은 상기 암호화 kd 트리 내 m 차원의 데이터 E(p)가, 상기 kNN 질의와 연관된 질의 영역에 포함되면, 상기 GSPE 프로토콜에 의한 수행 결과로서 'E(1)'을 반환하고, 상기 데이터 E(p)가, 상기 질의 영역에 포함되지 않으면, 상기 GSPE 프로토콜에 의한 수행 결과로서 'E(0)'을 반환할 수 있다. 예를 들면, k-NN 질의 처리 시스템(100)은 암호화 인덱스를 탐색하는 과정에서, E(q)와 E(node _z )(1≤z≤num _node )를 기반으로 GSPE 프로토콜을 수행함으로써, 질의 지점을 포함하는 노드를 탐색할 수 있다. 이때, GSPE 수행 결과 반환된 E(α _z )의 값이 E(1)인 노드는 질의 지점을 포함하는 노드일 수 있다. 이때, k-NN 질의 처리 시스템(100)은 제1 클라우드 및 제2 클라우드는 어느 노드가 질의 영역과 겹치는 영역인지 알 수 없게 할 수 있다.According to an embodiment, if the GSPE protocol is selected as the cryptographic computation protocol, the k-NN query processing system 100 determines in step 650 that the m-dimensional data E (p) in the encrypted kd tree satisfies kNN query (1) 'as a result of the GSPE protocol and if the data E (p) is not included in the query area, It can return 'E (0)'. For example, the k-NN query processing system 100 performs a GSPE protocol based on E ( q ) and E ( node _z ) (1? Z ? Num _node ) It is possible to search for a node including a point. In this case, the node whose E ( alpha _z ) value E (1) returned from the GSPE execution result may be a node including the query point. At this time, the k-NN query processing system 100 can prevent the first cloud and the second cloud from knowing which node overlaps the query region.

실시예에 따라서, 단계(650)에서 k-NN 질의 처리 시스템(100)은 상기 제1 클라우드에서, GSPE 프로토콜을 기반으로, 상기 kNN 질의의 지점에 관한 복수의 데이터 E(a)를, 상기 암호화 데이터베이스에서 탐색하여, 상기 제2 클라우드로 전송하고, 상기 제2 클라우드에서, 상기 복호화 비밀키를 통해, 상기 복수의 데이터 E(a) 각각을, 복수의 데이터 E'(a)로 복호화하고, 상기 복수의 데이터 E'(a)를 각각 포함하는 노드 그룹을 생성하며, 상기 제1 클라우드에서, 정해진 순서에 따라 상기 제2 클라우드로부터 노드 그룹을 수신하고, 상기 노드 그룹에 저장된 데이터 E'(a) 및 상기 데이터 E(a)를 이용한 SM 프로토콜을 기반으로, 상기 제1 클라우드와 상기 제2 클라우드 간의 다자간 계산을 수행할 수 있다.According to an embodiment, in step 650, the k-NN query processing system 100, in the first cloud, transmits a plurality of data E (a) about the point of the kNN query, based on the GSPE protocol, Decrypts each of the plurality of data E (a) into a plurality of data E '(a) through the decryption secret key in the second cloud, and transmits the decrypted secret key to the second cloud, (A), and in the first cloud, receives a node group from the second cloud according to a predetermined order, and stores data E '(a) stored in the node group, And multiplying between the first cloud and the second cloud based on the SM protocol using the data E (a).

즉, k-NN 질의 처리 시스템(100)은 암호화 인덱스 탐색을 위하여, 제1 클라우드에서 GSPE 프로토콜을 수행함으로써, 질의 지점을 포함하는 노드 E(a)를 탐색할 수 있다. 또한, k-NN 질의 처리 시스템(100)은 E(α)의 순서를 변경하여 제2 클라우드(180)로 전송하고, 제2 클라우드에서 복호화 한 후, c개의 노드 그룹 Group을 생성할 수 있다. 또한, k-NN 질의 처리 시스템(100)은 각 노드 그룹에 할당된 노드의 순서를 랜덤하게 변환한 후, 이를 제1 클라우드(150)로 전송할 수 있다. k-NN 질의 처리 시스템(100)은 제1 클라우드에서, 각 노드 그룹에 속한 노드의 식별 번호를 역변경하고, 노드 그룹 별 노드에 저장된 데이터와 각 노드의 E(α)를 이용해 SM 프로토콜을 수행할 수 있다. 또한, k-NN 질의 처리 시스템(100)은 E(cand)를 반환함으로써 암호화 인덱스 탐색을 종료할 수 있다.That is, the k-NN query processing system 100 can search the node E (a) including the query point by performing the GSPE protocol in the first cloud for the cryptographic index search. Also, the k-NN query processing system 100 may change the order of E ( ? ) And transmit it to the second cloud 180, and decode it in the second cloud, and then create c node groups Group . In addition, the k-NN query processing system 100 may randomly convert the order of nodes assigned to each node group, and then transmit the randomly converted nodes to the first cloud 150. The k-NN query processing system 100 reverses the identification numbers of the nodes belonging to each node group in the first cloud, and performs the SM protocol using the data stored in each node group and E ( α ) of each node can do. In addition, the k-NN query processing system 100 may terminate the encryption index search by returning E ( cand ).

실시예에 따라서, 상기 결과 데이터가 복수로 도출되는 경우, 단계(650)에서 k-NN 질의 처리 시스템(100)은 상기 kd 트리 내에서, 상기 복수의 결과 데이터 중 상기 kNN 질의와의 거리가 가까운 순서대로, 정해진 개수의 결과 데이터를 선별하여, 상기 사용자 단말로 제공할 수 있다. 즉, k-NN 질의 처리 시스템(100)은 kNN 탐색 과정을 통하여, 암호화 인덱스 탐색 단계에서 추출한 데이터를 기반으로 질의와의 거리가 가까운 k개의 데이터를 탐색하여 사용자 단말로 제공할 수 있다.According to an embodiment, if the result data is derived in a plurality, the k-NN query processing system 100 in step 650 determines, within the kd tree, that the result of the plurality of result data is close to the kNN query A predetermined number of result data may be selected and provided to the user terminal in order. That is, the k-NN query processing system 100 can search k data that are close to the query based on the data extracted in the encryption index search step, and provide the k data to the user terminal through the kNN search process.

실시예에 따라서, k-NN 질의 처리 시스템(100)은 상기 원본 데이터베이스에 저장된 데이터를, 다수의 속성 및 차원으로 분할하여, kd 트리를 구성하고, 상기 kd 트리 내에서, 상기 kNN 질의를 기준으로, 상기 결과 데이터와의 거리 보다, 짧은 거리를 갖는 데이터가 존재하는지를 확인하여, 상기 결과 데이터를 검증할 수 있다. According to an embodiment, the k-NN query processing system 100 divides the data stored in the source database into a plurality of attributes and dimensions to construct a kd tree, and in the kd tree, based on the kNN query , It is possible to verify whether or not there is data having a shorter distance than the distance from the result data to verify the result data.

즉, k-NN 질의 처리 시스템(100)은 노드 확장 탐색을 통하여 질의결과 검증을 할 수 있다. kNN 질의 탐색에 대한 결과는 kd 트리를 통해 분할된 일부 노드의 데이터를 기반으로 탐색된 것일 수 있다. 따라서, 인접한 kd 트리 노드에 질의와 보다 근접한 데이터가 존재하는지 검증하는 과정이 요구될 수 있다. 이를 해결하기 위하여, k-NN 질의 처리 시스템(100)은 k-NN 탐색 단계에서 반환된 결과 E(t) 중 k 번째 결과까지의 거리 dist _k 보다 질의 지점으로부터 가까운 거리에 존재하는 노드들을 탐색할 수 있다. 즉, k-NN 질의 처리 시스템(100)은 질의 지점으로부터의 최단 거리가 dist _k 보다 작은 노드를 찾기 위한 탐색 과정을 수행할 수 있다. 이를 위해, k-NN 질의 처리 시스템(100)은 한 점(p)과 한 영역이 주어졌을 때, 영역 내에 존재하는 모든 점 중에서 p까지의 최단 거리를 갖는 제1 정의의 최단 거리점을 활용할 수 있다.That is, the k-NN query processing system 100 can perform a query result verification through a node extension search. The result of the query search may be based on the data of some of the nodes divided through the kd tree. Thus, a process may be required to verify that there is a query and closer data to an adjacent kd tree node. To solve this problem, k-NN query processing system 100 to navigate the nodes existing in the short distance from the distance query point than dist _k to the k th result of the result E (t) is returned by the k-NN search step . That is, the k-NN query processing system 100 can perform a search process for finding a node whose shortest distance from the query point is smaller than dist _k . For this, the k-NN query processing system 100 can utilize the shortest distance point of the first definition having the shortest distance to p out of all the points in the region given a point ( p ) and a region have.

실시예에 따라서, 단계(650)에서, k-NN 질의 처리 시스템(100)은 상기 제1 클라우드에서, 암호화 연산 프로토콜을 수행하여 탐색한 노드를 상기 제2 클라우드로 전달하고(제1 단계), 상기 제2 클라우드에서, 상기 복호화 비밀키로 상기 노드로부터 획득한 데이터에 대해 암호화 연산 프로토콜을 수행한 결과를 상기 제1 클라우드로 전달할 수 있다(제2 단계). 또한, k-NN 질의 처리 시스템(100)은 상기 kNN 질의에 대한 결과가 도출될 때까지, 상기 제1 및 제2 단계를 반복하여, 상기 kNN 질의를 처리할 수 있다. 즉, k-NN 질의 처리 시스템(100)은 암호화 연산 프로토콜을 이용하여 노드를 탐색하고, 노드에 포함된 데이터에 대하여 암호화 연산 프로토콜을 수행하고, kd 트리 노드에 대하여 보다 근접한 데이터가 존재하는지 검증하는 과정을 반복 함으로써, kNN 질의에 대한 의미 있는 결과가 도출될 때까지 상기 단계를 반복적으로 수행할 수 있다.According to an embodiment, in step 650, the k-NN query processing system 100 performs a cryptographic operation protocol in the first cloud to forward the discovered node to the second cloud (step 1) In the second cloud, a result obtained by performing an encryption operation protocol on data acquired from the node using the decryption secret key may be transmitted to the first cloud (step 2). Also, the k-NN query processing system 100 may process the kNN query by repeating the first and second steps until the results for the kNN query are derived. That is, the k-NN query processing system 100 searches for a node using an encryption operation protocol, performs an encryption operation protocol on data included in the node, and verifies whether there is more data closer to the kd tree node By repeating the process, the above steps can be repeated until a meaningful result for the kNN query is derived.

이러한, 가블드 회로 기반 k-NN 질의 처리 방법은 가블드 회로 및 데이터 패킹 기법 기반의 ESSED 프로토콜, GSCMP 프로토콜, 및 GSPE 프로토콜 중 적어도 하나의 암호화 연산 프로토콜을 수행함으로써, 연산 횟수를 감소시켜 효율적인 질의처리 성능을 제공할 수 있다.The method for processing a k-NN query based on a gain-based circuit performs an encryption operation protocol of at least one of an ESSED protocol, a GSCMP protocol, and a GSPE protocol based on a garbled circuit and a data packing technique, Performance can be provided.

또한, 가블드 회로 기반 k-NN 질의 처리 방법은 향상된 암호화 연산 프로토콜을 기반으로 하는 암호화 인덱스 탐색과 암호화 데이터베이스 상에서의 데이터 접근 패턴 보호를 지원하는 k-NN 질의처리 알고리즘을 제공함으로써, 추가적인 정보의 노출을 방지하여 데이터 보호와 사용자 질의 보호 뿐만 아니라, 질의 처리 과정에서의 데이터 접근 패턴 보호를 모두 지원할 수 있다.In addition, the k-NN query processing method based on the garbled circuit provides the k-NN query processing algorithm that supports the encryption index search based on the improved cryptographic operation protocol and the data access pattern protection on the encrypted database, To support both data protection and user query protection as well as data access pattern protection during query processing.

본 발명의 실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method according to an embodiment of the present invention may be implemented in the form of a program command that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions to be recorded on the medium may be those specially designed and configured for the embodiments or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks, and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. For example, it is to be understood that the techniques described may be performed in a different order than the described methods, and / or that components of the described systems, structures, devices, circuits, Lt; / RTI > or equivalents, even if it is replaced or replaced.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

100: 가블드 회로 기반 k-NN 질의 처리 시스템
110: 데이터베이스 120: kd 트리
130: 암호화 공개키 140: 복호화 비밀키
150: 제1 클라우드 160: 암호화 데이터베이스
170: 암호화 kd 트리 180: 제2 클라우드
190: 사용자 단말100: Garbled Circuit-based k-NN Query Processing System
110: database 120: kd tree
130: Encryption public key 140: Decryption secret key
150: First cloud 160: Encryption database
170: Encryption kd tree 180: Second cloud
190: User terminal

Claims

Constructing a first cloud and a second cloud that is non-colluding with the first cloud;
Maintaining an encryption database in which data stored in an original database is encrypted and an encrypted public key generated in association with the encryption in the first cloud;
Maintaining a decryption private key corresponding to the encrypted public key in the second cloud; And
(SMC) between the first cloud and the second cloud based on the encrypted public key and the decrypted secret key when a kNN (k Nearest Neighbor) query is generated in the user terminal that has distributed the encrypted public key, Secure Multiparty Computation) to derive result data for the kNN query from the encryption database and provide the resulting data to the user terminal
(K-NN) query processing based on a GALLBLED CIRCUIT.

The method according to claim 1,
Constructing a kd tree by dividing the data stored in the original database into a plurality of attributes and a plurality of columns; And
Maintaining an encrypted kd tree encrypted with the kd tree in the first cloud;
Further comprising:
Deriving the result data and providing the result data to the user terminal,
Processing the kNN query on the encryption database based on the encryption kd tree based on a predetermined encryption algorithm protocol
(K-NN).

3. The method of claim 2,
Deriving the result data and providing the result data to the user terminal,
Selecting either the ESSED (Enhanced Secure Squared Euclidean Distance) protocol, the GSCMP (Garbled Circuit based Secure Secure Compare) protocol, or the GSPE (Garbled Circuit based Secure Point Enclosure)
Wherein the k-NN query processing method further comprises:

3. The method of claim 2,
When the ESSED protocol is selected as the encryption operation protocol,
Deriving the result data and providing the result data to the user terminal,
Performing a two-dimensional summation of the distance by dimension for any data pair in the encryption kd tree to reduce the number of encryption data-based operations for deriving the result data
Wherein the k-NN query processing method further comprises:

3. The method of claim 2,
When the GSCMP protocol is selected as the encryption operation protocol,
Deriving the result data and providing the result data to the user terminal,
Exchanging a random number between the first cloud and the second cloud to determine a value of data to be returned according to size comparison for any data pair in the encryption kd tree
Wherein the k-NN query processing method further comprises:

3. The method of claim 2,
When the GSPE protocol is selected as the encryption operation protocol,
Deriving the result data and providing the result data to the user terminal,
If the m-dimensional data E (p) in the encrypted kd tree is included in the query area associated with the kNN query, returning 'E (1)' as a result of performing the GSPE protocol; And
If the data E (p) is not contained in the query area, returning 'E (0)' as a result of performing the GSPE protocol
Wherein the k-NN query processing method further comprises:

The method according to claim 1,
Deriving the result data and providing the result data to the user terminal,
Searching, in the first cloud, a plurality of data E (a) related to a point of the kNN query, based on a GSPE protocol, in the encryption database and transmitting the data E (a) to the second cloud;
Decrypts each of the plurality of data E (a) into a plurality of data E '(a) through the decryption secret key in the second cloud, and decrypts each of the plurality of data E' (a) Creating a group; And
Receiving a node group from the second cloud in a predetermined order in the first cloud, and based on the SM protocol using the data E '(a) and the data E (a) stored in the node group, Performing multiparallel calculations between the cloud and the second cloud
(K-NN).

The method according to claim 1,
Dividing data stored in the original database into a plurality of attributes and dimensions, and constructing a kd tree
Further comprising:
When a plurality of result data are derived,
Deriving the result data and providing the result data to the user terminal,
Selecting and providing to the user terminal a predetermined number of result data in the kd tree in order of closeness to the kNN query among the plurality of result data;
(K-NN).

The method according to claim 1,
Dividing data stored in the original database into a plurality of attributes and dimensions to construct a kd tree; And
Verifying whether or not there is data having a distance shorter than the distance from the result data on the basis of the kNN query in the kd tree and verifying the result data
Wherein the k-NN query processing method further comprises:

The method according to claim 1,
Deriving the result data and providing the result data to the user terminal,
The method comprising: a first step of performing an encryption operation protocol in the first cloud and transferring the discovered node to the second cloud;
A second step of transmitting, in the second cloud, a result of performing an encryption operation protocol on data acquired from the node with the decryption secret key to the first cloud; And
Repeating the first and second steps until results for the kNN query are derived, and processing the kNN query
(K-NN).