KR20190063204A

KR20190063204A - Method and system for similarity search over encrypted data in cloud computing

Info

Publication number: KR20190063204A
Application number: KR1020170162121A
Authority: KR
Inventors: 한창희; 허준범
Original assignee: 고려대학교 산학협력단
Priority date: 2017-11-29
Filing date: 2017-11-29
Publication date: 2019-06-07
Also published as: KR102050888B1

Abstract

Disclosed is a method for searching for similarity with encrypted data in a cloud computing environment, capable of guaranteeing data confidentiality. According to the present invention, a similarity search method comprises a setup step, a key generation step, an encryption step, a transmission step of a data uploading device, a trapdoor generation step, a transmission step of a user device, and a search step.

Description

METHOD AND SYSTEM FOR SEARCHING OVER ENCRYPTED DATA IN CLOUD COMPUTING BACKGROUND OF THE INVENTION 1. Field of the Invention < RTI ID = 0.0 >

본 발명은 클라우드 컴퓨팅 환경에서 암호화된 데이터에 대한 유사도 검색 방법 및 시스템에 관한 것으로, 특히, 다수의 데이터 업로더가 암호 데이터를 클라우드 서버에 업로드하는 환경에서 사용자가 데이터 업로더와의 비밀키 공유 없이도 유사도 검색 쿼리를 생성하며, 클라우드 서버는 데이터와 쿼리에 대한 복호 없이 둘 간의 유사도를 판별할 수 있는 유사도 검색 방법 및 시스템에 관한 것이다.The present invention relates to a method and system for searching for similarity to encrypted data in a cloud computing environment. More particularly, the present invention relates to a method and system for searching for similarity of encrypted data in a cloud computing environment, A cloud server generates a similarity search query, and the cloud server relates to a similarity search method and system that can determine the degree of similarity between two without decoding data and queries.

본 발명은 다수가 문서를 암호화하며, 다수의 사용자가 키워드를 암호화하는 클라우드 컴퓨팅 환경에서 암호화된 데이터에 대한 유사도 검색 기술을 제시한다.The present invention proposes a similarity retrieval technique for encrypted data in a cloud computing environment in which a plurality of documents are encrypted and a plurality of users encrypt keywords.

검색 가능 암호(searchable encryption, SE) 기법은 암호화된 자료를 복호화하지 않고도 원하는 자료를 검색할 수 있도록 하는 암호 기반 기술이다. 검색 가능 암호는 개인의 정보가 외부 저장 공간에 저장되면서 발생하는 여러 문제점에 대한 해결 방법으로 지금까지 많은 연구가 진행되었다.Searchable encryption (SE) is a password-based technology that allows users to search for desired data without decrypting the encrypted data. Searchable passwords are a solution to many problems that occur when personal information is stored in an external storage space.

검색 가능 암호는 대부분 암호화된 문서에 대하여 사용자가 특정 키워드를 암호화한다. 이후, 암호화된 키워드가 해당 문서 내에 포함되어 있는지를 판별하는데, 이 때 키워드와 문서에 대한 복호가 필요하지 않다는 점에서 데이터 프라이버시를 보장한다.Most searchable passwords allow users to encrypt certain keywords for encrypted documents. Thereafter, it is determined whether the encrypted keyword is included in the document. In this case, data privacy is guaranteed in that the decryption of the keyword and document is not necessary.

문서와 키워드를 암호화하는 주체가 단일한 사용자인 singlewriter/singlereader(S/S) SE 기법, 한 명이 문서를 암호화하며, 다수의 사용자가 키워드를 암호화하는 singlewriter/multireader (S/M) SE 기법, 다수가 문서를 암호화하며, 한 명의 사용자가 키워드를 암호화하는 multiwriter/singlreader (M/S) SE 기법, 다수가 문서를 암호화하며, 다수의 사용자가 키워드를 암호화하는 multiwriter/multireader (M/M) SE 기법이 있으며, 위의 네 가지 기법 중에서 M/M SE가 가장 복잡하며 구현하기 어려운 기법으로 간주된다.A singlewriter / singlereader (S / S) SE technique, which is a single user that encrypts documents and keywords, a singlewriter / multireader (S / M) SE technique where one user encrypts documents, Multiwriter / multireader (M / S) SE technique in which a user encrypts a document, a multiwriter / singlreader (M / S) SE technique in which a single user encrypts a keyword, a multiwriter / Among these four techniques, M / M SE is considered to be the most complex and difficult to implement.

검색 암호는 대부분 문서를 대상으로 하는 키워드 검색과 관련한 연구가 주를 이루고 있다. 이미지 등의 데이터를 대상으로 하는 유사도 검색(similarity search)은 비교적 최근에 활발히 연구가 된 분야인데, 대부분의 연구가 S/M 환경에서 동작하도록 하는 기법 설계에 초점이 맞추어져 있다. Most of the search passwords are mainly related to keyword search targeting documents. Similarity search for data such as images has been actively researched recently and focuses on the design of techniques for most of the studies to operate in S / M environment.

2016년에 M/M 환경에서 유사도 검색을 지원하는 기법이 소개되었는데, 사용자의 쿼리를 재암호화하는 방식을 취하고 있다. 이와 같은 방법은 각각의 데이터 업로더 별로 재암호화키를 관리해야 한다는 점과 재암호화를 위한 부가적인 연산이 필요하단 점에서 비효율적이다. In 2016, we introduced a technique to support similarity search in M / M environment, which takes the approach of re-encrypting the user's query. This method is inefficient in that a re-encryption key must be managed for each data uploader and an additional operation for re-encryption is required.

M/M 환경에서 유사도 검색 기법의 이론적 측면에서의 최적의 효율성은 암호화된 데이터와 쿼리에 대한 부가적인 연산 없이 유사도 측정이 가능함이 전제되어야 한다. 하지만 아직까지 이러한 요구사항을 만족하는 안전한 유사도 검색 기법은 존재하지 않는다.The optimal efficiency in the theoretical aspect of the similarity retrieval method in the M / M environment should be that it is possible to measure the similarity without additional operations on the encrypted data and query. However, there is no secure similarity search method that satisfies these requirements.

US 9311494 B2US 9311494 B2 KR 1489876 B1KR 1489876 B1 KR 1661549 B1KR 1661549 B1 JP 5948060 B2JP 5948060 B2

C. Wang, N. Cao, J. Li, K. Ren, and W. Lou, “Secure ranked keyword search over encrypted cloud data,” In Distributed Computing Systems (ICDCS), 2010. C. Wang, N. Cao, J. Li, K. Ren, and W. Lou, " Secure ranked keyword search over encrypted cloud data, " In Distributed Computing Systems (ICDCS), 2010. Kiayias, A., Oksuz, O., Russell, A., Tang, Q., and Wang, B., “Efficient encrypted keyword search for multi-user data sharing,” In European Symposium on Research in Computer Security, pp. 173-195, 2016. Kiayias, A., Oksuz, O., Russell, A., Tang, Q., and Wang, B., "Efficient encrypted keyword search for multi-user data sharing," In European Symposium on Research in Computer Security, pp. 173-195, 2016.

본 발명이 이루고자 하는 기술적인 과제는 클라우드 컴퓨팅 환경에서 암호화된 데이터와 쿼리에 대한 복호 없이 둘 간의 유사도를 판별할 수 있는 유사도 검색 방법 및 시스템을 제공하는 것이다.SUMMARY OF THE INVENTION It is an object of the present invention to provide a similarity search method and system capable of determining the similarity between two data without decryption of encrypted data and queries in a cloud computing environment.

본 발명의 실시 예에 따른 복수의 사용자 장치, 복수의 데이터 업로더 기기, 클라우드 서버, 및 신뢰기관 서버를 포함하는 클라우드 컴퓨팅 환경에서 암호화된 데이터에 대한 유사도 검색 시스템에서의 유사도 검색 방법은 신뢰기관 서버가 보안상수(

)를 입력값으로 하여 임의의 생성원(

)을 선택하고, 임의의 난수(

)를 선택하여, 공개 파라미터 (

)를 생성하는 셋업 단계; 신뢰기관 서버가 서브특징(subfeature)의 개수(ㅣ)를 입력값으로 하여, l 개의 무작위 값을 비밀키(

)로 생성하는 키생성 단계; 데이터 업로더 기기가 데이터 업로더 기기의 비밀키(k), 공개 파라미터(

), 암호화 대상 데이터(I), 및 데이터 업로더 기기의 식별자(

)를 입력값으로 하여, 제1 암호문(

)과 식별자(

)를 생성하는 암호화 단계; 데이터 업로더 기기가 제1 암호문(

) 및 식별자(

)를 클라우드 서버에 전송하는 단계; 사용자 장치가 사용자 장치의 비밀키(k')와 공개 파라미터(

), 및 검색 대상 데이터(I’)를 입력값으로 하여, 제2 암호문(

)을 생성하는 트랩도어 생성 단계; 사용자 장치가 제2 암호문(

)을 클라우드 서버에 전송하는 단계; 및 클라우드 서버가 데이터 업로더 기기와 사용자 장치로부터 수신한 제1 암호문(

), 식별자(

), 제2 암호문(

), 및 임계값(

)을 입력값으로 하여, 제1 암호문(

)과 제2 암호문(

)이 유사한 것으로 판단되면 해당 식별자(

)를 반환하는 검색 단계를 포함한다.In a cloud computing environment including a plurality of user equipments, a plurality of data uploader devices, a cloud server, and a trusting authority server according to an embodiment of the present invention, the similarity searching method in the similarity searching system for encrypted data includes: Is a security constant (

) As an input value,

) Is selected, and an arbitrary random number (

) To select the public parameter (

); The trusted authority server sets the number of subfeatures (I) as input values, and outputs one random value as a secret key

); The data uploader device sends the secret key (k), disclosure parameters (

), The encryption target data (I), and the identifier of the data uploader device

) As an input value, and outputs the first cipher text (

) And identifier (

); If the data uploader device receives the first ciphertext (

) And identifier (

To the cloud server; If the user equipment has secret key (k ') and disclosure parameter (k') of the user equipment

) And the search object data I 'as input values, and outputs the second cipher text (

A trap door generating step of generating a trap door; If the user equipment sends a second cipher text (

To the cloud server; And a first cipher text received from the data uploader device and the user device by the cloud server

), An identifier (

), A second cipher text (

), And a threshold value

) As an input value, the first ciphertext (

) And the second cipher text

) Is similar, the corresponding identifier (

). &Lt; / RTI >

본 발명의 실시 예에 따른 클라우드 컴퓨팅 환경에서 암호화된 데이터에 대한 유사도 검색 시스템은 보안상수(

)를 입력값으로 하여 임의의 생성원(

) 및 임의의 난수(

)를 선택하고,

를 연산하여. 공개 파라미터 (

)를 생성하고, 서브특징(subfeature)의 개수 l을 입력값으로 하여 비밀키(

)를 생성하는 신뢰기관 서버; 데이터 업로더 기기의 비밀키(k), 상기 공개 파라미터(

), 암호화 대상 데이터(I), 및 데이터 업로더 기기의 식별자(

)를 입력값으로 하여, 최종적으로 제1 암호문(

)과 식별자(

)를 반환하고, 상기 제1 암호문(

)과 상기 식별자(

)를 클라우드 서버에 전송하는 데이터 업로더 기기; 사용자 장치의 비밀키(k')와 공개 파라미터(

), 및 검색 대상 데이터(I')를 입력값으로 하여, 최종적으로 제2 암호문(

)을 반환하고, 상기 제2 암호문(

)을 클라우드 서버에 전송하는 사용자 장치; 및 데이터 업로더 기기와 사용자 장치로부터 수신한 상기 제1 암호문(

), 상기 식별자(

), 상기 제2 암호문(

), 및 임계값(

)을 입력값으로 하여, 제1 암호문(

)과 제2 암호문(

)이 유사한 것으로 판단되면 해당 식별자(

)를 반환하는 클라우드 서버를 포함한다.In a cloud computing environment according to an embodiment of the present invention, the similarity search system for encrypted data includes a security constant

) As an input value,

) And an arbitrary random number (

) Is selected,

. Public parameter (

), And the number l of subfeatures is used as an input value to generate a secret key

); &Lt; / RTI > The secret key (k) of the data uploader device, the public parameter

) As an input value, and finally outputs the first ciphertext (

) And identifier (

), And the first cipher text (

) And the identifier (

) To the cloud server; The secret key (k ') and the disclosure parameter (

) And the search target data I 'as input values, and finally outputs the second cipher text (

), And the second cipher text (

) To the cloud server; And the first ciphertext (" cipher ") received from the data uploader device and the user device

), The identifier (

), The second cipher text (

), And a threshold value

) As an input value, the first ciphertext (

) And the second cipher text

) Is similar, the corresponding identifier (

) &Lt; / RTI >

본 발명의 실시 예에 따른 클라우드 컴퓨팅 환경에서 암호화된 데이터에 대한 유사도 검색 방법 및 시스템에 의할 경우, 서로 다른 신뢰 도메인(trust domain)을 가진 다수 데이터 업로더가 암호데이터를 클라우드 서버에 업로드하며 동시에 데이터 기밀성을 보장받을 수 있는 효과가 있다. According to the method and system for searching for similarity in encrypted data in a cloud computing environment according to an embodiment of the present invention, a plurality of data uploaders having different trust domains upload password data to the cloud server Data confidentiality can be guaranteed.

또한, 본 발명의 실시 예에 따른 클라우드 컴퓨팅 환경에서 암호화된 데이터에 대한 유사도 검색 방법 및 시스템에 의할 경우, 서로 다른 신뢰 도메인을 가진 다수의 사용자가 암호 데이터에 대한 유사도 검색을 통해 유사 데이터에 대한 정보를 받아볼 수 있으며 동시에 검색 요청에 대한 프라이버시를 보장받을 수 있는 효과가 있다. In a cloud computing environment according to an embodiment of the present invention, when a method and system for searching for similarity of encrypted data is employed, a plurality of users having different trust domains search for similar data It is possible to receive the information and at the same time, the privacy of the search request can be guaranteed.

또한, 본 발명의 실시 예에 따른 클라우드 컴퓨팅 환경에서 암호화된 데이터에 대한 유사도 검색 방법 및 시스템에 의할 경우, 다수의 데이터 업로더와 사용자 환경에서 각각의 개체가 고유한 비밀키로 데이터를 암호화하더라도 유사도 검색이 가능하다.In addition, according to the method and system for searching similarity of encrypted data in a cloud computing environment according to an embodiment of the present invention, even if each entity encrypts data with a unique secret key in a plurality of data uploaders and user environments, Search is possible.

본 발명의 상세한 설명에서 인용되는 도면을 보다 충분히 이해하기 위하여 각 도면의 상세한 설명이 제공된다.
도 1은 본 발명의 일 실시예에 따른 클라우드 컴퓨팅 환경에서 암호화된 데이터에 대한 유사도 검색 시스템의 개략도이다.
도 2는 본 발명의 일 실시예에 따른 클라우드 컴퓨팅 환경에서 암호화된 데이터에 대한 유사도 검색 방법을 설명하기 위한 흐름도이다.
도 3은 본 발명의 일 실시예에 따른 클라우드 컴퓨팅 환경에서 암호화된 데이터에 대한 유사도 검색 방법이 적용되는 시스템의 전체 흐름을 도시한 도면이다.
도 4는 도 2의 셋업 단계를 보다 세부적으로 도시한 도면이고, 도 5는 도 2의 키생성 단계를 보다 세부적으로 도시한 도면이다.
도 6은 도 2의 암호화 단계를 보다 세부적으로 도시한 도면이고, 도 7은 도 2의 트랩도어 생성 단계를 보다 세부적으로 도시한 도면이다.
도 8은 도 2의 검색 단계를 보다 세부적으로 도시한 도면이다. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS In order to more fully understand the drawings recited in the detailed description of the present invention, a detailed description of each drawing is provided.
1 is a schematic diagram of a similarity search system for encrypted data in a cloud computing environment according to an embodiment of the present invention.
2 is a flowchart illustrating a method of searching for similarity of encrypted data in a cloud computing environment according to an embodiment of the present invention.
FIG. 3 is a diagram illustrating an overall flow of a system to which a similarity search method for encrypted data is applied in a cloud computing environment according to an embodiment of the present invention.
FIG. 4 is a detailed view of the setup process of FIG. 2, and FIG. 5 is a diagram illustrating the key generation process of FIG. 2 in more detail.
FIG. 6 is a more detailed view of the encrypting step of FIG. 2, and FIG. 7 is a more detailed view of the trap door generating step of FIG.
Figure 8 is a more detailed illustration of the retrieval step of Figure 2;

본 명세서에 개시되어 있는 본 발명의 개념에 따른 실시 예들에 대해서 특정한 구조적 또는 기능적 설명은 단지 본 발명의 개념에 따른 실시 예들을 설명하기 위한 목적으로 예시된 것으로서, 본 발명의 개념에 따른 실시 예들은 다양한 형태들로 실시될 수 있으며 본 명세서에 설명된 실시 예들에 한정되지 않는다.It is to be understood that the specific structural or functional description of embodiments of the present invention disclosed herein is for illustrative purposes only and is not intended to limit the scope of the inventive concept But may be embodied in many different forms and is not limited to the embodiments set forth herein.

본 발명의 개념에 따른 실시 예들은 다양한 변경들을 가할 수 있고 여러 가지 형태들을 가질 수 있으므로 실시 예들을 도면에 예시하고 본 명세서에서 상세하게 설명하고자 한다. 그러나, 이는 본 발명의 개념에 따른 실시 예들을 특정한 개시 형태들에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물, 또는 대체물을 포함한다.The embodiments according to the concept of the present invention can make various changes and can take various forms, so that the embodiments are illustrated in the drawings and described in detail herein. It should be understood, however, that it is not intended to limit the embodiments according to the concepts of the present invention to the particular forms disclosed, but includes all modifications, equivalents, or alternatives falling within the spirit and scope of the invention.

제1 또는 제2 등의 용어는 다양한 구성 요소들을 설명하는데 사용될 수 있지만, 상기 구성 요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성 요소를 다른 구성 요소로부터 구별하는 목적으로만, 예컨대 본 발명의 개념에 따른 권리 범위로부터 벗어나지 않은 채, 제1 구성 요소는 제2 구성 요소로 명명될 수 있고 유사하게 제2 구성 요소는 제1 구성 요소로도 명명될 수 있다.The terms first, second, etc. may be used to describe various elements, but the elements should not be limited by the terms. The terms may be named for the purpose of distinguishing one element from another, for example, without departing from the scope of the right according to the concept of the present invention, the first element may be referred to as a second element, The component may also be referred to as a first component.

본 명세서에서 사용한 용어는 단지 특정한 실시 예를 설명하기 위해 사용된 것으로서, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 본 명세서에 기재된 특징, 숫자, 단계, 동작, 구성 요소, 부분품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성 요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The singular expressions include plural expressions unless the context clearly dictates otherwise. In this specification, the terms "comprises" or "having" and the like are used to specify that there are features, numbers, steps, operations, elements, parts or combinations thereof described herein, But do not preclude the presence or addition of one or more other features, integers, steps, operations, components, parts, or combinations thereof.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 갖는 것으로 해석되어야 하며, 본 명세서에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in commonly used dictionaries are to be interpreted as having a meaning consistent with the meaning of the context in the relevant art and, unless explicitly defined herein, are to be interpreted as ideal or overly formal Do not.

이하, 본 명세서에 첨부된 도면들을 참조하여 본 발명의 실시 예들을 상세히 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings attached hereto.

우선, 본 발명에 사용된 모든 알고리즘은 겹선형 사상(Bilinear Map)의 수학적 특성에 기반하여 생성되었다. Bilinear Map의 세부 특성은 아래와 같다.First, all the algorithms used in the present invention were generated based on the mathematical properties of the Bilinear Map. The detailed characteristics of Bilinear Map are as follows.

이하, 도 1을 참조하여, 본 발명의 일 실시예에 따른 클라우드 컴퓨팅 환경에서 암호화된 데이터에 대한 유사도 검색 시스템에 대해 상술한다. Hereinafter, a similarity search system for encrypted data in a cloud computing environment according to an embodiment of the present invention will be described in detail with reference to FIG.

도 1은 본 발명의 일 실시예에 따른 클라우드 컴퓨팅 환경에서 암호화된 데이터에 대한 유사도 검색 시스템(10)을 도시한다. 도 1을 참조하면, 클라우드 컴퓨팅 환경에서 암호화된 데이터에 대한 유사도 검색 시스템(10)은 복수의 사용자 장치(100), 복수의 데이터 업로더 기기(200), 클라우드 서버(300), 및 신뢰기관 서버(400)를 포함한다.Figure 1 illustrates a system 10 for searching for similarity of encrypted data in a cloud computing environment in accordance with an embodiment of the present invention. Referring to FIG. 1, a similarity search system 10 for encrypted data in a cloud computing environment includes a plurality of user devices 100, a plurality of data uploader devices 200, a cloud server 300, (400).

신뢰기관 서버(400)는 보안상수(

)를 입력값으로 하여 임의의 생성원(

) 및 임의의 난수(

)를 선택하고,

를 연산하여(S130). 아래와 같이 공개 파라미터(

)를 생성한다.Trusted entity server 400 includes a security constant (< RTI ID = 0.0 >

) As an input value,

) And an arbitrary random number (

) Is selected,

(S130). The public parameters (

).

또한, 신뢰기관 서버(400)는 l을 입력값으로 받은 후, l 개의 무작위 값을 비밀키(k)로 반환한다.In addition, the trusted authority server 400 receives 1 as an input value, and then returns 1 random value as a secret key k.

이때, l은 서브특징(subfeature)의 개수이며, 모든 개체는 키생성(KeyGen) 알고리즘을 실행하여 비밀키 k를 획득한다. In this case, 1 is the number of subfeatures, and all the entities execute the KeyGen algorithm to obtain the secret key k.

데이터(I)를 클라우드 서버(300)에 업로드하고자 하는 데이터 업로더 기기(200)는 데이터 업로더 기기(200)의 비밀키, 공개 파라미터(

), 암호화할 데이터(I), 데이터 업로더 기기의 식별자(

)를 입력값으로 하여, 최종적으로 제1 암호문(

)과 식별자(

)를 반환하고, 제1 암호문(

)과 식별자(

)를 클라우드 서버(300)에 전송한다.The data uploader device 200 which wants to upload the data I to the cloud server 300 is configured to transmit the private key of the data uploader device 200,

), The data to be encrypted (I), the identifier of the data uploader device

) As an input value, and finally outputs the first ciphertext (

) And identifier (

), And returns the first ciphertext (

) And identifier (

) To the cloud server (300).

사용자 장치(100)는 자신의 비밀키(k')와 공개 파라미터(

), 그리고 검색할 대상 데이터 I'를 입력값으로 하여, 최종적으로 제2 암호문(

)을 반환하고, 제2 암호문(

)을 클라우드 서버(300)에 전송한다.The user device 100 has its private key k 'and public parameters < RTI ID = 0.0 >

), And target data I 'to be searched as input values, and finally the second ciphertext (

), And returns a second cipher text (

) To the cloud server (300).

클라우드 서버(300)는 데이터 업로더 기기(200)와 사용자 장치(100)로부터 전달받은 값(

)과 임계값(threshold,

)을 입력값으로 하여, 제1 암호문(

)과 제2 암호문(

)이 유사한 것으로 판단되는 경우, 식별자(

)를를 반환한다. 이때,

는 사전에 지정되는 값으로서 필요에 따라 변경될 수 있다.The cloud server 300 receives data (e.g., data) from the data uploader device 200 and the user device 100

) And a threshold value (threshold,

) As an input value, the first ciphertext (

) And the second cipher text

) Is determined to be similar, the identifier (

). At this time,

May be changed as necessary as a value to be designated in advance.

이하, 도 2 내지 도 8을 참조하여, 본 발명의 일 실시예에 따른 클라우드 컴퓨팅 환경에서 암호화된 데이터에 대한 유사도 검색 방법에 대하여 자세히 살펴보도록 한다.Hereinafter, a method of searching for similarity with respect to encrypted data in a cloud computing environment according to an embodiment of the present invention will be described in detail with reference to FIG. 2 to FIG.

도 2는 본 발명의 일 실시예에 따른 클라우드 컴퓨팅 환경에서 암호화된 데이터에 대한 유사도 검색 방법을 설명하기 위한 흐름도이고, 도 3은 본 발명의 일 실시예에 따른 클라우드 컴퓨팅 환경에서 암호화된 데이터에 대한 유사도 검색 방법이 적용되는 시스템의 전체 흐름을 도시한 도면이다.FIG. 2 is a flowchart illustrating a method of searching for similarity of encrypted data in a cloud computing environment according to an embodiment of the present invention. FIG. 3 is a flowchart illustrating a method of searching for similarity of encrypted data in a cloud computing environment according to an exemplary embodiment of the present invention. FIG. 5 is a diagram showing the overall flow of a system to which a similarity search method is applied.

본 발명은 클라우드 컴퓨팅 환경에서의 유사도 검색 방법에 관한 것으로, 다수의 데이터 업로더 기기(200)가 암호 데이터를 클라우드 서버(300)에 업로드하는 환경에서 사용자 장치(100)가 데이터 업로더 기기(200)와의 비밀키 공유 없이도 유사도 검색 쿼리를 생성하며, 클라우드 서버(300)는 암호 데이터와 쿼리에 대한 복호 없이 둘 간의 유사도를 판별할 수 있다.The present invention relates to a method of searching for similarity in a cloud computing environment and a method in which a plurality of data uploader devices 200 upload cryptographic data to a cloud server 300, ), And the cloud server 300 can determine the degree of similarity between the two without the decryption of the cipher data and the query.

도 4 내지 도 8은 도 3의 셋업 단계, 키생성 단계, 암호화 단계, 트랩도어 생성 단계, 및 검색 단계를 보다 세부적으로 도시한 도면이다. FIGS. 4 to 8 are views showing in more detail the setup step, the key generation step, the encryption step, the trap door creation step, and the retrieval step in FIG.

셋업set up (( SetupSetup ) 단계) step

먼저, 신뢰기관 서버(400)는 데이터 업로더 기기(200)와 사용자 장치(100)가 공유하게 될 공개 파라미터(

)를 생성한다(S100). Setup 알고리즘이 반환하는 공개 파라미터(

)는 시스템에 참여하는 모든 개체가 공유한다.First, the trustworthiness institution server 400 transmits a public parameter (e.g., a public key) to be shared by the data uploader device 200 and the user device 100

(S100). The public parameters returned by the Setup algorithm (

) Is shared by all the participants in the system.

도 4를 참조하면, 신뢰기관 서버(400)는 보안상수(

)를 입력값으로 하여 임의의 생성원(

)을 선택한다(S110). 다음, 임의의 난수(

)를 선택하고(S120),

를 연산하여(S130). 아래와 같이 공개 파라미터(

)를 설정한다(S140).Referring to FIG. 4, the trusted authority server 400 includes a security constant

) As an input value,

(S110). Next, an arbitrary random number (

(S120). Then,

(S130). The public parameters (

(S140).

키생성Key generation (( KeyGenKeyGen ) 단계) step

신뢰기관 서버(400)는 ㅣ을 입력값으로 받은 후, l 개의 무작위 값을 비밀키(k)로 반환한다(S200, 도 5 참조).The trusted-state server 400 receives the input of L, and returns one random value as a secret key k (S200, refer to FIG. 5).

이때, l은 subfeature의 개수이며, 모든 개체는 키생성(KeyGen) 알고리즘을 실행하여 비밀키 k를 획득한다. In this case, l is the number of subfeatures, and all entities execute the KeyGen algorithm to obtain the secret key k.

암호화(encryption( BuildIndexBuildIndex ) 단계) step

데이터를 클라우드 서버(300)에 업로드하고자 하는 데이터 업로더 기기(200)가 실행하며, 데이터 업로더 기기(200)의 비밀키, 공개 파라미터(

), 암호화할 데이터(I), 데이터 업로더 기기의 식별자(

)를 입력값으로 하여, 최종적으로 제1 암호문(

)과 식별자(

)를 반환한다(S300).The data uploader device 200 for uploading the data to the cloud server 300 executes the secret key and the public parameter of the data uploader device 200

), The data to be encrypted (I), the identifier of the data uploader device

) As an input value, and finally outputs the first ciphertext (

) And identifier (

(S300).

도 6을 참조하면, 암호화(BuildIndex) 알고리즘은 업로드 대상 데이터(I)로부터 l 개의 subfeature를 추출한다(S310). 이때 subfeature을 추출하기 위하여 기존의LSH(Locality sensitive hashing) 알고리즘을 이용할 수 있다. Referring to FIG. 6, the encryption (BuildIndex) algorithm extracts one subfeature from the uploaded data I (S310). At this time, the existing LSH (Locality sensitive hashing) algorithm can be used to extract the subfeatures.

다음, 암호화(BuildIndex) 알고리즘은 모든 subfeature에 대하여 암호문

을 생성한다.(S330).Next, the encryption algorithm (BuildIndex)

(S330).

최종적으로 암호화(BuildIndex) 알고리즘은 제1 암호문(

)과 식별자(

)를 반환하고(S340), 데이터 업로더 기기(200)는 반환된 결과(

,

)를 클라우드 서버(300)에 전송한다(S350). Finally, the encryption algorithm (BuildIndex)

) And identifier (

(S340), and the data uploader device 200 returns the returned result (

,

To the cloud server 300 (S350).

트랩도어Trap door 생성(Trapdoor) 단계 Trapdoor phase

사용자 장치(100)는 자신의 비밀키(k')와 공개 파라미터(

), 그리고 검색할 대상 데이터 I’을 트랩도어(Trapdoor) 알고리즘에 입력한다. The user device 100 has its private key k 'and public parameters < RTI ID = 0.0 >

), And inputs data I 'to be searched into the trapdoor algorithm.

이후 알고리즘은 암호화(BuildIndex) 알고리즘과 동일하게 동작하며 최종적으로 제2 암호문(

)을 반환한다(S400, 도 7 참조). The algorithm then operates in the same way as the encryption algorithm (BuildIndex)

(S400, see Fig. 7).

사용자 장치(100)는 반환된 결과(

)를 클라우드 서버(300)에 전송한다. User device 100 may then return the returned result (

) To the cloud server (300).

검색(Search( SearchSearch ) 단계) step

검색(Search) 단계에서 클라우드 서버(300)는 데이터 업로더 기기(200)와 사용자 장치(100)로부터 전달받은 값(

)과 임계값(threshold,

)을 입력값으로 하여, 제1 암호문(

)과 제2 암호문(

)이 유사한 것으로 판단되는 경우, 식별자(

)를를 반환한다(S500). 이때,

는 사전에 지정되는 값으로서 필요에 따라 변경될 수 있다. In the search step, the cloud server 300 transmits the value (data) received from the data uploader device 200 and the user device 100

) And a threshold value (threshold,

) As an input value, the first ciphertext (

) And the second cipher text

) Is determined to be similar, the identifier (

(S500). At this time,

May be changed as necessary as a value to be designated in advance.

구체적으로, 도 8을 참조하면, 클라우드 서버(300)는

,

를 입력값으로 하여 아래의 식을 계산한다.Specifically, referring to FIG. 8, the cloud server 300

,

The following formula is calculated.

상기 식을 연산한 후, 아래의 조건을 만족하면 count 를 1 증가시킨다. After calculating the above expression, count is incremented by 1 if the following condition is satisfied.

만약 count 가 th 보다 크다면 알고리즘은 제1 암호문(

)과 제2 암호문(

)이 유사한 것으로 판단하고 해당 식별자(

)를 반환한다.If count is greater than th, the algorithm returns the first ciphertext (

) And the second cipher text

) Is judged to be similar and the corresponding identifier

).

본 발명은 도면에 도시된 실시 예를 참고로 설명되었으나 이는 예시적인 것에 불과하며, 본 기술 분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 타 실시 예가 가능하다는 점을 이해할 것이다. 따라서, 본 발명의 진정한 기술적 보호 범위는 첨부된 등록청구범위의 기술적 사상에 의해 정해져야 할 것이다.While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is evident that many alternatives, modifications and variations will be apparent to those skilled in the art. Accordingly, the true scope of the present invention should be determined by the technical idea of the appended claims.

10 : 클라우드 컴퓨팅 환경에서 암호화된 데이터에 대한 유사도 검색 시스템
100 : 사용자 장치
200 : 데이터 업로더 기기
300 : 클라우드 서버
400 : 신뢰기관 서버10: Similarity search system for encrypted data in cloud computing environment
100: User device
200: Data uploader device
300: Cloud server
400: trust authority server

Claims

1. A method of searching for similarity of encrypted data in a similarity retrieval system for encrypted data in a cloud computing environment including a plurality of user devices, a plurality of data uploader devices, a cloud server, and a trusted authority server,
The trust authority server sends a security constant (

) As an input value,

) Is selected, and an arbitrary random number (

) To select the public parameter (

);
The data uploader device sends the secret key (k), disclosure parameters (

) As an input value, and outputs the first cipher text (

) And identifier (

);
The data uploader device transmits the first cipher text (

) And the identifier (

A trap door generating step of generating a trap door;
The user device sends the second cipher text (

To the cloud server; And
The cloud server sends the first ciphertext (" cipher ") received from the data uploader device and the user device

), The identifier (

), The second cipher text (

), And a threshold value

) As an input value, the first ciphertext (

) And the second cipher text

) Is similar, the corresponding identifier (

And a search step of searching for similarity of the encrypted data in the cloud computing environment.

The method according to claim 1,
Wherein the encrypting step comprises:
Extracting one subfeature from the upload data (I) using a locality sensitive hashing (LSH) algorithm;

For each of the 1 subfeatures, the ciphertext

; And

First ciphertext (

) And identifier (

) &Lt; / RTI >

A similarity search method for encrypted data in a cloud computing environment.

3. The method of claim 2,
Wherein the trap door generating step comprises:
Extracting 1 subfeatures from the search object data I 'using a locality sensitive hashing algorithm;

For each of the 1 subfeatures, the ciphertext

; And

Second ciphertext (

&Lt; / RTI >

A similarity search method for encrypted data in a cloud computing environment.

The method of claim 3,
The retrieving step comprises:
Calculating a following equation;

Increasing the count by 1 if the following condition is satisfied; And

count is the threshold value (

), The first cipher text (

) And the second cipher text

) Is judged to be similar and the corresponding identifier

The method of claim 1, further comprising the steps of: determining a degree of similarity for the encrypted data in a cloud computing environment.

Security Constants (

) As an input value,

) And an arbitrary random number (

) Is selected,

. Public parameter (

) As an input value, and finally outputs the first ciphertext (

) And identifier (

), And the first cipher text (

) And the identifier (

) To the cloud server;
The secret key (k ') and the disclosure parameter (

), And the second cipher text (

) To the cloud server; And
The first cipher text received from the data uploader device and the user device

), The identifier (

), The second cipher text (

), And a threshold value

) As an input value, the first ciphertext (

) And the second cipher text

) Is similar, the corresponding identifier (

) In a cloud computing environment including the cloud computing environment.

6. The method of claim 5,
The data uploader device comprises:
Extracts one subfeature from the upload data I using a locality sensitive hashing (LSH) algorithm,

For each of the 1 subfeatures, the ciphertext

Lt; / RTI >

First ciphertext (

) And identifier (

&Lt; / RTI >

A similarity search system for encrypted data in a cloud computing environment.

The method according to claim 6,
The user device
Extracts one subfeature from the search object data I 'using a locality sensitive hashing (LSH) algorithm,

For each of the 1 subfeatures, the ciphertext

Lt; / RTI >

Second ciphertext (

),

A similarity search system for encrypted data in a cloud computing environment.