KR102292246B1

KR102292246B1 - Device and method for scalable and secure similarity search for multiple entities

Info

Publication number: KR102292246B1
Application number: KR1020190157284A
Authority: KR
Inventors: 허준범; 한창희
Original assignee: 고려대학교 산학협력단
Priority date: 2019-11-29
Filing date: 2019-11-29
Publication date: 2021-08-24
Also published as: KR20210067538A

Abstract

다수 개체를 위한 확장성 있고 안전한 유사도 검색 시스템과 방법이 개시된다. 상기 유사도 검색 시스템은 사용자 단말, 데이터 소유자의 단말인 소유자 단말, 및 서버를 포함하고, 상기 서버는 상기 소유자 단말로부터 데이터(M)의 암호화된 데이터를 수신하고, 상기 사용자 단말로부터 트랩도어(trapdoor)를 포함하는 쿼리(query)를 수신하여 상기 사용자 단말의 데이터(M')와 상기 데이터(M)의 유사도를 판단하고, 유사도 판단 결과를 상기 사용자 단말로 송신하고, 상기 소유자 단말은 상기 사용자 단말과의 통신을 통해 상기 데이터(M)를 검색하기 위한 인덱스를 생성하고, 상기 인덱스를 상기 서버로 송신하고, 상기 사용자 단말은 상기 데이터(M')와 상기 데이터(M) 사이의 유사도를 판단하기 위한 트랩도어를 생성하고, 상기 쿼리는 상기 트랩도어를 포함하고, 상기 서버는 상기 인덱스와 상기 트랩도어를 이용하여 상기 데이터(M)와 상기 데이터(M')의 유사도를 판단한다.A scalable and secure similarity search system and method for multiple entities are disclosed. The similarity search system includes a user terminal, an owner terminal that is a terminal of a data owner, and a server, wherein the server receives encrypted data of data M from the owner terminal, and opens a trapdoor from the user terminal. by receiving a query comprising a, determining the similarity between the data M' of the user terminal and the data M, and transmitting the similarity determination result to the user terminal, and the owner terminal is the user terminal and the user terminal Create an index for searching the data M through communication of, transmit the index to the server, and the user terminal for determining the similarity between the data M' and the data M A trapdoor is generated, the query includes the trapdoor, and the server determines a similarity between the data M and the data M' using the index and the trapdoor.

Description

DEVICE AND METHOD FOR SCALABLE AND SECURE SIMILARITY SEARCH FOR MULTIPLE ENTITIES

본 발명은 다수 개체를 위한 확장성 있고 안전한 유사도 검색 기법에 관한 것이다.The present invention relates to a scalable and secure similarity search technique for multiple entities.

데이터 소유자들(data owners)이 다른 사용자들과 데이터를 공유하기 위해 데이터를 서버에 위탁할 때, 데이터 프라이버시(data privacy) 문제가 제기될 수 있다. 위탁 전에 데이터를 암호화하는 것이 효과적이지만, 사용자들이 관심 있는 데이터를 검색하는 것과 같은 몇몇 중요한 기능을 복잡하게 만들다. 이에 대한 해결책으로, 안전한 유사도 검색(secure similarity search)은 단일 사용자로 하여금 암호화된 데이터베이스(encrypted databases)로부터 유사한 데이터를 검색할 수 있도록 한다.Data privacy issues can arise when data owners entrust data to servers to share data with other users. Encrypting data before entrustment is effective, but complicates some important functions, such as retrieving data of interest to users. As a solution to this, a secure similarity search allows a single user to retrieve similar data from encrypted databases.

Cui 등은 MKSE(multi-key searchable encryption)를 이용하여 확장성 있는 방식으로 다수 개체를 서포트하는 방법을 제안하였다(비특허문헌 [15], [24] 참조). MKSE는 데이터 소유자들과 사용자들의 각각 데이터와 쿼리들(queries)을 암호화하는 경우라도, 개별적으로 생성한 상이한 키들을 이용하여 데이터 검색을 가능케 한다. 따라서, 쿼리를 통해 상이한 데이터 소유자들에게 종속되는 데이터를 검색할 수 있다. 그러나, 이러한 기능은 데이터 소유자들이 유사한 데이터 아이템들을 공유하고 있다는 것을 유출시키는 문제점이 있다.Cui et al. proposed a method of supporting multiple entities in a scalable manner using multi-key searchable encryption (MKSE) (see Non-Patent Documents [15] and [24]). MKSE enables data retrieval using individually generated different keys, even when encrypting data and queries of data owners and users, respectively. Thus, a query can retrieve data that is subject to different data owners. However, this functionality suffers from leaking that data owners are sharing similar data items.

게다가, 멀티-유저 환경에서의 데이터 검색은 공모 공격(collusion attack)에 취약하다. 공모 공격에서, 몇몇 사용자들과 공모하는 서버는 다른 사용자들의 쿼리를 유출할 수 있다. 최근, Hamlin 등은 공모 공격을 방지하기 위한 방법을 제안하였다(비특허문헌 [26] 참조). 그러나, Hamlin 등의 접근법은 선택적인 허가받지 않은 검색 문제(selective unauthorized search problem)를 초래할 수 있음이 관찰되었다.Moreover, data retrieval in a multi-user environment is vulnerable to a collusion attack. In a collusion attack, a server colluding with several users may leak queries from other users. Recently, Hamlin et al. proposed a method for preventing collusion attacks (see Non-Patent Document [26]). However, it has been observed that the approach of Hamlin et al. may lead to a selective unauthorized search problem.

따라서, 본 명세서에서는 다수 개체를 위한 확장성 있고 안전한 유사도 검색 기법을 제안하고자 한다.Therefore, in the present specification, it is intended to propose a scalable and safe similarity search technique for multiple entities.

대한민국 공개특허 제2019-0063204호 (2019.06.07. 공개)Republic of Korea Patent Publication No. 2019-0063204 (published on July 7, 2019) 대한민국 공개특허 제2009-0031079호 (2009.03.25. 공개)Republic of Korea Patent Publication No. 2009-0031079 (published on March 25, 2009) 미합중국 공개특허 US 2018/0157703 A1 (2018.06.07. 공개)US Patent Publication US 2018/0157703 A1 (published on Jun. 7, 2018)

[1] K. Ren, C. Wang, and Q. Wang, ”Security Challenges for the Public Cloud,” IEEE InternetComputing, vol. 16, no. 1, pp. 69-73, 2012.[1] K. Ren, C. Wang, and Q. Wang, “Security Challenges for the Public Cloud,” IEEE InternetComputing, vol. 16, no. 1, pp. 69-73, 2012. [2] D. Song, E. Shi, I. Fischer, and U. Shankar, ”Cloud Data Protection for the Masses,” Computer, vol. 45, no. 1, pp. 39-45, 2012.[2] D. Song, E. Shi, I. Fischer, and U. Shankar, “Cloud Data Protection for the Masses,” Computer, vol. 45, no. 1, pp. 39-45, 2012. [3] Bosch, C., Hartel, P., Jonker, W., and Peter, A., ”A survey of provably secure searchable encryption”, ACM Computing Surveys (CSUR), 47(2), 2014.[3] Bosch, C., Hartel, P., Jonker, W., and Peter, A., “A survey of provably secure searchable encryption”, ACM Computing Surveys (CSUR), 47(2), 2014. [4] Zhang, Y., Katz, J., and Papamanthou, C., ”All your queries are belong to us: the power of file-injection attacks on searchable encryption”, In Usenix Security, (2016)[4] Zhang, Y., Katz, J., and Papamanthou, C., ”All your queries are belong to us: the power of file-injection attacks on searchable encryption”, In Usenix Security, (2016) [5] Boneh, D., Di Crescenzo, G., Ostrovsky, R., and Persiano, G., ”Public key encryption with keyword search”, In International Conference on the Theory and Applications of Cryptographic Techniques, pp. 506-522, 2004.[5] Boneh, D., Di Crescenzo, G., Ostrovsky, R., and Persiano, G., “Public key encryption with keyword search”, In International Conference on the Theory and Applications of Cryptographic Techniques, pp. 506-522, 2004. [6] Song, D. X., Wagner, D., and Perrig, A., ”Practical techniques for searches on encrypted data”, In Security and Privacy, IEEE Symposium on, pp. 44-55, 2000.[6] Song, D. X., Wagner, D., and Perrig, A., ”Practical techniques for searches on encrypted data”, In Security and Privacy, IEEE Symposium on, pp. 44-55, 2000. [7] Kiayias, A., Oksuz, O., Russell, A., Tang, Q., and Wang, B., “Efficient encrypted keyword search for multi-user data sharing,” In European Symposium on Research in Computer Security, pp. 173-195, 2016.[7] Kiayias, A., Oksuz, O., Russell, A., Tang, Q., and Wang, B., “Efficient encrypted keyword search for multi-user data sharing,” In European Symposium on Research in Computer Security , pp. 173-195, 2016. [8] Sun, S. F., Liu, J. K., Sakzad, A., Steinfeld, R., and Yuen, T. H., “An efficient non-interactive multi-client searchable encryption with support for boolean queries,” In European Symposium on Research in Computer Security, pp. 154-172, 2016.[8] Sun, SF, Liu, JK, Sakzad, A., Steinfeld, R., and Yuen, TH, “An efficient non-interactive multi-client searchable encryption with support for boolean queries,” In European Symposium on Research in Computer Security, pp. 154-172, 2016. [9] Patel, S., Persiano, G., and Yeo, K., “Symmetric searchable encryption with sharing and unsharing,” In European Symposium on Research in Computer Security, pp. 207-227, 2018.[9] Patel, S., Persiano, G., and Yeo, K., “Symmetric searchable encryption with sharing and unsharing,” In European Symposium on Research in Computer Security, pp. 207-227, 2018. [10] Li, J., Wang, Q., Wang, C., Cao, N., Ren, K., and Lou, W., “Fuzzy keyword search over encrypted data in cloud,” In INFOCOM, 2010 Proceedings IEEE, pp. 1-5, (2010)[10] Li, J., Wang, Q., Wang, C., Cao, N., Ren, K., and Lou, W., “Fuzzy keyword search over encrypted data in cloud,” In INFOCOM, 2010 Proceedings IEEE, pp. 1-5, (2010) [11] Kuzu, M., Islam, M. S., and Kantarcioglu, M., ”Efficient similarity search over encrypted data”, In Data Engineering (ICDE), 2012 IEEE 28th International Conference on, pp. 1156-1167, 2012.[11] Kuzu, M., Islam, M. S., and Kantarcioglu, M., “Efficient similarity search over encrypted data”, In Data Engineering (ICDE), 2012 IEEE 28th International Conference on, pp. 1156-1167, 2012. [12] X. Yuan, X.Wang, C.Wang, A. Squicciarini, and K. Ren, “Enabling privacy-preserving image-centric social discovery,” In Distributed Computing Systems (ICDCS), pp. 198-207, 2014.[12] X. Yuan, X. Wang, C. Wang, A. Squicciarini, and K. Ren, “Enabling privacy-preserving image-centric social discovery,” In Distributed Computing Systems (ICDCS), pp. 198-207, 2014. [13] X. Yuan, X. Wang, C. Wang, and C. Yu, “Privacy-Preserving Similarity Joins Over Encrypted Data,” IEEE Transactions on Information Forensics and Security, pp. 2763-2775, 2017.[13] X. Yuan, X. Wang, C. Wang, and C. Yu, “Privacy-Preserving Similarity Joins Over Encrypted Data,” IEEE Transactions on Information Forensics and Security, pp. 2763-2775, 2017. [14] Strizhov, M. and Ray, I. “Multi-keyword similarity search over encrypted cloud data,” In IFIP International Information Security Conference, pp. 52-65, Springer, Berlin, Heidelberg, 2014.[14] Strizhov, M. and Ray, I. “Multi-keyword similarity search over encrypted cloud data,” In IFIP International Information Security Conference, pp. 52-65, Springer, Berlin, Heidelberg, 2014. [15] H. Cui, X. Yuan, Y. Zheng, and C. Wang, “Enabling secure and effective near-duplicate detection over encrypted in-network storage,” In the 35th International Conference on Computer Communications (INFOCOM), 2016.[15] H. Cui, X. Yuan, Y. Zheng, and C. Wang, “Enabling secure and effective near-duplicate detection over encrypted in-network storage,” In the 35th International Conference on Computer Communications (INFOCOM), 2016 . [16] http://lear.inrialpes.fr/people/jegou/data.php#copydays[16] http://lear.inrialpes.fr/people/jegou/data.php#copydays [17] Y. Ke, R. Sukthankar, L. Huston, Y. Ke, and R. Sukthankar, “Efficient near-duplicate detection and sub-image retrieval,” In ACM Multimedia, 2004.[17] Y. Ke, R. Sukthankar, L. Huston, Y. Ke, and R. Sukthankar, “Efficient near-duplicate detection and sub-image retrieval,” In ACM Multimedia, 2004. [18] Gionis, A., Indyk, P., and Motwani, R., ”Similarity search in high dimensions via hashing” In VLDB, pp. 518-529, 1999.[18] Gionis, A., Indyk, P., and Motwani, R., “Similarity search in high dimensions via hashing” In VLDB, pp. 518-529, 1999. [19] Python Software Foundation, https://pypi.python.org/pypi/ImageHash.[19] Python Software Foundation, https://pypi.python.org/pypi/ImageHash. [20] Garg, S., Mohassel, P., and Papamanthou, C, “TWORAM: Round-Optimal Oblivious RAM with Applications to Searchable Encryption,” IACR Cryptology ePrint Archive, 2015, 1010.[20] Garg, S., Mohassel, P., and Papamanthou, C, “TWORAM: Round-Optimal Oblivious RAM with Applications to Searchable Encryption,” IACR Cryptology ePrint Archive, 2015, 1010. [21] Stefanov, E., Papamanthou, C., and Shi, E, “Practical Dynamic Searchable Encryption with Small Leakage,” In NDSS, Vol. 14, pp. 23-26, 2014.[21] Stefanov, E., Papamanthou, C., and Shi, E, “Practical Dynamic Searchable Encryption with Small Leakage,” In NDSS, Vol. 14, pp. 23-26, 2014. [22] Bost, R. “Forward Secure Searchable Encryption,” In ACM CCS, Vol. 16, pp. 1143-1154, 2016.[22] Bost, R. “Forward Secure Searchable Encryption,” In ACM CCS, Vol. 16, pp. 1143-1154, 2016. [23] Bost, R., Minaud, B., and Ohrimenko, O, “Forward and backward private searchable encryption from constrained cryptographic primitives,” In ACM CCS, 2017.[23] Bost, R., Minaud, B., and Ohrimenko, O, “Forward and backward private searchable encryption from constrained cryptographic primitives,” In ACM CCS, 2017. [24] Popa, R. A. and Zeldovich, N., “Multi-Key Searchable Encryption,” IACR Cryptology ePrint Archive, 2013.[24] Popa, R. A. and Zeldovich, N., “Multi-Key Searchable Encryption,” IACR Cryptology ePrint Archive, 2013. [25] Grubbs, P., McPherson, R., Naveed, M., Ristenpart, T., and Shmatikov, V., “Breaking web applications built on top of encrypted data,” In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 1353-1364, 2016.[25] Grubbs, P., McPherson, R., Naveed, M., Ristenpart, T., and Shmatikov, V., “Breaking web applications built on top of encrypted data,” In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 1353-1364, 2016. [26] Hamlin, A., Shelat, A., Weiss, M., and Wichs, D., “Multi-Key Searchable Encryption, Revisited,” In IACR International Workshop on Public Key Cryptography, pp. 95-124, 2018.[26] Hamlin, A., Shelat, A., Weiss, M., and Wichs, D., “Multi-Key Searchable Encryption, Revisited,” In IACR International Workshop on Public Key Cryptography, pp. 95-124, 2018. [27] Van Rompay, C., Molva, R., and Onen, M., “A leakage-abuse attack against multi-user searchable encryption,” Proceedings on Privacy Enhancing Technologies, pp. 168-178, 2017.[27] Van Rompay, C., Molva, R., and Onen, M., “A leakage-abuse attack against multi-user searchable encryption,” Proceedings on Privacy Enhancing Technologies, pp. 168-178, 2017. [28] Cui, B., Liu, Z., and Wang, L., “Key-aggregate searchable encryption (KASE) for group data sharing via cloud storage,” IEEE Transactions on computers, 65(8), pp. 2374-2385, 2016.[28] Cui, B., Liu, Z., and Wang, L., “Key-aggregate searchable encryption (KASE) for group data sharing via cloud storage,” IEEE Transactions on computers, 65(8), pp. 2374-2385, 2016. [29] Liu, Z., Li, T., Li, P., Jia, C., and Li, J., “Verifiable searchable encryption with aggregate keys for data sharing system,” Future Generation Computer Systems, 78, pp. 778-788, 2018.[29] Liu, Z., Li, T., Li, P., Jia, C., and Li, J., “Verifiable searchable encryption with aggregate keys for data sharing system,” Future Generation Computer Systems, 78, pp . 778-788, 2018. [30] A. De Caro and V. Iovino, “jPBC: Java pairing based cryptography,” Computers and Communications (ISCC), IEEE Symposium on, pp. 850-855, 2011.[30] A. De Caro and V. Iovino, “jPBC: Java pairing based cryptography,” Computers and Communications (ISCC), IEEE Symposium on, pp. 850-855, 2011. [31] Uzunkol, O. and Kiraz, M. S., “Still wrong use of pairings in cryptography,” Applied Mathematics and Computation, 333, pp. 467-479, 2011, 2018.[31] Uzunkol, O. and Kiraz, M. S., “Still wrong use of pairings in cryptography,” Applied Mathematics and Computation, 333, pp. 467-479, 2011, 2018. [32] Stein, B. and Zu Eissen, S. M., “Near similarity search and plagiarism analysis,” In From data and information analysis to knowledge engineering, pp. 430-437, Springer, Berlin, Heidelberg, 2006.[32] Stein, B. and Zu Eissen, S. M., “Near similarity search and plagiarism analysis,” In From data and information analysis to knowledge engineering, pp. 430-437, Springer, Berlin, Heidelberg, 2006. [33] Lv, Q., Josephson, W., Wang, Z., Charikar, M., and Li, K., “Multiprobe LSH: efficient indexing for high-dimensional similarity search,” In Proceedings of the 33rd international conference on Very large data bases, pp. 950-961, VLDB Endowment, 2007.[33] Lv, Q., Josephson, W., Wang, Z., Charikar, M., and Li, K., “Multiprobe LSH: efficient indexing for high-dimensional similarity search,” In Proceedings of the 33rd international conference on Very large data bases, pp. 950-961, VLDB Endowment, 2007. [34] Enron email dataset, https://www.cs.cmu.edu/ ./enron/. Accessed: 2019-05-30.[34] Enron email dataset, https://www.cs.cmu.edu/ ./enron/. Accessed: 2019-05-30.

본 발명이 이루고자 하는 기술적인 과제는 다수 개체를 위한 확장성 있고 안전한 유사도 검색 장치 및 방법을 제공하는 것이다.It is an object of the present invention to provide a scalable and safe similarity search apparatus and method for multiple entities.

본 발명의 일 실시예에 따른 유사도 검색 시스템은 사용자 단말, 데이터 소유자의 단말인 소유자 단말, 및 서버를 포함하고, 상기 서버는 상기 소유자 단말로부터 데이터(M)의 암호화된 데이터를 수신하고, 상기 사용자 단말로부터 트랩도어(trapdoor)를 포함하는 쿼리(query)를 수신하여 상기 사용자 단말의 데이터(M')와 상기 데이터(M)의 유사도를 판단하고, 유사도 판단 결과를 상기 사용자 단말로 송신하고, 상기 소유자 단말은 상기 사용자 단말과의 통신을 통해 상기 데이터(M)를 검색하기 위한 인덱스를 생성하고, 상기 인덱스(

)를 상기 서버로 송신하고, 상기 사용자 단말은 상기 데이터(M')와 상기 데이터(M) 사이의 유사도를 판단하기 위한 트랩도어를 생성하고, 상기 쿼리는 상기 트랩도어를 포함하고, 상기 서버는 상기 인덱스와 상기 트랩도어를 이용하여 상기 데이터(M)와 상기 데이터(M')의 유사도를 판단한다.A similarity search system according to an embodiment of the present invention includes a user terminal, an owner terminal that is a terminal of a data owner, and a server, wherein the server receives encrypted data of data M from the owner terminal, and the user Receiving a query including a trapdoor from a terminal, determining the similarity between the data M' of the user terminal and the data M, and transmitting the similarity determination result to the user terminal, and The owner terminal generates an index for retrieving the data M through communication with the user terminal, and the index (

) to the server, the user terminal generates a trap door for determining the similarity between the data M' and the data M, the query includes the trap door, and the server The similarity between the data M and the data M' is determined using the index and the trap door.

본 발명의 일 실시예에 따른 유사도 검색 방법은 사용자 단말, 데이터 소유자의 단말인 소유자 단말, 및 서버를 포함하는 유사도 검색 시스템 상에서 수행되고, 상기 서버가 상기 소유자 단말로부터 데이터(M)의 암호화된 데이터를 수신하는 단계, 상기 소유자 단말이 상기 사용자 단말과의 통신을 통해 상기 데이터(M)을 검색하기 위한 인덱스를 생성하고, 상기 인덱스를 상기 서버로 송신하는 단계, 상기 사용자 단말이 데이터(M')과 상기 데이터(M) 사이의 유사도 판단을 위한 트랩도어를 생성하고, 상기 트랩도어를 포함하는 쿼리를 상기 서버로 송신하는 단계, 및 상기 서버가 상기 인덱스와 상기 트랩도어를 이용하여 상기 데이터(M)과 상기 데이터(M')의 유사도를 결정하는 단계를 포함한다.A similarity search method according to an embodiment of the present invention is performed on a similarity search system including a user terminal, an owner terminal that is a terminal of a data owner, and a server, wherein the server is encrypted data of data M from the owner terminal receiving, the owner terminal generating an index for retrieving the data M through communication with the user terminal, and transmitting the index to the server, the user terminal generating the data M' generating a trapdoor for determining the similarity between the data (M) and sending a query including the trapdoor to the server, and the server using the index and the trapdoor to determine the data (M) ) and determining the degree of similarity between the data M'.

본 발명의 실시예에 따른 유사도 검색 기법에 의할 경우, 데이터 소유자의 프라이버시를 보장하면서도 확장성 있고 안전한 유사도 검색을 가능케 하는 효과가 있다.The similarity search technique according to an embodiment of the present invention has the effect of enabling a scalable and safe similarity search while guaranteeing the privacy of the data owner.

본 발명의 상세한 설명에서 인용되는 도면을 보다 충분히 이해하기 위하여 각 도면의 상세한 설명이 제공된다.
도 1은 본 명세서에서 사용되는 기호의 설명을 나타내는 표를 도시한다.
도 2는 본 발명의 일 실시예에 따른 유사도 검색 시스템을 도시한다.In order to more fully understand the drawings recited in the Detailed Description of the Invention, a detailed description of each drawing is provided.
1 shows a table showing explanations of symbols used in this specification.
2 illustrates a similarity search system according to an embodiment of the present invention.

본 명세서에 개시되어 있는 본 발명의 개념에 따른 실시예들에 대해서 특정한 구조적 또는 기능적 설명들은 단지 본 발명의 개념에 따른 실시예들을 설명하기 위한 목적으로 예시된 것으로서, 본 발명의 개념에 따른 실시예들은 다양한 형태들로 실시될 수 있으며 본 명세서에 설명된 실시예들에 한정되지 않는다.Specific structural or functional descriptions of the embodiments according to the concept of the present invention disclosed herein are only exemplified for the purpose of explaining the embodiments according to the concept of the present invention, and the embodiment according to the concept of the present invention These may be embodied in various forms and are not limited to the embodiments described herein.

본 발명의 개념에 따른 실시예들은 다양한 변경들을 가할 수 있고 여러 가지 형태들을 가질 수 있으므로 실시예들을 도면에 예시하고 본 명세서에서 상세하게 설명하고자 한다. 그러나, 이는 본 발명의 개념에 따른 실시예들을 특정한 개시 형태들에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물, 또는 대체물을 포함한다.Since the embodiments according to the concept of the present invention may have various changes and may have various forms, the embodiments will be illustrated in the drawings and described in detail herein. However, this is not intended to limit the embodiments according to the concept of the present invention to specific disclosed forms, and includes all modifications, equivalents, or substitutes included in the spirit and scope of the present invention.

제1 또는 제2 등의 용어는 다양한 구성 요소들을 설명하는데 사용될 수 있지만, 상기 구성 요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성 요소를 다른 구성 요소로부터 구별하는 목적으로만, 예컨대 본 발명의 개념에 따른 권리 범위로부터 벗어나지 않은 채, 제1 구성 요소는 제2 구성 요소로 명명될 수 있고 유사하게 제2 구성 요소는 제1 구성 요소로도 명명될 수 있다.Terms such as first or second may be used to describe various elements, but the elements should not be limited by the terms. The above terms are used only for the purpose of distinguishing one element from another, for example, without departing from the scope of the inventive concept, a first element may be termed a second element and similarly a second element. A component may also be referred to as a first component.

어떤 구성 요소가 다른 구성 요소에 '연결되어 있다'거나 '접속되어 있다'고 언급된 때에는, 그 다른 구성 요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성 요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성 요소가 다른 구성 요소에 '직접 연결되어 있다'거나 '직접 접속되어 있다'고 언급된 때에는 중간에 다른 구성 요소가 존재하지 않는 것으로 이해되어야 할 것이다. 구성 요소들 간의 관계를 설명하는 다른 표현들, 즉 '~사이에'와 '바로 ~사이에' 또는 '~에 이웃하는'과 '~에 직접 이웃하는' 등도 마찬가지로 해석되어야 한다.When it is mentioned that a component is 'connected' or 'connected' to another component, it may be directly connected or connected to the other component, but other components may exist in between. will have to be understood On the other hand, when it is mentioned that a certain element is 'directly connected' or 'directly connected' to another element, it should be understood that another element does not exist in the middle. Other expressions describing the relationship between components, such as 'between' and 'immediately between' or 'neighboring' and 'directly adjacent to', should be interpreted similarly.

본 명세서에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로서, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, '포함하다' 또는 '가지다' 등의 용어는 본 명세서에 기재된 특징, 숫자, 단계, 동작, 구성 요소, 부분품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성 요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terms used herein are used only to describe specific embodiments, and are not intended to limit the present invention. The singular expression includes the plural expression unless the context clearly dictates otherwise. In the present specification, terms such as 'comprise' or 'have' are intended to designate that a feature, number, step, operation, component, part, or combination thereof described herein exists, but one or more other features It is to be understood that it does not preclude the possibility of the presence or addition of numbers, steps, operations, components, parts, or combinations thereof.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 갖는 것으로 해석되어야 하며, 본 명세서에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in a commonly used dictionary should be interpreted as having a meaning consistent with the meaning in the context of the related art, and should not be interpreted in an ideal or excessively formal meaning unless explicitly defined in the present specification. does not

이하, 본 명세서에 첨부된 도면들을 참조하여 본 발명의 실시예들을 상세히 설명한다. 그러나, 특허출원의 범위가 이러한 실시예들에 의해 제한되거나 한정되는 것은 아니다. 각 도면에 제시된 동일한 참조 부호는 동일한 부재를 나타낸다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. However, the scope of the patent application is not limited or limited by these examples. Like reference numerals in each figure indicate like elements.

이하에서는, 본 발명에서 이용되는 배경 지식에 대하여 설명한다.Hereinafter, background knowledge used in the present invention will be described.

LSHLSH (Locality Sensitive Hashing)(Locality Sensitive Hashing)

LSH는 상이한 아이템들 사이의 유사도(similarity)를 측정하기 위해 데이터로부터 식별 가능한 특징들(identifiable features)을 추출하는 근사 알고리즘(approximation algorithm)이다(비특허문헌 [18] 참조). LSH의 기본적인 아이디어는 높은 확률로 유사한 아이템들을 동일한 값들로 매핑시키는 해쉬 함수들(hash functions)을 이용하여 고차원의 데이터(high-dimensional data)의 차원(dimensionality)을 감소시키는 것이다. 데이터 아이템에 대하여, LSH 함수는 서브특징 집합(subfeature set)

를 추출한다. 그런 다음, 두 개의 아이템들의 특징 집합들 간의 유사도를 측정하기 위해 동일성 테스트(equality testing)가 수행되고, 이 과정은 LSH-기반 유사도 검색의 검색 정확도(search accuracy)를 결정한다. 본 제안 기법에서는, 데이터 특징들을 추출하기 위하여 사용자들과 데이터 소유자들은 기존의 LSH 함수들을 사용하는 것으로 가정한다(비특허문헌 [18], [19] 참조). 그러나, 본 발명의 권리범위가 LSH 함수들의 종류나 유형에 제한되는 것이 아님은 명백하다.LSH is an approximation algorithm that extracts identifiable features from data to measure similarity between different items (see Non-Patent Document [18]). The basic idea of LSH is to reduce the dimensionality of high-dimensional data by using hash functions that map similar items to identical values with high probability. For data items, the LSH function is a subfeature set.

to extract Then, equality testing is performed to measure the similarity between the feature sets of the two items, and this process determines the search accuracy of the LSH-based similarity search. In this proposed technique, it is assumed that users and data owners use the existing LSH functions to extract data features (refer to non-patent literature [18] and [19]). However, it is clear that the scope of the present invention is not limited to the types or types of LSH functions.

LSH를LSH 이용한 유사도 검색 Similarity search using

유사도 검색의 핵심 아이디어는 LSH 매치들(LSH matches)의 개수가 미리 정의된 임계값을 초과하는지 여부를 체크하는 것이다. 검색 가능한 인텍스들(searchable indices)의 집합으로써

가 주어지고, 트랩도어들(trapdoors)의 집합으로써

가 주어졌을 때, M과 M'이 유사한지 여부를 결정할 수 있다. 이를 위해, 카운터 변수(counter variable)

가 0으로 초기화되고, 모든

에 대하여

이 성립하면

가 1 만큼 증가된다. 결국, 미리 결정된 임계값

에 대하여

이 성립하면, M과 M'은 유사한 것으로 고려된다. 즉, 인덱스와 인덱스와 동일한 값을 갖는 트랩도어의 개수가 임계값

보다 크거나 같은 경우, M과 M'은 유사한 것으로 결정될 수 있다.The core idea of the similarity search is to check whether the number of LSH matches exceeds a predefined threshold. As a set of searchable indices

is given, and as a set of trapdoors

Given a, it can be determined whether M and M' are similar. For this, a counter variable

is initialized to 0, and all

about

If this is achieved

is increased by 1. After all, a predetermined threshold

about

If this holds, M and M' are considered to be similar. That is, the number of trapdoors having the same value as the index and the index is the threshold value.

If greater than or equal to, M and M' may be determined to be similar.

키워드들을 LSH 값들과 교체함으로써 멀티-키워드 검색(multi-keyword searches)을 사용한 유사도 검색이 가능함이 보고되었다(비특허문헌 [15] 참조). 또한, 그 역도 마찬가지다. 따라서, 제안 기법은 키워드 검색에도 적용이 가능하다.It has been reported that similarity searches using multi-keyword searches are possible by replacing keywords with LSH values (see Non-Patent Document [15]). Also, vice versa. Therefore, the proposed technique can be applied to keyword search.

Bilinear Maps(Bilinear Maps( 겹선형double line 함수) function)

,

, 및

를 소수 위수(prime order)

의 곱셈 순환 군들(multiplicative cyclic groups)이라 하고,

를 곱셈 모듈로(multiplication modulo)

하의 군(group)이라 하자. 또한,

를 다음 특성을 갖는 겹선형 함수(bilinear maps)라 하자.

,

, and

to the prime order

are called multiplicative cyclic groups of

to the multiplication modulo

Let's call it a group. In addition,

Let be bilinear maps with the following properties:

·Bilinearty: 모든

,

, 및

에 대하여

이 성립한다.Bilinearity: all

,

, and

about

this is accomplished

·Nondegeneracy:

과

각각을

과

의 생성원이라 할 때,

이 성립한다.·Nondegeneracy:

class

each

class

When it is said to be the source of

this is accomplished

과

내에서의 그룹 연산과 겹선형 함수

가 효율적으로 계산가능하다면

과

를 겹선형 그룹(bilinear group)이라 한다.

class

Group operations and bilinear functions within

If can be calculated efficiently

class

is called a bilinear group.

Algorithm Definitions of Similarity Search(유사도 검색 알고리즘 정의)Algorithm Definitions of Similarity Search

본 명세서에서 사용되는 기호에 대한 설명은 도 1에 도시된 표와 같고, 유사고 검색 알고리즘은 다음과 같이 구성된다.Descriptions of symbols used in this specification are the same as the table shown in FIG. 1, and the similarity search algorithm is configured as follows.

1)

. 인덱스 생성 알고리즘(index generation algorithm, 인덱스 생성 단계로 명명될 수도 있음)은 사용자

와 소유자

에 의해 수행되는 인터랙티브 알고리즘(interactive algorithm, 대화식 알고리즘)이다. 사용자의 비밀키(secret key)

, 소유자의 비밀키

, 및 M으로부터 추출된 서브특징들(subfeatures)

을 입력으로 받고, 암호화된 검색 가능한 인덱스(encrypted searchable index)

를 출력한다.One)

. The index generation algorithm (which may also be called the index generation step) is

and owner

It is an interactive algorithm (interactive algorithm) performed by User's secret key

, the owner's private key

, and subfeatures extracted from M

takes as input, and an encrypted searchable index

to output

2)

. 트랩도어 생성 알고리즘(trapdoor generation algorithm, 트랩도어 생성 단계로 명명될 수도 있음)은 사용자의 비밀키

, 소유자의 공개키(public key)

, 사용자에 의해 소유되는 M'으로부터 추출된 서브특징들

을 입력으로 받고, 트랩도어

를 출력한다.2)

. The trapdoor generation algorithm (which may also be called the trapdoor generation step) uses the user's secret key

, the owner's public key

, sub-features extracted from M' owned by the user

as input, and trapdoor

to output

3)

. 유사도 검색 알고리즘(similarity search algorithm, 유사도 검색 단계로 명명될 수도 있음)은

,

, 및 유사도 임계치

를 입력으로 받고,

를 이용하여 M과 M' 사이의 유사도를 테스트한다. M과 M'이 유사하면 true를 출력하고, M과 M'이 매치되지 않으면 공집합(empty set)

를 출력한다. 즉, 유사도 검색 알고리즘은 소정의 입력을 이용하여, 데이터 M과 데이터 M'의 유사 여부를 판단한 후 판단 결과를 출력할 수 있다.3)

. A similarity search algorithm (which may be called a similarity search step) is

,

, and the similarity threshold

takes as input,

to test the similarity between M and M'. Outputs true if M and M' are similar, and empty set if M and M' do not match

to output That is, the similarity search algorithm may use a predetermined input to determine whether data M and data M' are similar, and then output a determination result.

도 2는 본 발명의 일 실시예에 따른 유사도 검색 시스템을 도시한다.2 illustrates a similarity search system according to an embodiment of the present invention.

도 2를 참조하면, 제안하는 유사도 검색 시스템은 데이터 소유자들(data owners, 데이터 소유자들 각각의 단말을 의미할 수 있음), 사용자들(users, 사용자들 각각의 단말을 의미할 수 있음), 및 서버(유사도 검색 서버라 명명될 수도 있음)로 구성된다. 각 엔티티의 역할은 다음과 같다.Referring to FIG. 2 , the proposed similarity search system includes data owners (which may mean a terminal for each of the data owners), users (which may mean a terminal for each of the users), and It consists of a server (sometimes similarly called a search server). The role of each entity is as follows.

1) Data owners(데이터 소유자들). 데이터 소유자들은 공개적으로 접근 가능한 서버로 업로드 되는 데이터의 소유자들이다. 소유자

는 데이터 셋(data set)

를 소유한다. 각 소유자는 선택적으로 자신의 데이터를 자신이 선택한 특정 사용자들과 공유할 수 있다. 소유자는 사용자와 대화식 공유 프로세스(interactive sharing process)를 수행한 후 오직 정해진 사용자만이 검색을 수행할 수 있는 검색 가능한 인덱스들(searchable indices)을 출력한다.1) Data owners. Data owners are the owners of data uploaded to a publicly accessible server. owner

is a data set

owns Each owner can optionally share their data with specific users of their choosing. After the owner performs an interactive sharing process with the user, searchable indices that only a specified user can perform a search are output.

2) Users(사용자들). 사용자들은 검색 쿼리들(search queries)을 생성하는 클라이언트들이다. 사용자들은 검색 가능한 인덱스들을 생성하기 위해 소유자들과 상호 작용한다. 또한, 사용자들은 프라이버시를 보호하기 위하여 비밀키들과 특정 소유자의 공개키 하에서 자신의 평문 쿼리 데이터(plain query data)를 암호화한다.2) Users. Users are clients who generate search queries. Users interact with owners to create searchable indexes. In addition, users encrypt their plain query data under private keys and a public key of a specific owner to protect privacy.

3) Server(서버). 서버는 복수의 데이터 소유자들에 의해 업로드된 데이터를 저장하는 스토리지 서비스 제공자(storage service provider)이다. 서버는 n 명의 소유자들로부터 수신된 암호화된 코퍼스(corpus)

를 소유한다. 또한, 서버는 암호화된 데이터 상에서 사용자들로부터 검색 쿼리들을 수신하여 유사도 검색을 수행한다. 유사도 검색 수행 결과는 쿼리를 전송한 사용자에게 전송될 수 있다.3) Server. A server is a storage service provider that stores data uploaded by a plurality of data owners. The server has an encrypted corpus received from n owners

owns In addition, the server receives search queries from users on the encrypted data and performs a similarity search. The similarity search result may be transmitted to the user who sent the query.

상술한 소유자들, 사용자들, 및 서버 각각은 각종 연산 처리 및 신호 생성이 가능한 적어도 하나의 전자 장치를 포함할 수 있다. 여기서, 적어도 하나의 전자 장치는 프로세서 및/또는 프로세서가 설치된 컴퓨팅 장치를 포함할 수 있으며, 상술한 동작들 중 적어도 일부는 상기 프로세서의 동작을 의미할 수 있다. 여기서, 프로세서는 중앙 처리 장치(CPU, Central Processing Unit), 마이크로 컨트롤러 유닛(MCU, Micro Controller Unit), 마이컴(Micom, Micro Processor), 애플리케이션 프로세서(AP, Application Processor), 전자 제어 유닛(ECU, Electronic Controlling Unit), 그래픽 처리 장치(GPU, Graphic Processing Unit) 및/또는 각종 연산 처리 및 제어 신호의 생성이 가능한 처리 장치 등을 포함할 수 있다. 이들 처리 장치는 예를 들어 하나 또는 둘 이상의 반도체 칩 및 관련 부품을 이용하여 구현될 수 있다. 또한, 컴퓨팅 장치는, 예를 들어, 데스크톱 컴퓨터, 랩톱 컴퓨터, 서버용 컴퓨터, 스마트 폰, 태블릿 피씨, 스마트 시계, 두부 장착형 디스플레이(HMD, Head Mounted Display) 장치, 휴대용 게임기, 내비게이션 장치, 개인용 디지털 보조기(PDA, Personal Digital Assistant), 인공지능 스피커 장치, 디지털 텔레비전, 셋톱 박스, 로봇, 가전 기기, 기계 장치 및/또는 이외 정보 처리 기능을 수행할 수 있는 적어도 하나의 전자 장치를 포함할 수 있다.Each of the above-described owners, users, and servers may include at least one electronic device capable of various arithmetic processing and signal generation. Here, the at least one electronic device may include a processor and/or a computing device in which the processor is installed, and at least some of the above-described operations may refer to the operation of the processor. Here, the processor is a central processing unit (CPU, Central Processing Unit), a micro controller unit (MCU, Micro Controller Unit), a microcomputer (Micom, Micro Processor), an application processor (AP, Application Processor), an electronic control unit (ECU, Electronic Controlling Unit), a graphic processing unit (GPU, graphic processing unit), and/or a processing unit capable of various arithmetic processing and control signal generation. These processing devices may be implemented using, for example, one or more semiconductor chips and related components. In addition, the computing device may be, for example, a desktop computer, a laptop computer, a server computer, a smart phone, a tablet PC, a smart watch, a head mounted display (HMD) device, a portable game machine, a navigation device, a personal digital assistant ( PDA (Personal Digital Assistant), artificial intelligence speaker device, digital television, set-top box, robot, home appliance, mechanical device, and/or at least one electronic device capable of performing other information processing functions.

이하에서는, 제안하는 유사도 검색 기법을 상세하게 설명한다. 우선,

을 보안 상수

을 입력으로 받아 소수

를 위수로 하고 겹선형 함수

를 구비하는 3 개의 그룹들(곱셈 순환군)

,

, 및

를 생성하는 그룹 생성기(symmetric pairing group generator)라 하자.

는 또한

과

각각의 생성자

과

를 출력한다.

를 공모-저항 해쉬 함수(collision-resistant hash function)이라 하자. 또한, M으로부터 추출된 서브특징들의 집합

은

와 같이 생성되며, 여기서

가 성립할 확률은 무시할 수 있으며,

은 2이상의 자연수일 수 있다.

를 수도 랜덤 함수(pseudo-random function)이라 하자. 소유자

는 비밀키

와 공개키

를 유도하기 위한 마스터키(master key)

를 갖는다. 여기서,

는 랜덤하게 샘플링된 씨드(seed)이다. 사용자

는 비밀키 쌍(secret key pair)

와

의 재배치된 순서(permuted order)

를 유도하기 위한 마스터키

를 갖는다. 여기서,

와

는 랜덤하게 샘플링된 씨드들(seeds)이다. 소유자는 유사도 임계치

를 정의하고 이를 서버로 송신한다. 또한, 상술한 변수, 파라미터들 중 연산과정에 필요한 변수나 파라미터들은 사전에 엔티티 사이에서 공유될 수 있다. 구체적인 알고리즘은 다음과 같다.Hereinafter, the proposed similarity search technique will be described in detail. first of all,

is a security constant

takes as input a decimal

with as the order of magnitude and a bilinear function

3 groups with (multiplicative recursive group)

,

, and

Let it be a symmetric pairing group generator that generates

is also

class

each constructor

class

to output

Let be a collision-resistant hash function. In addition, a set of sub-features extracted from M

silver

is created as, where

The probability that is established is negligible,

may be a natural number of 2 or more.

Let be a pseudo-random function. owner

is the secret key

and public key

master key to derive

has here,

is a randomly sampled seed. user

is a secret key pair

Wow

the permuted order of

master key to derive

has here,

Wow

are randomly sampled seeds. Owner is similarity threshold

define and send it to the server. Also, among the above-described variables and parameters, variables or parameters necessary for an operation process may be shared between entities in advance. The specific algorithm is as follows.

1)

. 검색 가능한 인덱스 생성 알고리즘(searchable index generation algorithm, 검색 가능한 인덱스 생성 단계로 명명될 수도 있음)은 사용자와 소유자의 상호 작용으로 수행된다. 알고리즘의 구체적인 내용은 다음과 같다.One)

. The searchable index generation algorithm (also referred to as the searchable index generation step) is performed by the interaction of the user and the owner. The details of the algorithm are as follows.

a) 소유자는

개의 임의의 값들

을 선택한다. 모든

에 대하여, 소유자는

와

를 계산한다. 소유자는

를 사용자에게 전송한다. 여기서,

를 인덱스

와 구분하기 위하여 제1 인덱스로 명명할 수 있다.a) the owner

random values of

select every

Regarding, the owner

Wow

to calculate the owner

is sent to the user. here,

index

It can be named as a first index to distinguish it from .

b) 사용자는

개의 임의 값들

를 선택한다. 모든

에 대하여, 사용자는

와

를 계산한다. 사용자는

에 따라

개의 쌍들

을 재배치(shuffle)한다. 그리고, 사용자는 재배치 결과를 소유자에게 반환한다.

에 따라 재배치된

개의 쌍들

을 인덱스

와 구분하기 위하여 제2 인덱스로 명명할 수 있다.b) the user

random values

select every

About, the user

Wow

to calculate the user

Depending on the

pairs of dogs

is shuffled. Then, the user returns the relocation result to the owner.

relocated according to

pairs of dogs

index

It can be named as a second index to distinguish it from .

c) 소유자는

로부터

와

를 제거한다. 결과는

의 거듭제곱이 된다. 소유자는

를 서버로 업로드한다. 즉, 소유자는 사용자로부터 수신한 제2 인덱스로부터 인덱스

를 생성하고, 생성된 인덱스

를 서버로 송신할 수 있다. 물론, 소유자의 데이터 M은 사전에 암호화되어 미리 서버로 업로드되어 저장되어 있을 수 있다.c) the owner

from

Wow

to remove result

becomes the power of the owner

upload to the server. That is, the owner indexes from the second index received from the user.

create and create index

can be sent to the server. Of course, the owner's data M may be encrypted in advance and uploaded to the server in advance and stored.

2)

. 사용자는

개의 임의의 값들

를 선택한다. 모든

에 대하여, 사용자는

와

를 계산한다. 사용자는

에 따라

를 재배치하고, 재배치된 트랩도어

를 서버로 전송한다.2)

. the user

random values of

select every

About, the user

Wow

to calculate the user

Depending on the

rearranged and relocated trapdoor

is sent to the server.

3)

. 서버는 임시 변수

를 0으로 설정한다. 모든

에 대하여, 서버는

의 성립 여부를 체크한다. 위 수학식이 성립하는 경우, 서버는

를 1 만큼 증가시킨다. 결국,

이 성립하면 서버는

를 반환하고, 그렇지 않으면 M과 M'이 유사하지 않음을 의미하는

를 반환한다. 즉, 서버는 유사도 검색 결과(유사 또는 비유사)를 쿼리를 송신한 사용자에게 전송할 수 있다.3)

. Server is a temporary variable

is set to 0. every

Regarding, the server is

check whether or not If the above equation holds, the server

is increased by 1. finally,

If this is established, the server

, otherwise it means that M and M' are not similar.

returns That is, the server may transmit the similarity search result (similar or dissimilar) to the user who sent the query.

Correctness(정확성)Correctness

상술한 제안 기법의 검색 정확성은 아래와 같이 증명될 수 있다. 2 개의 i번째 서브특징들

,

이 동일하다고 가정하면, 주어진 i번째 검색 가능한 인덱스

와 i번째 트랩도어

에 대하여 검색 알고리즘은 다음을 계산한다.The search accuracy of the above-described proposed method can be verified as follows. two i-th subfeatures

,

Assuming that is equal, given the i-th searchable index

and the i-th trapdoor

For , the search algorithm calculates:

Efficient revocation(효율적인 폐기)Efficient revocation

소유자

가 소유하는 데이터 아이템 M에 대한 사용자

의 검색 능력을 폐기하기 위하여, 소유자는 M에 대응하는

를 제거할 것을 요청할 수 있다.

가 제거되었지만, 사용자

는 여전히 M에 대한 유효한 트랩도어를 생성할 수 있다. 그러나,

가 제거되었기 때문에 트랩도어는 M에 대한 유사도 테스트를 위해 사용될 수 없다.

의 제거는 사용자 집합

에 포함되는 다른 사용자들에게 영향을 미치지 않으며, 이는 다른 사용자들은 그들의

가 유효한 동안 검색을 통해 M을 복구할 수 있음을 의미한다.owner

User for data item M owned by

In order to discard the search capability of

may be requested to be removed.

has been removed, but the user

can still create a valid trapdoor for M. But,

Since is removed, the trapdoor cannot be used for the similarity test for M.

The removal of the user set

does not affect other users included in the

This means that M can be recovered by searching while is valid.

Accelerating search via pre-computation(사전 연산을 통한 검색의 가속)Accelerating search via pre-computation

제안 기법의 중요한 특징은 트랩도어 생성과 검색 시간은 사전 연산(pre-computation)을 통하여 급격하게 감소될 수 있다는 것이다. 우선, 트랩도어를 사전 연산하는 방법을 기술한다. 트랩도어는 2 개의 파트로 구성된다. 첫번째 파트는 쿼리 데이터 암호화

이며, 두번째 파트는 소유자의 공개키

이다. 여기서, 두 개의 파트는 임의의 변수

를 이용하여 연관되어 있다. 트랩도어 생성 시간을 가속하기 위하여, 사용자는

를 선택하고

를 미리 연산할 수 있다. 사용자는

와

를 자신의 로컬 스토리지(사용자 또는 사용자 단말에 포함되어 있는 저장장치(또는 저장부)를 의미함)에 저장할 수 있다. 이후에, 사용자가 어떤 데이터를 검색하고자 할 때, 사용자는

을 계산하기 위하여 미리 저장되어 있는

를 이용할 수 있으며, 연산 결과로 생성된

와 미리 연산된 값인

를 트랩도어로써 서버로 송신할 수 있다. 이와 같은 사전 연산은 트랩도어 생성 시간을 약 89% 감소시킬 수 있다.An important feature of the proposed method is that the trapdoor generation and search time can be drastically reduced through pre-computation. First, a method for pre-computing the trapdoor will be described. The trapdoor consists of two parts. The first part is query data encryption

and the second part is the owner's public key

am. Here, the two parts are arbitrary variables

is associated using . To accelerate the trapdoor creation time, the user

select

can be computed in advance. the user

Wow

may be stored in its own local storage (meaning a storage device (or storage unit) included in the user or user terminal). Later, when the user wants to retrieve some data, the user

stored in advance to calculate

can be used, and the generated

and the precomputed value

can be sent to the server as a trapdoor. Such pre-computation can reduce trapdoor creation time by about 89%.

다음으로, 검색 시간을 감소시키는 방법을 기술한다. 이를 위해, 사용자는 소유자의 공개키 부분

을 사전 연산하고 이를 미리 서버로 전송할 수 있다. 서버가

를 수신함에 따라, 서버는 유사도 검색식의 좌변인

를 사전 연산하고, 이를 자신의 로컬 스토리지(서버에 포함되어 있는 저장장치(또는 저장부)를 의미함)에 저장할 수 있다. 후에, 사용자가

로 구성된 트랩도어를 서버로 전송하면, 서버는 유사도 검색식의 우변인

를 연산할 수 있다. 서버는 미리 연산되어 저장되어 있는 값인

와

를 비교함으로서 유사도 검색을 수행할 수 있다. 실험에 의하면, 검색을 수행하기 위해 필요한 전체 동작의 약 50%가 미리 계산될 수 있다. Next, a method for reducing the search time is described. To do this, the user has the owner's public key part

can be pre-computed and transmitted to the server in advance. the server

Upon receiving , the server is the left side of the similarity

can be pre-computed and stored in its own local storage (meaning a storage device (or storage unit) included in the server). After, the user

When a trap door composed of

can be calculated. The server computes and stores the value in advance.

Wow

A similarity search can be performed by comparing . Experiments have shown that about 50% of the total operations required to perform a search can be calculated in advance.

이상에서 설명된 장치는 하드웨어 구성 요소, 소프트웨어 구성 요소, 및/또는 하드웨어 구성 요소 및 소프트웨어 구성 요소의 집합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성 요소는, 예를 들어, 프로세서, 콘트롤러, ALU(Arithmetic Logic Unit), 디지털 신호 프로세서(Digital Signal Processor), 마이크로컴퓨터, FPA(Field Programmable array), PLU(Programmable Logic Unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(Operation System, OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술 분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(Processing Element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(Parallel Processor)와 같은, 다른 처리 구성(Processing Configuration)도 가능하다.The device described above may be implemented as a hardware component, a software component, and/or a set of hardware components and software components. For example, the devices and components described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable array (FPA), It may be implemented using one or more general purpose or special purpose computers, such as a Programmable Logic Unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications executed on the operating system. The processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For convenience of understanding, although one processing device is sometimes described as being used, one of ordinary skill in the art will recognize that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that can include For example, the processing device may include a plurality of processors or one processor and one controller. Other Processing Configurations are also possible, such as a Parallel Processor.

소프트웨어는 컴퓨터 프로그램(Computer Program), 코드(Code), 명령(Instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(Collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성 요소(Component), 물리적 장치, 가상 장치(Virtual Equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(Signal Wave)에 영구적으로, 또는 일시적으로 구체화(Embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.The software may include a computer program, code, instructions, or a combination of one or more thereof, and configure the processing device to operate as desired or independently or collectively processed You can command the device. The software and/or data may be any type of machine, component, physical device, virtual equipment, computer storage medium or device, to be interpreted by or provide instructions or data to the processing device. , or may be permanently or temporarily embodied in a transmitted signal wave (Signal Wave). The software may be distributed over networked computer systems, and stored or executed in a distributed manner. Software and data may be stored in one or more computer-readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 좋ㅂ하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(Magnetic Media), CD-ROM, DVD와 같은 광기록 매체(Optical Media), 플롭티컬 디스크(Floptical Disk)와 같은 자기-광 매체(Magneto-optical Media), 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, etc. alone or preferably. The program instructions recorded on the medium may be specially designed and configured for the embodiment, or may be known and available to those skilled in the art of computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROMs and DVDs, and magnetic media such as floppy disks. - Includes hardware devices specially configured to store and execute program instructions, such as Magneto-optical Media, ROM, RAM, Flash memory, and the like. Examples of program instructions include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

본 발명은 도면에 도시된 실시예를 참고로 설명되었으나 이는 예시적인 것에 불과하며, 본 기술 분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 타 실시예가 가능하다는 점을 이해할 것이다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성 요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성 요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다. 따라서, 본 발명의 진정한 기술적 보호 범위는 첨부된 등록청구범위의 기술적 사상에 의해 정해져야 할 것이다.Although the present invention has been described with reference to the embodiment shown in the drawings, which is merely exemplary, those skilled in the art will understand that various modifications and equivalent other embodiments are possible therefrom. For example, the described techniques are performed in an order different from the described method, and/or the described components of a system, structure, apparatus, circuit, etc. are combined or combined in a different form than the described method, or other components Or substituted or substituted by equivalents may achieve an appropriate result. Accordingly, the true technical protection scope of the present invention should be determined by the technical spirit of the appended claims.

Claims

In the similarity search system comprising a first terminal, a second terminal, and a server,
The server receives the encrypted first data from the first terminal and receives a query including a trapdoor generated based on the second data M' from the second terminal, determining the similarity between the second data M' and the first data M, and transmitting the similarity determination result to the second terminal;
The first terminal indexes (M) for the first data (M) through communication with the second terminal

), and the index (

) to the server,
The second terminal is the trap door (

) is created,
The server is the index (

) and the trap door (

) to judge the degree of similarity,
The first terminal

(

is a natural number greater than or equal to 2) arbitrary values (

After selecting all

about

Wow

by calculating the first index (

), and transmits the first index to the second terminal,
remind

is a subfeature extracted from the first data (M),
remind

class

Each is a multiplication cycle

class

cause of creation,
Similarity search system.

delete

According to claim 1,
The second terminal

random values (

), then all

about

Wow

Calculate the second index

to generate and transmit the second index to the first terminal,
remind

and above

is a secret key pair of the second terminal,
Similarity search system.

4. The method of claim 3,
The first terminal is the index from the second index (

) is created,
remind

is the secret key of the first terminal,
Similarity search system.

5. The method of claim 4,
The second terminal

random values (

), select all

about

Wow

Calculate the trap door (

) is created,
remind

is a sub-feature extracted from the second data (M'),
Similarity search system.

6. The method of claim 5,
The server is all

Determining whether the formula holds for
the above formula

ego,
The server determines that the first data (M) and the second data (M') are similar when the number of times the equation is established is equal to or greater than a predetermined threshold,
Similarity search system.

A similarity search method performed on a similarity search system including a first terminal, a second terminal, and a server, the method comprising:
receiving, by the server, encrypted first data from the first terminal;
The index for the first data (M) through the first terminal and the second terminal

), and the index (

) to the server;
The second terminal generates a trap door based on the second data M' (

) and transmitting a query including the trapdoor to the server; and
determining, by the server, a degree of similarity between the first data (M) and the second data (M') using the index and the trapdoor;
Transmitting the index to the server comprises:
the first terminal

(

is a natural number greater than or equal to 2) arbitrary values (

After selecting all

about

Wow

by calculating the first index (

) and transmitting the first index to the second terminal;
the second terminal

random values (

), then all

about

Wow

Calculate the second index

generating and transmitting the second index to the first terminal; and
The first terminal uses the index from the second index (

) comprising the steps of creating
remind

is a subfeature extracted from the first data (M),
remind

and above

is a secret key pair of the second terminal,
remind

class

Each is a multiplication cycle

class

is the source of
remind

is the secret key of the first terminal,
How to search for similarity.

delete

8. The method of claim 7,
The step of transmitting the query to the server, the second terminal

random values (

), select all

about

Wow

Calculate the trap door (

) comprising the steps of creating
remind

is a sub-feature extracted from the second data (M'),
How to search for similarity.

10. The method of claim 9,
In the step of determining the similarity, the server

Including the step of determining whether the formula is established with respect to,
the above formula

ego,
The server determines that the first data (M) and the second data (M') are similar when the number of times the equation is established is equal to or greater than a predetermined threshold,
How to search for similarity.