KR20240022203A

KR20240022203A - Communication system for private information retrieval using user preference based cache and its operation method

Info

Publication number: KR20240022203A
Application number: KR1020220100578A
Authority: KR
Inventors: 최완; 강준혁; 이호중
Original assignee: 서울대학교산학협력단; 한국과학기술원
Priority date: 2022-08-11
Filing date: 2022-08-11
Publication date: 2024-02-20

Abstract

사용자 단말의 선호도 기반 캐싱을 활용한 비공개적 회수 달성을 위한 기술이 개시된다. 복수 개의 데이터베이스에 저장된 각 파일에 대해 사용자 단말에서의 선호도를 기초로 캐싱 데이터를 저장하고, 상기 캐싱 데이터를 기초로 타겟 파일을 요청하기 위한 큐어리(query) 정보를 상기 복수 개의 데이터베이스로 전송하고, 상기 복수 개의 데이터베이스로부터 수신된 응답 문자열로부터 상기 타겟 파일을 검출할 수 있다.A technology for achieving private retrieval using preference-based caching of a user terminal is disclosed. Stores caching data based on preferences in the user terminal for each file stored in a plurality of databases, and transmits query information for requesting a target file based on the caching data to the plurality of databases, The target file can be detected from the response string received from the plurality of databases.

Description

Communication system for private information retrieval using user preference based cache and its operation method for achieving private retrieval using preference-based caching of a user terminal

아래의 설명은 다수의 데이터베이스에 요청하는 데이터에 대한 프라이버시를 보장하는 비공개적 회수 달성 기술에 관한 것이다.The explanation below is about a technique for achieving private retrieval that guarantees the privacy of data requested from multiple databases.

데이터베이스는 개인이 필요한 데이터를 저장하고 있는 네트워크 상의 필수적인 요소이다. 그러나, 공개적으로 접속이 가능한 데이터베이스는 사용자가 요청한 데이터의 내역을 통해 요청 데이터에 대한 정보를 역으로 추론할 수 있기에 개인 프라이버시를 노출시킬 수 있는 위험이 존재한다. 비공개적 회수 달성(private information retrieval)이라고 불리는 기술은 개인이 요청한 데이터에 대한 정보를 데이터베이스 측에서 추론할 수 없도록 하는 통신 기법으로, 데이터베이스로부터 개인이 요청한 데이터에 대한 프라이버시를 보장하면서 데이터를 빠르게 수신하는 기술 중 하나로 대두되고 있다.A database is an essential element of a network that stores data needed by individuals. However, in publicly accessible databases, there is a risk of exposing personal privacy because information about the requested data can be inferred backwards through the details of the data requested by the user. A technology called private information retrieval is a communication technique that prevents the database from inferring information about the data requested by an individual. It ensures the privacy of the data requested by the individual from the database and quickly receives the data. It is emerging as one of the technologies.

비공개적 회수 달성 통신 기법에 대한 연구는 기본적으로 개인이 데이터베이스로부터 원하는 데이터와 원하는 데이터 이외의 다른 데이터를 함께 요청하여 원하는 데이터에 대한 정보를 숨기는 기법이 제안되고 있다. 이를 위해, 사용자는 큐어리(query)를 통해 원하는 데이터와 다른 데이터를 함께 데이터베이스에 요청한다. 이 과정에서 큐어리의 정보량이 증대되면서 데이터베이스로부터 수신하는 데이터 다운로드 비용이 증가될 수 있다. 따라서, 다운로드 비용을 감소시키면서 사용자가 원하는 데이터를 노출하지 않고도 원하는 데이터를 획득할 수 있는 방안이 요구된다.Research on communication techniques for achieving private retrieval basically proposes a technique in which an individual requests the desired data from a database together with other data other than the desired data, thereby hiding information about the desired data. To do this, the user requests the desired data and other data from the database through a query. In this process, as the amount of information in Query increases, the cost of downloading data received from the database may increase. Therefore, there is a need for a method that reduces download costs and allows users to obtain desired data without exposing the desired data.

최근 연구는 단말에 캐싱이 없는 상황에서 데이터베이스에 데이터를 요청할 경우 원하는 데이터를 노출하지 않기 위한 큐어리의 정보량의 최소량에 대해서 알려져 있다. 연구 [1]에서는 그리디 반복 알고리즘(greedy iterative algorithm)을 통해 큐어리를 생성하는 기법을 제시하고 있으며, 연구 [2]에서는 큐어리의 길이가 충분히 길 경우 확률적 생성기법을 통해 신속하게 큐어리의 최소 정보량으로 원하는 데이터를 노출하지 않으면서 데이터베이스에 데이터를 요청할 수 있는 기법을 제시하고 있고, 연구 [3]에서는 사용자 단말에 선호도를 고려하지 않은 채 캐싱 데이터가 저장되어 있는 상황을 다루며 사용자 단말에 저장되어 있는 캐싱 데이터를 활용하여 보다 효율적인 비공개적 회수 달성을 위한 통신 기법을 제시하고 있다.Recent research is known about the minimum amount of information in Query to avoid exposing desired data when requesting data from a database in a situation where there is no caching in the terminal. Research [1] suggests a technique for generating a query through a greedy iterative algorithm, and research [2] suggests that if the length of a query is sufficiently long, a stochastic generation technique is used to quickly minimize the query. A technique is proposed to request data from the database without exposing the desired data due to the amount of information, and research [3] deals with a situation in which caching data is stored in the user terminal without considering preferences, and is stored in the user terminal. We propose a communication technique to achieve more efficient private retrieval by utilizing available caching data.

[1] H. Sun, S. Jafar, “The capacity of private information retrieval,” IEEE Trans. Inf. Theory, vo. 63, no. 7, pp. 4075-4088, Jul. 2017.[1] H. Sun, S. Jafar, “The capacity of private information retrieval,” IEEE Trans. Inf. Theory, vo. 63, no. 7, pp. 4075-4088, Jul. 2017.

[2] H. Seo, W. Choi, “A Stochastic Approach in Private Information Retrieval”, IEEE WCNC pp.1-6, Apr. 2018.[2] H. Seo, W. Choi, “A Stochastic Approach in Private Information Retrieval”, IEEE WCNC pp.1-6, Apr. 2018.

[3] H. Seo, H. Lee, W. Choi “Fundamental Limits of Private Information Retrieval with Unknown Cache Prefetching”, IEEE Trans. Comm. vo. 69, no. 12, pp. 8132-8144, Dec. 2021[3] H. Seo, H. Lee, W. Choi “Fundamental Limits of Private Information Retrieval with Unknown Cache Prefetching”, IEEE Trans. Comm. vo. 69, no. 12, pp. 8132-8144, Dec. 2021

개인 단말의 선호도에 기반하여 캐싱된 데이터를 활용한 효율적인 비공개적 회수 달성을 위한 통신 시스템 기법을 제공한다.Provides a communication system technique for achieving efficient and private retrieval using cached data based on personal device preferences.

사용자의 선호도에 기반하여 캐싱 데이터를 저장함으로써 통신 자원을 보다 효율적으로 사용하여 비공개적 회수 달성 기술을 구현할 수 있다.By storing caching data based on user preferences, communication resources can be used more efficiently to implement private retrieval technology.

사용자 단말에서 수행되는 방법에 있어서, 상기 사용자 단말은 메모리에 포함된 컴퓨터 판독가능한 명령들을 실행하도록 구성된 적어도 하나의 프로세서를 포함하고, 상기 방법은, 상기 적어도 하나의 프로세서에 의해, 복수 개의 데이터베이스에 저장된 각 파일에 대해 상기 사용자 단말에서의 선호도를 기초로 캐싱 데이터를 저장하는 단계; 상기 적어도 하나의 프로세서에 의해, 상기 캐싱 데이터를 기초로 타겟 파일을 요청하기 위한 큐어리(query) 정보를 상기 복수 개의 데이터베이스로 전송하는 단계; 및 상기 적어도 하나의 프로세서에 의해, 상기 복수 개의 데이터베이스로부터 수신된 응답 문자열로부터 상기 타겟 파일을 검출하는 단계를 포함하는 방법을 제공한다.In the method performed in a user terminal, the user terminal includes at least one processor configured to execute computer-readable instructions included in a memory, and the method includes, by the at least one processor, stored in a plurality of databases. storing caching data for each file based on preferences in the user terminal; transmitting, by the at least one processor, query information for requesting a target file based on the caching data to the plurality of databases; and detecting, by the at least one processor, the target file from a response string received from the plurality of databases.

일 측면에 따르면, 상기 저장하는 단계는, 상기 사용자 단말의 캐싱 데이터 용량에 따라 상기 선호도를 기준으로 상기 캐싱 데이터를 저장할 수 있다.According to one aspect, the storing step may store the caching data based on the preference according to the caching data capacity of the user terminal.

다른 측면에 따르면, 상기 저장하는 단계는, 상기 사용자 단말의 캐싱 데이터 용량이 상기 데이터베이스의 개수와 상기 파일의 개수를 기초로 결정된 일정 크기 이상인 경우 선호도가 높은 파일부터 상기 캐싱 데이터를 저장하는 단계를 포함할 수 있다.According to another aspect, the storing step includes storing the caching data starting from the file with the highest preference when the caching data capacity of the user terminal is greater than a certain size determined based on the number of the database and the number of files. can do.

또 다른 측면에 따르면, 상기 저장하는 단계는, 상기 사용자 단말의 캐싱 데이터 용량(z)이 (여기서, N은 데이터베이스의 개수, K는 파일의 개수) 이상인 경우, 전체 데이터 용량 중 일부 용량 (여기서, M_f는 단일 파일의 크기)에 대해 K개의 파일을 동일한 크기 만큼씩 저장하고, 나머지 데이터 용량 에 대해 선호도가 높은 파일부터 우선적으로 데이터를 저장할 수 있다.According to another aspect, in the storing step, the caching data capacity (z) of the user terminal is (where N is the number of databases and K is the number of files), if it is more than a portion of the total data capacity (where M _f is the size of a single file), K files of the same size are Store as much as you need, and the remaining data capacity You can save data first, starting with files with the highest preference.

또 다른 측면에 따르면, 상기 저장하는 단계는, 상기 사용자 단말의 캐싱 데이터 용량이 상기 일정 크기 미만인 경우 선형 결합을 통해 상기 캐싱 데이터를 저장하는 단계를 더 포함할 수 있다.According to another aspect, the storing step may further include storing the caching data through linear combination when the caching data capacity of the user terminal is less than the predetermined size.

또 다른 측면에 따르면, 상기 전송하는 단계는, 상기 타겟 파일을 요청하기 위한 큐어리 정보로 상기 복수 개의 데이터베이스에 서로 다른 큐어리를 생성하여 전송할 수 있다.According to another aspect, in the transmitting step, different queries may be created and transmitted in the plurality of databases as query information for requesting the target file.

또 다른 측면에 따르면, 상기 큐어리 정보는 정해진 길이의 바이너리 시퀀스(binary sequence)인 큐어리 벡터와 상기 정해진 길이의 튜플인 인덱스 정보 벡터를 포함하고, 상기 큐어리 벡터는 상기 데이터베이스에서의 선형 결합을 위한 정보 비트에 대한 정보를 나타내고, 상기 인덱스 정보 벡터는 상기 선형 결합에 참여하는 시점 지점에 대한 정보를 나타낼 수 있다.According to another aspect, the query information includes a query vector, which is a binary sequence of a predetermined length, and an index information vector, which is a tuple of the predetermined length, and the query vector is a linear combination in the database. The index information vector may represent information about a starting point participating in the linear combination.

또 다른 측면에 따르면, 상기 데이터베이스 각각에서는 상기 사용자 단말로부터 상기 큐어리 정보를 수신하여 상기 큐어리 정보에 대한 응답 비트로서 해당 데이터베이스에 저장된 파일에 대한 함수로 이루어진 응답 문자열이 생성될 수 있다.According to another aspect, each of the databases may receive the query information from the user terminal and generate a response string composed of a function for a file stored in the database as a response bit to the query information.

상기 방법을 컴퓨터 장치에 실행시키기 위해 컴퓨터 판독 가능한 기록매체에 저장된 컴퓨터 프로그램을 제공한다.A computer program stored on a computer-readable recording medium is provided to execute the method on a computer device.

컴퓨터로 구현된 사용자 단말에 있어서, 메모리에 포함된 컴퓨터 판독가능한 명령들을 실행하도록 구성된 적어도 하나의 프로세서를 포함하고, 상기 적어도 하나의 프로세서는, 복수 개의 데이터베이스에 저장된 각 파일에 대해 상기 사용자 단말에서의 선호도를 기초로 캐싱 데이터를 저장하는 과정; 상기 캐싱 데이터를 기초로 타겟 파일을 요청하기 위한 큐어리 정보를 상기 복수 개의 데이터베이스로 전송하는 과정; 및 상기 복수 개의 데이터베이스로부터 수신된 응답 문자열로부터 상기 타겟 파일을 검출하는 과정을 처리하는 사용자 단말을 제공한다.A computer-implemented user terminal, comprising at least one processor configured to execute computer-readable instructions included in a memory, wherein the at least one processor The process of storing caching data based on preferences; transmitting query information for requesting a target file based on the caching data to the plurality of databases; and a user terminal that processes the process of detecting the target file from the response string received from the plurality of databases.

본 발명의 실시예들에 따르면, 사용자의 선호도에 기반하여 캐싱 데이터를 저장하고 캐싱된 데이터를 기초로 사용자가 원하는 데이터를 회수함으로써 통신 자원을 보다 효율적으로 사용하여 비공개적 회수 달성 기술을 구현할 수 있다.According to embodiments of the present invention, by storing cached data based on the user's preference and retrieving data desired by the user based on the cached data, communication resources can be used more efficiently to achieve a private retrieval technology. .

도 1은 본 발명의 일실시예에 있어서 통신 네트워크 구성의 일례를 도시한 것이다.
도 2는 본 발명의 일실시예에 있어서 사용자의 파일 별 선호도를 도시한 것이다.
도 3은 본 발명의 일실시예에 있어서 데이터베이스 데이터 및 단말 캐싱 데이터를 도시한 것이다.
도 4는 본 발명의 일실시예에 있어서 정규화 캐싱 용량에 따른 정규화 다운로드 비용 실험 결과를 도시한 것이다.
도 5는 본 발명의 일실시예에 있어서 선호도 기반 데이터 캐싱 기법의 예시를 도시한 것이다.
도 6은 본 발명의 일실시예에 있어서 캐싱 데이터 저장 기법의 예시를 도시한 것이다.
도 7은 본 발명의 일실시예에 있어서 선호도 기반 데이터 회수 방법을 도시한 순서도이다.
도 8은 본 발명의 일실시예에 따른 컴퓨터 시스템의 예를 도시한 블록도이다.Figure 1 shows an example of a communication network configuration in one embodiment of the present invention.
Figure 2 shows a user's preferences for each file in one embodiment of the present invention.
Figure 3 shows database data and terminal caching data in one embodiment of the present invention.
Figure 4 shows the results of a normalized download cost experiment according to normalized caching capacity in one embodiment of the present invention.
Figure 5 shows an example of a preference-based data caching technique in one embodiment of the present invention.
Figure 6 shows an example of a caching data storage technique in one embodiment of the present invention.
Figure 7 is a flow chart illustrating a preference-based data retrieval method in one embodiment of the present invention.
Figure 8 is a block diagram showing an example of a computer system according to an embodiment of the present invention.

이하, 본 발명의 실시예를 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the attached drawings.

본 발명의 실시예들은 데이터베이스로부터 개인이 요청한 데이터에 대한 프라이버시를 보장하는 기술에 관한 것이다.Embodiments of the present invention relate to technology that guarantees privacy for data requested by individuals from a database.

본 명세서에서 구체적으로 개시되는 것들을 포함하는 실시예들은 개인 단말의 선호도에 기반하여 캐싱된 데이터를 활용한 효율적인 비공개적 회수 달성을 위한 통신 시스템 기법을 제공할 수 있고, 이를 통해 비용 절감, 프라이버시 보장, 데이터 처리 속도, 효율성 등의 측면에 있어서 상당한 장점들을 달성할 수 있다.Embodiments including those specifically disclosed in this specification can provide a communication system technique for achieving efficient private retrieval using cached data based on the preferences of a personal terminal, thereby reducing costs, ensuring privacy, and Significant advantages can be achieved in terms of data processing speed and efficiency.

통신 네트워크 및 데이터Telecommunications networks and data

도 1은 본 발명의 일실시예에 있어서 통신 네트워크 구성의 일례를 도시한 것이다.Figure 1 shows an example of a communication network configuration in one embodiment of the present invention.

도 1을 참조하면, 본 발명에서는 하나의 사용자 단말(110)과 사용자가 요청하는 데이터를 전송해주는 N개의 데이터베이스(120)로 구성된 네트워크를 가정한다. 데이터베이스(120)는 서로 정보를 주고받지 않으며(non-colluding), 각 데이터베이스(120)는 동일한 세트의 서로 다른 K개의 파일 데이터를 저장하고 있다. K개의 파일은 모두 동일한 길이 M_f비트로 이루어져 있다. 파일 수는 를 만족하여 각 파일을 이루는 비트열은 서로 겹치지 않는다고 가정하며 각 파일의 비트열은 수학식 1과 같이 표현된다.Referring to FIG. 1, the present invention assumes a network consisting of one user terminal 110 and N databases 120 that transmit data requested by the user. The databases 120 do not exchange information with each other (non-colluding), and each database 120 stores the same set of K different file data. All K files have the same length of M _f bits. The number of files is It is assumed that the bit strings that make up each file satisfy and do not overlap each other, and the bit strings of each file are expressed as Equation 1.

[수학식 1][Equation 1]

도 2는 본 발명의 일실시예에 있어서 사용자의 파일 별 선호도를 도시한 것이고, 도 3은 본 발명의 일실시예에 있어서 데이터베이스 데이터 및 단말 캐싱 데이터를 도시한 것이다.Figure 2 shows the user's preference for each file in one embodiment of the present invention, and Figure 3 shows database data and terminal caching data in one embodiment of the present invention.

본 실시예에서 선호도는 파일에 대한 접근 횟수, 접근 빈도, 접근 주기 등을 기초로 산정될 수 있다.In this embodiment, preference can be calculated based on the number of accesses to the file, access frequency, access cycle, etc.

예를 들어, 도 2에 도시한 바와 같이 사용자가 K개의 파일(F₁, F₂, ??, F_K)에 대해 서로 다른 선호도를 가지고 있으며, 선호도에 따라 사용자 단말(110)이 데이터베이스에 요청할 확률이 로 다르게 나타난다. 이때, 각 파일의 번호는 선호도에 따라 내림차순으로 정렬되어 있다고 가정하자 .For example, as shown in FIG. 2, users have different preferences for K files (F ₁ , F ₂ , ??, F _K ), and the user terminal 110 makes a request to the database according to the preferences. probability It appears differently. At this time, assume that the numbers of each file are sorted in descending order according to preference. .

사용자 단말(110)과 데이터베이스(120) 사이의 통신은 무소음 채널을 통해 이루어지며 양방향 통신이 가능하다.Communication between the user terminal 110 and the database 120 is performed through a noise-free channel, and two-way communication is possible.

사용자 단말(110)은 캐싱 데이터 저장이 가능하며, 캐싱 데이터 용량은 비트 크기를 가진다. z는 정규화 캐싱 용량으로 캐싱된 데이터 크기와 단일 파일 크기의 비율 이다. 이때, 사용자 단말(110)에 저장된 캐싱 데이터 Z는 수학식 2와 같이 표현되며, 한번 저장된 캐싱 데이터는 데이터베이스(120)로부터 응답 문자열 수신을 완료할 때까지 고정되어 있고 데이터베이스(120)는 사용자 단말(110)에 저장된 캐싱 데이터를 알 수 없다.The user terminal 110 is capable of storing caching data, and the caching data capacity is It has bit size. z is the normalized caching capacity, the ratio of the cached data size to the single file size am. At this time, the caching data Z stored in the user terminal 110 is expressed as Equation 2, and the caching data once stored is fixed until receiving the response string from the database 120 is completed, and the database 120 is stored in the user terminal ( 110) The caching data stored in is unknown.

[수학식 2][Equation 2]

사용자 단말(110)은 원하는 파일, 즉 타겟 파일이 무엇인지 결정하기 이전에 캐싱 데이터 용량에 저장할 데이터를 결정한다. 이때 저장되는 데이터는 z값에 따라서 일정 확률로 선택된 서로 다른 파일들의 임의의 단일 비트가 랜덤 선형 결합된 데이터 비트이거나 일정 비율의 단일 파일의 단일 비트의 형태로 Z안에 저장된다.The user terminal 110 determines the data to be stored in the caching data capacity before determining the desired file, that is, the target file. At this time, the data stored is stored in Z in the form of random linearly combined data bits of random single bits from different files selected with a certain probability according to the z value, or as single bits of a certain ratio of single files.

큐어리 구성Query Configuration

본 발명에서 사용자 단말(110)은 저장된 캐싱 데이터를 부가 정보로 활용하면서 데이터베이스(120)로부터 사용자가 원하는 파일 를 프라이버시를 보장하면서 요청하는 상황을 다룬다. 여기서, 는 전체 K개의 파일 중에서 앞서 이야기한 확률 로 결정되는 사용자가 요청하고자 하는 파일의 인덱스다. 사용자 단말(110)은 파일 를 얻기 위해 N개의 데이터베이스(120)에 서로 다른 큐어리를 생성하여 전송한다. 파일 를 요청하기 위해 각 데이터베이스(120)에 전송하는 큐어리는 수학식 3과 같이 표현한다.In the present invention, the user terminal 110 uses the stored caching data as additional information and retrieves the file desired by the user from the database 120. Handles situations where requests are made while ensuring privacy. here, is the probability mentioned above among all K files. This is the index of the file the user wants to request, determined by . The user terminal 110 is a file In order to obtain, different queries are created and transmitted to N databases 120. file The query transmitted to each database 120 to request is expressed as Equation 3.

[수학식 3] [Equation 3]

임의의 데이터베이스 n에게 전송하는 큐어리 은 개의 K-길이 바이너리(binary) 시퀀스인 큐어리 벡터 와 K-길이 튜플인 인덱스 정보 벡터 을 포함하고 있다. 큐어리 벡터와 인덱스 벡터는 수학식 4와 같이 나타낼 수 있다.Query sent to random database n silver Qary vector, which is a K-length binary sequence and an index information vector that is a K-length tuple. It includes. The query vector and index vector can be expressed as Equation 4.

[수학식 4][Equation 4]

여기서, 큐어리 벡터의 역할은 어떤 메시지에서 정보 비트를 가져와 선형 결합을 할 것인가를 데이터베이스(120)에게 알려주는 것이다. 예를 들어, 데이터베이스에 2개의 메시지가 있다고 하면 [1 0]의 큐어리 벡터를 전송할 경우 첫 번째 메시지에서 정보 비트를 요청하는 것이고 [0 1]의 큐어리 벡터를 전송할 경우 두 번째 메시지에서 정보 비트를 요청하는 것이며 [1 1]의 큐어리 벡터를 전송할 경우 첫 번째 메시지의 정보 비트 하나와 두 번째 메시지의 정보 비트 하나를 선형 결합하여 응답 비트를 만드는 것을 요청하는 것이다. 큐어리 벡터의 생성은 캐싱된 데이터를 고려하여 사용자 단말(110)이 원하는 데이터를 노출하지 않으면서 데이터베이스(120)에 데이터를 요청할 수 있도록 생성된다. 자세한 생성 기법은 이하에서 다시 설명하기로 한다.Here, the role of the query vector is to inform the database 120 from which message information bits will be taken and linearly combined. For example, if there are two messages in the database, if a query vector of [1 0] is transmitted, an information bit is requested in the first message, and if a query vector of [0 1] is transmitted, an information bit is requested in the second message. When transmitting a query vector of [1 1], it is requested to create a response bit by linearly combining one information bit of the first message and one information bit of the second message. The query vector is created in consideration of cached data so that the user terminal 110 can request data from the database 120 without exposing the desired data. The detailed generation technique will be described again below.

인덱스 정보 벡터는 각 파일이 가지는 M_f개의 비트 중 결합에 참여하는 시작 지점을 정해준다. 응답 비트를 만들기 위해서는 어떤 메시지의 정보 비트가 선형 결합되어야 하는지에 대한 정보도 필요하지만 몇 번째 비트가 선형 결합되어야 하는지에 대한 정보도 필요하다. 업로드 비용을 최소화하기 위해서 본 기법에서는 메시지의 정보 비트가 원래 배열된 순서대로 응답 비트를 만들 때 사용할 것이므로 그 순서의 시작점이 되는 초기 메시지 비트 인덱스 정보를 사용자가 각 데이터베이스에 전송해야 한다.The index information vector determines the starting point that participates in combining among the M _f bits of each file. To create a response bit, not only is information about which message's information bits to be linearly combined, but information is also needed to determine which bits are to be linearly combined. In order to minimize upload costs, this technique will be used to create response bits in the order in which the information bits of the message were originally arranged, so the user must transmit the initial message bit index information, which is the starting point of the sequence, to each database.

응답 문자열의 구성Configuration of the response string

데이터베이스 n에서 을 수신하면 비트 길이의 응답 문자열 를 생성하며 수학식 5와 같은 식으로 표현한다. 응답 문자열은 큐어리와 데이터베이스에 저장된 파일들에 대한 함수이다.in database n When you receive bit-long response string is generated and expressed in the same way as Equation 5. The response string is a function for files stored in the query and database.

[수학식 5][Equation 5]

next 함수는 초기 파일 인덱스 데이터 비트부터 순차적으로 정보 비트를 호출하는 함수이다.The next function is a function that calls information bits sequentially starting from the initial file index data bit.

프라이버시 안전도privacy security

결정형 시퀀스 들에 대하여 시퀀스들의 튜플이 수학식 6과 같이 표현된다고 하자.crystalline sequence Let us assume that the tuple of sequences is expressed as Equation 6.

[수학식 6][Equation 6]

또한, 수학식 7과 같은 경험적 확률 질량 함수 정의가 있을 수 있다.Additionally, there may be an empirical probability mass function definition such as Equation 7.

[수학식 7][Equation 7]

위 식에 따르면 생성된 큐어리가 수학식 8에 표현되는 바와 같은 시퀀스 튜플에 있으면 프라이버시가 안전하다고 표현할 수 있다.According to the above equation, if the generated query is in a sequence tuple as expressed in Equation 8, privacy can be expressed as safe.

[수학식 8][Equation 8]

본 기법의 성능은 사용자가 원하는 임의의 파일 의 데이터를 수신하기 위해 전체 데이터베이스로부터 다운로드 받아야 하는 총 응답 문자열의 데이터 양 과 단일 파일의 데이터 양 M_f의 비율인 정규화 다운로드 비용(Normalized download cost)으로 측정할 수 있다. 이때 전체 파일에 대한 정규화 다운로드 비용은 수학식 9와 같이 표현된다.The performance of this technique is limited to any file the user wants. The total amount of data in the response string that must be downloaded from the entire database to receive the data. It can be measured by the normalized download cost, which is the ratio of the data amount of a single file, M _f . At this time, the normalized download cost for the entire file is expressed as Equation 9.

[수학식 9][Equation 9]

본 기법의 성능을 수학식 9의 지표를 통해 측정할 수 있다.The performance of this technique can be measured through the indicator in Equation 9.

캐싱 데이터 정보를 고려하여 큐어리를 생성하게 되면 정규화 다운로드 비용은 저장 용량의 단일 파일크기에 대한 비율인 정규화 캐싱 용량 z에 대해서 수학식 10과 같은 크기를 가지게 된다.When a query is created considering the caching data information, the normalized download cost has the same size as Equation 10 for the normalized caching capacity z, which is the ratio of the storage capacity to the single file size.

[수학식 10][Equation 10]

여기서, 이다.here, am.

위 결과를 사용자의 선호도를 고려하지 않은 기존 기법과 시뮬레이션을 통해 비교한 결과는 도 4와 같다. 시뮬레이션 환경은 데이터베이스의 수가 2개이고 요청할 수 있는 파일의 가지 수가 4가지인 경우와 데이터베이스의 수가 3개이고 요청할 수 있는 파일의 가지 수가 7가지인 경우에서 진행한 것이다. 각 파일의 크기는 1Mbit라고 가정한다. 이때, 사용자 단말의 캐싱 데이터 용량과 단일 파일 용량의 비율 z에 따른 정규화 다운로드 비용을 비교한다.The results of comparing the above results through simulation with existing techniques that do not take user preferences into account are shown in Figure 4. The simulation environment was conducted in the case where the number of databases is 2 and the number of files that can be requested is 4, and the number of databases is 3 and the number of files that can be requested is 7. Assume that the size of each file is 1Mbit. At this time, the normalized download cost according to the ratio z of the caching data capacity of the user terminal and the single file capacity is compared.

각 파일의 선호도를 반영하여 데이터를 캐싱한 제안 기법과 다르게 기존 기법의 경우에는 사용자는 선호도를 고려하지 않고 각 파일의 요청 확률이 같다는 가정 하에 데이터를 캐싱한다. 해당 기법을 적용한 결과, 선호도를 반영하여 단일 비트 형태로 데이터를 저장하는 구간 에서는 본 기법을 사용할 경우 정규화 다운로드 비용을 최대 30%(0.5->0.35), 27%(0.64->0.47) 감소할 수 있음을 확인할 수 있다.Unlike the proposed technique that caches data by reflecting the preference of each file, in the case of the existing technique, the user caches data under the assumption that the request probability of each file is the same without considering the user's preference. As a result of applying the technique, a section where data is stored in the form of a single bit, reflecting preference It can be seen that when using this technique, the normalized download cost can be reduced by up to 30% (0.5->0.35) and 27% (0.64->0.47).

본 발명에서는 사용자가 본인의 캐싱 데이터 용량에 따라 서로 다른 방법으로 캐싱 데이터를 저장하고 큐어리를 생성한다. 단일 파일의 크기 M^f에 대한 캐싱 데이터 용량의 상대적인 비율 z로 용량 크기의 기준을 잡는다.In the present invention, a user stores caching data and creates a query in different ways depending on the capacity of the user's caching data. Capacity size is based on the relative ratio z of the caching data capacity to the size of a single file M ^f .

(i) 캐싱 데이터 용량이 인 경우(i) Caching data capacity If

전체 데이터 용량 중 만큼은 전체 K개의 데이터를 동일한 크기 만큼씩 저장을 한다. 그 후 남아있는 용량 에 대해서는 선호도가 제일 높은 파일부터 우선으로 데이터를 저장한다.Of total data capacity As long as the total K pieces of data are the same size Save it as much as you need. The remaining capacity after that For , data is saved first, starting from the file with the highest preference.

도 5를 참조하면, 전체 K=9개의 파일이 있고, 총 파일의 크기가 M_f=6Mb이며 데이터베이스의 수가 2개인 경우, 본 기법은 각 9가지 파일에 대해서 6/3=2Mb씩 동일한 크기로 저장을 할 수 있다. 그 후에 잔여 캐싱 용량이 있는 경우, 선호도가 제일 높은 파일(왼쪽 파일)부터 우선적으로 데이터를 저장하며, 결과적으로 4번째 파일은 총 4Mb를, 그 이후의 파일에 대해서는 초기에 저장한 2Mb의 데이터만을 캐싱해 둘 수 있는 상황이다.Referring to Figure 5, if there are a total of K = 9 files, the total file size is M _f = 6Mb, and the number of databases is 2, this technique creates the same size of 6/3 = 2Mb for each of the 9 files. You can save it. Afterwards, if there is remaining caching capacity, data is saved first starting from the file with the highest preference (the file on the left). As a result, the fourth file stores a total of 4Mb, and for subsequent files, only the initially stored 2Mb of data is stored. This is a situation that can be cached.

사용자는 캐싱 데이터를 모두 저장한 후에 본인이 원하는 파일 를 결정하고 이를 요청하기 위한 큐어리 를 생성한다. 만일 사용자가 결정한 원하는 파일이 이미 전부 캐싱되어 있다면 필요한 다운로드 비용은 0이 된다. 그 외의 경우에는 아래 과정을 따른다.After saving all caching data, the user can select the file he or she wants. Query to decide and request creates . If all the desired files determined by the user are already cached, the required download cost is zero. In other cases, follow the process below.

임의의 데이터베이스 n에 전송할 큐어리 는 앞서 설명한 바와 같이 개의 큐어리 벡터 와 인덱스 정보 벡터 를 묶어서 생성되며 각각은 다음과 같이 생성된다.Query to send to random database n As explained previously, dog curry vector and index information vector It is created by tying it together, and each is created as follows.

큐어리 벡터 은 다음 규칙에 따라서 생성된다.curary vector is created according to the following rules.

개의 큐어리 벡터 은 각 큐어리 벡터마다 모든 요소가 1이다. 즉 모든 데이터 파일의 데이터를 선형 결합한 형태로 데이터를 요청하고자 한다. 이 경우에, 모든 데이터에 대해서 동일한 양의 데이터를 요청하게 되므로 그렇게 생성된 큐어리 벡터는 식을 만족함을 보일 수 있으며, 다시 말해 안전한 큐어리가 된다. dog curry vector All elements of each quarry vector are 1. In other words, we want to request data in the form of a linear combination of data from all data files. In this case, the same amount of data is requested for all data, so the curry vector generated in this way is It can be shown that the equation is satisfied, in other words, it becomes a safe query.

그리고, 인덱스 정보 벡터는 사용자가 요청하고자 하는 파일 인덱스 번째 요소를 제외하고는 모두 현재 캐싱되어 있는 데이터의 인덱스를 모든 데이터베이스에 동일하게 전송한다. 번째 파일의 단말에 캐싱되어 있지 않은 첫 번째 비트 인덱스가 b인 경우, 첫 번째 데이터베이스에 전송하는 인덱스 정보 벡터의 번째 요소는 b가 되며 i번째 데이터베이스에 전송하는 인덱스 정보 벡터의 번째 요소는 가 된다.And, the index information vector is the file index that the user wants to request. Except for the first element, the index of the currently cached data is transmitted equally to all databases. If the first bit index that is not cached in the terminal of the first file is b, the index information vector transmitted to the first database is The th element is b, and the index information vector transmitted to the ith database is The second element is It becomes.

이하 영역에 대한 캐싱 내용은 아래와 같다.below The caching details for the area are as follows.

(ii) 캐싱 데이터 용량이 인 경우(ii) Caching data capacity If

도 6에 도시한 바와 같이, 사용자가 가지고 있는 전체 비트의 저장 용량 중 비트는 선형 결합을 통해 저장할 비트를 결정한다. 선형 결합을 통해 저장할 비트는 다음 순서로 결정된다.As shown in Figure 6, the user's total Of the storage capacity of bits Bits are linearly combined to determine which bits to store. The bits to be stored through linear combination are determined in the following order.

먼저 K개의 파일 중 일부를 의 확률로 고른다. 도 6에서 첫 번째 저장공간 Z₁에 저장하기 위해 선택된 파일은 1번째 파일과 4번째 파일이다. 그 다음은 선발된 파일마다 저장할 비트를 유니폼하게 하나 선발한다.First, select some of the K files. Choose with a probability of In FIG. 6, the files selected to be stored in the first storage space Z ₁ are the 1st file and the 4th file. Next, a uniform bit to be stored is selected for each selected file.

저장할 비트까지 선택이 마치면 저장공간 에 각 저장 공간마다 선발된 비트를 각각 선형 결합하여 저장한다.Once you have selected the bits to be stored, the storage space In each storage space, the selected bits are linearly combined and stored.

도 6에 도시한 바와 같이, 전체 파일의 개수가 K=4개이고 M_c=5, M'=3인 경우에 첫 번째로 저장할 비트 Z₁은 K=4개의 파일 중 일정 확률로 선발된 파일 (k=1, 4)에서 유니폼하게 저장할 비트를 결정한다. 그렇게 결정된 비트를 과 라고 하면 그 둘의 선형 결합한 비트를 Z₁에 저장한다. 선형 결합으로 저장되는 M'비트 외에 용량에는 유니폼하게 선발된 하나의 파일의 유니폼하게 선발된 하나의 비트를 저장한다. 예시에서는 2번째 파일의 임의의 비트와 3번째 파일의 임의의 비트가 각각 Z₄와 Z₅에 저장됨을 알 수 있다.As shown in Figure 6, when the total number of files is K = 4 and M _c = 5 and M' = 3, the first bit to be stored Z ₁ is a file selected with a certain probability among K = 4 files ( Determine the bits to be stored uniformly at k=1, 4). The bit decided like that class Then, the linearly combined bits of the two are stored in Z ₁ . In addition to the M' bits stored in linear combination, one uniformly selected bit of one uniformly selected file is stored in the capacity. In the example, you can see that the random bits of the second file and the random bits of the third file are stored in Z ₄ and Z ₅ , respectively.

개의 각 큐어리 벡터마다 사용자가 원하는 파일 인덱스 번째 요소를 제외한 나머지는 k번째 요소의 경우 사용자의 k번째 캐싱 데이터에 포함되어 있는 파일은 1로 그 외에는 0으로 정한다. User-desired file index for each curry vector In the case of the kth element, the file included in the user's kth cached data is set to 1, and the rest is set to 0.

아직 정해지지 않은 개의 요소 (각 큐어리 벡터의 번째 요소)는 사용자의 캐싱 데이터 중 번째 파일의 정보가 포함되어 있는 인덱스만을 고른다. 그렇게 선택된 인덱스의 집합을 라고 한다. 그 집합 중 번째 인덱스의 큐어리 벡터부터 순차적으로 번째 인덱스의 큐어리 벡터의 번째 요소만 0으로, 그 외는 1로 정한다. 이를 식으로 표현하면 수학식 11과 같다.not yet decided elements (of each curary vector) The first element) is the user's cached data. Select only the index that contains the information of the th file. The set of indices thus selected is It is said. Among that set Sequentially starting from the query vector of the th index of the curary vector of the th index Only the first element is set to 0, the rest are set to 1. This can be expressed as Equation 11.

[수학식 11][Equation 11]

그렇게 생성된 큐어리 벡터는 식을 만족함을 보일 수 있으며 다시 말해 파일의 크기가 충분히 클 경우 안전한 큐어리가 된다.The Qurey vector created in this way is It can be shown that the equation is satisfied, and in other words, if the file size is large enough, it becomes a safe query.

그리고 인덱스 정보 벡터는 첫 번째 데이터베이스에 대해서 파일 길이 M_f중에서 유니폼하게 K번 결정하여 생성된다. 그 후에 번째 데이터베이스의 인덱스 정보 벡터는 첫 번째 데이터베이스에 대한 인덱스 정보 벡터의 K개의 요소에 각각 을 더하여 생성한다.And the index information vector is generated by uniformly determining the file length M _f for the first database K times. After that The index information vector for the first database is each K elements of the index information vector for the first database. Created by adding .

(iii) 캐싱 데이터 용량이 인 경우(iii) caching data capacity If

(ii)의 경우와 마찬가지 방법으로 캐싱 데이터 저장용량에 데이터를 저장한다. 이번에는 전체 저장 용량 M_c비트 중 비트만 전부 선형 결합으로 비트를 생성하여 저장한다. 그 외의 용량에 대해서는 고려하지 않는다.Data is stored in the caching data storage capacity in the same way as in case (ii). This time, out of the total storage capacity M _c bits Only bits are generated and stored through linear combination. Other capacities are not considered.

임의의 데이터베이스 n에 전송할 큐어리 는 앞서 설명한 바와 같이 개의 큐어리 벡터 와 인덱스 정보 벡터 를 생성되며 각각은 다음과 같이 생성된다.Query to send to random database n As explained previously, dog curry vector and index information vector are created, and each is created as follows.

전체 전송하는 큐어리 벡터의 개수는 일 때 개의 큐어리 벡터를 보내며 큐어리 벡터 은 다음 규칙에 따라서 생성된다.The total number of transmitted Quri vectors is when Sending four curary vectors, curary vector is created according to the following rules.

먼저 1번째부터 M''번째 인덱스의 큐어리 벡터의 경우 (i)의 경우와 마찬가지로 아래와 같은 방식으로 결정된다.First, the query vector of the 1st to M''th index is determined in the following manner, as in case (i).

아직 정해지지 않은 개의 요소 (각 큐어리 벡터의 번째 요소)는 사용자의 캐싱 데이터 중 번째 파일의 정보가 포함되어 있는 인덱스만을 고른다. 그렇게 선택된 인덱스의 집합을 라고 한다. 그 집합 중 번째 인덱스의 큐어리 벡터부터 순차적으로 번째 인덱스의 큐어리 벡터의 번째 요소만 0으로, 그 외는 1로 정한다. 이를 식으로 표현하면 수학식 12와 같다.not yet decided elements (of each curary vector) The first element) is the user's cached data. Select only the index that contains the information of the th file. The set of indices thus selected is It is said. Among that set Sequentially starting from the query vector of the th index of the curary vector of the th index Only the first element is set to 0, the rest are set to 1. This can be expressed as Equation 12.

[수학식 12][Equation 12]

그 외에 M''+1번째부터 M'''번째 인덱스의 큐어리 벡터는 기법[2]와 같은 방법으로 큐어리를 생성한다.In addition, the query vector of the M'''th index from M''+1th to M'''th index is created in the same way as technique [2].

그리고 인덱스 정보 벡터는 (ii)에서 설명한 바와 동일한 과정을 거쳐 생성된다.And the index information vector is created through the same process as described in (ii).

즉, 앞서 설명한 시스템 구성을 포함하여 본 발명에서 제안하고자 하는 전체 통신 기법의 알고리즘은 사용자 단말(110)과 데이터베이스(120) 측에서 각각 도 7과 같이 표현될 수 있다.That is, the algorithm of the entire communication technique proposed in the present invention, including the system configuration described above, can be expressed as shown in FIG. 7 on the user terminal 110 and the database 120, respectively.

도 7을 참조하면, 사용자 단말(110)은 선호도 기반 캐싱 데이터를 저장할 수 있으며(S11), 사용자 요청에 따라 데이터베이스(120)로부터 원하는 데이터가 발생하게 되면 큐어리 벡터 와 인덱스 정보 벡터 을 생성한다(S12 내지 도 14).Referring to FIG. 7, the user terminal 110 can store preference-based caching data (S11), and when desired data is generated from the database 120 according to a user request, the query vector and index information vector Generate (S12 to 14).

사용자 단말(110)은 원하는 데이터에 대한 큐어리 정보(즉, 큐어리 벡터 와 인덱스 정보 벡터 )를 데이터베이스(120)로 전송한다(S15).The user terminal 110 provides query information about the desired data (i.e., query vector and index information vector ) is transmitted to the database 120 (S15).

데이터베이스(120)는 사용자 단말(110)로부터 사용자 단말(110)이 원하는 데이터에 대한 큐어리 정보(즉, 큐어리 벡터 와 인덱스 정보 벡터 )를 수신한 후(S21), 큐어리와 데이터베이스에 저장된 파일들에 대한 함수인 응답 문자열을 구성하여 사용자 단말(110)로 전송한다(S22 내지 도 23).The database 120 provides query information (i.e., query vector) about data desired by the user terminal 110 from the user terminal 110. and index information vector ) is received (S21), a response string that is a function for files stored in the query and database is constructed and transmitted to the user terminal 110 (S22 to 23).

이에, 사용자 단말(110)은 데이터베이스(120)로부터 큐어리 정보에 대한 응답 문자열을 수신하여 수신된 응답 문자열로부터 원하는 데이터를 검출할 수 있다(S16 내지 S17).Accordingly, the user terminal 110 may receive a response string for the query information from the database 120 and detect desired data from the received response string (S16 to S17).

본 실시예들은 모바일 통신이나 PC 통신에서 데이터를 저장 가능한 단말 기기가 데이터를 들고 있는 서버에 데이터를 요청하는 상황에서 개인 프라이버시를 보장할 수 있는 기술로 사용될 수 있다.These embodiments can be used as a technology that can guarantee personal privacy in mobile communication or PC communication when a terminal device capable of storing data requests data from a server holding the data.

도 8은 본 발명의 일실시예에 따른 컴퓨터 시스템의 예를 도시한 블록도이다.Figure 8 is a block diagram showing an example of a computer system according to an embodiment of the present invention.

도 8은 본 발명의 일실시예에 따른 컴퓨터 시스템의 예를 도시한 블록도이다. 앞서 설명한 사용자 단말(110)과 데이터베이스(120)는 도 8과 같이 구성된 컴퓨터 시스템(800)에 의해 구현될 수 있다.Figure 8 is a block diagram showing an example of a computer system according to an embodiment of the present invention. The user terminal 110 and database 120 described above may be implemented by a computer system 800 configured as shown in FIG. 8.

도 8에 도시된 바와 같이 컴퓨터 시스템(800)은 본 발명의 실시예들에 따른 선호도 기반 데이터 회수 방법을 실행하기 위한 구성요소로서, 메모리(810), 프로세서(820), 통신 인터페이스(830) 그리고 입출력 인터페이스(840)를 포함할 수 있다.As shown in FIG. 8, the computer system 800 is a component for executing the preference-based data retrieval method according to embodiments of the present invention, and includes a memory 810, a processor 820, a communication interface 830, and It may include an input/output interface 840.

메모리(810)는 컴퓨터에서 판독 가능한 기록매체로서, RAM(random access memory), ROM(read only memory) 및 디스크 드라이브와 같은 비소멸성 대용량 기록장치(permanent mass storage device)를 포함할 수 있다. 여기서 ROM과 디스크 드라이브와 같은 비소멸성 대용량 기록장치는 메모리(810)와는 구분되는 별도의 영구 저장 장치로서 컴퓨터 시스템(800)에 포함될 수도 있다. 또한, 메모리(810)에는 운영체제와 적어도 하나의 프로그램 코드가 저장될 수 있다. 이러한 소프트웨어 구성요소들은 메모리(810)와는 별도의 컴퓨터에서 판독 가능한 기록매체로부터 메모리(810)로 로딩될 수 있다. 이러한 별도의 컴퓨터에서 판독 가능한 기록매체는 플로피 드라이브, 디스크, 테이프, DVD/CD-ROM 드라이브, 메모리 카드 등의 컴퓨터에서 판독 가능한 기록매체를 포함할 수 있다. 다른 실시예에서 소프트웨어 구성요소들은 컴퓨터에서 판독 가능한 기록매체가 아닌 통신 인터페이스(830)를 통해 메모리(810)에 로딩될 수도 있다. 예를 들어, 소프트웨어 구성요소들은 네트워크(860)를 통해 수신되는 파일들에 의해 설치되는 컴퓨터 프로그램에 기반하여 컴퓨터 시스템(800)의 메모리(810)에 로딩될 수 있다.The memory 810 is a computer-readable recording medium and may include a non-permanent mass storage device such as random access memory (RAM), read only memory (ROM), and a disk drive. Here, non-perishable large-capacity recording devices such as ROM and disk drives may be included in the computer system 800 as a separate permanent storage device that is distinct from the memory 810. Additionally, an operating system and at least one program code may be stored in the memory 810. These software components may be loaded into the memory 810 from a computer-readable recording medium separate from the memory 810. Such separate computer-readable recording media may include computer-readable recording media such as floppy drives, disks, tapes, DVD/CD-ROM drives, and memory cards. In another embodiment, software components may be loaded into the memory 810 through the communication interface 830 rather than a computer-readable recording medium. For example, software components may be loaded into memory 810 of computer system 800 based on computer programs being installed by files received over network 860.

프로세서(820)는 기본적인 산술, 로직 및 입출력 연산을 수행함으로써, 컴퓨터 프로그램의 명령을 처리하도록 구성될 수 있다. 명령은 메모리(810) 또는 통신 인터페이스(830)에 의해 프로세서(820)로 제공될 수 있다. 예를 들어 프로세서(820)는 메모리(810)와 같은 기록 장치에 저장된 프로그램 코드에 따라 수신되는 명령을 실행하도록 구성될 수 있다.The processor 820 may be configured to process instructions of a computer program by performing basic arithmetic, logic, and input/output operations. Commands may be provided to the processor 820 by the memory 810 or the communication interface 830. For example, the processor 820 may be configured to execute received instructions according to program codes stored in a recording device such as memory 810.

통신 인터페이스(830)는 네트워크(860)를 통해 컴퓨터 시스템(800)이 다른 장치와 서로 통신하기 위한 기능을 제공할 수 있다. 일례로, 컴퓨터 시스템(800)의 프로세서(820)가 메모리(810)와 같은 기록 장치에 저장된 프로그램 코드에 따라 생성한 요청이나 명령, 데이터, 파일 등이 통신 인터페이스(830)의 제어에 따라 네트워크(860)를 통해 다른 장치들로 전달될 수 있다. 역으로, 다른 장치로부터의 신호나 명령, 데이터, 파일 등이 네트워크(860)를 거쳐 컴퓨터 시스템(800)의 통신 인터페이스(830)를 통해 컴퓨터 시스템(800)으로 수신될 수 있다. 통신 인터페이스(830)를 통해 수신된 신호나 명령, 데이터 등은 프로세서(820)나 메모리(810)로 전달될 수 있고, 파일 등은 컴퓨터 시스템(800)이 더 포함할 수 있는 저장 매체(상술한 영구 저장 장치)로 저장될 수 있다.The communication interface 830 may provide a function for the computer system 800 to communicate with other devices through the network 860. For example, a request, command, data, file, etc. generated by the processor 820 of the computer system 800 according to a program code stored in a recording device such as a memory 810 is transmitted to the network ( 860) and can be transmitted to other devices. Conversely, signals, commands, data, files, etc. from other devices may be received by the computer system 800 through the communication interface 830 of the computer system 800 via the network 860. Signals, commands, data, etc. received through the communication interface 830 may be transmitted to the processor 820 or memory 810, and files, etc. may be stored in a storage medium (as described above) that the computer system 800 may further include. It can be stored as a permanent storage device).

통신 방식은 제한되지 않으며, 네트워크(860)가 포함할 수 있는 통신망(일례로, 이동통신망, 유선 인터넷, 무선 인터넷, 방송망)을 활용하는 통신 방식뿐만 아니라 기기들 간의 근거리 유선/무선 통신 역시 포함될 수 있다. 예를 들어, 네트워크(860)는, PAN(personal area network), LAN(local area network), CAN(campus area network), MAN(metropolitan area network), WAN(wide area network), BBN(broadband network), 인터넷 등의 네트워크 중 하나 이상의 임의의 네트워크를 포함할 수 있다. 또한, 네트워크(860)는 버스 네트워크, 스타 네트워크, 링 네트워크, 메쉬 네트워크, 스타-버스 네트워크, 트리 또는 계층적(hierarchical) 네트워크 등을 포함하는 네트워크 토폴로지 중 임의의 하나 이상을 포함할 수 있으나, 이에 제한되지 않는다.The communication method is not limited, and may include not only a communication method utilizing communication networks that the network 860 may include (e.g., mobile communication network, wired Internet, wireless Internet, and broadcasting network), but also short-distance wired/wireless communication between devices. there is. For example, the network 860 may be a personal area network (PAN), a local area network (LAN), a campus area network (CAN), a metropolitan area network (MAN), a wide area network (WAN), or a broadband network (BBN). , may include one or more arbitrary networks such as the Internet. Additionally, the network 860 may include any one or more of network topologies including a bus network, star network, ring network, mesh network, star-bus network, tree or hierarchical network, etc. Not limited.

입출력 인터페이스(840)는 입출력 장치(850)와의 인터페이스를 위한 수단일 수 있다. 예를 들어, 입력 장치는 마이크, 키보드, 카메라 또는 마우스 등의 장치를, 그리고 출력 장치는 디스플레이, 스피커와 같은 장치를 포함할 수 있다. 다른 예로 입출력 인터페이스(840)는 터치스크린과 같이 입력과 출력을 위한 기능이 하나로 통합된 장치와의 인터페이스를 위한 수단일 수도 있다. 입출력 장치(850)는 컴퓨터 시스템(800)과 하나의 장치로 구성될 수도 있다.The input/output interface 840 may be a means for interfacing with the input/output device 850. For example, input devices may include devices such as a microphone, keyboard, camera, or mouse, and output devices may include devices such as displays and speakers. As another example, the input/output interface 840 may be a means for interfacing with a device that integrates input and output functions, such as a touch screen. The input/output device 850 may be configured as a single device with the computer system 800.

또한, 다른 실시예들에서 컴퓨터 시스템(800)은 도 8의 구성요소들보다 더 적은 혹은 더 많은 구성요소들을 포함할 수도 있다. 그러나, 대부분의 종래기술적 구성요소들을 명확하게 도시할 필요성은 없다. 예를 들어, 컴퓨터 시스템(800)은 상술한 입출력 장치(850) 중 적어도 일부를 포함하도록 구현되거나 또는 트랜시버(transceiver), 각종 데이터베이스 등과 같은 다른 구성요소들을 더 포함할 수도 있다.Additionally, in other embodiments, computer system 800 may include fewer or more components than those of FIG. 8 . However, there is no need to clearly show most prior art components. For example, the computer system 800 may be implemented to include at least some of the input/output devices 850 described above, or may further include other components such as a transceiver, various databases, etc.

이처럼 본 발명의 실시예들에 따르면, 사용자의 선호도에 기반하여 캐싱 데이터를 저장하고 캐싱된 데이터를 기초로 사용자가 원하는 데이터를 회수함으로써 통신 자원을 보다 효율적으로 사용하여 비공개적 회수 달성 기술을 구현할 수 있다.In this way, according to embodiments of the present invention, it is possible to implement a technology for achieving private retrieval by using communication resources more efficiently by storing cached data based on user preferences and retrieving data desired by the user based on the cached data. there is.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 어플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The device described above may be implemented with hardware components, software components, and/or a combination of hardware components and software components. For example, the devices and components described in the embodiments include a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA), and a programmable logic unit (PLU). It may be implemented using one or more general-purpose or special-purpose computers, such as a logic unit, microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. Additionally, a processing device may access, store, manipulate, process, and generate data in response to the execution of software. For ease of understanding, a single processing device may be described as being used; however, those skilled in the art will understand that a processing device includes multiple processing elements and/or multiple types of processing elements. It can be seen that it may include. For example, a processing device may include a plurality of processors or one processor and one controller. Additionally, other processing configurations, such as parallel processors, are possible.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 컴퓨터 저장 매체 또는 장치에 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may include a computer program, code, instructions, or a combination of one or more of these, which may configure a processing unit to operate as desired, or may be processed independently or collectively. You can command the device. The software and/or data may be embodied in any type of machine, component, physical device, computer storage medium or device for the purpose of being interpreted by or providing instructions or data to the processing device. there is. Software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored on one or more computer-readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 이때, 매체는 컴퓨터로 실행 가능한 프로그램을 계속 저장하거나, 실행 또는 다운로드를 위해 임시 저장하는 것일 수도 있다. 또한, 매체는 단일 또는 수 개의 하드웨어가 결합된 형태의 다양한 기록수단 또는 저장수단일 수 있는데, 어떤 컴퓨터 시스템에 직접 접속되는 매체에 한정되지 않고, 네트워크 상에 분산 존재하는 것일 수도 있다. 매체의 예시로는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등을 포함하여 프로그램 명령어가 저장되도록 구성된 것이 있을 수 있다. 또한, 다른 매체의 예시로, 어플리케이션을 유통하는 앱 스토어나 기타 다양한 소프트웨어를 공급 내지 유통하는 사이트, 서버 등에서 관리하는 기록매체 내지 저장매체도 들 수 있다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer-readable medium. At this time, the medium may continuously store a computer-executable program, or temporarily store it for execution or download. In addition, the medium may be a variety of recording or storage means in the form of a single or several pieces of hardware combined. It is not limited to a medium directly connected to a computer system and may be distributed over a network. Examples of media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical recording media such as CD-ROMs and DVDs, magneto-optical media such as floptical disks, And there may be something configured to store program instructions, including ROM, RAM, flash memory, etc. Additionally, examples of other media include recording or storage media managed by app stores that distribute applications, sites that supply or distribute various other software, or servers.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with limited examples and drawings, various modifications and variations can be made by those skilled in the art from the above description. For example, the described techniques are performed in a different order than the described method, and/or components of the described system, structure, device, circuit, etc. are combined or combined in a different form than the described method, or other components are used. Alternatively, appropriate results may be achieved even if substituted or substituted by an equivalent.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents of the claims also fall within the scope of the claims described below.

Claims

In the method performed on the user terminal,
The user terminal includes at least one processor configured to execute computer-readable instructions contained in a memory,
The above method is,
storing, by the at least one processor, caching data based on preferences in the user terminal for each file stored in a plurality of databases;
transmitting, by the at least one processor, query information for requesting a target file based on the caching data to the plurality of databases; and
Detecting, by the at least one processor, the target file from a response string received from the plurality of databases.
How to include .

According to paragraph 1,
The saving step is,
Storing the caching data based on the preference according to the caching data capacity of the user terminal.
A method characterized by .

According to paragraph 1,
The saving step is,
If the caching data capacity of the user terminal is greater than a certain size determined based on the number of databases and the number of files, storing the caching data starting from the file with the highest preference.
How to include .

According to paragraph 1,
The saving step is,
The caching data capacity (z) of the user terminal is (where N is the number of databases and K is the number of files),
A portion of the total data capacity (where M _f is the size of a single file), K files of the same size are Save it as much as you need,
remaining data capacity Saving data first, starting with files with the highest preference.
A method characterized by .

According to paragraph 3,
The saving step is,
If the caching data capacity of the user terminal is less than the predetermined size, storing the caching data through linear combination.
How to further include .

According to paragraph 1,
The transmitting step is,
Creating and transmitting different queries to the plurality of databases with query information for requesting the target file
A method characterized by .

According to clause 6,
The query information includes a query vector, which is a binary sequence of a predetermined length, and an index information vector, which is a tuple of the predetermined length,
The query vector represents information about information bits for linear combination in the database,
The index information vector represents information about the starting point participating in the linear combination.
A method characterized by .

According to clause 6,
In each of the databases, the query information is received from the user terminal, and a response string consisting of a function for the file stored in the database is generated as a response bit to the query information.
A method characterized by .

A computer program stored in a computer-readable recording medium for executing the method of any one of claims 1 to 8 on a computer device.

In a user terminal implemented by a computer,
At least one processor configured to execute computer readable instructions contained in memory
Including,
The at least one processor,
A process of storing caching data based on preferences in the user terminal for each file stored in a plurality of databases;
transmitting query information for requesting a target file based on the caching data to the plurality of databases; and
A process of detecting the target file from response strings received from the plurality of databases
A user terminal that processes .

According to clause 10,
The at least one processor,
Storing the caching data based on the preference according to the caching data capacity of the user terminal.
A user terminal characterized by .

According to clause 10,
The at least one processor,
If the caching data capacity of the user terminal is greater than a certain size determined based on the number of databases and the number of files, storing the caching data starting from the file with the highest preference.
A user terminal characterized by .

According to clause 10,
The at least one processor,
The caching data capacity (z) of the user terminal is (where N is the number of databases and K is the number of files),
A portion of the total data capacity (where M _f is the size of a single file), K files of the same size are Save it as much as you need,
remaining data capacity Saving data first, starting with files with the highest preference.
A user terminal characterized by .

According to clause 12,
The at least one processor,
If the caching data capacity of the user terminal is less than the certain size, storing the caching data through linear combination.
A user terminal characterized by .

According to clause 10,
The at least one processor,
Creating and transmitting different queries to the plurality of databases with query information for requesting the target file
A user terminal characterized by .

According to clause 15,
The query information includes a query vector, which is a binary sequence of a predetermined length, and an index information vector, which is a tuple of the predetermined length,
The query vector represents information about information bits for linear combination in the database,
The index information vector represents information about the starting point participating in the linear combination.
A user terminal characterized by .

According to clause 15,
In each of the databases, the query information is received from the user terminal, and a response string consisting of a function for the file stored in the database is generated as a response bit to the query information.
A user terminal characterized by .