CN116506226B - Private data processing system and method - Google Patents

Private data processing system and method Download PDF

Info

Publication number
CN116506226B
CN116506226B CN202310767247.5A CN202310767247A CN116506226B CN 116506226 B CN116506226 B CN 116506226B CN 202310767247 A CN202310767247 A CN 202310767247A CN 116506226 B CN116506226 B CN 116506226B
Authority
CN
China
Prior art keywords
vector
data
party
storage
inquiring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310767247.5A
Other languages
Chinese (zh)
Other versions
CN116506226A (en
Inventor
刘纪海
巫锡斌
陈超超
郑小林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Jinzhita Technology Co ltd
Original Assignee
Hangzhou Jinzhita Technology Co ltd
Filing date
Publication date
Application filed by Hangzhou Jinzhita Technology Co ltd filed Critical Hangzhou Jinzhita Technology Co ltd
Priority to CN202310767247.5A priority Critical patent/CN116506226B/en
Publication of CN116506226A publication Critical patent/CN116506226A/en
Application granted granted Critical
Publication of CN116506226B publication Critical patent/CN116506226B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

Embodiments of the present disclosure provide a private data processing system and method, wherein the private data processing system includes: the data inquiring party determines preset public parameters, encodes the inquiring element set based on the preset public parameters to obtain an inquiring encoding vector, calculates a first pseudo-random related vector according to the preset public parameters, calculates an intermediate vector according to the first pseudo-random related vector and the inquiring encoding vector, and sends the intermediate vector to the data storing party; the data storage party calculates a second pseudo-random correlation vector according to preset public parameters, obtains encrypted storage elements according to the intermediate vector and the second pseudo-random correlation vector, encodes a storage element set and a storage data set based on the encrypted storage elements, obtains a storage encoding vector, and sends the storage encoding vector to the data inquiry party; the data query party obtains an encryption query element based on the first pseudo-random correlation vector, and decodes the stored encoding vector based on the encryption query element to obtain a query result.

Description

Private data processing system and method
Technical Field
Embodiments of the present disclosure relate to the field of computer technologies, and in particular, to a private data processing system.
Background
With the advent of the internet big data age, data generation and storage are in a distributed nature. The problems of data privacy disclosure and the like are often caused in the process of mining the potential value of the data. Therefore, the data can be invisible, and the problem of data security and privacy protection is very important to be solved. The privacy information retrieval (Private Information Retrieval, PIR) technology is a solution to the problem of protecting the privacy of a user query, and is mainly aimed at ensuring that the query is completed under the condition that the privacy information of a target user is not leaked when a query request is submitted to a database on a server, i.e. the server cannot know the specific query information of the user and the retrieved data item based on the query process and the result. Most of the existing private information retrieval is realized based on homomorphic encryption protocols, but the main performance index of the existing private information retrieval technology is the communication cost of a query party, and the calculation cost of PIR protocol execution of the query party is ignored, so that the query efficiency is lower. Therefore, how to provide an efficient query method in the private information retrieval scenario is a problem that needs to be solved at present.
Disclosure of Invention
In view of this, the present description embodiments provide a private data processing system. One or more embodiments of the present specification also relate to a method of processing private data, a computing device, a computer-readable storage medium, and a computer program to solve the technical drawbacks existing in the prior art.
According to a first aspect of embodiments of the present specification, there is provided a private data processing system, the system comprising a data querying party and a data storing party, wherein the data querying party comprises a query element set, the data storing party comprises a storage element set, and a storage data set corresponding to the storage element set, and in case that the data querying party performs a query task with the data storing party, the system comprises:
the data inquiring party determines preset disclosure parameters corresponding to the data storing party, encodes the inquiring element set based on the preset disclosure parameters to obtain an inquiring encoding vector, calculates a first pseudo-random related vector according to the preset disclosure parameters, calculates an intermediate vector according to the first pseudo-random related vector and the inquiring encoding vector, and sends the intermediate vector to the data storing party;
the data storage side calculates a second pseudo-random correlation vector according to the preset public parameters, encrypts the storage element set according to the intermediate vector and the second pseudo-random correlation vector to obtain encrypted storage elements, encodes the storage element set and the storage data set based on the encrypted storage elements to obtain storage encoding vectors, and sends the storage encoding vectors to the data inquiry side;
And the data inquiring party encrypts the inquiring element set based on the first pseudo-random related vector to obtain an encrypted inquiring element, and decodes the stored encoding vector based on the encrypted inquiring element to obtain an inquiring result corresponding to the inquiring task.
According to a second aspect of embodiments of the present specification, there is provided a method of privacy data processing, the method comprising a data querying party and a data storing party, wherein the data querying party comprises a set of querying elements, the data storing party comprises a set of storing elements, and a set of storing data corresponding to the set of storing elements, in case the data querying party performs a querying task with the data storing party, the method comprises:
the data inquiring party determines preset disclosure parameters corresponding to the data storing party, encodes the inquiring element set based on the preset disclosure parameters to obtain an inquiring encoding vector, calculates a first pseudo-random related vector according to the preset disclosure parameters, calculates an intermediate vector according to the first pseudo-random related vector and the inquiring encoding vector, and sends the intermediate vector to the data storing party;
The data storage side calculates a second pseudo-random correlation vector according to the preset public parameters, encrypts the storage element set according to the intermediate vector and the second pseudo-random correlation vector to obtain encrypted storage elements, encodes the storage element set and the storage data set based on the encrypted storage elements to obtain storage encoding vectors, and sends the storage encoding vectors to the data inquiry side;
and the data inquiring party encrypts the inquiring element set based on the first pseudo-random related vector to obtain an encrypted inquiring element, and decodes the stored encoding vector based on the encrypted inquiring element to obtain an inquiring result corresponding to the inquiring task.
According to a third aspect of embodiments of the present specification, there is provided a computing device comprising:
a memory and a processor;
the memory is configured to store computer-executable instructions that, when executed by the processor, perform the steps of the privacy data processing method described above.
According to a fourth aspect of embodiments of the present specification, there is provided a computer-readable storage medium storing computer-executable instructions which, when executed by a processor, implement the steps of the above-described privacy data processing method.
According to a fifth aspect of embodiments of the present specification, there is provided a computer program, wherein the computer program, when executed in a computer, causes the computer to perform the steps of the above-described privacy data processing method.
According to the embodiment of the specification, the pseudorandom correlation generator and the careless data structure are based to construct the careless data structure to generate the pseudorandom vector of the fixed data, the fixed input is generated, the safety and the privacy of the data of the two parties are guaranteed, meanwhile, the storage element set is encrypted by the data inquiring party to obtain the encrypted inquiry element, the storage element set and the storage data set are encrypted by the data storing party to obtain the encrypted storage element, the data storage element set of the data storing party corresponds to the storage data set, the corresponding storage data can be obtained by vector decoding by the data inquiring party through the inquiry element set, and the function of hiding the trace inquiry is realized.
Drawings
FIG. 1 is a schematic diagram of a private data processing system according to one embodiment of the present disclosure;
FIG. 2 is a process flow diagram of a private data processing system provided in one embodiment of the present disclosure;
FIG. 3 is a flow chart of a method of processing private data provided in one embodiment of the present disclosure;
FIG. 4 is a block diagram of a computing device provided in one embodiment of the present description.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many other forms than described herein and similarly generalized by those skilled in the art to whom this disclosure pertains without departing from the spirit of the disclosure and, therefore, this disclosure is not limited by the specific implementations disclosed below.
The terminology used in the one or more embodiments of the specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the specification. As used in this specification, one or more embodiments and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that, although the terms first, second, etc. may be used in one or more embodiments of this specification to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first may also be referred to as a second, and similarly, a second may also be referred to as a first, without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
Furthermore, it should be noted that, user information (including, but not limited to, user equipment information, user personal information, etc.) and data (including, but not limited to, data for analysis, stored data, presented data, etc.) according to one or more embodiments of the present disclosure are information and data authorized by a user or sufficiently authorized by each party, and the collection, use, and processing of relevant data is required to comply with relevant laws and regulations and standards of relevant countries and regions, and is provided with corresponding operation entries for the user to select authorization or denial.
First, terms related to one or more embodiments of the present specification will be explained.
PIR: the hidden inquiry is also called private information retrieval (Private Information Retrieval), which is a special protocol in the field of multiparty computing of security, and comprises two types of entities, namely a data party (sender) and an inquirer (receiver). Wherein the querying party provides the query ID and the data party provides the data ID and the data tag. PIR requires that the query result be returned to the querying party without revealing the query ID of the querying party to the data party and without revealing any other information than the query result by the data party.
OPRF: an inadvertent pseudorandom function is a cryptographic protocol in which the sender may select a random seed, the receiver may select an input and obtain an output of the pseudorandom function, and the sender is unaware of the input.
OKVS: an unintentional key-value store (accidental key-value store) refers to a data structure that can preserve key-value mappings on the premise of hidden keys (keys). If there is a set of key-value pairs { (x 1, y 1), (x 2, y 2), (x 3, y 3) }, then there is one OKVS function f such that f (x 1) =y1, f (x 2) =y2, f (x 3) =y3, and for the other keys f (x_other) is a random number.
PIR has a wide range of applications (e.g., medical, financial, government, civil, etc.) and is of great academic and industrial interest. At present, related researches on PIR have been successful in guaranteeing data privacy and protocol correctness, but a great amount of optimization space still exists when the PIR protocol is deployed in an actual application scene. We need not only to achieve the desired functionality of PIR, but also consider the efficiency of PIR.
The technical means for realizing PIR are mainly divided into two types: one type of label-based unbalanced privacy set intersection computation protocol (Labeled-Unbalanced Private set intersection, L-U-PSI) implementation uses two cryptographic primitives, namely an unintentional pseudo-random function (Oblivious Pseudo Random Function, OPRF) and homomorphic encryption (Homomorphic Encryption, HE). The most representative L-U-PSI scheme (Seal-PIR) concept is as follows: the data party and the inquiring party firstly execute the OPRF protocol to blindly integrate elements so as to eliminate noise overflow operation (noise flooding) of the HE, improve the calculation performance of the HE stage and reduce the communication overhead. And secondly, executing an HE stage, transmitting the query ID to the data party after homomorphic encryption by the query party, respectively subtracting and multiplying the data ID and the query ID ciphertext transmitted by the query party by the data party, and finally randomizing the data ID and the query ID ciphertext and transmitting the randomized data to the query party. The inquirer obtains the tag by decrypting the ciphertext. This type of scheme is also called keyword-PIR because it protects the query location of the data party. One type is implemented based on inadvertent transmission (Oblivious Transfer, OT) of cryptographic primitives. The idea is as follows: the scheme requires that the position of a query element is known, the query party queries the position t, the two parties execute n selection 1OT, the query party acquires the label of the t-th position, and the data party does not acquire any information. This type of scheme is called Index-PIR because it requires knowledge of the query location of the data party.
Through analyzing the basic ideas and the cryptographic primitive techniques, the key-PIR is more needed in the real scene, but the main performance index of the existing key-PIR scheme is the communication cost of the inquiring party, and the preprocessing cost of the data party and the calculation cost of the PIR protocol executed by both parties are ignored. And the preprocessing cost on the data side is the main run-time consumption of PIR protocol. Therefore, the scheme is only suitable for scenes with large differences in the private set sizes of the data party and the querying party and the data party data set determination. The method is not applicable to a scene that the performance index is total running time (offline time and online time), a specified data set is queried and any specified data set is queried.
Based on this, in the present specification, a private data processing system, a private data processing method are provided, and the present specification relates to a computing device, and a computer-readable storage medium, and is described in detail in the following embodiments one by one.
Referring to FIG. 1, FIG. 1 shows a schematic diagram of a privacy data processing system provided in accordance with one embodiment of the present specification, the system including a data querying party 102 including a set of query elements, and a data store 104 -said data storage comprises a set of storage elements + ->And a storage data set corresponding to said storage element set +.>In the case where the data querying party performs a query task with the data storing party, including:
the data querying party 102 determines a preset disclosure parameter corresponding to the data storage party, encodes the query element set based on the preset disclosure parameter to obtain a query encoding vector, calculates a first pseudorandom correlation vector according to the preset disclosure parameter, calculates an intermediate vector according to the first pseudorandom correlation vector and the query encoding vector, and sends the intermediate vector to the data storage party.
The data querying party can be understood as a party with a requirement for private data querying, and the data storing party can be understood as a party with stored private data, for example, a certain enterprise queries a bank for credit information of each employee, a certain school queries an education department for school data of each student, and the like. In the private data query process, the data between the two parties are not revealed, namely the query data of the data query party is not revealed to the data storage party, the data storage party returns the query result to the data query party under the condition that no other information is revealed except the query result, for example, an enterprise queries credit investigation data of each employee to a bank, employee data used in the enterprise query is not revealed to the bank, and the bank can not reveal other personal private data of the employee to the enterprise except the credit investigation data queried by the enterprise, so that the private information retrieval is realized. The data querying party is provided with a query element set, query elements in the query element set are query elements related in the query process, such as the name, ID and the like of each employee, privacy data corresponding to each object can be queried through the query elements, the data storing party is provided with a storage element set and a storage data set, and storage elements in the storage element set correspond to storage data in the storage data set, so that corresponding storage elements and storage data can be matched according to the query elements, and data searching is realized.
In practical application, when a data querying party needs to perform data query, that is, when the data querying party and a data storage party execute query tasks, parameter negotiation is performed with the data storage party, a preset public parameter is determined according to a negotiation result, and the preset public parameter can be understood as a parameter which is negotiated in advance by the data two parties. The preset disclosure parameters may include statistical security parameters:calculating security parameters:>finite field->、/>Sub-field->Coding of unintentional Key pairs (OKVS) random value parameter +.>Pseudo-random correlation generationA generator (PCG) for generating a matrix G, vector length +.>Anti-collision hash function:>
in the implementation, in the OKVS stage, the data querying party encodes the query element set to obtain a coded vector with a hidden key function, namely, a query coded vector, and then in the pseudo-random correlation vector generation stage, the data querying party and the data storing party interactively generate a pseudo-random correlation vector, wherein the pseudo-random correlation vector is used for blinding the coded vector and is used as a secret sharing vector.
In a specific embodiment of the present disclosure, the data querying party is an enterprise, the data storing party is a bank, the enterprise wants to obtain credit information of each employee in the enterprise, the data querying party includes an employee identifier set, the bank includes customer identifiers of all customers and credit information corresponding to each customer, therefore, when the data querying party queries the employee credit information, the bank needs to be guaranteed that the bank cannot learn the employee identifier, and the enterprise cannot learn privacy data of any customer except the credit information of the employee in the enterprise, at this time, the enterprise and the bank negotiate to determine preset public parameters, the enterprise encodes the employee identifier set based on the preset public parameters, obtains a query encoding vector, calculates a first pseudo-random correlation vector according to the preset public parameters, calculates an intermediate vector according to the first pseudo-random correlation vector and the query encoding vector, and sends the intermediate vector to the data storing party.
The data storage 104 calculates a second pseudorandom related vector according to the preset disclosure parameter, encrypts the storage element set according to the intermediate vector and the second pseudorandom related vector to obtain an encrypted storage element, encodes the storage element set and the storage data set based on the encrypted storage element to obtain a storage encoding vector, and sends the storage encoding vector to the data querying party.
After the data storage party interacts with the data query party to generate the pseudorandom related vector, the data storage party can encrypt the storage element set and the storage data set based on the second pseudorandom related vector, the encryption can be understood as performing blinding processing on the storage element set and the storage data set, and the second pseudorandom related vector is used as a blinding factor.
In the implementation, the data storage party firstly encrypts the storage element set and the storage data set according to the second pseudo-random related vector to obtain an encrypted storage element, the encrypted storage element comprises the encrypted storage element and the storage data, the encrypted storage element and the storage data are encoded to obtain an encoded vector, namely the storage encoded vector, and the stored encoded vector is sent to the data inquiry party, so that the data inquiry party can inquire corresponding inquiry content according to the inquiry element.
In a specific embodiment of the present disclosure, referring to the above example, the bank calculates a second pseudo-random correlation vector according to a preset disclosure parameter, performs encryption blinding on the client identifier and the client data according to the intermediate vector and the second pseudo-random correlation vector, obtains an encrypted storage element, encodes the client identifier and the client data based on the encrypted storage element, obtains a storage encoding vector, and sends the storage encoding vector to the data querying party, so that the data querying party can perform data query based on the storage encoding vector.
The data querying party 102 encrypts the query element set based on the first pseudo-random related vector to obtain an encrypted query element, and decodes the stored encoded vector based on the encrypted query element to obtain a query result corresponding to the query task.
The data query party can take the first pseudorandom related vector as a blinding factor, encrypt and blinde the query element set to obtain an encrypted query element, and decode the stored coded vector based on the encrypted query element, so that query contents are output from the input query element to the stored coded vector, namely query results corresponding to a query task are realized.
In a specific embodiment of the present disclosure, the bank performs encryption blinding on the employee identifier set according to the first pseudo-random correlation vector to obtain encrypted employee identifiers, decodes the stored encoding vector based on the encrypted employee identifiers, so as to obtain a query result corresponding to the employee credit data query task, where the query result is credit data corresponding to each employee.
Further, in order to ensure that the data querying party performs unintentional key value pair encoding on the query element set, it is first required to determine an object to be encoded that participates in encoding, specifically, the data querying party determines the object to be encoded according to the preset disclosure parameter, and performs key value pair encoding on the object to be encoded and the query element set to obtain a query encoding vector.
Wherein the object to be encoded can be understood as a parameter object participating in encoding, when the data inquiring party encodes the inquiring element set in the OKVS stage, the object to be encoded is required to be selected according to the set size of the vector length in the preset public parameter, the key value pair encoding can be understood as an unintentional key value pair encoding, the key value pair encoding firstly needs to determine the encoded key object and the value object so as to encode,
In particular, in the case that the set size n is larger than m, the data inquirer selects the query element set and n-m random elements as key objects of the OKVS, and selectsAs the value object of the OKVS, the object to be encoded is n-m random elements in the preset public parameters and the anti-collision hash function value, and then the OKVS encoding algorithm is adopted to obtain the OKVS vector, namely the query encoding vector +.>. In another case, in case the set size n is smaller than m, the query element set is selected as the key object of OKVS +.>As the value object of the OKVS, the object to be encoded is the anti-collision hash function value in the preset public parameter, and then the OKVS encoding algorithm is adopted to obtain the OKVS vector, namely the query encoding vector +.>
Further, selecting an object to be encoded, wherein the object to be encoded needs to be determined according to the set size of the vector length in a preset disclosure parameter, and the specific data inquirer determines a random element set and a hash value set according to the preset disclosure parameter and uses the random element set and the hash value set as the object to be encoded, and performs key value pair encoding on the random element set, the hash value set and the inquiry element set to obtain an inquiry encoding vector; or determining a hash value set according to the preset public parameter and taking the hash value set as an object to be encoded, and performing key value pair encoding on the hash value set and the query element set to obtain a query encoding vector.
The method comprises the steps of selecting a random element set and a hash value set as objects to be encoded under the condition that the set size n is larger than m, and encoding the random element set and the query element set as key objects and the hash value set as value objects when key value pair encoding is carried out subsequently, so as to obtain a query encoding vector; and under the condition that the set size n is smaller than m, selecting the hash value set as an object to be encoded, and encoding the query element set as a key object and the hash value set as a value object when encoding key value pairs subsequently, so as to obtain a query encoding vector.
Based on the method, the key value pair coding modes of the data inquirer under two different conditions are determined according to the vector length, so that the data inquirer can accurately obtain the OKVS vector.
Further, in order to ensure the security and privacy of the data, a pseudo-random correlation vector generator may be selected to generate a pseudo-random correlation vector, and the specific data storage party generates an input scalar according to the preset disclosure parameter, calculates a second correlation long vector according to the input scalar and sends the second correlation long vector to the data query party, and performs a matrix product operation on the second correlation long vector to obtain a second pseudo-random correlation vector; and the data query party generates a sampling vector according to the preset public parameter, calculates a first correlation long vector according to the sampling vector and the second correlation long vector, and performs matrix product operation on the first correlation long vector to obtain a first pseudorandom correlation vector.
The input scalar is understood to mean the scalar determined from the finite field F, i.e. the input scalarThe data storage side also needs to input PPRF key +.>The subsequent data store may output a second correlation length vector. The sampling vector is understood to be a vector which is randomly sampled from the finite field F, i.e. a randomly sampled vector +.>F, and claim->Index position of (2) is non-zero, let ∈ ->. The data inquirer can output the PPRF puncturing key according to the sampling vector and the first correlation long vector>Andtwo related long vectors are obtained through calculation:. In the case where the data querying party obtains the first correlation long vector and the data storing party obtains the second correlation long vector, the two parties can locally perform the LPN matrix-vector product operation, thereby obtaining the inadvertent pseudorandom correlation vector.
In practical applications, the data querying party and the data storing party locally perform LPN matrix-vector product operations including: data storage local LPN extension related long vectorIs pseudo-random correlation long vector +>. Data inquirer local LPN expansion correlation long correlation vector +.>E is a pseudo-random correlation long vector:>
based on the above, the first correlation long vector and the second correlation long vector are subjected to LPN expansion to obtain an unintentional pseudo-random correlation vector, so that the pseudo-random correlation vector is generated by traffic between a data inquiring party and a data storing party, and the pseudo-random correlation vector can be used as an element encryption blinding factor later.
Furthermore, in order to ensure the security and privacy of the data, the local element may be encrypted and blinded by adopting an unintentional pseudorandom function, and specifically, the data storage party calculates a key vector according to the intermediate vector and the second pseudorandom correlation vector, and encrypts the storage element set through the key vector to obtain an encrypted storage element.
The intermediate vector can be understood as a vector obtained by the data inquirer according to the first pseudo-random correlation long vector and the inquiry code vector, and the intermediate vector is used for being sent to the data storage party for key calculation. After the data storage party receives the intermediate vector, a key vector can be calculated according to the intermediate vector and the second pseudo-random correlation vector, and then the storage element set is encrypted through the key vector to obtain the encrypted storage element.
In practical application, the data inquirer calculates an intermediate vectorAnd +.>Transmitting to a data storage party, and then encrypting the query element set based on the first pseudo-random correlation vector to obtain an encrypted query element. The data storage receives the intermediate vector->Thereafter, an OPRF key vector is calculated Encrypting the storage element set through the key vector to obtain an encrypted storage element
Based on the method, the set is packed into the linear vector, the set plaintext is hidden, and the pseudo-random vector is adopted for encryption blinding, so that the safety and the privacy of the data are ensured in the private data retrieval process.
Further, in order to ensure security and privacy of data, the data storage party may perform inadvertent key value pair encoding on the storage element set and the storage data set, and specifically, the data storage party performs key value pair encoding based on the encrypted storage element, the storage element set and the storage data set, to obtain a storage encoding vector.
The storage coding vector can be understood as a storage element set and a storage data set after coding by an careless key value pair, so that a subsequent data inquirer cannot acquire other private data after acquiring the storage coding vector, and the safety of the data is ensured.
In practical application, the data storage party stores the element setAs key object of OKVS, stored number+.>The value subject with the exclusive OR with the encryption storage element is the value object of OKVS, and the OKVS coding algorithm is adopted to obtain the OKVS vector +. >The data store may then send the stored encoding vector P2 to the data inquirer. The subsequent data inquirer can acquire the corresponding inquired content based on the stored code vector P2.
Based on the method, the storage element set and the storage data set are coded by means of unintentional key value pairs, so that the storage elements and the storage data of the data storage party are mapped on the premise of ensuring safety and privacy, a subsequent data inquiry party can conveniently acquire corresponding storage data based on the decoding vector of the inquiry element, and the function of hiding the trace inquiry is realized.
Further, in order to make the data querying party not obtain other private data, the data querying party needs to perform vector decoding according to the query element, specifically, the data querying party decodes the stored encoding vector based on the query element set to obtain an initial query result, and calculates a target query result according to the initial query result and the encrypted query element.
Wherein, the data inquirer decodes the storage coding vector based on the inquiry element set can be understood that the data inquirer takes the inquiry element in the inquiry element combination as a hidden inquiry value byDecoding to obtain- >,/>Namely, the initial query result is obtained, and the follow-up needs to be carried out according to the encryption query element +.>And->Performing exclusive OR calculation to obtain target query result +.>
In practical application, in order to prevent the data querying party from acquiring the stored data without query authority, the target query result can be further judged according to the hidden trace query value, if the hidden trace query valueThen->For correct query results, the opposite is +.>Is a random value.
Based on the method, the query result is obtained by the data query party in a decoding mode, so that the data query party is prevented from obtaining the private data without query permission, and the safety and the privacy of the data are ensured.
The private data processing system provided by the specification comprises a data query party and a data storage party, wherein the data query party comprises a query element set, the data storage party comprises a storage element set and a storage data set corresponding to the storage element set, and under the condition that the data query party and the data storage party execute query tasks, the private data processing system comprises: the data inquiring party determines preset disclosure parameters corresponding to the data storing party, encodes the inquiring element set based on the preset disclosure parameters to obtain an inquiring encoding vector, calculates a first pseudo-random related vector according to the preset disclosure parameters, calculates an intermediate vector according to the first pseudo-random related vector and the inquiring encoding vector, and sends the intermediate vector to the data storing party; the data storage side calculates a second pseudo-random correlation vector according to the preset public parameters, encrypts the storage element set according to the intermediate vector and the second pseudo-random correlation vector to obtain encrypted storage elements, encodes the storage element set and the storage data set based on the encrypted storage elements to obtain storage encoding vectors, and sends the storage encoding vectors to the data inquiry side; and the data inquiring party encrypts the inquiring element set based on the first pseudo-random related vector to obtain an encrypted inquiring element, and decodes the stored encoding vector based on the encrypted inquiring element to obtain an inquiring result corresponding to the inquiring task. By constructing an unintentional pseudorandom function based on the pseudorandom correlation generator and the unintentional data structure, generating a fixed input based on the unintentional pseudorandom function and the unintentional data structure to generate a fixed output pseudorandom vector, vector pseudo-randomness is ensured by the unintentional pseudorandom function, and the fixed input to generate the fixed output function is ensured by the unintentional data structure code. The vector generated by the process realizes the function of corresponding storage elements and storage data of the data storage party on the premise of ensuring safety and privacy, the data inquiry party can obtain the corresponding storage data finally based on the decoding vector of the inquiry element, the function of hiding the inquiry is finally realized, the retrieval efficiency is higher, and the data set can be arbitrarily appointed for inquiry.
The application of the private data processing system provided in the present specification to querying user data is taken as an example, and the private data processing system is further described below with reference to fig. 2. Fig. 2 is a flowchart of a processing procedure of a private data processing system according to an embodiment of the present disclosure, where the system includes a data querying party and a data storing party, the data querying party includes a query element set, the data storing party includes a storage element set, and a storage data set corresponding to the storage element set, and in a case where the data querying party and the data storing party execute a query task, the method specifically includes the following steps.
Step 202: the data inquiring party determines a preset disclosure parameter corresponding to the data storing party, determines an object to be encoded according to the preset disclosure parameter, and performs key value pair encoding according to the object to be encoded and the inquiring element set to obtain an inquiring encoding vector.
The data inquiry party is a school, the data storage party is a educational administration system, the data storage party comprises a set of academic data corresponding to each student of all schools in the city and a set of identity data corresponding to each student, and the data inquiry party comprises a set of academic data corresponding to all students in the school. The preset public parameters are public parameters determined by the advance negotiation of schools and educational administration systems. And determining an object to be encoded according to the preset disclosure parameters, and performing key value pair encoding according to the object to be encoded and the number data in the query element set.
Specifically, a data inquiring party determines a random element set and a hash value set as objects to be encoded according to vector lengths in preset public parameters, and performs key value pair encoding according to the random element set, the hash value set and the academic data set to obtain an inquiry encoding vector; or determining the hash value set as an object to be encoded according to the preset public parameters, and performing key value pair encoding according to the hash value set and the academic data set to obtain a query encoding vector.
Step 204: the data storage side generates an input scalar according to the preset disclosure parameters, calculates a second correlation long vector according to the input scalar and sends the second correlation long vector to the data inquiry side.
Step 206: the data query party generates a sampling vector according to a preset public parameter, and calculates a first correlation long vector according to the sampling vector and a second correlation long vector.
Step 208: the data query party performs matrix product operation on the first correlation long vector to obtain a first pseudo-random correlation vector.
Step 210: the data storage side performs matrix product operation on the second correlation long vector to obtain a second pseudo-random correlation vector.
Step 212: the data inquirer calculates an intermediate vector according to the first pseudo-random correlation vector and the inquiry code vector, and sends the intermediate vector to the data storage party.
Step 214: the data storage party calculates a key vector according to the intermediate vector and the second pseudo-random correlation vector, and encrypts the identity data set through the key vector to obtain an encrypted storage element.
Step 216: the data storage party carries out key value pair coding based on the encryption storage element, the identity data set and the academic data set to obtain a storage coding vector, and sends the storage coding vector to the data inquiry party;
step 218: the data query party encrypts the number data set based on the first pseudo-random correlation vector to obtain an encrypted query element, decodes the stored encoding vector based on the number data set to obtain an initial query result, and calculates a target query result according to the initial query result and the encrypted query element.
The data inquiry party obtains a target inquiry result, namely the academic data corresponding to each student in the school.
According to the privacy data processing system provided by the specification, the pseudo-random correlation generator and the careless data structure are based to construct the careless data structure to generate the pseudo-random vector of the fixed data generated by the fixed input, the security and the privacy of data of both parties are guaranteed, meanwhile, the storage element set is encrypted by the data inquiring party to obtain the encrypted inquiry element, the storage element set and the storage data set are encrypted by the data storing party to obtain the encrypted storage element, the data storage element set of the data storing party corresponds to the storage data set, the corresponding storage data can be obtained by the data inquiring party through vector decoding of the inquiry element set, and the function of the trace inquiry is realized.
Referring to fig. 3, fig. 3 shows a flowchart of a method for processing private data according to an embodiment of the present disclosure, including:
step 302: the data inquiring party determines preset public parameters corresponding to the data storing party, encodes the inquiring element set based on the preset public parameters to obtain an inquiring encoding vector, calculates a first pseudorandom correlation vector according to the preset public parameters, calculates an intermediate vector according to the first pseudorandom correlation vector and the inquiring encoding vector, and sends the intermediate vector to the data storing party.
Step 304: the data storage side calculates a second pseudo-random correlation vector according to the preset public parameters, encrypts the storage element set according to the intermediate vector and the second pseudo-random correlation vector to obtain encrypted storage elements, encodes the storage element set and the storage data set based on the encrypted storage elements to obtain storage encoding vectors, and sends the storage encoding vectors to the data inquiry side.
Step 306: encrypting the query element set based on the first pseudorandom correlation vector to obtain an encrypted query element, and decoding the stored coding vector based on the encrypted query element to obtain a query result corresponding to the query task.
The method for processing privacy data provided by the specification comprises a data query party and a data storage party, wherein the data query party comprises a query element set, the data storage party comprises a storage element set and a storage data set corresponding to the storage element set, and under the condition that the data query party and the data storage party execute query tasks, the method comprises the following steps: the data inquiring party determines preset disclosure parameters corresponding to the data storing party, encodes the inquiring element set based on the preset disclosure parameters to obtain an inquiring encoding vector, calculates a first pseudo-random related vector according to the preset disclosure parameters, calculates an intermediate vector according to the first pseudo-random related vector and the inquiring encoding vector, and sends the intermediate vector to the data storing party; the data storage side calculates a second pseudo-random correlation vector according to the preset public parameters, encrypts the storage element set according to the intermediate vector and the second pseudo-random correlation vector to obtain encrypted storage elements, encodes the storage element set and the storage data set based on the encrypted storage elements to obtain storage encoding vectors, and sends the storage encoding vectors to the data inquiry side; and the data inquiring party encrypts the inquiring element set based on the first pseudo-random related vector to obtain an encrypted inquiring element, and decodes the stored encoding vector based on the encrypted inquiring element to obtain an inquiring result corresponding to the inquiring task. The data inquiring party encrypts the storage element set to obtain an encrypted storage element, the data storing party encrypts the storage element set and the storage data set to obtain the encrypted storage element, the data storing element set of the data storing party corresponds to the storage data set, and the data inquiring party can obtain corresponding storage data by vector decoding through the inquiry element set to realize the function of hiding trace inquiry.
Fig. 4 illustrates a block diagram of a computing device 400 provided in accordance with one embodiment of the present description. The components of the computing device 400 include, but are not limited to, a memory 410 and a processor 420. Processor 420 is coupled to memory 410 via bus 430 and database 450 is used to hold data.
Computing device 400 also includes access device 440, access device 440 enabling computing device 400 to communicate via one or more networks 460. Examples of such networks include public switched telephone networks (PSTN, public Switched Telephone Network), local area networks (LAN, local Area Network), wide area networks (WAN, wide Area Network), personal area networks (PAN, personal Area Network), or combinations of communication networks such as the internet. The access device 440 may include one or more of any type of network interface, wired or wireless, such as a network interface card (NIC, network interface controller), such as an IEEE802.11 wireless local area network (WLAN, wireless Local Area Network) wireless interface, a worldwide interoperability for microwave access (Wi-MAX, worldwide Interoperability for Microwave Access) interface, an ethernet interface, a universal serial bus (USB, universal Serial Bus) interface, a cellular network interface, a bluetooth interface, near field communication (NFC, near Field Communication).
In one embodiment of the present description, the above-described components of computing device 400, as well as other components not shown in FIG. 4, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device shown in FIG. 4 is for exemplary purposes only and is not intended to limit the scope of the present description. Those skilled in the art may add or replace other components as desired.
Computing device 400 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smart phone), wearable computing device (e.g., smart watch, smart glasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or personal computer (PC, personal Computer). Computing device 400 may also be a mobile or stationary server.
Wherein the processor 420 is configured to execute computer-executable instructions that, when executed by the processor, perform the steps of the private data processing method described above.
The foregoing is a schematic illustration of a computing device of this embodiment. It should be noted that, the technical solution of the computing device and the technical solution of the above private data processing method belong to the same concept, and details of the technical solution of the computing device, which are not described in detail, can be referred to the description of the technical solution of the above private data processing method.
An embodiment of the present disclosure also provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the above-described privacy data processing method.
The above is an exemplary version of a computer-readable storage medium of the present embodiment. It should be noted that, the technical solution of the storage medium and the technical solution of the above private data processing method belong to the same concept, and details of the technical solution of the storage medium, which are not described in detail, can be referred to the description of the technical solution of the above private data processing method.
An embodiment of the present specification also provides a computer program, wherein the computer program, when executed in a computer, causes the computer to perform the steps of the above-mentioned private data processing method.
The above is an exemplary version of a computer program of the present embodiment. It should be noted that, the technical solution of the computer program and the technical solution of the above private data processing method belong to the same concept, and details of the technical solution of the computer program, which are not described in detail, can be referred to the description of the technical solution of the above private data processing method.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The computer instructions include computer program code that may be in source code form, object code form, executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the content of the computer readable medium can be increased or decreased appropriately according to the requirements of the patent practice, for example, in some areas, according to the patent practice, the computer readable medium does not include an electric carrier signal and a telecommunication signal.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the embodiments are not limited by the order of actions described, as some steps may be performed in other order or simultaneously according to the embodiments of the present disclosure. Further, those skilled in the art will appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily all required for the embodiments described in the specification.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.
The preferred embodiments of the present specification disclosed above are merely used to help clarify the present specification. Alternative embodiments are not intended to be exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the teaching of the embodiments. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. This specification is to be limited only by the claims and the full scope and equivalents thereof.

Claims (8)

1. A private data processing system, the system comprising a data querying party and a data storing party, wherein the data querying party comprises a query element set, the data storing party comprises a storage element set, and a storage data set corresponding to the storage element set, and in case that the data querying party and the data storing party execute a query task, the system comprises:
the data inquiring party determines preset disclosure parameters corresponding to the data storing party, encodes the inquiring element set based on the preset disclosure parameters to obtain an inquiring encoding vector, generates a sampling vector according to the preset disclosure parameters, calculates a first correlation long vector according to the sampling vector and a second correlation long vector, performs matrix product operation on the first correlation long vector to obtain a first pseudo-random correlation vector, calculates an intermediate vector according to the first pseudo-random correlation vector and the inquiring encoding vector, and sends the intermediate vector to the data storing party, wherein the second correlation long vector generates an input scalar according to the preset disclosure parameters through the data storing party, calculates a second correlation long vector according to the input scalar and sends the second correlation long vector to obtain the data storing party;
The data storage side performs matrix product operation on the second correlation long vector to obtain a second pseudo-random correlation vector, encrypts the storage element set according to the intermediate vector and the second pseudo-random correlation vector to obtain encrypted storage elements, encodes the storage element set and the storage data set based on the encrypted storage elements to obtain a storage encoding vector, and sends the storage encoding vector to the data inquiry side;
the data inquiring party encrypts the inquiring element set based on the first pseudo-random related vector to obtain an encrypted inquiring element, decodes the stored encoding vector based on the inquiring element set to obtain an initial inquiring result, and calculates a target inquiring result according to the initial inquiring result and the encrypted inquiring element.
2. The system of claim 1, wherein the data querying party determines an object to be encoded according to the preset disclosure parameter, and performs key value pair encoding on the object to be encoded and the query element set to obtain a query encoding vector.
3. The system of claim 2, wherein the data querying party determines a random element set and a hash value set according to the preset public parameter and takes the random element set, the hash value set and the query element set as objects to be encoded, and performs key value pair encoding on the random element set, the hash value set and the query element set to obtain query encoding vectors; or determining a hash value set according to the preset public parameter and taking the hash value set as an object to be encoded, and performing key value pair encoding on the hash value set and the query element set to obtain a query encoding vector.
4. The system of claim 1, wherein the data store calculates a key vector from the intermediate vector and the second pseudo-random correlation vector and encrypts the set of storage elements with the key vector to obtain encrypted storage elements.
5. The system of claim 1, wherein the data store performs key value pair encoding on the encrypted storage element, the set of storage elements, and the set of storage data to obtain a storage encoding vector.
6. A method for processing private data, the method comprising a data querying party and a data storing party, wherein the data querying party comprises a query element set, the data storing party comprises a storage element set, and a storage data set corresponding to the storage element set, and when the data querying party and the data storing party execute a query task, the method comprises:
the data inquiring party determines preset disclosure parameters corresponding to the data storing party, encodes the inquiring element set based on the preset disclosure parameters to obtain an inquiring encoding vector, generates a sampling vector according to the preset disclosure parameters, calculates a first correlation long vector according to the sampling vector and a second correlation long vector, performs matrix product operation on the first correlation long vector to obtain a first pseudo-random correlation vector, calculates an intermediate vector according to the first pseudo-random correlation vector and the inquiring encoding vector, and sends the intermediate vector to the data storing party, wherein the second correlation long vector generates an input scalar according to the preset disclosure parameters through the data storing party, calculates a second correlation long vector according to the input scalar and sends the second correlation long vector to obtain the data storing party;
The data storage side performs matrix product operation on the second correlation long vector to obtain a second pseudo-random correlation vector, encrypts the storage element set according to the intermediate vector and the second pseudo-random correlation vector to obtain encrypted storage elements, encodes the storage element set and the storage data set based on the encrypted storage elements to obtain a storage encoding vector, and sends the storage encoding vector to the data inquiry side;
the data inquiring party encrypts the inquiring element set based on the first pseudo-random related vector to obtain an encrypted inquiring element, decodes the stored encoding vector based on the inquiring element set to obtain an initial inquiring result, and calculates a target inquiring result according to the initial inquiring result and the encrypted inquiring element.
7. A computing device, comprising:
a memory and a processor;
the memory is configured to store computer-executable instructions, and the processor is configured to execute the computer-executable instructions which, when executed by the processor, perform the steps of the method of claim 6.
8. A computer readable storage medium storing computer executable instructions which when executed by a processor implement the steps of the method of claim 6.
CN202310767247.5A 2023-06-27 Private data processing system and method Active CN116506226B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310767247.5A CN116506226B (en) 2023-06-27 Private data processing system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310767247.5A CN116506226B (en) 2023-06-27 Private data processing system and method

Publications (2)

Publication Number Publication Date
CN116506226A CN116506226A (en) 2023-07-28
CN116506226B true CN116506226B (en) 2023-09-19

Family

ID=

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109889541A (en) * 2019-03-25 2019-06-14 郑州轻工业学院 The mobile device authentication method for having anonymous reward distribution and privacy of identities protection
CN113468219A (en) * 2021-06-30 2021-10-01 建信金融科技有限责任公司 Data query and matching method, device and system
WO2022015948A1 (en) * 2020-07-15 2022-01-20 Georgia Tech Research Corporation Privacy-preserving fuzzy query system and method
CN114036565A (en) * 2021-11-19 2022-02-11 上海勃池信息技术有限公司 Private information retrieval system and private information retrieval method
CN114287001A (en) * 2019-08-26 2022-04-05 皇家飞利浦有限公司 Restricted full privacy conjunctive database queries for protecting user privacy and identity
CN114417068A (en) * 2022-01-20 2022-04-29 三未信安科技股份有限公司 Large-scale graph data matching method with privacy protection function
CN116010678A (en) * 2022-12-30 2023-04-25 北京火山引擎科技有限公司 Method, device and equipment for inquiring trace

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109889541A (en) * 2019-03-25 2019-06-14 郑州轻工业学院 The mobile device authentication method for having anonymous reward distribution and privacy of identities protection
CN114287001A (en) * 2019-08-26 2022-04-05 皇家飞利浦有限公司 Restricted full privacy conjunctive database queries for protecting user privacy and identity
WO2022015948A1 (en) * 2020-07-15 2022-01-20 Georgia Tech Research Corporation Privacy-preserving fuzzy query system and method
CN113468219A (en) * 2021-06-30 2021-10-01 建信金融科技有限责任公司 Data query and matching method, device and system
CN114036565A (en) * 2021-11-19 2022-02-11 上海勃池信息技术有限公司 Private information retrieval system and private information retrieval method
CN114417068A (en) * 2022-01-20 2022-04-29 三未信安科技股份有限公司 Large-scale graph data matching method with privacy protection function
CN116010678A (en) * 2022-12-30 2023-04-25 北京火山引擎科技有限公司 Method, device and equipment for inquiring trace

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种高效的私有信息检索方案;祁堃;黄刘生;罗永龙;荆巍巍;;小型微型计算机系统(第07期);全文 *
基于可信身份检索的物联网隐私保护方案;赵亮; 王学良; 陈声涛;《软件导刊》;全文 *

Similar Documents

Publication Publication Date Title
Paulet et al. Privacy-preserving and content-protecting location based queries
US10635824B1 (en) Methods and apparatus for private set membership using aggregation for reduced communications
EP3024169B1 (en) System and method for matching data sets while maintaining privacy of each data set
US20090138698A1 (en) Method of searching encrypted data using inner product operation and terminal and server therefor
Liang et al. Research on neural network chaotic encryption algorithm in wireless network security communication
CN116502276B (en) Method and device for inquiring trace
CN116502254B (en) Method and device for inquiring trace capable of searching statistics
Feng et al. Privacy-preserving computation in cyber-physical-social systems: A survey of the state-of-the-art and perspectives
CN111026788A (en) Homomorphic encryption-based multi-keyword ciphertext sorting and retrieving method in hybrid cloud
CN110445797B (en) Two-party multidimensional data comparison method and system with privacy protection function
CN116112168B (en) Data processing method and system in multiparty privacy exchange
Palmieri et al. Spatial bloom filters: Enabling privacy in location-aware applications
Mao et al. Public key encryption with conjunctive keyword search secure against keyword guessing attack from lattices
Rayappan et al. Lightweight Feistel structure based hybrid-crypto model for multimedia data security over uncertain cloud environment
CN117077209B (en) Large-scale data hiding trace query method
Wang et al. Fast and secure location-based services in smart cities on outsourced data
CN116502732B (en) Federal learning method and system based on trusted execution environment
CN116506226B (en) Private data processing system and method
Cheng et al. A High‐Security Privacy Image Encryption Algorithm Based on Chaos and Double Encryption Strategy
CN116506226A (en) Private data processing system and method
Wu et al. Compressed sensing based visually secure multi-secret image encryption-sharing scheme
Lian et al. Efficient Privacy‐Preserving Protocol for k‐NN Search over Encrypted Data in Location‐Based Service
Varghese et al. Secure data transmission using optimized cryptography and steganography using syndrome-trellis coding
CN115408451B (en) Confidential trace query method and storage medium
Bongale et al. Hybrid International Data Encryption Algorithm for Digital Image Encryption

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant