WO2024066015A1 - 实现隐私信息检索 - Google Patents

实现隐私信息检索 Download PDF

Info

Publication number
WO2024066015A1
WO2024066015A1 PCT/CN2022/135408 CN2022135408W WO2024066015A1 WO 2024066015 A1 WO2024066015 A1 WO 2024066015A1 CN 2022135408 W CN2022135408 W CN 2022135408W WO 2024066015 A1 WO2024066015 A1 WO 2024066015A1
Authority
WO
WIPO (PCT)
Prior art keywords
client
server
records
encrypted
encryption
Prior art date
Application number
PCT/CN2022/135408
Other languages
English (en)
French (fr)
Inventor
吴炜
魏长征
陆林鹏
吴行行
闫莺
张辉
Original Assignee
蚂蚁区块链科技(上海)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 蚂蚁区块链科技(上海)有限公司 filed Critical 蚂蚁区块链科技(上海)有限公司
Publication of WO2024066015A1 publication Critical patent/WO2024066015A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/30Public key, i.e. encryption algorithm being computationally infeasible to invert or user's encryption keys not requiring secrecy
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols

Definitions

  • the embodiments of this specification belong to the field of privacy computing technology, and in particular, relate to a method, system, server, and client for implementing privacy information retrieval.
  • Privacy-Preserving Computing is a collection of technologies that implement data analysis and computing on the premise of protecting the data itself from external disclosure, making the data available but invisible. Through privacy-preserving computing technology, the value of data can be transformed and released while fully protecting data and privacy security.
  • the mainstream technologies for realizing privacy-preserving computing mainly include three directions: the first category is the privacy computing technology based on cryptography represented by Secure Multi-Party Computation (SMPC); the second category is the technology derived from the integration of artificial intelligence and privacy protection technology represented by Federated Learning (FL); the third category is the confidential computing (CC) technology based on trusted hardware represented by Trusted Execution Environment (Trust Execution Environment).
  • SMPC Secure Multi-Party Computation
  • FL the technology derived from the integration of artificial intelligence and privacy protection technology represented by Federated Learning
  • the third category is the confidential computing (CC) technology based on trusted hardware represented by Trusted Execution Environment (Trust Execution Environment).
  • DP Differential Privacy
  • Differential Privacy (DP) actually protects the calculation results, not the calculation process
  • Federated Learning, Secure Multi-Party Computation and Confidential Computing protect the calculation process and the intermediate results of the calculation process.
  • the first type of multi-party secure computing includes four basic technologies, namely, Garbled Circuit (GC), Secret Sharing, Oblivious Transfer and Homomorphic Encryption (HE).
  • GC Garbled Circuit
  • HE Homomorphic Encryption
  • homomorphic encryption is a special encryption algorithm that directly performs calculations based on ciphertext, and the calculation results are the same as those based on decrypted plaintext. It includes semi-homomorphic encryption (Partially Homomorphic Encryption, PHE) and fully homomorphic encryption (Fully Homomorphic Encryption, FHE).
  • Secure multi-party computing provides privacy protection for input secret data with its solid security theoretical foundation, thus achieving the security of the privacy-preserving computing process.
  • general secure multi-party computing and specific problem secure multi-party computing.
  • the former can solve various computing problems, but this "universal" technical route usually has a large system and high overhead; the latter designs special protocols for specific problems, such as Private Set Intersection (PSI) and Privacy Information Retrieval (PIR), which can often obtain computing results at a lower cost than general secure multi-party computing protocols, but requires domain experts to carefully design them for application scenarios, and are generally not applicable to general scenarios and have high design costs.
  • PSI Private Set Intersection
  • PIR Privacy Information Retrieval
  • Private set intersection is a method for two parties to obtain the intersection of their data without revealing any additional information. Additional information refers to any information other than the intersection of the data of both parties. Private set intersection is very useful in real-world scenarios, such as data alignment in vertical federated learning, or friend discovery through address books in social software.
  • Privacy information retrieval is a method by which a client retrieves information from a database. During the retrieval process, the querying party hides the query target identifier, and the data service provider provides matching query results but cannot know the specific query object.
  • the purpose of this specification is to provide a method, system, server and client for implementing private information retrieval.
  • a method for realizing private information retrieval wherein the server obtains a query base after encrypting the database, and sends the query base to the client; the encryption/decryption performed by the client and the server on the same target adopts an encryption/decryption algorithm with interchangeable order.
  • a retrieval process it includes: S210, the client sends a sensitive field encrypted by itself to the server, and obtains the same sensitive field encrypted by the server through interaction with the server; S220, the client retrieves the sensitive field encrypted by the server in the query base and obtains the identifier of the matching record; S230, the server and the client transmit the corresponding record of the predetermined size identifier set including the matching identifier in the database/the value of the field of interest in the corresponding record to the client through an oblivious transmission method.
  • a system for implementing private information retrieval includes a server and a client.
  • the encryption/decryption performed by the client and the server on the same target adopts an encryption/decryption algorithm with an interchangeable order
  • the server is configured with a database, and obtains a query base after encrypting the database, and sends the query base to the client.
  • the client sends a sensitive field encrypted by itself to the server, and obtains the same sensitive field encrypted by the server through interaction with the server; the client searches the query base according to the sensitive field encrypted by the server, and obtains the identifier of the matching record; the server and the client transmit the corresponding record of the predetermined size identifier set including the matching identifier in the database/the value of the field of interest in the corresponding record to the client through an oblivious transmission method.
  • a server for implementing private information retrieval wherein the encryption/decryption performed by the server and the client on the same target adopts an encryption/decryption algorithm with interchangeable order, and the server is configured with a database, and obtains a query base after encrypting the database, and sends the query base to the client.
  • the server receives the sensitive field encrypted by itself sent by the client, encrypts it again and returns it to the client; the server also transmits the corresponding record of the predetermined size identifier set including the matching identifier in the database/the value of the field of interest in the corresponding record to the client through an oblivious transmission method.
  • the client sends a sensitive field encrypted by itself to the server, and obtains the same sensitive field encrypted by the server through interaction with the server; the client also searches the query base according to the sensitive field encrypted by the server to obtain the identifier of the matching record; the client also obtains the value of the record corresponding to the matching identifier/the field of interest in the corresponding record in the database through an oblivious transmission between the client and the server.
  • the client can locate the identifier of the field to be queried in the query base through interaction with the server and the query base without exposing the plain text of the database, and further initiate a query to the server based on the identifier, thereby ensuring the privacy protection of the database by the server, and supporting structured query statements.
  • the transmission of the record corresponding to the matching identifier/the value of the field of interest in the corresponding record between the server and the client adopts an oblivious transmission method, so that the matching identifier of the client will not be exposed to the server, thereby protecting the privacy of the client.
  • FIG1 is a schematic diagram of a flow chart of an embodiment
  • FIG. 2 is a schematic diagram of a flow chart of an embodiment.
  • PIR is a method for clients to retrieve information from a database.
  • the PIR scheme was proposed by Chor B et al. in 1995 to protect the privacy of user queries.
  • the main purpose of the PIR scheme is to ensure that the query request submitted by the query user to the database on the server is completed without leaking the user's private information, that is, during the retrieval process, the server does not know the user's specific query information and the retrieved data items.
  • the application scenarios of privacy information retrieval include: patients want to query the treatment drugs for their diseases through the medical system. If the disease name is used as the query condition, the medical system will know that the patient may have such a disease, and the patient's privacy will be leaked. Such leakage problems can be avoided through privacy information query.
  • a simple implementation is that the database sends all data to the client, but it cannot protect the database security, that is, it cannot guarantee the privacy of the server.
  • PIR that can guarantee the privacy security of both the client and the database
  • APIR asymmetrical PIR
  • SPIR symmetric PIR
  • APIR asymmetrical PIR
  • CPIR computational security
  • the client often searches based on keywords (without knowing the specific location of the keyword in the database), and hopes to retrieve a string (multiple bits).
  • a practical PIR usually needs to meet multiple conditions such as symmetry, single copy, keyword search, and string return at the same time, and achieve a balance between computational efficiency and communication efficiency.
  • the above conditions can be met or partially met through cryptographic techniques such as homomorphic encryption, oblivious transfer (OT), and one-way trapdoor function.
  • This specification provides an embodiment of a method for implementing private information retrieval.
  • the server may encrypt the database in advance to obtain a query base, and send the query base to the client.
  • the server has a local database that can be queried by the client.
  • the local database of the server is as follows:
  • the server can encrypt the database to obtain the query base.
  • the encryption method can use RSA (a widely used asymmetric encryption algorithm proposed by Ronald Rivest, Adi Shamir and Leonard Adleman in 1977) or ECC (Elliptical Curve Cryptography) encryption.
  • the server can use RSA private key/ECC private key ⁇ to encrypt the data, that is, use RSA private key/ECC private key ⁇ to encrypt each field except the ID column (that is, the data in each cell).
  • the server can generate a secret value ⁇ and properly store it.
  • the secret value ⁇ is also the ECC private key.
  • the server can convert the value of the name field into a point on the elliptic curve through a hash function, which can be expressed as Hash(C) or H(C).
  • ⁇ H(C) is easy to calculate based on the scalar multiplication operation on the elliptic curve, but it is difficult to deduce the value of ⁇ by knowing the result of ⁇ H(C) and H(C).
  • ⁇ H(C) it is also difficult to know the value of H(C) by knowing the result of ⁇ H(C).
  • the above hash function can not only convert the original input into an output of fixed length and format, but also convert the output into the x-axis coordinate of a point on the elliptic curve.
  • any 256-bit data can be used as a legal x-axis coordinate on this elliptic curve.
  • sha256 or sha3-256 can be used, or 256 bits can be intercepted from the results of sha384, sha512, or sha3-384, sha3-512.
  • any hash value (not limited to hash results of 256 bits) can be modulo the order of the elliptic curve, and the product of the modulo result and the generator point multiplication (scalar multiplication) is a point on the elliptic curve.
  • the server can send the query base to the client that needs to perform the search.
  • the server can directly send the query base to the client, such as directly to the client's device, or to the client's proxy server; in another way, the server can publish the query base on a Uniform Resource Locator (URL), and then the client can obtain the query base from the URL.
  • URL Uniform Resource Locator
  • the client can receive the query base and save the received query base locally.
  • the server can generate a secret value ⁇ and store it properly.
  • This secret value is the RSA private key.
  • the server can convert the value of the name field into a point on the elliptic curve through a hash function, which can be expressed as Hash(C) or H(C).
  • the server can send the query base to the client that needs to perform the search.
  • the server can directly send the query base to the client, such as directly to the client's device, or to the client's proxy server; in another way, the server can publish the query base on a Uniform Resource Locator (URL), and then the client can obtain the query base from the URL.
  • URL Uniform Resource Locator
  • the client can receive the query base and save the received query base locally.
  • S110 The client sends the sensitive field encrypted by itself to the server, and obtains the same sensitive field encrypted by the server through interaction with the server.
  • the client's search condition is that the value of the Age field is 25, but 25 is a sensitive field, that is, the client does not want to let the other party know.
  • the client can encrypt the 25.
  • RSA/ECC private key encryption is used, and the encryption algorithm used by the client is the same as the encryption algorithm used by the server to generate the query base.
  • the client when RSA private key encryption is used, the client generates a secret ⁇ and keeps it properly. Then, the client can encrypt 25 with its own private key ⁇ . Specifically, 25 or the hash value of 25 can be encrypted.
  • the hash encryption of 25 is used as an example.
  • the direct encryption of 25 is similar, and the client and the server use the same hash algorithm. For example, the client uses the same large prime number q as the server as the modulus.
  • the client can perform RSA encryption on the hash value of 25 using ⁇ to obtain (H(25)) ⁇ .
  • the sensitive field sent by the client to the server can be (H(25)) ⁇ , where (H(25)) ⁇ represents the ciphertext of the value 25 of the sensitive field.
  • the client can also construct a search statement, encrypt the sensitive fields in the search statement to obtain the private fields, replace the sensitive fields with the private fields, and send the replaced private search statement to the server.
  • the result is as follows:
  • ? represents the search statement after replacement.
  • the client can encrypt 25 with an RSA private key.
  • the client can use the same hash function as the server to perform hash calculation on 25, and then use ⁇ to perform RSA encryption on the hash value of 25 to obtain (H(25)) ⁇ .
  • the query statement sent by the client to the server is, for example, as follows:
  • (H(25)) ⁇ is the ciphertext, which is the content represented by “?” in the above search statement. After obtaining it, the server cannot know ⁇ and 25.
  • the client uses the same elliptic curve as the server, that is, it has the same elliptic curve parameters and generators.
  • the client generates the secret ⁇ itself and keeps it properly.
  • the client can use its own private key ⁇ to encrypt 25.
  • it can be to encrypt the hash value of 25, and the client and the server use the same hash algorithm.
  • the client can use ⁇ to perform ECC encryption on the hash value of 25 to obtain ⁇ H(25).
  • the sensitive field sent by the client to the server can be ⁇ H(25), where ⁇ H(25) represents the ciphertext of the value 25 of the sensitive field.
  • the client can also construct a search statement, encrypt the sensitive fields in the search statement to obtain the privacy fields, replace the sensitive fields with the privacy fields, and send the replaced privacy search statement to the server.
  • the result is as follows:
  • ? represents the search statement after replacement.
  • the client can encrypt 25 with an ECC private key.
  • the client uses the same elliptic curve as the server, that is, it has the same elliptic curve parameters and generators.
  • the client can replace the sensitive fields in the search statement after encrypting it with its own ECC private key, and send the replaced privacy search statement to the server.
  • the client generates a secret ⁇ by itself and saves it properly.
  • the client can perform a hash calculation on 25 using the same hash function as the server, and then use ⁇ to perform ECC encryption on the hash value of 25 to obtain ⁇ H(25).
  • the query statement sent by the client to the server is, for example, as follows:
  • ⁇ H(25) is the ciphertext, which is the content represented by “?” in the above search statement. After obtaining it, the server cannot know ⁇ and 25.
  • the client obtains the same sensitive field encrypted by the server through interaction with the server, which may include the server using its own key to encrypt the sensitive field encrypted by the client again and then sending it to the client, and the client using its own key to decrypt the sensitive field encrypted twice to obtain the sensitive field encrypted by the server.
  • the core of this content is to find an encryption algorithm that can exchange the order of decryption for two consecutive encryption operations (two parties encrypt successively).
  • the two parties agree to use the same elliptic curve, that is, have the same elliptic curve parameters and generators, each holding private keys ⁇ and ⁇ , and the encryption operation is to perform scalar multiplication with ⁇ (or ⁇ ).
  • the encryption results can be decrypted in different orders.
  • both parties agree to use the same large prime number q and primitive root g, each holding private keys ⁇ and ⁇ .
  • the encryption operation is to use ⁇ (or ⁇ ) to exponentiate and modulo q. No matter whether ⁇ is used for encryption first and ⁇ is used for encryption or ⁇ is used for encryption first and ⁇ is used for encryption, the encryption results can be decrypted in the same or different order.
  • the encryption/decryption performed by the client and the server on the same target uses an encryption/decryption algorithm with interchangeable order.
  • the server may encrypt the privacy field again and return it to the client, or after receiving the sensitive field sent by the client and encrypted by the client itself, the server may encrypt the encrypted sensitive field again with the server's own key and return it to the client. Then, the client uses its own key to decrypt the twice encrypted sensitive field to obtain the sensitive field encrypted by the server.
  • case 1 the server can receive (H(25)) ⁇ sent by the client.
  • the server can re-encrypt the encrypted sensitive field (ie, the privacy field) and return the re-encrypted sensitive field to the client. Specifically, the server can re-encrypt the privacy field (H(25)) ⁇ using its own RSA private key ⁇ to obtain ((H(25)) ⁇ ) ⁇ .
  • the server can encrypt the private field again and return the re-encrypted private field to the client.
  • the server can re-encrypt the privacy field (H(25)) ⁇ using its own RSA private key ⁇ to obtain ((H(25)) ⁇ ) ⁇ .
  • the specific process is similar to the above and will not be repeated here.
  • case 2 the server can receive ⁇ H(25) sent by the client.
  • the server can re-encrypt the private field and return the re-encrypted private field to the client. Specifically, the server can re-encrypt the private field ⁇ H(25) using its own ECC private key ⁇ to obtain ⁇ H(25).
  • the server can encrypt the private field again and return the re-encrypted private field to the client.
  • the server can re-encrypt the privacy field ⁇ H(25) using its own ECC private key ⁇ to obtain ⁇ H(25).
  • the specific process is similar to the above and will not be repeated here.
  • the server uses its own key to re-encrypt the sensitive field (i.e., the privacy field) encrypted by the client and sends it to the client
  • the client can use its own key to decrypt the twice-encrypted privacy field to obtain the sensitive field encrypted by the server.
  • the client can use the inverse element of its own private key ⁇ Decrypt the twice encrypted sensitive fields as follows: In this way, the client obtains the same sensitive field encrypted by the server, namely (H(25)) ⁇ .
  • S120 The client searches the query base according to the sensitive field encrypted by the server, obtains the identifier of the matching record, and returns the identifier to the server.
  • the client can obtain the same sensitive field encrypted by the server.
  • the client queries in the query base based on the privacy field encrypted by the server, and after matching the record, it can locate the identifier of the matching record (also referred to as the matching identifier).
  • the client returns the identifier of the matching record to the server, which may include two situations.
  • the client constructs a search statement, encrypts the sensitive fields in the search statement to obtain the private fields, replaces the sensitive fields with the private fields, and sends the replaced private search statement to the server.
  • the client can directly return the identifier of the matching record to the server.
  • the client sends the value of the sensitive field encrypted by itself to the server.
  • the client can construct a search statement, for example, the search statement is:
  • the client may send the constructed search formula to the server, where the search formula includes the identifier of the matching record and indicates that the field of interest is Name, that is, the field name immediately following select.
  • S130 The server returns the value of the field of interest in the record corresponding to the identifier in the database to the client.
  • the client can locate the identifier of the field to be queried in the query base by interacting with the server and the query base without exposing the plain text of the database, and further initiate a query to the server according to the identifier to obtain the value of the field of interest in the record corresponding to the identifier.
  • the client can locate the identifier of the field to be queried in the query base by interacting with the server and the query base without exposing the plain text of the database, and further initiate a query to the server according to the identifier to obtain the value of the field of interest in the record corresponding to the identifier.
  • this embodiment does not need to pay attention to the specific position (bit position) of the keyword to be retrieved in the database, can realize the query of the string, and can support structured query language (Structured Query Language, SQL).
  • SQL Structured Query Language
  • the database is still kept on the server, and the query base obtained by encrypting the database is configured to the client, so that the client can locate the data based on the query base to obtain the identifier of the record when searching.
  • the encryption characteristics of the query base prevent the client from obtaining the content of the database, ensuring the privacy protection of the database by the server.
  • the form of the database and query base in this embodiment can be called "asymmetric dual copies" when a server configures the database and a client configures the query base, and can be called "asymmetric multiple copies" when multiple clients configure the query base.
  • the client can initiate a query on the field of interest, such as the Name field to be queried in the above select Name... This exposes the client's fields of interest to a certain extent.
  • you can query the records that meet the conditions, that is, the entire row of data that meets the conditions, which can protect the privacy of the client, but requires the server to return the entire record, which exposes the entire row of data on the server to a certain extent.
  • the result returned by the server can be the record of id_1, for example as follows:
  • the server can encrypt and return the record corresponding to the identifier in the database/the value of the field of interest in the corresponding record to the client.
  • the server can use the symmetric key negotiated with the client to encrypt the record corresponding to the identifier in the database/the value of the field of interest in the corresponding record and return it to the client, or use the public key in the asymmetric key of the client to encrypt the record corresponding to the identifier in the database/the value of the field of interest in the corresponding record and return it to the client, so that the client can decrypt it with its own private key, and use a digital envelope method, etc.
  • the client directly returns the matched ID to the server.
  • the record corresponding to the ID or the field of interest in the record can be obtained from the server, such as S130, this will expose the privacy of the client to a certain extent, that is, the server will know that the identifier that the client wants to query is id_1.
  • the server In order to protect the privacy of the client, it can be achieved by the following embodiment.
  • S210 The client sends the sensitive field encrypted by itself to the server, and obtains the same sensitive field encrypted by the server through interaction with the server.
  • the client's search condition is that the value of the Age field is 25, but 25 is a sensitive field, that is, the client does not want to let the other party know.
  • the client can encrypt the 25.
  • RSA/ECC private key encryption is used, and the encryption algorithm used by the client is the same as the encryption algorithm used by the server to generate the query base.
  • the client when RSA private key encryption is used, the client generates a secret ⁇ and keeps it properly. Then, the client can encrypt 25 with its own private key ⁇ . Specifically, 25 or the hash value of 25 can be encrypted.
  • the hash encryption of 25 is used as an example.
  • the direct encryption of 25 is similar, and the client and the server use the same hash algorithm. For example, the client uses the same large prime number q as the server as the modulus.
  • the client can perform RSA encryption on the hash value of 25 using ⁇ to obtain (H(25)) ⁇ .
  • the sensitive field sent by the client to the server can be (H(25)) ⁇ , where (H(25)) ⁇ represents the ciphertext of the value 25 of the sensitive field.
  • the client uses the same elliptic curve as the server, that is, it has the same elliptic curve parameters and generators.
  • the client generates the secret ⁇ itself and keeps it properly.
  • the client can use its own private key ⁇ to encrypt 25.
  • it can be to encrypt the hash value of 25, and the client and the server use the same hash algorithm.
  • the client can use ⁇ to perform ECC encryption on the hash value of 25 to obtain ⁇ H(25).
  • the sensitive field sent by the client to the server can be ⁇ H(25), where ⁇ H(25) represents the ciphertext of the value 25 of the sensitive field.
  • the client obtains the same sensitive field encrypted by the server through interaction with the server, which may include the server using its own key to re-encrypt the sensitive field encrypted by the client and then sending it to the client, and the client using its own key to decrypt the sensitive field encrypted twice to obtain the sensitive field encrypted by the server.
  • the core of this content is to find an encryption algorithm that can exchange the order of decryption for two consecutive encryption operations (two parties encrypt successively).
  • the two parties agree to use the same elliptic curve, that is, have the same elliptic curve parameters and generators, each holding private keys ⁇ and ⁇ , and the encryption operation is to perform scalar multiplication with ⁇ (or ⁇ ).
  • the encryption results can be decrypted in different orders.
  • both parties agree to use the same large prime number q and primitive root g, each holding private keys ⁇ and ⁇ .
  • the encryption operation is to use ⁇ (or ⁇ ) to exponentiate and modulo q. No matter whether ⁇ is used for encryption first and ⁇ is used for encryption or ⁇ is used for encryption first and ⁇ is used for encryption, the encryption results can be decrypted in the same or different order.
  • the encryption/decryption performed by the client and the server on the same target uses an encryption/decryption algorithm with interchangeable order.
  • the server After the server receives the sensitive field sent by the client and encrypted by the client itself, the server encrypts the encrypted sensitive field again with the server's own key and returns it to the client. Then, the client uses its own key to decrypt the twice encrypted sensitive field to obtain the sensitive field encrypted by the server.
  • case 1 the server can receive (H(25)) ⁇ sent by the client.
  • the server can re-encrypt the encrypted sensitive field (ie, the privacy field) and return the re-encrypted sensitive field to the client. Specifically, the server can re-encrypt the privacy field (H(25)) ⁇ using its own RSA private key ⁇ to obtain ((H(25)) ⁇ ) ⁇ .
  • case 2 the server can receive ⁇ H(25) sent by the client.
  • the server can re-encrypt the private field and return the re-encrypted private field to the client. Specifically, the server can re-encrypt the private field ⁇ H(25) using its own ECC private key ⁇ to obtain ⁇ H(25).
  • the server uses its own key to re-encrypt the sensitive field (i.e., the privacy field) encrypted by the client and sends it to the client
  • the client can use its own key to decrypt the twice-encrypted privacy field to obtain the sensitive field encrypted by the server.
  • the client can use the inverse element of its own private key ⁇ Decrypt the twice encrypted sensitive fields as follows: In this way, the client obtains the same sensitive field encrypted by the server, namely (H(25)) ⁇ .
  • S220 The client searches the query base according to the sensitive field encrypted by the server to obtain the identifier of the matching record.
  • the client can obtain the same sensitive field encrypted by the server.
  • S230 The server returns the value of the field of interest in the record corresponding to the set of identifiers of a predetermined size including the matching identifier in the database to the client by an oblivious transmission method.
  • the client does not return the matched ID to the server, so the server cannot know which record or records the client wants to find; in S210, the sensitive field sent by the client after client encryption makes the server also unable to know which record or records the sensitive field searched by the client will hit, and only the client knows it. In this way, the privacy of the client is protected. However, the search still needs to be completed in the end, which requires the server to return the record that the client wants to query to the client.
  • the server can use oblivious transmission.
  • Oblivious Transfer can be implemented based on RSA, ECC, etc., and can implement multiple OTs such as 2-choose-1, n-choose-1, m-choose-1, m-choose-k (k ⁇ m ⁇ n).
  • 2-choose-1 OT Take 2-choose-1 OT as an example to illustrate its principle.
  • the sender has two secrets, m1 and m2, and needs to send two secrets to the receiver.
  • the receiver can only choose to decrypt one of them and cannot know the other. At the same time, the sender cannot know which one the receiver chooses.
  • a simple implementation process of 2-choose-1 is as follows: First, the sender generates two different pairs of public and private keys and publishes two public keys.
  • the two public keys are public key 1 and public key 2.
  • the receiver wants to know m1, but does not want the sender to know that he wants m1.
  • the receiver generates a random number r, encrypts r with public key 1, and sends it to the sender.
  • the sender uses its own two private keys to decrypt the encrypted r, decrypts it with private key 1 to get r1, and decrypts it with private key 2 to get r2.
  • r1 is equal to r
  • r2 is a string of meaningless numbers (also the decryption result).
  • the sender does not know which public key the receiver used for encryption, so the sender does not know which of the r1 and r2 calculated by himself is the real r.
  • the sender After receiving m1 and m2, the sender symmetrically encrypts m1 with r1 and symmetrically encrypts m2 with r2, and sends the two symmetrically encrypted results to the receiver.
  • n-choose-1 2 public-private key pairs can be expanded to n public-private key pairs, which becomes n-choose-1 OT.
  • the core of n-choose-1 is that the server uses n different keys to encrypt n records in the data table/the values of the fields of interest in the corresponding records to obtain n encryption results, and sends the n encryption results to the client; the client uses the key corresponding to the matching identifier to decrypt the 1 encryption result corresponding to the matching identifier among the n encryption results sent by the server.
  • S231 The server generates n different public and private key pairs in advance and publishes the public key.
  • n is equal to the number of records in the database.
  • the server generates n different public-private key pairs (pk-sk; pk is publick key, indicating public key; sk is secret key, indicating private key; public key can be made public, private key needs to be kept secret), for example, pk 0 -sk 0 , pk 1 -sk 1 , pk 2 -sk 2 , ..., pk n-1 -sk n-1 , and makes these n public keys public, that is, pk 0 , pk 1 , pk 2 , ..., pk n-1 .
  • the client can obtain these n public keys.
  • S232 The client generates a random number r, encrypts r with the public key corresponding to the desired ID, and sends the encrypted number to the server.
  • the client wants to obtain the record with id_1, but does not want the server to know that the record the client wants to obtain is the one with id_1.
  • the client can use pk 1 to encrypt r and send it to the server.
  • the above order mainly means that there is a corresponding relationship between ID and public key, and such a corresponding relationship can be known by the client.
  • the client wants to obtain the record with id_1 but does not want the server to know that the record the client wants to obtain is the one with id_1.
  • the client can use pk 1 corresponding to id_1 to encrypt r and send it to the server; similarly, the client wants to obtain the record with id_t but does not want the server to know that the record the client wants to obtain is the one with id_t.
  • the client can use pk t corresponding to id_t to encrypt r and send it to the server.
  • the server uses sk 0 , sk 1 , sk 2 , ..., sk n-1 to decrypt the random number r encrypted by pk 1.
  • the server uses sk 0 to decrypt to obtain r0, uses sk 1 to decrypt to obtain r1, ..., and uses sk n-1 to decrypt to obtain r(n-1).
  • r1 is equal to r, because only the decryption with sk 1 is encrypted with the corresponding pk 1 ; and the results r0, r2, ..., r(n-1) obtained by decryption with sk 0 , sk 2 , ..., sk n-1 that do not correspond to pk 1 will not be the same as r.
  • the server only obtains the decryption results of the same form, and does not know what the real r is, nor does it know which public key the client used to encrypt. In other words, the server does not know which public key the client used to encrypt r, so the server does not know which of the n decryption results r0, r1, r2, ..., r(n-1) is the real r.
  • S234 The server symmetrically encrypts each record in the database according to the serial number using the decryption result of the corresponding serial number, and sends the symmetrically encrypted result to the client.
  • the server symmetrically encrypts the record id_0 using r0, symmetrically encrypts the record id_1 using r1, ..., symmetrically encrypts the record id_n-1 using r(n-1), and sends the n symmetric encryption results to the client.
  • the client uses the random number r to symmetrically decrypt the encryption result corresponding to the ID expected to be obtained in the symmetrical encryption result to obtain a retrieval result.
  • the client uses the random number r to symmetrically decrypt the encryption result corresponding to the ID expected to be obtained in the symmetric encryption result.
  • the client expects to obtain the value of the field of interest in the record/record corresponding to id_1, then the client uses the corresponding public key pk 1 to encrypt the random number r;
  • the server uses r0, r1, r2, ..., r(n-1) to symmetrically encrypt the values of the fields of interest in the corresponding records/records of id_0, id
  • the client can obtain the correct search result, that is, the value of the corresponding record/the field of interest in the corresponding record.
  • the client can only use the random number r to symmetrically decrypt the encryption result corresponding to the ID expected to be obtained from the n symmetric encryption results, that is, the client only uses r to decrypt the encryption result of id_1, so as to obtain the corresponding record of id_1/the value of the field of interest in the corresponding record, without using r to symmetrically decrypt the symmetric encryption results of id_0, id_2,..., id_n-1, because the client can know that these encryption results are not symmetric encrypted using r, and even if r is used for symmetric decryption, the correct result cannot be obtained.
  • the client uses the random number r to symmetric decrypt the n symmetric encryption results, and the following explanation is given here:
  • the server uses r0, r1, r2, ..., r(n-1) to symmetrically encrypt the values of the corresponding records id_0, id_1, ..., id_n-1/the fields of interest in the records:
  • the Enc mentioned above means encryption (Encrypt), and the first part id_0, id_1, id_2, ..., id_n-1 in the brackets of Enc() represent the values of the fields of interest in n records/n records, and the second part r0, r1, r2, ..., r(n-1) represent the encryption key.
  • the client uses the random number r to symmetrically decrypt the encryption result in S234. Specifically, the client uses the random number r to symmetrically decrypt the following contents respectively:
  • the above Dec means decryption (Decrypt), the first part of Dec() represents the decryption object, which is the encryption result above, and the second part of Dec() represents the key used for decryption.
  • S231 may be after S230 or before S230, which is not limited here.
  • the server does not know which ID or IDs the client is querying, but encrypts all records in the database and returns them to the client, protecting the privacy of the client.
  • the server uses n private keys to decrypt the received encrypted r respectively, so a large number of asymmetric decryption calculations are performed, which consumes a large amount of CPU and memory resources.
  • the transmission of n symmetric encrypted results in S234 will also occupy a large amount of bandwidth. Especially when the number n is relatively large, the server's calculation amount is large and the bandwidth occupancy is also large.
  • n-choose-k if the possible matching result is greater than 1, for example, k (k>1), it can be achieved through n-choose-k oblivious transfer.
  • n-choose-k one implementation is to group each k of the n records into a set, and each set corresponds to a public-private key pair, so that there will be a total of (C represents the combination formula, the number of combinations consisting of any k out of n).
  • C represents the combination formula, the number of combinations consisting of any k out of n).
  • the oblivious transmission method of option 1 transmits all records corresponding to the identifier including the matching identifier in the database/the values of the fields of interest in the corresponding records to the client.
  • the implementation process of the 1-choose-oblivious transmission method is similar to the implementation process of the n-choose-1 oblivious transmission method.
  • Different keys are used to encrypt each k records in the data table/the value of the field of interest in the corresponding record to obtain The encrypted result is sent The client uses the key corresponding to the matching identifier to decrypt the encrypted result sent by the server.
  • the specific implementation is similar to the above-mentioned process of S231-S235, which will not be repeated here.
  • the above r can also be a public key in an asymmetric key.
  • the server uses r0 to asymmetrically encrypt the record id_0, uses r1 to asymmetrically encrypt the record id_1, ..., uses r(n-1) to asymmetrically encrypt the record id_n-1, and sends these n encrypted results to the client.
  • the client After the client receives these encrypted results, it can use its own private key to asymmetrically decrypt the encrypted result corresponding to the ID it expects to obtain to obtain the result. The following is similar and will not be repeated.
  • this specification provides the following implementation method that adds the construction of a confusion set.
  • S310 The client sends the sensitive field encrypted by itself to the server, and obtains the same sensitive field encrypted by the server through interaction with the server.
  • the client's search condition is that the value of the Age field is 25, but 25 is a sensitive field, that is, the client does not want to let the other party know.
  • the client can encrypt the 25.
  • RSA/ECC private key encryption is used, and the encryption algorithm used by the client is the same as the encryption algorithm used by the server to generate the query base.
  • the client when RSA private key encryption is used, the client generates a secret ⁇ and keeps it properly. Then, the client can encrypt 25 with its own private key ⁇ . Specifically, 25 or the hash value of 25 can be encrypted.
  • the hash encryption of 25 is used as an example.
  • the direct encryption of 25 is similar, and the client and the server use the same hash algorithm. For example, the client uses the same large prime number q as the server as the modulus.
  • the client can perform RSA encryption on the hash value of 25 using ⁇ to obtain (H(25)) ⁇ .
  • the sensitive field sent by the client to the server can be (H(25)) ⁇ , where (H(25)) ⁇ represents the ciphertext of the value 25 of the sensitive field.
  • the client uses the same elliptic curve as the server, that is, it has the same elliptic curve parameters and generators.
  • the client generates the secret ⁇ itself and keeps it properly.
  • the client can use its own private key ⁇ to encrypt 25.
  • it can be to encrypt the hash value of 25, and the client and the server use the same hash algorithm.
  • the client can use ⁇ to perform ECC encryption on the hash value of 25 to obtain ⁇ H(25).
  • the sensitive field sent by the client to the server can be ⁇ H(25), where ⁇ H(25) represents the ciphertext of the value 25 of the sensitive field.
  • the client obtains the same sensitive field encrypted by the server through interaction with the server, which may include the server using its own key to encrypt the sensitive field encrypted by the client again and then sending it to the client, and the client using its own key to decrypt the sensitive field encrypted twice to obtain the sensitive field encrypted by the server.
  • the core of this content is to find an encryption algorithm that can exchange the order of decryption for two consecutive encryption operations (two parties encrypt successively).
  • the two parties agree to use the same elliptic curve, that is, have the same elliptic curve parameters and generators, each holding private keys ⁇ and ⁇ , and the encryption operation is to perform scalar multiplication with ⁇ (or ⁇ ).
  • the encryption results can be decrypted in different orders.
  • both parties agree to use the same large prime number q and primitive root g, each holding private keys ⁇ and ⁇ .
  • the encryption operation is to use ⁇ (or ⁇ ) to exponentiate and modulo q. No matter whether ⁇ is used for encryption first and ⁇ is used for encryption or ⁇ is used for encryption first and ⁇ is used for encryption, the encryption results can be decrypted in the same or different order.
  • the encryption/decryption performed by the client and the server on the same target uses an encryption/decryption algorithm with interchangeable order.
  • the server After the server receives the sensitive field sent by the client and encrypted by the client itself, the server encrypts the encrypted sensitive field again with the server's own key and returns it to the client. Then, the client uses its own key to decrypt the twice encrypted sensitive field to obtain the sensitive field encrypted by the server.
  • case 1 the server can receive (H(25)) ⁇ sent by the client.
  • the server can re-encrypt the encrypted sensitive field (ie, the privacy field) and return the re-encrypted sensitive field to the client. Specifically, the server can re-encrypt the privacy field (H(25)) ⁇ using its own RSA private key ⁇ to obtain ((H(25)) ⁇ ) ⁇ .
  • case 2 the server can receive ⁇ H(25) sent by the client.
  • the server can re-encrypt the private field and return the re-encrypted private field to the client. Specifically, the server can re-encrypt the private field ⁇ H(25) using its own ECC private key ⁇ to obtain ⁇ H(25).
  • the server uses its own key to re-encrypt the sensitive field (i.e., the privacy field) encrypted by the client and sends it to the client
  • the client can use its own key to decrypt the twice-encrypted privacy field to obtain the sensitive field encrypted by the server.
  • the client can use the inverse element of its own private key ⁇ Decrypt the twice encrypted sensitive fields as follows: In this way, the client obtains the same sensitive field encrypted by the server, namely (H(25)) ⁇ .
  • S320 The client searches the query base according to the sensitive field encrypted by the server to obtain the identifier of the matching record.
  • the client can obtain the same sensitive field encrypted by the server.
  • S330 The server returns the value of the field of interest in the record corresponding to the set of identifiers of a predetermined size including the matching identifier in the database to the client by an oblivious transmission method.
  • the sensitive fields sent by the client in S310 are encrypted by the client so that the server cannot know which record the sensitive fields searched by the client will hit, and only the client knows it. In this way, the privacy of the client is protected. However, the search still needs to be completed in the end, which requires the server to return the record that the client wants to query to the client.
  • FIG2 provides an implementation method of n-choose-1 oblivious transmission.
  • m-choose-1 oblivious transmission can be adopted, where m ⁇ n.
  • the client may also not return the matched ID alone to the server, but confuse the matched ID with some other forged IDs to form a confusion set, and send the confusion set to the server. In this way, the server cannot accurately know which record in the confusion set the client wants to find, and it is necessary to ensure that the client can only obtain the record to be found, but cannot obtain other records.
  • the confusion set can also be sent in the following S332, which is not limited here.
  • S331 The server generates n different public and private key pairs in advance and publishes the public key.
  • the server generates n different public-private key pairs (pk-sk; pk is publick key, indicating public key; sk is secret key, indicating private key), for example, pk 0 -sk 0 , pk 1 -sk 1 , pk 2 -sk 2 , ..., pk n-1 -sk n-1 , and publishes these n public keys, that is, publishes pk 0 , pk 1 , pk 2 , ..., pk n-1 . After the server publishes these n ordered public keys, the client can obtain these n public keys.
  • S332 The client generates a confusion set of size m including the desired ID, generates a random number r, encrypts r with the public key corresponding to the desired ID, and sends the encrypted number r together with the confusion set to the server.
  • the client wants to obtain the record with id_1, but does not want the server to know that the record the client wants to obtain is the record with id_1, so a confusion set of size m is generated.
  • the confusion set is, for example: ⁇ id_1, id_2, id_3, id_4 ⁇ .
  • the four IDs and public key pairs have the following corresponding relationship:
  • the client can use pk 1 to encrypt r and send it to the server together with the confusion set.
  • the client can send the confusion set together with the search statement to the server.
  • the client can use pk 1 to encrypt r and send it together with the search statement containing the confusion set, for example:
  • the server uses sk 1 , sk 2 , sk 3 , and sk 4 to decrypt the random number r encrypted by pk 1. For example, the server uses sk 1 to decrypt to get r1, sk 2 to decrypt to get r2, sk 3 to decrypt to get r3, and sk 4 to decrypt to get r4.
  • r1 is equal to r, because only the decryption with sk 1 is encrypted with the corresponding pk 1 ; and the results r2, r3, and r4 obtained by decryption with sk 2 , sk 3 , and sk 4 that do not correspond to pk 1 will not be the same as r.
  • the server only obtains the decryption results of the same form, and does not know what the real r is, nor does it know which public key the client used to encrypt. In other words, the server does not know which public key the client used to encrypt r, so the server does not know which of the four decryption results r1, r2, r3, and r4 is the real r.
  • the server After the server receives the obfuscation set ⁇ id_1, id_2, id_3, id_4 ⁇ , it can know from the obfuscation set that the data the client wants to obtain is one of the four IDs in the obfuscation set, but it is not sure which one it is, thereby protecting the client's privacy.
  • S334 The server symmetrically encrypts the record specified in the obfuscation set using the decryption result of the corresponding serial number, and sends the symmetrically encrypted result to the client.
  • the server symmetrically encrypts the record id_1 using r1, symmetrically encrypts the record id_2 using r2, symmetrically encrypts the record id_3 using r3, and symmetrically encrypts the record id_4 using r4, and sends the four symmetric encryption results to the client.
  • S335 The client uses the random number r to symmetrically decrypt the encryption result corresponding to the ID expected to be obtained in the symmetrical encryption result to obtain a retrieval result.
  • the client uses the random number r to symmetrically decrypt the encryption result corresponding to the ID expected to be obtained in the symmetric encryption result.
  • the client expects to obtain the value of the field of interest in the record/record corresponding to id_1, then the client uses the corresponding public key pk 1 to encrypt the random number r;
  • the server uses r1, r2, r3, and r4 to symmetrically encrypt the values of the fields of interest in the corresponding records/records of id_1, id_2, id_3, and id_4, respectively, and sends the
  • the client can obtain the correct search result, that is, the value of the corresponding record/the field of interest in the corresponding record.
  • the client can only use the random number r to symmetrically decrypt the encryption result corresponding to the ID expected to be obtained among the four symmetric encryption results, that is, the client only uses r to decrypt the encryption result of id_1, so as to obtain the corresponding record of id_1/the value of the field of interest in the corresponding record, and there is no need to use r to symmetrically decrypt the symmetric encryption results of id_2, id_3, and id_4, because the client can know that these encryption results are not symmetric encrypted using r, and even if r is used for symmetric decryption, the correct result cannot be obtained.
  • the construction and transmission of the obfuscation set can be decoupled from the execution of the OT protocol, and the OT protocol can be used to transmit the key.
  • the client can send an obfuscation set of size m to the server.
  • the client knows which of the obfuscation sets of size m is the identifier of the result that is really wanted to be obtained.
  • the server can generate m symmetric keys. Through the OT of m to 1, the client can obtain a specified symmetric key, that is, the client obtains the symmetric key corresponding to the identifier that really wants to obtain the result.
  • the server can encrypt the records corresponding to the m identifiers in the client obfuscation set with the corresponding symmetric key and send them to the client, so that the client uses the correct symmetric key to decrypt the result that is really wanted to obtain, thereby obtaining the result.
  • the server can generate m symmetric keys in advance, so that the key preparation work can be completed in batches before the OT interaction, without occupying the time of the OT protocol execution.
  • the server transmits the m corresponding records of the identifiers specified in the obfuscation set/the values of the fields of interest in the corresponding records to the client through m-choose-1 oblivious transmission.
  • the number of possible matching results is greater than 1, for example, k (k>1), this can be achieved through m-choose-k oblivious transmission.
  • the following introduces a system for implementing privacy information retrieval in an embodiment of this specification, including a server and a client.
  • the encryption/decryption performed by the client and the server on the same target adopts an encryption/decryption algorithm with an interchangeable order, and: the server is configured with a database, and obtains a query base after encrypting the database, and sends the query base to the client; in a retrieval process: the client sends a sensitive field encrypted by itself to the server, and obtains the same sensitive field encrypted by the server through interaction with the server; the client searches the query base according to the sensitive field encrypted by the server, and obtains the identifier of the matching record; the server and the client transmit the corresponding record of the predetermined size identifier set including the matching identifier in the database/the value of the field of interest in the corresponding record to the client through an oblivious transmission method.
  • the encryption/decryption performed by the server and the client on the same target adopts an encryption/decryption algorithm with an interchangeable order, and: the server is configured with a database, and obtains a query base after encrypting the database, and sends the query base to the client; during a retrieval process: the server receives the sensitive field encrypted by itself sent by the client, encrypts it again and returns it to the client; the server also transmits the corresponding record of the predetermined size identifier set including the matching identifier in the database/the value of the field of interest in the corresponding record to the client through an oblivious transmission method.
  • the encryption/decryption performed by the client and the server on the same target adopts an encryption/decryption algorithm with an interchangeable order, and: the client is configured with a query base, and the query base is obtained by the server after encrypting the database; in a retrieval process: the client sends the sensitive field encrypted by itself to the server, and obtains the same sensitive field encrypted by the server through interaction with the server; the client also searches the query base according to the sensitive field encrypted by the server to obtain the identifier of the matching record; the client also obtains the value of the record corresponding to the matching identifier/the value of the field of interest in the corresponding record in the database through an oblivious transmission between the client and the server.
  • a programmable logic device such as a field programmable gate array (FPGA)
  • FPGA field programmable gate array
  • HDL Hardware Description Language
  • HDL Very-High-Speed Integrated Circuit Hardware Description Language
  • ABEL Advanced Boolean Expression Language
  • AHDL Altera Hardware Description Language
  • HDCal Joint CHDL
  • JHDL Java Hardware Description Language
  • Lava Lava
  • Lola MyHDL
  • PALASM RHDL
  • VHDL Very-High-Speed Integrated Circuit Hardware Description Language
  • Verilog Verilog
  • the controller may be implemented in any suitable manner, for example, the controller may take the form of a microprocessor or processor and a computer readable medium storing a computer readable program code (e.g., software or firmware) executable by the (micro)processor, a logic gate, a switch, an application specific integrated circuit (ASIC), a programmable logic controller, and an embedded microcontroller, examples of which include but are not limited to the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicone Labs C8051F320, and the memory controller may also be implemented as part of the control logic of the memory.
  • a computer readable program code e.g., software or firmware
  • the controller may be implemented in the form of a logic gate, a switch, an application specific integrated circuit, a programmable logic controller, and an embedded microcontroller by logically programming the method steps. Therefore, such a controller may be considered as a hardware component, and the means for implementing various functions included therein may also be considered as a structure within the hardware component. Or even, the means for implementing various functions may be considered as both a software module for implementing the method and a structure within the hardware component.
  • the systems, devices, modules or units described in the above embodiments may be implemented by computer chips or entities, or by products with certain functions.
  • a typical implementation device is a server system.
  • the computer that implements the functions of the above embodiments may be, for example, a personal computer, a laptop computer, a vehicle-mounted human-computer interaction device, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
  • one or more embodiments of the present specification provide method operation steps as described in the embodiments or flow charts, more or less operation steps may be included based on conventional or non-creative means.
  • the order of steps listed in the embodiments is only one way of executing the order of many steps, and does not represent the only execution order.
  • the device or terminal product in practice is executed, it can be executed in sequence or in parallel according to the method shown in the embodiments or the drawings (for example, a parallel processor or a multi-threaded processing environment, or even a distributed data processing environment).
  • each module can be implemented in the same or more software and/or hardware, or the module implementing the same function can be implemented by a combination of multiple sub-modules or sub-units, etc.
  • the device embodiments described above are only schematic.
  • the division of the units is only a logical function division. There may be other division methods in actual implementation.
  • multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed.
  • Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some interfaces, indirect coupling or communication connection of devices or units, which can be electrical, mechanical or other forms.
  • These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including an instruction device that implements the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
  • These computer program instructions may also be loaded onto a computer or other programmable data processing device so that a series of operational steps are executed on the computer or other programmable device to produce a computer-implemented process, whereby the instructions executed on the computer or other programmable device provide steps for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
  • a computing device includes one or more processors (CPU), input/output interfaces, network interfaces, and memory.
  • processors CPU
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • Memory may include non-permanent storage in a computer-readable medium, in the form of random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
  • RAM random access memory
  • ROM read-only memory
  • flash RAM flash memory
  • Computer readable media include permanent and non-permanent, removable and non-removable media that can be implemented by any method or technology to store information.
  • Information can be computer readable instructions, data structures, program modules or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, compact disk read-only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage, graphene storage or other magnetic storage devices or any other non-transmission media that can be used to store information that can be accessed by a computing device.
  • computer readable media does not include temporary computer readable media (transitory media), such as modulated data signals and carrier waves.
  • one or more embodiments of the present specification may be provided as a method, system or computer program product. Therefore, one or more embodiments of the present specification may take the form of a complete hardware embodiment, a complete software embodiment or an embodiment combining software and hardware. Moreover, one or more embodiments of the present specification may take the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • One or more embodiments of this specification may be described in the general context of computer-executable instructions executed by a computer, such as program modules.
  • program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types.
  • One or more embodiments of this specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices connected through a communication network.
  • program modules may be located in local and remote computer storage media, including storage devices.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Storage Device Security (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本说明书一个或多个实施例提供一种实现隐私信息检索的方法、系统、服务器和客户端。所述方法中,服务端将数据库加密后得到查询基,并发送该查询基至客户端;客户端与服务端对同一目标执行的加/解密采用可交换顺序的加/解密算法。在一次检索过程中,包括:所述客户端发送经自身加密的敏感字段至服务端,并通过与服务端的交互得到由服务端加密的同一敏感字段;所述客户端在查询基中根据所述由服务端加密的敏感字段检索,得到匹配记录的标识;所述服务端与所述客户端之间通过不经意传输方式将所述数据库中包含所述匹配标识在内的预定大小标识集合对应记录/对应记录中感兴趣字段的值传输至所述客户端。

Description

实现隐私信息检索 技术领域
本说明书实施例属于隐私计算技术领域,尤其涉及一种实现隐私信息检索的方法、系统、服务器和客户端。
背景技术
隐私保护计算(Privacy-Preserving Computing)是在保护数据本身不对外泄露的前提下实现数据分析计算的技术集合,实现数据的可用不可见。通过隐私保护计算技术,可以在充分保护数据和隐私安全的前提下,实现数据价值的转化和释放。
目前实现隐私保护计算的主流技术主要包括三大方向:第一类是以多方安全计算(Secure Multi-Party Computation,SMPC)为代表的基于密码学的隐私计算技术;第二类是以联邦学习(Federated Learning,FL)为代表的人工智能与隐私保护技术融合衍生的技术;第三类是以可信执行环境(Trust Execution Environment)为代表的基于可信硬件的机密计算(Confidential Computing,CC)技术。此外,还包括差分隐私(Differential Privacy,DP)等。差分隐私(Differential Privacy,DP)实际则是对计算结果的保护,而不是针对计算过程;联邦学习、安全多方计算以及机密计算则是对计算过程以及计算过程中间结果进行保护。
第一类的多方安全计算,又包括四大基础技术,分别是混淆电路(Garbled Circuit,GC)、秘密分享(Secret Sharing)、不经意传输(Oblivious Transfer)和同态加密(Homomorphic Encryption,HE)。其中,同态加密是一种特殊的加密算法,在密文基础上直接进行计算,与基于解密后的明文是一样的计算结果,其又包括半同态加密(Partially Homomorphic Encryption,PHE)和全同态加密(Fully Homomorphic Encryption,FHE)。
安全多方计算凭借其坚实的安全理论基础提供输入秘密数据的隐私保护能力,实现隐私保护计算过程的安全。目前安全多方计算主要有两条实施技术路线,包括通用安全多方计算和特定问题安全多方计算。前者可以解决各类计算问题,但是这种“万能型”的技术路线通常体系庞大,各种开销较大;后者针对特定问题设计专用协议,如隐私集合求交PSI(Private Set Intersection,PSI),隐私信息检索(Privacy Information Retrieval,PIR)等,往往能够以比通用安全多方计算协议更低的代价得到计算结果,但是需要领域专家针对应用场景进行精心设计,一般无法适用于通用场景且设计成本较高。
隐私集合求交是参与双方在不泄露任何额外信息的情况下,得到双方持有数据的交集。额外的信息指的是除了双方的数据交集以外的任何信息。隐私集合求交在现实场景中非常有用,比如在纵向联邦学习中做数据对齐,或是在社交软件中通过通讯录做好友发现等。
隐私信息检索是客户端从数据库检索信息的一种方法。检索过程中,查询方隐藏查询目标标识,数据服务方提供匹配的查询结果却无法获知具体的查询对象。
发明内容
本说明书的目的在于提供一种实现隐私信息检索的方法、系统、服务器和客户端。
一种实现隐私信息检索的方法,服务端将数据库加密后得到查询基,并发送该查询基至客户端;客户端与服务端对同一目标执行的加/解密采用可交换顺序的加/解密算法。 在一次检索过程中,包括:S210,所述客户端发送经自身加密的敏感字段至服务端,并通过与服务端的交互得到由服务端加密的同一敏感字段;S220,所述客户端在查询基中根据所述由服务端加密的敏感字段检索,得到匹配记录的标识;S230,所述服务端与所述客户端之间通过不经意传输方式将所述数据库中包含所述匹配标识在内的预定大小标识集合对应记录/对应记录中感兴趣字段的值传输至所述客户端。
一种实现隐私信息检索的系统,包括服务端与客户端,客户端与服务端对同一目标执行的加/解密采用可交换顺序的加/解密算法,且所述服务端配置有数据库,并将该数据库加密后得到查询基,并发送该查询基至客户端。在一次检索过程中:所述客户端发送经自身加密的敏感字段至服务端,并通过与服务端的交互得到由服务端加密的同一敏感字段;所述客户端在查询基中根据所述由服务端加密的敏感字段检索,得到匹配记录的标识;所述服务端与所述客户端之间通过不经意传输方式将所述数据库中包含所述匹配标识在内的预定大小标识集合对应记录/对应记录中感兴趣字段的值传输至所述客户端。
一种实现隐私信息检索的服务端,所述服务端与客户端对同一目标执行的加/解密采用可交换顺序的加/解密算法,且所述服务端配置有数据库,并将该数据库加密后得到查询基,并发送该查询基至客户端。在一次检索过程中:所述服务端接收所述客户端发送的经自身加密的敏感字段并再次加密后返回至所述客户端;所述服务端还与所述客户端之间通过不经意传输方式将所述数据库中包含所述匹配标识在内的预定大小标识集合对应记录/对应记录中感兴趣字段的值传输至所述客户端。
一种实现隐私信息检索的客户端,该客户端与服务端对同一目标执行的加/解密采用可交换顺序的加/解密算法,且所述客户端配置有查询基,所述查询基由所述服务端将数据库加密后得到。在一次检索过程中:所述客户端发送经自身加密的敏感字段至服务端,并通过与服务端的交互得到由服务端加密的同一敏感字段;所述客户端还在查询基中根据所述由服务端加密的敏感字段检索,得到匹配记录的标识;所述客户端还与所述服务端之间通过不经意传输方式获得所述数据库中与所述匹配标识对应记录/对应记录中感兴趣字段的值。
上述实施例中,通过将查询基预先配置到客户端的形式,实现不暴露数据库明文的情况下由客户端通过与服务端交互及查询基定位要查询的字段在查询基中的标识,进一步根据标识向服务端发起查询,保证了服务端对数据库的隐私保护,且可以支持结构化查询语句。而且,所述服务端与所述客户端之间传输匹配标识对应记录/对应记录中感兴趣字段的值,采用了不经意传输方式,这样不会暴露客户端的匹配标识至服务器,保护了客户端隐私。
附图说明
为了更清楚地说明本说明书实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本说明书中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是一实施例的流程示意图;
图2是一实施例的流程示意图。
具体实施方式
为了使本技术领域的人员更好地理解本说明书中的技术方案,下面将结合本说明书 实施例中的附图,对本说明书实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本说明书一部分实施例,而不是全部的实施例。基于本说明书中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都应当属于本说明书保护的范围。
如前所述,PIR是客户端从数据库检索信息的一种方法。PIR方案是由Chor B等在1995年提出的解决保护用户查询隐私的方案。PIR方案的主要目的是,保证查询用户向服务器上的数据库提交的查询请求,在用户查询的隐私信息不被泄漏的条件下完成查询,即在检索过程中服务器不知道用户具体查询信息及检索出的数据项。
隐私信息检索的应用场景包括有:病患想通过医药系统查询其疾病的治疗药物,如果以该疾病名为查询条件,医疗系统将会得知该病人可能患有这样的疾病,从而病人的隐私被泄露,通过隐私信息查询可以避免此类泄露问题。
在域名、商标申请过程,用户需要首相向相关数据库提交自己申请的域名或商标信息以查询是否已存在,但有不想让服务提供方知晓自己的申请名称,从而能够抢先注册。
在证券市场中,某用户想查询某个股票信息,但又不能将自己感兴趣的股票泄露给服务方从而影响股票价格和自己的偏好。
一个简单的实现方案是数据库把所有数据发送给客户端,但无法保护数据库安全,即无法保证服务端的隐私。能够同时保证客户端和数据库隐私安全的PIR,称为对称的PIR(Symmetrical PIR,SPIR),同时保证客户端和数据库两者之一隐私安全的PIR,称为非对称的PIR(Asymmetrical PIR,APIR)。根据数据库副本的个数分为多副本PIR和单副本PIR。多副本PIR协议要求多个数据库副本之间不能合谋,这在现实场景中很难满足,因此考虑更多的是单副本PIR。单副本PIR只能达到计算安全(Computational PIR,CPIR)。在大多数PIR方案中,总是假设客户端知道想要检索的是数据库的第几个比特(单比特)。但是在现实场景中,客户端往往是根据关键字检索(并不知道该关键字对应数据库的具体位置),且希望取回的是字符串(多比特)。总而言之,一个实用的PIR通常需要最好同时满足对称、单副本、按关键字检索、返回字符串等多个条件,并达到计算效率和通信效率的平衡。通过同态加密、不经意传输(Oblivious Transfer,OT)、单向陷门函数(One-way Trapdoor Function)等密码学技术,可以满足或部分满足上述条件。
本说明书提供一种实现隐私信息检索的方法实施例。
该实施例中,服务端(Server)可以预先将数据库加密后得到查询基,并发送该查询基至客户端。
一般的,服务端本地具有数据库,可以供客户端查询。服务端本地的数据库例如为如下:
ID Name Age Native_place
id_0 A 24 anhui
id_1 B 25 shanghai
id_2 C 30 anhui
id_3 D 46 henan
id_4 E 34 shandong
id_5 F 54 shanghai
id_6 G 24 beijing
id_7 H 34 shandong
id_8 I 42 guangdong
id_9 J 56 zhejiang
表1、服务端具有的数据库
上述表1的例子中,包括ID、Name、Age、Native_place这4个字段,例如有id_0,...id_9共10条记录,每一行为一个记录。其中,id_0,...id_9为每一行记录的标识。
为了让客户端可以进行检索,而又不暴露服务端的隐私安全,服务端可以加密该数据库,得到查询基。加密方式可以采用RSA(一种使用广泛的非对称加密算法,1977年由罗纳德·李维斯特(Ron Rivest)、阿迪·萨莫尔(Adi Shamir)和伦纳德·阿德曼(Leonard Adleman)一起提出的)或ECC(Elliptical Curve Cryptography,椭圆曲线密码学)加密。具体的,服务端可以使用RSA私钥/ECC私钥α对数据加密,即对除了ID列的其它每个字段(即每个单元格中的数据)采用RSA私钥/ECC私钥α进行加密。
采用ECC加解密算法的情况下,具体的,服务端可以生成一个秘密值α并妥善保存,该秘密值α也就是ECC私钥。此外,服务端可以将name字段的值通过一个哈希函数转换为椭圆曲线上的一个点,可以表达为Hash(C)或表达为H(C)。
根据椭圆曲线上标量乘法的运算性质,椭圆曲线上的一个点P和一个整数k,计算Q=kP很容易,且得到的结果Q也是该椭圆曲线上的一个点;反之,如果知道椭圆曲线上的一个点对P、Q,求解Q=kP中使等式成立的k的值很难。
这里,根据椭圆曲线上的标量乘法运算α·H(C)很容易计算得到,但是知道α·H(C)的结果和H(C)却很难推算出α的值。很难得到α的值的情况下,知道α·H(C)的结果,也很难得到知道H(C)的值。
进而,服务端采用秘密值α加密后的数据库如下所示:
ID Name Age Native_place
id_0 α·H(A) α·H(24) α·H(anhui)
id_1 α·H(B) α·H(25) α·H(shanghai)
id_2 α·H(C) α·H(30) α·H(anhui)
id_3 α·H(D) α·H(46) α·H(henan)
id_4 α·H(E) α·H(34) α·H(shandong)
id_5 α·H(F) α·H(54) α·H(shanghai)
id_6 α·H(G) α·H(24) α·H(beijing)
id_7 α·H(H) α·H(34) α·H(shandong)
id_8 α·H(I) α·H(42) α·H(guangdong)
id_9 α·H(J) α·H(56) α·H(zhejiang)
表2、服务端采用ECC私钥加密后的查询基
需要说明的是,上述hash函数,不仅能将原始输入转换为固定长度和格式的输出,还能将输出转换为椭圆曲线上的一个点的x轴坐标。例如采用curve25519这样的椭圆曲线,任意的256bits数据都可以作为这条椭圆曲线上的一个合法的x轴坐标。相应的,可以采用sha256或sha3-256,也可以采用sha384、sha512或者sha3-384、sha3-512的结果中截取256bits。更广泛的说,任意hash值(不局限于hash结果是256bits)可以对椭圆曲线的阶取模,取模结果与生成元点乘之积(标量乘法)即为该椭圆曲线上的一个点。
进而,服务端可以将该查询基发送至需要进行检索的客户端。一种方式中,服务端可以直接发送该查询基至客户端,例如直接发送至客户端的设备,或者发送至客户端的代理服务器之类;另一种方式中,服务端可以在一个统一资源定位系统(Uniform Resource Locator,URL)上发布该查询基,进而客户端可以从该URL上获取该查询基。
相应的,客户端可以接收到该查询基,并将接收到的查询基保存在本地。
类似的,采用RSA的情况下,服务端可以生成一个秘密值α并妥善保存,该秘密值 也就是RSA私钥。此外,服务端可以将name字段的值通过一个哈希函数转换为椭圆曲线上的一个点,可以表达为Hash(C)或表达为H(C)。
根据模幂运算的性质,已知秘密值α,对于一个大质数q和底数g,计算p=g αmod q很容易;反之,如果知道p、q和底数g,求解p=g αmod q中使等式成立的α的值很难。底数g也称为原根。
这里,根据模拟运算计算(H(C)) αmod q很容易,但是知道(H(C)) αmod q的结果和H(C)、q却很难推算出α的值。很难得到α的值的情况下,知道(H(C)) αmod q的结果,也很难得到知道H(C)的值。后续,将形如(H(C)) αmod q的表达式省略mod q,简略表示为(H(C)) α
进而,服务端采用秘密值α加密后的数据库如下所示:
ID Name Age Native_place
id_0 (H(A)) α (H(24)) α (H(anhui)) α
id_1 (H(B)) α (H(25)) α (H(shanghai)) α
id_2 (H(C)) α (H(30)) α (H(anhui)) α
id_3 (H(D)) α (H(46)) α (H(henan)) α
id_4 (H(E)) α (H(34)) α (H(shandong)) α
id_5 (H(F)) α (H(54)) α (H(shanghai)) α
id_6 (H(G)) α (H(24)) α (H(beijing)) α
id_7 (H(H)) α (H(34)) α (H(shandong)) α
id_8 (H(I)) α (H(42)) α (H(guangdong)) α
id_9 (H(J)) α (H(56)) α (H(zhejiang)) α
表3、服务端采用RSA私钥加密后的查询基
进而,服务端可以将该查询基发送至需要进行检索的客户端。类似的,服务端可以直接发送该查询基至客户端,例如直接发送至客户端的设备,或者发送至客户端的代理服务器之类;另一种方式中,服务端可以在一个统一资源定位系统(Uniform Resource Locator,URL)上发布该查询基,进而客户端可以从该URL上获取该查询基。
相应的,客户端可以接收到该查询基,并将接收到的查询基保存在本地。
S110:客户端发送经自身加密的敏感字段至服务端,并通过与服务端的交互得到由服务端加密的同一敏感字段。
例如,客户端的检索条件为Age字段值为25,而25为敏感字段,即不希望让对端知道。为了避免让服务端知道客户端检索条件是Age字段的值25,客户端可以将该25加密。例如,采用RSA/ECC私钥加密,客户端采用的加密算法与服务端生成查询基采用的加密算法相同。
具体的,采用RSA私钥加密的情况下,客户端自身生成秘密β并妥善保存。进而,客户端可以采用自身私钥β对25加密。具体的,可以是对25或对25的hash值加密。这里以对25的hash加密为例加以说明,对25直接加密的情况类似,客户端与服务端采用相同的hash算法。例如,客户端采用与服务端相同的大质数q作为模数。客户端可以将对25采用β对25的hash值进行RSA加密,得到(H(25)) β。则客户端发送至服务端的敏感字段可以为(H(25)) β,其中,(H(25)) β表示敏感字段的值25的密文。
另一方面,客户端也可以构造检索语句,并将检索语句中的敏感字段加密后得到隐 私字段,并用隐私字段替换敏感字段,将替换后的隐私检索语句发送至服务端。
例如,客户端构造的查询语句为select Name where Age=25。
为了保护隐私,即不让服务端获得查询的是Age=25这个条件,例如是将其中的25隐私保护起来,结果如下:
select Name where Age=?
其中,?表示替换后的检索语句。
具体的,客户端可以将25用RSA私钥加密。例如,客户端可以将对25采用与服务端相同hash函数进行hash计算,进而采用β对25的hash值进行RSA加密,得到(H(25)) β。则客户端发送至服务端的查询语句例如为如下:
select Name where Age=(H(25)) β
如前所述,(H(25)) β为密文,即为上面检索语句中的“?”代表的内容,服务端获得后并不能知晓其中的β和25。
采用ECC私钥加密的情况下,客户端采用与服务端相同的椭圆曲线,即具有相同的椭圆曲线参数和生成元。客户端自身生成秘密β并妥善保存。进而,客户端可以采用自身私钥β对25加密。具体的,可以是对25的hash值加密,客户端与服务端采用相同的hash算法。例如,客户端可以采用β对25的hash值进行ECC加密,得到β·H(25)。则客户端发送至服务端的敏感字段可以为β·H(25),其中,β·H(25)表示敏感字段的值25的密文。
另一方面,客户端也可以构造检索语句,并将检索语句中的敏感字段加密后得到隐私字段,并用隐私字段替换敏感字段,将替换后的隐私检索语句发送至服务端。
例如,客户端构造的查询语句为select Name where Age=25。
为了保护隐私,即不让服务端获得查询的是Age=25这个条件,例如是将其中的25隐私保护起来,结果如下:
select Name where Age=?
其中,?表示替换后的检索语句。
具体的,客户端可以将25用ECC私钥加密。例如,客户端采用与服务端相同的椭圆曲线,即具有相同的椭圆曲线参数和生成元。客户端可以将检索语句中的敏感字段用自身ECC私钥加密后替换,将替换后的隐私检索语句发送至服务端。例如客户端自身生成秘密β并妥善保存。此外,客户端可以将对25采用与服务端相同hash函数进行hash计算,进而采用β对25的hash值进行ECC加密,得到β·H(25)。则客户端发送至服务端的查询语句例如为如下:
select Name where Age=β·H(25)
如前所述,β·H(25)为密文,即为上面检索语句中的“?”代表的内容,服务端获得后并不能知晓其中的β和25。
所述客户端通过与服务端的交互得到由服务端加密的同一敏感字段,可以包括服务端采用自身密钥对由客户端加密的敏感字段再次加密后发送至客户端,客户端采用自身密钥对两次加密后的敏感字段解密得到由服务端加密的敏感字段。该内容的核心是需要找到一个满足连续两次加密操作(两方先后加密)可以交换顺序进行解密的加密算法。根据ECC的密码学性质,双方约定采用相同的椭圆曲线,即具有相同的椭圆曲线参数 和生成元,各自持有私钥α和β,加密操作为用α(或β)进行标量乘法运算,不论先用α加密后用β加密还是先用β加密后用α加密,都可以用相同或不同的顺序解密,即可以对加密结果用不同的顺序解密。类似的,根据RSA的密码学性质加密,双方约定采用一个相同的大质数q和原根g,各自持有私钥α和β,加密操作为用α(或β)求幂并用q取模,不论先用α加密后用β加密还是先用β加密后用α加密,都可以用相同或不同的顺序解密,即可以对加密结果用不同的顺序解密。整体来说,这里客户端和服务端对同一目标执行的加/解密采用可交换顺序的加/解密算法。
具体的,可以是服务端收到隐私检索语句后,对隐私字段再次加密后返回至客户端,也可以是服务端收到客户端发送的经客户端自身加密的敏感字段后,服务端对加密的敏感字段再次用服务端自身密钥加密后返回至客户端。进而,客户端采用自身密钥对两次加密后的敏感字段解密得到由服务端加密的敏感字段。
例如,情况1:服务端可以接收到客户端发送的(H(25)) β
服务端可以对加密后的敏感字段(亦即隐私字段)再次加密,并将再次加密后的敏感字段返回至客户端。具体的,服务端可以对隐私字段(H(25)) β采用自身的RSA私钥α进行再次加密,得到((H(25)) β) α
例如,情况1':服务端可以接收到客户端发送的隐私检索语句select Name where Age=(H(25)) β。这样,服务端可以获得该隐私检索语句中的隐私字段(H(25)) β
服务端可以对隐私字段再次加密,并将再次加密后的隐私字段返回至客户端。
具体的,服务端可以对隐私字段(H(25)) β采用自身的RSA私钥α进行再次加密,得到((H(25)) β) α,具体过程类似上述,这里不再赘述。
例如,情况2:服务端可以接收到客户端发送的β·H(25)。
服务端可以对隐私字段再次加密,并将再次加密后的隐私字段返回至客户端。具体的,服务端可以对隐私字段β·H(25)采用自身的ECC私钥α进行再次加密,得到α·β·H(25)。
例如,情况2':服务端可以接收到客户端发送的隐私检索语句select Name where Age=β·H(25)。这样,服务端可以获得该隐私检索语句中的隐私字段β·H(25)。
服务端可以对隐私字段再次加密,并将再次加密后的隐私字段返回至客户端。
具体的,服务端可以对隐私字段β·H(25)采用自身的ECC私钥α进行再次加密,得到α·β·H(25),具体过程类似上述,这里不再赘述。
服务端采用自身密钥对由客户端加密的敏感字段(即隐私字段)再次加密后发送至客户端后,客户端可以采用自身密钥对两次加密后的隐私字段解密得到由服务端加密的敏感字段。
例如,对应上面情况1和1',客户端接收到服务端发送的((H(25)) β) α,其中的幂次方运算存在性质如下:((H(25)) β) α=(H(25)) βα=(H(25)) αβ=((H(25)) α) β。进而,客户端可以采用自身私钥β的逆元
Figure PCTCN2022135408-appb-000001
对两次加密后的敏感字段解密,如下:
Figure PCTCN2022135408-appb-000002
这样,客户端得到由服务端加密的同一敏感字段,即(H(25)) α
对应上面情况2和2',客户端接收到服务端发送的α·β·H(25),其中的标量乘法运 算存在性质如下:α·β·H(25)=β·α·H(25)。进而,客户端可以采用自身私钥β的逆元β -1对两次加密后的敏感字段解密,如下:β -1·α·β·H(25)=β -1·β·α·H(25)=αH(25)。这样,客户端同样得到由服务端加密的同一敏感字段,即α·H(25)。
需要说明的是,RSA中,根据欧拉定理,pk·sk=1mod(p-1)·(q-1),其中p和q两个大质数,所以pk和sk互为逆元。类似的,ECC中,pk=sk*G,G为ECC选定曲线上的的一个生成元,所以pk和sk也是互为逆元。
S120:客户端在查询基中根据所述由服务端加密的敏感字段检索,得到匹配记录的标识,并将该标识返回至服务端。
S110执行后,客户端可以得到由服务端加密的同一敏感字段。
客户端可以基于该由服务端加密的敏感字段在查询基中查询。例如,客户端解密后得到由服务端加密的隐私字段α·H(25)或(H(25)) α,从而客户端基于该隐私字段在查询基中查询,例如分别在表2或表3中查询,可以得到Age中包含该隐私字段的记录为ID=d_1的这条记录,ID为这条记录的标识。这样,客户基于该由服务端加密的隐私字段在查询基中查询,匹配到记录后可以定位得到匹配记录的标识(也可简称为匹配标识)。
所述客户端将匹配记录的标识返回至服务端,可以包括两种情况。
一种是S110中,客户端构造检索语句,并将检索语句中的敏感字段加密后得到隐私字段,并用隐私字段替换敏感字段,将替换后的隐私检索语句发送至服务端的情况。该情况下,客户端可以直接将匹配记录的标识返回至服务端。
另一种是S110中,客户端发送经自身加密的敏感字段的值至服务端的情况。该情况下,S120中,客户端可以构造检索语句,例如检索语句为:
select Name where ID=id_1
这样,S120中,客户端可以将构造的上述检索式发送至服务端,该检索式中包含了匹配记录的标识,并指示感兴趣的字段是Name,也就是select后面紧跟的字段名称。
换句话说,S110和S120这两个步骤中,可以选择在其中的一个步骤中发送检索式,该检索式中包含了感兴趣字段。
S130:服务端返回所述数据库中所述标识对应记录中的感兴趣字段的值至客户端。
仍然按照上述例子,服务端接收到客户端发来的标识后,可以在所述数据库中查找所述标识对应记录,并按照S110或S120的感兴趣字段取出查找到的记录中的相应值,并将该取出的感兴趣字段的值返回至客户端。例如,返回id_1对应记录中的Name=B,即返回B至客户端。
上述实施例中,通过将查询基预先配置到客户端的形式,实现不暴露数据库明文的情况下由客户端通过与服务端交互及查询基定位要查询的字段在查询基中的标识,进一步根据标识向服务端发起查询,得到标识对应记录中的感兴趣字段的值。相对于传统的多副本PIR,显然不需要多个副本数据库之间不能合谋的前提假设,实用性更好。相对于传统的单副本PIR中只能实现比特位检索的情形,本实施例不需要关注要检索的关键字在数据库中的具体位置(比特位置),可以实现字符串的查询,且可以支持结构化查询语句(Structured Query Language,SQL)。本实施例中数据库仍然保持在服务端,同时将数据库加密得到的查询基配置到客户端,以便于客户端检索时基于查询基进行数据定位以得到记录的标识,同时查询基的加密特性使得客户端不会获得数据库的内容,保证了服务端对数据库的隐私保护。整体来说,本实施例中数据库、查询基的形式,在一个服务端配置数据库和一个客户端配置查询基的情况下可以称为“非对称双副本”,在多 个客户端配置查询基的情况下可以称为“非对称多副本”。
上述实施例中,通过SQL查询语句,客户端可以发起对感兴趣字段的查询,例如上述select Name...中要查询的Name字段。这在一定程度上暴露了客户端的感兴趣字段。另一种方式中,可以查询符合条件的记录,即符合条件的整行数据,这样可以保护客户端的隐私,但是需要服务端返回整条记录,这就一定程度上暴露了服务端的整行数据。例如S110/S120中通过“select*where Age=?”或“select*where ID=id_1”这样的检索语句。这样,服务端返回的结果可以是id_1这条的记录,例如如下:
id_1B25shanghai
另外,为了保证传输过程的安全,所述服务端可以加密返回所述数据库中所述标识对应记录/对应记录中感兴趣字段的值至所述客户端。例如,服务端可以采用与客户端协商所得的对称密钥对数据库中所述标识对应记录/对应记录中感兴趣字段的值加密后返回至所述客户端,或者采用所述客户端的非对称密钥中的公钥对数据库中所述标识对应记录/对应记录中感兴趣字段的值加密后返回至所述客户端,从而客户端可以用自身私钥解密,以及采用数字信封方式等等。
上述S120中,客户端直接将匹配得到的ID返回至服务端,虽然可以从服务端获得该ID对应的记录或记录中的感兴趣字段,如S130,但是,这会一定程度上暴露客户端的隐私,即会让服务端知道客户端想查询的标识是id_1。为了保护客户端的隐私,可以通过下述实施例中的方式实现。
S210:客户端发送经自身加密的敏感字段至服务端,并通过与服务端的交互得到由服务端加密的同一敏感字段。
例如,客户端的检索条件为Age字段值为25,而25为敏感字段,即不希望让对端知道。为了避免让服务端知道客户端检索条件是Age字段的值25,客户端可以将该25加密。例如,采用RSA/ECC私钥加密,客户端采用的加密算法与服务端生成查询基采用的加密算法相同。
具体的,采用RSA私钥加密的情况下,客户端自身生成秘密β并妥善保存。进而,客户端可以采用自身私钥β对25加密。具体的,可以是对25或对25的hash值加密。这里以对25的hash加密为例加以说明,对25直接加密的情况类似,客户端与服务端采用相同的hash算法。例如,客户端采用与服务端相同的大质数q作为模数。客户端可以将对25采用β对25的hash值进行RSA加密,得到(H(25)) β。则客户端发送至服务端的敏感字段可以为(H(25)) β,其中,(H(25)) β表示敏感字段的值25的密文。
采用ECC私钥加密的情况下,客户端采用与服务端相同的椭圆曲线,即具有相同的椭圆曲线参数和生成元。客户端自身生成秘密β并妥善保存。进而,客户端可以采用自身私钥β对25加密。具体的,可以是对25的hash值加密,客户端与服务端采用相同的hash算法。例如,客户端可以采用β对25的hash值进行ECC加密,得到β·H(25)。则客户端发送至服务端的敏感字段可以为β·H(25),其中,β·H(25)表示敏感字段的值25的密文。
所述客户端通过与服务端的交互得到由服务端加密的同一敏感字段,可以包括服务端采用自身密钥对由客户端加密的敏感字段再次加密后发送至客户端,客户端采用自身密钥对两次加密后的敏感字段解密得到由服务端加密的敏感字段。该内容的核心是需要找到一个满足连续两次加密操作(两方先后加密)可以交换顺序进行解密的加密算法。根据ECC的密码学性质,双方约定采用相同的椭圆曲线,即具有相同的椭圆曲线参数和生成元,各自持有私钥α和β,加密操作为用α(或β)进行标量乘法运算,不论先用α加密 后用β加密还是先用β加密后用α加密,都可以用相同或不同的顺序解密,即可以对加密结果用不同的顺序解密。类似的,根据RSA的密码学性质加密,双方约定采用一个相同的大质数q和原根g,各自持有私钥α和β,加密操作为用α(或β)求幂并用q取模,不论先用α加密后用β加密还是先用β加密后用α加密,都可以用相同或不同的顺序解密,即可以对加密结果用不同的顺序解密。整体来说,这里客户端和服务端对同一目标执行的加/解密采用可交换顺序的加/解密算法。
具体的,可以是服务端收到客户端发送的经客户端自身加密的敏感字段后,服务端对加密的敏感字段再次用服务端自身密钥加密后返回至客户端。进而,客户端采用自身密钥对两次加密后的敏感字段解密得到由服务端加密的敏感字段。
例如,情况1:服务端可以接收到客户端发送的(H(25)) β
服务端可以对加密后的敏感字段(亦即隐私字段)再次加密,并将再次加密后的敏感字段返回至客户端。具体的,服务端可以对隐私字段(H(25)) β采用自身的RSA私钥α进行再次加密,得到((H(25)) β) α
例如,情况2:服务端可以接收到客户端发送的β·H(25)。
服务端可以对隐私字段再次加密,并将再次加密后的隐私字段返回至客户端。具体的,服务端可以对隐私字段β·H(25)采用自身的ECC私钥α进行再次加密,得到α·β·H(25)。
服务端采用自身密钥对由客户端加密的敏感字段(即隐私字段)再次加密后发送至客户端后,客户端可以采用自身密钥对两次加密后的隐私字段解密得到由服务端加密的敏感字段。
例如,对应上面情况1,客户端接收到服务端发送的((H(25)) β) α,其中的幂次方运算存在性质如下:((H(25)) β) α=(H(25)) βα=(H(25)) αβ=((H(25)) α) β。进而,客户端可以采用自身私钥β的逆元
Figure PCTCN2022135408-appb-000003
对两次加密后的敏感字段解密,如下:
Figure PCTCN2022135408-appb-000004
Figure PCTCN2022135408-appb-000005
这样,客户端得到由服务端加密的同一敏感字段,即(H(25)) α
对应上面情况2,客户端接收到服务端发送的α·β·H(25),其中的标量乘法运算存在性质如下:α·β·H(25)=β·α·H(25)。进而,客户端可以采用自身私钥β的逆元β -1对两次加密后的敏感字段解密,如下:β -1·α·β·H(25)=β -1·β·α·H(25)=α·H(25)。这样,客户端同样得到由服务端加密的同一敏感字段,即α·H(25)。
需要说明的是,RSA中,根据欧拉定理,pk·sk=1mod(p-1)·(q-1),其中p和q两个大质数,所以pk和sk互为逆元。类似的,ECC中,pk=sk·G,G为ECC选定曲线上的的一个生成元,所以pk和sk也是互为逆元。
S220:客户端在查询基中根据所述由服务端加密的敏感字段检索,得到匹配记录的标识。
S210执行后,客户端可以得到由服务端加密的同一敏感字段。
客户端可以基于该由服务端加密的敏感字段在查询基中查询。例如,客户端解密后得到由服务端加密的隐私字段α·H(25)或(H(25)) α,从而客户端基于该隐私字段在查询基中查询,例如分别在表2或表3中查询,可以得到Age中包含该隐私字段的记录为ID=d_1的这条记录,ID为这条记录的标识。这样,客户基于该由服务端加密的隐私字 段在查询基中查询,匹配到记录后可以定位得到匹配记录的标识。
S230:服务端采用不经意传输方式返回所述数据库中包含所述匹配标识在内的预定大小标识集合对应记录中的感兴趣字段的值至客户端。
S220中,客户端并不将匹配得到的ID返回至服务端,这样服务端无法获知客户端想要查找的是哪一条或哪几条记录;S210中客户端发送的经过客户端加密的敏感字段使得服务端也无法获知客户端查找的敏感字段将命中哪一条或哪几条记录,而只有客户端自己知道。这样,保护了客户端的隐私。但是,最终仍需要完成检索,这就需要服务端将客户端想查询的记录返回至客户端。
这里,服务端可以采用不经意传输方式。
不经意传输(Oblivious Transfer,OT)可以基于RSA、ECC等实现,可以实现2选1、n选1和m选1、m选k(k<m<n)等多种OT。以2选1OT为例说明其原理,发送者有两个秘密,分别是m1和m2,需要发送2个秘密至接收者,接收者只能选择解密其中的1个而无法获知另一个,同时发送者也无法得知接收者选择的是哪一个。以RSA为例,2选1的一个简单的实施流程如下:首先,发送者生成两对不同的公私钥,并公开两个公钥,记这两个公钥分别为公钥1和公钥2。假设接收者希望知道m1,但不希望发送人知道他想要的是m1。接收者生成一个随机数r,再用公钥1对r进行加密,传给发送者。发送者用自身的两个私钥对这个加密后的r进行解密,用私钥1解密得到r1,用私钥2解密得到r2。显然,只有r1是和r相等的,r2则是一串毫无意义的数(也是解密结果)。但发送者不知道接收者加密时用的哪个公钥,因此发送者也不知道自己算出来的r1和r2中的哪个才是真的r。发送者接收到m1和m2后,用r1对m1进行对称加密,用r2对m2进行对称加密,并将两个对称加密结果发送至接收者。接收者本地具有的r=r1,所以接收者用r对发来的两个结果分别进行对称解密可以得到m1,但是无法解密得到m2,这是因为接收者所具有的r≠r2,接收者也就无法用正确的对称密钥进行解密得到m2的值。这个过程中,发送者也不知道接收者算出的是m1和m2中的哪一个。
有了2选1作为基础,可以将2个公私钥对扩展为n个公私钥对,就成为了n选1的OT。n选1的核心在于,服务端用n个不同密钥分别加密所述数据表中的n个记录/对应记录中感兴趣字段的值得到n个加密结果,并发送该n个加密结果至客户端;客户端采用匹配标识对应的密钥解密所述服务端发送的n个加密结果中匹配标识对应的1个加密结果。
结合本说明书上述实施例,假设服务端的数据表中具有总计n条记录,这样,客户端的查询基中相应的也具有n条加密的记录。为了方便,数据记录的ID按照顺序标识为id_0、id_1、id_2、...id_n-1。一个简单的实施流程如下。
S231:服务端预先生成n对不同的公私钥对并公开公钥。
这里的n等于数据库中的记录的数量。
服务端生成n对不同的公私钥对(pk-sk;pk是publick key,表示公钥;sk是secret key,表示私钥;公钥可以公开,私钥需要保密),例如分别是pk 0-sk 0,pk 1-sk 1,pk 2-sk 2,...,pk n-1-sk n-1,并公开这n个公钥,即公开pk 0,pk 1,pk 2,...,pk n-1。服务端公开这n个有序的公钥后,客户端可以获得这n个公钥。
S232:客户端生成随机数r,并用期望获得的ID对应的公钥对r进行加密后发送至服务端。
这里假设客户端希望获得id_1的那条记录,同时不希望服务端知道客户端想要获 得的记录是id_1的那条。这样,客户端可以采用pk 1对r进行加密后发送至服务端。上述的有序,主要是指ID和公钥有对应关系,而这样的对应关系可以被客户端知晓。例如上述例子中,客户端希望获得id_1的那条记录但同时不希望服务端知道客户端想要获得的记录是id_1的那条,客户端可以采用id_1对应的pk 1对r进行加密后发送至服务端;类似的,客户端希望获得id_t的那条记录但同时不希望服务端知道客户端想要获得的记录是id_t的那条,客户端可以采用id_t对应的pk t对r进行加密后发送至服务端。
S233:服务端接收到加密的r后,用n个私钥分别对其解密。
服务端分别用sk 0,sk 1,sk 2,...,sk n-1分别解密经过pk 1加密的随机数r。例如,服务端用sk 0解密得到r0,用sk 1解密得到r1,...,用sk n-1解密得到r(n-1)。
显然,只有r1是和r相等的,因为只有用sk 1进行解密的才是用对应pk 1加密的;而用不对应pk 1的sk 0、sk 2、...、sk n-1解密得到的结果r0、r2、...、r(n-1)都不会与r相同。通过解密,服务端只是得到形式相同的解密结果,并不知道真正的r是什么,也不知道客户端是用哪个公钥进行加密的。换句话说,服务端不知道客户端加密r时用的哪个公钥,因此服务端也不知道解密得到的n个结果r0、r1、r2、...、r(n-1)中的哪个才是真正的r。
S234:服务端将数据库中每条记录按照序号采用对应序号的解密结果进行对称加密,将对称加密后的结果发送至客户端。
例如,服务端将id_0这条记录采用r0进行对称加密,将id_1这条记录采用r1进行对称加密,...,将id_n-1这条记录采用r(n-1)进行对称加密,并将这n个对称加密结果发送至客户端。
S235:客户端采用所述随机数r对所述对称加密结果中期望获得的ID对应的加密结果进行对称解密,得到检索结果。
客户端采用所述随机数r对所述对称加密结果中期望获得的ID对应的加密结果进行对称解密。具体的,例如上述S232中客户端期望获得的是id_1对应的那条记录/记录中的感兴趣字段的值,则客户端采用对应的公钥pk 1对所述随机数r进行加密;S233中,服务端用对应的私钥sk 1对解密结果进行解密,得到的r1=r,而用不对应pk 1的sk 0、sk 2、...、sk n-1解密得到的结果r0、r2、...、r(n-1)都不会与r相同;S234中,服务端用r0、r1、r2、...、r(n-1)分别对对应的id_0、id_1、...、id_n-1这些记录/记录中的感兴趣字段的值进行对称加密,并将这n个对称加密结果发送至客户端;S235中,客户端采用所述随机数r对所述n个对称加密结果进行对称解密。其中,n个对称解密结果中,只有id_1的加密结果是用r对称加密的,因此这里只有对id_1的加密结果采用r进行解密才能得到正确的值。从而,客户端可以获得正确的检索结果,即获得对应记录/对应记录中感兴趣字段的值。
当然,为了减少计算量,客户端可以仅采用所述随机数r对所述n个对称加密结果中期望获得的ID对应的加密结果进行对称解密,即客户端仅采用r对id_1的加密结果进行解密,从而获得id_1的对应记录/对应记录中感兴趣字段的值,而无须采用r对id_0、id_2、...、id_n-1这些对称加密结果进行对称解密,因为客户端可以知道这些加密结果并非采用r进行的对称加密,即使采用r进行对称解密也无法解出正确结果。
为了更清楚的呈现,这里对客户端采用所述随机数r对所述n个对称加密结果进行对称解密,进行如下解释:
S234中,服务端用r0、r1、r2、...、r(n-1)分别对对应的id_0、id_1、...、id_n-1这些记录/记录中的感兴趣字段的值进行对称加密:
Enc(id_0,r0),其中r0≠r;
Enc(id_1,r1),其中r1=r;
Enc(id_2,r2),其中r2≠r;
...
Enc(id_n-1,r(n-1)),其中r(n-1)≠r;
上述Enc表示加密(Encrypt),Enc()括号中的前一部分的id_0、id_1、id_2、...、id_n-1表示n条记录/n条记录中感兴趣字段的值,后一部分的r0、r1、r2、...、r(n-1)表示加密密钥。
S235中,客户端采用随机数r对所述对S234中的加密结果进行对称解密。具体的,客户端采用所述随机数r对下述内容分别进行对称解密:
Dec(Enc(id_0,r0),r),其中r0≠r;
Dec(Enc(id_1,r1),r),其中r1=r;
Dec(Enc(id_2,r2),r),其中r2≠r;
...
Dec(Enc(id_n-1,r(n-1),r),其中r(n-1)≠r;
上述Dec表示解密(Decrypt),Dec()中的前一部分表示解密对象,这里也就是上面的加密结果,Dec()中的后一部分表示解密采用的密钥。
可见,客户端只能解密得到id_1的那条记录,而无法推测出其它记录。这是因为服务端只有对id_1的那条记录采用了随机数r进行对称加密,而对其它ID采用的并非随机数r进行的对称加密,而客户端也无法获得除r1=r以外的r0、r2、...、r(n-1)。
需要说明的是,S231可以是在S230之后,或者是在S230之前,这里并不限制。
上面是所述客户端在查询基中检索得到匹配记录的数量为1时,通过n选1不经意传输,将所述数据库中包含所述匹配标识在内的所有所述标识对应记录/对应记录中感兴趣字段的值传输至所述客户端。
上述实施例中,服务端并不知道客户端查询的是哪个或哪些ID,而是将数据库中的所有记录均加密返回至客户端,保护了客户端的隐私。但是,S233中,服务端用n个私钥分别对接收到加密的r进行解密,这样进行大量的非对称解密计算,需要消耗大量的CPU和内存资源。并且,S234中传输n个对称加密后的结果也将占用大量带宽。尤其是当n的数量比较大时,服务端的计算量较大,带宽占用也较大。
此外,可能匹配的结果大于1,例如为k条(k>1),则可以通过n选k不经意传输来实现。关于n选k,一种实现方案是将n个记录中的每k个组成一个集合,每个集合对应一个公私钥对,这样总计会有
Figure PCTCN2022135408-appb-000006
(C表示组合公式,n里任选k个构成的组合的数量)。接下来,采用
Figure PCTCN2022135408-appb-000007
选1的不经意传输方式将所述数据库中包含所述匹配标识在内的所有所述标识对应记录/对应记录中感兴趣字段的值传输至所述客户端。
Figure PCTCN2022135408-appb-000008
选1的不经意传输方式,实现过程类似于上述n选1不经意传输的实现过程。即所述服务端用
Figure PCTCN2022135408-appb-000009
个不同密钥分别加密所述数据表中的每k个记录/对应记录中感兴趣字段的值得到
Figure PCTCN2022135408-appb-000010
个加密结果,并发送该
Figure PCTCN2022135408-appb-000011
个加密结果至客户端;所述客户端采用匹配标识对应的密钥解密所述服务端发送的
Figure PCTCN2022135408-appb-000012
个加密结果中匹配标识对应的1个加密结果。具体实现类似上述S231-S235的过程,这里不再赘述。
需要说明的是,上述的r也可以是非对称密钥中的公钥,这样,客户端接收到由r 加密的结果后,可以采用自身的私钥对其解密得到结果。即S234和S235中,服务端将id_0这条记录采用r0进行非对称加密,将id_1这条记录采用r1进行非对称加密,...,将id_n-1这条记录采用r(n-1)进行非对称加密,并将这n个加密结果发送至客户端,客户端接收到这些加密的结果后,可以采用自身私钥对其中期望获得的ID对应的加密结果进行非对称解密得到结果。下面也类似,不再重复。
基于此,本说明书给出以下增加了构造混淆集的一种实施方式。
S310:客户端发送经自身加密的敏感字段至服务端,并通过与服务端的交互得到由服务端加密的同一敏感字段。
例如,客户端的检索条件为Age字段值为25,而25为敏感字段,即不希望让对端知道。为了避免让服务端知道客户端检索条件是Age字段的值25,客户端可以将该25加密。例如,采用RSA/ECC私钥加密,客户端采用的加密算法与服务端生成查询基采用的加密算法相同。
具体的,采用RSA私钥加密的情况下,客户端自身生成秘密β并妥善保存。进而,客户端可以采用自身私钥β对25加密。具体的,可以是对25或对25的hash值加密。这里以对25的hash加密为例加以说明,对25直接加密的情况类似,客户端与服务端采用相同的hash算法。例如,客户端采用与服务端相同的大质数q作为模数。客户端可以将对25采用β对25的hash值进行RSA加密,得到(H(25)) β。则客户端发送至服务端的敏感字段可以为(H(25)) β,其中,(H(25)) β表示敏感字段的值25的密文。
采用ECC私钥加密的情况下,客户端采用与服务端相同的椭圆曲线,即具有相同的椭圆曲线参数和生成元。客户端自身生成秘密β并妥善保存。进而,客户端可以采用自身私钥β对25加密。具体的,可以是对25的hash值加密,客户端与服务端采用相同的hash算法。例如,客户端可以采用β对25的hash值进行ECC加密,得到β·H(25)。则客户端发送至服务端的敏感字段可以为β·H(25),其中,β·H(25)表示敏感字段的值25的密文。
所述客户端通过与服务端的交互得到由服务端加密的同一敏感字段,可以包括服务端采用自身密钥对由客户端加密的敏感字段再次加密后发送至客户端,客户端采用自身密钥对两次加密后的敏感字段解密得到由服务端加密的敏感字段。该内容的核心是需要找到一个满足连续两次加密操作(两方先后加密)可以交换顺序进行解密的加密算法。根据ECC的密码学性质,双方约定采用相同的椭圆曲线,即具有相同的椭圆曲线参数和生成元,各自持有私钥α和β,加密操作为用α(或β)进行标量乘法运算,不论先用α加密后用β加密还是先用β加密后用α加密,都可以用相同或不同的顺序解密,即可以对加密结果用不同的顺序解密。类似的,根据RSA的密码学性质加密,双方约定采用一个相同的大质数q和原根g,各自持有私钥α和β,加密操作为用α(或β)求幂并用q取模,不论先用α加密后用β加密还是先用β加密后用α加密,都可以用相同或不同的顺序解密,即可以对加密结果用不同的顺序解密。整体来说,这里客户端和服务端对同一目标执行的加/解密采用可交换顺序的加/解密算法。
具体的,可以是服务端收到客户端发送的经客户端自身加密的敏感字段后,服务端对加密的敏感字段再次用服务端自身密钥加密后返回至客户端。进而,客户端采用自身密钥对两次加密后的敏感字段解密得到由服务端加密的敏感字段。
例如,情况1:服务端可以接收到客户端发送的(H(25)) β
服务端可以对加密后的敏感字段(亦即隐私字段)再次加密,并将再次加密后的敏感字段返回至客户端。具体的,服务端可以对隐私字段(H(25)) β采用自身的RSA私钥α进 行再次加密,得到((H(25)) β) α
例如,情况2:服务端可以接收到客户端发送的β·H(25)。
服务端可以对隐私字段再次加密,并将再次加密后的隐私字段返回至客户端。具体的,服务端可以对隐私字段β·H(25)采用自身的ECC私钥α进行再次加密,得到α·β·H(25)。
服务端采用自身密钥对由客户端加密的敏感字段(即隐私字段)再次加密后发送至客户端后,客户端可以采用自身密钥对两次加密后的隐私字段解密得到由服务端加密的敏感字段。
例如,对应上面情况1,客户端接收到服务端发送的((H(25)) β) α,其中的幂次方运算存在性质如下:((H(25)) β) α=(H(25)) βα=(H(25)) αβ=((H(25)) α) β。进而,客户端可以采用自身私钥β的逆元
Figure PCTCN2022135408-appb-000013
对两次加密后的敏感字段解密,如下:
Figure PCTCN2022135408-appb-000014
Figure PCTCN2022135408-appb-000015
这样,客户端得到由服务端加密的同一敏感字段,即(H(25)) α
对应上面情况2,客户端接收到服务端发送的α·β·H(25),其中的标量乘法运算存在性质如下:α·β·H(25)=β·α·H(25)。进而,客户端可以采用自身私钥β的逆元β -1对两次加密后的敏感字段解密,如下:β -1·α·β·H(25)=β -1·β·α·H(25)=α·H(25)。这样,客户端同样得到由服务端加密的同一敏感字段,即α·H(25)。
需要说明的是,RSA中,根据欧拉定理,pk·sk=1mod(p-1)·(q-1),其中p和q两个大质数,所以pk和sk互为逆元。类似的,ECC中,pk=sk·G,G为ECC选定曲线上的的一个生成元,所以pk和sk也是互为逆元。
S320:客户端在查询基中根据所述由服务端加密的敏感字段检索,得到匹配记录的标识。
S310执行后,客户端可以得到由服务端加密的同一敏感字段。
客户端可以基于该由服务端加密的敏感字段在查询基中查询。例如,客户端解密后得到由服务端加密的隐私字段α·H(25)或(H(25)) α,从而客户端基于该隐私字段在查询基中查询,例如分别在表2或表3中查询,可以得到Age中包含该隐私字段的记录为ID=d_1的这条记录,ID为这条记录的标识。这样,客户基于该由服务端加密的隐私字段在查询基中查询,匹配到记录后可以定位得到匹配记录的标识。
S330:服务端采用不经意传输方式返回所述数据库中包含所述匹配标识在内的预定大小标识集合对应记录中的感兴趣字段的值至客户端。
S310中客户端发送的经过客户端加密的敏感字段使得服务端也无法获知客户端查找的敏感字段将命中哪一条记录,而只有客户端自己知道。这样,保护了客户端的隐私。但是,最终仍需要完成检索,这就需要服务端将客户端想查询的记录返回至客户端。
上述图2对应的实施例给出了n选1不经意传输的实现方式,这里,可以采用m选1的不经意传输,其中m<n。S320中,客户端还可以不将匹配得到的ID单独返回至服务端,而是将匹配得到的ID与其它一些伪造的ID混淆组合在一起构造成混淆集,将混淆集发送至服务端,这样服务端无法准确获知客户端想要查找的是混淆集中的哪一条记录,并且需要保证客户端只能获得其中要查找的那一条记录,而无法获得其它记录。S320中,客户端发送的混淆集可以连同检索语句一并发送,例如select Name where ID=混淆集。或者,混淆集也可以在下面的S332中发送,这里并不限定。
结合本说明书上述实施例,假设服务端的数据表中具有总计n条记录,这样,客户端的查询基中相应的也具有n条加密的记录。为了方便,数据记录的ID按照顺序标识为id_0、id_1、id_2、...id_n-1。一个简单的实施流程如下。
S331:服务端预先生成n对不同的公私钥对并公开公钥。
服务端生成n对不同的公私钥对(pk-sk;pk是publick key,表示公钥;sk是secret key,表示私钥),例如分别是pk 0-sk 0,pk 1-sk 1,pk 2-sk 2,...,pk n-1-sk n-1,并公开这n个公钥,即公开pk 0,pk 1,pk 2,...,pk n-1。服务端公开这n个有序的公钥后,客户端可以获得这n个公钥。
S332:客户端生成包含期望获得ID在内的m大小的混淆集,并生成随机数r,并用期望获得的ID对应的公钥对r进行加密后与混淆集一并发送至服务端。
这里假设客户端希望获得id_1的那条记录,同时不希望服务端知道客户端想要获得的记录是id_1的那条,便生成m大小的混淆集,m=4时这个混淆集例如为:{id_1,id_2,id_3,id_4}。
这4个ID与公钥对例如存在以下对应关系:
pk 1,id_1
pk 2,id_2
pk 3,id_3
pk 4,id_4
客户端可以采用pk 1对r进行加密后与混淆集一并发送至服务端。例如:此外,客户端可以将混淆集连同检索语句一并发送至服务端。这样,客户端可以采用pk 1对r进行加密后与包含混淆集的检索语句一并发送,例如:
select Name where ID={id_1,id_2,id_3,id_4}|Enc(r,pk 1)
其中,“|”用于分割前面的检索语句和后面的加密后的随机数,下同。
S333:服务端接收到混淆集和加密的r后,用对应的m个私钥分别对加密的r进行解密。
服务端分别用sk 1,sk 2,sk 3,sk 4,分别解密经过pk 1加密的随机数r。例如,服务端用sk 1解密得到r1,用sk 2解密得到r2,用sk 3解密得到r3,用sk 4解密得到r4。
显然,只有r1是和r相等的,因为只有用sk 1进行解密的才是用对应pk 1加密的;而用不对应pk 1的sk 2、sk 3、sk 4解密得到的结果r2、r3、r4都不会与r相同。通过解密,服务端只是得到形式相同的解密结果,并不知道真正的r是什么,也不知道客户端是用哪个公钥进行加密的。换句话说,服务端不知道客户端加密r时用的哪个公钥,因此服务端也不知道解密得到的4个结果r1、r2、r3、r4中的哪个才是真正的r。
此外,服务端接收到混淆集{id_1,id_2,id_3,id_4}后,可以从混淆集中得知客户端想要获取的数据是混淆集中4个ID中的1个,但不确定是其中哪一个,从而保护了客户端隐私。
S334:服务端将混淆集中指定的记录采用对应序号的解密结果进行对称加密,将对称加密后的结果发送至客户端。
例如,服务端将id_1这条记录采用r1进行对称加密,将id_2这条记录采用r2进行对称加密,将id_3这条记录采用r3进行对称加密,将id_4这条记录采用r4进行对称 加密,并将这4个对称加密结果发送至客户端。
S335:客户端采用所述随机数r对所述对称加密结果中期望获得的ID对应的加密结果进行对称解密,得到检索结果。
客户端采用所述随机数r对所述对称加密结果中期望获得的ID对应的加密结果进行对称解密。具体的,例如上述S332中客户端期望获得的是id_1对应的那条记录/记录中的感兴趣字段的值,则客户端采用对应的公钥pk 1对所述随机数r进行加密;S333中,服务端用对应的私钥sk 1对解密结果进行解密,得到的r1=r,而用不对应pk 1的sk 2、sk 3、sk 4解密得到的结果r2、r3、r4都不会与r相同;S334中,服务端用r1、r2、r3、r4分别对对应的id_1、id_2、id_3、id_4这些记录/记录中的感兴趣字段的值进行对称加密,并将这个对称加密结果发送至客户端;S335中,客户端采用所述随机数r对所述4个对称加密结果进行对称解密。其中,4个对称解密结果中,只有id_1的加密结果是用r对称加密的,因此这里只有对id_1的加密结果采用r进行解密才能得到正确的值。从而,客户端可以获得正确的检索结果,即获得对应记录/对应记录中感兴趣字段的值。
当然,为了减少计算量,客户端可以仅采用所述随机数r对所述4个对称加密结果中期望获得的ID对应的加密结果进行对称解密,即客户端仅采用r对id_1的加密结果进行解密,从而获得id_1的对应记录/对应记录中感兴趣字段的值,而无须采用r对id_2、id_3、id_4这些对称加密结果进行对称解密,因为客户端可以知道这些加密结果并非采用r进行的对称加密,即使采用r进行对称解密也无法解出正确结果。
上述S331~S335仅是示例性的一种实现方式。在另一种实现方式中,可以将混淆集的构建和传输与OT协议的执行进行解耦,用OT协议来传输密钥。具体的,一方面,客户端可以将m大小的混淆集发送至服务端,当然客户端知道m大小的混淆集中的第几个是真正想要获得的结果的标识,另一方面,服务端可以生成m个对称密钥,通过m选1的OT,客户端可以获得其中指定的一个对称密钥,即客户端获得真正想要获得结果的那个标识对应的对称密钥。这样,服务端可以将客户端混淆集中m个标识对应的记录采用对应的对称密钥加密后发送至客户端,从而客户端对其中真正想要获得的结果采用正确的对称密钥解密,从而得到结果。其中,服务端可以预先生成m个对称密钥,这样可以在进行OT交互之前批量完成密钥准备的工作,而不会占用OT协议执行的时间。
上面是所述客户端在查询基中检索得到匹配记录的数量为1时,通过m选1不经意传输,服务端将所述混淆集中指明的m个标识对应记录/对应记录中感兴趣字段的值传输至所述客户端。此外,可能匹配的结果大于1,例如为k条(k>1),则可以通过m选k不经意传输来实现。m选k不经意传输的核心在于,客户端构造m大小的混淆集并发送至服务端,1<m<n,该m个混淆集中的1个包含所述匹配的k个记录的标识,例如m=4,k=2,匹配标识为id_1和id_3,则构造的混淆集例如为{{id_1,id_3},{id_2和id_4},{id_3和id_4},{id_2}},显然其中第一个为匹配标识构成的子集合;进而,服务端生成m个对称密钥,通过m选1的OT协议使客户端获得其指定的那个对称密钥;服务端用m个不同对称密钥分别加密所述混淆集中的m个子集合得到m个加密结果子集合,并发送该m个加密结果子集合至客户端;客户端采用获得的对称密钥解密k个匹配标识构成的那个子集合,从而获得正确的解密结果。具体实现与上述将混淆集的构建和传输与OT协议的执行进行解耦,用OT协议来传输密钥的过程类似,不再展开。
以下介绍本说明书一实施例中的实现隐私信息检索的系统,包括服务端与客户端,客户端与服务端对同一目标执行的加/解密采用可交换顺序的加/解密算法,且:所述服务端配置有数据库,并将该数据库加密后得到查询基,并发送该查询基至客户端;在一次检索过程中:所述客户端发送经自身加密的敏感字段至服务端,并通过与服务端的交互得到由服务端加密的同一敏感字段;所述客户端在查询基中根据所述由服务端加密的 敏感字段检索,得到匹配记录的标识;所述服务端与所述客户端之间通过不经意传输方式将所述数据库中包含所述匹配标识在内的预定大小标识集合对应记录/对应记录中感兴趣字段的值传输至所述客户端。
以下介绍本说明书一实施例中的实现隐私信息检索的服务端,所述服务端与客户端对同一目标执行的加/解密采用可交换顺序的加/解密算法,且:所述服务端配置有数据库,并将该数据库加密后得到查询基,并发送该查询基至客户端;在一次检索过程中:所述服务端接收所述客户端发送的经自身加密的敏感字段并再次加密后返回至所述客户端;所述服务端还与所述客户端之间通过不经意传输方式将所述数据库中包含所述匹配标识在内的预定大小标识集合对应记录/对应记录中感兴趣字段的值传输至所述客户端。
以下介绍本说明书一实施例中的实现隐私信息检索的客户端,该客户端与服务端对同一目标执行的加/解密采用可交换顺序的加/解密算法,且:所述客户端配置有查询基,所述查询基由所述服务端将数据库加密后得到;在一次检索过程中:所述客户端发送经自身加密的敏感字段至服务端,并通过与服务端的交互得到由服务端加密的同一敏感字段;所述客户端还在查询基中根据所述由服务端加密的敏感字段检索,得到匹配记录的标识;所述客户端还与所述服务端之间通过不经意传输方式获得所述数据库中与所述匹配标识对应记录/对应记录中感兴趣字段的值。
在20世纪90年代,对于一个技术的改进可以很明显地区分是硬件上的改进(例如,对二极管、晶体管、开关等电路结构的改进)还是软件上的改进(对于方法流程的改进)。然而,随着技术的发展,当今的很多方法流程的改进已经可以视为硬件电路结构的直接改进。设计人员几乎都通过将改进的方法流程编程到硬件电路中来得到相应的硬件电路结构。因此,不能说一个方法流程的改进就不能用硬件实体模块来实现。例如,可编程逻辑器件(Programmable Logic Device,PLD)(例如现场可编程门阵列(Field Programmable Gate Array,FPGA))就是这样一种集成电路,其逻辑功能由用户对器件编程来确定。由设计人员自行编程来把一个数字系统“集成”在一片PLD上,而不需要请芯片制造厂商来设计和制作专用的集成电路芯片。而且,如今,取代手工地制作集成电路芯片,这种编程也多半改用“逻辑编译器(logic compiler)”软件来实现,它与程序开发撰写时所用的软件编译器相类似,而要编译之前的原始代码也得用特定的编程语言来撰写,此称之为硬件描述语言(Hardware Description Language,HDL),而HDL也并非仅有一种,而是有许多种,如ABEL(Advanced Boolean Expression Language)、AHDL(Altera Hardware Description Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL(Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby Hardware Description Language)等,目前最普遍使用的是VHDL(Very-High-Speed Integrated Circuit Hardware Description Language)与Verilog。本领域技术人员也应该清楚,只需要将方法流程用上述几种硬件描述语言稍作逻辑编程并编程到集成电路中,就可以很容易得到实现该逻辑方法流程的硬件电路。
控制器可以按任何适当的方式实现,例如,控制器可以采取例如微处理器或处理器以及存储可由该(微)处理器执行的计算机可读程序代码(例如软件或固件)的计算机可读介质、逻辑门、开关、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程逻辑控制器和嵌入微控制器的形式,控制器的例子包括但不限于以下微控制器:ARC 625D、Atmel AT91SAM、Microchip PIC18F26K20以及Silicone Labs C8051F320,存储器控制器还可以被实现为存储器的控制逻辑的一部分。本领域技术人员也知道,除了以纯计算机可读程序代码方式实现控制器以外,完全可以通过将方法步骤进行逻辑编程来使得控制器以逻辑门、开关、专用集成电路、可编程逻辑控制器和嵌入微控制器等的形式来实现相同功能。因此这种控制器可以被认为是一种硬件部件,而对其内包括的 用于实现各种功能的装置也可以视为硬件部件内的结构。或者甚至,可以将用于实现各种功能的装置视为既可以是实现方法的软件模块又可以是硬件部件内的结构。
上述实施例阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为服务器系统。当然,本说明书不排除随着未来计算机技术的发展,实现上述实施例功能的计算机例如可以为个人计算机、膝上型计算机、车载人机交互设备、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任何设备的组合。
虽然本说明书一个或多个实施例提供了如实施例或流程图所述的方法操作步骤,但基于常规或者无创造性的手段可以包括更多或者更少的操作步骤。实施例中列举的步骤顺序仅仅为众多步骤执行顺序中的一种方式,不代表唯一的执行顺序。在实际中的装置或终端产品执行时,可以按照实施例或者附图所示的方法顺序执行或者并行执行(例如并行处理器或者多线程处理的环境,甚至为分布式数据处理环境)。术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、产品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、产品或者设备所固有的要素。在没有更多限制的情况下,并不排除在包括所述要素的过程、方法、产品或者设备中还存在另外的相同或等同要素。例如若使用到第一,第二等词语用来表示名称,而并不表示任何特定的顺序。
为了描述的方便,描述以上装置时以功能分为各种模块分别描述。当然,在实施本说明书一个或多个时可以把各模块的功能在同一个或多个软件和/或硬件中实现,也可以将实现同一功能的模块由多个子模块或子单元的组合实现等。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
本说明书是参照根据本说明书实施例的方法、装置(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储、石墨烯存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
本领域技术人员应明白,本说明书一个或多个实施例可提供为方法、系统或计算机程序产品。因此,本说明书一个或多个实施例可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本说明书一个或多个实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本说明书一个或多个实施例可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本本说明书一个或多个实施例,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本说明书的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
以上所述仅为本说明书一个或多个实施例的实施例而已,并不用于限制本本说明书一个或多个实施例。对于本领域技术人员来说,本说明书一个或多个实施例可以有各种更改和变化。凡在本说明书的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在权利要求范围之内。

Claims (15)

  1. 一种实现隐私信息检索的方法,服务端将数据库加密后得到查询基,并发送该查询基至客户端;客户端与服务端对同一目标执行的加/解密采用可交换顺序的加/解密算法;
    在一次检索过程中,包括:
    所述客户端发送经自身加密的敏感字段至服务端,并通过与服务端的交互得到由服务端加密的同一敏感字段;
    所述客户端在查询基中根据所述由服务端加密的敏感字段检索,得到匹配记录的标识;
    所述服务端与所述客户端之间通过不经意传输方式将所述数据库中包含所述匹配标识在内的预定大小标识集合对应记录/对应记录中感兴趣字段的值传输至所述客户端。
  2. 如权利要求1所述的方法,所述客户端通过与服务端的交互得到由服务端加密的同一敏感字段,包括:
    服务端采用自身密钥对由客户端加密的敏感字段再次加密后发送至客户端,客户端采用自身密钥对两次加密后的敏感字段解密得到由服务端加密的敏感字段。
  3. 如权利要求1所述的方法,当数据库中记录的总数是n条,且所述客户端在查询基中检索得到匹配记录的数量为1时,所述服务端与所述客户端之间通过不经意传输方式将所述数据库中包含所述匹配标识在内的预定大小所述标识对应记录/对应记录中感兴趣字段的值传输至所述客户端,包括:
    服务端与所述客户端之间通过n选1不经意传输方式将所述数据库中包含所述匹配标识在内的所有所述标识对应记录/对应记录中感兴趣字段的值传输至所述客户端。
  4. 如权利要求3所述的方法,所述服务端与所述客户端之间通过n选1不经意传输方式将所述数据库中包含所述匹配标识在内的所有所述标识对应记录/对应记录中感兴趣字段的值传输至所述客户端,包括:
    所述服务端用n个不同密钥分别加密所述数据表中的n个记录/对应记录中感兴趣字段的值得到n个加密结果,并发送该n个加密结果至客户端;
    所述客户端采用匹配标识对应的密钥解密所述服务端发送的n个加密结果中匹配标识对应的1个加密结果。
  5. 如权利要求1所述的方法,当数据库中记录的总数是n条,且所述客户端在查询基中检索得到匹配记录的数量为k,1<k<n时,所述服务端与所述客户端之间通过不经意传输方式将所述数据库中包含所述匹配标识在内的预定大小标识集合对应记录/对应记录中感兴趣字段的值传输至所述客户端,包括:
    所述服务端用
    Figure PCTCN2022135408-appb-100001
    个不同密钥分别加密所述数据表中的每k个记录/对应记录中感兴趣字段的值得到
    Figure PCTCN2022135408-appb-100002
    个加密结果,并发送该
    Figure PCTCN2022135408-appb-100003
    个加密结果至客户端;所述客户端采用匹配标识对应的密钥解密所述服务端发送的
    Figure PCTCN2022135408-appb-100004
    个加密结果中匹配标识对应的1个加密结果。
  6. 如权利要求1所述的方法,当数据库中记录的总数是n条,且所述客户端在查询基中检索得到匹配记录的数量为1,所述服务端与所述客户端之间通过不经意传输方式将所述数据库中包含所述匹配标识在内的预定大小标识集合对应记录/对应记录中感兴趣字段的值传输至所述客户端,包括:
    客户端构造包含m个标识的混淆集并发送至服务端,1<m<n,该m个混淆集中包含所述匹配的记录的标识;
    服务端与所述客户端之间通过m选1不经意传输方式将所述数据库中由混淆集指示的m个标识对应记录/对应记录中感兴趣字段的值传输至所述客户端。
  7. 如权利要求6所述的方法,所述服务端与所述客户端之间通过m选1不经意传输方式将所述数据库中由混淆集指示的m个标识对应记录/对应记录中感兴趣字段的值传输至所述客户端,包括:
    所述服务端用m个不同密钥分别加密所述数据表中的由所述混淆集指明的m个记录/对应记录中感兴趣字段的值得到m个加密结果,并发送该m个加密结果至客户端;
    所述客户端采用匹配标识对应的密钥解密所述服务端发送的m个加密结果中匹配标识对应的1个加密结果。
  8. 如权利要求1所述的方法,当数据库中记录的总数是n条,且所述客户端在查询基中检索得到匹配记录的数量为k,所述服务端与所述客户端之间通过不经意传输方式将所述数据库中包含所述匹配标识在内的预定大小标识集合对应记录/对应记录中感兴趣字段的值传输至所述客户端,包括:
    客户端构造m大小的混淆集并发送至服务端,1<m<n,该m个混淆集中的1个子集合包含所述匹配的k个记录的标识;
    服务端用m个不同密钥分别加密所述混淆集中的m个子集合对应的标识得到m个加密结果子集合,并发送该m个加密结果子集合至客户端;
    服务端生成m个密钥,通过m选1的OT协议使客户端获得其指定的那个密钥;服务端用m个不同密钥分别加密所述混淆集中的m个子集合得到m个加密结果子集合,并发送该m个加密结果子集合至客户端;
    客户端采用获得的密钥解密k个匹配标识构成的那个子集合,从而获得正确的解密结果。
  9. 一种实现隐私信息检索的方法,服务端将数据库加密后得到查询基,并发送该查询基至客户端;客户端与服务端对同一目标执行的加/解密采用可交换顺序的加/解密算法;
    在一次检索过程中,包括:
    S1:所述客户端发送经自身加密的敏感字段至服务端,并通过与服务端的交互得到由服务端加密的同一敏感字段;
    S2:所述客户端在查询基中根据所述由服务端加密的敏感字段检索,得到匹配记录的标识,并生成包含期望获得ID在内的m大小的混淆集;
    S3:所述服务端与所述客户端之间通过不经意传输方式将所述数据库中包含所述混淆集对应记录/对应记录中感兴趣字段的值传输至所述客户端。
  10. 如权利要求9所述的方法,当数据库中记录的总数是n条,且所述客户端在查询基中检索得到匹配记录的数量为1,m<n,所述服务端与所述客户端之间通过不经意传输方式将所述数据库中包含所述混淆集对应记录/对应记录中感兴趣字段的值传输至所述客户端,包括:
    服务端与所述客户端之间通过m选1不经意传输方式将所述数据库中由混淆集指示的m个标识对应记录/对应记录中感兴趣字段的值传输至所述客户端。
  11. 如权利要求10所述的方法,所述服务端与所述客户端之间通过m选1不经意传输方式将所述数据库中由混淆集指示的m个标识对应记录/对应记录中感兴趣字段的值传输至所述客户端,包括:
    所述服务端用m个不同密钥分别加密所述数据表中的由所述混淆集指明的m个记录/对应记录中感兴趣字段的值得到m个加密结果,并发送该m个加密结果至客户端;
    所述客户端采用匹配标识对应的密钥解密所述服务端发送的m个加密结果中匹配标识对应的1个加密结果。
  12. 如权利要求9所述的方法,当数据库中记录的总数是n条,且所述客户端在查询基中检索得到匹配记录的数量为k,1<m<n,
    所述客户端生成包含期望获得ID在内的m大小的混淆集中的1个子集合包含所述匹配的k个记录的标识;
    所述服务端与所述客户端之间通过不经意传输方式将所述混淆集对应记录/对应记录中感兴趣字段的值传输至所述客户端,包括:
    服务端用m个不同密钥分别加密所述混淆集中的m个子集合对应的标识得到m个加密结果子集合,并发送该m个加密结果子集合至客户端;
    服务端生成m个密钥,通过m选1的OT协议使客户端获得其指定的那个密钥;服务端用m个不同密钥分别加密所述混淆集中的m个子集合得到m个加密结果子集合,并发送该m个加密结果子集合至客户端;
    客户端采用获得的密钥解密k个匹配标识构成的那个子集合,从而获得正确的解密结果。
  13. 一种实现隐私信息检索的系统,包括服务端与客户端,客户端与服务端对同一目标执行的加/解密采用可交换顺序的加/解密算法,且:
    所述服务端配置有数据库,并将该数据库加密后得到查询基,并发送该查询基至客户端;
    在一次检索过程中:
    所述客户端发送经自身加密的敏感字段至服务端,并通过与服务端的交互得到由服务端加密的同一敏感字段;
    所述客户端在查询基中根据所述由服务端加密的敏感字段检索,得到匹配记录的标识;
    所述服务端与所述客户端之间通过不经意传输方式将所述数据库中包含所述匹配标识在内的预定大小标识集合对应记录/对应记录中感兴趣字段的值传输至所述客户端。
  14. 一种实现隐私信息检索的服务端,所述服务端与客户端对同一目标执行的加/解密采用可交换顺序的加/解密算法,且:
    所述服务端配置有数据库,并将该数据库加密后得到查询基,并发送该查询基至客户端;
    在一次检索过程中:
    所述服务端接收所述客户端发送的经自身加密的敏感字段并再次加密后返回至所述客户端;所述服务端还与所述客户端之间通过不经意传输方式将所述数据库中包含所述匹配标识在内的预定大小标识集合对应记录/对应记录中感兴趣字段的值传输至所述客户端。
  15. 一种实现隐私信息检索的客户端,该客户端与服务端对同一目标执行的加/解密采用可交换顺序的加/解密算法,且:
    所述客户端配置有查询基,所述查询基由所述服务端将数据库加密后得到;
    在一次检索过程中:
    所述客户端发送经自身加密的敏感字段至服务端,并通过与服务端的交互得到由服务端加密的同一敏感字段;所述客户端还在查询基中根据所述由服务端加密的敏感字段检索,得到匹配记录的标识;所述客户端还与所述服务端之间通过不经意传输方式获得所述数据库中与所述匹配标识对应记录/对应记录中感兴趣字段的值。
PCT/CN2022/135408 2022-09-30 2022-11-30 实现隐私信息检索 WO2024066015A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211216293.8A CN115664722A (zh) 2022-09-30 2022-09-30 一种实现隐私信息检索的方法、系统、服务器和客户端
CN202211216293.8 2022-09-30

Publications (1)

Publication Number Publication Date
WO2024066015A1 true WO2024066015A1 (zh) 2024-04-04

Family

ID=84986472

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/135408 WO2024066015A1 (zh) 2022-09-30 2022-11-30 实现隐私信息检索

Country Status (2)

Country Link
CN (1) CN115664722A (zh)
WO (1) WO2024066015A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070294539A1 (en) * 2006-01-27 2007-12-20 Imperva, Inc. Method and system for transparently encrypting sensitive information
CN111062052A (zh) * 2019-12-09 2020-04-24 支付宝(杭州)信息技术有限公司 一种数据查询的方法和系统
CN112583809A (zh) * 2020-12-09 2021-03-30 北京国研数通软件技术有限公司 非浸入式多种加密算法的数据加密解密的方法
CN114036565A (zh) * 2021-11-19 2022-02-11 上海勃池信息技术有限公司 隐私信息检索系统及隐私信息检索方法
US20220100884A1 (en) * 2020-09-29 2022-03-31 The Johns Hopkins University Term-Based Encrypted Retrieval Privacy

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070294539A1 (en) * 2006-01-27 2007-12-20 Imperva, Inc. Method and system for transparently encrypting sensitive information
CN111062052A (zh) * 2019-12-09 2020-04-24 支付宝(杭州)信息技术有限公司 一种数据查询的方法和系统
US20220100884A1 (en) * 2020-09-29 2022-03-31 The Johns Hopkins University Term-Based Encrypted Retrieval Privacy
CN112583809A (zh) * 2020-12-09 2021-03-30 北京国研数通软件技术有限公司 非浸入式多种加密算法的数据加密解密的方法
CN114036565A (zh) * 2021-11-19 2022-02-11 上海勃池信息技术有限公司 隐私信息检索系统及隐私信息检索方法

Also Published As

Publication number Publication date
CN115664722A (zh) 2023-01-31

Similar Documents

Publication Publication Date Title
Miao et al. Lightweight fine-grained search over encrypted data in fog computing
Wang et al. Inverted index based multi-keyword public-key searchable encryption with strong privacy guarantee
Shen et al. Secure phrase search for intelligent processing of encrypted data in cloud-based IoT
Cui et al. Efficient and expressive keyword search over encrypted data in cloud
Boneh et al. Private database queries using somewhat homomorphic encryption
Liu et al. An efficient privacy-preserving outsourced computation over public data
JP2014002365A (ja) プライバシーを保護することができる暗号化データの問い合わせ方法及びシステム
US20240104234A1 (en) Encrypted information retrieval
JP6770075B2 (ja) 暗号化メッセージ検索方法、メッセージ送受信システム、端末、プログラム
US20230254126A1 (en) Encrypted search with a public key
Zhang et al. Secure and efficient searchable public key encryption for resource constrained environment based on pairings under prime order group
CN116346310A (zh) 基于同态加密的匿踪查询方法、装置和计算机设备
Sun et al. A dynamic and non-interactive boolean searchable symmetric encryption in multi-client setting
CN115795514A (zh) 一种隐私信息检索方法、装置及系统
Karati et al. Design of a secure file storage and access protocol for cloud-enabled Internet of Things environment
Niu et al. A data-sharing scheme that supports multi-keyword search for electronic medical records
US20230006813A1 (en) Encrypted information retrieval
WO2024066015A1 (zh) 实现隐私信息检索
WO2021185434A1 (en) Fuzzy datamatching using homomorphic encryption
WO2024066008A1 (zh) 一种实现隐私信息检索的方法、系统、服务器和客户端
WO2024077734A1 (zh) 一种实现构造混淆集的方法和客户端
WO2024066013A1 (zh) 实现隐私信息检索
US20230318809A1 (en) Multi-key information retrieval
WO2024087312A1 (zh) 一种数据库访问方法、计算设备和服务器
He et al. Hierarchical conditional proxy re-encryption: A new insight of fine-grained secure data sharing

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22960611

Country of ref document: EP

Kind code of ref document: A1