CN111741020B - Public data set determination method, device and system based on data privacy protection - Google Patents

Public data set determination method, device and system based on data privacy protection Download PDF

Info

Publication number
CN111741020B
CN111741020B CN202010759417.1A CN202010759417A CN111741020B CN 111741020 B CN111741020 B CN 111741020B CN 202010759417 A CN202010759417 A CN 202010759417A CN 111741020 B CN111741020 B CN 111741020B
Authority
CN
China
Prior art keywords
data
encrypted
owner
sequence
data sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010759417.1A
Other languages
Chinese (zh)
Other versions
CN111741020A (en
Inventor
李漓春
赵原
孙勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202010759417.1A priority Critical patent/CN111741020B/en
Publication of CN111741020A publication Critical patent/CN111741020A/en
Application granted granted Critical
Publication of CN111741020B publication Critical patent/CN111741020B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
    • H04L63/0442Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload wherein the sending and receiving network entities apply asymmetric encryption, i.e. different keys for encryption and decryption
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • Bioethics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Storage Device Security (AREA)

Abstract

Embodiments of the present specification provide methods and apparatus for determining a common data set for two data owners. And each data owner encrypts the respective data set by using the respective key and sends the respective data set encryption result to the opposite data owner. The second data owner secondarily encrypts the data set encryption result received from the first data owner using the key, disorder the secondary encryption result, and sends the disorder secondary encryption result to the first data owner. The first data owner decrypts the scrambled secondary encryption result using the key it has, and determines intersection information from the decrypted result and the data set encryption result received from the second data owner. The second data owner determines plaintext data for a common data set of the first and second data owners based on the intersection information.

Description

Public data set determination method, device and system based on data privacy protection
Technical Field
The embodiments of the present specification generally relate to the field of artificial intelligence, and in particular, to a public data set determination method, apparatus, and system based on data privacy protection.
Background
With the development of artificial intelligence technology, business models, such as machine learning models, have been increasingly applied to various business application scenarios, such as risk assessment, speech recognition, natural language processing, and the like. In order to obtain better business service effect, when business processing is carried out, local business data of a plurality of data owners are needed to be jointly processed. In some cases, it is necessary to determine a common data set between data owners and use the determined common data set for subsequent business processing. For example, in the case of joint marketing by two business parties, the two business parties need to determine common customer data and then use the determined common customer data for joint marketing. However, the remaining business data of each data owner belongs to the private data of each data owner, except for the public data set, and cannot be revealed to other data owners. The problem how to determine public data integration to be urgently solved under the condition of ensuring the security of private data of each data owner is solved.
Disclosure of Invention
In view of the foregoing, the present specification embodiments provide a method, apparatus and system for determining a common data set for first and second data owners. By using the method, the device and the system, the respective data sets are encrypted by using the respective keys at the two data owners respectively, and the encrypted data sets are shared between the two data owners. At the second data owner, the encrypted data set received from the first data owner is secondarily encrypted using the key it has, and the results of the secondary encryption are scrambled and then returned to the first data owner. The first data owner decrypts the out-of-order processing result using the key it has, determines intersection information between the decrypted result and the encrypted data set received from the second data owner and returns it to the second data owner. The second data owner determines plaintext data for the public data set using the intersection information. According to the public data set determination scheme, encrypted messages after encryption processing are interacted between the two data owners, so that private data owned by each data owner can be prevented from being leaked.
According to an aspect of embodiments herein, there is provided a method for determining a common data set for first and second data owners, the first data owner having a first data set and a first key, the second data owner having a second data set and a second key, the method comprising: encrypting the first data set and the second data set at the first data owner by using a first key and a second key respectively to obtain a first encrypted data sequence and a second encrypted data sequence; the first data owner sends the first encrypted data sequence to the second data owner, and the second data owner sends a second encryption result to the first data owner, wherein the second encryption result comprises the second encrypted data sequence or a variant of the second encrypted data sequence; at the second data owner, encrypting the first encrypted data sequence by using a second key, disordering the obtained encryption result, and sending the disordering encryption result to the first data owner; at the first data owner, decrypting the scrambled encryption result by using the first key to obtain a third encryption data sequence, determining intersection information of the third encryption data sequence and the second encryption data sequence according to the third encryption data sequence and the second encryption result, and sending the intersection information to the second data owner; and determining, at the second data owner, plaintext data for a common data set of the first and second data owners based on the intersection information.
Optionally, in one example of the above aspect, the first data set is a small set data set and the second data set is a large set data set.
Optionally, in one example of the above aspect, the variations of the second encrypted data sequence include: the first hash value set of each ciphertext data element in the second encryption data sequence or the first bloom filter constructed by each ciphertext data element in the second encryption data sequence.
Optionally, in an example of the above aspect, the intersection information includes: intersection ciphertext information of the third encrypted data sequence and the second encrypted data sequence; a second hash value set of intersection elements of the third encrypted data sequence and the second encrypted data sequence, the intersection elements being queried in the first hash value set by using element hash values of the third encrypted data sequence; or a second bloom filter constructed by using the intersection element of the third encrypted data sequence and the second encrypted data sequence, wherein the intersection element is matched from the third encrypted data sequence by using the first bloom filter.
Optionally, in one example of the above aspect, determining, at the second data owner, plaintext data for a common data set of the first and second data owners from the intersection information comprises: decrypting the intersection ciphertext information by using a second key to obtain plaintext data of public data sets of the first data owner and the second data owner; finding out data elements of the ciphertext data elements of the corresponding second encrypted data sequence in the intersection ciphertext information from the second data set to obtain plaintext data of a public data set of the first data owner and the second data owner; finding out data elements of the hash values of the ciphertext data elements of the corresponding second encrypted data sequence in the second hash value set from the second data set to obtain plaintext data of a public data set of the first data owner and the second data owner; or finding out the data elements matched with the second bloom filter from the ciphertext data elements of the corresponding second encryption data sequence from the second data set to obtain the plaintext data of the public data set of the first data owner and the second data owner.
Optionally, in one example of the above aspect, the encryption processes at the first and second data owners are implemented using interchangeable deterministic encryption algorithms.
Optionally, in one example of the above aspect, the interchangeable deterministic encryption algorithm comprises a DH algorithm or an RSA algorithm.
Optionally, in one example of the above aspect, the processes at the first and second data owners are performed in parallel.
Optionally, in an example of the above aspect, the method may further include: a first key and a second key are generated at the first and second data owners, respectively.
According to another aspect of embodiments of the present specification, there is provided a method for determining a common data set for first and second data owners, the first data owner having a first data set and a first key, the second data owner having a second data set and a second key, the method applied to the first data owner, the method comprising: encrypting the first data set using the first key to obtain a first encrypted data sequence; sending the first encrypted data sequence to a second data owner, and receiving a second encryption result from the second data owner, the second encryption result comprising a second encrypted data sequence or a variant of the second encrypted data sequence, the second encrypted data sequence being obtained by the second data owner encrypting a second data set using a second key; receiving an out-of-order encrypted result from a second data owner, the out-of-order encrypted result being obtained by out-of-order processing, at the second data owner, a first encrypted data sequence encrypted using a second key; decrypting the scrambled encryption result by using a first key to obtain a third encryption data sequence; and determining intersection information of the third encrypted data sequence and the second encrypted data sequence according to the third encrypted data sequence and the second encryption result, and sending the intersection information to the second data owner, wherein the intersection information is used by the second data owner to determine plaintext data of a public data set of the first data owner and the second data owner.
According to another aspect of embodiments of the present specification, there is provided a method for determining a common data set for first and second data owners, the first data owner having a first data set and a first key, the second data owner having a second data set and a second key, the method applied to the second data owner, the method comprising: encrypting the second data set by using a second key to obtain a second encrypted data sequence, and sending a second encryption result to the first data owner, wherein the second encryption result comprises the second encrypted data sequence or a variant of the second encrypted data sequence; receiving a first encrypted data sequence from a first data owner, the first encrypted data sequence being obtained by encrypting a first data set using a first key by the first data owner; encrypting the first encrypted data sequence by using a second key, disordering the obtained encryption result, and sending the disordering encryption result to a first data owner; receiving intersection information of a third encrypted data sequence and a second encrypted data sequence from a first data owner, wherein the intersection information is determined by the first data owner according to the third encrypted data sequence and a second encrypted result, and the third encrypted data sequence is obtained by decrypting the scrambled encrypted result by the first data owner by using a first secret key; and determining plaintext data of the public data sets of the first and second data owners according to the intersection information.
According to another aspect of embodiments of the present specification, there is provided an apparatus for determining a common data set for first and second data owners, the first data owner having a first data set and a first key, the second data owner having a second data set and a second key, the apparatus applied to the first data owner, the apparatus comprising: a data encryption unit which encrypts the first data set by using a first key to obtain a first encrypted data sequence; an encrypted data sharing unit that transmits the first encrypted data sequence to the second data owner, and receives a second encryption result from the second data owner, the second encryption result including the second encrypted data sequence or a modification of the second encrypted data sequence, the second encrypted data sequence being obtained by the second data owner encrypting the second data set using the second key; an out-of-order result acquisition unit that receives an out-of-order encrypted result from a second data owner, the out-of-order encrypted result being obtained by out-of-order processing, at the second data owner, a first encrypted data sequence encrypted using a second key; the data decryption unit is used for decrypting the scrambled encryption result by using a first key to obtain a third encrypted data sequence; the intersection information determining unit is used for determining intersection information of the third encrypted data sequence and the second encrypted data sequence according to the third encrypted data sequence and the second encrypted result; and an intersection information sending unit that sends the intersection information to a second data-owner, the intersection information being used by the second data-owner to determine plaintext data of a common data set of the first and second data-owners.
Optionally, in one example of the above aspect, the variations of the second encrypted data sequence include: the first hash value set of each ciphertext data element in the second encryption data sequence or the first bloom filter constructed by each ciphertext data element in the second encryption data sequence.
Optionally, in one example of the above aspect, the intersection information determination unit: determining intersection ciphertext information of the third encrypted data sequence and the second encrypted data sequence as the intersection information; using the element hash values of the third encrypted data sequence to query an intersection element of the third encrypted data sequence and the second encrypted data sequence in the first hash value set, and determining a second hash value set of the intersection element as the intersection information; or matching intersection elements of the third encrypted data sequence and the second encrypted data sequence from the third encrypted data sequence by using the first bloom filter, and determining that the second bloom filter constructed by using the matched intersection elements is the intersection information.
Optionally, in an example of the above aspect, the apparatus may further include: a key generation unit that generates the first key.
According to another aspect of embodiments of the present specification, there is provided an apparatus for determining a common data set for first and second data owners, the first data owner having a first data set and a first key, the second data owner having a second data set and a second key, the apparatus being applied to the second data owner, the apparatus comprising: a first data encryption unit which encrypts a second data set by using a second key to obtain a second encrypted data sequence; an encrypted data sharing unit that transmits a second encryption result to the first data owner, the second encryption result including a second encrypted data sequence or a modification of the second encrypted data sequence, and receives the first encrypted data sequence from the first data owner, the first encrypted data sequence being obtained by the first data owner encrypting the first data set using the first key; a second data encryption unit that encrypts the first encrypted data sequence using a second key; the disorder processing unit is used for performing disorder processing on the encrypted first encrypted data to obtain a disorder encrypted result; the disorder result sending unit is used for sending the encrypted result after disorder to the first data owner; the intersection information acquisition unit is used for receiving intersection information of a third encrypted data sequence and a second encrypted data sequence from the first data owner, the intersection information is determined by the first data owner according to the third encrypted data sequence and the second encrypted result, and the third encrypted data sequence is obtained by decrypting the disordered encrypted result by the first data owner by using a first key; and a plaintext data determination unit that determines plaintext data of a common data set of the first and second data owners based on the intersection information.
Optionally, in one example of the above aspect, the plaintext data determination unit: decrypting the intersection ciphertext information of the third encrypted data sequence and the second encrypted data sequence by using a second key to obtain plaintext data of a public data set of the first data owner and the second data owner; finding out data elements of the ciphertext data elements of the corresponding second encrypted data sequence in the intersection ciphertext information from the second data set to obtain plaintext data of a public data set of the first data owner and the second data owner; finding out data elements of the hash values of the ciphertext data elements of the corresponding second encrypted data sequence in a second hash value set of intersection elements of the third encrypted data sequence and the second encrypted data sequence from the second data set to obtain plaintext data of a public data set of a first data owner and a second data owner, wherein the second hash value set is inquired in the first hash value set by using the element hash values of the third encrypted data sequence, and the first hash value set is a hash value set formed by the hash values of all the ciphertext data elements in the second encrypted data sequence; or finding out data elements of the corresponding second encrypted data sequence from the second data set, wherein the data elements are matched with a second bloom filter, so as to obtain plaintext data of the public data set of the first data owner and the second data owner, wherein the second bloom filter is constructed by using intersection elements of a third encrypted data sequence and a second encrypted data sequence, and the intersection elements are matched from the third encrypted data sequence by using the first bloom filter constructed by using each ciphertext data element of the second encrypted data sequence.
According to another aspect of embodiments herein, there is provided a system for determining a common data set for first and second data owners, comprising: a first data owner having a first data set and a first key and comprising an apparatus as described above; and a second data owner having a second data set and a second key and comprising the apparatus as described above.
According to another aspect of embodiments of the present specification, there is provided an electronic apparatus including: at least one processor, and a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform a method performed on the first or second data owner side as described above.
According to another aspect of embodiments of the present specification, there is provided a machine-readable storage medium storing executable instructions that, when executed, cause the machine to perform the method performed on the first or second data owner side as described above.
Drawings
A further understanding of the nature and advantages of the present disclosure may be realized by reference to the following drawings. In the drawings, similar components or features may have the same reference numerals.
FIG. 1 illustrates an example schematic diagram of a common data set determination system in accordance with embodiments of the present description.
FIG. 2 illustrates an example flow diagram of a method for determining a common data set for first and second data owners in accordance with an embodiment of the present description.
Fig. 3 shows a block diagram of a common data set determination apparatus on the first data-owner side according to an embodiment of the present specification.
Fig. 4 shows a block diagram of a common data set determination apparatus on the second data-owner side according to an embodiment of the present specification.
Fig. 5 shows a schematic diagram of an electronic device for implementing a common data set determination process on a first data owner side according to an embodiment of the present description.
Fig. 6 shows a schematic diagram of an electronic device for implementing a common data set determination process on the second data owner side according to an embodiment of the present description.
Detailed Description
The subject matter described herein will now be discussed with reference to example embodiments. It should be understood that these embodiments are discussed only to enable those skilled in the art to better understand and thereby implement the subject matter described herein, and are not intended to limit the scope, applicability, or examples set forth in the claims. Changes may be made in the function and arrangement of elements discussed without departing from the scope of the disclosure. Various examples may omit, substitute, or add various procedures or components as needed. For example, the described methods may be performed in an order different from that described, and various steps may be added, omitted, or combined. In addition, features described with respect to some examples may also be combined in other examples.
As used herein, the term "include" and its variants mean open-ended terms in the sense of "including, but not limited to. The term "based on" means "based at least in part on". The terms "one embodiment" and "an embodiment" mean "at least one embodiment". The term "another embodiment" means "at least one other embodiment". The terms "first," "second," and the like may refer to different or the same object. Other definitions, whether explicit or implicit, may be included below. The definition of a term is consistent throughout the specification unless the context clearly dictates otherwise.
In this specification, the terms "business service provider" and "data owner" are used interchangeably. The terms "first data owner" and "first data owner device" may be used interchangeably. The terms "second data owner" and "second data owner device" may be used interchangeably.
In some application scenarios where two service providers jointly provide a service, it is necessary for the two service providers to determine a common data set between the service data sets that they have, and to use the determined common data set for subsequent service processing. For example, in the case of joint marketing by two business parties, the two business parties need to determine common customer data and then use the determined common customer data for joint marketing. However, the remaining business data of each data owner belongs to the private data of each data owner, except for the public data set, and cannot be revealed to other data owners.
In view of the foregoing, embodiments of the present specification propose a public data set determination method, apparatus, and system based on data privacy protection. By using the method, the device and the system, the respective data sets are encrypted by using the respective keys at the two data owners respectively, and the encrypted data sets are shared between the two data owners. At the second data owner, the encrypted data set received from the first data owner is secondarily encrypted using the key it has, and the results of the secondary encryption are scrambled and then returned to the first data owner. The first data owner decrypts the out-of-order processing result using the key it has, determines intersection information between the decrypted result and the encrypted data set received from the second data owner and returns it to the second data owner. The second data owner determines plaintext data for the public data set using the intersection information. According to the public data set determination scheme, encrypted messages after encryption processing are interacted between the two data owners, so that private data owned by each data owner can be prevented from being leaked.
A public data set determination method, apparatus, and system based on data privacy protection according to embodiments of the present specification are described below with reference to the accompanying drawings.
FIG. 1 shows an example schematic diagram of a common data set determination system 100 in accordance with embodiments of the present description. As shown in FIG. 1, the public data set determination system 100 includes a first data owner 110 and a second data owner 120. The first data owner 110 has a first data set and the second data owner 120 has a second data set. The first data set may be local data collected locally by the first data owner 110 and the second data set may be local data collected locally by the second data owner 120.
In this specification, the first data owner 110 and the second data owner 120 may be business participants participating in business processes or data owners providing data for the business participants. For example, the first data owner 110 and the second data owner 120 may be, for example, private data storage servers or intelligent terminal devices of different financial or medical institutions.
In this description, the first data owner 110 and the second data owner 120 may be any suitable computing device with computing capabilities. The computing devices include, but are not limited to: personal computers, server computers, workstations, desktop computers, laptop computers, notebook computers, mobile computing devices, smart phones, tablet computers, cellular phones, Personal Digital Assistants (PDAs), handheld devices, messaging devices, wearable computing devices, consumer electronics, and so forth.
The first data owner 110 has a public data set determination means 111 and the second data owner has a public data set determination means 121. The public data set determining means 111 in the first data owner 110 and the public data set determining means 121 in the second data owner 120 may communicate with each other via a network 130, for example including but not limited to the internet or a local area network, etc., whereby the public data set determining means 111 cooperates with the public data set determining means 121 to determine a public data set of the first data set and the second data set. In other embodiments of the present specification, the public data set determining device 111 in the first data owner 110 and the public data set determining device 121 in the second data owner 120 may also be directly communicably connected to communicate with each other.
FIG. 2 illustrates an example flow diagram of a method 200 for determining a common data set for first and second data owners in accordance with an embodiment of the present description.
As shown in FIG. 2, at 210, a first key K1 is generated at the first data owner 110, and a second key K2 is generated at the second data owner 120.
At 220, the first data set is encrypted using the first key K1 at the first data owner 110
Figure 893765DEST_PATH_IMAGE001
The encryption is carried out to obtain a first encrypted data sequence
Figure 968162DEST_PATH_IMAGE002
Where i is equal to 1 to n. Using the first key K2 at the second data owner 120 on the second data set
Figure 747899DEST_PATH_IMAGE003
Encrypting to obtain a second encrypted data sequence
Figure 143109DEST_PATH_IMAGE004
Where j equals 1 to m.
At 230, the first data owner 110 sends the first encrypted data sequence X1 to the second data owner 120, and the second data owner 120 sends the second encrypted result to the first data owner 110. Here, the second encryption result may be the second encrypted data sequence Y1 or a variation of the second encrypted data sequence Y1. In other words, the second data owner 120 may send the second encrypted data sequence Y1 to the first data owner 110, or may also send a variant of the second encrypted data sequence Y1 to the first data owner 110.
In this description, in one example, the variant of the second encrypted data sequence Y1 may include each ciphertext data element of the second encrypted data sequence Y1
Figure 223060DEST_PATH_IMAGE005
The first set of hash values is based on respective ciphertext data elements
Figure 45523DEST_PATH_IMAGE005
The computed hash values. In another example, a variation of the second encrypted data sequence Y1 may include a first bloom filter constructed with each ciphertext data element of the second encrypted data sequence Y1.
At 240, the first encrypted data sequence X1 is encrypted using the second key K2 at the second data owner 120 resulting in an encrypted sequence
Figure 312556DEST_PATH_IMAGE006
And the obtained encrypted sequence X2 is subjected to disorder processing, thereby obtaining a scrambled sequenceThe encrypted sequence X2 '(i.e., the scrambled encryption result), the ordering of the individual data elements in the scrambled encrypted sequence X2' being different from the ordering of the individual data elements in the first encrypted data sequence X1.
At 250, the second data owner 120 sends the encryption sequence X2' to the first data owner 110.
At 260, at the first data owner 110, decrypting the encrypted sequence X2' using the first key K1 results in a third encrypted data sequence X3, the resulting ordering of the individual data elements in the third encrypted data sequence X3 being different from the ordering of the individual data elements in the first encrypted data sequence X1.
At 270, intersection information of the third encrypted data sequence and the second encrypted data sequence is determined at the first data owner 110 based on the third encrypted data sequence X3 and the second encryption result.
Alternatively, in an example, in a case where the second encryption result is the second encrypted data sequence Y1, the intersection information may be intersection ciphertext information of the third encrypted data sequence X3 and the second encrypted data sequence Y1. For example, for each ciphertext data element in the third encrypted data sequence X3, a query is made in the second encrypted data sequence Y1 as to whether the same element exists. If so, the ciphertext data element is considered to belong to the common data set. In the above manner, the intersection ciphertext information of the third encrypted data series X3 and the second encrypted data series Y1 is obtained as intersection information.
In another example, the second encryption result is each ciphertext data element of the second encrypted data sequence Y1
Figure 714718DEST_PATH_IMAGE005
In the case of the first set of hash values, the intersection information may be a second set of hash values of intersection elements of the third encrypted data sequence and the second encrypted data sequence. Specifically, for each ciphertext data element in the third encrypted data sequence X3, an element hash value thereof is calculated, and then the calculated element hash value is used to query whether the same hash value exists in the first hash value set. And if so, determining that the ciphertext data element belongs to the public data set, and classifying the element hash value corresponding to the ciphertext data element into a second hash value set. In the above manner, the second hash value set of the intersection element of the third encrypted data sequence X3 and the second encrypted data sequence Y1 is obtained as the intersection information.
In another example, in the case where the second encryption result is the first bloom filter constructed using the respective ciphertext data elements of the second encrypted data series Y1, the intersection information is the second bloom filter constructed using the intersection element of the third encrypted data series X3 and the second encrypted data series Y1, which is matched out of the third encrypted data series X3 using the first bloom filter.
At 280, the first data owner 110 sends the determined intersection information to the second data owner 120.
At 290, at the second data owner 120, plaintext data for a common data set between the first data set of the first data owner 110 and the second data set of the second data owner 120 is determined from the received intersection information.
Optionally, in one example, in the case that the intersection information is intersection ciphertext information of the third encrypted data sequence X3 and the second encrypted data sequence Y1, at the second data owner 120, decrypting the intersection ciphertext information using the second key results in plaintext data of the public data set of the first data owner 110 and the second data owner 120. Alternatively, in another example, at the second data owner 120, the data elements of the ciphertext data elements of the corresponding second encrypted data sequence Y1 in the intersection ciphertext information are found from the second data set Y, thereby resulting in the plaintext data of the common data set of the first data owner 110 and the second data owner 120.
In another example, in the case where the intersection information is the second hash value set of the intersection elements of the third encrypted data series X3 and the second encrypted data series Y1, at the second data owner 120, hash values of the respective ciphertext data elements of the second encrypted data series Y1 are calculated, and data elements in which the hash values of the corresponding ciphertext data elements of the second encrypted data series Y1 are in the second hash value set are found out from the second data set Y, thereby obtaining plaintext data of a common data set of the first data owner 110 and the second data owner 120.
In another example, where the intersection information is a second bloom filter constructed using the intersection elements of the third encrypted data sequence X3 and the second encrypted data sequence Y1, at the second data owner 120, the second bloom filter is used to find out from the second data set Y the data elements of the corresponding ciphertext data sequence Y1 that match the second bloom filter, thereby resulting in plaintext data for the common data set of the first data owner 110 and the second data owner 120.
By using the public data set determining method, because the two data owners interact with each other through the encrypted ciphertext information, the private data owned by each data owner can be prevented from being leaked.
Further optionally, in one example of the above aspect, the encryption processes at the first and second data owners are implemented using interchangeable deterministic encryption algorithms. The term "deterministic encryption" may mean that the same plaintext is encrypted each time resulting in the same ciphertext. The term "alternately-cipherable" may mean that in the case of double-ciphering using two different keys, K1 and K2, the order of use of the keys does not change the ciphering result. In other words, with respect to plaintext data X, a resulting ciphertext that is encrypted first using key K1 and then re-encrypted using key K2 is the same as a resulting ciphertext that is encrypted first using key K2 and then re-encrypted using key K1. In addition, the decryption order at the time of decryption may be the same as or different from the encryption order. Examples of the interchangeable deterministic encryption algorithm may include, but are not limited to, the DH (Diffie-Hellman) algorithm or the RSA algorithm.
Further optionally, in one example, the first data set is a small set data set (i.e., a data set with few elements of the data set), and the second data set is a large set data set (i.e., a data set with many elements of the data set). In this way, the public data set is determined and the recovery process of the plaintext data is performed at the data-owning side having the large-aggregate data set, so that the plaintext of the public data set is disclosed only to the large-aggregate data side and kept secret from the small-aggregate data side.
Further, optionally, in one example, the processes at the first data owner 110 and the second data owner 120 may be performed in parallel, so that the common data set determination time may be shortened, thereby improving the common data set determination efficiency.
Further optionally, in one example, the first data owner 110 may have the first key K1 in advance, and/or the second data owner 120 may have the second key K2 in advance, thereby eliminating the need to generate the first key K1 and/or the second key K2 in real-time during the public data set determination process.
Fig. 3 shows a block diagram of a common data set determination apparatus 300 on the first data-owner side according to an embodiment of the present specification. As shown in fig. 3, the common data set determining apparatus 300 includes a data encrypting unit 310, an encrypted data sharing unit 320, an out-of-order result acquiring unit 330, a data decrypting unit 340, an intersection information determining unit 350, and an intersection information sending unit 360.
The data encryption unit 310 is configured to encrypt the first data set using the first key resulting in a first encrypted data sequence. The operation of the data encryption unit 310 may refer to the operation of 220 described above with reference to fig. 2.
The encrypted data sharing unit 320 is configured to send the first sequence of encrypted data to the second data owner and receive a second encryption result from the second data owner, the second encryption result including the second sequence of encrypted data or a variant of the second sequence of encrypted data, the second sequence of encrypted data being obtained by the second data owner encrypting the second data set using the second key. The operation of the encrypted data sharing unit 320 may refer to the operation of 230 described above with reference to fig. 2.
The out-of-order result acquisition unit 330 is configured to receive an out-of-order encrypted result, which is obtained by out-of-order processing, at the second data owner, the first encrypted data sequence encrypted using the second key, from the second data owner. The operation of the out-of-order result acquisition unit 330 may refer to the operation of 250 described above with reference to fig. 2.
The data decryption unit 340 is configured to decrypt the scrambled encryption result using the first key to obtain a third encrypted data sequence. The operation of the data decryption unit 340 may refer to the operation of 260 described above with reference to fig. 2.
The intersection information determination unit 350 is configured to determine intersection information of the third encrypted data sequence and the second encrypted data sequence from the third encrypted data sequence and the second encryption result. The operation of the intersection information determination unit 350 may refer to the operation of 270 described above with reference to fig. 2.
The intersection information sending unit 360 is configured to send intersection information to the second data owner, which is used by the second data owner to determine plaintext data for a common data set of the first and second data owners. The operation of the intersection information sending unit 360 may refer to the operation of 280 described above with reference to fig. 2.
In one example, variations of the second encrypted data sequence include: the first hash value set of each ciphertext data element in the second encrypted data sequence, or a first bloom filter constructed by using each ciphertext data element in the second encrypted data sequence.
In another example, in a case where the second encryption result is the second encrypted data series, the intersection information determination unit 350 determines intersection ciphertext information of the third encrypted data series and the second encrypted data series as intersection information.
In another example, in a case where the second encryption result is a first hash value set of each ciphertext data element in the second encrypted data sequence, the intersection information determination unit 350 queries, using an element hash value of the third encrypted data sequence, intersection elements of the third encrypted data sequence and the second encrypted data sequence in the first hash value set, and determines a second hash value set of the queried intersection elements as the intersection information.
In another example, in a case where the second encryption result is the first bloom filter constructed using the respective ciphertext data elements of the second encrypted data series, the intersection information determination unit 350 matches the intersection element of the third encrypted data series and the second encrypted data series from the third encrypted data series using the first bloom filter, and determines that constructing the second bloom filter using the matched intersection element is the intersection information.
Further optionally, in one example, the public data set determining apparatus 300 may further include a key generating unit (not shown). The key generation unit is configured to generate a first key.
Fig. 4 shows a block diagram of a common data set determination apparatus 400 on the second data-owner side according to an embodiment of the present specification. As shown in fig. 4, the common data set determination apparatus 400 includes a first data encryption unit 410, an encrypted data sharing unit 420, a second data encryption unit 430, an out-of-order processing unit 440, an out-of-order result sending unit 450, an intersection information acquisition unit 460, and a plaintext data determination unit 470.
The first data encryption unit 410 is configured to encrypt the second data set using the second key resulting in a second encrypted data sequence. The operation of the first data encryption unit 410 may refer to the operation of 220 described above with reference to fig. 2.
The encrypted data sharing unit 420 is configured to send a second encryption result to the first data owner, the second encryption result including the second encrypted data sequence or a variant of the second encrypted data sequence, and receive the first encrypted data sequence from the first data owner, the first encrypted data sequence being obtained via the first data owner by encrypting the first data set using the first key. The operation of the encrypted data sharing unit 420 may refer to the operation of 230 described above with reference to fig. 2.
The second data encryption unit 430 is configured to encrypt the first encrypted data sequence using a second key. The out-of-order processing unit 440 is configured to perform out-of-order processing on the encrypted first encrypted data to obtain an out-of-order encrypted result. The operations of the second data encryption unit 430 and the out-of-order processing unit 440 may refer to the operations of 240 described above with reference to fig. 2.
The out-of-order result transmitting unit 450 is configured to transmit the encrypted result after the out-of-order to the first data owner. The operation of the out-of-order result transmitting unit 450 may refer to the operation of 250 described above with reference to fig. 2.
The intersection information obtaining unit 460 is configured to receive intersection information of a third encrypted data sequence and a second encrypted data sequence from the first data owner, the intersection information being determined by the first data owner from the third encrypted data sequence and the second encrypted result, the third encrypted data sequence being obtained by the first data owner decrypting the scrambled encrypted result using the first key. The operation of the intersection information acquisition unit 460 may refer to the operation of 280 described above with reference to fig. 2.
The plaintext data determination unit 470 is configured to determine plaintext data of a common data set of the first data-owner and the second data-owner based on the intersection information. The operation of the plaintext data determination unit 470 may refer to the operation of 290 described above with reference to fig. 2.
Further, in one example, in the case where the intersection information is intersection ciphertext information of the third encrypted data series and the second encrypted data series, the plaintext data determination unit 470 decrypts the intersection ciphertext information using the second key, thereby obtaining plaintext data of the common data set of the first data owner and the second data owner. Alternatively, in another example, the plaintext data determination unit 470 finds, from the second data set, a data element of the ciphertext data element of the corresponding second encrypted data sequence that is in the intersection ciphertext information, thereby resulting in plaintext data of a common data set of the first data owner and the second data owner.
In another example, in the case where the intersection information is the second hash value set of the intersection elements of the third encrypted data sequence and the second encrypted data sequence, the plaintext data determination unit 470 calculates hash values of the respective ciphertext data elements of the second encrypted data sequence, and finds out, from the second data set, a data element whose corresponding hash value of the ciphertext data element of the second encrypted data sequence is in the second hash value set, thereby obtaining plaintext data of the common data set of the first data owner and the second data owner.
In another example, in a case where the intersection information is a second bloom filter constructed using intersection elements of the third encrypted data series and the second encrypted data series, the plaintext data determination unit 470 uses the second bloom filter to find out, from the second data set, a data element of the ciphertext data element of the corresponding second encrypted data series that matches the second bloom filter, thereby obtaining plaintext data of the common data set of the first data owner and the second data owner.
Further, optionally, in another example, the public data set determining apparatus 400 may further include a key generating unit (not shown). The key generation unit is configured to generate a second key.
Further, it is to be noted that, in the example of fig. 4, the first data encryption unit 410 and the second data encryption unit 430 are shown as two independent data encryption units, but in other embodiments of the present specification, the first data encryption unit 410 and the second data encryption unit 430 may be implemented using the same data encryption unit.
Further, in the example of fig. 4, the encrypted data sharing unit 420 and the out-of-order result sending unit 450 are shown as two separate units, but in other embodiments of the present specification, the encrypted data sharing unit 420 and the out-of-order result sending unit 450 may be implemented using the same unit.
As described above with reference to fig. 1 to 4, the common data set determination method and the common data set determination apparatus according to the embodiments of the present specification are described. The above common data set determining means may be implemented in hardware, or may be implemented in software, or a combination of hardware and software.
Fig. 5 shows a schematic diagram of an electronic device 500 for implementing a common data set determination process on a first data owner side according to an embodiment of the present description. As shown in fig. 5, the electronic device 500 may include at least one processor 510, a storage (e.g., non-volatile storage) 520, a memory 530, and a communication interface 540, and the at least one processor 510, the storage 520, the memory 530, and the communication interface 540 are connected together via a bus 560. The at least one processor 510 executes at least one computer-readable instruction (i.e., the elements described above as being implemented in software) stored or encoded in memory.
In one embodiment, computer-executable instructions are stored in the memory that, when executed, cause the at least one processor 510 to: encrypting the first data set using the first key to obtain a first encrypted data sequence; sending the first encrypted data sequence to a second data owner, and receiving a second encryption result from the second data owner, the second encryption result comprising a second encrypted data sequence or a variant of the second encrypted data sequence, the second encrypted data sequence being obtained by the second data owner encrypting a second data set using a second key; receiving an out-of-order encrypted result from a second data owner, the out-of-order encrypted result being obtained by out-of-order processing, at the second data owner, a first encrypted data sequence encrypted using a second key; decrypting the scrambled encryption result by using a first key to obtain a third encryption data sequence; and determining intersection information of the third encrypted data sequence and the second encrypted data sequence according to the third encrypted data sequence and the second encryption result, and sending the determined intersection information to the second data owner, wherein the intersection information is used by the second data owner to determine plaintext data of a common data set of the first data owner and the second data owner.
It should be appreciated that the computer-executable instructions stored in the memory, when executed, cause the at least one processor 510 to perform the various operations and functions described above in connection with fig. 1-4 in the various embodiments of the present description.
Fig. 6 shows a schematic diagram of an electronic device 600 for implementing a common data set determination process on the second data owner side according to an embodiment of the present description. As shown in fig. 6, electronic device 600 may include at least one processor 610, storage (e.g., non-volatile storage) 620, memory 630, and communication interface 640, and at least one processor 610, storage 620, memory 630, and communication interface 640 are connected together via a bus 660. The at least one processor 610 executes at least one computer-readable instruction (i.e., the elements described above as being implemented in software) stored or encoded in memory.
In one embodiment, computer-executable instructions are stored in the memory that, when executed, cause the at least one processor 610 to: encrypting the second data set by using a second key to obtain a second encrypted data sequence, and sending a second encryption result to the first data owner, wherein the second encryption result comprises the second encrypted data sequence or a variant of the second encrypted data sequence; receiving a first encrypted data sequence from a first data owner, the first encrypted data sequence being obtained by encrypting a first data set using a first key by the first data owner; encrypting the first encrypted data sequence by using a second key, disordering the obtained encryption result, and sending the disordering encryption result to a first data owner; receiving intersection information of a third encrypted data sequence and a second encrypted data sequence from a first data owner, wherein the intersection information is determined by the first data owner according to the third encrypted data sequence and a second encrypted result, and the third encrypted data sequence is obtained by decrypting the scrambled encrypted result by the first data owner by using a first secret key; and determining plaintext data of the public data sets of the first and second data owners according to the intersection information.
It should be appreciated that the computer-executable instructions stored in the memory, when executed, cause the at least one processor 610 to perform the various operations and functions described above in connection with fig. 1-4 in the various embodiments of the present description.
According to one embodiment, a program product, such as a machine-readable medium (e.g., a non-transitory machine-readable medium), is provided. A machine-readable medium may have instructions (i.e., elements described above as being implemented in software) that, when executed by a machine, cause the machine to perform various operations and functions described above in connection with fig. 1-4 in the various embodiments of the present specification. Specifically, a system or apparatus may be provided which is provided with a readable storage medium on which software program code implementing the functions of any of the above embodiments is stored, and causes a computer or processor of the system or apparatus to read out and execute instructions stored in the readable storage medium.
In this case, the program code itself read from the readable medium can realize the functions of any of the above-described embodiments, and thus the machine-readable code and the readable storage medium storing the machine-readable code form part of the present invention.
Examples of the readable storage medium include floppy disks, hard disks, magneto-optical disks, optical disks (e.g., CD-ROMs, CD-R, CD-RWs, DVD-ROMs, DVD-RAMs, DVD-RWs), magnetic tapes, nonvolatile memory cards, and ROMs. Alternatively, the program code may be downloaded from a server computer or from the cloud via a communications network.
It will be understood by those skilled in the art that various changes and modifications may be made in the above-disclosed embodiments without departing from the spirit of the invention. Accordingly, the scope of the invention should be determined from the following claims.
It should be noted that not all steps and units in the above flows and system structure diagrams are necessary, and some steps or units may be omitted according to actual needs. The execution order of the steps is not fixed, and can be determined as required. The apparatus structures described in the above embodiments may be physical structures or logical structures, that is, some units may be implemented by the same physical entity, or some units may be implemented by a plurality of physical entities, or some units may be implemented by some components in a plurality of independent devices.
In the above embodiments, the hardware units or modules may be implemented mechanically or electrically. For example, a hardware unit, module or processor may comprise permanently dedicated circuitry or logic (such as a dedicated processor, FPGA or ASIC) to perform the corresponding operations. The hardware units or processors may also include programmable logic or circuitry (e.g., a general purpose processor or other programmable processor) that may be temporarily configured by software to perform the corresponding operations. The specific implementation (mechanical, or dedicated permanent, or temporarily set) may be determined based on cost and time considerations.
The detailed description set forth above in connection with the appended drawings describes exemplary embodiments but does not represent all embodiments that may be practiced or fall within the scope of the claims. The term "exemplary" used throughout this specification means "serving as an example, instance, or illustration," and does not mean "preferred" or "advantageous" over other embodiments. The detailed description includes specific details for the purpose of providing an understanding of the described technology. However, the techniques may be practiced without these specific details. In some instances, well-known structures and devices are shown in block diagram form in order to avoid obscuring the concepts of the described embodiments.
The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (16)

1. A method for determining a common data set for first and second data owners, the first data owner having a first data set and a first key, the second data owner having a second data set and a second key, the method comprising:
encrypting the first data set and the second data set at the first data owner by using a first key and a second key respectively to obtain a first encrypted data sequence and a second encrypted data sequence;
the first data owner sends the first encrypted data sequence to the second data owner, and the second data owner sends a second encryption result to the first data owner, wherein the second encryption result comprises the second encrypted data sequence or a variant of the second encrypted data sequence;
at the second data owner, encrypting the first encrypted data sequence by using a second key, disordering the obtained encryption result, and sending the disordering encryption result to the first data owner;
at the first data owner, decrypting the scrambled encryption result by using the first key to obtain a third encryption data sequence, determining intersection information of the third encryption data sequence and the second encryption data sequence according to the third encryption data sequence and the second encryption result, and sending the intersection information to the second data owner; and
determining, at the second data-owner, plaintext data for a common data set of the first and second data-owners based on the intersection information,
wherein the variations of the second encrypted data sequence include: a first set of hash values for each ciphertext data element of the second encrypted data sequence or a first bloom filter constructed using each ciphertext data element of the second encrypted data sequence,
wherein the determined intersection information comprises:
intersection ciphertext information of the third encrypted data sequence and the second encrypted data sequence;
a second set of hash values for intersection elements of the third encrypted data sequence and the second encrypted data sequence; or
And the second bloom filter is constructed by using the intersection element of the third encrypted data sequence and the second encrypted data sequence, and the intersection element is matched from the third encrypted data sequence by using the first bloom filter.
2. The method of claim 1, wherein the first data set is a small set data set and the second data set is a large set data set.
3. The method of claim 1, wherein determining, at the second data-owner, plaintext data for a common data set of the first and second data-owners based on the intersection information comprises:
decrypting the intersection ciphertext information by using a second key to obtain plaintext data of public data sets of the first data owner and the second data owner;
finding out data elements of the ciphertext data elements of the corresponding second encrypted data sequence in the intersection ciphertext information from the second data set to obtain plaintext data of a public data set of the first data owner and the second data owner;
finding out data elements of the hash values of the ciphertext data elements of the corresponding second encrypted data sequence in the second hash value set from the second data set to obtain plaintext data of a public data set of the first data owner and the second data owner; or
And finding out the data elements matched with the second bloom filter from the ciphertext data elements of the corresponding second encryption data sequence from the second data set to obtain the plaintext data of the public data set of the first data owner and the second data owner.
4. The method of claim 1, wherein the encryption processes at the first and second data owners are implemented using interchangeable deterministic encryption algorithms.
5. The method of claim 4, wherein the interchangeable deterministic encryption algorithms comprise a DH algorithm or an RSA algorithm.
6. The method of claim 1, wherein the processes at the first and second data owners are performed in parallel.
7. The method of claim 1, further comprising:
a first key and a second key are generated at the first and second data owners, respectively.
8. A method for determining a common data set for first and second data owners, a first data owner having a first data set and a first key, a second data owner having a second data set and a second key, the method applied to the first data owner, the method comprising:
encrypting the first data set using the first key to obtain a first encrypted data sequence;
sending the first encrypted data sequence to a second data owner, and receiving a second encryption result from the second data owner, the second encryption result comprising a second encrypted data sequence or a variant of the second encrypted data sequence, the second encrypted data sequence being obtained by the second data owner encrypting a second data set using a second key;
receiving an out-of-order encrypted result from a second data owner, the out-of-order encrypted result being obtained by out-of-order processing, at the second data owner, a first encrypted data sequence encrypted using a second key;
decrypting the scrambled encryption result by using a first key to obtain a third encryption data sequence; and
determining intersection information of the third encrypted data sequence and the second encrypted data sequence based on the third encrypted data sequence and the second encryption result, and sending the intersection information to the second data owner, the intersection information being used by the second data owner to determine plaintext data for a common data set of the first and second data owners,
wherein the variations of the second encrypted data sequence include: a first set of hash values for each ciphertext data element of the second encrypted data sequence or a first bloom filter constructed using each ciphertext data element of the second encrypted data sequence,
wherein the determined intersection information comprises:
intersection ciphertext information of the third encrypted data sequence and the second encrypted data sequence;
a second set of hash values for intersection elements of the third encrypted data sequence and the second encrypted data sequence; or
And the second bloom filter is constructed by using the intersection element of the third encrypted data sequence and the second encrypted data sequence, and the intersection element is matched from the third encrypted data sequence by using the first bloom filter.
9. A method for determining a common data set for first and second data owners, a first data owner having a first data set and a first key, a second data owner having a second data set and a second key, the method being applied to a second data owner, the method comprising:
encrypting the second data set by using a second key to obtain a second encrypted data sequence, and sending a second encryption result to the first data owner, wherein the second encryption result comprises the second encrypted data sequence or a variant of the second encrypted data sequence;
receiving a first encrypted data sequence from a first data owner, the first encrypted data sequence being obtained by encrypting a first data set using a first key by the first data owner;
encrypting the first encrypted data sequence by using a second key, disordering the obtained encryption result, and sending the disordering encryption result to a first data owner;
receiving intersection information of a third encrypted data sequence and a second encrypted data sequence from a first data owner, wherein the intersection information is determined by the first data owner according to the third encrypted data sequence and a second encrypted result, and the third encrypted data sequence is obtained by decrypting the scrambled encrypted result by the first data owner by using a first secret key; and
determining plaintext data for a common data set of the first and second data owners based on the intersection information,
wherein the variations of the second encrypted data sequence include: a first set of hash values for each ciphertext data element of the second encrypted data sequence or a first bloom filter constructed using each ciphertext data element of the second encrypted data sequence,
wherein the determined intersection information comprises:
intersection ciphertext information of the third encrypted data sequence and the second encrypted data sequence;
a second set of hash values for intersection elements of the third encrypted data sequence and the second encrypted data sequence; or
And the second bloom filter is constructed by using the intersection element of the third encrypted data sequence and the second encrypted data sequence, and the intersection element is matched from the third encrypted data sequence by using the first bloom filter.
10. An apparatus for determining a common data set for first and second data owners, the first data owner having a first data set and a first key, the second data owner having a second data set and a second key, the apparatus for application to the first data owner, the apparatus comprising:
a data encryption unit which encrypts the first data set by using a first key to obtain a first encrypted data sequence;
an encrypted data sharing unit that transmits the first encrypted data sequence to the second data owner, and receives a second encryption result from the second data owner, the second encryption result including the second encrypted data sequence or a modification of the second encrypted data sequence, the second encrypted data sequence being obtained by the second data owner encrypting the second data set using the second key;
an out-of-order result acquisition unit that receives an out-of-order encrypted result from a second data owner, the out-of-order encrypted result being obtained by out-of-order processing, at the second data owner, a first encrypted data sequence encrypted using a second key;
the data decryption unit is used for decrypting the scrambled encryption result by using a first key to obtain a third encrypted data sequence;
the intersection information determining unit is used for determining intersection information of the third encrypted data sequence and the second encrypted data sequence according to the third encrypted data sequence and the second encrypted result; and
an intersection information sending unit that sends the intersection information to a second data-owner, the intersection information being used by the second data-owner to determine plaintext data of a common data set of the first and second data-owners,
wherein the variations of the second encrypted data sequence include: a first set of hash values for each ciphertext data element of the second encrypted data sequence or a first bloom filter constructed using each ciphertext data element of the second encrypted data sequence,
wherein the intersection information determination unit:
determining intersection ciphertext information of the third encrypted data sequence and the second encrypted data sequence as the intersection information;
using the element hash values of the third encrypted data sequence to query an intersection element of the third encrypted data sequence and the second encrypted data sequence in the first hash value set, and determining a second hash value set of the intersection element as the intersection information; or
And matching intersection elements of the third encrypted data sequence and the second encrypted data sequence from the third encrypted data sequence by using the first bloom filter, and determining that the second bloom filter constructed by using the matched intersection elements is the intersection information.
11. The apparatus of claim 10, further comprising:
a key generation unit that generates the first key.
12. An apparatus for determining a common data set for first and second data owners, a first data owner having a first data set and a first key, a second data owner having a second data set and a second key, the apparatus for application to a second data owner, the apparatus comprising:
a first data encryption unit which encrypts a second data set by using a second key to obtain a second encrypted data sequence;
an encrypted data sharing unit that transmits a second encryption result to the first data owner, the second encryption result including a second encrypted data sequence or a modification of the second encrypted data sequence, and receives the first encrypted data sequence from the first data owner, the first encrypted data sequence being obtained by the first data owner encrypting the first data set using the first key;
a second data encryption unit that encrypts the first encrypted data sequence using a second key;
the disorder processing unit is used for performing disorder processing on the encrypted first encrypted data to obtain a disorder encrypted result;
the disorder result sending unit is used for sending the encrypted result after disorder to the first data owner;
the intersection information acquisition unit is used for receiving intersection information of a third encrypted data sequence and a second encrypted data sequence from the first data owner, the intersection information is determined by the first data owner according to the third encrypted data sequence and the second encrypted result, and the third encrypted data sequence is obtained by decrypting the disordered encrypted result by the first data owner by using a first key; and
a plaintext data determination unit that determines plaintext data of a common data set of the first and second data owners based on the intersection information,
wherein the variations of the second encrypted data sequence include: a first set of hash values for each ciphertext data element of the second encrypted data sequence or a first bloom filter constructed using each ciphertext data element of the second encrypted data sequence,
wherein the intersection information includes:
intersection ciphertext information of the third encrypted data sequence and the second encrypted data sequence;
a second set of hash values for intersection elements of the third encrypted data sequence and the second encrypted data sequence; or
And the second bloom filter is constructed by using the intersection element of the third encrypted data sequence and the second encrypted data sequence, and the intersection element is matched from the third encrypted data sequence by using the first bloom filter.
13. The apparatus according to claim 12, wherein the plaintext data determination unit:
decrypting the intersection ciphertext information of the third encrypted data sequence and the second encrypted data sequence by using a second key to obtain plaintext data of a public data set of the first data owner and the second data owner;
finding out data elements of the ciphertext data elements of the corresponding second encrypted data sequence in the intersection ciphertext information from the second data set to obtain plaintext data of a public data set of the first data owner and the second data owner;
finding out data elements of the hash values of the ciphertext data elements of the corresponding second encrypted data sequence in a second hash value set of intersection elements of the third encrypted data sequence and the second encrypted data sequence from the second data set to obtain plaintext data of a public data set of a first data owner and a second data owner, wherein the second hash value set is inquired in the first hash value set by using the element hash values of the third encrypted data sequence, and the first hash value set is a hash value set formed by the hash values of all the ciphertext data elements in the second encrypted data sequence; or
Finding out data elements of the corresponding second encrypted data sequence from the second data set, which data elements match with a second bloom filter, resulting in plaintext data of the public data set of the first and second data owners, the second bloom filter being constructed using intersection elements of the third encrypted data sequence and the second encrypted data sequence, the intersection elements being matched from the third encrypted data sequence by using the first bloom filter constructed using the respective ciphertext data elements of the second encrypted data sequence.
14. A system for determining a common data set for first and second data owners, comprising:
a first data owner having a first data set and a first key and comprising the apparatus of claim 10 or 11; and
a second data owner having a second data set and a second key and comprising the apparatus of claim 12 or 13.
15. An electronic device, comprising:
at least one processor, and
a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method of claim 8 or 9.
16. A machine-readable storage medium storing executable instructions that, when executed, cause the machine to perform the method of claim 8 or 9.
CN202010759417.1A 2020-07-31 2020-07-31 Public data set determination method, device and system based on data privacy protection Active CN111741020B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010759417.1A CN111741020B (en) 2020-07-31 2020-07-31 Public data set determination method, device and system based on data privacy protection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010759417.1A CN111741020B (en) 2020-07-31 2020-07-31 Public data set determination method, device and system based on data privacy protection

Publications (2)

Publication Number Publication Date
CN111741020A CN111741020A (en) 2020-10-02
CN111741020B true CN111741020B (en) 2020-12-22

Family

ID=72656780

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010759417.1A Active CN111741020B (en) 2020-07-31 2020-07-31 Public data set determination method, device and system based on data privacy protection

Country Status (1)

Country Link
CN (1) CN111741020B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112073444B (en) * 2020-11-16 2021-02-05 支付宝(杭州)信息技术有限公司 Data set processing method and device and server
CN112887297B (en) * 2021-01-22 2022-09-02 支付宝(杭州)信息技术有限公司 Privacy-protecting differential data determining method, device, equipment and system
CN112822201B (en) * 2021-01-22 2023-03-24 支付宝(杭州)信息技术有限公司 Privacy-protecting difference data determination method, device, equipment and system
CN114611131B (en) * 2022-05-10 2023-05-30 支付宝(杭州)信息技术有限公司 Method, device and system for determining shared data for protecting privacy

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110086817A (en) * 2019-04-30 2019-08-02 阿里巴巴集团控股有限公司 Reliable teller system and method
CN110399741A (en) * 2019-07-29 2019-11-01 深圳前海微众银行股份有限公司 Data alignment method, equipment and computer readable storage medium
CN110944011A (en) * 2019-12-16 2020-03-31 支付宝(杭州)信息技术有限公司 Joint prediction method and system based on tree model

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11195149B2 (en) * 2016-05-31 2021-12-07 Microsoft Technology Licensing, Llc Relating data while preventing inter-entity data sharing
CN109726580B (en) * 2017-10-31 2020-04-14 阿里巴巴集团控股有限公司 Data statistical method and device
CN109525386B (en) * 2018-11-29 2021-05-18 东北大学 Paillier homomorphic encryption private aggregation and method based on Paillier
CN109886029B (en) * 2019-01-28 2020-09-22 湖北工业大学 Polynomial expression based privacy protection set intersection calculation method and system
CN110851869B (en) * 2019-11-14 2023-09-19 深圳前海微众银行股份有限公司 Sensitive information processing method, device and readable storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110086817A (en) * 2019-04-30 2019-08-02 阿里巴巴集团控股有限公司 Reliable teller system and method
CN110399741A (en) * 2019-07-29 2019-11-01 深圳前海微众银行股份有限公司 Data alignment method, equipment and computer readable storage medium
CN110944011A (en) * 2019-12-16 2020-03-31 支付宝(杭州)信息技术有限公司 Joint prediction method and system based on tree model

Also Published As

Publication number Publication date
CN111741020A (en) 2020-10-02

Similar Documents

Publication Publication Date Title
CN111741020B (en) Public data set determination method, device and system based on data privacy protection
Liu et al. Toward highly secure yet efficient KNN classification scheme on outsourced cloud data
EP3673640B1 (en) Processing data elements stored in blockchain networks
US11210658B2 (en) Constructing a distributed ledger transaction on a cold hardware wallet
CN110969431B (en) Secure hosting method, device and system for private key of blockchain digital coin
CN112949545B (en) Method, apparatus, computing device and medium for recognizing face image
CN112101531B (en) Neural network model training method, device and system based on privacy protection
EP4150879A1 (en) Constructing a distributed ledger transaction on a cold hardware wallet
CN109274644A (en) A kind of data processing method, terminal and watermark server
CN112788001B (en) Data encryption-based data processing service processing method, device and equipment
CN111523134B (en) Homomorphic encryption-based model training method, device and system
Yilmaz et al. Armor: An anti-counterfeit security mechanism for low cost radio frequency identification systems
CN110213202B (en) Identification encryption matching method and device, and identification processing method and device
Dhiran et al. Video fraud detection using blockchain
CN112380404B (en) Data filtering method, device and system
CN111737756B (en) XGB model prediction method, device and system performed through two data owners
CN111046431B (en) Data processing method, query method, device, electronic equipment and system
US20190394018A1 (en) Ciphertext matching system and ciphertext matching method
CN111046408A (en) Judgment result processing method, query method, device, electronic equipment and system
CN111984932B (en) Two-party data packet statistics method, device and system
CN115599959A (en) Data sharing method, device, equipment and storage medium
Jin et al. Efficient blind face recognition in the cloud
CN113052044A (en) Method, apparatus, computing device, and medium for recognizing iris image
CN113052045A (en) Method, apparatus, computing device and medium for recognizing finger vein image
KR20170001633A (en) Tokenization-based encryption key managemnent sytem and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant