CN114021198B

CN114021198B - Method and device for determining common data for protecting data privacy

Info

Publication number: CN114021198B
Application number: CN202111635107.XA
Authority: CN
Inventors: 潘无穷; 韦韬; 李婷婷; 钱中天
Original assignee: Alipay Hangzhou Information Technology Co Ltd
Current assignee: Alipay Hangzhou Information Technology Co Ltd
Priority date: 2021-12-29
Filing date: 2021-12-29
Publication date: 2022-04-08
Anticipated expiration: 2041-12-29
Also published as: CN114021198A

Abstract

The embodiment of the specification provides a method and a device for determining common data for protecting data privacy, wherein an intermediate party respectively acquires respective barrel dividing data from a first party and a second party, the barrel dividing data is obtained by performing preset barrel dividing processing on a held private data set by each party, and the preset barrel dividing processing comprises the steps of filling mapping values of any private data into target barrel spaces in a plurality of barrel spaces according to a first section in which the private data fall; any first barrel space in the plurality of barrel spaces comprises a plurality of first sections; aiming at a first mapping value of a first barrel space in the barrel data of any one party, comparing the first mapping value with each mapping value of a second barrel space in the barrel data of the other party to obtain a comparison result of the first mapping value; the second barrel space and the first barrel space share a first section; and sending a result set formed by comparison results of all mapping values in the bucket data of any party to the first party and the second party for determining the common data of the privacy data sets of the two parties.

Description

Method and device for determining common data for protecting data privacy

Technical Field

The present disclosure relates to the field of data security technologies, and in particular, to a method and an apparatus for determining common data to protect data privacy.

Background

Privacy rendezvous algorithms are typically used before federated machine learning. Before two (or more) parties want to train a model using their data together, they typically first confirm common samples that both (more) parties have through a privacy rendezvous algorithm and then perform privacy machine learning based on these common samples.

At present, there are many privacy-based algorithms, such as DH (Diffie-Hellman) algorithm, which has a problem of large computation (asymmetric computation is introduced).

Therefore, it is desirable to provide an improved scheme for protecting the security of private data of each party and reducing the amount of computation during data processing by combining multiple parties.

Disclosure of Invention

One or more embodiments of the present specification provide a method and an apparatus for determining common data to protect data privacy, so as to reduce the amount of computation while protecting the security of private data of each party.

According to a first aspect, there is provided a method of determining common data for protecting data privacy, the method performed by an intermediary party, comprising:

respectively acquiring respective bucket dividing data from a first party and a second party, wherein the bucket dividing data is obtained by carrying out preset bucket dividing processing on private data sets held by the first party and the second party by each party, and the preset bucket dividing processing comprises the steps of filling mapping values corresponding to the private data into target bucket spaces in a plurality of preset bucket spaces according to first sections, into which the private data fall, of t pre-divided first sections aiming at any private data; any first barrel space of the plurality of barrel spaces comprises a plurality of first sections;

comparing a first mapping value contained in the first barrel space in the barrel data of any one party with each mapping value contained in the second barrel space in the barrel data of the other party to obtain a comparison result aiming at the first mapping value; wherein the second bucket space and the first bucket space present a common first section;

and sending a result set formed by comparison results of all mapping values in the bucket dividing data of any party to the first party and the second party, wherein the result set is used for determining common data of the private data sets of the first party and the second party.

In an alternative embodiment, the plurality of bucket spaces includes a plurality of levels of bucket spaces, and different levels of bucket spaces contain different numbers of first sections.

In an alternative embodiment, the first barrel space comprises p consecutive first sections; prior to said comparing, further comprising:

determining a second bucket space, wherein the second bucket space comprises a first subspace, a second subspace and/or a third subspace, the first subspace comprises part of the p first sections, and the second subspace corresponds to the first bucket space; the third subspace includes and is larger than the first bucket space.

In an optional implementation, before the obtaining the respective bucket data from the first party and the second party, further includes:

determining a suggested number of bucket spaces for each level;

the suggested number is sent to the first party and the second party, respectively, so that the suggested number determines the plurality of barrel spaces.

determining the division number t of the first section based on the maximum value of the number of data in the privacy data sets of the two parties;

determining section division information based on the division number t; and respectively sending the section division information to a first party and a second party to ensure that the t first sections are determined.

In an optional embodiment, the mapping value corresponding to the privacy data is determined based on a remainder result of the privacy data taken for the first random number;

before the obtaining of the respective bucketized data from the first and second parties, respectively, the method further comprises:

generating the first random number;

and sending the first random number to the first party and the second party respectively.

In an optional implementation manner, each piece of privacy data is a hash value obtained by performing hash calculation on the corresponding object identifier.

In an alternative embodiment, the method further comprises:

acquiring respective XOR results from a first party and a second party respectively, wherein the XOR results comprise T XOR values corresponding to T second sections which are divided in advance, any ith XOR value is obtained by carrying out XOR operation on mapping values corresponding to privacy data of an updated privacy data set of each party, which falls into the ith second section, and the updated privacy data set is obtained by deleting non-shared data from the privacy data set of each party based on the result set;

judging whether the ith exclusive OR value in the exclusive OR result of any one party is the same as the ith exclusive OR value in the exclusive OR result of the other party to obtain a judgment result aiming at the ith exclusive OR value;

and sending the judgment result corresponding to each second section to the first party and the second party for determining the common data of the updated privacy data sets of the two parties.

In an alternative embodiment, the intermediate party is a secret computing center that includes M executing parties;

the obtaining respective sub-bucket data from the first party and the second party respectively comprises:

each executing party respectively obtains respective barreled data fragments from a first party and a second party, wherein each party respectively divides mapping values in a plurality of barrel spaces of each barreled data fragment into M parts to obtain the barreled data fragment;

the obtaining of the comparison result for the first mapping value includes:

and the executing parties compare the first mapping value with each mapping value contained in the second barrel space in a multi-party security calculation (MPC) mode so as to obtain the comparison result.

In an alternative embodiment, the obtaining the comparison result for the first mapping value includes:

comparing the first mapping value with each mapping value contained in the second barrel space to obtain each intermediate result corresponding to each mapping value;

and carrying out exclusive OR operation on each intermediate result to obtain the comparison result aiming at the first mapping value.

According to a second aspect, there is provided a method of determining common data for protecting data privacy, the method being performed by a first party, the method comprising:

performing preset barrel dividing processing on each piece of privacy data in a held first privacy data set to obtain first barrel dividing data, wherein the preset barrel dividing processing comprises the step of filling a mapping value corresponding to the privacy data into target barrel spaces in a plurality of preset barrel spaces according to first sections, into which the privacy data fall, of t pre-divided first sections, for any piece of privacy data; any first barrel space of the plurality of barrel spaces comprises a plurality of first sections;

sending the first barreled data to an intermediate party to enable the intermediate party to determine a result set based on the first barreled data and second barreled data, wherein the second barreled data is obtained by performing preset barreling processing on a second private data set held by a second party by the second party; the result set comprises a comparison result obtained by comparing a first mapping value in a first bucket space in the first barreled data with each mapping value contained in a second bucket space in the second barreled data, wherein a common first section exists between the second bucket space and the first bucket space;

obtaining the result set from the intermediary;

determining, based on the result set, data common to both parties from the first set of privacy data.

In an alternative embodiment, the method further comprises:

obtaining a first random number from the intermediary;

and utilizing the first random number to carry out complementation on each piece of data in the first privacy data set to obtain a mapping value corresponding to each piece of privacy data.

In an alternative embodiment, the method further comprises:

obtaining space division information from the intermediary party, the space division information being determined based on a space division number t, the space division number t being determined by the intermediary party based on a maximum value of a number of data in the first private data set and the second private data set;

determining the t first sections based on the spatial division information.

In an alternative embodiment, the method further comprises:

obtaining a suggested number of bucket spaces for each level from the intermediary;

determining the plurality of bucket spaces according to the suggested number.

In an optional implementation manner, the filling a mapping value corresponding to the private data into a target bucket space of a plurality of preset bucket spaces includes:

determining a plurality of candidate bucket spaces containing a target section, the target section being a first section of the t first sections into which the private data falls;

in the plurality of candidate bucket spaces, the mapping value is preferentially added to the bucket space which contains the least number of the first sections and is not full, and the mapping value is preferentially added to the bucket space which already contains the mapping value.

In an optional implementation manner, each piece of data in the first private data set is a hash value obtained by performing hash calculation on an object identifier corresponding to each piece of data.

In an alternative embodiment, the determining common data of two parties from the first privacy data set based on the result set includes:

deleting data which is not common with the second privacy data set from the first privacy data set based on the result set to obtain an updated first privacy data set;

for T pre-divided second sections, performing exclusive-or operation on mapping values of the privacy data falling into the second sections in the updated first data set respectively to obtain T exclusive-or values corresponding to the T second sections as a first exclusive-or result;

sending the first XOR result to the intermediate party so that the intermediate party obtains a judgment result of whether the corresponding XOR values of the two parties are the same or not based on the first XOR result and a second XOR result correspondingly sent by the second party;

obtaining the judgment result from the intermediate party;

and determining common data of two parties in the updated first privacy data set based on the judgment result.

According to a third aspect, there is provided a method of determining common data for protecting data privacy, the method being performed by a first party and an intermediary party, the method comprising:

the method comprises the steps that a first party carries out preset barrel dividing processing on each piece of private data in a first private data set to obtain first barrel dividing data and sends the first barrel dividing data to an intermediate party, wherein the preset barrel dividing processing comprises the step of filling mapping values corresponding to the private data into target barrel spaces in a plurality of preset barrel spaces according to first sections, into which the private data fall, of t pre-divided first sections aiming at any piece of private data; any first barrel space of the plurality of barrel spaces comprises a plurality of first sections;

after the middle party obtains the first bucket dividing data and the second bucket dividing data, aiming at a first mapping value contained in the first bucket space in the bucket dividing data of any party, the middle party compares the first mapping value with each mapping value contained in the second bucket space in the bucket dividing data of the other party to obtain a comparison result aiming at the first mapping value; sending a result set formed by comparison results of all mapping values in the bucket dividing data of any party to the first party and the second party; the second bucket space and the first bucket space share a first section, and the second bucket data is obtained by performing the preset bucket processing on a second privacy data set held by a second party by the second party;

the first party determines common data of the private data sets of both parties from the first private data set based on the result set.

According to a fourth aspect, there is provided a system for determining common data for protecting privacy of data, comprising a first party, a second party and an intermediary party, wherein,

the first party is configured to perform preset barreling processing on each piece of private data in a first private data set to obtain first barreled data and send the first barreled data to the intermediate party, wherein the preset barreling processing includes that for any piece of private data, a mapping value corresponding to the private data is filled into a target barrel space in a plurality of preset barrel spaces according to a first section in which the private data falls in t pre-divided first sections; any first barrel space of the plurality of barrel spaces comprises a plurality of first sections;

the middle party is configured to compare a first mapping value contained in the first bucket space in the bucket data of any party with each mapping value contained in the second bucket space in the bucket data of the other party after obtaining the first bucket data and the second bucket data to obtain a comparison result aiming at the first mapping value; sending a result set formed by comparison results of all mapping values in the bucket dividing data of any party to the first party and the second party; the second bucket space and the first bucket space share a first section, and the second bucket data is obtained by performing the preset bucket processing on a second privacy data set held by a second party by the second party;

the first party is further configured to determine common data of the private data sets of both parties from the first private data set based on the result set.

According to a fifth aspect, there is provided an apparatus for determining common data for protecting data privacy, the apparatus being deployed at an intermediary party, comprising:

the first acquisition module is configured to respectively acquire respective barrel data from a first party and a second party, wherein the barrel data is obtained by performing preset barrel processing on private data sets held by the parties, and the preset barrel processing comprises that for any private data, according to a first section in which the private data falls in t pre-divided first sections, a mapping value corresponding to the private data is filled in a target barrel space in a plurality of preset barrel spaces; any first barrel space of the plurality of barrel spaces comprises a plurality of first sections;

the first comparison module is configured to compare a first mapping value contained in the first bucket space in the bucket dividing data of any one party with each mapping value contained in the second bucket space in the bucket dividing data of the other party to obtain a comparison result aiming at the first mapping value; wherein the second bucket space and the first bucket space present a common first section;

and the first sending module is configured to send a result set formed by comparison results of all mapping values in the bucket data of any party to the first party and the second party, and the result set is used for determining shared data of the private data sets of the two parties.

According to a sixth aspect, there is provided an apparatus for determining common data for protecting data privacy, the apparatus being deployed at a first party, comprising:

the first bucket dividing processing module is configured to perform preset bucket dividing processing on each piece of private data in a held first private data set to obtain first bucket dividing data, wherein the preset bucket dividing processing includes, for any piece of private data, filling a mapping value corresponding to the private data into a target bucket space in preset multiple bucket spaces according to a first section in which the private data falls in t pre-divided first sections; any first barrel space of the plurality of barrel spaces comprises a plurality of first sections;

a second sending module configured to send the first barreled data to an intermediate party, so that the intermediate party determines a result set based on the first barreled data and a second barreled data, wherein the second barreled data is obtained by performing the preset barreling on a second private data set held by the second party; the result set comprises a comparison result obtained by comparing a first mapping value in a first bucket space in the first barreled data with each mapping value contained in a second bucket space in the second barreled data, wherein a common first section exists between the second bucket space and the first bucket space;

a second obtaining module configured to obtain the result set from the intermediary party;

a first determination model module configured to determine common data of both parties from the first privacy dataset based on the result set.

According to a seventh aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the first aspect.

According to an eighth aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the second aspect.

According to a ninth aspect, there is provided a computing device comprising a memory and a processor, wherein the memory has stored therein executable code, and the processor, when executing the executable code, implements the method of the first aspect.

According to a tenth aspect, there is provided a computing device comprising a memory and a processor, wherein the memory has stored therein executable code, and the processor, when executing the executable code, implements the method of the second aspect.

According to the method and the device provided by the embodiment of the specification, the first party and the second party respectively fill the corresponding mapping values into target barrel spaces in a plurality of preset barrel spaces according to the first sections of the privacy data in the privacy data sets held by the first party and the second party, wherein the first sections of the privacy data fall into the t pre-divided first sections, so that the privacy data are grouped according to the first sections of the privacy data, the barrel data of each privacy data are respectively obtained, and the barrel data are further sent to the middle party. Then, the middle part compares the first mapping value contained in the first barrel space in the first barrel data of any one part with each mapping value contained in the second barrel space in the second barrel data of the other part, and obtains the comparison result aiming at the first mapping value. Then, a result set formed by comparison results of the mapping values in the bucket data of any party is sent to the first party and the second party, and the result set is used for determining shared data of the private data sets of the two parties. The mapping values of the first barrel space and the second barrel space with the shared first section are only compared, so that the private data is protected, and the comparison times and the corresponding calculation amount of the mapping values are reduced. And the comparison involved is ciphertext comparison, which is less computationally intensive than asymmetric operations.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.

FIG. 1 is a schematic diagram of a framework for implementing one embodiment disclosed herein;

FIG. 2 is a schematic flowchart of a method for determining common data according to an embodiment;

FIG. 3 is a schematic diagram of an embodiment of a determined second barrel space;

FIG. 4 is a schematic diagram of an embodiment of filling a bucket space with data;

FIG. 5 is a flowchart illustrating a method for determining common data according to an embodiment;

FIG. 6 is a schematic block diagram of a system for determining common data for protecting data privacy provided by an embodiment;

fig. 7 is a schematic block diagram of an apparatus for determining common data for protecting data privacy according to an embodiment;

fig. 8 is a schematic block diagram of an apparatus for determining common data for protecting data privacy according to an embodiment.

Detailed Description

The technical solutions of the embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.

In the context of big data, it is often necessary to comprehensively process business data of different data parties. For example, in a merchant classification analysis scenario based on machine learning, an electronic payment platform owns transaction flow data of merchants, a banking institution owns settlement data of the merchants, and the two parties need to jointly train a model for classifying the merchants. Before training the model, the two parties firstly need to determine common samples owned by the two parties through a privacy interaction algorithm, and then jointly train the model for classifying the merchants based on the common samples.

At present, the privacy interaction algorithm based on Diffie-Hellman is the most common privacy interaction algorithm. The main process is as follows, party A possesses data set X_nParty B owns data set Y_mThe method comprises the following steps that a party A and a party B respectively carry out hash calculation on each data in a data set to obtain a hash value corresponding to each data, and the hash value corresponding to each data is encrypted by using a public key held by the party A to obtain corresponding encrypted data, wherein the party A obtains an encrypted data set as follows:

；

party B obtains the encrypted data set

。

Party A will encrypt the data set

Sending to the B party, and the B party collects the encrypted data

And sending the data to the party A.

Party A utilizes its own public key

For received encrypted data set

The encryption is carried out in such a way that,obtaining:

(ii) a Sending it to party B;

party B utilizes its held public key

For received encrypted data set

And (3) encrypting to obtain:

(ii) a Sending it to party A;

accordingly, the A side and the B side respectively utilize

And

determining data held by both parties, i.e. determining a data set X_nAnd data set Y_mIntersection data between.

The exponential operation is an example of an operation on a group, and may specifically be a modular exponentiation or a point multiplication on an elliptic curve. In the above process, each data needs two asymmetric operations (modular exponentiation or dot product), and each data needs two ciphertext transmissions, which results in a large calculation amount.

Therefore, the embodiment of the specification provides a method for determining common data for protecting data privacy, so as to reduce the calculation amount in the intersection solving process to a certain extent while realizing protection of private data.

Fig. 1 is a schematic view of an implementation scenario of an embodiment disclosed in this specification. In this implementation scenario, parties A, B and C are shown schematically. Each participant may be embodied as a device, platform, server, or cluster of devices having computing, processing capabilities. The participators A and B respectively have a first privacy data set X (n) and a second privacy data set Y (m) which need to find the intersection, and the two parties hope to determine the common data of the two parties, namely the intersection between the first privacy data set and the second privacy data set, on the premise of not revealing the plain text of the privacy data of the two parties.

To this end, according to an embodiment of the present specification, the participant a holding the first privacy data set x (n) holds each piece of data x in the first privacy data set x (n) held by the participant a_iAnd performing preset barrel dividing processing to obtain first barrel dividing data, and sending the first barrel dividing data to a participant C, wherein the preset barrel dividing processing comprises the steps of filling mapping values corresponding to the privacy data into target barrel spaces in preset barrel spaces according to sections of the privacy data falling into t sections which are divided in advance, and the optional first barrel spaces in the barrel spaces comprise a plurality of sections. In order to distinguish possible other subsequent partition manners, each of the t partitions is referred to herein as a first partition, or a single bucket.

Correspondingly, the participant B holding the second privacy data set y (m) holds the data y in the second privacy data set y (m)_jAnd performing the preset bucket dividing processing to obtain second bucket dividing data, and sending the second bucket dividing data to the participant C.

After the party C obtains the first barreled data and the second barreled data, aiming at a first mapping value contained in a first barrel space in any one part of the barreled data, the first mapping value is compared with each mapping value contained in a second barrel space in the other part of the barreled data, and a comparison result aiming at the first mapping value is obtained, wherein a common first section exists between the second barrel space and the first barrel space.

It can be understood that the preset barreling process is to fill the mapping value corresponding to the private data into the target barrel space according to the first section, into which the private data falls, of the t first sections, that is, to perform barreling storage on the data according to the first section, into which the private data falls, of the t first sections. There is the same possibility for private data falling into the same first section, and not the same for private data falling into different first sections. In the comparison, the mapping value in the first bucket space is only needed to be compared with the mapping value in the second bucket space in which the common first section exists in the first bucket space.

And the party C sends a result set formed by comparison results of all mapping values in the bucket data of any party to the party A and the party B.

Based on the result set, party a determines common data of the first privacy data set x (n) and the second privacy data set y (m) from the first privacy data set x (n).

Based on the result set, the B-party determines common data of the first privacy data set x (n) and the second privacy data set y (m) from the second privacy data set y (m).

In one implementation, each piece of private data in the first private data set x (n) and the second private data set y (m) is a hash value obtained by performing hash calculation on the corresponding object identifier.

In the above process, both parties do not reveal the plaintext of the private data in the private data set, so that safe private data intersection operation is realized. The method includes the steps that a party A and a party B respectively perform barrel division processing on private data based on first sections, into which the private data in private data sets held by the parties A and B fall, of t pre-divided first sections, namely dividing the private data into different barrel spaces, and then the party C compares a first mapping value contained in a first barrel space in any one of the pieces of barrel data with each mapping value contained in a second barrel space, in which a common first section exists with the first barrel space, in second barrel data to obtain a comparison result aiming at the first mapping value. The number of comparisons of each mapped value and the corresponding amount of calculation are reduced. And the comparison involved therein is a ciphertext (i.e., mapping value) comparison, which is less computationally intensive than an asymmetric operation.

The following describes a specific implementation procedure of the above scheme.

FIG. 2 illustrates a process diagram for joint determination of common data by multiple parties, in one embodiment.

In order to perform secure private data intersection calculation, an initial preparation stage is firstly performed among the first party, the second party and the intermediate party to prepare for a specific process of subsequent private data intersection calculation.

The initial preparation phase may include the first party (party a) and the second party (party B) first determining their respective private data sets that require secure private data rendezvous, the first party a determining its first private data set x (n), and the second party B determining its second private data set y (m). In one implementation, each private data x in the first private data set X (n)_iAnd each piece of privacy data yj in the second privacy data set y (m) may be a hash value obtained by performing hash calculation on the corresponding object identifier. The object identifier corresponding to the privacy data may be an identifier of the user (e.g., an identification number, a mobile phone number, etc.), or may be an identifier of an article (e.g., an article serial number, etc.). In one case, the size of the private data may be within a preset range, for example, private data x_iIn binary representation, the number of bits does not exceed L bits, and the specific value of L can be set according to practical situations, for example, 80.

Then, the first party a, the second party B and the intermediate party (the participating party C) determine section division information for bucket division of the private data, so that subsequent privacy data intersection is facilitated. In one implementation, the first party a and the second party B may determine the segment division information according to the private data sets held by the first party a and the second party B, further perform segment division based on the determined segment division information to obtain t first segments, and then notify the intermediate party (the participant C) of the segment division information corresponding to the t first segments. In another implementation, the following may also be implemented: the middle party C determines the section division information according to the privacy data sets held by the first party A and the second party B respectively, and then notifies the first party A and the second party B of the section division information so that the two parties can divide the section into t first sections. Specifically, the intermediary party C may first obtain the number of data in the first privacy data set x (n) and the number of data in the second privacy data set y (m); then, the number t of partitions of the first segment is determined based on the maximum value of the number of data in the two privacy data sets (x (n) and y (m)). The middle party C determines section division information based on the division number t; and sending the section division information to a first party A and a second party B to ensure that t first sections are determined.

In one case, the intermediary C may be provided with a data field in advance, which may cover all the private data in the first and second private data sets. After determining the division number t, the middle side C determines the segment division information based on the data area and the division number t.

In another case, the intermediary C may be based on the largest privacy data x of the first privacy data set x (n) obtained in advance_maxAnd a second privacy data set Y (m) maximum privacy data y_maxFrom private data x_maxAnd private data y_maxAnd determining the data area so that the data area can cover all the private data in the first private data set and the second private data set, and further determining the section division information based on the data area and the division number t.

The t first sections may be determined by uniform division or non-uniform division. If the size of the data area is determined by the first party a, the second party B and the middle party C, the section division information may only include t. If the first party a and the second party B do not know the size of the data area, or the t first zones are determined in an uneven dividing manner, the zone dividing information includes start position information and end position information corresponding to the t first zones.

In one case, the data area is represented in binary form, which may be represented as [0, 2 ]^K]Wherein K is not less than the aforementioned L. In the case where the t first segments are uniformly divided, the determination manner of t may be expressed by the following formula (1):

t=max（n,m）/a （1），

where a is a preset constant, which may be set to 32. Correspondingly, the z-th first segment may be denoted as [ z/t 2 ]^K，(z+1)/t*2^K) Z is [0, t-1 ]]Is an integer of (1).

It is to be understood that the section division information may also be determined by the first party a and the second party B, and the determination process may refer to the determination process by the intermediate party C, which is not described herein again.

Subsequently, after the first party a and the second party B are divided into t first sections, a common data determination phase may be entered. The following embodiments are described from the perspective of the first party a and the intermediary party C, and the actions that the second party B needs to perform may be referred to the actions performed by the first party a.

Specifically, in the process of specifying the common data, as shown in fig. 2, first, in step S201, the first party a sets each private data x in the first private data set x (n) held by the first party a_iPerforming preset barrel dividing processing to obtain first barrel dividing data [ X ]]. The preset bucket dividing processing comprises the steps that for any private data, according to a first section in which the private data fall in t pre-divided first sections, a mapping value corresponding to the private data is filled into a target bucket space in a plurality of preset bucket spaces; any first bucket space of the plurality of bucket spaces includes a number of first sections.

The mapping value corresponding to the private data may be the private data itself, or may be obtained by performing a preset mapping process on the private data. In one implementation, the preset mapping process may be to determine a remainder result of the privacy data by using the first random number, and determine a mapping value corresponding to the privacy data based on the remainder result, that is, the mapping value corresponding to the privacy data is determined based on the remainder result of the privacy data taken for the first random number.

In one case, in order to reduce the transmission amount, the last E bit of the remainder result corresponding to the private data may be used as the mapping value corresponding to the private data. In addition, in order to avoid the occurrence of data collision (for example, consecutive bits of the last bits of the remainder result of different private data are the same), the value of E should not be too low. The value of E can be set empirically, for example, it can be set to 32.

In one implementation, the first random number may be generated by the intermediary party C and sent to the first party a and the second party B, respectively. In another implementation, the first random number may be generated by the first party a or the second party B, and then the generated first random number is sent to the other party. The first random number has a number of bits not lower than the E bit.

The predetermined plurality of barrel spaces are determined based on the t first sections, and any first barrel space of the plurality of barrel spaces includes a plurality of first sections. Different barrel spaces may include different first sections, and may also include the same first section. The first party A aims at each piece of private data x in the held first private data set X (n)_iFor the private data x_iPerforming preset barrel processing, i.e. according to the private data x_iIn the first section in which t first sections divided in advance fall, the privacy data x is divided into_iCorresponding mapping value<x_i>And filling a target barrel space in the preset barrel spaces. Wherein the privacy data x can be based on_iOf the private data x_iA first section into which t first sections divided in advance fall.

In one implementation, any first bucket space in the plurality of bucket spaces may include the same number of first sections, for example, each first section may include one first section, and this type of bucket space may be referred to as a single bucket space.

In another implementation, the plurality of bucket spaces may include multiple levels of bucket spaces, with different levels of bucket spaces containing different numbers of first segments. In one example, some of the bucket spaces may contain a first section (single bucket space); some of the bucket spaces may contain two consecutive first sections, and such bucket spaces may be referred to as twin-bucket spaces; some of the barrel spaces may include three consecutive first segments, and such barrel spaces may be referred to as three barrel spaces, in thatBy analogy, some bucket spaces may contain p consecutive first sections, and this type of bucket space may be referred to as p bucket spaces. In another example, each level of bucket space in the multi-level bucket space may be 2^cBarrel space, i.e. 2^cThe barrel space may include 2^cA continuous first section in which there may be 0<2^c<=t。

In one case, in the case where t is singular, taking the double-barrel space as an example, when determining the double-barrel space including the last first section, the barrel space including the last (tth) first section may be directly set as one double-barrel space, and half of the double-barrel space does not include any section of the t first sections. Accordingly, similar situations occur in other levels of bucket spaces, with reference to the above arrangement.

It is understood that there is a difference in the first section into which the t first sections divided in advance fall among the different privacy data. In order to avoid the disclosure of the distribution condition of the private data in the private data set, for the bucket space not filled with data, random numbers may be used for filling, and the filled random numbers are different from mapping values corresponding to various pieces of private data in the private data set. And the number of the barrel spaces containing the first sections can be set to be equal, that is, the number of the barrel spaces containing the z-th first section is equal. For example, the plurality of barrel spaces are all single barrel spaces, and the number of the single barrel spaces containing each first section is equal.

For another example, the plurality of bucket spaces include multiple levels of bucket spaces, and in order to prevent leakage of the distribution situation of the private data, the level bucket spaces are uniformly distributed for each level of bucket spaces, that is, the number of bucket spaces including different first sections in the level bucket spaces is equal, for example, the number of single bucket spaces including the z-th first section is equal for a single bucket space. For a double-barrel space, the number of double-barrel spaces containing the first and second first sections is equal to the number of double-barrel spaces containing the third and fourth first sections, equal to the number of double-barrel spaces containing the fifth and sixth first sections, and so on, equal to the number of double-barrel spaces containing the t-1 th and t (when t is a double number) first sections.

In one case, each of the barrel spaces may correspond to a threshold number of filled data, for example, the threshold number of filled data corresponding to the barrel space is 20, that is, each of the barrel spaces is filled with at most 20 mapping values, and in a case where the number of the mapping values filled in one of the barrel spaces reaches the threshold number of filled data, the corresponding mapping value is filled in another barrel space including the first segment into which the mapping value falls.

Next, in step S202, the first party A sends the first bucket of data [ X ] to the intermediate party C. The first bucket data [ X ] includes a mapping value corresponding to each private data, and the plaintext of each private data in the first private data set of the first party a is not exposed.

Correspondingly, on the second party B side, each piece of privacy data y in the held second privacy data set Y (m)_jPerforming the preset barrel dividing processing to obtain second barrel dividing data [ Y]And second barrel data [ Y ] is divided]And sending to the middle party C.

The first party A and the second party B respectively send the first barrel data [ X ] and the second barrel data [ Y ] to the middle party C according to the data transmission requirement of the middle party C.

In step S203, the mediator C obtains first and second bucketized data [ X ] and [ Y ]. Then, in step S204, a first mapping value included in the first bucket space of one of the pieces of bucket data is compared with each mapping value included in the second bucket space of the other piece of bucket data, and a comparison result for the first mapping value is obtained. Wherein the second barrel space and the first barrel space share a first section.

It can be understood that the preset barreling process is to fill the mapping value corresponding to the private data into the target barrel space according to the first section, into which the private data falls, of the t first sections, that is, to perform barreling storage on the data according to the first section, into which the private data falls, of the t first sections. There is the same possibility for private data falling into the same first section, and not the same for private data falling into different first sections. In making the comparison, it is only necessary to compare the mapping values in the first bucket of space with the mapping values in the second bucket of space where there is a common first zone with the first bucket of space.

In the following, a comparative example will be described in which a mapping value included in a bucket space in first sub-bucket data [ X ] of the first party a is determined from second sub-bucket data, and the corresponding mapping value is determined. For the mapping values included in the bucket space in the second sub-bucket data [ Y ] of the second party B, a corresponding mapping value is determined from the first sub-bucket data [ X ] for comparison, which may be referred to as the above comparison process, and is not described again.

Specifically, the middle party C may first determine, from a plurality of bucket spaces in the second barreled data, a second bucket space with respect to a first bucket space in the first barreled data of the first party a, where the second bucket space and the first bucket space share a first section, for example, both include a z-th first section. Furthermore, the middle side C is used for mapping the first mapping value contained in the first barrel space in the first sub-barrel data<x_i>And each mapped value contained in a second bucket space of the second sub-bucket data<y_j>And comparing to obtain a comparison result aiming at the first mapping value. In the case where the levels of the barrel spaces included in the plurality of barrel spaces are different, the determined levels of the second barrel spaces are different, and for clarity of layout, the type of the determined second barrel spaces will be described later.

In an implementation, the comparing may specifically be that the middle party C uses a first mapping value included in a first bucket space in the first sub-bucket data<x_i>And each mapped value contained in a second bucket space of the second sub-bucket data<y_j>Comparing, i.e. determining the first mapping value<x_i>With each mapped value<y_j>Whether the two are the same or not is judged to obtain a first mapping value<x_i>With each mapped value<y_j>Corresponding intermediate results, i.e. mapped values in the second bucket space<y_j>The number of intermediate results. The intermediate results may characterize the first mapping value<x_i>And the corresponding mapping value<y_j>The same or different, wherein the first mapping value may be characterized by a first value<x_i>Corresponding mapping value<y_j>The first mapping value is characterized by the second value<x_i>Corresponding mapping value<y_j>Different. In one case, the first value may be 1 and the second value may be 0.

For example, a first mapping value contained in a first bucket space of a first sub-bucket of data<x_i>Are respectively as<x_i1>，<x_i2>And<x_i3>. Each mapping value contained in a second bucket space (which contains a common first segment with the first bucket space) in the second sub-bucket of data<y_j>Are respectively as<y_j1>，<y_j2>，<y_j3>And<y_j4>. For the first mapping value<x_i1>Respectively, by mixing them with<y_j1>，<y_j2>，<y_j3>And<y_j4>comparing to obtain a first mapping value<x_i1>4 intermediate results of (1); for the first mapping value<x_i2>Respectively, by mixing them with<y_j1>，<y_j2>，<y_j3>And<y_j4>comparing to obtain a first mapping value<x_i2>4 intermediate results of (1); for the first mapping value<x_i3>Respectively, by mixing them with<y_j1>，<y_j2>，<y_j3>And<y_j4>comparing to obtain a first mapping value<x_i3>4 intermediate results.

Subsequently, the middle party C performs XOR operation on each middle result to obtain a first mapping value<x_i>The comparison result of (1). For the first mapping value<x_i>Can characterize whether the second sub-bucket data exists or not and the first mapping value<x_i>The same mapping value. Taking over the above example, the first mapping value will be addressed<x_i1>The 4 intermediate results are subjected to XOR operation to obtain a first mapping value<x_i1>The comparison result of (1); will be directed to the first mapping value<x_i2>The 4 intermediate results are subjected to XOR operation to obtain a first mapping value<x_i2>Result of comparison of (1)(ii) a Will be directed to the first mapping value<x_i3>The 4 intermediate results are subjected to XOR operation to obtain a first mapping value<x_i3>The comparison result of (1).

Wherein, the first mapping value is used as the first mapping value<x_i1>For example, if the first mapping value is described<x_i1>Of 4 intermediate results, characterization<y_j1>，<y_j2>，<y_j3>And<y_j4>in which there is one mapping value and a first mapping value<x_i1>Same, then the first mapping value<x_i1>The comparison result of (a) indicates that there is a second bucket of data that is associated with the first mapping value<x_i1>The same mapping value. If the first mapping value<x_i1>Of 4 intermediate results, characterization<y_j1>，<y_j2>，<y_j3>And<y_j4>is not present with the first mapping value<x_i1>The same mapping value, the first mapping value<x_i1>The comparison result of (a) indicates that there is no second bucket data to the first mapping value<x_i1>The same mapping value.

In order to save determination time, the comparison process of each first mapping value with each mapping value contained in the second bucket space in the other part of the partitioned bucket data can be executed in parallel for different first mapping values contained in the first bucket space.

The middle party C can obtain the comparison result of each mapping value in each bucket space in the first sub-bucket data [ X ], that is, the comparison result of each mapping value in each bucket space in the second sub-bucket data [ Y ] through the comparison process. Subsequently, in step S205, the intermediary party C sends a result set formed by comparison results of the mapping values in the bucket data of any party to the first party and the second party.

Next, in step S206, the first party a determines common data of the private data sets of both parties from the first private data set x (n) based on the result set. This step, in the case that the intermediary party C determines a corresponding mapping value from the second barreled data for the mapping value contained in the barrel space in the first barreled data [ X ] of the first party a, and compares the mapping values, where the result set includes a comparison result of each mapping value in the first barreled data, and the comparison result may represent whether there is a mapping value in the second barreled data that is the same as each mapping value in the first barreled data, and the first party a may determine common data of the private data sets of the two parties from the first private data set X (n) based on the combination of the results.

Accordingly, the second party B may determine common data of the private data sets of both parties from the second private data set y (m) based on the result set.

In the embodiment, the two parties do not reveal the plaintext of the private data in the private data set, so that safe private data intersection operation is realized. The first party a and the second party B perform the barreling process on the private data respectively based on a first section in which each piece of private data in the private data set held by the first party a falls in t pre-divided first sections, that is, partition the private data into different barrel spaces, and further compare a first mapping value included in a first barrel space in any one of the barreled data with each mapping value included in a second barrel space in the second barreled data, in which a common first section exists with the first barrel space, to obtain a comparison result for the first mapping value. The comparison of global privacy data is not needed, and the comparison times of all mapping values and the corresponding calculation amount are reduced. And the comparison involved therein is ciphertext (mapped value) comparison, which is computationally inexpensive.

Through the process shown in fig. 2, data common to both parties can be determined from the first private data set and the second private data set, and in one case, the first party a (the second party B) can delete the data determined to be not common in the private data set based on the combination of the results, so as to achieve the effect of reducing different private data (i.e., non-common data) in the private data set, which can be referred to as the process shown in fig. 2 as reducing different processes.

In one embodiment of the present description, the plurality of bucket spaces may include multiple levels of bucket spaces, with different levels of bucket spaces containing different numbers of first sections. It will be appreciated that any one of the plurality of first kegs may comprise p consecutive first sections. Correspondingly, before the comparison performed by the middle part C in step S203, the middle part C determines, for the first barrel space, a second barrel space that may include barrel spaces of each level, where the second barrel space includes a first subspace, a second subspace and/or a third subspace, the first subspace includes a part of the p first sections (i.e., the first subspace is a barrel space that includes the first section with a number that is lower than the number of the first section included in the first barrel space, i.e., is a barrel space that is lower than the level of the first barrel space), and the second subspace corresponds to the same first barrel space (i.e., the second subspace includes a barrel space that includes the first section with a number that is equal to the number of the first section included in the first barrel space, i.e., is a barrel space that is the same as the level of the first barrel space); the third subspace includes and is greater than the first barrel space (i.e. the third subspace includes barrel spaces whose number of the first sections is greater than the number of the first sections included in the first barrel space, i.e. barrel spaces higher than the level of the first barrel space).

For example, in the case that the first bucket space is a single bucket space including the z-th first segment, the second bucket space determined from the second sub-bucket data may include: the second subspace, i.e., the single-barrel space containing the z-th first segment; the third subspace is a double-bucket space including the z-th first segment, and a three-bucket space including the z-th first segment and a bucket space of a level above the z-th first segment.

Also for example, in a case that the first bucket space is a dual bucket space including the z-th first segment and the z + 1-th first segment, the determining the second bucket space may include: the first subspace, i.e. the single-bucket space containing the z-th first segment, and the single-bucket space containing the z + 1-th first segment; the second subspace, i.e. the double-barrel space containing the z-th first segment and the z + 1-th first segment; the third subspace is a three-bucket space including at least the z-th first segment and the z + 1-th first segment, and a bucket space of above levels. As shown in fig. 3, t =8, and 8 first sections are equally divided, and the first bucket space is a double bucket space including the first and second first sections, respectively, as shown in fig. 3, forThe second barrel space determined by the double barrel space may include: a single-barrel space containing a first section and a single-barrel space containing a second first section (the first subspace mentioned above); a double-barreled space (the second subspace mentioned above) comprising a first and a second first section; three barrel spaces including the first, second and third first sections, four barrel spaces including the first to fourth first sections, and barrel spaces of the above hierarchy. In another case, the barrel space at each level is 2^cIn the case of the barrel space, the third subspace is a four-barrel space including at least the z-th first segment and the z + 1-th first segment, an eight-barrel space, and 2 of the above level^cBarrel space, where c is an integer.

Also for example, in the case that the first bucket space is a t-bucket space including first to tth first sections, the determining the second bucket space may include: the first subspace includes a single barrel space of any one of the first section to the tth first section, a double barrel space of two consecutive first sections of the first section to the tth first section (the first sections included in different double barrel spaces are not repeated), and so on, a t-1 barrel space of the first section to the t-1 first section; and the second subspace is a t-barrel space including the first section to the t-th first section.

Through the arrangement of the barrel spaces of multiple layers, the data transmission amount can be reduced to a certain extent while the security protection of the private data of the first party A and the second party B and the protection of the data distribution condition of the private data are realized. In one case, when only one hierarchical set of bucket spaces is used, for example, when only one single bucket space including one first segment is set, there may be a case where more private data falls into some first segments (i.e., more private data in the single bucket space including the first segments), and less private data falls into some first segments (i.e., less private data in the single bucket space including the first segments). In this case, considering that the distribution of the private data needs to be avoided, the number of the single-bucket spaces (data storage amount) with less private data needs to be set according to the number of the single-bucket spaces (data storage amount) with more private data, and correspondingly, the random numbers with corresponding number need to be filled in the single-bucket spaces with less private data so as to cover up the problem of less private data in the corresponding single-bucket spaces, and at this time, the filled random numbers increase the data transmission amount and the workload of the subsequent comparison process.

In view of this, in order to solve the above problem, a multi-level bucket space is proposed, in which a basic data storage amount (provided by at least one single bucket space) is set for a single bucket space, and after all the single bucket spaces including a certain first section are full, the private data falling into the corresponding first section may be filled into a higher-level bucket space (e.g., a double-bucket space, a triple-bucket space, etc.) including the corresponding first section, so that a plurality of single bucket spaces and a shared data storage amount (provided by the higher-level bucket space including the first section included in the single bucket space) are provided.

For example, there are 16 private data falling into the first segment and 8 private data falling into the second first segment, and if only a single bucket space is provided, 8 random numbers need to be added into the single bucket space including the second first segment to cover the problem of less private data in the single bucket space including the second first segment, and at this time, 16 × 2 data needs to be transmitted.

If a plurality of levels of bucket spaces are provided, for example, a single bucket space is provided, and a double bucket space is provided, the data storage amount of each single bucket space is set to 8 (i.e., 8 data are stored), and in the case that the single bucket space including the first segment stores 8 data, i.e., is full, the mapping values corresponding to 8 pieces of private data that are not stored and fall into the first segment can be stored into the double bucket space including the first and second first segments, at this time, the situation that the amount of private data that covers the second first segment is less can fall into the double bucket space to some extent, and the amount of transmission data is 24 data, which corresponds to 32 data, and the data transmission amount is reduced.

In an embodiment of the present specification, the first party a, in the process of filling the mapping value corresponding to the private data into a target bucket space of a plurality of preset bucket spaces, may specifically be configured to determine a plurality of candidate bucket spaces including a target section, where the target section is a first section of the t first sections into which the private data falls; among the plurality of candidate bucket spaces, the mapping value is preferentially filled in the bucket space which contains the least number of the first sections and is not full, and the mapping value is preferentially added to the bucket space which already contains the mapping value.

In this embodiment, the plurality of alternative bucket spaces may include a multi-level bucket space. The first party a may preferentially add the mapping value to the less-than-full bucket space containing the smallest number of first sections and preferentially add the mapping value to the bucket space already containing the mapping value among the plurality of candidate bucket spaces according to the first section in which the privacy data falls in the t first sections. That is, it can be understood that the mapped value is preferentially added to the hierarchically lower and less full bucket space. For example, for the private data falling into the first section, the mapping value corresponding to the type of private data is preferentially added to the single-bucket space (hereinafter referred to as a first single-bucket space) containing the first section, after the first single-bucket space is full, the mapping value corresponding to the type of private data is added to the second first single-bucket space, and so on, after all the first single-bucket spaces are full, the mapping value corresponding to the type of private data is added to the double-bucket space (hereinafter referred to as a first double-bucket space) containing the first section and the second first section, after the first double-bucket space is full, the mapping value corresponding to the type of private data is added to the second first double-bucket space, and so on.

Then, when it is determined that the private data falls into the second first section, the mapping value corresponding to the private data of the type is added to the single-bucket space (hereinafter referred to as the second single-bucket space) including the second first section, and so on until all the second single-bucket spaces are full, the mapping value corresponding to the private data of the corresponding type is added to the double-bucket space (i.e., the first double-bucket space) including the first section and the second first section. In one case, if the first dual-bucket space is not full (i.e., not occupied), the mapping value corresponding to the private data is preferentially added to the first dual-bucket space already containing the mapping value. In another case, if all the first dual-bucket spaces are full, the mapping value corresponding to the private data is added to three-bucket spaces (i.e. the first three-bucket space) including the first to third first sections, wherein the mapping value is preferably filled into the first three-bucket space in which the mapping value is stored.

As shown in fig. 4, the plurality of candidate bucket spaces includes a multi-level bucket space, and if fig. 4 shows that t =8 and the 8 first sections are uniformly divided, the plurality of candidate bucket spaces includes: 3 sets of single-barrel spaces (i.e. the single-barrel spaces containing each first section are all 3), 2 sets of double-barrel spaces (i.e. the double-barrel spaces containing the first and second first sections, the double-barrel spaces containing the third and fourth first sections, the double-barrel spaces containing the fifth and sixth first sections, and the double-barrel spaces containing the seventh and eighth first sections are all 3), 2 sets of four-barrel spaces (i.e. the four-barrel spaces containing the first to fourth first sections, and the four-barrel spaces containing the fifth to eighth first sections are all 2), and 1 set of eight-barrel spaces.

Wherein, the t first sections are covered by the same level of bucket space, which can be called a level bucket area. For example, for a single barrel space, a single barrel space containing a first block, a single barrel space containing a second first block, … …, a single barrel space containing a first z-th block, … … and a single barrel space containing a first t-th block, make up a single barrel zone, as shown in FIG. 4. For the double-barrel space, the double-barrel space containing the first and second first sections, the double-barrel space containing the third and fourth first sections, and the double-barrel space containing the t-1 th and the t (t is a double number) th first sections form a double-barrel area, as shown in fig. 4, and so on. That is, 3 sets of single-bucket spaces may be referred to as 3 single-bucket zones, 2 sets of double-bucket spaces may be referred to as 2 double-bucket zones, 2 sets of four-bucket spaces may be referred to as 2 four-bucket zones, and 1 set of eight-bucket spaces may be referred to as 1 eight-bucket zone.

It is understood that for the same level of bucket space, the first section contained by different level bucket spaces in a level bucket region is not repeated. As shown in fig. 4.

As shown in fig. 4, for the private data 1 falling into the first section, the mapping value 1 corresponding to the private data 1 is preferentially added to the less full bucket space containing the first section with the least number (lowest hierarchy), and the mapping value 1 is preferentially added to the bucket space already containing the mapping value, as shown in fig. 4, the single bucket space containing the first section in the 1 st single bucket area and the 2 nd single bucket area is occupied, and the single bucket space containing the first section in the 3 rd single bucket area also has space, then the mapping value 1 is filled into the single bucket space containing the first section in the 3 rd single bucket area.

For the private data 2 falling into the first section, if the single-bucket space in the 3 rd single-bucket area including the first section is not occupied, filling the mapping value 2 corresponding to the private data 2 into the single-bucket space in the 3 rd single-bucket area including the first section.

For the private data 3 falling into the second first segment, as shown in fig. 4, all the single-bucket space and the double-bucket space of the second first segment are occupied, and the mapping value is stored in the 1 st four-bucket space of the second first segment, the mapping value 3 corresponding to the private data 3 is filled into the 1 st four-bucket space of the second first segment.

In one case, the plurality of candidate bucket spaces are determined by the first party a based on a first private data set held by the first party a. Alternatively, the plurality of candidate bucket spaces may be determined by the intermediary party C in order to better serve the first party a and the second party B. Correspondingly, in the case that the middle party C determines that the first party a and the second party B perform the secure private data intersection operation, before step S201, the middle party C may also determine the suggested number of the barrel spaces of each hierarchy for the first party a and the second party B, respectively; and send the suggested number to the first party a and the second party B, respectively, so that both parties determine their respective plurality of bucket spaces.

The middle party C can determine the suggested number PA of the bucket space of each hierarchy of the first party A according to the number of data in the first privacy data set of the first party A; and determining the suggested number PB of the bucket spaces of each hierarchy of the second party B according to the number of data in the second private data set of the second party B, it can be understood that, in order to ensure that all private data can be added into the bucket spaces, the number of data storage provided by PA bucket spaces of the suggested number of bucket spaces of each hierarchy of bucket spaces is greater than the number of private data.

In one case, the middle party C may pre-store the suggested number coefficients corresponding to the respective hierarchical bucket spaces, and accordingly, the middle party C determines the product of the pre-stored suggested number coefficients corresponding to the respective hierarchical bucket spaces and the number of data in the private data set of any party (the first party a or the second party B) as the suggested number of the respective hierarchical bucket spaces of any party. In another case, the mediator C stores in advance a correspondence between each data number range and the number of each hierarchical bucket space, and determines the suggested number of each hierarchical bucket space of any one party based on the correspondence and the number of data of the private data set of any one party.

Correspondingly, the first party A obtains the suggested number PA of the barrel space of each layer from the middle party C; and determines a number of bucket spaces based on the proposed number PA. In one implementation, after obtaining the suggested number of each hierarchical bucket space, the first party a determines, according to the suggested number, each hierarchical bucket space of a corresponding number as a candidate bucket space. Then, in the plurality of candidate bucket spaces, the mapping value is preferentially added to the less than full bucket space containing the smallest number of the first sections, and the mapping value is preferentially added to the bucket space already containing the mapping value, so as to obtain a plurality of bucket spaces filled with the mapping value.

When the first party a completes filling the mapping values corresponding to all the private data, there may be some hierarchical bucket spaces in which no mapping value is stored, and accordingly, the first party a may discard the hierarchical bucket spaces in which no mapping value is stored, that is, the hierarchical bucket spaces in which no mapping value is stored are not used as the plurality of bucket spaces (the bucket spaces transmitted to the middle party C).

Subsequently, the first party a may obtain the actual number of each hierarchical bucket space filled with the mapping value, and send the actual number of each hierarchical bucket space to the middle party C, so that the middle party C may determine, based on the actual number of each hierarchical bucket space, a first bucket space in the first sub-bucket data and a second bucket space in the second sub-bucket data that share the first section.

In one implementation, where the plurality of bucket spaces includes multi-level bucket spaces, the upper level bucket space and which lower level bucket spaces share a first section, which can be represented using an address identifier, e.g., for a double bucket space, which single bucket space contains the first section is represented by 1 bit.

In one case, the intermediary party C may provide a suggested number for each hierarchical bucket for the first party a and the second party B to ensure that each hierarchical bucket within the hierarchical bucket is evenly spatially distributed. Subsequently, after the first party a (the second party B) adds the mapping values of all the private data to the corresponding bucket spaces, if no mapping value is stored in a certain hierarchical bucket area, the hierarchical bucket area (i.e., all the hierarchical bucket spaces) is discarded, as shown in fig. 4, if the mapping values corresponding to all the private data are filled, and no mapping value is stored in the eight bucket area, the eight bucket area is discarded. For other hierarchical bucket areas in which mapping values are stored, if a hierarchical bucket space in which mapping values are not stored and/or is not full exists, in one case, in order to avoid disclosure of the distribution situation of private data, a random number is added to the corresponding hierarchical bucket space in which mapping values are not stored and/or is not full. Alternatively, the corresponding hierarchical bucket space in which no mapping value is stored may be discarded in consideration of the fact that the data transmission amount should not be excessively large.

In one embodiment of the present description, to achieve better protection of the private data of the first party a and the second party B, the intermediary party is a secret computing center comprising M executing parties; the M executing parties may run in a trusted execution environment TEE, one executing party may be implemented by one TEE, and at this time, the executing party may be referred to as a trusted executing party, and the secret computing center may be referred to as a trusted secret computing center. Alternatively, the M execution parties may also operate in a common execution environment. Accordingly, the first party a and the second party B send the first split-bucket data [ X ] and the second split-bucket data [ Y ] to the intermediate party C, respectively, in a cryptographic manner. Specifically, the first party a divides each mapping value in the plurality of bucket spaces in the first data sub-bucket [ X ] into M parts, obtains M parts of first data sub-buckets, and sends the M parts of first data sub-buckets to M executing parties respectively. And the second party B divides mapping values in a plurality of barrel spaces in the second barreled data [ Y ] into M parts respectively to obtain M parts of second barreled data fragments, and sends the M parts of second barreled data fragments to M executing parties respectively.

Correspondingly, in step S203, the data fragments are specifically set as each executing party, and each barreled data fragment is obtained from the first party a and the second party B, wherein each party divides each mapping value in the plurality of barrel spaces of each barreled data fragment into M parts;

accordingly, in step S204, it is specifically set that each executing party compares the first mapping value with each mapping value included in the second bucket space by way of multi-party secure computing MPC, so as to obtain a comparison result for the first mapping value.

In this embodiment, M executing parties all obtain fragments of the partitioned bucket data, that is, obtain a part of each mapping value in a plurality of bucket spaces, and each executing party cannot obtain a plain text of the mapping value. Moreover, under the condition that each executing party runs in the trusted execution environment TEE, the protection degree of the secret computing system on the data is higher, and the anti-attack capability is better.

In addition, in the data transmission process, in the process that the first party A and the second party B upload the barrel data to the middle party C (a trusted confidential calculation center), the barrel data are uploaded in the form of barrel data fragmentation, so that the encrypted uploading of the barrel data of the first party A and the second party B is realized, and the safety of the barrel data in the data transmission process is also ensured to a certain extent.

Theoretically, by the process shown in FIG. 2The shared data of the first party A and the second party B is determined from the privacy data of the first party A and the second party B, and the shared data is the data really shared by the two parties. However, in the process of specifying the mapping value corresponding to the private data, it is difficult to avoid the occurrence of a data collision, and accordingly, in the common data of both parties specified by the process shown in fig. 2, it is difficult to avoid the same mapping value corresponding to the private data that is not the common data. In view of this, the present specification provides a schematic diagram of a process for determining common data, as shown in fig. 5. First, in step S501: the first party A holds each piece of private data x in a first private data set X (n)_iPerforming preset barrel dividing processing to obtain first barrel dividing data [ X ]]。

Correspondingly, the second party B will hold each piece of private data y in the second private data set Y (m)_jPerforming preset barrel dividing processing to obtain second barrel dividing data [ Y]. And second barrel data [ Y [ ]]And sending to the middle party C.

In step S502: the first party A sends a first bucket of data [ X ] to the intermediate party C. The first bucket data [ X ] includes a mapping value corresponding to each private data, and the plaintext of each private data in the first private data set of the first party a is not exposed.

In step S503: the mediator C obtains first and second sub-bucket data [ X ] and [ Y ]. Then, in step S504, the mediator C compares a first mapping value included in a first bucket space of one of the pieces of bucket data with each mapping value included in a second bucket space of the other piece of bucket data, and obtains a comparison result for the first mapping value. Wherein the second barrel space and the first barrel space share a first section.

In step S505: and the middle party C sends a result set formed by comparison results of all mapping values in the bucket dividing data of any party to the first party and the second party.

The specific implementation process of steps S501 to S505 may refer to the specific implementation process of steps S201 to S205 in the flow shown in fig. 2, and is not described herein again.

Next, in step S506, the first party A derives from the first private data set based on the result setX (n), deleting data not public to the second privacy data set Y (m), and obtaining an updated first privacy data set X (n)^，. The result set includes a comparison result of each mapping value in the first sub-bucket data, and the comparison result may represent whether a mapping value identical to the mapping value in the first sub-bucket data exists in the second sub-bucket data, that is, whether the private data corresponding to the corresponding mapping value is common data of the first private data set and the second private data set. The first party A may delete data not common to the second privacy data set Y (m) from the first privacy data set X (n) based on the result set to obtain an updated first privacy data set X (n)^，. Updated first privacy data set X (n)^，The privacy data may be reduced.

Correspondingly, the second party B deletes the data which is not common with the first privacy data set X (n) from the second privacy data set Y (m) based on the result set to obtain an updated second privacy data set Y (m)^，. Updated second privacy data set Y (m)^，The privacy data may be reduced.

Subsequently, in step S507, the first party A aims at the updated first data set X (n) for the T second sections which are divided in advance^，And performing exclusive-or operation on the mapping values of the private data falling into the second sections respectively to obtain T exclusive-or values corresponding to the T second sections, and taking the T exclusive-or values as a first exclusive-or result.

In one implementation, the T second sections may be divided by the first party a and the second party B based on at least the number of data in their respective updated private data sets, and then the division condition corresponding to the T second sections is notified to the intermediary party C. The intermediary party C may determine the division condition corresponding to the T second sections determined by the number of data of the updated private data sets of at least the first party a and the second party B, and then notify the first party a and the second party B of the division condition corresponding to the T second sections determined, so as to divide the first party a and the second party B into the T second sections.

In one case, the division may be based on the number of data of the updated private data set of each of the first party a and the second party B, and the data area [0, 2 ]^K]And (4) determining. The T second sections may be uniformly or non-uniformly divided. In the case where the T second sections are determined in a uniformly divided manner, the determination is made in a manner of T = max (h, g)/b, where h denotes the number of data of the updated first privacy data set, g denotes the number of data of the updated second privacy data set, and b is a preset empirical value, for example, 80. The u-th second segment can be expressed as [ u/T2 ]^K，(u+1)T*2^K) U is [0, T-1 ]]Is an integer of (1).

In another case, the new data area may be re-determined according to the maximum private data in the first private data set updated by the first party a and the second private data set updated by the second party B, and the division may be determined based on the number of data in the updated private data sets of the first party a and the second party B and the new data area.

The first party A aims at the updated first privacy data set X (n) for the T second sections which are divided in advance^，Based on the size of the private data, determining a second section into which the private data falls, and filling a mapping value corresponding to the second section into a storage area corresponding to the second section. The first data set X (n) to be updated accordingly^，The mapping value corresponding to each piece of privacy data in the table is filled into the storage area corresponding to the second segment into which the privacy data falls. The first data set X (n) storing the update^，The storage area of the mapping value of the private data is called a first storage area.

And the first party A performs exclusive OR operation on the mapping values in the first storage area corresponding to the T second sections which are divided in advance to obtain T exclusive OR values corresponding to the T second sections as a first exclusive OR result. Wherein a second section corresponds to an exclusive or value.

In one implementation, each mapping value may be represented in binary form, and the corresponding xor value of the second segment includes: and carrying out XOR operation on each bit of each mapping value in the first storage area corresponding to the second section to obtain an XOR value.

In one possible embodiment, the updated first data set X (n)^，Each of which isThe mapping value corresponding to the private data is determined based on a remainder result of the private data with respect to the second random number. Considering that if the remainder results of the private data a and B for the first random number are the same (i.e. a data collision occurs), the probability that the remainder results of the private data a and B for the second random number are the same is relatively reduced, and in order to reduce the occurrence of the data collision, the second random number is different from the first random number in the different process.

The second random number may be generated by the first party a or the second party B, and then the first party a or the second party B notifies the other party, so that the two parties determine the mapping value of the private data in the updated private data set held by the two parties through the same second random number. Alternatively, the second random number may be generated by the intermediary party C and then notified to the first party a or the second party B.

Correspondingly, the second party B performs xor operation on the mapping values of the private data falling into the second sections in the updated second private data set for the T second sections divided in advance, and obtains T xor values corresponding to the T second sections as a second xor result. For a specific process, refer to the process of determining the first xor result by the first party a, which is not described herein again. The second party B stores the mapping values of the private data falling into the second sections in the updated second private data set into the storage areas corresponding to the second sections, and for clarity of description, the storage area storing the mapping values of the private data in the second private data set is referred to as a second storage area.

In step S508, the first party a sends the first xor result to the intermediary C. Accordingly, the second party B sends the second xor result to the intermediate party C. In one implementation, the first party a and the second party B send the first xor result and the second xor result to the intermediary party C, respectively, according to the data transmission requirement of the intermediary party C. In one case, the intermediate party C is a secret computing center, and includes M executing parties, and the data transmission requirement may be that each party divides each xor value in the xor result thereof into M parts to obtain M xor result fragments, and sends each xor result fragment to each executing party, respectively.

Next, in step S509, the intermediary party C obtains a determination result of whether the corresponding xor values of the two parties are the same based on the first xor result and the second xor result correspondingly transmitted by the second party. Specifically, the intermediary party C determines whether the r-th xor value in the xor result of any one of the parties is the same as the r-th xor value in the xor result of the other party, and obtains a determination result for the r-th xor value. The r-th exclusive-or values of the two parties, i.e. the exclusive-or values corresponding to the two parties, correspond to the same second segment, e.g. the r-th second segment, and the determination results corresponding to the second segments can be obtained through the above processes.

If the judgment result corresponding to the r-th second section represents that the r-th exclusive-or value of any one side is the same as the r-th exclusive-or value of the other side, the judgment result represents that the mapping value of the first party A filled in the first storage area corresponding to the r-th second section is the same as the mapping value of the second party B filled in the second storage area corresponding to the r-th second section, the judgment result represents that the first party A falls into the privacy data of the second section, and the privacy data of the first party A and the privacy data of the second party B falling into the second section are shared data. And if the judgment result corresponding to the r-th second section represents that the r-th exclusive-or value of any one party is different from the r-th exclusive-or value of the other party, representing that the mapping value of the first party A filled in the first storage area corresponding to the r-th second section is different from the mapping value of the second party B filled in the second storage area corresponding to the r-th second section, and correspondingly representing that the privacy data of the first party A falling into the second section and the privacy data of the second party B falling into the second section are non-shared data.

It can be understood that, since there is a difference in the corresponding mapping values of the private data falling into the same second section in the first party a and the second party B, the private data falling into the same second section are considered to be non-common data, and when the second sections are divided, each second section is made as small as possible, so that the number of the private data falling into the second section is made as small as possible.

In step S510, the intermediary party C sends the determination results corresponding to the respective second sections to the first party a and the second party B. Thereafter, in step S511, the first party a determines shared data of both parties in the updated first private data set based on the determination result. In one case, the first party a screens out the judgment results with the same characteristic corresponding exclusive or value from the judgment results corresponding to the second sections as the judgment results corresponding to the target second sections; and determining the privacy data falling into the target second section in the updated first privacy data set as shared data of two parties.

Accordingly, the second party B determines common data of both parties in the updated second private data set based on the determination result. See above for a way in which the first party a determines the common data of both parties in the updated first private data set.

Subsequently, the first party a may obtain the updated first privacy data set by dropping the updated first privacy data set out of the privacy data output set of the target second segment, where the updated first privacy data set includes the privacy data that does not fall into the target second segment. And the second party B may obtain the updated second privacy data set by dropping the updated second privacy data set out of the privacy data output set of the target second segment, where the updated second privacy data set includes the privacy data that does not fall into the target second segment. S507 to S511 described above can achieve the effect of outputting common data in the updated private data set, and S507 to S511 can be referred to as the same output flow.

Under the condition that the precision requirement of the common data required to be determined by the two parties is high, the first party A and the second party B can continue to return to execute the step S301 according to the latest updated first privacy data set and the latest updated second privacy data set by combining with the middle party C until the updated first privacy data set and the updated second privacy data set are empty sets, and then the circulation is finished, and the common data with high precision between the first party A and the second party B are sequentially determined.

In another implementation, after reducing different processes for privacy data sets held by the first party a and the second party B multiple times, the same process may be output for an updated privacy data set corresponding to the last reduced different process. After the private data sets held by the first party a and the second party B are subjected to different processes once, the same process may be output for a plurality of times to obtain common data.

For the above-mentioned reduction of different processes, it is assumed that the number n of data in the first private data set is equal to the number m of data in the second private data set, that is, n = m, and correspondingly, t = n/a, where a is equal to 32, and the first and second split-bucket data include 36 single-bucket regions, 10 double-bucket regions, 8 four-bucket regions, eight-bucket regions, and sixteen-bucket regions, and 32 thirty-two-bucket regions; assuming that all the bucket spaces in the bucket zones of each hierarchy are full of data, the number of the transmitted bucket spaces is: 36t + 10/2t +8/4t +8/8t +8/16t +8/32t +32/64t =45.25 t; the total amount of data transmitted is: 45.25 × t × 32=45.25n, and assuming that the dense state calculation center transmits with two components (two executing parties), the amount of transmission data is 45.25n × 2=90.5 n.

For the same output flow, the number n of data of the updated first private data set^，Number of data m equal to the updated second private data set^，I.e. n^，=m^，，T = n^，/80. The number of plaintext is T × 80= n^，If the secret computing center transmits by using two components (two executing parties), the transmission data amount is 2n^，。

After the different processes are reduced once and the same process is output, the reduction range of the number of the data of the updated first privacy data set and the updated second privacy data set is large, and the data transmission quantity in the subsequent processes can be ignored.

Accordingly, it can be seen that the amount of transmission data for each private data in the embodiment of the present description is less than 100 bits (90.5 and 2), which corresponds to the amount of transmission data for the plain text of the private data.

The foregoing describes certain embodiments of the present specification, and other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily have to be in the particular order shown or in sequential order to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

Corresponding to the above method embodiments, the present specification provides a system for determining common data for protecting data privacy, a schematic block diagram of which is shown in fig. 6, and includes a first party 610, a second party 620 and an intermediary party 630, wherein,

the first party 610 is configured to perform preset barreling on each piece of private data in a first private data set to obtain first barreled data, and send the first barreled data to the intermediary party 630, where the preset barreling includes, for any piece of private data, filling a mapping value corresponding to the private data into a target barrel space in a plurality of preset barrel spaces according to a first segment in which the private data falls in t pre-divided first segments; any first barrel space of the plurality of barrel spaces comprises a plurality of first sections;

the middle party 630 is configured to, after obtaining the first bucket data and the second bucket data, compare the first mapping value included in the first bucket space in the bucket data of any party with each mapping value included in the second bucket space in the bucket data of the other party to obtain a comparison result for the first mapping value; sending a result set formed by comparison results of each mapping value in the bucket dividing data of any party to the first party 610 and the second party 620; wherein the second bucket space and the first bucket space share a first section, and the second sub-bucket data is obtained by the second party 620 performing the preset sub-bucket processing on a second private data set held by the second party;

the first party 610 is further configured to determine common data of the private data sets of both parties from the first private data set based on the result set.

In an alternative embodiment, the first barrel space comprises p consecutive first sections; the intermediary 430, prior to the comparing, is further configured to:

In an alternative embodiment, the intermediary party 630, prior to the obtaining of the respective bucketized data from the first and second parties, is further configured to: determining a suggested number of bucket spaces for each level; sending the suggested number to a first party and a second party respectively;

the first party 610 is further configured to determine the plurality of bucket spaces based on the suggested number.

In an alternative embodiment, the intermediary party 630, prior to the obtaining of the respective bucketized data from the first and second parties, is further configured to: determining the division number t of the first section based on the maximum value of the number of data in the privacy data sets of the two parties; determining section division information based on the division number t; respectively sending the section division information to a first party and a second party;

the first party 610, further configured to determine the t first sections based on the spatial division information.

the intermediary party 630, prior to the obtaining of the respective bucketized data from the first and second parties, is further configured to: generating the first random number; sending the first random numbers to the first party and the second party respectively;

the first party 610 is further configured to use the first random number to perform a remainder operation on each piece of data in the first privacy data set to obtain a mapping value corresponding to each piece of privacy data.

In an alternative embodiment, the intermediary party 630 is a secret computing center that includes M executing parties;

the intermediate party 630 is specifically configured to, in the process of respectively obtaining the respective bucket data from the first party and the second party:

the intermediary 630, in the process of obtaining the comparison result for the first mapping value, is specifically configured to:

In an optional implementation manner, the intermediary party 630, in the process of obtaining the comparison result for the first mapping value, is specifically configured to: comparing the first mapping value with each mapping value contained in the second barrel space to obtain each intermediate result corresponding to each mapping value; and carrying out exclusive OR operation on each intermediate result to obtain the comparison result aiming at the first mapping value.

In an optional implementation manner, in the process of filling the mapping value corresponding to the private data into the target bucket space of the preset multiple bucket spaces, the first party 610 is specifically configured to: determining a plurality of candidate bucket spaces containing a target section, the target section being a first section of the t first sections into which the private data falls; in the plurality of candidate bucket spaces, the mapping value is preferentially added to the bucket space which contains the least number of the first sections and is not full, and the mapping value is preferentially added to the bucket space which already contains the mapping value.

In an alternative embodiment, the first party 610, in determining the common data of the two parties from the first privacy data set based on the result set, is specifically configured to: deleting data which is not common with the second privacy data set from the first privacy data set based on the result set to obtain an updated first privacy data set; for T pre-divided second sections, performing exclusive-or operation on mapping values of the privacy data falling into the second sections in the updated first data set respectively to obtain T exclusive-or values corresponding to the T second sections as a first exclusive-or result; sending the first XOR result to the intermediary 630;

the intermediate party 630 is further configured to determine, based on the first xor result and the second xor result correspondingly sent by the second party 620, whether the ith xor value in the xor result of any one party is the same as the ith xor value in the xor result of the other party, so as to obtain a determination result for the ith xor value; sending the judgment result corresponding to each second section to the first party 610 and the second party 620;

the first party 610 is further configured to determine common data of both parties in the updated first private data set based on the determination result.

Corresponding to the above method embodiment, this specification embodiment provides an apparatus 700 for determining common data for protecting data privacy, where the apparatus is deployed at an intermediary party, and a schematic block diagram of the apparatus is shown in fig. 7, and includes:

a first obtaining module 710 configured to obtain respective bucket dividing data from a first party and a second party, where the bucket dividing data is obtained by performing preset bucket dividing processing on private data sets held by the parties, where the preset bucket dividing processing includes, for any private data, filling a mapping value corresponding to the private data into a target bucket space in preset multiple bucket spaces according to a first section in which the private data falls in t pre-divided first sections; any first barrel space of the plurality of barrel spaces comprises a plurality of first sections;

a first comparing module 720, configured to compare, for a first mapping value included in the first bucket space in the data of any one of the two parties, with each mapping value included in the second bucket space in the data of the other party, to obtain a comparison result for the first mapping value; wherein the second bucket space and the first bucket space present a common first section;

a first sending module 730, configured to send a result set formed by comparison results of each mapping value in the bucket data of any party to the first party and the second party, for determining common data of the private data sets of both parties.

In an alternative embodiment, the first barrel space comprises p consecutive first sections; the device further comprises:

a space determining module (not shown in the figure) configured to determine a second bucket space before comparing a first mapping value contained in the first bucket space in the bucket data for any one party with each mapping value contained in a second bucket space in the other party's bucket data, where the second bucket space includes a first subspace, a second subspace and/or a third subspace, the first subspace includes a part of the p first sections, and the second subspace corresponds to the same as the first bucket space; the third subspace includes and is larger than the first bucket space.

In an alternative embodiment, the apparatus further comprises:

a quantity determination module (not shown) configured to determine a suggested quantity of each hierarchical bucket space before said obtaining respective bucketized data from the first and second parties, respectively;

a third sending module (not shown) configured to send the suggested number to the first party and the second party, respectively, so that the suggested number determines the plurality of barrel spaces.

In an alternative embodiment, the method further comprises:

a second determining module (not shown in the figure) configured to determine the number t of partitions of the first section based on a maximum value of the number of data in the private data sets of the first party and the second party before the obtaining of the respective partitioned data from the first party and the second party respectively;

a determination transmission module (not shown in the figure) configured to determine the section division information based on the division number t; and respectively sending the section division information to a first party and a second party to ensure that the t first sections are determined.

the device further comprises:

a generating module (not shown in the figure) configured to generate the first random number before the obtaining of the respective barrel data from the first party and the second party, respectively;

a fourth sending module (not shown in the figure) configured to send the first random numbers to the first party and the second party, respectively.

In an alternative embodiment, the apparatus further comprises:

a third obtaining module (not shown in the figure), configured to obtain respective xor results from the first party and the second party, wherein the xor results include T xor values corresponding to T pre-divided second sections, and any ith xor value is obtained by performing xor operation on mapping values corresponding to privacy data falling into the ith second section in an updated privacy data set of the parties, and the updated privacy data set is obtained by deleting non-shared data from the privacy data set of the parties based on the result set;

a judging module (not shown in the figure) configured to judge whether the ith exclusive-or value in the exclusive-or result of any one party is the same as the ith exclusive-or value in the exclusive-or result of the other party, so as to obtain a judgment result for the ith exclusive-or value;

a fifth sending module (not shown in the figure) configured to send the determination result corresponding to each second section to the first party and the second party, for determining the common data of the updated privacy data sets of the two parties.

the first obtaining module 710 is specifically configured to obtain, by each executing party, a respective data fragment of a plurality of buckets from the first party and the second party, where each data fragment of the plurality of buckets is obtained by dividing, by each party, each mapping value in the plurality of bucket spaces of the data fragment into M parts;

the first comparing module 720 is specifically configured to compare the first mapping value with each mapping value included in the second bucket space by each executing party through a multi-party security calculation MPC manner, so as to obtain the comparison result.

In an optional implementation manner, the first comparing module 720 is specifically configured to compare the first mapping value with each mapping value included in the second barrel space, so as to obtain each intermediate result corresponding to each mapping value;

Corresponding to the above method embodiment, this specification embodiment provides an apparatus 800 for determining common data for protecting data privacy, where the apparatus is deployed at a first party, and a schematic block diagram of the apparatus is shown in fig. 8, and includes:

a first bucket dividing processing module 810, configured to perform preset bucket dividing processing on each piece of private data in a held first private data set to obtain first bucket dividing data, where the preset bucket dividing processing includes, for any piece of private data, filling a mapping value corresponding to the private data into a target bucket space in preset multiple bucket spaces according to a first section in which the private data falls in t pre-divided first sections; any first barrel space of the plurality of barrel spaces comprises a plurality of first sections;

a second sending module 820 configured to send the first barrel data to an intermediary party so that the intermediary party determines a result set based on the first barrel data and a second barrel data, wherein the second barrel data is obtained by performing the preset barrel processing on a second privacy data set held by the second party; the result set comprises a comparison result obtained by comparing a first mapping value in a first bucket space in the first barreled data with each mapping value contained in a second bucket space in the second barreled data, wherein a common first section exists between the second bucket space and the first bucket space;

a second obtaining module 830 configured to obtain the result set from the intermediary party;

a first determining module 840 configured to determine common data of both parties from the first set of privacy data based on the set of results.

In an alternative embodiment, the method further comprises:

a fourth obtaining module (not shown in the figure) configured to obtain a first random number from the intermediary party;

and a complementation module (not shown in the figure) configured to utilize the first random number to complement each piece of data in the first privacy data set to obtain a mapping value corresponding to each piece of privacy data.

In an alternative embodiment, the method further comprises:

a fifth obtaining module (not shown in the figure) configured to obtain, from the intermediary party, space division information, the space division information being determined based on a space division number t, the space division number t being determined by the intermediary party based on a maximum value of the number of data in the first privacy data set and the second privacy data set;

a third determining module (not shown in the figure) configured to determine the t first sections based on the spatial division information.

In an alternative embodiment, the method further comprises:

a sixth obtaining module (not shown in the figures) configured to obtain a suggested number of bucket spaces of each hierarchy from the intermediary;

a fourth determining module (not shown in the figures) configured to determine the plurality of bucket spaces according to the suggested number.

In an optional embodiment, the first bucket processing module 810 is specifically configured to determine a plurality of candidate bucket spaces including a target section, where the target section is a first section of the t first sections, into which the private data falls;

In an optional embodiment, the first determining module 840 is specifically configured to delete, based on the result set, data that is not common to the second private data set from the first private data set, resulting in an updated first private data set;

obtaining the judgment result from the intermediate party;

The above device embodiments correspond to the method embodiments, and specific descriptions may refer to descriptions of the method embodiments, which are not repeated herein. The device embodiment is obtained based on the corresponding method embodiment, has the same technical effect as the corresponding method embodiment, and for the specific description, reference may be made to the corresponding method embodiment.

The present specification also provides a computer readable storage medium having a computer program stored thereon, which when executed in a computer causes the computer to perform the method for asset transfer in the payment platform provided in the specification.

The embodiment of the specification further provides a computing device which comprises a memory and a processor, wherein the memory stores executable codes, and the processor executes the executable codes to realize the method for transferring the assets in the payment platform.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the storage medium and the computing device embodiments, since they are substantially similar to the method embodiments, they are described relatively simply, and reference may be made to some descriptions of the method embodiments for relevant points.

Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in connection with the embodiments of the invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.

The above-mentioned embodiments further describe the objects, technical solutions and advantages of the embodiments of the present invention in detail. It should be understood that the above description is only exemplary of the embodiments of the present invention, and is not intended to limit the scope of the present invention, and any modification, equivalent replacement, or improvement made on the basis of the technical solutions of the present invention should be included in the scope of the present invention.

Claims

1. A method of determining common data that protects data privacy, the method performed by an intermediary, comprising:

2. The method of claim 1, wherein the plurality of bucket spaces comprises a plurality of levels of bucket spaces, different levels of bucket spaces containing different numbers of first sections.

3. The method of claim 2, wherein the first barrel space comprises p consecutive first sections; prior to said comparing, further comprising:

4. The method of claim 2, further comprising, prior to said obtaining respective bucketized data from the first and second parties, respectively:

determining a suggested number of bucket spaces for each level;

5. The method of claim 1, further comprising, prior to said obtaining respective bucketized data from the first and second parties, respectively:

6. The method of claim 1, wherein the mapping value corresponding to the private data is determined based on a remainder result of the private data taken for the first random number;

generating the first random number;

7. The method according to claim 1, wherein each piece of privacy data is a hash value obtained by performing a hash calculation on the corresponding object identifier.

8. The method of any of claims 1-7, further comprising:

9. The method of any of claims 1-7, wherein the intermediary party is a secret computing center comprising M executing parties;

the obtaining of the comparison result for the first mapping value includes:

10. The method of any of claims 1-7, wherein the obtaining a comparison result for a first mapping value comprises:

11. A method of determining common data that protects data privacy, the method performed by a first party, the method comprising:

obtaining the result set from the intermediary;

12. The method of claim 11, further comprising:

obtaining a first random number from the intermediary;

13. The method of claim 11, wherein the plurality of bucket spaces comprises a plurality of levels of bucket spaces, different levels of bucket spaces containing different numbers of first sections.

14. The method of claim 11, further comprising:

determining the t first sections based on the spatial division information.

15. The method of claim 13, further comprising:

determining the plurality of bucket spaces according to the suggested number.

16. The method of claim 13, wherein the filling a mapping value corresponding to the private data into a target bucket space of a plurality of preset bucket spaces comprises:

17. The method according to claim 11, wherein each piece of data in the first private data set is a hash value obtained by performing a hash calculation on the corresponding object identifier.

18. The method of any of claims 11-17, wherein the determining, based on the result set, data common to both parties from the first privacy dataset comprises:

obtaining the judgment result from the intermediate party;

19. A method of determining common data that protects data privacy, the method performed by a first party and an intermediary party, the method comprising:

20. A system for determining common data for protecting data privacy includes a first party, a second party, and an intermediary party, wherein,

21. An apparatus for determining common data for protecting data privacy, the apparatus being deployed at an intermediary party, comprising:

22. An apparatus for determining common data for protecting privacy of data, the apparatus being deployed at a first party, comprising:

a first determination module configured to determine common data of both parties from the first set of privacy data based on the set of results.

23. A computing device comprising a memory and a processor, wherein the memory has stored therein executable code that when executed by the processor implements the method of any of claims 1-10 or 11-18.