WO2023124400A1 - 保护数据隐私的多方数据查询方法及装置 - Google Patents

保护数据隐私的多方数据查询方法及装置 Download PDF

Info

Publication number
WO2023124400A1
WO2023124400A1 PCT/CN2022/125462 CN2022125462W WO2023124400A1 WO 2023124400 A1 WO2023124400 A1 WO 2023124400A1 CN 2022125462 W CN2022125462 W CN 2022125462W WO 2023124400 A1 WO2023124400 A1 WO 2023124400A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
ciphertext
row
sorting
attribute
Prior art date
Application number
PCT/CN2022/125462
Other languages
English (en)
French (fr)
Inventor
潘无穷
韦韬
李婷婷
李天一
Original Assignee
支付宝(杭州)信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 支付宝(杭州)信息技术有限公司 filed Critical 支付宝(杭州)信息技术有限公司
Priority to EP22913664.3A priority Critical patent/EP4345670A1/en
Publication of WO2023124400A1 publication Critical patent/WO2023124400A1/zh
Priority to US18/400,427 priority patent/US20240135026A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6227Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries

Definitions

  • This specification relates to the technical field of data security, and in particular to a multi-party data query method and device for protecting data privacy.
  • party A holds the user's height, weight and other physical status data
  • party B holds the user's salary data
  • party C holds the user's loan data.
  • the protection and security of data privacy become issues worthy of attention.
  • Party A needs to query the sum of the salaries of users whose salary is in the top N position, or needs to query the loan status of users whose salary is in the top N position, or, It is necessary to query the sum of the salaries of users whose height is in the top M position, and so on.
  • parties A, B, and C directly upload the private data of users they hold to a third party in plain text
  • the third party will summarize the data into a table, and then sort the corresponding data in the table, based on the sorting results
  • this process may reveal the private data of users of all parties.
  • One or more embodiments of this specification provide a multi-party data query method and device for protecting data privacy, so as to realize the protection of multi-party private data.
  • a multi-party data query method for protecting data privacy includes a plurality of data owners, each of which holds the attribute values of several attribute items of N target objects, and the method uses the multi-party Executed by an intermediary other than the method, the method includes:
  • sort the attribute value ciphertext corresponding to the target attribute item in the disordered table to obtain a target sorting table, and based on the target sorting table, obtain the Sort related data as query results.
  • the intermediate party is a secret computing system, which includes M executing parties;
  • the said acquisition of attribute value ciphertexts of N target objects sent by each data owner includes:
  • Each executor among the M executors respectively obtains an attribute value fragment sent by each data owner as an attribute value ciphertext, wherein each attribute value fragment is held by each data owner
  • the attribute value of is determined by dividing it into M parts
  • the scrambling of the ciphertext table in units of rows includes:
  • the sorting-related data is a calculation result of performing a specified calculation on the ciphertext of the specified X-bit attribute value in the sorting sequence corresponding to the target attribute item;
  • the obtaining the sorting related data as the query result includes:
  • said reordering the ciphertext table in row units includes:
  • the ciphertext table is reordered through at least one reordering process, wherein any current reordering process includes:
  • At least one pair of replacement rows is determined from the target sub-table;
  • For each pair of replacement rows perform position replacement with a certain probability to obtain the out-of-order sub-table corresponding to the target sub-table, determine the output table of the current round of out-of-order process based on the out-of-order sub-table, and use it to form the out-of-order sheet.
  • the certain probability is determined according to random numbers generated for the pair of replacement rows.
  • the target subtable when the current out-of-order flow is the first out-of-order flow, the target subtable is the ciphertext table; when the current out-of-order flow is a non-first out-of-order flow, the The target subtable is a table with a row number not less than 2 in the output table corresponding to the previous out-of-order process.
  • the determining at least one pair of replacement rows from the target subtable includes:
  • each loop process includes, at least based on the current loop count, selecting at least one pair of replacement rows from the target subtable; until a loop end condition related to the loop threshold is reached.
  • selecting at least one pair of replacement rows from the target subtable includes:
  • the determining the output table of the current round of out-of-order process based on the out-of-order sub-table includes:
  • the out-of-order sub-table is divided to determine the output table of the current round of out-of-order process.
  • the row numbers corresponding to each row in the ciphertext table are expressed in binary
  • the sorting the attribute value ciphertext corresponding to the target attribute item in the disordered table includes:
  • any sorting process includes:
  • the grouping also obtains a third row set equal to the reference ciphertext.
  • any one of the sorting processes further includes, arranging and placing the first row set, the second row set and the third row set corresponding to each table part, so that the first row set is placed in the second row set The lower part of the three-row set, the second set is placed on the upper part of the third set.
  • said reordering the ciphertext table in row units includes:
  • the preset out-of-order condition includes: the total number of rows is equal to an integer power of a preset value;
  • the ciphertext table is filled with specific rows to obtain a filled form, and the total number of rows in the filled form meets the preset out-of-order condition;
  • said obtaining the target sorting table includes:
  • the specific rows filled in the sorted table are removed to obtain the target sorted table.
  • a multi-party data query device for protecting data privacy includes multiple data owners, each of which holds the attribute values of several attribute items of N target objects, and the device is deployed in the For an intermediary party other than multiple parties, the means include:
  • the obtaining module is configured to obtain the attribute value ciphertext of N target objects sent by each data owner to obtain a ciphertext table, wherein one row corresponds to a target object in the ciphertext table, and one column corresponds to an attribute item;
  • the out-of-order module is configured to out-order the ciphertext table in units of behaviors to obtain the out-of-order table;
  • the sorting module is configured to sort the attribute value ciphertext corresponding to the target attribute item in the disordered table in response to a query instruction for querying sorting related data for the target attribute item to obtain a target sorting table;
  • a query module configured to obtain the sorting related data as a query result based on the target sorting table.
  • the intermediate party is a secret computing system, which includes M executing parties;
  • the obtaining module is specifically configured such that each executor among the M executors respectively obtains an attribute value fragment sent by each data owner as an attribute value ciphertext, wherein each attribute value fragment is, Each data owner divides the attribute value it holds into M shares and determines it;
  • the out-of-sequence module is specifically configured such that each executor calculates the MPC scheme based on the attribute value ciphertext held by the party, and cooperates with other M-1 executors to process the ciphertext in action units. Tables are shuffled.
  • the sorting-related data is a calculation result of performing a specified calculation on the ciphertext of the specified X-bit attribute value in the sorting sequence corresponding to the target attribute item;
  • the query module is specifically configured to obtain the specified X-bit target attribute value ciphertext from the target sorting table;
  • the out-of-order module is specifically configured to out-order the ciphertext table through at least one out-of-order process, wherein the out-of-order module implements any current out-of-order Execution of the program flow:
  • the first determination unit is configured to, for each target sub-table of the ciphertext table, determine at least one pair of replacement rows from the target sub-table;
  • the position replacement unit is configured to perform position replacement with a certain probability for each pair of replacement rows, so as to obtain the disordered subtable corresponding to the target subtable;
  • the second determining unit is configured to determine an output table of the current round of the out-of-order process based on the out-of-order sub-table, for forming the out-of-order table.
  • the certain probability is determined according to random numbers generated for the pair of replacement rows.
  • the target subtable when the current out-of-order flow is the first out-of-order flow, the target subtable is the ciphertext table; when the current out-of-order flow is a non-first out-of-order flow, the The target subtable is a table with a row number not less than 2 in the output table corresponding to the previous out-of-order process.
  • the first determining unit is specifically configured to determine a cycle threshold based on the number of rows of the target subtable
  • each loop process includes, at least based on the current loop count, selecting at least one pair of replacement rows from the target subtable; until a loop end condition related to the loop threshold is reached.
  • the first determining unit is specifically configured to select, from the target subtable, a row whose corresponding row label is equal to the current cycle number, and a row whose corresponding row label is equal to the current cycle number The row with the sum of the cycle threshold, as a pair of permutation rows.
  • the second determining unit is specifically configured to use the cycle threshold to divide the out-of-order sub-table, so as to determine the output table of the current round of out-of-order process.
  • the row numbers corresponding to each row in the ciphertext table are expressed in binary
  • the first determination unit is specifically configured to select two rows whose corresponding row labels are identical except for the i-th digit from the target subtable as a pair of replacement rows, wherein the i is equal to the current The number of cycles, the target subtable is the ciphertext table itself.
  • the sorting module is specifically configured to iteratively execute multiple sorting processes, wherein any one of the sorting processes includes:
  • the grouping also obtains a third row set equal to the reference ciphertext.
  • the sorting module is further specifically configured to arrange and place the first row set, the second row set, and the third row set corresponding to each table part, so that the first row set is placed in the second row set The lower part of the three-row set, the second set is placed on the upper part of the third set.
  • the out-of-order module includes:
  • the judging unit is configured to judge whether the total number of rows of the ciphertext table satisfies a preset out-of-order condition, wherein the preset out-of-order condition includes: the total number of rows is equal to an integer power of a preset value;
  • the filling unit is configured to fill the ciphertext form with a specific row if it is judged that the preset out-of-sequence condition is not met, to obtain a filled form, and the total number of rows in the filled form meets the preset out-of-order condition;
  • the shuffle unit is configured to shuffle the filled table in row units to obtain a shuffle table.
  • the sorting module is specifically configured as:
  • the specific rows filled in the sorted table are removed to obtain the target sorted table.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed in a computer, the computer is caused to execute the method described in the first aspect.
  • a computing device including a memory and a processor, wherein executable code is stored in the memory, and when the processor executes the executable code, the method described in the first aspect is implemented.
  • each data owner sends the attribute value ciphertext to the intermediate party, which can avoid the leakage of the attribute value.
  • the intermediate party obtains the ciphertext table constructed by the attribute value ciphertext sent by each data owner, it first shuffles the ciphertext table in units of rows, so as to disrupt the sequence relationship of each row in the ciphertext table, and obtains Out of order table. Then sort the attribute value ciphertext corresponding to the target attribute item in the out-of-order table.
  • Figure 1A is a schematic diagram of an implementation framework of an embodiment disclosed in this specification.
  • Fig. 1B is a schematic diagram of an implementation framework of an embodiment disclosed in this specification.
  • FIG. 2 is a schematic flow diagram of a multi-party data query method for protecting data privacy provided by an embodiment
  • Fig. 3 is a process schematic diagram of an out-of-order process provided by the embodiment
  • Fig. 4 is a process schematic diagram of an out-of-order process provided by the embodiment
  • FIG. 5 is a schematic diagram of a scene of out-of-order provided by the embodiment.
  • Fig. 6 is a schematic block diagram of a multi-party data query device for protecting data privacy provided by an embodiment.
  • party A holds the attribute values of the user's height, weight and other physical status attributes
  • party B holds the attribute value of the user's salary attribute (specific salary amount)
  • party C holds There is an attribute value of the user's loan attribute (specific loan amount).
  • Party A needs to query the sum of salaries of users whose salary is in the top N position, or needs to query the loan status of users whose salary is in the top N position, or needs to query the sum of salary and weight of users whose salary is in the top M position
  • query requirements such as the sum of the salary of a user
  • the third party When the third party jointly sorts the data sent by A, B, and C during the data query process, it can use a currently existing sorting method based on a similar order-preserving encryption method.
  • this sorting method the third party obtains the attribute value ciphertext uploaded by all parties (A, B, and C) to obtain a table to be sorted, in which the user ID and the attribute values of each attribute item are ciphertext (for example, One row in the table stores the attribute value ciphertext of one user for multiple attribute items, and one column stores the attribute value ciphertext of each user for one attribute item), but the sequence relationship between the rows is plaintext.
  • the third party directly sorts the attribute value ciphertexts of the target attribute items in the table based on the query request indicating to sort the target attribute items.
  • the third party uses the above sorting method to sort the data
  • each row in the table will be sorted.
  • the sequence relationship on the column is exposed, for example: the target attribute item includes the user's weight attribute item and salary attribute item.
  • the third party may Personal privacy information will be obtained, such as the fifth person in weight and the first salary information, which is not allowed under the condition of confidentiality of private data.
  • the processing method provided by the embodiment of this specification is mainly designed for the scenario of multi-party data query holding private data (attribute values of several attribute items of N target objects).
  • Fig. 1A is a schematic diagram of an implementation scenario of an embodiment disclosed in this specification.
  • multiple data owners are schematically shown, for example, party A, party B, and party C, and an intermediary D, where the intermediary D can be implemented using a secret-state computing system.
  • Each data owner and intermediate party can be specifically embodied as: a device, platform, server or device cluster with computing and processing capabilities.
  • each data owner holds its own private data, that is, the attribute values of several attribute items of N target objects, and the intermediate party D is used to receive each data owner (ie, party A, party B, and party C), respectively
  • the attribute value ciphertext of the N target objects sent is composed of a ciphertext table, and then the subsequent data query process is executed.
  • Each data owner does not want its own data (attribute values of several attribute items of N target objects) to be leaked, and also does not want the sequence relationship of each row (attribute value of each target object) on the sorting column to be exposed.
  • each data owner performs corresponding processing on the attribute values of several attribute items of the N target objects held by each data owner according to the data upload requirements of the intermediate party D, and obtains the attribute values of each attribute value ciphertext; after that, each data owner uploads the ciphertext of their respective attribute values to the intermediate party D.
  • the intermediary D obtains the attribute value ciphertext of N target objects sent by each data owner to obtain a ciphertext table, one row of the ciphertext table corresponds to a target object, and one column corresponds to an attribute item, that is, the ciphertext table
  • One row stores the attribute value ciphertext of a target object for multiple attribute items
  • one column stores the attribute value ciphertext of each target object for one attribute item.
  • the ciphertext table can be shown in Table 1 below.
  • the middle party D shuffles the ciphertext table in units of rows, that is, shuffles the rows of the ciphertext table to disrupt the sequence relationship between rows in the ciphertext table, so that the positions of the front and back rows in the ciphertext table It cannot be tracked, and the table after the disorder is obtained, that is, the disordered table.
  • the intermediary D sorts the attribute value ciphertext corresponding to the target attribute item in the disordered table to obtain the target sorting table, wherein each row in the target sorting table can be represented by the target
  • the attribute value ciphertext corresponding to the attribute item is sorted in descending order (or from smallest to largest).
  • the sorting related data is obtained as the query result. Afterwards, the query result is fed back to the initiator of the query command.
  • the intermediary D sorts the attribute value ciphertext corresponding to the target attribute item in the out-of-order table, which can be used as an intermediate process for determining the query result corresponding to the query instruction.
  • the sorted target sorting table is generally not fed back to the query instruction initiator.
  • the intermediary D can be implemented through a secret-state computing system, which includes M executors, and the M executors can run in the trusted execution environment TEE, and one executor can be implemented through one TEE.
  • the executor can be called a trusted executor.
  • the M executors may also run in a common execution environment. As shown in Figure 1B, the number M of executors can be set to 3, and each data owner divides each attribute value held by it into 3 shares, and obtains 3 attribute value fragments corresponding to each attribute value. The data owner sends each attribute value fragment corresponding to each attribute value it holds to each executor.
  • Each of the three executors obtains a piece of attribute value fragment sent by each data owner as the attribute value ciphertext, based on the attribute value ciphertext held by the party, through the multi-party secure calculation MPC scheme, and The other two executors jointly shuffle the ciphertext table in units of behaviors.
  • each executor obtains a part of the attribute value (attribute value fragmentation), and each executor cannot obtain the plaintext of the attribute value, which can avoid the leakage of the attribute value plaintext in each executor. And if one or a specified number of executors is successfully attacked, since the executor only obtains part of the attribute value, the attacker cannot obtain the plaintext of the attribute values of each data owner through the successful attack on the executor.
  • the secret computing system will have a higher degree of protection for its data and better anti-attack capabilities.
  • the target object at position a in the ranking of the first attribute item is the target object at position b in the ranking of the second attribute item).
  • Fig. 2 shows a flow chart of a multi-party data query method for protecting data privacy in an embodiment of this specification.
  • the multi-party includes multiple data owners, each of which holds the attribute values of several attribute items of N target objects, and the method is executed by an intermediate party other than the multiple data owners.
  • the method includes the following steps S210-S240:
  • S210 Obtain the ciphertext of the attribute values of the N target objects sent by each data owner to obtain a ciphertext table.
  • one row of the ciphertext table corresponds to a target object
  • one column corresponds to an attribute item
  • one row in the ciphertext table stores the ciphertext of the attribute values of a target object for multiple attribute items
  • one column stores the ciphertext of each target object for an attribute The attribute value ciphertext for the item.
  • the multiple data owners Ai before the multiple data owners Ai send the attribute values of the N target objects held by them to the intermediary according to the data upload requirements of the intermediary, in order to ensure that the data on the intermediary side To effectively carry out the query process, the multiple data owners A i need to align the attribute values of N target objects.
  • the alignment may be that multiple data owners A i arrange the attribute values of each attribute item of the target object according to the preset target object arrangement order, for example, the attribute values of all attribute items of the target object 1 are in the first The attribute values of all attribute items of the target object 2 are in the second row, and so on, the attribute values of all attribute items of the target object N are in the Nth row.
  • the target object may be one of the following: user, item.
  • each data owner Ai sends the aligned attribute value ciphertexts of several attribute items of N target objects to the intermediate party, it also needs to send the corresponding object identification ciphertexts of the N target objects to the intermediate square.
  • S220 Shuffle the ciphertext table in row units to obtain a random table.
  • the intermediary can shuffle the ciphertext table in units of behaviors based on the preset scrambling scheme or in a random manner, that is, scrambling the sequence relationship between rows in the ciphertext table to obtain a scrambled table.
  • the out-of-order table it is impossible to know which row specifically stores the attribute value ciphertext of the target object for multiple attribute items; in each row in the out-of-order table, the attribute value ciphertext of the target object and its corresponding attribute items The corresponding relationship between the texts has not changed.
  • the query instruction may be initiated by any data owner, or may be initiated by a user who uses the service jointly provided by the above-mentioned multiple data owners, wherein the service is related to the target object held by the multiple data owners. Attribute values are related.
  • the query instruction includes at least information indicating to sort the attribute value ciphertext of the target attribute item. That is, the query result corresponding to the query instruction is sorting related data, which needs to be determined by using the sorting result of the attribute value ciphertext corresponding to the target attribute item.
  • each row is sorted in descending order (or from small to large) of the attribute value ciphertext corresponding to the target attribute item. For example: in the target sorting table, the row of the largest attribute value ciphertext corresponding to the target attribute item is located in the first row; the row of the second largest attribute value ciphertext corresponding to the target attribute item is located in the second row, and so on. The row where the minimum attribute value ciphertext corresponding to the target attribute item is located, and is located in the Nth row.
  • the row corresponding to the smallest attribute value ciphertext corresponding to the target attribute item is located in the first row; the row corresponding to the second smallest attribute value ciphertext corresponding to the target attribute item is located in the second row, and so on.
  • the row where the ciphertext of the largest attribute value corresponding to the target attribute item is located which is located in the Nth row.
  • S240 Obtain sorting related data as a query result based on the target sorting table. After the intermediary obtains the target sorting table, it can obtain the sorting order between the attribute value ciphertexts corresponding to the target attribute item. On the basis of the sorting order, a series of logical analysis is performed based on the query command to obtain the sorting related data, that is, the query command corresponding query results. Send the query result to the initiator of the query instruction.
  • the sorting-related data is the calculation result of the specified calculation on the ciphertext of the specified X-bit attribute value in the sorting sequence corresponding to the target attribute item;
  • the S240 may include the following step 01: Obtain the target attribute value ciphertext with specified X digits from the target sorting table; Step 02: Perform specified calculation on the target attribute value ciphertext.
  • the specified calculation includes but not limited to calculations such as summation, average value, multiplication, and size comparison.
  • the data holder that initiates the query instruction may be a data holder that does not hold the target attribute item.
  • the multiple data owners are A, B, and C respectively
  • the target objects are users 1-N respectively.
  • party A holds the attribute values of user 1-N's height, weight, and age attribute items of physical status
  • party B holds the attribute value of user 1-N's salary attribute item (specific salary amount) and working years attribute
  • the attribute value of the item specifically number of years
  • party C holds the attribute value of the loan attribute item of users 1-N (specific loan amount).
  • This query command is sent by party A, and is used to query the sum of the top 10 salaries of the salary attribute item.
  • the sorting related data is: the ciphertext of the top 10 attribute values in the sorting sequence corresponding to the salary attribute item ( Salary amount ciphertext) is the calculation result of the summation calculation.
  • the middle party obtains the ciphertext of the attribute values of users 1-N uploaded by each of A, B, and C to obtain the ciphertext form; the ciphertext form is scrambled in units of behaviors to obtain the scrambled form; the salary in the scrambled form is The attribute value ciphertext corresponding to the attribute item is sorted to obtain the target sorting table.
  • the sorting-related data is the calculation result of the specified calculation of the attribute value ciphertext corresponding to the specified attribute item of the specified h-bit target object in the sorting sequence corresponding to the target attribute item, and the specified attribute item is An attribute item that is different from the target attribute item; in this case, the data holder that initiates the query instruction may be a data holder that does not hold the specified attribute item and/or the target attribute item.
  • the intermediary determines the attribute value ciphertext of specified Y bits from the attribute value ciphertext corresponding to the specified attribute item in the target sorting table, and performs specified calculation on the attribute value ciphertext of specified Y bits to obtain the calculated result, as a query result.
  • Party A needs to query the sum of loans and loans of users whose salary ranks in the top Y.
  • the query command is sent by Party A to query the attributes corresponding to the loan and loan attributes of the top 10 users in the salary attribute item
  • the sum of value ciphertext that is, the sum of the loan amount
  • the query command is used to query the loan attributes of the top 10 users in the sorting sequence corresponding to the salary attribute item in the sorting table (target sorting table) corresponding to the salary attribute item
  • the sum of the ciphertext of the attribute value of the item is sent by Party A to query the attributes corresponding to the loan and loan attributes of the top 10 users in the salary attribute item.
  • the middle party sorts the attribute value ciphertexts corresponding to the salary attribute items in the disordered table, and obtains the sorting table corresponding to the salary attribute items, that is, the target sorting table, and determines the corresponding sorting of the salary attribute items from the target sorting table
  • the top 10 users in the sequence are used as the target user, and the attribute value ciphertext of the loan attribute item corresponding to the target user is determined, and then the determined attribute value ciphertext of the loan attribute item is summed to obtain the sum result, as a query result.
  • the query instruction can also be used to query the comparison result between the first calculated value and the second calculated value, wherein the first calculated value is based on the corresponding attribute item in the sorting table corresponding to the first attribute item.
  • the attribute value ciphertext of specified h bits and the first calculation method are determined; the second calculation value is determined based on the attribute value ciphertext of specified h bits corresponding to the specified attribute item in the sorting table corresponding to the second attribute item and the first calculation method Determine, that is, the sorting-related data is the ciphertext of the attribute value corresponding to the specified attribute item of the target object with the specified h-bit in the sorting sequence corresponding to the first attribute item, and the specified h-bit in the sorting sequence corresponding to the second attribute item.
  • both the above-mentioned first attribute item and the second attribute item are used as target attribute items.
  • the first attribute item, the second attribute item and the specified attribute item are different.
  • the data holder that initiates the query instruction may be any data holder that does not hold the specified attribute item, the first attribute item, and/or the second attribute item.
  • the middle party first sorts the first attribute item as the target attribute item, that is, sorts the attribute value ciphertext corresponding to the first attribute item in the out-of-order table to obtain the first target sorting table, and starts from the first In the target sorting table, determine the target object at the specified h position in the sorting sequence corresponding to the first attribute item, as the first group of objects, and determine the attribute value corresponding to the specified attribute item of the first group of objects from the first target sorting table
  • the ciphertext is used as the ciphertext of the first group of attribute values; based on the ciphertext of the first group of attribute values and the first calculation method, the first calculation value is obtained.
  • sort the second attribute item as the target attribute item that is, rearrange the ciphertext table (or the first target sorting table) in units of rows to obtain the first table after the disordered order, and respond to the aforementioned query command , sort the attribute value ciphertext corresponding to the second attribute item in the first table to obtain the second target sorting table.
  • the second target sorting table determines the target object of the specified h position in the sorting sequence corresponding to the second attribute item, as the second group of objects, determine the specified attribute of the second group of objects from the second target sorting table
  • the attribute value ciphertext corresponding to the item is used as the second group of attribute value ciphertext; based on the second group of attribute value ciphertext and the first calculation method, the second calculation value is obtained.
  • Party A needs to query the sum of the salaries of the top 10 users in the sorting sequence corresponding to the height (the first attribute item), and the sum of the salaries of the top 10 users in the sorting sequence corresponding to the weight (the second attribute item).
  • the relevant data for this sorting is the sum of the salaries of the top 10 users in the sorting sequence corresponding to the height attribute item, and the corresponding weight attribute item The result of comparing the sums of salaries of the top 10 users in the sort sequence.
  • the middle party sorts the attribute value ciphertext corresponding to the height attribute item in the disordered table, and obtains the sorting table corresponding to the height attribute item, that is, the first target sorting table. From the first target sorting table, it is determined that the birth height is in The attribute value ciphertexts corresponding to the salary attribute items of the top 10 users are used as the first group of attribute value ciphertexts, and then the encrypted state summation is performed on the first group of attribute value ciphertexts, and the first sum is obtained.
  • the middle party shuffles the ciphertext table (or the first target sorting table) in units of rows to obtain the first table after the reordering; in response to the query instruction, the attribute value corresponding to the weight attribute item in the first table
  • the ciphertext is sorted to obtain the second target sorting table; from the second target sorting table, determine the attribute value ciphertext corresponding to the salary attribute item of the top 10 users, as the second group of attribute value ciphertext, and then to the second
  • the ciphertexts of the two sets of attribute values are encrypted and summed to obtain the second sum; the first sum and the second sum are compared to obtain the query result (that is, sorting related data).
  • the target object in the rank a of the first attribute item is the target object in the rank b of the second attribute item.
  • a third-party intermediary while it cannot obtain the plain text of each attribute value, it will not obtain the sequence relationship of each row in the table in the sorting column.
  • the intermediate party is a secret computing system, which includes M execution parties.
  • the executor can run the corresponding cryptographic computing system in the trusted execution environment TEE (Trusted Execution Environment), and the cryptographic computing system is a trusted cryptographic computing (TrustEd Cryptographic Computing, TECC) system; run in the execution environment.
  • TEE Trusted Execution Environment
  • TECC trusted cryptographic computing
  • each data owner divides each attribute value held by it into M shares, and obtains M attribute value fragments corresponding to each attribute value, and each data owner divides each attribute value corresponding to each attribute value it holds Value shards are sent to each executor separately.
  • the S210 is set to: each of the M executors respectively obtains a piece of attribute value fragment sent by each data owner as the attribute value ciphertext, wherein each attribute value fragment is , determined by each data owner dividing the attribute value it holds into M shares.
  • each executor based on the ciphertext of the attribute value held by the party, uses the multi-party secure calculation MPC scheme, and cooperates with other M-1 executors to perform the ciphertext table in units of behaviors. Out of order.
  • each executor obtains a part of the attribute value (a fragment of the attribute value), and each executor cannot obtain the plaintext of the attribute value, which can avoid leakage of the plaintext of the attribute value at each executor.
  • the intermediary party When the intermediary party shuffles the ciphertext table in line units, it can use any method in related technologies that can realize the shuffle of the table. You can also use the follow-up reordering method provided in this description to perform reordering. The reordering method of the ciphertext table will be described in detail below.
  • the S220 may include the following steps: shuffle the ciphertext table through at least one out-of-order flow, wherein any current out-of-order flow may include the following steps 11-13:
  • Step 11 For each target sub-table of the ciphertext table, determine at least one pair of replacement rows from the target sub-table.
  • Step 12 For each pair of replacement rows, perform position replacement with a certain probability, so as to obtain the out-of-order subtable corresponding to the target subtable.
  • Step 13 Determine the output table of the current round of the out-of-order process based on the out-of-order sub-table to form the out-of-order table.
  • the certain probability is determined according to random numbers generated for the pair of replacement rows. That is, after determining at least one pair of replacement rows in the target subtable, the intermediate party generates random numbers for each pair of replacement rows, and performs position replacement based on the random numbers.
  • the random number may be a first value or a second value, the first value may be 1, indicating that the position of the corresponding replacement row is replaced; the second value may be 0, indicating that the position of the corresponding replacement row is not replaced.
  • the random number may have a value of 0-1, wherein a larger random number indicates a higher probability of replacing the position of the corresponding replacement row.
  • the position of the replacement line mentioned in this specification refers to the replacement of the attribute value ciphertext of the target object stored in each pair of replacement lines, for example, it is determined that the first line and the fourth line need to be replaced.
  • the position of the row that is, the ciphertext of the attribute value of the target object stored in the first row is stored in the fourth row, and the ciphertext of the attribute value of the target object stored in the fourth row is stored in the first row.
  • the intermediate party is implemented by a secret-state computing system, which includes M executors, and the above-mentioned random numbers generated for each pair of replacement rows are random numbers generated by each of the M executors based on it Fragmentation is determined jointly with other M-1 through the MPC scheme.
  • each executor only generates a fragment of the random number, and cannot know the random number.
  • each executor cannot know whether the positions of each pair of replacement rows are actually replaced.
  • the intermediary can pre-store the out-of-order process threshold, and correspondingly, in the first out-of-order process, the secret-state computing system can directly use the ciphertext table as the target sub-table.
  • the intermediary can pre-store the out-of-order process threshold, and correspondingly, in the first out-of-order process, the secret-state computing system can directly use the ciphertext table as the target sub-table.
  • at least one target sub-table may be determined from the ciphertext table, wherein each target sub-table includes at least two consecutive rows.
  • the intermediate party determines at least one pair of replacement rows from the target sub-table, and performs position replacement with a certain probability for each pair of replacement rows, so as to obtain the disordered sub-table corresponding to the target sub-table sheet.
  • the middle party obtains the out-of-order subtables corresponding to all the target sub-tables of the current out-of-order process, and determines the output table of the out-of-order process in the current round based on the out-of-order sub-tables corresponding to all the target sub-tables.
  • the output table of the current round of out-of-order process (the ciphertext table after some rows are out of order) can be directly used as a target sub-table,
  • the subsequent processes that is, steps 11-13) are cyclically executed until the number of executed out-of-order processes reaches the out-of-order process threshold, and the out-of-order table is obtained.
  • the table obtained after the out-of-order is divided to obtain multiple subtables, and The plurality of sub-tables are used as the output tables corresponding to the current round of out-of-order processes, and each output table is the target sub-table of the next out-of-order process, and then the subsequent processes are executed until the number of out-of-order process executions reaches the out-of-order process threshold, Get out of order table.
  • any current out-of-order process may include the following steps:
  • the intermediate party may treat the target sub-table as a separate table that needs to be out of order, and reconfigure new row numbers for each row contained in the target sub-table. For example, the row number of the first row in the target subtable is reset to 0 (or 1), and the row numbers of subsequent rows are incremented by 1 in turn.
  • S320 Select a pair of replacement rows from the target subtable at least based on the current cycle times.
  • S340 Update the number of cycles, and determine whether a cycle end condition related to the cycle threshold is reached based on the updated number of cycles. If it is judged that the loop end condition related to the loop threshold is not reached, return to S320;
  • judging whether the cycle end condition related to the cycle threshold is reached may be that the initial value of the cycle number is 0, and the cycle number is updated by adding one to the current cycle number.
  • the cycle number after adding one reaches the cycle threshold , that is, the end-of-loop condition associated with the loop threshold is reached.
  • the initial value of the cycle number is the cycle threshold, and the cycle number is updated once minus one for the current cycle number.
  • the cycle number after subtracting one is 0, that is, the cycle end condition related to the cycle threshold is reached.
  • the intermediary may select at least one pair of replacement rows from the target subtable based on the current cycle number and the cycle threshold. Specifically, it may be: from the target subtable, respectively select a row whose corresponding row label is equal to the current cycle number and a row whose corresponding row label is equal to the sum of the current cycle number and the cycle threshold as a pair of replacement rows .
  • the aforementioned S350 may be configured to divide the out-of-order sub-tables by using the cycle threshold, so as to determine the output table of the current round of out-of-order procedures.
  • the corresponding row in the disordered subtable corresponding to the target subtable can be directly marked as the row of the cycle threshold value as the division reference row, and the division reference row and the row before it in the disordered subtable can be used as a new Table, the row after the division reference row is used as a new table, which is the output table of this round of out-of-order process.
  • the above S310-S350 is executed for all target sub-tables, so as to obtain all output tables of the current round of out-of-order process. It can be understood that in the current out-of-order process, after dividing the out-of-order subtables of each target subtable, the number of rows of all new tables obtained after the division is less than 2, that is, the output of this round of out-of-order process If the number of rows in the table is less than 2, an out-of-order table is generated directly based on the output table of the current round of out-of-order process.
  • the intermediary can record the positional relationship between the ciphertext table and each target sub-table, and record the out-of-order sub-table corresponding to each target sub-table and its divided new sub-table Correspondingly, the intermediary can determine the positional relationship between the output tables of each round of the out-of-order process based on the above-mentioned positional relationship. After the number of rows in the output table of a round of out-of-order process is less than 2, the out-of-order table is generated directly based on the positional relationship between the output tables of the current round of out-of-order process.
  • the reordering steps for each target subtable can be executed in parallel.
  • the ciphertext table (table 1) contains 4 rows, the row numbers are 0-3, and each row is represented as X[0]-X[3], where X[0] is stored in all
  • the attribute item ciphertext of the attribute item X[1] is stored in the attribute item ciphertext of all attribute items of the target object 2
  • X[2] is stored in the attribute item ciphertext of all attribute items of the target object 3
  • X[3] The attribute item ciphertext of all attribute items stored in the target object 4.
  • the row with number 0 (X[0]) and the row whose corresponding row label is equal to the sum of current cycle number 0 and cycle threshold 2 (X[2]) are used as a pair of replacement rows; for X[0] with a certain probability ] and X[2] perform position replacement, and whether the two lines X[0] and X[2] are true or not, the probability of performing the replacement position is random.
  • Row (X[1]) whose corresponding row number is equal to the current number of cycles 1 is selected from Table 1, and the corresponding row number is equal to the sum of the current number of cycles 1 and the cycle threshold 2.
  • Row (X[3]) as a pair of replacement rows, performs position replacement for the two rows X[1] and X[3] with a certain probability, whether the position replacement is true or not is random.
  • Table 2 is divided based on cycle threshold 2, and two new sub-tables are obtained, namely table 3 (including X[0] and X[1]) and table 4 (including X[2] and X[3]), as the first The output table of the round out-of-order process.
  • Table 4 Table 3
  • the row labels X[0]' and X[1]' of its rows are reset, wherein X[0]' corresponds to X[2], and X[1]' corresponds to X[3].
  • Table 3 and Table 4 are respectively used as the target sub-tables, and for Table 3 (Table 4), based on its row number 2, the cycle threshold is determined to be 0; from Table 3 (Table 4), respectively select The corresponding row label is equal to the row X[0] (X[0]') of the current cycle number 0, and the row X[1] (X[ 1]'), as a pair of replacement lines.
  • Table 5 (Table 6) is divided based on cycle threshold 0, and two new sub-tables are obtained, namely Table 7 and Table 8 (Table 9 and Table 10), respectively, as the output table of the second round of out-of-order process, wherein the second The number of rows in the output table of the round of out-of-order process is 1, and if it is less than 2, the out-of-order table is generated based on the output table of the second round of out-of-order process.
  • any current out-of-order flow may include the following steps S410-S450:
  • S410 Determine a cycle threshold based on the number of rows in the ciphertext table.
  • the cycle threshold is in the form of m
  • S420 From the ciphertext table, select two rows whose corresponding row numbers are the same except for the i-th digit, as a pair of replacement rows, to obtain at least one pair of replacement rows, where i is equal to the current number of cycles.
  • S430 Perform position replacement with a certain probability for each pair of replacement rows. Wherein, the certain probability is determined according to the random number generated for the corresponding replacement row.
  • S440 Update the number of cycles, and determine whether a cycle end condition related to the cycle threshold is reached based on the updated number of cycles. If the judgment is not reached, return to S420;
  • the row number of each row in the ciphertext table is represented by binary, and the specific binary digits can be set according to the actual situation, for example, 4 digits, 8 digits, etc.
  • the first row X[0] the corresponding row number is represented as 0000
  • the second row X[1] the corresponding row number is represented as 0001, and so on.
  • the value range of the cycle number i can be set to an integer in [0, m-1]. i can start from 0.
  • the update cycle number is the current cycle number plus one, and the cycle end condition related to the cycle threshold is reached when the updated cycle number reaches the cycle threshold. In another case, i can also be taken from m-1.
  • the update cycle number is the current cycle number minus one, and the cycle ending condition related to the cycle threshold is reached when the updated cycle number is 0.
  • the position replacement step for each pair of replacement rows can be performed in parallel in any out-of-order process.
  • the row labels corresponding to the 4 rows are respectively expressed as: the row label corresponding to the first row X[0] is 0000, the second row X The row number corresponding to [1] is 0001, the row number corresponding to the third row X[2] is 0010, and the row number corresponding to the fourth row X[3] is 0011.
  • the cycle threshold is 2, and correspondingly, i can take a value of 0,1.
  • Update i the updated i is equal to 1, select two rows whose corresponding row labels are the same except for the first digit from the intermediate sorting table (the result of the previous cycle), as a pair of replacement row pairs, to get At least one pair of replacement rows, specifically: the first row X[0] and the third row X[2] are a pair of replacement rows, the second row X[1] and the fourth row X[3] are a pair of replacement rows .
  • the position replacement is performed with a certain probability
  • the second row X[1] and the fourth row X[3] the position replacement is performed with a certain probability.
  • Update i the updated i is equal to 2, the loop end condition is reached, and the disordered table is obtained.
  • This embodiment can realize the scrambling of the ciphertext table through a scrambling process, disrupting the order relationship of the rows in the table, so that when sorting each time, the positions of the rows before and after the table cannot be tracked, avoiding the row in the table in the sorting column (for example, the target object at position a in the ranking of the first attribute item is the target object at position b in the ranking of the second attribute item).
  • the S220 may include the following steps 21-22:
  • Step 21 Determine whether the total number of rows in the ciphertext table satisfies the preset out-of-order condition, wherein the preset out-of-order condition includes: the total number of rows is equal to the integer power of the preset value.
  • the preset value can be set according to the actual situation, in one case, the preset value is 2.
  • Step 22 If it is judged that the preset disordering condition is satisfied, the ciphertext table is disordered in units of behaviors. Among them, when the total number of rows in the ciphertext table satisfies the preset out-of-order condition, the probability of each row in the ciphertext table being determined as one of a pair of replacement rows can be equal to a certain extent, and certain rows are avoided from being executed. The generalization of replacement is very small, and it affects the effect of disordering to a certain extent. For example, the ciphertext table includes 5 lines. Using the out-of-order process shown in Figure 3 above, there will be a situation that the fifth line does not participate in the out-of-order situation.
  • the S220 may also include steps 23-24: Step 23, using a specific row to fill the ciphertext
  • the text form is obtained to obtain the filled form, and the total number of rows in the filled form satisfies the preset out-of-sequence condition; step 24, the filled form is shuffled in row units to obtain the out-of-order form.
  • the row number corresponding to the specific row is different from the row numbers of the rows in the ciphertext table, and the specific row may contain the ciphertext of the specified type of data.
  • the attribute value corresponding to the target attribute item in the reordering table is The ciphertext is sorted to obtain the sorted table, and the specific rows filled in the sorted table are removed to obtain the target sorted table.
  • the row that has participated in the disordering process with fewer times after the disordered table is disordered in row units. (for example: the row whose corresponding row label is greater than 2 ⁇ m), randomly shuffle this type of row, for example, perform random row shuffle between this type of row, or randomly insert this type of row into other rows (such as : This type of row is randomly inserted into the row whose corresponding row label is not greater than 2 ⁇ m in the table). In order to better disrupt the position sequence relationship of the preceding and following lines of this type of line.
  • the intermediary can sort the attribute value ciphertext of the target attribute item in the out-of-order table through various sorting algorithms, for example: merge sorting algorithm, heap sorting algorithm, etc.
  • the embodiment of this specification provides a sorting method.
  • the S240 may include: iteratively executing multiple sorting processes, wherein any sorting process includes the following steps 31-33:
  • Step 31 For each table part currently to be sorted in the out-of-order table, determine the reference ciphertext from the attribute value ciphertext corresponding to the target attribute item included in the table part.
  • Step 32 Based on the reference ciphertext, group other attribute value ciphertexts included in the table to obtain a first row set that is larger than the reference ciphertext, and a second row set that is smaller than the reference ciphertext.
  • Step 33 Use the first row set and the second row set corresponding to each table part with a row number greater than 1 as the table part to be sorted corresponding to the next sorting process; until the target sorting table is obtained.
  • the target sorted table may be obtained when the number of rows in the first row set and the second row set corresponding to each table part to be sorted is not greater than 1.
  • the intermediate party after the intermediate party obtains the out-of-order table, iteratively executes multiple sorting processes on the out-of-order table in response to a query instruction.
  • the out-of-order table is used as the current table part to be sorted, and an attribute value ciphertext is determined from the attribute value ciphertext corresponding to the target attribute item contained in the table part as the reference ciphertext.
  • the first row set of the benchmark ciphertext is smaller than the second row set of the benchmark ciphertext.
  • the first set of rows includes: the row where other attribute value ciphertexts of the target attribute item object larger than the reference ciphertext are located in the table part
  • the second row set includes: the row of the target attribute item object smaller than the reference ciphertext in the table part The line where the cipher text of other attribute values is located.
  • the first row set corresponding to the table part and the set with the number of rows greater than 1 in the second row set are used as the table part to be sorted corresponding to the next sorting process, and the above steps 31-33 are executed; until each table part to be sorted The number of rows in the corresponding first row set and the second row set is not greater than 1, and the target sorting table is obtained.
  • the packet also gets a third set of rows equal to the base ciphertext.
  • the third set of rows includes: the rows of other attribute value ciphertexts of the target attribute item object equal to the reference ciphertext in the table part.
  • the rows in the third set of rows need not be sorted for the target attribute item.
  • the arbitrary sorting process may also include arranging and placing the first row set, the second row set, and the third row set corresponding to each table part, so that the first row set is placed in The lower part of the third row set, the second row set is placed on the upper part of the third row set.
  • the first line set, the second line set and the third line set corresponding to each table part are arranged according to this arrangement, so that the rows in the disordered table can be sorted according to the order of the attribute value ciphertext corresponding to the target attribute item from small to large .
  • the arbitrary sorting process may also include arranging and placing the first row set, the second row set and the third row set corresponding to each table part, and placing the first row set in the third row In the upper part of the set, the second row set is placed in the lower part of the third row set, so as to sort the rows in the out-of-order table according to the order of the attribute value ciphertext corresponding to the target attribute item from large to small.
  • the sorting of each table part can be executed in parallel.
  • the embodiment of this specification provides a multi-party data query device 600 for protecting data privacy.
  • attribute value the device is deployed in an intermediate party other than multiple parties, and its schematic block diagram is shown in Figure 6, including:
  • Obtaining module 610 configured to obtain the attribute value ciphertext of N target objects sent by each data owner, so as to obtain a ciphertext table, wherein a row in the ciphertext table corresponds to a target object, and a column corresponds to an attribute item ;
  • the reordering module 620 is configured to reorder the ciphertext table in units of rows to obtain a reordered table
  • the sorting module 630 is configured to sort the attribute value ciphertext corresponding to the target attribute item in the out-of-order table in response to a query instruction for querying sorting-related data for the target attribute item to obtain a target sorting table;
  • the query module 640 is configured to obtain the sorting related data as a query result based on the target sorting table.
  • the intermediate party is a secret computing system, which includes M executing parties;
  • the obtaining module 610 is specifically configured such that each executor among the M executors respectively obtains an attribute value fragment sent by each data owner as an attribute value ciphertext, wherein each attribute value fragment is , determined by each data owner dividing the attribute value held by it into M shares;
  • the out-of-sequence module 620 is specifically configured such that each executing party, based on the attribute value ciphertext held by the party, calculates the MPC scheme through a multi-party secure calculation, and cooperates with other M-1 executing parties to perform the encryption in units of actions.
  • the text table is shuffle.
  • the sorting-related data is a calculation result of performing a specified calculation on the ciphertext of the specified X-bit attribute value in the sorting sequence corresponding to the target attribute item;
  • the query module 640 is specifically configured to obtain the specified X-digit target attribute value ciphertext from the target sorting table;
  • the reordering module 620 is specifically configured to reorder the ciphertext table through at least one reordering process, wherein the reordering module 620 realizes any Execution of the current out-of-order process:
  • the first determination unit (not shown in the figure) is configured to determine at least one pair of replacement rows from the target sub-table for each target sub-table of the ciphertext table;
  • the position replacement unit (not shown in the figure) is configured to perform position replacement with a certain probability for each pair of replacement rows, so as to obtain the disordered subtable corresponding to the target subtable;
  • the second determination unit (not shown in the figure) is configured to determine an output table of the current round of the out-of-order process based on the out-of-order sub-table, so as to form the out-of-order table.
  • the certain probability is determined according to random numbers generated for the pair of replacement rows.
  • the target subtable when the current out-of-order flow is the first out-of-order flow, the target subtable is the ciphertext table; when the current out-of-order flow is a non-first out-of-order flow, the The target subtable is a table with a row number not less than 2 in the output table corresponding to the previous out-of-order process.
  • the first determining unit is specifically configured to determine a cycle threshold based on the number of rows of the target subtable
  • each loop process includes, at least based on the current loop count, selecting at least one pair of replacement rows from the target subtable; until a loop end condition related to the loop threshold is reached.
  • the first determining unit is specifically configured to select, from the target subtable, a row whose corresponding row label is equal to the current cycle number, and a row whose corresponding row label is equal to the current cycle number The row with the sum of the cycle threshold, as a pair of permutation rows.
  • the second determining unit is specifically configured to use the cycle threshold to divide the out-of-order sub-table, so as to determine the output table of the current round of out-of-order process.
  • the row numbers corresponding to each row in the ciphertext table are expressed in binary
  • the first determination unit is specifically configured to select two rows whose corresponding row labels are identical except for the i-th digit from the target subtable as a pair of replacement rows, wherein the i is equal to the current The number of cycles, the target subtable is the ciphertext table itself.
  • the sorting module 630 is specifically configured to iteratively execute multiple sorting processes, wherein any sorting process includes:
  • the grouping also obtains a third row set equal to the reference ciphertext.
  • the sorting module 630 is also specifically configured to arrange and place the first row set, the second row set and the third row set corresponding to each table part, so that the first row set is placed in The lower part of the third row set, the second row set is placed on the upper part of the third row set.
  • the out-of-order module 620 includes:
  • a judging unit (not shown in the figure), configured to judge whether the total number of rows of the ciphertext table satisfies a preset out-of-order condition, wherein the preset out-of-order condition includes: the total number of rows is equal to an integer power of a preset value ;
  • a filling unit (not shown in the figure), configured to fill the ciphertext table with a specific row if it is judged that the preset out-of-sequence condition is not satisfied, to obtain a filled table, the total number of rows of the filled table meets the preset Out of order condition;
  • the shuffle unit (not shown in the figure) is configured to shuffle the filled table in row units to obtain a shuffle table.
  • the sorting module 630 is specifically configured as:
  • the specific rows filled in the sorted table are removed to obtain the target sorted table.
  • the foregoing device embodiments correspond to the method embodiments, and for specific descriptions, refer to the description of the method embodiments, and details are not repeated here.
  • the device embodiment is obtained based on the corresponding method embodiment, and has the same technical effect as the corresponding method embodiment. For specific description, please refer to the corresponding method embodiment.
  • the embodiment of this specification also provides a computer-readable storage medium on which a computer program is stored.
  • the computer program is executed in a computer, the computer is instructed to execute the multi-party data query for data privacy protection provided in this specification. method.
  • the embodiment of this specification also provides a computing device, including a memory and a processor, wherein executable code is stored in the memory, and when the processor executes the executable code, the protected data provided in this specification is realized. Private multi-party data query method.
  • each embodiment in this specification is described in a progressive manner, the same and similar parts of each embodiment can be referred to each other, and each embodiment focuses on the differences from other embodiments.
  • the description is relatively simple, and for relevant parts, please refer to part of the description of the method embodiments.
  • the functions described in the embodiments of the present invention may be implemented by hardware, software, firmware or any combination thereof.
  • the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Storage Device Security (AREA)

Abstract

一种保护数据隐私的多方数据查询方法及装置,多方包括多个数据拥有方,其各自持有N个目标对象的若干属性项的属性值,方法通过多方之外的中间方执行,方法包括:获得各数据拥有方发送的N个目标对象的属性值密文,以得到密文表格(S210),其中,密文表格的一行对应于一个目标对象,一列对应于一个属性项;以行为单位对密文表格进行乱序,得到乱序表格(S220);响应于针对目标属性项查询排序相关数据的查询指令,对乱序表格中目标属性项对应的属性值密文进行排序,得到目标排序表格(S230);基于目标排序表格,得到排序相关数据作为查询结果(S240)。

Description

保护数据隐私的多方数据查询方法及装置
本申请要求于2021年12月28日提交中国国家知识产权局、申请号为202111621978.6、申请名称为“保护数据隐私的多方数据查询方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本说明书涉及数据安全技术领域,尤其涉及一种保护数据隐私的多方数据查询方法及装置。
背景技术
在大数据背景下,常常需要将不同数据方的业务数据进行综合处理。例如,在对用户信息进行分析场景中,A方持有用户的身高、体重等各身体状态数据,B方持有用户的工资数据,C方持有用户的借贷数据。在对多方数据进行联合处理的过程中,数据隐私的保护和安全性成为值得关注的问题。
在数据处理过程中,难免存在对数据进行联合排序的情况,例如,A方需要查询工资处于前N位的用户的工资总和,或者,需要查询工资处于前N位的用户的借贷情况,或者,需要查询身高处于前M位的用户的工资总和,等等。以上情况中,需要对A、B和C三方所持有的数据进行联合排序,进而基于相应的排序结果得到数据查询结果。如果A、B和C多方将其持有的用户的隐私数据直接以明文的形式上传至一个第三方,第三方将数据进行汇总成表格之后,对表格中的相应数据进行排序,以基于排序结果得到各方所需的数据查询结果,该过程有可能会泄露各方的用户的隐私数据。
那么,如何提供一种在保护数据隐私的前提下,实现多方数据查询方法成为亟待解决的问题。
发明内容
本说明书一个或多个实施例提供了一种保护数据隐私的多方数据查询方法及装置,以实现对多方隐私数据的保护。
根据第一方面,提供一种保护数据隐私的多方数据查询方法,所述多方包括多个数据拥有方,其各自持有N个目标对象的若干属性项的属性值,所述方法通过所述多方之外的中间方执行,所述方法包括:
获得各数据拥有方发送的N个目标对象的属性值密文,以得到密文表格,其中, 所述密文表格的一行对应于一个目标对象,一列对应于一个属性项;
以行为单位对所述密文表格进行乱序,得到乱序表格;
响应于针对目标属性项查询排序相关数据的查询指令,对所述乱序表格中所述目标属性项对应的属性值密文进行排序,得到目标排序表格,基于所述目标排序表格,得到所述排序相关数据作为查询结果。
在一种可选实施方式中,所述中间方为密态计算系统,其中包括M个执行方;
所述获得各数据拥有方发送的N个目标对象的属性值密文,包括:
所述M个执行方中的各执行方,分别获得各数据拥有方发送的一份属性值分片作为属性值密文,其中,每份属性值分片是,各数据拥有方将其持有的属性值划分成M份而确定的;
所述以行为单位对所述密文表格进行乱序,包括:
所述各执行方基于本方所持有的属性值密文,通过多方安全计算MPC方案,与其他M-1个执行方联合以行为单位对所述密文表格进行乱序。
在一种可选实施方式中,所述排序相关数据为,对所述目标属性项对应的排序序列中指定X位的属性值密文进行指定计算的计算结果;
所述得到所述排序相关数据作为查询结果,包括:
从所述目标排序表格中得到所述指定X位的目标属性值密文;
对所述目标属性值密文进行所述指定计算。
在一种可选实施方式中,所述以行为单位对所述密文表格进行乱序,包括:
通过至少一个乱序流程对所述密文表格进行乱序,其中,任一当前乱序流程包括:
针对所述密文表格的各目标子表格,从该目标子表格中确定至少一对置换行;
针对各对置换行,以一定概率执行位置置换,以得到该目标子表格对应的乱序子表格,基于所述乱序子表格确定本轮乱序流程的输出表格,用于形成所述乱序表格。
在一种可选实施方式中,所述一定概率根据针对该对置换行生成的随机数确定。
在一种可选实施方式中,所述当前乱序流程为首次乱序流程时,所述目标子表格为所述密文表格;所述当前乱序流程为非首次乱序流程时,所述目标子表格为上一乱 序流程对应的输出表格中行数不小于2的表格。
在一种可选实施方式中,所述从该目标子表格中确定至少一对置换行,包括:
基于该目标子表格的行数,确定循环阈值;
迭代执行多次循环过程,每次循环过程包括,至少基于当前的循环次数,从该目标子表格中选出至少一对置换行;直到达到与所述循环阈值相关的循环结束条件。
在一种可选实施方式中,所述至少基于当前的循环次数,从该目标子表格中选出至少一对置换行,包括:
从该目标子表格中,分别选择所对应行标号等于当前的循环次数的行,以及所对应行标号等于当前的循环次数与所述循环阈值之和的行,作为一对置换行。
在一种可选实施方式中,所述基于所述乱序子表格确定本轮乱序流程的输出表格,包括:
利用所述循环阈值,划分所述乱序子表格,以确定出本轮乱序流程的输出表格。
在一种可选实施方式中,所述循环阈值形式为2 m,其中m与该目标子表格的行数n满足如下条件:2 m<n<=2 m+1
在一种可选实施方式中,所述密文表格中各行对应的行标号通过二进制表示;
至少基于当前的循环次数,从该目标子表格中选出至少一对置换行,包括:
从该目标子表格中选择所对应行标号除第i位外其他位均相同的两行,作为一对置换行,其中,所述i等于所述当前的循环次数,所述目标子表格为所述密文表格自身。
在一种可选实施方式中,所述对所述乱序表格中所述目标属性项对应的属性值密文进行排序,包括:
迭代执行多个排序流程,其中,任意一个排序流程,包括:
针对所述乱序表格中当前待排序的各表格部分,从该表格部分包含的目标属性项对应的属性值密文中,确定出基准密文;
基于所述基准密文,对该表格部分包含的目标属性项对应的其他属性值密文进行分组,以得到大于所述基准密文的第一行集合,小于所述基准密文的第二行集合;
将各表格部分对应的第一行集合和第二行集合中行数大于1的集合,作为下一排序流程对应的待排序的表格部分;直至得到目标排序表格。
在一种可选实施方式中,所述分组还得到等于所述基准密文的第三行集合。
在一种可选实施方式中,所述任意一个排序流程还包括,将各表格部分对应的第一行集合、第二行集合和第三行集合进行排列放置,使得第一行集合放置于第三行集合的下部,第二行集合放置于第三行集合的上部。
在一种可选实施方式中,所述以行为单位对所述密文表格进行乱序,包括:
判断所述密文表格的总行数是否满足预设乱序条件,其中,所述预设乱序条件包括:总行数等于预设数值的整数次幂;
若判断不满足所述预设乱序条件,使用特定行填充所述密文表格,得到填充表格,所述填充表格的总行数满足所述预设乱序条件;
以行为单位对所述填充表格进行乱序,以得到乱序表格。
在一种可选实施方式中,所述得到目标排序表格,包括:
对所述乱序表格中所述目标属性项对应的属性值密文进行排序,得到排序后表格;
将排序后表格中所填充的特定行移除,得到所述目标排序表格。
根据第二方面,提供一种保护数据隐私的多方数据查询装置,所述多方包括多个数据拥有方,其各自持有N个目标对象的若干属性项的属性值,所述装置部署在所述多方之外的中间方,所述装置包括:
获得模块,配置为获得各数据拥有方发送的N个目标对象的属性值密文,以得到密文表格,其中,所述密文表格中一行对应于一个目标对象,一列对应于一个属性项;
乱序模块,配置为以行为单位对所述密文表格进行乱序,得到乱序表格;
排序模块,配置为响应于针对目标属性项查询排序相关数据的查询指令,对所述乱序表格中所述目标属性项对应的属性值密文进行排序,得到目标排序表格;
查询模块,配置为基于所述目标排序表格,得到所述排序相关数据作为查询结果。
在一种可选实施方式中,所述中间方为密态计算系统,其中包括M个执行方;
所述获得模块,具体配置为所述M个执行方中的各执行方,分别获得各数据拥有 方发送的一份属性值分片作为属性值密文,其中,每份属性值分片是,各数据拥有方将其持有的属性值划分成M份而确定的;
所述乱序模块,具体配置为所述各执行方基于本方所持有的属性值密文,通过多方安全计算MPC方案,与其他M-1个执行方联合以行为单位对所述密文表格进行乱序。
在一种可选实施方式中,所述排序相关数据为,对所述目标属性项对应的排序序列中指定X位的属性值密文进行指定计算的计算结果;
所述查询模块,具体配置为从所述目标排序表格中得到所述指定X位的目标属性值密文;
对所述目标属性值密文进行所述指定计算。
在一种可选实施方式中,所述乱序模块,被具体配置为通过至少一个乱序流程对所述密文表格进行乱序,其中,所述乱序模块通过如下单元实现任一当前乱序流程的执行:
第一确定单元,配置为针对所述密文表格的各目标子表格,从该目标子表格中确定至少一对置换行;
位置置换单元,配置为针对各对置换行,以一定概率执行位置置换,以得到该目标子表格对应的乱序子表格;
第二确定单元,配置为基于所述乱序子表格确定本轮乱序流程的输出表格,用于形成所述乱序表格。
在一种可选实施方式中,所述一定概率根据针对该对置换行生成的随机数确定。
在一种可选实施方式中,所述当前乱序流程为首次乱序流程时,所述目标子表格为所述密文表格;所述当前乱序流程为非首次乱序流程时,所述目标子表格为上一乱序流程对应的输出表格中行数不小于2的表格。
在一种可选实施方式中,所述第一确定单元,具体配置为基于该目标子表格的行数,确定循环阈值;
迭代执行多次循环过程,每次循环过程包括,至少基于当前的循环次数,从该目标子表格中选出至少一对置换行;直到达到与所述循环阈值相关的循环结束条件。
在一种可选实施方式中,所述第一确定单元,具体配置为从该目标子表格中,分别选择所对应行标号等于当前的循环次数的行,以及所对应行标号等于当前的循环次数与所述循环阈值之和的行,作为一对置换行。
在一种可选实施方式中,所述第二确定单元,具体配置为利用所述循环阈值,划分所述乱序子表格,以确定出本轮乱序流程的输出表格。
在一种可选实施方式中,所述循环阈值形式为2 m,其中m与该目标子表格的行数n满足如下条件:2 m<n<=2 m+1
在一种可选实施方式中,所述密文表格中各行对应的行标号通过二进制表示;
所述第一确定单元,具体配置为从该目标子表格中选择所对应行标号除第i位外其他位均相同的两行,作为一对置换行,其中,所述i等于所述当前的循环次数,所述目标子表格为所述密文表格自身。
在一种可选实施方式中,所述排序模块,具体配置为迭代执行多个排序流程,其中,任意一个排序流程,包括:
针对所述乱序表格中当前待排序的各表格部分,从该表格部分包含的目标属性项对应的属性值密文中,确定出基准密文;
基于所述基准密文,对该表格部分包含的目标属性项对应的其他属性值密文进行分组,以得到大于所述基准密文的第一行集合,小于所述基准密文的第二行集合;
将各表格部分对应的第一行集合和第二行集合中行数大于1的集合,作为下一排序流程对应的待排序的表格部分;直至得到目标排序表格。
在一种可选实施方式中,所述分组还得到等于所述基准密文的第三行集合。
在一种可选实施方式中,所述排序模块,还具体配置为将各表格部分对应的第一行集合、第二行集合和第三行集合进行排列放置,使得第一行集合放置于第三行集合的下部,第二行集合放置于第三行集合的上部。
在一种可选实施方式中,所述乱序模块,包括:
判断单元,配置为判断所述密文表格的总行数是否满足预设乱序条件,其中,所述预设乱序条件包括:总行数等于预设数值的整数次幂;
填充单元,配置为若判断不满足所述预设乱序条件,使用特定行填充所述密文表 格,得到填充表格,所述填充表格的总行数满足所述预设乱序条件;
乱序单元,配置为以行为单位对所述填充表格进行乱序,以得到乱序表格。
在一种可选实施方式中,所述排序模块,具体配置为:
对所述乱序表格中所述目标属性项对应的属性值密文进行排序,得到排序后表格;
将排序后表格中所填充的特定行移除,得到所述目标排序表格。
根据第三方面,提供一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行第一方面所述的方法。
根据第四方面,提供一种计算设备,包括存储器和处理器,其中,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现第一方面所述的方法。
根据本说明书实施例提供的方法及装置,各数据拥有方发送属性值密文至中间方,可以避免属性值的泄露。并且,中间方在获得由各数据拥有方发送的属性值密文构建的密文表格之后,首先以行为单位对密文表格进行乱序,以打乱密文表格中各行的前后顺序关系,得到乱序表格。进而对乱序表格中目标属性项对应的属性值密文进行排序。即使对该密文表格中的多个目标属性项对应的属性值密文进行排序,由于在针对任一目标属性项进行排序之前,需要打乱表格中行的前后顺序关系,使得每次排序时,表格中前后行的位置不能追踪,避免了表格中各行在排序列上的顺序关系(例如处于第一属性项排序列第a位的目标对象,即为处于第二属性项排序列第b位的目标对象)的暴露。
附图说明
为了更清楚地说明本发明实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单的介绍。显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1A为本说明书披露的一个实施例的实施框架示意图;
图1B为本说明书披露的一个实施例的实施框架示意图;
图2为实施例提供的保护数据隐私的多方数据查询方法的一种流程示意图;
图3为实施例提供的一个乱序流程的过程示意图;
图4为实施例提供的一个乱序流程的过程示意图;
图5为实施例提供的乱序的一种场景示意图;
图6为实施例提供的保护数据隐私的多方数据查询装置的一种示意性框图。
具体实施方式
下面将结合附图,详细描述本说明书实施例的技术方案。
如前所述,在多方联合进行数据处理过程中,数据隐私安全成为目前备受关注的问题。在具体的数据处理过程中,难免存在对数据进行联合排序的情况。例如,在对用户信息进行分析场景中,A方持有用户的身高、体重等各身体状态属性的属性值,B方持有用户的工资属性的属性值(具体的工资数额),C方持有用户的借贷属性的属性值(具体的借贷数额)。在A方需要查询工资处于前N位的用户的工资总和,或者,需要查询工资处于前N位的用户的借贷情况,或者,需要查询身高处于前M位的用户的工资总和与体重处于前M位的用户的工资总和的大小等查询需求的情况下,需要对A、B和C三方所持有的数据发送至一个第三方,第三方基于A、B和C三方所发送的数据,进行联合排序,进而基于相应的排序结果得到数据查询结果。
第三方在进行数据查询过程中,对A、B和C三方所发送的数据进行联合排序时,可以采用目前存在的一种基于类似保序加密方式的排序方式。该排序方式中,第三方获得各方(A、B和C)上传的属性值密文,以获得待排序的表格,该表格中用户标识和各属性项的属性值是密文的(例如,表格中一行存储一个用户针对多个属性项的属性值密文,一列存储各用户针对一个属性项的属性值密文),但是各行之间的顺序关系是明文的。后续的,第三方基于指示对目标属性项进行排序的查询请求,直接对该表格中目标属性项的属性值密文进行排序。
上述数据查询过程中,第三方在使用上述排序方式对数据进行排序的过程中,在对表格的多列(多个目标属性项)的属性值密文进行排序时,会出现表格中各行在排序列上的顺序关系暴露的情况,例如:目标属性项包括用户的体重属性项和工资属性项,第三方分别针对表格中体重属性项和工资属性项的属性值密文进行排序后,第三方可能会得到个人隐私信息如体重第5名的人,工资是第1名的信息,这在隐私数据保密情况下是不允许的。
为此,本说明书实施例提供的处理方法,主要针对持有隐私数据(N个目标对象 的若干属性项的属性值)的多方数据查询的场景而设计。
图1A为本说明书披露的一个实施例的实施场景示意图。在该实施场景中,示意性示出了多个数据拥有方,例如,A方、B方和C方,以及中间方D,其中,该中间方D可以采用密态计算系统实现。各个数据拥有方和中间方各自具体可以体现为:具有计算、处理能力的设备、平台、服务器或设备集群。其中,各数据拥有方持有其方的隐私数据,即N个目标对象的若干属性项的属性值,中间方D用于接收各数据拥有方(即A方、B方和C方),分别发送的N个目标对象的属性值密文,组成密文表格,继而执行后续的数据查询流程。各数据拥有方均不希望其各自持有的数据(N个目标对象的若干属性项的属性值)泄露,且不希望各行(各目标对象的属性值)在排序列上的顺序关系暴露。
为此,根据本说明书的实施例,各数据拥有方对各自持有的N个目标对象的若干属性项的属性值,按照中间方D的数据上传要求进行相应处理,得到各属性值的属性值密文;之后,各数据拥有方将其各自的属性值密文,上传至中间方D。中间方D获得各数据拥有方发送的N个目标对象的属性值密文,以得到密文表格,该密文表格的一行对应于一个目标对象,一列对应于一个属性项,即该密文表格中一行存储一个目标对象针对多个属性项的属性值密文,一列存储各目标对象针对一个属性项的属性值密文,密文表格可以如下表1所示。
表1
Figure PCTCN2022125462-appb-000001
中间方D以行为单位对该密文表格进行乱序,即对密文表格进行行乱序,以打乱密文表格中行与行之间的前后顺序关系,使得密文表格中前后行的位置不能追踪,得 到乱序之后的表格,即乱序表格。
中间方D响应于针对目标属性项查询排序相关数据的查询指令,对乱序表格中目标属性项对应的属性值密文进行排序,得到目标排序表格,其中,该目标排序表格中各行可以以目标属性项对应的属性值密文的从大到小(或者从小到大)的顺序排序。基于目标排序表格,得到排序相关数据作为查询结果。之后将查询结果反馈至查询指令发起方。
可以理解的是,中间方D响应于查询指令,对乱序表格中目标属性项对应的属性值密文进行排序,可以作为确定查询指令对应的查询结果的中间流程。此种情况下,排序所得的目标排序表格,一般不会被反馈至查询指令发起方处。
在一种实现方式中,中间方D可以通过密态计算系统实现,其中包括M个执行方,该M个执行方可以在可信执行环境TEE中运行,一个执行方可以通过一个TEE实现,此时可以称执行方为可信执行方。另一种情况,该M个执行方也可以在普通执行环境中运行。如图1B所示,执行方的个数M可以设置为3,各数据拥有方将其持有的每一属性值划分为3份,得到每一属性值对应的3份属性值分片,各数据拥有方将其持有的每一属性值对应的每份属性值分片分别发送至每一执行方。3个执行方中的各执行方,分别获得各数据拥有方发送的一份属性值分片作为属性值密文,基于本方所持有的属性值密文,通过多方安全计算MPC方案,与其他2个执行方联合以行为单位对密文表格进行乱序。
其中,每一执行方均获得属性值的一部分(属性值分片),每一执行方均无法获得属性值明文,可以避免属性值明文在每一执行方的泄露。且若某一或指定数量以下的执行方被攻击成功,由于该执行方仅获得了属性值的一部分,攻击者也无法通过其攻击成功的执行方,获得各数据拥有方的属性值明文。
并且,各执行方在可信执行环境TEE中运行的情况下,该密态计算系统对其数据的保护程度会更高,防攻击能力更好。
本实施例中,由于在针对任一目标属性项进行排序之前,需要打乱表格中行的前后顺序关系,使得每次排序时,表格中前后行的位置不能追踪,避免了表格中各行在排序列上的顺序关系(例如处于第一属性项排序列第a位的目标对象,即为处于第二属性项排序列第b位的目标对象)的暴露。
下面结合具体实施例,对本说明书提供的保护数据隐私的多方数据查询方法进行 详细阐述。
图2示出了本说明书一个实施例中保护数据隐私的多方数据查询方法的流程图。多方包括多个数据拥有方,多个数据拥有方各自持有N个目标对象的若干属性项的属性值,该方法通过多个数据拥有方之外的中间方执行。所述方法包括如下步骤S210-S240:
S210:获得各数据拥有方发送的N个目标对象的属性值密文,以得到密文表格。其中,密文表格的一行对应于一个目标对象,一列对应于一个属性项,即该密文表格中一行存储一个目标对象针对多个属性项的属性值密文,一列存储各目标对象针对一个属性项的属性值密文。
在一种实现方式中,该多个数据拥有方Ai在将其所持有的N个目标对象的属性值,按照中间方的数据上传要求,发送至该中间方之前,为了保证中间方侧数据查询流程的有效进行,该多个数据拥有方A i之间需要对齐N个目标对象的属性值。其中,该对齐可以是,多个数据拥有方A i均按照预设的目标对象排列顺序,排列目标对象的各属性项的属性值,例如,目标对象1的所有属性项的属性值处于第一行,目标对象2的所有属性项的属性值处于第二行,以此类推,目标对象N的所有属性项的属性值处于第N行。一种实现中,该目标对象可以是如下之一:用户、物品。
可以理解的是,各数据拥有方Ai将对齐后的N个目标对象的若干属性项的属性值密文发送至中间方时,还需要将相应的N个目标对象的对象标识密文发送至中间方。
S220:以行为单位对密文表格进行乱序,得到乱序表格。中间方可以基于预设乱序方案或以随机的方式,以行为单位对密文表格进行乱序,即打乱密文表格中行与行之间的前后顺序关系,得到乱序表格。在该乱序表格中,无法获知各行具体存储有哪个目标对象的针对多个属性项的属性值密文;在乱序表格中的每一行,目标对象及其对应的各属性项的属性值密文之间的对应关系未改变。
S230:响应于针对目标属性项查询排序相关数据的查询指令,对乱序表格中目标属性项对应的属性值密文进行排序,得到目标排序表格。其中,该查询指令可以是任一数据拥有方发起的,也可以是使用上述多个数据拥有方所共同提供服务的用户发起的,其中该服务与多个数据拥有方所持有的目标对象的属性值相关。该查询指令中至少包含有指示针对目标属性项的属性值密文进行排序的信息。即该查询指令对应的查询结果即排序相关数据,需要利用针对目标属性项对应的属性值密文的排序结果确定。
排序后,所得到的目标排序表格中,各行以目标属性项对应的属性值密文的从大到小(或者从小到大)的顺序排序。例如:目标排序表格中,目标属性项对应的最大的属性值密文所在行,位于第1行;目标属性项对应的次大的属性值密文所在行,位于第2行,以此类推,目标属性项对应的最小的属性值密文所在行,位于第N行。反之,目标排序表格中,目标属性项对应的最小的属性值密文所在行,位于第1行;目标属性项对应的次小的属性值密文所在行,位于第2行,以此类推,目标属性项对应的最大的属性值密文所在行,位于第N行。
S240:基于目标排序表格,得到排序相关数据作为查询结果。中间方得到目标排序表格之后,可以得到目标属性项对应的属性值密文之间的排序顺序,在该排序顺序的基础上,基于查询指令进行一系列逻辑分析,得到排序相关数据,即查询指令对应的查询结果。将该查询结果发送至查询指令发起方。
在本说明书的一种实现方式中,该排序相关数据为,对目标属性项对应的排序序列中指定X位的属性值密文进行指定计算的计算结果;
所述S240,可以包括如下步骤01:从目标排序表格中得到指定X位的目标属性值密文;步骤02:对目标属性值密文进行指定计算。
其中,该指定计算包括但不限于求和、求均值、求连乘以及比较大小等计算。
该种情况下,发起该查询指令的数据持有方可以为,未持有目标属性项的数据持有方。
举例而言,多个数据拥有方分别为A、B、C三方,目标对象分别为用户1-N。其中,A方持有用户1-N的身高、体重、年龄各身体状态属性项的属性值,B方持有用户1-N的工资属性项的属性值(具体的工资数额)以及工作年限属性项的属性值(具体的年限数),C方持有用户1-N的借贷属性项的属性值(具体的借贷数额)。
该查询指令为A方发送的,用于查询工资属性项的处于前10位的工资总和,相应的,排序相关数据为:对工资属性项对应的排序序列中前10位的属性值密文(工资数额密文)进行求和计算的计算结果。中间方获得A、B、C三方各自上传的用户1-N的属性值密文,以得到密文表格;以行为单位对密文表格进行乱序,得到乱序表格;对乱序表格中工资属性项对应的属性值密文进行排序,得到目标排序表格。从目标排序表格中工资属性项对应的属性值密文中,确定出处于前10位的属性值密文,即处于前10位的工资数额密文,作为目标属性值密文;对目标属性值密文进行密态求和,得 到工资总和密文,即查询结果。后续的,将查询结果发送至A方。
在另一种实现方式中,该排序相关数据为,对目标属性项对应的排序序列中指定h位目标对象的指定属性项所对应属性值密文进行指定计算的计算结果,该指定属性项为不同于目标属性项的属性项;该种情况下,发起该查询指令的数据持有方可以为,未持有指定属性项和/或目标属性项的数据持有方。
该实现方式中,中间方从目标排序表格中的指定属性项对应的属性值密文中,确定出指定Y位的属性值密文,对该指定Y位的属性值密文进行指定计算,得到计算结果,作为查询结果。承接上述例子,A方需要查询工资处于前Y位的用户的借贷总和,相应的,查询指令为A方发送的,用于查询工资属性项的处于前10位的用户的借贷属性项对应的属性值密文总和(即借贷数额总和),也就是说,该查询指令用于查询工资属性项对应的排序表格(目标排序表格)中,工资属性项对应的排序序列中前10位用户的借贷属性项的属性值密文之和。
相应的,中间方对乱序表格中工资属性项对应的属性值密文进行排序,得到工资属性项对应的排序表格,即目标排序表格,从目标排序表格中,确定出工资属性项对应的排序序列中处于前10位的用户,作为目标用户,并确定该目标用户对应的借贷属性项的属性值密文,进而对所确定出的借贷属性项的属性值密文进行求和,得到求和结果,作为查询结果。
在另一种实现方式中,该查询指令还可以用于,查询第一计算值和第二计算值的比较结果,其中,第一计算值基于第一属性项对应的排序表格中指定属性项对应的指定h位的属性值密文和第一计算方式而确定;第二计算值基于第二属性项对应的排序表格中指定属性项对应的指定h位的属性值密文和第一计算方式而确定,即该排序相关数据为,对第一属性项对应的排序序列中指定h位的目标对象的指定属性项所对应属性值密文,和第二属性项对应的排序序列中指定h位的目标对象的指定属性项所对应属性值密文,进行指定计算的计算结果。其中,上述第一属性项和第二属性项均作为目标属性项。
其中,第一属性项、第二属性项和指定属性项各不相同。该种情况下,发起该查询指令的数据持有方可以为,未持有指定属性项、第一属性项和/或第二属性项的任一数据持有方。
在上述情况中,中间方针对各目标属性项(即第一属性项或第二属性项)进行排 序之前,均需要先对待排序的表格(例如密文表格或者前一排序的表格)进行乱序。具体的可以是:中间方首先针对作为目标属性项的第一属性项进行排序,即对乱序表格中第一属性项对应的属性值密文进行排序,得到第一目标排序表格,从第一目标排序表格中,确定出第一属性项对应的排序序列中指定h位的目标对象,作为第一组对象,从第一目标排序表格中确定出第一组对象的指定属性项所对应属性值密文,作为第一组属性值密文;基于第一组属性值密文以及第一计算方式,得到第一计算值。
之后,针对作为目标属性项的第二属性项进行排序,即重新对密文表格(或者第一目标排序表格)以行为单位进行乱序,得到乱序后的第一表格,响应于前述查询指令,对第一表格中第二属性项对应的属性值密文进行排序,得到第二目标排序表格。然后,从第二目标排序表格中,确定出第二属性项对应的排序序列中指定h位的目标对象,作为第二组对象,从第二目标排序表格中确定出第二组对象的指定属性项所对应属性值密文,作为第二组属性值密文;基于第二组属性值密文以及第一计算方式,得到第二计算值。
比较第一计算值和第二计算值的大小,得到比较结果(即排序相关数据),作为查询结果。将该查询结果反馈至查询指令发起方。
承接上述例子,查询结果表征,A方需要查询身高(第一属性项)对应的排序序列中前10位用户的工资总和,与体重(第二属性项)对应的排序序列中前10位用户的工资总和的大小,其中,身高属性项和体重属性项均为目标属性项。也就是说,查询结果表征需要分别针对身高属性项和体重属性项排序,该排序相关数据为,对身高属性项对应的排序序列中身高处于前10位用户的工资总和,和体重属性项对应的排序序列中体重处于前10位用户的工资总和进行比较的结果。
相应的,中间方对乱序表格中身高属性项对应的属性值密文进行排序,得到身高属性项对应的排序表格,即第一目标排序表格,从第一目标排序表格中,确定出身高处于前10位用户的工资属性项对应的属性值密文,作为第一组属性值密文,进而对第一组属性值密文进行密态求和,第一和值。之后,中间方对密文表格(或者第一目标排序表格),以行为单位进行乱序,得到乱序后的第一表格;响应于查询指令,对第一表格中体重属性项对应的属性值密文进行排序,得到第二目标排序表格;从二目标排序表格中,确定出体重处于前10位用户的工资属性项对应的属性值密文,作为第二组属性值密文,进而对第二组属性值密文进行密态进行求和,得到第二和值;比较第 一和值与第二和值的大小,得到查询结果(即排序相关数据)。
本实施例,由于在针对任一目标属性项进行排序之前,需要打乱表格中行的前后顺序关系,使得每次排序时,表格中前后行的位置不能追踪,避免了表格中各行在排序列上的顺序关系(例如处于第一属性项排序列第a位的目标对象,即为处于第二属性项排序列第b位的目标对象)的暴露。相应的,作为第三方的中间方,其在无法得到各属性值明文的同时,也不会得到表格中各行在排序列上的顺序关系。
在本说明书的另一实施例中,考虑到对各数据拥有方的隐私数据(属性值)的保护,中间方为密态计算系统,其中包括M个执行方。其中,该执行方可以在可信执行环境TEE(Trusted Execution Environment)中运行相应的,该密态计算系统为可信密态计算(TrustEd Cryptographic Computing,TECC)系统;或者该执行方也可以在普通执行环境中运行。
各数据拥有方将其持有的每一属性值划分为M份,得到每一属性值对应的M份属性值分片,各数据拥有方将其持有的每一属性值对应的每份属性值分片,分别发送至每一执行方。相应的,所述S210,被设置为:M个执行方中的各执行方,分别获得各数据拥有方发送的一份属性值分片作为属性值密文,其中,每份属性值分片是,各数据拥有方将其持有的属性值划分成M份而确定的。
进一步的,所述S220,被设置为:各执行方基于本方所持有的属性值密文,通过多方安全计算MPC方案,与其他M-1个执行方联合以行为单位对密文表格进行乱序。
本实施例中,每一执行方均获得属性值的一部分(属性值分片),每一执行方均无法获得属性值明文,可以避免属性值明文在每一执行方的泄露。
中间方在对密文表格以行为单位进行乱序时,可以利用相关技术中任一可以实现对表格进行乱序的方式。也可以利用本说明所提供的后续的乱序方式进行乱序,下面对密文表格的乱序方式进行详细阐述。
在本说明书的一个实施例中,所述S220,可以包括如下步骤:通过至少一个乱序流程对密文表格进行乱序,其中,任一当前乱序流程,可以包括如下步骤11-13:
步骤11:针对密文表格的各目标子表格,从该目标子表格中确定至少一对置换行。
步骤12:针对各对置换行,以一定概率执行位置置换,以得到该目标子表格对应的乱序子表格。
步骤13:基于该乱序子表格确定本轮乱序流程的输出表格,用于形成乱序表格。
其中,一种实现中,该一定概率根据针对该对置换行生成的随机数确定。即中间方在确定出目标子表格中的至少一对置换行之后,针对各对置换行,生成随机数,基于该随机数进行位置置换。一种情况,该随机数可以为第一值或第二值,第一值可以为1,表示置换所对应置换行的位置;第二值可以为0,表示不置换所对应置换行的位置。另一种情况,该随机数可以0-1的值,其中,随机数值越大表示置换所对应置换行的位置的概率越大。可以理解的是,本说明书中所提到的置换所对应置换行的位置指的是,置换各对置换行中所存储的目标对象的属性值密文,例如确定需要置换第一行和第四行的位置,即将第一行所存储的目标对象的属性值密文,存储至第四行,将第四行所存储的目标对象的属性值密文存储至第一行。
一种情况中,中间方通过密态计算系统实现,其中包括M个执行方,上述针对各对置换行生成的随机数,是由M个执行方中的每一执行方基于其生成的随机数分片,通过MPC方案,与其他M-1联合确定的。这样,每一执行方仅生成了该随机数的分片,无法获知随机数,相应的,各对置换行是否真正的置换了位置,各个执行方也无法获知。
在一种实现中,中间方可以预存有乱序流程阈值,相应的,在首次乱序流程中,密态计算系统可以直接将密文表格作为目标子表格。或者也可以从密文表格,确定出至少一个目标子表格,其中,每一目标子表格包括连续的至少两行。
后续的,中间方针对每一目标子表格,从该目标子表格中确定出至少一对置换行,针对各对置换行,以一定概率执行位置置换,以得到该目标子表格对应的乱序子表格。之后,中间方获得本轮乱序流程的所有目标子表格对应的乱序子表格,基于所有目标子表格对应的乱序子表格,确定本轮乱序流程的输出表格。
在乱序流程执行个数未达到乱序流程阈值的情况下,一种情况,可以直接将本轮乱序流程的输出表格(部分行乱序之后的密文表格),作为一个目标子表格,循环执行后续的流程(即步骤11-13),直至乱序流程执行个数达到乱序流程阈值,得到乱序表格。另一种情况,确定出目标子表格对应的乱序子表格之后,基于所确定的至少一对置换行中的一对置换行,对乱序之后所得的表格进行划分,得到多个子表格,将该多个子表格作为本轮乱序流程对应的输出表格,其中每一输出表格即为后一乱序流程的目标子表格,进而执行后续流程,直至乱序流程执行个数达到乱序流程阈值,得到 乱序表格。
在另一种实现方式中,任一当前乱序流程,如图3所示,可以包括如下步骤:
S310:针对密文表格的各目标子表格,基于该目标子表格的行数,确定循环阈值。一种情况中,该循环阈值形式为2 m,其中m与该目标子表格的行数n可以满足如下条件:2 m<n<=2 m+1。中间方针对每一目标子表格,可以将该目标子表格作为一个单独的需要乱序的表格,针对该目标子表格中所包含的各行,重新为其配置新的行标号。例如,重新将目标子表格中的第一行的行标号设置为0(或者1),之后的行的行标号依次递增1。
S320:至少基于当前的循环次数,从该目标子表格中选出一对置换行。
S330:针对该对置换行,以一定概率执行位置置换。
S340:更新循环次数,基于更新后的循环次数,判断是否达到与循环阈值相关的循环结束条件。若判断未达到与循环阈值相关的循环结束条件,返回执行S320;
S350:若判断达到与循环阈值相关的循环结束条件,得到该目标子表格对应的乱序子表格,并基于所有目标子表格对应的乱序子表格,确定本轮乱序流程的输出表格,用于形成乱序表格。
其中,判断是否达到与循环阈值相关的循环结束条件,可以是,循环次数的初始值为0,更新一次循环次数为针对当前的循环次数加一,相应的,加一后的循环次数达到循环阈值,即达到与循环阈值相关的循环结束条件。也可以是,循环次数的初始值为循环阈值,更新一次循环次数为针对当前的循环次数减一,相应的,减一后的循环次数为0,即达到与循环阈值相关的循环结束条件。
在一种实现中,中间方可以基于当前的循环次数和循环阈值,从该目标子表格中选出至少一对置换行。具体的,可以是:从该目标子表格中,分别选择所对应行标号等于当前的循环次数的行,以及所对应行标号等于当前的循环次数与循环阈值之和的行,作为一对置换行。
在一种实现方式中,前述S350,可以被设置为,利用循环阈值,划分乱序子表格,以确定出本轮乱序流程的输出表格。其中,可以直接将目标子表格对应的乱序子表格中所对应行标号为循环阈值的行,作为划分基准行,将该乱序子表格中该划分基准行及其之前的行作为一个新的表格,该划分基准行之后的行作为一个新的表格,作为本 轮乱序流程的输出表格。
在当前乱序流程中,针对所有的目标子表格执行上述S310-S350,以得到本轮乱序流程的所有输出表格。可以理解的是,在当前乱序流程,在对各目标子表格的乱序子表格进行划分之后,出现划分之后所得的所有新的表格的行数均小于2,即本轮乱序流程的输出表格的行数均小于2的情况,则直接基于本轮乱序流程的输出表格,生成乱序表格。
一种实现中,在上述乱序过程中,中间方可以记录密文表格与各目标子表格之间的位置关系,并且记录各目标子表格对应的乱序子表格与其划分后的新的子表格之间的位置关系,相应的,中间方可以基于上述位置关系,确定出每一轮乱序流程的输出表格之间的位置关系。在一轮乱序流程的输出表格的行数均小于2之后,直接基于本轮乱序流程的输出表格之间的位置关系,生成乱序表格。
其中,为了提高乱序速度,可以在任意乱序流程中,并行化执行对各目标子表格的乱序步骤。
举例而言,密文表格(表格1)包含4行,行标号分别0-3,每行分别表示为X[0]-X[3],其中,X[0]存储于目标对象1的所有属性项的属性项密文,X[1]存储于目标对象2的所有属性项的属性项密文,X[2]存储于目标对象3的所有属性项的属性项密文,X[3]存储于目标对象4的所有属性项的属性项密文。
首次乱序流程,表格1作为目标子表格,基于表格1的行数,确定循环阈值2(2^1<8<=2^2);从表格1中分别选择所对应行标号等于当前的循环次数0的行(X[0]),以及所对应行标号等于当前的循环次数0与循环阈值2之和的行(X[2]),作为一对置换行;以一定概率针对X[0]和X[2]两行执行位置置换,X[0]和X[2]两行是否真实的执置换位置的概率是随机的。其中,若置换X[0]和X[2],即将目标对象3的所有属性项的属性项密文存储至X[0],将目标对象1的所有属性项的属性项密文存储至X[2])。
接着,循环次数加一,从表格1中分别选择所对应行标号等于当前的循环次数1的行(X[1]),以及所对应行标号等于当前的循环次数1与循环阈值2之和的行(X[3]),作为一对置换行,以一定概率针对X[1]和X[3]两行执行位置置换,是否真实的进行位置置换随机。循环次数加一(为2),达到了循环结束条件(2=2),得到表格1的乱序子表格(表格2),其中,表格2中的,X[0]、X[1]、X[2]以及X[3]存储的是哪个目标对象的所有属性项的属性值是不确定的。
基于循环阈值2划分表格2,得到两个新的子表格分别为表格3(包括X[0]和X[1])和表格4(包括X[2]和X[3]),作为第一轮乱序流程的输出表格。针对表格4(表格3)重新设置其行的行标号X[0]’和X[1]’,其中,X[0]’对应X[2],X[1]’对应X[3]。
第二轮乱序流程,将表格3和表格4分别作为目标子表格,针对表格3(表格4),基于其行数2,确定循环阈值为0;从表格3(表格4)中,分别选择所对应行标号等于当前的循环次数0的行X[0](X[0]’),以及所对应行标号等于当前的循环次数与所述循环阈值之和的行X[1](X[1]’),作为一对置换行。以一定概率针对X[0]和X[1](X[0]’)和X[1]’)执行位置置换,得到表格3对应的乱序子表格即表格5(表格4对应的乱序子表格,即表格6)。基于循环阈值0划分表格5(表格6),得到两个新的子表格,分别为表格7和表格8(表格9和表格10),作为第二轮乱序流程的输出表格,其中,第二轮乱序流程的输出表格的行数均为1,小于2,则基于第二轮乱序流程的输出表格,生成乱序表格。
在另一种实现方式中,任一当前乱序流程,如图4所示,可以包括如下步骤S410-S450:
S410:基于密文表格的行数,确定循环阈值。一种情况中,该循环阈值的形式为m,该m与该目标子表格的行数n可以满足如下条件:2 m<n<=2 m+1
S420:从密文表格中,选择所对应行标号除第i位外其他位均相同的两行,作为一对置换行对,以得到至少一对置换行,其中,i等于当前的循环次数。
S430:针对各对置换行,以一定概率执行位置置换。其中,该一定概率根据针对所对应对置换行生成的随机数确定。
S440:更新循环次数,基于更新后的循环次数,判断是否达到与循环阈值相关的循环结束条件。若判断未达到,返回执行S420;
S450:若判断达到,得到本轮乱序流程的输出表格,并基于该输出表格生成乱序表格。
本实施例中,密文表格中各行的行标号通过二进制表示,具体的二进制的位数可以根据实际情况进行设置,例如4位,8位等。在4位的情况下,可以例如第一行X[0],对应的行标号表示为0000,第二行X[1],对应的行标号表示为0001,以此类推。
在确定出循环阈值之后,可以设置循环次数i的取值范围为[0,m-1]中的整数。i 可以从0开始取,相应的,更新循环次数为在当前的循环次数加一,达到与循环阈值相关的循环结束条件为更新后的循环次数达到循环阈值。另一种情况,i也可以从m-1开始取,此时,更新循环次数为在当前的循环次数减一,达到与循环阈值相关的循环结束条件为更新后的循环次数为0。
一种实现中,为了提高乱序速度,可以在任意乱序流程中,并行化执行对各对置换行的位置置换步骤。
如图5所示,以密文表格(目标子表格)中包含4行为例说明,4行对应的行标号分别表示为:第一行X[0]对应的行标号为0000,第二行X[1]对应的行标号为0001,第三行X[2]对应的行标号为0010,第四行X[3]对应的行标号为0011。其中,循环阈值为2,相应的,i可以取值0,1。i等于0时,从该密文表格中选择所对应行标号除第0位外其他位均相同的两行,作为一对置换行对,以得到至少一对置换行,具体为:第一行X[0]和第二行X[1]为一对置换行,第三行X[2]和第四行X[3]为一对置换行。
针对第一行X[0]和第二行X[1],以一定概率进行位置置换,并针对第三行X[2]和第四行X[3],以一定概率进行位置置换,得到中间排序表格。其中,各行之间的位置关系被重排。
更新i,更新后的i等于1,从该中间排序表格(前一次循环的结果)中选择所对应行标号除第1位外其他位均相同的两行,作为一对置换行对,以得到至少一对置换行,具体为:第一行X[0]和第三行X[2]为一对置换行,第二行X[1]和第四行X[3]为一对置换行。针对第一行X[0]和第三行X[2],以一定概率进行位置置换,并针对第二行X[1]和第四行X[3],以一定概率进行位置置换。
更新i,更新后的i等于2,达到循环结束条件,得到乱序表格。
本实施例可以通过一个乱序流程实现对密文表格的乱序,打乱表格中行的前后顺序关系,使得每次排序时,表格中前后行的位置不能追踪,避免了表格中各行在排序列上的顺序关系(例如处于第一属性项排序列第a位的目标对象,即为处于第二属性项排序列第b位的目标对象)的暴露。
在本说明书的一种实现方式中,为了实现密文表格的乱序效果更好,所述S220,可以包括如下步骤21-22:
步骤21:判断密文表格的总行数是否满足预设乱序条件,其中,预设乱序条件包 括:总行数等于预设数值的整数次幂。其中,该预设数值可以根据实际情况设置,一种情况,该预设数值为2。
步骤22:若判断满足预设乱序条件,以行为单位对密文表格进行乱序。其中,在密文表格的总行数满足预设乱序条件下,在一定程度上可以使得密文表格中各行被确定为一对置换行中的一行的概率均等,避免出现某些行被执行位置置换的概括很小,在一定程序上影响乱序效果。例如,密文表格中包括5行,利用前述图3所示的乱序流程,会出现第5行未参与乱序的情况,这在一定程度上存在第5行所存储的目标对象的某些隐私数据被泄露的隐患。为了降低上述情况的发生,提高乱序效果,中间方在判断密文表格满足不预设乱序条件的情况下,所述S220,还可以包括步骤23-24:步骤23,使用特定行填充密文表格,得到填充表格,填充表格的总行数满足预设乱序条件;步骤24,以行为单位对填充表格进行乱序,以得到乱序表格。该特定行对应的行标号不同于密文表格中各行的行标号,该特定行中可以包含指定类数据的密文。
后续的,在对填充特定行的填充表格进行乱序,得到其对应的乱序表格之后,响应于针对目标属性项查询排序相关数据的查询指令,对乱序表格中目标属性项对应的属性值密文进行排序,得到排序后表格,并将排序后表格中所填充的特定行移除,得到目标排序表格。
在另一实现方式中,在判断密文表格不满足预设乱序条件的情况下,还可以在针对乱序表格以行为单位进行乱序之后,确定出其中参加乱序流程次数较少的行(例如:所对应行标号大于2^m的行),将该类行以随机方式进行乱序,例如,该类行之间进行随机行乱序,或者将该类行随机插入其他行(例如:该类行随机插入表格中所对应行标号不大于2^m的行)。以更好的打乱该类行的前后行的位置顺序关系。
本说明书实施例中,中间方可以通过多种排序算法,对乱序表格的目标属性项的属性值密文进行排序,例如:merge排序算法、堆排序算法等。本说明书实施例提供了一种排序方式,具体的,所述S240,可以包括:迭代执行多个排序流程,其中,任意一个排序流程,包括如下步骤31-33:
步骤31:针对乱序表格中当前待排序的各表格部分,从该表格部分包含的目标属性项对应的属性值密文中,确定出基准密文。
步骤32:基于基准密文,对该表格部分包含的其他属性值密文进行分组,以得到大于基准密文的第一行集合,小于基准密文的第二行集合。
步骤33:将各表格部分对应的第一行集合和第二行集合中行数大于1的集合,作为下一排序流程对应的待排序的表格部分;直至得到目标排序表格。一种情况,可以是在各待排序的表格部分对应的第一行集合和第二行集合中行数均不大于1的情况下,得到目标排序表格。
一种实现中,中间方得到乱序表格后,响应于查询指令,对乱序表格迭代执行多个排序流程。其中,在首个排序流程中,乱序表格作为当前待排序的表格部分,从该表格部分包含的目标属性项对应的属性值密文中,确定出一个属性值密文作为基准密文。将该基准密文与目标属性项对象的其他属性值密文进行比较,得到比较结果;进而基于比较结果,对该表格部分包含的目标属性项对应的其他属性值密文进行分组,以得到大于基准密文的第一行集合,小于基准密文的第二行集合。其中,第一行集合包括:该表格部分中大于基准密文的目标属性项对象的其他属性值密文所在行,第二行集合包括:该表格部分中小于基准密文的目标属性项对象的其他属性值密文所在行。
将该表格部分对应的第一行集合和第二行集合中行数大于1的集合,作为下一排序流程对应的待排序的表格部分,返回执行上述步骤31-33;直至各待排序的表格部分对应的第一行集合和第二行集合中行数均不大于1,得到目标排序表格。
一种情况中,该分组还得到等于基准密文的第三行集合。该第三行集合包括:该表格部分中等于基准密文的目标属性项对象的其他属性值密文所在行。该第三行集合中的行无需再针对目标属性项进行排序。
在本说明书的一种实现方式中,该任意一个排序流程还可以包括,将各表格部分对应的第一行集合、第二行集合和第三行集合进行排列放置,使得第一行集合放置于第三行集合的下部,第二行集合放置于第三行集合的上部。各表格部分对应的第一行集合、第二行集合和第三行集合按照该种排列放置,可以实现将乱序表格中各行,按照目标属性项对应的属性值密文从小到大的顺序排序。在另一种实现方式中,该任意一个排序流程还可以包括,将各表格部分对应的第一行集合、第二行集合和第三行集合进行排列放置,第一行集合放置于第三行集合的上部,第二行集合放置于第三行集合的下部,以实现将乱序表格中各行,按照目标属性项对应的属性值密文从大到小的顺序排序。
其中,为了提高排序速度,可以在任意排序流程中,并行化执行对各表格部分的排序。
上述内容对本说明书的特定实施例进行了描述,其他实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行,并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要按照示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的,或者可能是有利的。
相应于上述方法实施例,本说明书实施例,提供了一种的保护数据隐私的多方数据查询装置600,所述多方包括多个数据拥有方,其各自持有N个目标对象的若干属性项的属性值,所述装置部署在多方之外的中间方,其示意性框图如图6所示,包括:
获得模块610,配置为获得各数据拥有方发送的N个目标对象的属性值密文,以得到密文表格,其中,所述密文表格中一行对应于一个目标对象,一列对应于一个属性项;
乱序模块620,配置为以行为单位对所述密文表格进行乱序,得到乱序表格;
排序模块630,配置为响应于针对目标属性项查询排序相关数据的查询指令,对所述乱序表格中所述目标属性项对应的属性值密文进行排序,得到目标排序表格;
查询模块640,配置为基于所述目标排序表格,得到所述排序相关数据作为查询结果。
在一种可选实施方式中,所述中间方为密态计算系统,其中包括M个执行方;
所述获得模块610,具体配置为所述M个执行方中的各执行方,分别获得各数据拥有方发送的一份属性值分片作为属性值密文,其中,每份属性值分片是,各数据拥有方将其持有的属性值划分成M份而确定的;
所述乱序模块620,具体配置为所述各执行方基于本方所持有的属性值密文,通过多方安全计算MPC方案,与其他M-1个执行方联合以行为单位对所述密文表格进行乱序。
在一种可选实施方式中,所述排序相关数据为,对所述目标属性项对应的排序序列中指定X位的属性值密文进行指定计算的计算结果;
所述查询模块640,具体配置为从所述目标排序表格中得到所述指定X位的目标属性值密文;
对所述目标属性值密文进行所述指定计算。
在一种可选实施方式中,所述乱序模块620,被具体配置为通过至少一个乱序流程对所述密文表格进行乱序,其中,所述乱序模块620通过如下单元实现任一当前乱序流程的执行:
第一确定单元(图中未示出),配置为针对所述密文表格的各目标子表格,从该目标子表格中确定至少一对置换行;
位置置换单元(图中未示出),配置为针对各对置换行,以一定概率执行位置置换,以得到该目标子表格对应的乱序子表格;
第二确定单元(图中未示出),配置为基于所述乱序子表格确定本轮乱序流程的输出表格,用于形成所述乱序表格。
在一种可选实施方式中,所述一定概率根据针对该对置换行生成的随机数确定。
在一种可选实施方式中,所述当前乱序流程为首次乱序流程时,所述目标子表格为所述密文表格;所述当前乱序流程为非首次乱序流程时,所述目标子表格为上一乱序流程对应的输出表格中行数不小于2的表格。
在一种可选实施方式中,所述第一确定单元,具体配置为基于该目标子表格的行数,确定循环阈值;
迭代执行多次循环过程,每次循环过程包括,至少基于当前的循环次数,从该目标子表格中选出至少一对置换行;直到达到与所述循环阈值相关的循环结束条件。
在一种可选实施方式中,所述第一确定单元,具体配置为从该目标子表格中,分别选择所对应行标号等于当前的循环次数的行,以及所对应行标号等于当前的循环次数与所述循环阈值之和的行,作为一对置换行。
在一种可选实施方式中,所述第二确定单元,具体配置为利用所述循环阈值,划分所述乱序子表格,以确定出本轮乱序流程的输出表格。
在一种可选实施方式中,所述循环阈值形式为2 m,其中m与该目标子表格的行数n满足如下条件:2 m<n<=2 m+1
在一种可选实施方式中,所述密文表格中各行对应的行标号通过二进制表示;
所述第一确定单元,具体配置为从该目标子表格中选择所对应行标号除第i位外其他位均相同的两行,作为一对置换行,其中,所述i等于所述当前的循环次数,所 述目标子表格为所述密文表格自身。
在一种可选实施方式中,所述排序模块630,具体配置为迭代执行多个排序流程,其中,任意一个排序流程,包括:
针对所述乱序表格中当前待排序的各表格部分,从该表格部分包含的目标属性项对应的属性值密文中,确定出基准密文;
基于所述基准密文,对该表格部分包含的目标属性项对应的其他属性值密文进行分组,以得到大于所述基准密文的第一行集合,小于所述基准密文的第二行集合;
将各表格部分对应的第一行集合和第二行集合中行数大于1的集合,作为下一排序流程对应的待排序的表格部分;直至得到目标排序表格。
在一种可选实施方式中,所述分组还得到等于所述基准密文的第三行集合。
在一种可选实施方式中,所述排序模块630,还具体配置为将各表格部分对应的第一行集合、第二行集合和第三行集合进行排列放置,使得第一行集合放置于第三行集合的下部,第二行集合放置于第三行集合的上部。
在一种可选实施方式中,所述乱序模块620,包括:
判断单元(图中未示出),配置为判断所述密文表格的总行数是否满足预设乱序条件,其中,所述预设乱序条件包括:总行数等于预设数值的整数次幂;
填充单元(图中未示出),配置为若判断不满足所述预设乱序条件,使用特定行填充所述密文表格,得到填充表格,所述填充表格的总行数满足所述预设乱序条件;
乱序单元(图中未示出),配置为以行为单位对所述填充表格进行乱序,以得到乱序表格。
在一种可选实施方式中,所述排序模块630,具体配置为:
对所述乱序表格中所述目标属性项对应的属性值密文进行排序,得到排序后表格;
将排序后表格中所填充的特定行移除,得到所述目标排序表格。
上述装置实施例与方法实施例相对应,具体说明可以参见方法实施例部分的描述,此处不再赘述。装置实施例是基于对应的方法实施例得到,与对应的方法实施例具有同样的技术效果,具体说明可参见对应的方法实施例。
本说明书实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,当 所述计算机程序在计算机中执行时,令计算机执行本说明书所提供的所述保护数据隐私的多方数据查询方法。
本说明书实施例还提供了一种计算设备,包括存储器和处理器,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现本说明书所提供的所述保护数据隐私的多方数据查询方法。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于存储介质和计算设备实施例而言,由于其基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可。
本领域技术人员应该可以意识到,在上述一个或多个示例中,本发明实施例所描述的功能可以用硬件、软件、固件或它们的任意组合来实现。当使用软件实现时,可以将这些功能存储在计算机可读介质中或者作为计算机可读介质上的一个或多个指令或代码进行传输。
以上所述的具体实施方式,对本发明实施例的目的、技术方案和有益效果进行了进一步的详细说明。所应理解的是,以上所述仅为本发明实施例的具体实施方式而已,并不用于限定本发明的保护范围,凡在本发明的技术方案的基础之上所做的任何修改、等同替换、改进等,均应包括在本发明的保护范围之内。

Claims (18)

  1. 一种保护数据隐私的多方数据查询方法,所述多方包括多个数据拥有方,其各自持有N个目标对象的若干属性项的属性值,所述方法通过所述多方之外的中间方执行,所述方法包括:
    获得各数据拥有方发送的N个目标对象的属性值密文,以得到密文表格,其中,所述密文表格的一行对应于一个目标对象,一列对应于一个属性项;
    以行为单位对所述密文表格进行乱序,得到乱序表格;
    响应于针对目标属性项查询排序相关数据的查询指令,对所述乱序表格中所述目标属性项对应的属性值密文进行排序,得到目标排序表格,基于所述目标排序表格,得到所述排序相关数据作为查询结果。
  2. 根据权利要求1所述的方法,其中,所述中间方为密态计算系统,其中包括M个执行方;
    所述获得各数据拥有方发送的N个目标对象的属性值密文,包括:
    所述M个执行方中的各执行方,分别获得各数据拥有方发送的一份属性值分片作为属性值密文,其中,每份属性值分片是,各数据拥有方将其持有的属性值划分成M份而确定的;
    所述以行为单位对所述密文表格进行乱序,包括:
    所述各执行方基于本方所持有的属性值密文,通过多方安全计算MPC方案,与其他M-1个执行方联合以行为单位对所述密文表格进行乱序。
  3. 根据权利要求1所述的方法,其中,所述排序相关数据为,对所述目标属性项对应的排序序列中指定X位的属性值密文进行指定计算的计算结果;
    所述得到所述排序相关数据作为查询结果,包括:
    从所述目标排序表格中得到所述指定X位的目标属性值密文;
    对所述目标属性值密文进行所述指定计算。
  4. 根据权利要求1所述的方法,其中,所述以行为单位对所述密文表格进行乱序,包括:
    通过至少一个乱序流程对所述密文表格进行乱序,其中,任一当前乱序流程包括:
    针对所述密文表格的各目标子表格,从该目标子表格中确定至少一对置换行;
    针对各对置换行,以一定概率执行位置置换,以得到该目标子表格对应的乱序子表格,基于所述乱序子表格确定本轮乱序流程的输出表格,用于形成所述乱序表格。
  5. 根据权利要求4所述的方法,其中,所述一定概率根据针对该对置换行生成的随机数确定。
  6. 根据权利要求4所述的方法,其中,所述当前乱序流程为首次乱序流程时,所述目标子表格为所述密文表格;所述当前乱序流程为非首次乱序流程时,所述目标子表格为上一乱序流程对应的输出表格中行数不小于2的表格。
  7. 根据权利要求4所述的方法,其中,所述从该目标子表格中确定至少一对置换行,包括:
    基于该目标子表格的行数,确定循环阈值;
    迭代执行多次循环过程,每次循环过程包括,至少基于当前的循环次数,从该目标子表格中选出至少一对置换行;直到达到与所述循环阈值相关的循环结束条件。
  8. 根据权利要求7所述的方法,其中,所述至少基于当前的循环次数,从该目标子表格中选出至少一对置换行,包括:
    从该目标子表格中,分别选择所对应行标号等于当前的循环次数的行,以及所对应行标号等于当前的循环次数与所述循环阈值之和的行,作为一对置换行。
  9. 根据权利要求7所述的方法,其中,所述基于所述乱序子表格确定本轮乱序流程的输出表格,包括:
    利用所述循环阈值,划分所述乱序子表格,以确定出本轮乱序流程的输出表格。
  10. 根据权利要求7所述的方法,其中,所述循环阈值形式为2 m,其中m与该目标子表格的行数n满足如下条件:2 m<n<=2 m+1
  11. 根据权利要求7所述的方法,其中,所述密文表格中各行对应的行标号通过二进制表示;
    至少基于当前的循环次数,从该目标子表格中选出至少一对置换行,包括:
    从该目标子表格中选择所对应行标号除第i位外其他位均相同的两行,作为一对 置换行,其中,所述i等于所述当前的循环次数,所述目标子表格为所述密文表格自身。
  12. 根据权利要求1-11任一项所述的方法,其中,所述对所述乱序表格中所述目标属性项对应的属性值密文进行排序,包括:
    迭代执行多个排序流程,其中,任意一个排序流程,包括:
    针对所述乱序表格中当前待排序的各表格部分,从该表格部分包含的目标属性项对应的属性值密文中,确定出基准密文;
    基于所述基准密文,对该表格部分包含的目标属性项对应的其他属性值密文进行分组,以得到大于所述基准密文的第一行集合,小于所述基准密文的第二行集合;
    将各表格部分对应的第一行集合和第二行集合中行数大于1的集合,作为下一排序流程对应的待排序的表格部分;直至得到目标排序表格。
  13. 根据权利要求12所述的方法,其中,所述分组还得到等于所述基准密文的第三行集合。
  14. 根据权利要求13所述的方法,其中,所述任意一个排序流程还包括,将各表格部分对应的第一行集合、第二行集合和第三行集合进行排列放置,使得第一行集合放置于第三行集合的下部,第二行集合放置于第三行集合的上部。
  15. 根据权利要求1-11任一项所述的方法,其中,所述以行为单位对所述密文表格进行乱序,包括:
    判断所述密文表格的总行数是否满足预设乱序条件,其中,所述预设乱序条件包括:总行数等于预设数值的整数次幂;
    若判断不满足所述预设乱序条件,使用特定行填充所述密文表格,得到填充表格,所述填充表格的总行数满足所述预设乱序条件;
    以行为单位对所述填充表格进行乱序,以得到乱序表格。
  16. 根据权利要求15所述的方法,所述得到目标排序表格,包括:
    对所述乱序表格中所述目标属性项对应的属性值密文进行排序,得到排序后表格;
    将排序后表格中所填充的特定行移除,得到所述目标排序表格。
  17. 一种保护数据隐私的多方数据查询装置,所述多方包括多个数据拥有方,其 各自持有N个目标对象的若干属性项的属性值,所述装置部署在所述多方之外的中间方,所述装置包括:
    获得模块,配置为获得各数据拥有方发送的N个目标对象的属性值密文,以得到密文表格,其中,所述密文表格中一行对应于一个目标对象,一列对应于一个属性项;
    乱序模块,配置为以行为单位对所述密文表格进行乱序,得到乱序表格;
    排序模块,配置为响应于针对目标属性项查询排序相关数据的查询指令,对所述乱序表格中所述目标属性项对应的属性值密文进行排序,得到目标排序表格;
    查询模块,配置为基于所述目标排序表格,得到所述排序相关数据作为查询结果。
  18. 一种计算设备,包括存储器和处理器,其中,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现权利要求1-16中任一项所述的方法。
PCT/CN2022/125462 2021-12-28 2022-10-14 保护数据隐私的多方数据查询方法及装置 WO2023124400A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP22913664.3A EP4345670A1 (en) 2021-12-28 2022-10-14 Multi-party data query method and apparatus for protecting data privacy
US18/400,427 US20240135026A1 (en) 2021-12-28 2023-12-29 Multi-party data query methods and apparatuses for data privacy protection

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111621978.6A CN114003962B (zh) 2021-12-28 2021-12-28 保护数据隐私的多方数据查询方法及装置
CN202111621978.6 2021-12-28

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/400,427 Continuation US20240135026A1 (en) 2021-12-28 2023-12-29 Multi-party data query methods and apparatuses for data privacy protection

Publications (1)

Publication Number Publication Date
WO2023124400A1 true WO2023124400A1 (zh) 2023-07-06

Family

ID=79932060

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/125462 WO2023124400A1 (zh) 2021-12-28 2022-10-14 保护数据隐私的多方数据查询方法及装置

Country Status (4)

Country Link
US (1) US20240135026A1 (zh)
EP (1) EP4345670A1 (zh)
CN (1) CN114003962B (zh)
WO (1) WO2023124400A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114003962B (zh) * 2021-12-28 2022-04-12 支付宝(杭州)信息技术有限公司 保护数据隐私的多方数据查询方法及装置
CN114282256B (zh) * 2022-03-04 2022-06-07 支付宝(杭州)信息技术有限公司 一种基于秘密分享的排序打乱方法和恢复方法
CN114338017B (zh) * 2022-03-04 2022-06-10 支付宝(杭州)信息技术有限公司 一种基于秘密分享的排序方法和系统
CN114726514B (zh) * 2022-03-21 2024-03-22 支付宝(杭州)信息技术有限公司 数据的处理方法和装置
CN115587382B (zh) * 2022-12-14 2023-04-11 富算科技(上海)有限公司 全密态数据处理方法、装置、设备、介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106802926A (zh) * 2016-12-21 2017-06-06 上海数据交易中心有限公司 一种多方数据查询系统和方法
US20200177366A1 (en) * 2019-06-18 2020-06-04 Alibaba Group Holding Limited Homomorphic data encryption method and apparatus for implementing privacy protection
CN112000979A (zh) * 2019-06-21 2020-11-27 华控清交信息科技(北京)有限公司 隐私数据的数据库操作方法、系统及存储介质
CN112613077A (zh) * 2021-01-22 2021-04-06 支付宝(杭州)信息技术有限公司 保护隐私的多方数据处理的方法、装置和系统
CN114003962A (zh) * 2021-12-28 2022-02-01 支付宝(杭州)信息技术有限公司 保护数据隐私的多方数据查询方法及装置

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9442694B1 (en) * 2015-11-18 2016-09-13 International Business Machines Corporation Method for storing a dataset
CN112347501A (zh) * 2019-08-06 2021-02-09 中国移动通信集团广东有限公司 数据处理方法、装置、设备及存储介质
CN112887297B (zh) * 2021-01-22 2022-09-02 支付宝(杭州)信息技术有限公司 保护隐私的差异数据确定方法、装置、设备及系统
CN112822201B (zh) * 2021-01-22 2023-03-24 支付宝(杭州)信息技术有限公司 保护隐私的差异数据确定方法、装置、设备及系统
CN112597524B (zh) * 2021-03-03 2021-05-18 支付宝(杭州)信息技术有限公司 隐私求交的方法及装置
CN113536379B (zh) * 2021-07-19 2022-11-29 建信金融科技有限责任公司 一种隐私数据的查询方法、装置及电子设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106802926A (zh) * 2016-12-21 2017-06-06 上海数据交易中心有限公司 一种多方数据查询系统和方法
US20200177366A1 (en) * 2019-06-18 2020-06-04 Alibaba Group Holding Limited Homomorphic data encryption method and apparatus for implementing privacy protection
CN112000979A (zh) * 2019-06-21 2020-11-27 华控清交信息科技(北京)有限公司 隐私数据的数据库操作方法、系统及存储介质
CN112613077A (zh) * 2021-01-22 2021-04-06 支付宝(杭州)信息技术有限公司 保护隐私的多方数据处理的方法、装置和系统
CN114003962A (zh) * 2021-12-28 2022-02-01 支付宝(杭州)信息技术有限公司 保护数据隐私的多方数据查询方法及装置

Also Published As

Publication number Publication date
EP4345670A1 (en) 2024-04-03
CN114003962B (zh) 2022-04-12
CN114003962A (zh) 2022-02-01
US20240135026A1 (en) 2024-04-25

Similar Documents

Publication Publication Date Title
WO2023124400A1 (zh) 保护数据隐私的多方数据查询方法及装置
Mayberry et al. Efficient private file retrieval by combining ORAM and PIR
CA3057854C (en) Method and system for hierarchical cryptographic key management
US10541983B1 (en) Secure storage and searching of information maintained on search systems
US9037860B1 (en) Average-complexity ideal-security order-preserving encryption
US10374807B2 (en) Storing and retrieving ciphertext in data storage
CN111539535B (zh) 基于隐私保护的联合特征分箱方法及装置
CN110337649A (zh) 用于搜索模式未察觉的动态对称可搜索加密的方法和系统
Kuznetsov et al. Performance of hash algorithms on gpus for use in blockchain
US9953184B2 (en) Customized trusted computer for secure data processing and storage
Yuan et al. Enabling encrypted rich queries in distributed key-value stores
CN108829899B (zh) 数据表储存、修改、查询和统计方法
CN111401572B (zh) 基于隐私保护的有监督特征分箱方法及装置
WO2022142366A1 (zh) 机器学习模型更新的方法和装置
US20130097430A1 (en) Encrypting data and characterization data that describes valid contents of a column
CN111539009A (zh) 保护隐私数据的有监督特征分箱方法及装置
CN1413398A (zh) 防止通过分析无意旁生信道信号来提取数据的数据处理方法
US9361480B2 (en) Anonymization of streaming data
GB2556902A (en) Method and system for securely storing data using a secret sharing scheme
Ren et al. Privacy-preserving ranked multi-keyword search leveraging polynomial function in cloud computing
WO2023179185A1 (zh) 数据的处理方法和装置
Riazi et al. PriSearch: Efficient search on private data
Williams et al. Practical oblivious outsourced storage
CN114374521B (zh) 一种隐私数据保护方法、电子设备及存储介质
Abdelraheem et al. Executing boolean queries on an encrypted bitmap index

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22913664

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 22913664.3

Country of ref document: EP

Ref document number: 2022913664

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022913664

Country of ref document: EP

Effective date: 20231228