CN115994161A - Data aggregation system and method based on multiparty security calculation - Google Patents

Data aggregation system and method based on multiparty security calculation Download PDF

Info

Publication number
CN115994161A
CN115994161A CN202310278687.4A CN202310278687A CN115994161A CN 115994161 A CN115994161 A CN 115994161A CN 202310278687 A CN202310278687 A CN 202310278687A CN 115994161 A CN115994161 A CN 115994161A
Authority
CN
China
Prior art keywords
data
aggregation
task
batch
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310278687.4A
Other languages
Chinese (zh)
Other versions
CN115994161B (en
Inventor
陈超超
郑小林
刘鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Jinzhita Technology Co ltd
Original Assignee
Hangzhou Jinzhita Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Jinzhita Technology Co ltd filed Critical Hangzhou Jinzhita Technology Co ltd
Priority to CN202310278687.4A priority Critical patent/CN115994161B/en
Publication of CN115994161A publication Critical patent/CN115994161A/en
Application granted granted Critical
Publication of CN115994161B publication Critical patent/CN115994161B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application provides a data aggregation system and a method based on multiparty security computation, wherein the data aggregation system based on multiparty security computation comprises a first participant and a second participant, wherein the first participant is used for generating a plurality of batch data according to first local data and constructing a target matrix corresponding to each batch data; combining the target matrixes corresponding to the batch data according to a service aggregation strategy to obtain a global matrix, and sending the global matrix to the second party; the second party is used for splitting the global matrix according to the service aggregation strategy to obtain a plurality of target matrixes; and constructing an aggregation task corresponding to each target matrix, and aggregating the second local data by executing each aggregation task in turn.

Description

Data aggregation system and method based on multiparty security calculation
Technical Field
The present disclosure relates to the field of multiparty secure computing technologies, and in particular, to a data aggregation system and method based on multiparty secure computing.
Background
With the development of internet technology, online services provide more and more convenient services for users. And simultaneously, a large amount of related data about the user is generated on each service platform. In practical application, the user data is relatively important data for each platform, and on the basis, the privacy computing platform gradually enters the field of view of people to solve the problem of 'data island', and realize cross-domain joint multiparty data mining and joint modeling under the condition that private data of enterprises are protected from leaving domains. And in the application scene aiming at the longitudinal distribution of data, a plurality of mechanism participants are required to be combined for data mining. In the prior art, for joint mining of cross-domain relational data, a packet aggregation algorithm is generally used to obtain statistical information of the data. However, due to the consideration of data security, two parties cannot share data directly, so that the acquisition of statistical information needs to be completed in a manner such as multiparty security calculation, and in the process of solving the above problems in multiparty security calculation, the problems of long time consumption, multiple communication times and the like of packet aggregation are necessarily accompanied. There is therefore a need for an effective solution to the above problems.
Disclosure of Invention
In view of this, the present description embodiments provide a data aggregation system based on multiparty security computing. The present specification also relates to a data aggregation method based on multiparty security computation, a computing device, and a computer-readable storage medium, which solve the technical drawbacks existing in the prior art.
According to a first aspect of embodiments of the present specification, there is provided a data aggregation system based on multiparty security computing, the system comprising a first party and a second party, comprising:
the first participant is used for generating a plurality of batch data according to the first local data and constructing a target matrix corresponding to each batch data; combining the target matrixes corresponding to the batch data according to a service aggregation strategy to obtain a global matrix, and sending the global matrix to the second party;
the second party is used for splitting the global matrix according to the service aggregation strategy to obtain a plurality of target matrixes; and constructing an aggregation task corresponding to each target matrix, and aggregating the second local data by executing each aggregation task in turn.
Optionally, the first participant is further configured to obtain first local data, and segment the first local data; grouping the divided first local data according to a preset batch to obtain the batch data.
Optionally, constructing the target matrix corresponding to any one batch data of the plurality of batch data includes:
the first participant is further configured to sort data units included in the batch data to obtain target batch data when the number of data units included in the batch data is greater than a number threshold; and converting the target batch data into a lower triangular matrix as a target matrix corresponding to the batch data.
Optionally, the first participant is further configured to determine the service aggregation policy, and combine the lower triangular matrices corresponding to each batch of data according to the service aggregation policy to obtain the global matrix; and calling a multiparty safety comparison protocol, and transmitting the global matrix to the second party according to the multiparty safety comparison protocol.
Optionally, the executing of any one of the plurality of aggregation tasks includes:
the second participant is further configured to determine batch data to be aggregated corresponding to the target matrix in the second local data by executing an aggregation task; and aggregating the batch data to be aggregated according to matrix elements contained in the target matrix.
Optionally, the second participant is further configured to determine an ith aggregation task from a plurality of aggregation tasks, and determine batch data to be aggregated corresponding to the ith aggregation task in the second local data by executing the ith aggregation task; the batch data to be polymerized are polymerized according to the target matrix corresponding to the ith polymerization task; i, adding 1 automatically, and executing the step of determining an ith aggregation task in a plurality of aggregation tasks; until i is self-increased to k, determining at least one data cluster based on the aggregation result of the batch data to be aggregated in each aggregation task as aggregation processing of the second local data; wherein i is a positive integer, and k is the number of tasks of the plurality of aggregation tasks.
Optionally, in the case that the ith aggregation task and the (i+1) th aggregation task have overlapping task elements, the second participant is further configured to determine a sub-data set corresponding to the overlapping task elements according to an aggregation result of batch data to be aggregated in the ith aggregation task; updating the batch data to be aggregated corresponding to the i+1th aggregation task according to the sub data set, and taking the updated batch data to be aggregated as the batch data to be aggregated corresponding to the i+1th aggregation task.
Optionally, the first participant is further configured to receive a data aggregation request submitted by a service demand party for a service task, read first local data in response to the data aggregation request, and perform a step of generating a plurality of batches of data according to the first local data;
the second party is further configured to generate a target data table according to an aggregation result of the second local data, read target data in the target data table according to the data aggregation request, execute the service task according to the target data, and feed back an execution result of the service task to the service demander.
According to a second aspect of embodiments of the present specification, there is provided a data aggregation method based on multiparty security computing, the method being applied to a data aggregation system comprising a first party and a second party, comprising:
the first participant generates a plurality of batch data according to the first local data and constructs a target matrix corresponding to each batch data; combining the target matrixes corresponding to the batch data according to a service aggregation strategy to obtain a global matrix, and sending the global matrix to the second party;
the second party splits the global matrix according to the service aggregation strategy to obtain a plurality of target matrixes; and constructing an aggregation task corresponding to each target matrix, and aggregating the second local data by executing each aggregation task in turn.
According to a third aspect of embodiments of the present specification, there is provided a computing device comprising:
a memory and a processor;
the memory is configured to store computer-executable instructions that, when executed, implement the steps of a data aggregation method based on multiparty security computing.
According to a fourth aspect of embodiments of the present description, there is provided a computer-readable storage medium storing computer-executable instructions which, when executed by a processor, implement the steps of the multiparty security computation based data aggregation method.
According to the multiparty security calculation-based data aggregation system provided by the specification, in order to achieve that data aggregation can be rapidly completed under the condition that private data is not revealed between participants, a plurality of batch data can be generated by a first participant according to first local data held by the first participant, each data block is used for forming batch data, a matrix corresponding to each batch data is built on the basis, a target matrix is combined and then sent to a second participant, information used by aggregated data can be sent to the second participant through one-time communication, the information is obtained after conversion based on the first local data, leakage of the data of the first participant is not caused, and communication rounds and safety are effectively reduced. And the second party can use the global matrix to generate a plurality of target matrixes, and create an aggregation task according to each target matrix, so that the second local data held by the second party is aggregated by executing each aggregation task in turn, the running efficiency of data aggregation is improved in a batch processing mode, and the aggregation of a plurality of data can be completed in a single time, thereby effectively improving the data aggregation efficiency and facilitating the follow-up processing of the parties based on the aggregation result.
Drawings
FIG. 1 is a schematic diagram of a data aggregation system based on multiparty security computing provided in an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a data aggregation system based on multiparty security computing according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of a linear aggregation process in a multi-party security computing-based data aggregation system according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of a batch aggregation process in a data aggregation system based on multiparty security computing according to an embodiment of the present disclosure;
FIG. 5 is a process flow diagram of a data aggregation system based on multiparty security computing provided in accordance with one embodiment of the present disclosure;
FIG. 6 is a flow chart of a method of aggregating data based on multiparty security computing, provided in an embodiment of the present disclosure;
FIG. 7 is a block diagram of a computing device according to one embodiment of the present disclosure.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many other forms than described herein and similarly generalized by those skilled in the art to whom this disclosure pertains without departing from the spirit of the disclosure and, therefore, this disclosure is not limited by the specific implementations disclosed below.
The terminology used in the one or more embodiments of the specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the specification. As used in this specification, one or more embodiments and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that, although the terms first, second, etc. may be used in one or more embodiments of this specification to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first may also be referred to as a second, and similarly, a second may also be referred to as a first, without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
In the present specification, a data aggregation system based on multiparty security computation is provided, and the present specification relates to a data aggregation method based on multiparty security computation, a computing device, and a computer-readable storage medium, which are described in detail in the following embodiments one by one.
In practical applications, for joint mining of cross-domain relational data, a packet aggregation algorithm is generally used to obtain statistical information of the data. Such as an insurance company, hopes to estimate the amount it will pay based on the disease before the patient submits the claim. The insurance record data of the insurance company is stored as relational data in a user insurance data table R1 (person), and the medical record is stored as relational data in a patient information table R2 (person) of the hospital. The two parts of data belong to different subjects. When the payment amount needs to be calculated, the coinsurance in R1 needs to be grouped according to the disease in R2, and the sum of the coinsurance in different groups is calculated for realizing the medical data for calculating the payment amount to be reused.
In this process, the two parties cannot directly share the data under the consideration of data security. The computation needs to be performed by means such as multiparty secure computation. In the process of solving such problems by multiparty security computing, the packet aggregation step is more time-consuming, and thus an efficient packet aggregation method is needed. Wherein, the grouping aggregation problem can be simplified into the content as shown in fig. 3 (a), wherein a has a Key with ordered data and B has a data Value; when the task initiator C hopes to group and sum the Value of the B according to the Key of the A for subsequent service processing, target data T is obtained according to the grouping result, namely the A holds Key {1,2,2,2,3}, the B holds Value {10,20,10,15,15,10}, the Key-Value has an alignment relationship, and the data T {1-10 can be obtained through the grouping and summation; 2-45;3-10}, for later use, and in the process, the data of a and B are not presumed to each other or other parties.
For the above-mentioned grouping aggregation problem, in the prior art, a linear aggregation method is used to implement the common practice, as shown in (b) of fig. 3, by sequentially comparing equality of adjacent keys from beginning to end using the agsumifequal basic operator, values of the same grouping are aggregated together, that is, two adjacent keys are sequentially selected for comparison, and the values are aggregated according to the comparison result. When the number of key-value pairs is n, n-1 comparisons of adjacent positions (AggSumIfEqual) need to be used. Finally, the data in the Value are aggregated in the group according to the difference of keys, and the result is stored in the last position in the group, and the rest positions are 0. Wherein, regarding the AggSumIfEquat operator, value is updated according to the following formula. When key is 1 Equal to key 2 When [ key ] 1 =key 2 ]1, val at this time 1 Is 0, val 2 Is Val 1 +Val 2 The method comprises the steps of carrying out a first treatment on the surface of the When key is 1 Not equal to key 2 When [ key ] 1 =key 2 ]0, val at this time 1 ,Val 2 The data of the data are still original data and are not changed; wherein the formula is as follows:
Val 1 =(1-[key 1 = key 2 ])·Val 1
Val 2 = Val 2 +[key 1 = key 2 ])·Val 1
it can be seen from this that although the linear aggregation scanning scheme can achieve the purpose of data packet aggregation, there is data dependence between adjacent comparisons, and parallelization processing is not possible. The actual operation efficiency is low, and meanwhile, because each round of comparison needs to be completed by using different keys, multiple times of communication between the participants are needed to be completed, an effective scheme is needed to solve the above problems.
Referring to the schematic diagram shown in fig. 1, in the data aggregation system based on multiparty security calculation provided in this specification, in order to enable rapid completion of data aggregation between parties without revealing private data, a first party may generate a plurality of batch data according to first local data held by the first party, so as to form each data block into batch data, on this basis, construct a matrix corresponding to each batch data, and combine the target matrices, and send the combined data to a second party, so that information used by aggregated data can be sent to the second party through one-time communication, and the information is obtained after conversion based on the first local data, so that the data of the first party is not revealed, and communication rounds and security are effectively reduced. And the second party can use the global matrix to generate a plurality of target matrixes, and create an aggregation task according to each target matrix, so that the second local data held by the second party is aggregated by executing each aggregation task in turn, the running efficiency of data aggregation is improved in a batch processing mode, and the aggregation of a plurality of data can be completed in a single time, thereby effectively improving the data aggregation efficiency and facilitating the follow-up processing of the parties based on the aggregation result.
FIG. 2 is a schematic diagram of a data aggregation system based on multiparty security computing, the data aggregation system 200 based on multiparty security computing including a first participant 210 and a second participant 220 according to one embodiment of the present disclosure;
the first participant 210 is configured to generate a plurality of batch data according to the first local data, and construct a target matrix corresponding to each batch data; combining the target matrixes corresponding to the batch data according to a service aggregation strategy to obtain a global matrix, and sending the global matrix to the second party;
the second participant 220 is configured to split the global matrix according to the service aggregation policy to obtain a plurality of target matrices; and constructing an aggregation task corresponding to each target matrix, and aggregating the second local data by executing each aggregation task in turn.
The data aggregation system based on multiparty security calculation provided in this embodiment may be applied to aggregation processing of text data, aggregation processing of image data, aggregation processing of medical data, aggregation processing of user data, etc., and in practical application, the data type may be determined according to the data type held by the participant, which is not limited in this embodiment. The system can be completed by two or more participants, and the data aggregation processing operation between any two parties can be referred to the same or corresponding description in this embodiment, and the processing of more than two parties is not repeated.
In this embodiment, taking the case that the first participant holds the user insurance data and the second participant holds the patient medical data as an example, the multiparty security calculation-based data aggregation system is described, and descriptions of other scenes can be referred to the same or similar descriptions in this embodiment.
Specifically, the first party refers to a party holding first local data, and the data held by the party can group and aggregate the data held by the second party, that is, the first local data can be understood as a Key in a relational data type and is ordered data; for example, in the scenario of insurance payment amount calculation, the first local data may be insurance data of an insurance purchased by the user; correspondingly, the second party specifically refers to a party holding the second local data, and the data held by the party can be grouped and aggregated by the data held by the first party, that is, the second local data can be understood as Value in the relationship data type, for example, in the scene of insurance payment amount calculation, and the second local data can be patient medical data held by a hospital.
Correspondingly, the batch data specifically refers to a data set obtained by dividing the first local data into batches, for example, the first local data is divided into n data blocks, k data are aggregated in each batch, and then the n data blocks are grouped according to k, so that n/k batch data can be obtained; correspondingly, the target matrix specifically refers to a matrix obtained after vectorization processing is performed on each batch of data. Correspondingly, the service aggregation policy specifically refers to a policy of merging target matrixes, and the service aggregation policy is deployed respectively on a first participant and a second participant, so that after the first participant aggregates the target matrixes, the second participant can split a plurality of target matrixes from a global matrix according to the reverse operation; correspondingly, the global matrix is a matrix obtained by combining a plurality of target matrices. Correspondingly, the aggregation task specifically refers to a task of performing aggregation processing on local data in the second local data corresponding to each target matrix, and the task is completed according to matrix elements in the target matrix when being executed, so that each matrix task can aggregate part of the data in the second local data, and therefore after each aggregation task is executed, the aggregation processing on the second local data held by the second participant can be completed, and subsequent processing on the basis of a second local data aggregation result is facilitated.
It should be noted that, considering that the first participant and the second participant in the system need to cooperate to implement aggregation of local data, and use the aggregated data for specific service processing, it needs to ensure that the first local data and the second local data are cross-domain relational data, so as to implement grouping aggregation of data held by the second participant in combination with data held by the first participant in different cross-domain scenarios, so as to be used for service processing, and avoid data leakage in the process, and ensure data security.
Based on the above, in the business processing, considering that the data held by different participants are kept secret relative to other participants, if the data sharing is directly carried out, private data can be revealed, and in the business processing, the data processing is accurately and efficiently completed on the basis of the data held by each participant, so that the business processing can be realized through multiparty safe calculation. The multiparty security calculation has low efficiency due to the need of multiple times of communication, and the normal operation of business processing is greatly influenced.
In view of this, in the multiparty security calculation-based data aggregation system provided in this embodiment, when the first participant and the second participant need to cooperate to implement data packet aggregation, the first participant may divide the first local data held by the first participant and integrate the first local data by batches, so as to integrate a plurality of data blocks into one batch of data, which is convenient for the second participant to complete aggregation by batches when executing the aggregation task, and effectively improves the aggregation efficiency.
After the first party obtains the data of a plurality of batches, in order to avoid data leakage, a target matrix corresponding to each batch is firstly constructed, then the plurality of target matrices are combined according to a pre-deployed service aggregation strategy, so that a global matrix representing the first local data can be obtained, and then the global matrix is sent to the second party, so that the information for grouping and aggregating the second local data can be transmitted to the second party through one-time communication, the number of communication rounds can be effectively reduced, and bandwidth resources can be saved.
After the second party receives the global matrix sent by the first party, the global matrix is a result of merging matrices corresponding to all batch data and cannot be directly reused, so that in order to obtain a target matrix corresponding to each batch data and for creating an aggregation task, the global matrix can be split according to the service aggregation strategy which is the same as that of the first party, so that the target matrix corresponding to each batch data, namely a plurality of target matrices, can be obtained. On the basis, since each batch of data is composed of a plurality of data blocks, and the corresponding data blocks can be represented by mapping to the target matrix, an aggregation task can be constructed based on each target matrix, so that when the aggregation task is executed, the data in the second local data can be aggregated according to matrix elements of the plurality of data blocks corresponding to each aggregation task, and the aggregation processing of the second local data is completed according to batches, so that after all the aggregation tasks are executed, the subsequent business processing operation is carried out according to the aggregation result of the second local data.
For example, the second party is an insurance company, which holds a table R1 composed of insurance data of the user, the first party is a hospital, which holds a table R2 composed of diseased medical data of the user, and before the patient submits a claim, the corresponding claim amount is estimated according to the disease of the patient, so that the insurance data table R1 of the insurance company needs to be grouped by using the medical data table R2 of the hospital, and the sum of data in different groups is calculated, so that an insurance data set corresponding to the disease is obtained according to the calculation result, and the calculation of the claim amount is performed based on the result.
Based on this, firstly, the hospital will divide the medical data contained in the medical data table R2 to obtain n data blocks; secondly, grouping n data blocks according to a batch processing strategy k to obtain n/k=m batch data; then, constructing target matrixes corresponding to the m batches of data respectively to obtain m target matrixes; and merging the m target matrixes according to a preset aggregation strategy to obtain a global matrix Q, and sending the global matrix Q to an insurance company side.
Further, after receiving the global matrix Q, the insurance company side splits the global matrix Q according to a preset aggregation policy, and obtains m target matrices according to the splitting result. At this time, m aggregation tasks are constructed for m matrixes, then each aggregation task is sequentially executed, so that data in an insurance data table R1 held by an insurance company is grouped and aggregated, and because each target matrix corresponds to k data blocks, aggregation processing is finished in batches when data aggregation is carried out according to the aggregation tasks; and obtaining an insurance data set corresponding to the patient after all the aggregation tasks are completed, and then calculating the claim settlement amount according to the insurance data contained in the insurance data set.
In the specific implementation, any party providing data as a Key may be a first party, any party providing data as a Value may be a second party, and the data packet aggregation process may be the same or corresponding to the description of this embodiment, which is not repeated herein.
According to the multiparty security calculation-based data aggregation system provided by the specification, in order to achieve that data aggregation can be rapidly completed under the condition that private data is not revealed between participants, a plurality of batch data can be generated by a first participant according to first local data held by the first participant, each data block is used for forming batch data, a matrix corresponding to each batch data is built on the basis, a target matrix is combined and then sent to a second participant, information used by aggregated data can be sent to the second participant through one-time communication, the information is obtained after conversion based on the first local data, leakage of the data of the first participant is not caused, and communication rounds and safety are effectively reduced. And the second party can use the global matrix to generate a plurality of target matrixes, and create an aggregation task according to each target matrix, so that the second local data held by the second party is aggregated by executing each aggregation task in turn, the running efficiency of data aggregation is improved in a batch processing mode, and the aggregation of a plurality of data can be completed in a single time, thereby effectively improving the data aggregation efficiency and facilitating the follow-up processing of the parties based on the aggregation result.
Further, considering that the first local data volume held by the first participant is larger, if data packet aggregation is performed at the second participant according to the matrix element corresponding to each data block in the first local data, more time may be consumed for completion, so in order to improve the packet aggregation efficiency, batch data may be constructed at the first participant, and the second participant may complete data aggregation by batch; in this embodiment, the first participant is further configured to obtain first local data, and perform segmentation processing on the first local data; grouping the divided first local data according to a preset batch to obtain the batch data.
Specifically, the dividing process specifically refers to an operation of dividing the first local data into a plurality of data blocks, and the preset batch specifically refers to a batch value obtained by grouping the divided data blocks according to an actual setting.
Based on the above, when the first party needs to cooperate with the second party to perform data segmentation, the first party will acquire the first local data, and segment the first local data to obtain a plurality of data blocks according to the segmentation result, and then group the segmented first local data according to the preset batch, so as to obtain a plurality of batch data, so that the subsequent use is convenient.
In practical applications, the preset batch may be set to a value greater than 2; when the preset batch is less than or equal to 2, the subsequent treatment can be directly carried out according to the linear polymerization treatment. Thus, setting to a value greater than 2 can increase the processing speed compared to linear polymerization.
In conclusion, the batch data of the divided first local data is built according to the preset batch, so that the second party can complete the data grouping polymerization according to the batch, and the data aggregation efficiency is effectively improved.
Further, when constructing the target matrix for each batch of data, considering that the target matrix is a base supporting the second party to perform the aggregation task construction and execution, in order to be able to facilitate the second party to use, the following triangular matrix may be adopted; in this embodiment, the constructing the target matrix corresponding to any one lot data of the plurality of lot data includes: the first participant is further configured to sort data units included in the batch data to obtain target batch data when the number of data units included in the batch data is greater than a number threshold; and converting the target batch data into a lower triangular matrix as a target matrix corresponding to the batch data.
Specifically, the data unit data specifically refers to the number of data blocks contained in batch data, and the corresponding number threshold specifically refers to a threshold set according to actual requirements, and the threshold is to be valued according to a rule greater than 2; correspondingly, the target batch data specifically refers to batch data obtained after sequencing the data units in sequence. Correspondingly, the lower triangular matrix specifically refers to a matrix constructed according to the data units contained in the target batch data.
Based on the above, when constructing a target matrix corresponding to any batch of data, it is required to determine whether the data units contained in each batch of data are greater than a quantity threshold, and if the data units are less than or equal to the quantity threshold, then it is required to perform aggregation processing according to a linear aggregation mode; if the number of the data units is larger than the number threshold, the data aggregation can be completed in batches, and meanwhile, in order to ensure that the aggregation is completed quickly, the data units contained in the batch data can be sequenced to obtain target batch data; and then converting the target batch data into a lower triangular matrix which is used as a target matrix corresponding to the batch data so as to facilitate the subsequent use.
In summary, by constructing the lower triangular matrix as the target matrix, the subsequent aggregation processing can be performed by utilizing the characteristics of the lower triangular matrix, so that the aggregation efficiency can be effectively improved.
On the basis, in order to ensure that all information used by aggregation can be transmitted at one time, the communication turn is reduced, and meanwhile, the data security is improved, and the method can be completed by combining a multiparty security comparison protocol; in this embodiment, the first participant is further configured to determine the service aggregation policy, and combine the lower triangular matrices corresponding to each batch of data according to the service aggregation policy to obtain the global matrix; and calling a multiparty safety comparison protocol, and transmitting the global matrix to the second party according to the multiparty safety comparison protocol.
The multiparty security comparison protocol is a protocol used when information is transmitted among a plurality of participants, so that mutual transmission of the same information can be completed on the premise of data security, and subsequent business processing is facilitated on the basis.
Based on the above, after the first participant obtains the lower triangular matrix corresponding to each batch of data, in order to achieve that the transmission of all the contents can be completed by one-time communication, a pre-deployed service aggregation policy can be determined first, and then the lower triangular matrices corresponding to each batch of data are combined according to the service aggregation policy, so that a global matrix is obtained; and at the moment, calling the multiparty safety comparison protocol, and sending the global matrix to the second party according to the multiparty safety comparison protocol, so that the second party can conveniently use the global matrix to carry out subsequent aggregation processing operation of the second local data.
That is, the first party may split the data sequence into multiple small data blocks to support the second party batch aggregation process. Multiple aggregation operations may be accomplished within a batch by vectorization, a single operation. In each batch, the comparison of adjacent Key is converted into a lower triangular matrix M, and matrix multiplication is performed on the lower triangular matrix M and a batch vector Value on the attribute Value to be aggregated of the second participant, so that an aggregation result corresponding to each batch of data can be obtained.
In the process of multiparty security calculation, a comparison protocol based on multiparty security calculation is used, and the step is a key step affecting the operation efficiency. Therefore, considering that the equality comparison among batches does not have data dependence, a round of communication transmission global matrix can be performed in a matrix merging mode. That is, the matrix merging vectorization can be performed on the lower triangular matrix of equality corresponding to each batch of data in the manner shown in (a) of fig. 4, and then the merged matrix is subjected to element-based multiparty safety comparison protocol, so that the calculation of the equality matrix of all batches can be completed by one round of communication. Therefore, in the subsequent processing, the second party can complete grouping aggregation in a single batch only by carrying out multiparty secure multiplication protocol of each batch, namely executing each aggregation task, thereby reducing multi-round communication into one-round communication, improving aggregation efficiency and reducing consumption of bandwidth resources.
Along the above example, the hospital side will divide the medical data contained in the medical data table R2 first, and n data blocks will be obtained; secondly, grouping n data blocks according to a batch processing strategy k to obtain n/k=m batch data; when the number of data blocks contained in each batch of M batch data is greater than 2, namely k is greater than 2, a lower triangular matrix M corresponding to the M batch data respectively can be constructed, and M lower triangular matrices M are obtained at the moment; and merging the M lower triangular matrixes M according to a preset aggregation strategy to obtain a global matrix Q, and sending the global matrix Q to an insurance company side for subsequent processing.
In summary, by adopting the multiparty security comparison protocol to communicate, the communication security can be ensured, and the second party can be supported to split the global matrix into a plurality of target matrices after receiving the global matrix, so that the aggregation task is created and executed, and the data aggregation efficiency is improved.
Furthermore, after splitting the global matrix into a plurality of target matrices and constructing a plurality of aggregation tasks, the second participant needs to complete aggregation of the second local data by executing the aggregation tasks, and in this process, the second participant performs batch aggregation according to the target matrices because the target matrices correspond to batch data; in this embodiment, execution of any one of a plurality of aggregation tasks includes: the second participant is further configured to determine batch data to be aggregated corresponding to the target matrix in the second local data by executing an aggregation task; and aggregating the batch data to be aggregated according to matrix elements contained in the target matrix.
Specifically, the batch data to be aggregated specifically refers to batch data of a target matrix corresponding to a current aggregation task in the second local data, where the batch data is composed of a plurality of data blocks in the second local data, and the number of the data blocks included in the batch data to be aggregated is the same as the number of the data blocks included in the batch data corresponding to the target matrix. Correspondingly, the matrix element specifically refers to an element contained in the target matrix and is used for aggregating data as Value in batch data to be aggregated as Key.
Based on the method, after the second participant receives the global matrix, the global matrix can be split into a plurality of target matrices according to a pre-deployed service aggregation strategy, and as each target matrix corresponds to one batch of data, an aggregation task corresponding to each target matrix can be created, so that data aggregation processing is completed in batches at the second participant. When any aggregation task is executed, the purpose is to aggregate the data with the same Key together, so that the data to be aggregated corresponding to the aggregation task is required to be determined in the second local data according to the target matrix, namely, a data set consisting of values corresponding to the Key in the target matrix one by one, and then the data in the data to be aggregated can be aggregated according to matrix elements in the target matrix, so that an aggregation result corresponding to each aggregation task is obtained, and after all the aggregation tasks are executed, an aggregation result corresponding to the second local data can be obtained, so that the subsequent use is convenient.
In summary, the aggregation processing operation can be completed in batches by performing the aggregation processing on the batch data to be aggregated corresponding to each matrix according to the matrix elements contained in the target matrix, so that the aggregation processing efficiency is improved.
When executing each aggregation task to aggregate second local data, the second participant is actually to sequentially read the batch data to be aggregated to finish, and in this embodiment, the second participant is further configured to determine an ith aggregation task from a plurality of aggregation tasks, and determine, by executing the ith aggregation task, batch data to be aggregated corresponding to the ith aggregation task from the second local data; the batch data to be polymerized are polymerized according to the target matrix corresponding to the ith polymerization task; i, adding 1 automatically, and executing the step of determining an ith aggregation task in a plurality of aggregation tasks; until i is self-increased to k, determining at least one data cluster based on the aggregation result of the batch data to be aggregated in each aggregation task as aggregation processing of the second local data; wherein i is a positive integer, and k is the number of tasks of the plurality of aggregation tasks.
Because the target matrix corresponding to each aggregation task in the plurality of aggregation tasks is obtained according to ordered batch data conversion, the aggregation efficiency can be ensured to be faster and the accuracy to be higher by adopting a sequential execution mode. Therefore, after a plurality of aggregation tasks are created, the aggregation tasks can be ordered according to priority, then an ith aggregation task is selected from the ordered aggregation tasks, and the ith aggregation task is executed to determine batch data to be aggregated corresponding to the ith aggregation task in the second local data; at the moment, the batch data to be polymerized in the current period are polymerized according to the target matrix corresponding to the ith polymerization task, and the local data can be polymerized in batches; thereafter i self-increasing 1, and executing a step of determining an i-th aggregation task among a plurality of aggregation tasks; until i is increased to k, the aggregation tasks are all executed, namely the second local data is aggregated in a batch aggregation mode, so that at least one data cluster can be determined based on the aggregation result of batch data to be aggregated in each aggregation task and used as the aggregation processing of the second local data, and the subsequent service can conveniently use the aggregation processing result to carry out service processing operation. Wherein i is a positive integer, and k is the number of tasks of the plurality of aggregation tasks.
Along the above example, after receiving the global matrix Q at the protecting company side, it may be split, so as to obtain M lower triangular matrices M, and then M aggregation tasks may be created according to each lower triangular matrix M. And then selecting an M1-th aggregation task, and determining batch data to be aggregated corresponding to the M1-th aggregation task in a safety data table R2 by executing the M1-th aggregation task, wherein the batch data is the same as the batch data corresponding to a lower triangular matrix M corresponding to the M1-th aggregation task. At this time, value in batch data to be aggregated can be aggregated according to Key in the lower triangular matrix M, and after the aggregation is completed, the next aggregation task is selected to be executed until M aggregation tasks are all executed, and aggregation processing of the insurance data table R2 can be completed, so that the calculation of claim settlement amount can be performed according to the insurance data set obtained by the aggregation result, and subsequent use is convenient.
In summary, by adopting the batch polymerization mode to polymerize the second local data, the polymerization result obtained by less polymerization operations can be realized, thereby effectively improving the polymerization efficiency.
In addition, considering that in the process of processing the aggregation task, the adjacent aggregation task may include the same Key, that is, the matrix element in the target matrix of the previous aggregation task may overlap with the matrix element in the target matrix of the next aggregation task, at this time, if aggregation is performed, the same Key may correspond to two aggregation results, thereby affecting the aggregation result of the second local data, so in order to avoid the problem affecting the aggregation result, the aggregation task having the overlapping task element may be updated to perform the update of the batch data to be aggregated, in this embodiment, in the case that the overlapping task element exists between the i-th aggregation task and the i+1-th aggregation task, the second participant is further configured to determine the sub-data set corresponding to the overlapping task element according to the aggregation result of the batch data to be aggregated in the i-th aggregation task; updating the batch data to be aggregated corresponding to the i+1th aggregation task according to the sub data set, and taking the updated batch data to be aggregated as the batch data to be aggregated corresponding to the i+1th aggregation task.
Specifically, the overlapping task element specifically refers to an overlapping key of the i-th aggregation task and the i+1-th aggregation task, and the overlapping key is determined according to the matrix element; correspondingly, the sub-data set specifically refers to an aggregation result corresponding to the overlapping task element, that is, an aggregation result corresponding to the overlapping Key, in the execution of the ith aggregation task. In order to avoid that the aggregation does not have errors, the sub-data set corresponding to the overlapping task elements in the i-th aggregation task can be used as a part of the batch data to be aggregated corresponding to the overlapping task elements in the i+1-th aggregation task, that is, the batch data to be aggregated corresponding to the i+1-th aggregation task can be obtained by combining the batch data corresponding to other task elements and the sub-data set, so that the subsequent use is convenient.
Based on this, in order to avoid aggregation overlapping, in the case that it is determined that an overlapping task element exists between the i-th aggregation task and the i+1-th aggregation task, the second participant first determines a sub-data set corresponding to the overlapping task element according to an aggregation result of batch data to be aggregated in the i-th aggregation task; at this time, the batch data to be aggregated corresponding to the i+1th aggregation task can be updated according to the sub-data set, so that the sub-data set is integrated into the batch data to be aggregated corresponding to the i+1th aggregation task, and the updated batch data to be aggregated is used as the batch data to be aggregated corresponding to the +1th aggregation task.
That is, assuming that the data selection lot is k, the lot data corresponding to the target matrix of the i-1 th aggregation task is selected as
Figure SMS_1
The lot data corresponding to the target matrix of the ith aggregation task is selected as
Figure SMS_2
The method comprises the steps of carrying out a first treatment on the surface of the And because of the ordered key, if +.>
Figure SMS_3
Then the lot data corresponding to the target matrix of the ith aggregation task is selected multiple times +.>
Figure SMS_4
The polymerization result will not be affected.
But at the ith aggregate taskWhen the key exists in the batch data corresponding to the target matrix and the key is the same as the key in the batch data corresponding to the target matrix of the i-1 th aggregation task, the value is recorded as a, and the occurrence of the key is inevitable
Figure SMS_5
Thus, the problem of polymerization overlap occurs. In order to avoid aggregation, the i-1 aggregation task is completed by
Figure SMS_6
Recording the aggregation result of the key in the i-1 aggregation task with the key being a. At this time +.>
Figure SMS_7
Added to the ith aggregation task, the lot data corresponding to the target matrix of the ith aggregation task is already selected +.>
Figure SMS_8
In the course of the polymerization, the result of the polymerization of the preceding batch of words can be taken as->
Figure SMS_9
Accumulated in the aggregation result of the same grouping of the ith aggregation task, thereby ensuring the correctness of the aggregation result.
As shown in the schematic diagram of fig. 4 (b), it is assumed that the lot data of the target matrix corresponding to the ith aggregation task includes 3 data blocks, and after the 1 st aggregation task is completed, the result will be
Figure SMS_10
I.e. key 1 =val 1 ,key 2 =val 2 ,key 3 =val 3 And key 2 And key 3 Vals are equal 2 And val 3 Polymerizing together to form->
Figure SMS_11
The method comprises the steps of carrying out a first treatment on the surface of the When the 2 nd aggregation task is executed, if
Figure SMS_12
Then need to be +.>
Figure SMS_13
As->
Figure SMS_14
Polymerization val is carried out 4 Basis of (2) to obtain->
Figure SMS_15
While
Figure SMS_16
And the like, and the aggregation processing operation of the second party data can be completed until all the aggregation tasks are completed.
When overlapping task elements exist for adjacent aggregation tasks, the sub-data set of the previous adjacent aggregation task is used as a part of batch data to be aggregated of the next aggregation task in order to avoid aggregation errors, so that subsequent aggregation processing is facilitated, and aggregation accuracy is ensured.
In an actual service scenario, a service demand party is required to trigger an aggregation task, then the aggregation task is processed, and finally an execution result is fed back to the service demand party, in this embodiment, the first party is further configured to receive a data aggregation request submitted by the service demand party for the service task, read first local data in response to the data aggregation request, and execute a step of generating a plurality of batches of data according to the first local data;
the second party is further configured to generate a target data table according to an aggregation result of the second local data, read target data in the target data table according to the data aggregation request, execute the service task according to the target data, and feed back an execution result of the service task to the service demander.
Specifically, the service demander specifically refers to a party that needs to perform a service task, for example, in an insurance claim scene, the service demander can be a claim settlement user; correspondingly, the business task is the task of calculating the claim amount; correspondingly, the execution result is the calculated claim amount.
Based on the above, after receiving a data aggregation request submitted by a business demand party for a business task, the first participant reads first local data in response to the data aggregation request and executes a step of generating a plurality of batches of data according to the first local data; and the second party can generate a target data table according to the aggregation result of the second local data, read target data in the target data table according to the data aggregation request, execute the service task according to the target data and feed back the execution result of the service task to the service requiring party.
Along the above example, after receiving the global matrix Q, the insurance company side splits the global matrix Q according to a preset aggregation policy, and obtains m target matrices according to the splitting result. At this time, m aggregation tasks are constructed for m matrixes, then each aggregation task is sequentially executed, so that data in an insurance data table R1 held by an insurance company is grouped and aggregated, and because each target matrix corresponds to k data blocks, aggregation processing is finished in batches when data aggregation is carried out according to the aggregation tasks; and obtaining an insurance data set corresponding to the patient after all the aggregation tasks are completed, calculating the claim settlement amount according to the insurance data contained in the insurance data set, determining the claim settlement amount as S, and feeding back the claim settlement amount to the user.
According to the multiparty security calculation-based data aggregation system provided by the specification, in order to achieve that data aggregation can be rapidly completed under the condition that private data is not revealed between participants, a plurality of batch data can be generated by a first participant according to first local data held by the first participant, each data block is used for forming batch data, a matrix corresponding to each batch data is built on the basis, a target matrix is combined and then sent to a second participant, information used by aggregated data can be sent to the second participant through one-time communication, the information is obtained after conversion based on the first local data, leakage of the data of the first participant is not caused, and communication rounds and safety are effectively reduced. And the second party can use the global matrix to generate a plurality of target matrixes, and create an aggregation task according to each target matrix, so that the second local data held by the second party is aggregated by executing each aggregation task in turn, the running efficiency of data aggregation is improved in a batch processing mode, and the aggregation of a plurality of data can be completed in a single time, thereby effectively improving the data aggregation efficiency and facilitating the follow-up processing of the parties based on the aggregation result.
The application of the data aggregation system based on multiparty security computation in the text data batch processing scenario provided in the present specification is taken as an example, and the data aggregation system based on multiparty security computation will be further described below with reference to fig. 5. Fig. 5 shows a process flow diagram of a data aggregation system based on multiparty security computation according to an embodiment of the present disclosure, which specifically includes the following steps:
step S502, a first participant acquires first local data, performs segmentation processing on the first local data, and groups the segmented first local data according to a preset batch to obtain a plurality of batch data.
In step S504, the first participant constructs a target matrix corresponding to each batch of data.
The construction of the target matrix corresponding to any one batch of data in the plurality of batches of data comprises the following steps: sequencing the data units contained in the batch data under the condition that the number of the data units contained in the batch data is larger than a number threshold value, so as to obtain target batch data; and converting the target batch data into a lower triangular matrix as a target matrix corresponding to the batch data.
Step S506, the first participant determines a service aggregation strategy, and merges the lower triangular matrixes corresponding to each batch of data according to the service aggregation strategy to obtain a global matrix.
Step S508, the first party invokes the multiparty security comparison protocol and sends the global matrix to the second party according to the multiparty security comparison protocol.
In step S510, the second party splits the global matrix according to the service aggregation policy to obtain a plurality of target matrices.
In step S512, the second participant constructs an aggregation task corresponding to each target matrix, and aggregates the second local data by sequentially executing each aggregation task.
The execution of any one of a plurality of aggregate tasks, comprising: determining batch data to be aggregated corresponding to a target matrix in the second local data by executing an aggregation task; and aggregating the batch data to be aggregated according to matrix elements contained in the target matrix.
That is, the second participant is further configured to determine an ith aggregation task from a plurality of aggregation tasks, and determine batch data to be aggregated corresponding to the ith aggregation task from the second local data by executing the ith aggregation task; the batch data to be polymerized are polymerized according to the target matrix corresponding to the ith polymerization task; i, adding 1 automatically, and executing the step of determining an ith aggregation task in a plurality of aggregation tasks; until i is self-increased to k, determining at least one data cluster based on the aggregation result of the batch data to be aggregated in each aggregation task as aggregation processing of the second local data; wherein i is a positive integer, and k is the number of tasks of the plurality of aggregation tasks.
Under the condition that overlapping task elements exist in an ith aggregation task and an (i+1) th aggregation task, the second participant is further configured to determine a sub-data set corresponding to the overlapping task elements according to an aggregation result of batch data to be aggregated in the ith aggregation task; updating the batch data to be aggregated corresponding to the i+1th aggregation task according to the sub data set, and taking the updated batch data to be aggregated as the batch data to be aggregated corresponding to the i+1th aggregation task.
In summary, in order to achieve that data aggregation can be rapidly completed between the participants without revealing private data, a plurality of batch data can be generated by the first participant according to the first local data held by the first participant, each data block is used for forming batch data, on the basis, a matrix corresponding to each batch data is constructed, the target matrix is combined and then sent to the second participant, information used by the aggregated data can be sent to the second participant through one-time communication, the information is obtained after conversion based on the first local data, and therefore leakage of the data of the first participant is avoided, and communication rounds and safety are effectively reduced. And the second party can use the global matrix to generate a plurality of target matrixes, and create an aggregation task according to each target matrix, so that the second local data held by the second party is aggregated by executing each aggregation task in turn, the running efficiency of data aggregation is improved in a batch processing mode, and the aggregation of a plurality of data can be completed in a single time, thereby effectively improving the data aggregation efficiency and facilitating the follow-up processing of the parties based on the aggregation result.
Corresponding to the above method embodiments, the present disclosure further provides a data aggregation method embodiment based on multiparty security computation, and fig. 6 shows a flowchart of a data aggregation method based on multiparty security computation provided in an embodiment of the present disclosure. As shown in fig. 6, the method is applied to a data aggregation system, the system including a first party and a second party, including:
step S602, the first participant generates a plurality of batch data according to the first local data, and constructs a target matrix corresponding to each batch data; combining the target matrixes corresponding to the batch data according to a service aggregation strategy to obtain a global matrix, and sending the global matrix to the second party;
step S604, the second participant splits the global matrix according to the service aggregation policy to obtain a plurality of target matrices; and constructing an aggregation task corresponding to each target matrix, and aggregating the second local data by executing each aggregation task in turn.
In an optional embodiment, the first participant acquires first local data and performs segmentation processing on the first local data; grouping the divided first local data according to a preset batch to obtain the batch data.
In an optional embodiment, the constructing the target matrix corresponding to any one of the plurality of lot data includes:
the first participant sorts the data units contained in the batch data under the condition that the number of the data units contained in the batch data is larger than a number threshold value, so as to obtain target batch data; and converting the target batch data into a lower triangular matrix as a target matrix corresponding to the batch data.
In an optional embodiment, the first participant determines the service aggregation policy, and merges the lower triangular matrices corresponding to each batch of data according to the service aggregation policy to obtain the global matrix; and calling a multiparty safety comparison protocol, and transmitting the global matrix to the second party according to the multiparty safety comparison protocol.
In an alternative embodiment, the execution of any one of a plurality of aggregate tasks includes:
the second participant determines batch data to be aggregated corresponding to a target matrix in the second local data by executing an aggregation task; and aggregating the batch data to be aggregated according to matrix elements contained in the target matrix.
In an optional embodiment, the second participant determines an ith aggregation task from a plurality of aggregation tasks, and determines batch data to be aggregated corresponding to the ith aggregation task from the second local data by executing the ith aggregation task; the batch data to be polymerized are polymerized according to the target matrix corresponding to the ith polymerization task; i, adding 1 automatically, and executing the step of determining an ith aggregation task in a plurality of aggregation tasks; until i is self-increased to k, determining at least one data cluster based on the aggregation result of the batch data to be aggregated in each aggregation task as aggregation processing of the second local data; wherein i is a positive integer, and k is the number of tasks of the plurality of aggregation tasks.
In an optional embodiment, in a case that an overlapping task element exists between an i-th aggregation task and an i+1-th aggregation task, the second participant determines a sub-data set corresponding to the overlapping task element according to an aggregation result of batch data to be aggregated in the i-th aggregation task; updating the batch data to be aggregated corresponding to the i+1th aggregation task according to the sub data set, and taking the updated batch data to be aggregated as the batch data to be aggregated corresponding to the i+1th aggregation task.
In an alternative embodiment, the first participant receives a data aggregation request submitted by a business requirement for a business task, reads first local data in response to the data aggregation request, and performs the step of generating a plurality of batches of data according to the first local data;
and the second party generates a target data table according to the aggregation result of the second local data, reads target data in the target data table according to the data aggregation request, executes the service task according to the target data, and feeds back the execution result of the service task to the service requiring party.
In summary, in order to achieve that data aggregation can be rapidly completed between the participants without revealing private data, a plurality of batch data can be generated by the first participant according to the first local data held by the first participant, each data block is used for forming batch data, on the basis, a matrix corresponding to each batch data is constructed, the target matrix is combined and then sent to the second participant, information used by the aggregated data can be sent to the second participant through one-time communication, the information is obtained after conversion based on the first local data, and therefore leakage of the data of the first participant is avoided, and communication rounds and safety are effectively reduced. And the second party can use the global matrix to generate a plurality of target matrixes, and create an aggregation task according to each target matrix, so that the second local data held by the second party is aggregated by executing each aggregation task in turn, the running efficiency of data aggregation is improved in a batch processing mode, and the aggregation of a plurality of data can be completed in a single time, thereby effectively improving the data aggregation efficiency and facilitating the follow-up processing of the parties based on the aggregation result.
The foregoing is a schematic scheme of a data aggregation method based on multiparty security computation of this embodiment. It should be noted that, the technical solution of the data aggregation method based on multiparty security computation and the technical solution of the data aggregation system based on multiparty security computation belong to the same concept, and details of the technical solution of the data aggregation method based on multiparty security computation, which are not described in detail, can be referred to the description of the technical solution of the data aggregation system based on multiparty security computation.
Fig. 7 illustrates a block diagram of a computing device 700 provided in accordance with an embodiment of the present specification. The components of computing device 700 include, but are not limited to, memory 710 and processor 720. Processor 720 is coupled to memory 710 via bus 730, and database 750 is used to store data.
Computing device 700 also includes access device 740, access device 740 enabling computing device 700 to communicate via one or more networks 760. Examples of such networks include public switched telephone networks (PSTN, public Switched Telephone Network), local area networks (LAN, local Area Network), wide area networks (WAN, wide Area Network), personal area networks (PAN, personal Area Network), or combinations of communication networks such as the internet. The access device 740 may include one or more of any type of network interface, wired or wireless, such as a network interface card (NIC, network interface controller), such as an IEEE802.11 wireless local area network (WLAN, wireless Local Area Network) wireless interface, a worldwide interoperability for microwave access (Wi-MAX, worldwide Interoperability for Microwave Access) interface, an ethernet interface, a universal serial bus (USB, universal Serial Bus) interface, a cellular network interface, a bluetooth interface, a near field communication (NFC, near Field Communication) interface, and so forth.
In one embodiment of the present application, the above-described components of computing device 700, as well as other components not shown in FIG. 7, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device illustrated in FIG. 7 is for exemplary purposes only and is not intended to limit the scope of the present application. Those skilled in the art may add or replace other components as desired.
Computing device 700 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smart phone), wearable computing device (e.g., smart watch, smart glasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or personal computer (PC, personal Computer). Computing device 700 may also be a mobile or stationary server.
Wherein the processor 720 is configured to execute the following computer-executable instructions:
the first participant generates a plurality of batch data according to the first local data and constructs a target matrix corresponding to each batch data; combining the target matrixes corresponding to the batch data according to a service aggregation strategy to obtain a global matrix, and sending the global matrix to the second party;
The second party splits the global matrix according to the service aggregation strategy to obtain a plurality of target matrixes; and constructing an aggregation task corresponding to each target matrix, and aggregating the second local data by executing each aggregation task in turn.
The foregoing is a schematic illustration of a computing device of this embodiment. It should be noted that, the technical solution of the computing device and the technical solution of the data aggregation method based on multiparty security computation belong to the same concept, and details of the technical solution of the computing device, which are not described in detail, can be referred to the description of the technical solution of the data aggregation method based on multiparty security computation.
An embodiment of the present disclosure also provides a computer-readable storage medium storing computer instructions that, when executed by a processor, are configured to:
the first participant generates a plurality of batch data according to the first local data and constructs a target matrix corresponding to each batch data; combining the target matrixes corresponding to the batch data according to a service aggregation strategy to obtain a global matrix, and sending the global matrix to the second party;
the second party splits the global matrix according to the service aggregation strategy to obtain a plurality of target matrixes; and constructing an aggregation task corresponding to each target matrix, and aggregating the second local data by executing each aggregation task in turn.
The above is an exemplary version of a computer-readable storage medium of the present embodiment. It should be noted that, the technical solution of the storage medium and the technical solution of the data aggregation method based on multiparty security computation belong to the same concept, and details of the technical solution of the storage medium which are not described in detail can be referred to the description of the technical solution of the data aggregation method based on multiparty security computation.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The computer instructions include computer program code that may be in source code form, object code form, executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.
It should be noted that, for the sake of simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the present description is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present description. Further, those skilled in the art will appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily all necessary in the specification.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.
The preferred embodiments of the present specification disclosed above are merely used to help clarify the present specification. Alternative embodiments are not intended to be exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the disclosure and the practical application, to thereby enable others skilled in the art to best understand and utilize the disclosure. This specification is to be limited only by the claims and the full scope and equivalents thereof.

Claims (11)

1. A data aggregation system based on multiparty security computing, the system comprising a first party and a second party, comprising:
the first participant is used for generating a plurality of batch data according to the first local data and constructing a target matrix corresponding to each batch data; combining the target matrixes corresponding to the batch data according to a service aggregation strategy to obtain a global matrix, and sending the global matrix to the second party;
the second party is used for splitting the global matrix according to the service aggregation strategy to obtain a plurality of target matrixes; and constructing an aggregation task corresponding to each target matrix, and aggregating the second local data by executing each aggregation task in turn.
2. The system of claim 1, wherein the first party is further configured to obtain first local data and perform a segmentation process on the first local data; grouping the divided first local data according to a preset batch to obtain the batch data.
3. The system of claim 1, wherein the constructing of the target matrix corresponding to any one of the plurality of lot data comprises:
The first participant is further configured to sort data units included in the batch data to obtain target batch data when the number of data units included in the batch data is greater than a number threshold; and converting the target batch data into a lower triangular matrix as a target matrix corresponding to the batch data.
4. The system of claim 3, wherein the first party is further configured to determine the service aggregation policy, and combine the lower triangular matrices corresponding to each batch of data according to the service aggregation policy to obtain the global matrix; and calling a multiparty safety comparison protocol, and transmitting the global matrix to the second party according to the multiparty safety comparison protocol.
5. The system of claim 1, wherein the execution of any one of a plurality of aggregate tasks comprises:
the second participant is further configured to determine batch data to be aggregated corresponding to the target matrix in the second local data by executing an aggregation task; and aggregating the batch data to be aggregated according to matrix elements contained in the target matrix.
6. The system of claim 1, wherein the second participant is further configured to determine an ith aggregation task from a plurality of aggregation tasks, and determine batch data to be aggregated corresponding to the ith aggregation task from the second local data by executing the ith aggregation task; the batch data to be polymerized are polymerized according to the target matrix corresponding to the ith polymerization task; i, adding 1 automatically, and executing the step of determining an ith aggregation task in a plurality of aggregation tasks; until i is self-increased to k, determining at least one data cluster based on the aggregation result of the batch data to be aggregated in each aggregation task as aggregation processing of the second local data; wherein i is a positive integer, and k is the number of tasks of the plurality of aggregation tasks.
7. The system of claim 6, wherein, in a case where an overlapping task element exists between an i-th aggregation task and an i+1-th aggregation task, the second participant is further configured to determine a sub-data set corresponding to the overlapping task element according to an aggregation result of batch data to be aggregated in the i-th aggregation task; updating the batch data to be aggregated corresponding to the i+1th aggregation task according to the sub data set, and taking the updated batch data to be aggregated as the batch data to be aggregated corresponding to the i+1th aggregation task.
8. The system of any of claims 1-7, wherein the first party is further configured to receive a data aggregation request submitted by a business requirement party for a business task, read first local data in response to the data aggregation request, and perform the step of generating a plurality of batches of data from the first local data;
the second party is further configured to generate a target data table according to an aggregation result of the second local data, read target data in the target data table according to the data aggregation request, execute the service task according to the target data, and feed back an execution result of the service task to the service demander.
9. A method of data aggregation based on multiparty security computing, the method being applied to a data aggregation system, the system comprising a first party and a second party, comprising:
the first participant generates a plurality of batch data according to the first local data and constructs a target matrix corresponding to each batch data; combining the target matrixes corresponding to the batch data according to a service aggregation strategy to obtain a global matrix, and sending the global matrix to the second party;
The second party splits the global matrix according to the service aggregation strategy to obtain a plurality of target matrixes; and constructing an aggregation task corresponding to each target matrix, and aggregating the second local data by executing each aggregation task in turn.
10. A computing device comprising a memory and a processor; the memory is configured to store computer-executable instructions and the processor is configured to execute the computer-executable instructions to perform the steps of the method of claim 9.
11. A computer readable storage medium storing computer instructions which, when executed by a processor, implement the steps of the method of claim 9.
CN202310278687.4A 2023-03-21 2023-03-21 Data aggregation system and method based on multiparty security calculation Active CN115994161B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310278687.4A CN115994161B (en) 2023-03-21 2023-03-21 Data aggregation system and method based on multiparty security calculation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310278687.4A CN115994161B (en) 2023-03-21 2023-03-21 Data aggregation system and method based on multiparty security calculation

Publications (2)

Publication Number Publication Date
CN115994161A true CN115994161A (en) 2023-04-21
CN115994161B CN115994161B (en) 2023-06-06

Family

ID=85993667

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310278687.4A Active CN115994161B (en) 2023-03-21 2023-03-21 Data aggregation system and method based on multiparty security calculation

Country Status (1)

Country Link
CN (1) CN115994161B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116451279A (en) * 2023-06-20 2023-07-18 腾讯科技(深圳)有限公司 Data processing method, device, equipment and readable storage medium
CN116484432A (en) * 2023-06-21 2023-07-25 杭州金智塔科技有限公司 Longitudinal joint query method and device based on multiparty security calculation

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106018919A (en) * 2016-05-20 2016-10-12 清华大学 Wide-range broadband current sensor base on tunnel magnetic resistance effect
US20180218171A1 (en) * 2017-01-31 2018-08-02 Hewlett Packard Enterprise Development Lp Performing privacy-preserving multi-party analytics on horizontally partitioned local data
CN112597542A (en) * 2020-12-04 2021-04-02 光大科技有限公司 Target asset data aggregation method and device, storage medium and electronic device
WO2021083179A1 (en) * 2019-10-30 2021-05-06 阿里巴巴集团控股有限公司 Secure multi-party computing method, apparatus, system, and storage medium
WO2021114819A1 (en) * 2019-12-11 2021-06-17 支付宝(杭州)信息技术有限公司 Methods for generating and executing smart contract transaction and device
CN112989399A (en) * 2021-05-18 2021-06-18 杭州金智塔科技有限公司 Data processing system and method
CN113312641A (en) * 2021-06-02 2021-08-27 杭州趣链科技有限公司 Multipoint and multiparty data interaction method, system, electronic device and storage medium
CN114296922A (en) * 2021-12-28 2022-04-08 杭州趣链科技有限公司 Multi-party data processing method, system, electronic device and storage medium
CN114548429A (en) * 2022-04-27 2022-05-27 蓝象智联(杭州)科技有限公司 Safe and efficient transverse federated neural network model training method
CN114546527A (en) * 2022-02-22 2022-05-27 复旦大学 Longitudinal multi-party data aggregation calculation solution system
CN114547082A (en) * 2022-02-25 2022-05-27 腾讯科技(深圳)有限公司 Data aggregation method, related device, equipment and storage medium
CN114826580A (en) * 2022-04-24 2022-07-29 杭州博盾习言科技有限公司 Privacy set intersection method, device and storage medium based on multi-party security calculation
CN115396101A (en) * 2022-10-26 2022-11-25 华控清交信息科技(北京)有限公司 Secret sharing based careless disorganizing method and system

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106018919A (en) * 2016-05-20 2016-10-12 清华大学 Wide-range broadband current sensor base on tunnel magnetic resistance effect
US20180218171A1 (en) * 2017-01-31 2018-08-02 Hewlett Packard Enterprise Development Lp Performing privacy-preserving multi-party analytics on horizontally partitioned local data
WO2021083179A1 (en) * 2019-10-30 2021-05-06 阿里巴巴集团控股有限公司 Secure multi-party computing method, apparatus, system, and storage medium
WO2021114819A1 (en) * 2019-12-11 2021-06-17 支付宝(杭州)信息技术有限公司 Methods for generating and executing smart contract transaction and device
CN112597542A (en) * 2020-12-04 2021-04-02 光大科技有限公司 Target asset data aggregation method and device, storage medium and electronic device
CN112989399A (en) * 2021-05-18 2021-06-18 杭州金智塔科技有限公司 Data processing system and method
CN113312641A (en) * 2021-06-02 2021-08-27 杭州趣链科技有限公司 Multipoint and multiparty data interaction method, system, electronic device and storage medium
WO2022252595A1 (en) * 2021-06-02 2022-12-08 杭州趣链科技有限公司 Method and system for multi-point multi-party data exchange, electronic apparatus, and storage medium
CN114296922A (en) * 2021-12-28 2022-04-08 杭州趣链科技有限公司 Multi-party data processing method, system, electronic device and storage medium
CN114546527A (en) * 2022-02-22 2022-05-27 复旦大学 Longitudinal multi-party data aggregation calculation solution system
CN114547082A (en) * 2022-02-25 2022-05-27 腾讯科技(深圳)有限公司 Data aggregation method, related device, equipment and storage medium
CN114826580A (en) * 2022-04-24 2022-07-29 杭州博盾习言科技有限公司 Privacy set intersection method, device and storage medium based on multi-party security calculation
CN114548429A (en) * 2022-04-27 2022-05-27 蓝象智联(杭州)科技有限公司 Safe and efficient transverse federated neural network model training method
CN115396101A (en) * 2022-10-26 2022-11-25 华控清交信息科技(北京)有限公司 Secret sharing based careless disorganizing method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KEWEI CHENG: "SecureBoost:A Lossless Federated Learning Framework", IEEE, pages 87 - 98 *
周俊;方国英;吴楠;: "联邦学习安全与隐私保护研究综述", 西华大学学报(自然科学版), no. 04, pages 9 - 17 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116451279A (en) * 2023-06-20 2023-07-18 腾讯科技(深圳)有限公司 Data processing method, device, equipment and readable storage medium
CN116451279B (en) * 2023-06-20 2023-08-15 腾讯科技(深圳)有限公司 Data processing method, device, equipment and readable storage medium
CN116484432A (en) * 2023-06-21 2023-07-25 杭州金智塔科技有限公司 Longitudinal joint query method and device based on multiparty security calculation
CN116484432B (en) * 2023-06-21 2023-09-19 杭州金智塔科技有限公司 Longitudinal joint query method and device based on multiparty security calculation

Also Published As

Publication number Publication date
CN115994161B (en) 2023-06-06

Similar Documents

Publication Publication Date Title
CN115994161B (en) Data aggregation system and method based on multiparty security calculation
CN113505882B (en) Data processing method based on federal neural network model, related equipment and medium
CN112989399B (en) Data processing system and method
CN113259106B (en) Data processing method and system
CN111026359B (en) Method and device for judging numerical range of private data in multi-party combination manner
CN114648130A (en) Longitudinal federal learning method and device, electronic equipment and storage medium
CN114338028A (en) Threshold signature method and device, electronic equipment and readable storage medium
US20230283461A1 (en) Method, device, and storage medium for determining extremum based on secure multi-party computation
US20230006977A1 (en) Systems and methods for secure averaging of models for federated learning and blind learning using secure multi-party computation
CN112801301A (en) Asynchronous calculation method, device, equipment, storage medium and program product
CN112948885A (en) Method, device and system for realizing privacy protection of multi-party collaborative update model
CN115935438A (en) Data privacy intersection system and method
CN116112168A (en) Data processing method and system in multiparty privacy exchange
CN113077058B (en) Push model optimization method and device executed by user terminal
CN116108473B (en) Data processing method and device in multiparty security calculation
CN111401888B (en) Method and device for generating multi-signature wallet
CN116681141A (en) Federal learning method, terminal and storage medium for privacy protection
CN112183759A (en) Model training method, device and system
WO2023038985A1 (en) Systems and methods for converting data from int-64 to boolean for computations
CN113254996B (en) Graph neural network training method and device, computing equipment and storage medium
CN115964738A (en) Data processing method and device based on multi-party security calculation
CN114780224A (en) Resource scheduling method and system applied to meta universe
CN115345298A (en) Method and device for jointly training models
US20210012421A1 (en) Method and device for trading on an electronic trading platform
CN113657685A (en) Federal model training method, device, equipment, storage medium and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant