WO2021204271A1

WO2021204271A1 - Data privacy protected joint training of service prediction model by two parties

Info

Publication number: WO2021204271A1
Application number: PCT/CN2021/086273
Authority: WO
Inventors: 陈超超; 王力; 王磊; 周俊
Original assignee: 支付宝(杭州)信息技术有限公司
Priority date: 2020-04-10
Filing date: 2021-04-09
Publication date: 2021-10-14
Also published as: CN111178549B; CN111178549A

Abstract

Provided are a method and apparatus for data privacy protected joint training of a service prediction model by two parties. Two parties each have a portion of feature data. In a model iteration process, the two parties obtain two product fragments of a product result of a total feature matrix X and a total parameter matrix W by means of security matrix multiplication. A second party with a label performs secret sharing on a label vector Y, so that the two parties obtain two label fragments. Hence, the two parties respectively calculate a corresponding error fragment according to the product fragment and label fragment held thereby. Then, the two parties obtain a corresponding gradient fragment on the basis of respective error fragments and feature matrices and by means of secret sharing and security matrix multiplication. Afterwards, a first party updates, by means of the gradient fragment thereof, a parameter fragment maintained thereby, and the second party updates, by means of the gradient fragment thereof, a parameter fragment maintained thereby. In this way, data privacy protected security joint training is implemented.

Description

Both parties jointly train business prediction models to protect data privacy

Technical field

One or more embodiments of this specification relate to the fields of data security and machine learning, specifically, to the joint training of business prediction models by both parties.

Background technique

The data needed for machine learning often involves multiple fields. For example, in a business classification analysis scenario based on machine learning, the electronic payment platform owns the merchant's transaction flow data, the e-commerce platform stores the merchant's sales data, and the banking institution owns the merchant's loan data. Data often exists in the form of islands. Due to industry competition, data security, user privacy and other issues, data integration is facing great resistance. It is difficult to integrate data scattered on various platforms to train machine learning models. Under the premise of ensuring that data is not leaked, the use of multi-party data to jointly train machine learning models has become a major challenge at present.

Commonly used machine learning models include logistic regression models, linear regression models, and neural network models. Logistic regression models can effectively perform tasks such as sample classification and prediction. The linear regression model can effectively predict the regression value of the sample. The neural network model can perform various prediction tasks through the combination of multiple layers of neurons. In the training process of the above models, the process of using the calculation between the feature data and the model parameter data to obtain the prediction result, and determining the gradient according to the prediction result, and then adjusting the model parameters. In the case of multiple parties jointly training a machine learning model, how to perform the above-mentioned operations in each stage collaboratively without revealing the private data (including feature data and model parameter data) of each party is a practical problem to be solved. Therefore, it is hoped to provide an improved solution to ensure that the private data of each party is not leaked and ensure data security when the two parties jointly train the business prediction model.

Summary of the invention

One or more embodiments of this specification describe a method and device for jointly training a business prediction model by both parties, which ensure that data privacy is not leaked by means of parameter slicing in the iterative process, and ensure the security of private data in joint training.

According to a first aspect, there is provided a method for two parties to jointly train a business prediction model to protect data privacy. The two parties include a first party and a second party, and the first party stores first characteristic parts of a plurality of business objects. A first feature matrix X _{A formed} _{; the second party stores a second feature matrix X B} formed by the second feature parts of the multiple business objects, and a label vector Y formed by label values. The method is applied to the second party, and the method includes multiple iterations of performing model parameter updates. Each iteration includes: based on the locally maintained first parameter second segment and the second parameter second segment, the second multiplication integral piece is calculated through local matrix multiplication and safe matrix multiplication with the first party, the second fragment is the first parameter for processing the first portion of the first characteristic parameter W _a second portion of the fragment, the second parameter is a second feature of the second slice for the second part of the process The second segment of the parameter part W _B ; the tag vector Y is secretly shared to obtain the second tag segment, and the second tag segment is subtracted based on the second multiplication and integration segment to obtain the first Two error fragments; locally calculating the product of the second error fragment and the second feature matrix X _B to obtain the first part of the second gradient; using the second feature matrix X _B and the first part of the first party An error segment is multiplied by a security matrix to obtain the second segment of the second part of the second gradient, and receive the second segment of the second part of the first gradient from the first party; according to the first part of the second gradient And the second slice of the second part of the second gradient, update the second parameter second slice; according to the first slice of the first part of the first gradient, update the first parameter second slice .

In one embodiment, before performing the model parameter update multiple iterations, the method further includes: initializing the second parameter part W _B , and splitting it into the second parameter first segment and the second parameter second segment through secret sharing. receiving a first parameter from a first side portion of the first parameter W _a secret sharing; fragment, retaining the second parameter of the second fragment, transmits the first fragment of the second parameter to the first party The second fragment.

In an embodiment, after performing the model parameter update multiple iterations, the method further includes: sending a second segment of the first parameter updated in the last iteration to the first party, and receiving the update from the first party. The party receives the updated second parameter first segment; the updated second parameter second segment in the last iteration is combined with the received second parameter first segment to obtain the service prediction model training After the second parameter part W _B.

In an embodiment, the business object includes one of the following: users, merchants, commodities, and events; the business prediction model is used to predict the classification or regression value of the business object.

In one embodiment, the business prediction model is a linear regression model; wherein subtracting the second label fragments based on the second multiplication and integration fragments to obtain the second error fragments includes: calculating the first error fragments The difference between the squared integral slice and the second label slice is used as the second error slice.

In another embodiment, the business prediction model is a logistic regression model; wherein subtracting the second label segment based on the second multiplying integral segment to obtain the second error segment includes: according to a sigmoid function In the Taylor expansion form of, a second prediction result segment is obtained based on the second multiplication-integral slice, and the difference between the second prediction result segment and the second label segment is calculated as the second error segment .

In a specific embodiment, according to the Taylor expansion form of the sigmoid function, obtaining the second prediction result fragment based on the second multiplication integral piece includes: calculating the second multiplication according to the multi-order Taylor expansion form of the sigmoid function Multiply the multiplicity of the integral piece to obtain the multiplicity of the second slicing; using the second multiplier integral piece and the second slicing multiplicity, follow the first multiplier integral piece and the first multiplier piece in the first party. Perform multiple safe matrix multiplication operations for multiple times of one slice to obtain multiple second multifactorial integral slices; use the second multiplier integral slice, the second slice multiplier and multiple second multifactorial integral slices Slice, determining the second prediction result slice.

In one embodiment, calculating the second multiplication-integral piece includes: using the second piece of the first parameter to perform a security matrix multiplication _{with the first feature matrix X A in the first party to obtain the first feature} The second segment of the second processing result; locally calculating the product of the second feature matrix X _B and the second segment of the second parameter to obtain the first processing result of the second feature; using the second feature matrix X _B , and The first segment of the second parameter in the first party performs security matrix multiplication to obtain the second segment of the second processing result of the second feature; for the second segment of the second processing result of the first feature, the The first processing result of the second characteristic, and the second slices of the second processing result of the second characteristic are added to obtain the second multiplication-integral slice.

In one embodiment, updating the second parameter of the second segment according to the first part of the second gradient and the second segment of the second part of the second gradient includes: changing the first part of the second gradient And the product of the sum of the second slice of the second part of the second gradient and the preset step length as the adjustment amount, and the second parameter second slice is updated by subtracting the adjustment amount.

According to a second aspect, there is provided a method for two parties to jointly train a business prediction model to protect data privacy. The two parties include a first party and a second party, and the first party stores the first feature part of a plurality of business objects. The first feature matrix X _A _{; the second party stores a second feature matrix X B} formed by the second feature parts of the multiple business objects, and a label vector Y formed by label values. The method is applied to the second party, and the method includes multiple iterations of performing model parameter updates. Wherein, each iteration includes: based on the first segment of the first parameter maintained locally and the first segment of the second parameter, the first multiplication integral is calculated through the local matrix multiplication and the safe matrix multiplication with the second party sheet, the first parameter of the first fragment is a first fragment of a first parameter W _a portion of the first feature processing section, a first fragment of the second parameter is a characteristic portion according to a second process The first segment of the second parameter part W _B ; the first tag segment that is secretly shared with the tag vector Y is received from the second party, and the first tag segment is segmented based on the first multiplication-integral segment Subtract the slices to obtain the first error slice; locally calculate the product of the first error slice and the first characteristic matrix X _A to obtain the first part of the first gradient; use the first characteristic matrix X _A to The second error segment in the second party performs security matrix multiplication to obtain the first segment of the second part of the first gradient, and receives the first segment of the second part of the second gradient from the second party; The first segment of the first part of the first gradient and the first segment of the second part of the first gradient update the first segment of the first parameter; update all the first segments according to the first segment of the second part of the second gradient The second parameter is the first fragment.

According to a third aspect, there is provided an apparatus for two parties to jointly train a business prediction model to protect data privacy. The two parties include a first party and a second party, and the first party stores a plurality of business objects that constitute the first characteristic part. The first feature matrix X _A _{; the second party stores a second feature matrix X B} formed by the second feature parts of the multiple business objects, and a label vector Y formed by label values. The device is deployed on the second party, and the device includes an iterative unit for performing model parameter update multiple times, and further includes: a multiplier-integral piece determination unit configured to perform a second piece based on the locally maintained first parameter And the second parameter second slice, through local matrix multiplication and the safe matrix multiplication operation with the first party, the second multiplication integral slice is calculated, and the first parameter second slice is used to process the first feature W _a second portion of the first slice parameter portion, the second parameter is a second fragment of the second fragment processing the second parameter characteristic part W _B of the second portion; fragments error determination unit, It is configured to secretly share the label vector Y to obtain a second label fragment, and subtract the second label fragment based on the second multiply integral fragment to obtain a second error fragment; determining unit configured to calculate the local error of the product of the second sheet and the second feature points X _B matrix to obtain a first portion of the second gradient, and a matrix X _B with the second feature, the first prescription The first error segment performs security matrix multiplication to obtain the second segment of the second part of the second gradient, and receives the second segment of the second part of the first gradient from the first party; the parameter update unit is configured to The second segment of the first part of the second gradient and the second segment of the second part of the second gradient, the second segment of the second parameter is updated, and the first segment of the first part of the first gradient is updated. The first parameter and the second fragment.

According to a fourth aspect, there is provided a device for two parties to jointly train a business prediction model to protect data privacy. The two parties include a first party and a second party, and the first party stores a plurality of business objects' first feature parts. The first feature matrix X _A _{; the second party stores a second feature matrix X B} formed by the second feature parts of the multiple business objects, and a label vector Y formed by label values. The device is deployed on the first party, and the device includes an iterative unit for performing model parameter update multiple iterations, and further includes: a multiplying-integral piece determination unit configured to be a first piece based on a locally maintained first parameter And the first segment of the second parameter, through the local matrix multiplication and the security matrix multiplication operation with the second party, the first multiplication-integral segment is calculated, and the first segment of the first parameter is used to process the first feature the first parameter W _a portion of the first fragment, the second parameter is the first slice portion for processing the first portion of fragment W _B of the second parameter characteristic of said second portion; fragments error determination unit, It is configured to receive, from the second party, a first label fragment that is secretly shared with the label vector Y, and subtract the first label fragment based on the first multiplication-integral fragment to obtain a first error score Slice; a gradient slice determination unit configured to locally calculate the product of the first error slice and the first feature matrix X _A to obtain the first part of the first gradient, and use the first feature matrix X _A and the The second error segment in the second party performs the security matrix multiplication to obtain the first segment of the second part of the first gradient, and receives the first segment of the second part of the second gradient from the second party; parameter update Unit configured to update the first parameter first slice according to the first slice of the first part of the first gradient and the first slice of the second part of the first gradient, and according to the first slice of the second part of the second gradient Fragmentation, the first fragmentation of the second parameter is updated.

According to a fifth aspect, there is provided a computer-readable storage medium having a computer program stored thereon, and when the computer program is executed in a computer, the computer is caused to execute the method of the first aspect or the second aspect.

According to a sixth aspect, there is provided a computing device, including a memory and a processor, the memory stores executable code, and when the processor executes the executable code, the method of the first aspect or the second aspect is implemented .

According to the method and device provided in the embodiments of this specification, the two parties participating in the joint training each have a part of characteristic data. In the iterative process of joint training, the two parties not only do not exchange the plaintext of feature data, but also split the model parameter part into parameter shards, and each only maintains the iterative update of the sharding parameters. The model will not be reconstructed until the end of the iteration. parameter. In the iterative process, all parties only maintain parameter shards and exchange some sharding results, and it is almost impossible to infer useful information about private data based on these sharding results. This greatly enhances the privacy data in the joint training process. safety.

Description of the drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the following will briefly introduce the accompanying drawings that need to be used in the description of the embodiments. The drawings in the following description are only some embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.

Fig. 1 is a schematic diagram of an implementation scenario of an embodiment disclosed in this specification;

Fig. 2 shows a schematic diagram of a process of joint training of a linear regression model by two parties according to an embodiment;

Figure 3 shows part of the implementation process of the first sub-phase in an embodiment;

4 shows a schematic diagram of a process of joint training of a logistic regression model between two parties according to another embodiment;

Fig. 5 shows a schematic block diagram of a joint training device deployed in a second party according to an embodiment;

Fig. 6 shows a schematic block diagram of a joint training device deployed in a first party according to an embodiment.

Detailed ways

The following describes the solutions provided in this specification with reference to the accompanying drawings.

As mentioned above, the training process of a typical machine learning model includes a process of obtaining a prediction result from calculations between feature data and model parameter data, determining the gradient according to the prediction result, and then adjusting the model parameters according to the gradient.

Specifically, assuming that the training data set used to train the machine learning model has n samples, the sample feature of each sample is expressed as x (x can be a vector), and the label is expressed as y, then the training data set can be expressed as:

Through the calculation of the sample feature x of each sample and the model parameter w, the predicted value of the sample can be obtained

If the machine learning model is a linear regression model, the predicted value can be expressed as:

If the machine learning model is a logistic regression model, the predicted value can be expressed as:

In the case of using maximum likelihood probability and stochastic gradient descent, the obtained gradient can be expressed as:

in,

Is the predicted value, y is the label value, the superscript T is the transposition, and x is the feature; therefore, the parameter w can be updated according to the gradient to achieve model training.

As can be seen from the above process, the training process includes several core operations: calculate the product xw of the sample feature x and the model parameter w, and the product xw is used to determine the predicted value

pass through

Obtain the prediction error E; then according to the product of the prediction error E and x, the gradient is obtained.

In the case of a single-party independent training model, the above-mentioned calculations can be easily performed. However, in the case of multi-party joint training of machine learning models, the characteristics of the same sample may be distributed among different participants. Each participant maintains some of the parameters of the model. How to implement the above items without revealing the plaintext data of all parties Computation is the core challenge for realizing data privacy protection in joint training.

In response to the above problems, the inventor proposed that in the scenario where the two parties jointly train the machine learning model, the model parameters of each party should be disassembled into safe parameter fragments. With the help of secret sharing and safe matrix multiplication, the above operations are also disassembled accordingly. The solution is a safe and secret sharding operation. Through the interaction and joint calculation of the results of the sharding operation by both parties, the above-mentioned operations are realized, thereby realizing safe collaborative training.

Figure 1 is a schematic diagram of an implementation scenario of an embodiment disclosed in this specification. As shown in Figure 1, the scenario of joint training between the two parties involves participant A and participant B, or called the first party and the second party. Each participant can be implemented as any device, platform, server or device cluster with computing and processing capabilities. Both parties must jointly train a business prediction model while protecting data privacy.

The first party A stores part of the features of n business objects in the training sample set, which is called the first feature part. Assuming that the first feature part of each business object is a d1-dimensional vector, then the first feature parts of n business objects constitute an n*d1-dimensional first feature matrix X _A. The second party B stores the second characteristic parts of the n business objects. Assuming that the second feature part of each business object is a d2-dimensional vector, then the second feature parts of n business objects constitute an n*d2-dimensional second feature matrix X _B. It is assumed that the label values of n business objects are also stored in the second party, and the n label values constitute a label vector Y.

In an exemplary scenario, the above-mentioned first party A and second party B are electronic payment platforms and banking institutions, and the two parties need to jointly train a business prediction model to evaluate the user's credit rating. At this point, the business object is the user. Both parties can maintain part of the user's characteristic data. For example, the electronic payment platform maintains the user's electronic payment and transfer related characteristics, which constitutes the above-mentioned first characteristic matrix; the banking institution maintains the user's credit record related characteristics, which constitutes the above-mentioned second Feature matrix. In addition, the banking institution also has a label Y for the user's credit rating.

In another example, the above-mentioned first party A and second party B are e-commerce platforms and electronic payment platforms, and the two parties need to jointly train a business prediction model to assess the merchant's fraud risk. At this time, the business object is the merchant. Both parties can maintain part of the characteristic data of the merchants respectively. For example, the e-commerce platform stores the sales data of sample merchants as part of the sample characteristics, and this part of the sample characteristics constitutes the above-mentioned first characteristic matrix; the electronic payment platform maintains the merchant's transaction flow data as another part of the sample Special, constitute the second characteristic matrix. The electronic payment platform also maintains the label of the sample merchant (whether it is a fraudulent merchant or not), which constitutes a label vector Y.

In other scenario examples, the business object may also be other objects to be evaluated, such as commodities, interaction events (for example, transaction events, login events, click events, purchase events), and so on. Correspondingly, the participating parties may be different business parties that maintain different characteristic parts of the above-mentioned business objects. The business prediction model may be a model that performs classification prediction or regression prediction for the corresponding business object.

It needs to be understood that the business object features maintained by both parties belong to private data. During the joint training process, plaintext exchanges are not allowed to protect the security of private data. And, finally, the first party A wants to train to obtain the model parameter part used to process the first feature part, called the first parameter part W _A ; the second party B wants to train to obtain the second parameter used to process the second feature part Part W _B , these two parts of parameters together constitute a business prediction model.

In order to perform joint training of the model without leaking private data, according to the embodiment of this specification, as shown in FIG. 1, the first party A and the second party B will initialize the first parameter parts W _A and W A to be trained to be generated. The second parameter part W _B is secretly shared and disassembled into parameter fragments, so the first party obtains the first parameter first fragment <W _A > ₁ and the second parameter first fragment <W _B > ₁ , the second The party obtains the second segment of the first parameter <W _A > ₂ and the second segment of the second parameter <W _B > ₂ .

_{In the iterative training process of the model, both parties obtain the encrypted fragments Z 1} , Z ₂ of the product result of the total feature matrix X and the total parameter matrix W through the security matrix multiplication. The label vector Y is secretly shared by the second party with the label, so that both parties obtain label fragments Y ₁ and Y ₂ respectively, so that the two parties respectively calculate the error fragments E ₁ and E based on the multiplier and integral fragments and the label fragments they own ₂ . _{Further, the two parties obtain the corresponding gradient fragments G 1} and G ₂ through secret sharing and security matrix multiplication based on the error fragments and their respective feature matrices. Then, the first party uses its gradient segment G ₁ to update its maintained parameter segments <W _A > ₁ and <W _B > ₁ , and the second party uses its gradient segment G ₂ to update its maintained parameter segments <W _A > ₂ and <W _B > ₂ .

Until the end of the entire iterative process, the two parties exchange their parameter fragments and perform parameter reconstruction. Therefore, the first party reconstructs the first parameter part after training based on the first parameter first fragment <W _A > ₁ maintained by itself and the second parameter second fragment <W _A > _{2 sent by the second party} W _a; second party based on a second parameter which is maintained by a second fragment <W _B> ₂ and the second parameter of the first party sends a first fragment <W _B> _1, to give a second reconstructed training The parameter part W _B.

During the entire training process, not only did the two parties not exchange the feature data in plaintext, but the model parameters were also split into parameter shards, and each only maintained the iterative update of the sharding parameters. The model parameters would not be reconstructed until the end of the iteration. In this way, the security of private data in the joint training process is greatly enhanced.

The following describes the specific process of the two parties jointly conducting model training.

Fig. 2 shows a schematic diagram of a process of joint training of a linear regression model by two parties according to an embodiment. The data holding status of the first party A and the second party B in the scenario of FIG. 2 is the same as that of FIG. 1, and will not be repeated here. In the scenario in Figure 2, the two parties jointly train a linear regression model as a business prediction model.

First, in the model initialization stage, the first party A and the second party B initialize the model parameters and share secretly, each maintaining parameter slicing.

Specifically, in step S11, the first party for processing the first initialization parameter A W _A portion of the first feature section. The first parameter may be initialized W _A portion obtained by way of randomly generated. Then, in S12, the first party A secretly shares the above-mentioned first parameter part, that is, splits it into the first parameter first segment <W _A > ₁ and the first parameter second _{segment <W A} > ₂ , Hold the first segment of the first parameter <W _A > ₁ and send the second segment of the first parameter <W _A > ₂ to the second party B. It can be understood that the sum of the two parameter fragments is the first parameter part, namely: W _A =<W _A > ₁ +<W _A > ₂ .

Correspondingly, in step S13, the second party B initializes the second parameter part W _B for processing the second characteristic part. The second parameter part W _B can be initialized in a randomly generated manner. Then, in S14, the second party A secretly shares the above-mentioned second parameter part, and splits it into the second parameter first segment <W _B > ₁ and the second parameter second _{segment <W B} > ₂ , Holds the second parameter second fragment <W _B > ₂ and sends the second parameter first fragment <W _B > ₁ to the first party A. Correspondingly, the sum of these two parameter fragments is the second parameter part, namely: W _B =<W _B > ₁ +<W _B > ₂ .

It should be understood that steps S11-S12 and steps S13-S14 can be executed in parallel or in any order, which is not limited here.

After the above initialization and secret sharing, the first party A maintains the first parameter first fragment <W _A > ₁ and the second parameter first fragment <W _B > ₁ , and the second party B maintains the first parameter The second segment <W _A > ₂ and the second parameter of the second segment <W _B > ₂ .

Next, enter the model iteration stage, which generally includes multiple iterations. In one embodiment, the number of iterations is a preset hyperparameter. In another embodiment, the number of iterations is not preset, but the iteration is stopped when a certain convergence condition is met. The above convergence conditions may be, for example, that the error is low enough, the gradient is small enough, and so on.

Each iteration process can include 4 sub-phases: calculating the multiplication and integral slices <Z> ₁ and <Z>₂; calculating the error slices <E> ₁ and <E>₂; calculating the gradient G; updating the parameters. The following describes the specific implementation of each sub-phase.

In the first sub-stage, in step S21, the first party A and the second party B are each calculated based on the local matrix multiplication operation and the safe matrix multiplication operation of both parties to obtain the first multiplication integral piece <Z> ₁ and the second multiplication integral sheet <Z> _2, such that the two fragments corresponding to a sum of the product of the total feature matrix X parameter W, i.e. equal to the first product of the first feature matrix X _a W _a portion of the first parameter multiplied, and The sum of the second product of the second feature matrix X _B multiplied by the second parameter part W _B.

Figure 3 shows a part of the implementation process of the first sub-phase in one embodiment.

Specifically, in step S211, the first party A locally calculates the product of the first feature matrix X _A and the first segment of the first parameter <W _A > ₁ to obtain the first feature first processing result <Z _A > ₁ , that is :

<Z _A > ₁ = X _A ^˙ <W _A > ₁

In step S212, the first party A uses the first feature matrix X _A _{held by the first party A to} perform a security matrix multiplication with the first parameter second slice <W A> _{2 held by the second party B.} The secure matrix multiplication can be implemented by homomorphic encryption, secret sharing or other secure calculation methods, which is not limited. The product of the first feature matrix X _A and the second segment of the first parameter <W _A > ₂ is recorded as the first feature second processing result <Z _A > ₂ , namely:

<Z _A > ₂ = X _A ^˙ <W _A > ₂

In the context of this article, the result of processing with local parameters is referred to as the first processing result, and the result of processing with the other party's parameters through secure matrix multiplication is referred to as the second processing result.

Then through the security matrix multiplication in step S212, the first party A obtains the first feature of the second processing result <Z _A > ₂ of the first fragment <<Z _A > ₂ > ₁ , and the second party B obtains the first feature of the second The second segment of the processing result <Z _A > ₂ _{is <<Z A} > ₂ > ₂ , and the sum of the two segments is the second processing result of the first feature.

In step S213, the second party B locally calculates the product of the second feature matrix X _B and the second parameter second segment <W _B > ₂ to obtain the first processing result of the second feature <Z _B > ₁ , namely:

<Z _B > ₁ = X _B ^˙ <W _B > ₂

In step S214, the second party B uses the second feature matrix X _B _{held by the second party B to} perform the security matrix multiplication with the second parameter first slice <W B> ₁ held by the first party A, and the product is denoted as second The second processing result of the feature <Z _B > ₂ , namely:

<Z _B > ₂ = X _B ^˙ <W _B > ₁

Through the security matrix multiplication in step S214, the first party A obtains the first segment of the second feature second processing result <Z _B > ₂ <<Z _B > ₂ > ₁ , and the second party B obtains the second feature second processing The second fragment of the result <Z _B > ₂ _{<<Z B} > ₂ > ₂ , the sum of the two fragments is the second processing result of the second feature.

It should be understood that the above steps S211-S214 can be performed in any order.

Then, in step S215, the first party A adds up the pieces of the processing results obtained by the above calculations, that is, the first processing result of the first feature <Z _A > ₁ , the second processing result of the first feature The first segment <<Z _A > ₂ > ₁ , the first segment of the second processing result of the second feature <<Z _B > ₂ > ₁ is added to obtain the first multiplied integral <Z> ₁ , namely:

<Z> ₁ =<Z _A > ₁ +<<Z _A > ₂ > ₁ +<<Z _B > ₂ > ₁

Correspondingly, in step S216, the second party B adds up the pieces of each processing result obtained by it, that is, the second piece of the second processing result of the first feature <<Z _A > ₂ > ₂ , The first processing result of the second feature <Z _B > ₁ , and the second segment of the second processing result of the second feature <<Z _B > ₂ > ₂ are added to obtain the second multiplication and integral segment <Z> ₂ , namely:

<Z> ₂ =<Z _B > ₁ +<<Z _A > ₂ > ₂ +<<Z _B > ₂ > ₂

It can be verified that the sum of the first multiplying integral piece <Z> ₁ and the second multiplying integral piece <Z> ₂ is the product of the total feature matrix X and the total parameter W, that is, the first feature matrix X _A and the first parameter part The _{sum of the first product of W A} and the second product of the second feature matrix X _B and the second parameter part W _B :

<Z> ₁ +<Z> ₂ =<Z _A > ₁ +<<Z _A > ₂ > ₁ +<<Z _B > ₂ > ₁ +<Z _B > ₁ +<<Z _A > ₂ > ₂ +<<Z _B > ₂ > ₂

＝<Z _A > ₁ +(<<Z _A > ₂ > ₁ +<<Z _A > ₂ > ₂ )+<Z _B > ₁ +(<<Z _B > ₂ > ₁ +<<Z _B > ₂ > ₂ )

＝X _A ^˙ <W _A > ₁ +X _A ^˙ <W _A > ₂ +X _B ^˙ <W _B > ₁ +X _B ^˙ <W _B > ₂

＝X _A ^˙ W _A +X _B ^˙ W _B

So far, the first party A and the second party B have calculated the first multiplying integral piece <Z> ₁ and the second multiplying integral piece <Z> _{2 respectively} . In this way, in the first sub-stage of the iteration, the two parties jointly perform safety calculations, and obtain the first multiplying integral piece <Z> ₁ and the second multiplying integral piece <Z> _{2 respectively} .

Then, enter the second sub-phase, calculate the error fragments <E> ₁ and <E> ₂ .

In step S31 of the second sub-phase, the second party secretly shares the label vector Y held by it, that is, splits it into the first label segment <Y> ₁ and the second label segment <Y> ₂ , and Hold the second label fragment <Y> ₂ and send the first label fragment <Y> ₁ to the first party A. It can be understood that the sum of the two label fragments is the label vector, namely:

Y=<Y> ₁ +<Y> ₂ .

Then in step S32, the second party B subtracts the second label segment <Y> ₂ based on the second multiplier integration segment <Z> ₂ to obtain the second error segment <E> ₂ . In addition, in step S33, the first party A subtracts the first label segment <Y> ₁ _{based on the first multiplier integration segment <Z> 1} to obtain the first error segment <E> ₁ .

In the scenario of the linear regression model shown in Figure 2, the predicted value

Therefore, the prediction error

It can be expressed as the difference between the product result X*W of the feature matrix and the model parameters and the label vector Y. The product result obtained so far corresponds to the first multiplier integral piece <Z> ₁ and the second multiplier integral piece <Z> ₂ held by the first party A and the second party B respectively, and the label vector Y corresponds to the first party A The first label fragment <Y> ₁ and the second label fragment <Y> ₂ held by the second party B respectively. Thus, the second party B may be integrated by the second sheet <Z> ₂ by subtracting the second tag fragment <Y> _2, the second difference and the resulting fragment as a second error <E> _2, the first Party A can subtract the first label segment <Y> ₁ from the first multiplication integral slice <Z> ₁ , and use the obtained first difference value as the first error slice <E> ₁ .

It can be verified that the _{sum of the first error segment <E> 1} and the second error segment <E> ₂ is the product of the above-mentioned total feature matrix X and the total parameter W, and the difference between the label vector Y :

<E> ₁ +<E> ₂ ＝<Z> ₁ -<Y> ₁ +<Z> ₂ -<Y> ₂

＝(<Z> ₁ +<Z> ₂ )-(<Y> ₁ +<Y> ₂ )

＝X*W-Y

So far, the first party A and the second party B have calculated the first error fragment <E> ₁ and the second error fragment <E> _{2 respectively} . In this way, in the second sub-stage of the iteration, the two parties jointly perform security calculations, and obtain the first error fragment <E> ₁ and the second error fragment <E> _{2 respectively} .

Therefore, the third sub-stage is entered, and the gradient G is calculated. According to the previous formula (1), the gradient calculation involves the multiplication of the error vector and the feature matrix. However, the error vector and the feature matrix are still distributed between the first party A and the second party B. Therefore, a piecewise calculation method is still needed to obtain each gradient piece.

Specifically, in step S41, the first party A local computing a first error fragment <E> permutation ₁ <E> ^T ₁ and the product of the first feature matrix X _A to obtain a first portion of a first gradient <G _A> ₁ , namely:

<G _A > ₁ ＝<E> ₁ ^T˙ X _A

In step S42, the first party A uses the first feature matrix X _A held by the first party A to perform a safety matrix multiplication with the second error slice <E> _{2 held by the second party B.} Secure matrix multiplication can be implemented by homomorphic encryption, secret sharing or other secure calculation methods. Fragmentation product of the second error <E> ₂ permutation <E> ^T ₂ and the first feature matrix X _A is referred to as a second portion of a first gradient <G _A> _2, namely:

<G _A > ₂ ＝<E> ₂ ^T˙ X _A

Then through the safe matrix multiplication of step S42, the first party A obtains the first segment of the first gradient second part <G _A > ₂ <<G _A > ₂ > ₁ , and the second party B obtains the first gradient second part The second fragment of <G _A > ₂ _{<<G A} > ₂ > ₂ , the sum of the two fragments is the second part of the first gradient.

In step S43, the second local party B calculates a second error fragment <E> ₂ permutation of <E> ₂ ^T X _B matrix with a second feature of the product, to obtain a first portion of a second gradient <G _B> _1, i.e., :

<G _B > ₁ ＝<E> ₂ ^T˙ X _B

In step S44, the second party B uses the second feature matrix X _B held by the second party B to perform a safety matrix multiplication with the first error slice <E> _{1 held by the first party A.} Secure matrix multiplication can be implemented by homomorphic encryption, secret sharing or other secure calculation methods. The first fragment error <E> permutation ₁ <E> ^T ₁ and the second feature matrix X _B product referred to as a second gradient of the second portion <G _B> _2, namely:

<G _B > ₂ ＝<E> ₁ ^T˙ X _B

Then through the safety matrix multiplication of step S44, the second party B obtains the second part of the second gradient <G _B > ₂ <<G _B > ₂ > ₂ , and the first party A obtains the second part of the second gradient The first fragment of <G _B > ₂ _{<<G B} > ₂ > ₁ , the sum of the two fragments is the second part of the second gradient.

It should be understood that the above steps S41-S44 can be performed in any order.

So far, the calculation of gradient slicing is realized. Then, enter the fourth sub-phase of the iteration, parameter update. In this stage, each party updates the parameter shards maintained by themselves according to the gradient shards obtained by themselves. The parameter update phase includes the following steps.

In step S51, the first portion of the first gradient of the first party A calculated according to step S41 <G _A> ₁ obtained in step S42 and the first slice << G _A second portion of the first gradient>_2> _1, the first update One parameter first fragment <W _A > ₁ .

Specifically, the product of the sum of the first part of the first gradient <G _A > ₁ and the first slice of the second part of the first gradient <<G _A > ₂ > ₁ and the preset step size α is used as the adjustment amount, and the Subtract the adjustment amount, update the first parameter, the first slice <W _A > ₁ , which can be expressed as:

<W _A > ₁ ←<W _A > ₁ -α(<G _A > ₁ +<<G _A > ₂ > ₁ )

In step S52, the first party A updates the first segment of the second parameter <W _B > ₁ according to the first segment of the second part of the second gradient <<G _B > ₂ > ₁ obtained in step S44, which can mean for:

<W _B > ₁ ←<W _B > ₁ -α<<G _B > ₂ > ₁

In step S53, the second party B updates the _{first part of the second gradient <G B} > ₁ calculated in step S43 and the second segment of the second part of the second gradient <<G _B > ₂ > _{2 obtained in step S44.} The second segment with two parameters <W _B > ₂ .

Specifically, the product of the sum of the first part of the second gradient <G _B > ₁ and the second segment of the second part of the second gradient <<G _B > ₂ > ₂ and the preset step size α is used as the adjustment amount, and Subtract the adjustment amount and update the second parameter of the second segment <W _B > ₂ , which can be expressed as:

<W _B > ₂ ←<W _B > ₂ -α(<G _B > ₁ +<<G _B > ₂ > ₂ )

In step S54, the second party B updates the first parameter and the second _{segment <W A} > ₂ _{according to the second segment <<G A} > ₂ > _{2 of the} second part of the first gradient obtained in step S42, which can mean for:

<W _A > ₂ ←<W _A > ₂ -α<<G _A > ₂ > ₂

It can be understood that the above steps S51-S54 can be executed in any order, or executed in parallel.

It can be seen that _{the update of the first parameter part W A} is jointly completed by both parties, where the first party A updates the first parameter first fragment <W _A > ₁ , and the second party B updates the first parameter second fragment < W _A > ₂ , the sum of the two parties' common update is:

<G _A > ₁ +<<G _A > ₂ > ₁ +<<G _A > ₂ > ₂ ＝<G _A > ₁ +<G _A > ₂

＝<E> ₁ ^T˙ X _A +<E> ₂ ^T˙ X _A

＝E ^T˙ X _A

That is, the product of (transpose of) the error vector and the first feature matrix X _A.

The update of the second parameter part W _B is jointly completed by both parties, where the second party B updates the second parameter second _{segment <W B} > ₂ , and the first party A updates the second parameter first _{segment <W B} > ₁ , The sum of the two parties’ updates is:

<G _B > ₁ +<<G _B > ₂ > ₂ +<<G _B > ₂ > ₁ ＝<G _B > ₁ +<G _B > ₂

＝<E> ₁ ^T˙ X _B +<E> ₂ ^T˙ X _B

＝E ^T˙ X _B

That is, the product of the error vector (transpose of) and the second feature matrix X _B.

However, after each round of iteration, the two parties do not need to exchange updated parameter fragments, but continue to the next iteration, that is, return to step S21, and execute the first sub-phase again based on the updated parameter fragments. In this way, in the iterative process, neither party has complete model parameters, nor does it exchange the plaintext information of the feature matrix, which ensures the security of private data with high strength.

Until the end of the entire iteration process, for example, the preset number of iterations is reached, or the predetermined convergence condition is reached, the model reconstruction phase is entered.

In the model reconstruction phase, the first party A sends its iteratively maintained second parameter first fragment <W _B > ₁ to the second party B; the second party B will iteratively maintain the first parameter second fragment <W _A > _{2 is} sent to the first party A.

The first party A reconstructs the first parameter part after training based on the first parameter first fragment <W _A > ₁ maintained by itself and the first parameter second fragment <W _A > _{2 sent by the second party} W _A.

Based on the second parameter second fragment <W _B > ₂ maintained by the second party itself and the second parameter first fragment <W _B > ₁ sent by the first party, the second parameter part after training is reconstructed W _B.

Thus, the first party the second party A and B have completed the training linear regression model, the model parameters obtained were each portion W _A and W _B used to treat the corresponding characteristic portion.

Looking back at the entire training process, it can be seen that the two parties not only do not exchange the plaintext of the feature data, but also split the model parameters into parameter shards, and each only maintains the iterative update of the sharding parameters. The model will not be reconstructed until the end of the iteration. parameter. In the iterative process, all parties only maintain parameter shards and exchange some sharding results, and it is almost impossible to infer useful information about private data based on these sharding results. This greatly enhances the privacy data in the joint training process. safety.

The joint training of the linear regression model in Figure 2 is described in detail above. The following describes the scenario of the logistic regression model. Those skilled in the art understand that when a logistic regression model is used as a business prediction model, the predicted value can be expressed as:

It can be seen that the predicted value of the logistic regression model is based on the non-linear sigmoid function, and the non-linear function is not conducive to secure calculations such as secret sharing.

Therefore, in the case of a logistic regression model, in order to facilitate linear calculation, the sigmoid function can be expanded by Taylor Taylor. Specifically, the sigmod function 1/(1+e^x) can perform the following Taylor decomposition:

Correspondingly, the predicted value of logistic regression can be expanded into:

Substituting the above predicted value expansion into formula (1), the gradient form can be obtained. For example, under the first-order expansion, the gradient form is

The gradient form of the third-order expansion is

In this way, through Taylor Taylor expansion, the predicted value of logistic regression is converted into a scheme that can use homomorphic encryption. Therefore, the program process shown in Figure 2 can be slightly modified to make the training process suitable for the logistic regression model.

Fig. 4 shows a schematic diagram of a process of joint training of a logistic regression model by two parties according to another embodiment. The training process of Fig. 4 is basically the same as that of Fig. 2, except that in step S32 and step S33, when calculating the encryption error fragments, according to the Taylor expansion form of the sigmoid function, based on the first multiplying integral piece <Z> ₁ and the second multiplying integral piece <Z> ₂ Obtain the first part and the second part of the prediction result respectively, and then _{subtract correspondingly with the first label segment <Y> 1} and the second label _{segment <Y> 2} to obtain the first error segment <E> ₁ and the second error slice <E> ₂ .

In the case of using the first-order Taylor expansion, according to formula (4), the prediction result can be expressed as: 0.5+0.25(<Z> ₁ +<Z> ₂ ), and the prediction result can be split into the first part 0.25+0.25 accordingly <Z> ₁ and the second part 0.25+0.25<Z> ₂ , and then get the first error _{segment <E> 1} =0.25+0.25<Z> ₁ -<Y> ₁ and the second error _{segment <E> 2} =0.25+0.25<Z> ₂ -<Y> ₂ . It can be understood that there can be other ways to divide 0.5 among them, such as -0.1+0.6, or 0+0.5. Therefore, the error fragments of the approximate error vector under logistic regression can be obtained.

The other training steps are the same as in Figure 2.

In the case of adopting the multi-order Taylor expansion, it is also necessary to further obtain the multi-order calculation result of wx, that is, the multiplication-integral piece of the ^{multi-order product result Z k.} Specifically, first, according to the multi-order Taylor expansion form of the sigmoid function, the first party A calculates the first multiplier of the integral piece <Z> ₁ to obtain the first piece of _{multiplicity {<Z> 1} ^k |k ＞2, k∈N*} (where k is the order of multi-order Taylor expansion), the second party B calculates the second multiplier integral piece <Z> the multiplicity of ₁ , and obtains the second multiplicity of shard {<Z> ₂ ^k |k>2, k∈N*}; then, the first party A uses the first multiplication integral piece <Z> ₁ and a part of the first piece _{multiplier {<Z> 1} ^k-1 |k＞2, k∈N*}, and the second multiplication integral piece <Z> ₂ in the second party B and a part of the second multiplication square {<Z> ₂ ^k-1 |k＞2 ，K∈N*} perform multiple safe matrix multiplications, thus, the first party A obtains multiple first multifactorial integral pieces corresponding to the multiple matrix multiplication results, and the second party B obtains the multiple matrix multiplication results corresponding to Multiple second multi-factorial integral pieces of; then, the first party A according to the first multiplier integral piece <Z> ₁ , the first multiple of the first piece {<Z> ₁ ^k |k＞2, k∈N*} And multiple first multi-factorial integral slices, determine the first part of the prediction result, and then subtract the first label slice <Y> _{1 from} it to obtain the first error slice <E> ₁ , and the second party B according to the second Multiply the integral piece <Z> ₂ , the second slicing _{multiplicity {<Z> 2} ^k |k>2, k∈N*} and multiple second multi-factorial integral pieces to determine the second part of the prediction result, Then subtract the second label segment <Y> _{2 from} it to obtain the second error segment <E> ₂ .

Specifically, for example, when the third-order expansion is adopted, that is, k=3, according to the third-order Taylor expansion:

(<Z> ₁ +<Z> ₂ ) ³ =<Z> ₁ ³ +3<Z> ₁ ^2˙ <Z> ₂ +3<Z> ₁ ^˙ <Z> ₂ ² +<Z> ₂ ³ ,

After the first party A obtains the first multiplying integral piece <Z> ₁ , it needs to calculate <Z> ₁ ² and <Z> ₁ ³ locally, and the second party B obtains the second multiplying integral piece <Z> ₂ , Also need to calculate <Z> ₂ ² and <Z> ₂ ³ locally. Then, the first party A uses <Z> ₁ ² and the second party B's <Z> _{2 to} perform a safe matrix multiplication, and the two parties get the multifactorial respectively Integral piece <<Z> ₁ ^2˙ <Z> ₂ > ₁ and multi-factorial integral piece <<Z> ₁ ^2˙ <Z> ₂ > ₂ , and the first party A uses <Z> ₁ and the second party B In <Z> ₂ ² for safe matrix multiplication, the two parties get the multi-factorial integral piece<<Z> ₁ ^˙ <Z> ₂ ² > ₁ and the multi-factorial integral piece<<Z> ₁ ^˙ <Z> ₂ ² > ₂ .

Further, the first party A can calculate <E> ₁ by the following formula:

<E> ₁ ＝1/2+<Z> ₁ /4-(<Z> ₁ ³ +3<<Z> ₁ ^2˙ <Z> ₂ > ₁ +3<<Z> ₁ ^˙ <Z> ₂ ² > ₁ )/48-<Y>₁;

The second party A calculates <E> ₂ by the following formula:

<E> ₂ =<Z> ₂ /4-(<Z> ₁ ³ +3<<Z> ₁ ^2˙ <Z> ₂ > ₂ +3<<Z> ₁ ^˙ <Z> ₂ ² > ₂ )/ 48-<Y> ₂ .

In this way, in the case of multi-order Taylor expansion, the first error fragment <E> ₁ and the second error fragment <E> ₂ can be calculated.

It can be understood that the higher the order of Taylor expansion, the more accurate the result, but the higher the computational complexity. In this way, for the business prediction model implemented by the logistic regression model, the two-party joint training to protect data privacy can be realized through the method described above.

The above training methods are also applicable to business prediction models implemented by neural networks. For a typical feedforward fully connected neural network, each neuron is connected to each neuron in the previous layer with different weights. Therefore, the output of each neuron in the previous layer can be regarded as feature data, and the feature data is distributed among the two sides; the connection weight can be regarded as the model parameter part, which is used to process the corresponding feature data in a linear combination. Therefore, the aforementioned training process can be applied to the parameter training of each neuron in the neural network to realize the joint safety training of both parties of the neural network model.

In general, for various business prediction models based on linear combinations of feature data and model parameters, the training methods described above can be used. In this training method, through the fragmented maintenance of parameters, high strength ensures that private data will not be leaked or reversed, and data security is ensured.

According to another embodiment, there is provided an apparatus for both parties to jointly train a business prediction model to protect data privacy. The second party can be implemented as any device, platform or device cluster with computing and processing capabilities. Fig. 5 shows a schematic block diagram of a joint training device deployed in a second party according to an embodiment. As shown in FIG. 5, the device 500 includes an iterative unit 510 for performing model parameter update multiple times. The iteration unit 510 further includes:

The multiplying integral piece determining unit 511 is configured to calculate the second piece based on the locally maintained first parameter second piece and the second parameter second piece through local matrix multiplication and the safe matrix multiplication with the first party. Multiply the integral slice; wherein the second slice of the first parameter is the second slice used to process the first parameter part W _A of the first characteristic part, and the second slice of the second parameter is used to process the first parameter part W A The second segment of the second parameter part W _B of the two characteristic part.

The error segment determination unit 512 is configured to secretly share the tag vector Y to obtain a second tag segment, and subtract the second tag segment based on the second multiplication-integral segment to obtain a second tag segment. Error fragmentation.

Gradient fragmentation determination unit 513, configured to calculate the local error of the second sheet and the second partial product feature matrix X _B to obtain a first portion of a second gradient; and with the second feature matrix X _B, with the first The first error segment in one party performs a security matrix multiplication to obtain the second segment of the second part of the second gradient, and receives the second segment of the second part of the first gradient from the first party.

The parameter update unit 514 is configured to update the second parameter second slice according to the first part of the second gradient and the second slice of the second part of the second gradient; according to the second slice of the first part of the first gradient The first fragment, the second fragment is updated with the first parameter.

In one embodiment, the above-mentioned apparatus 500 further includes an initialization unit 520 configured to initialize the second parameter part W _B , and split it into a second parameter first segment and a second parameter second segment through secret sharing. sheet, retaining the second parameter of the second fragment, transmits the first fragment of the second parameter to the first party; receiving from the first party to the first parameter W _a portion of the first secret sharing parameter Two slices.

In an embodiment, the above-mentioned apparatus 500 further includes a parameter reconstruction unit 530, configured to: send the second segment of the first parameter updated in the last iteration to the first party, and send it from the first party. One party receives the updated first segment of the second parameter; combines the updated second segment of the second parameter in the last iteration with the received first segment of the second parameter to obtain the service prediction model The second parameter part W _B after training.

In different embodiments, the foregoing business objects include one of the following: users, merchants, commodities, and events; the business prediction model is used to predict the classification or regression value of the business objects.

In one embodiment, the service prediction model is a linear regression model; at this time, the error segment determination unit 512 is configured to calculate the difference between the second multiplication-integral segment and the second label segment as the The second error fragment.

In another embodiment, the business prediction model is a logistic regression model; at this time, the error segment determination unit 512 is configured to obtain a second prediction result based on the second multiplication and integration segment according to the Taylor expansion form of the sigmoid function Fragment, calculating the difference between the second prediction result fragment and the second label fragment as the second error fragment.

Further, in a specific embodiment, the multiplier-integral piece determining unit 511 is further configured to calculate the multiplier of the second multiplier-integral piece to obtain the second multiplier of the multiplier; The slice and the second sharding multiplier, and the first multiplication integral slice and the first slicing multiplier in the first side perform multiple security matrix multiplication operations to obtain multiple second multifactorial integral slices Correspondingly, the error slice determination unit 512 is configured to use the second multiplier integral slice, the second slice multiplier and multiple second multifactorial integral slices according to the multi-order Taylor expansion form of the sigmoid function , Determining the second prediction result segment, and calculating the difference between the second prediction result segment and the second label segment as the second error segment.

In one embodiment, the above-mentioned multiplying-integral piece determining unit 511 is specifically configured to: use the first parameter second piece to perform security matrix multiplication _{with the first feature matrix X A in the first party to obtain the first} The second segment of the feature second processing result; _{the product of the second feature matrix X B} and the second segment of the second parameter is locally calculated to obtain the second feature first processing result; using the second feature matrix X _B , and Perform security matrix multiplication on the first segment of the second parameter in the first party to obtain the second segment of the second processing result of the second feature; for the second segment of the second processing result of the first feature, so According to the first processing result of the second characteristic, the second slices of the second processing result of the second characteristic are added to obtain the second multiplication-integral slice.

In a specific embodiment, the above-mentioned parameter update unit 514 is configured to use the product of the sum of the first part of the second gradient and the second part of the second part of the second gradient and the preset step length as the adjustment amount, By subtracting the adjustment amount, the second segment of the second parameter is updated.

According to another embodiment, there is provided a device for two parties to jointly train a business prediction model. The device can be deployed in the aforementioned first party, and the first party can be implemented as any device or platform with computing and processing capabilities. Or device cluster. As mentioned above, the first party stores the first feature matrix X _A formed by the first feature parts of the multiple business objects; the second party stores the second features formed by the second feature parts of the multiple business objects Matrix X _B , and label vector Y composed of label values. Fig. 6 shows a schematic block diagram of a joint training device deployed in a first party according to an embodiment. As shown in Fig. 6, the device 600 includes an iterative unit 610 for performing model parameter update multiple iterations. The iteration unit 610 further includes:

The multiplying-integral piece determining unit 611 is configured to calculate the first piece based on the locally maintained first parameter first piece and the second parameter first piece through local matrix multiplication and the safe matrix multiplication with the second party. Multiply the integral slice; wherein the first parameter first slice is the first slice used to process the first parameter part W _A of the first characteristic part, and the second parameter first slice is used to process the first parameter part W A The first segment of the second parameter part W _B of the two characteristic part.

The error segment determination unit 612 receives the first tag segment secretly shared with the tag vector Y from the second party, and subtracts the first tag segment based on the first multiplication-integral segment, Obtain the first error fragment.

Gradient fragmentation determination unit 613, the local computing a product of said first error and the first fragment of the feature matrix X _A, to obtain a first portion of a first gradient; and using said first feature matrix X _A, with the second party Perform security matrix multiplication on the second error fragment in the second part to obtain the first fragment of the second part of the first gradient, and receive the first fragment of the second part of the second gradient from the second party.

The parameter updating unit 614 updates the first parameter first slice according to the first slice of the first part of the first gradient and the first slice of the second part of the first gradient; according to the first slice of the second part of the second gradient One slice, update the first slice with the second parameter.

In one embodiment, the above-mentioned apparatus 600 further includes an initialization unit 620 configured to initialize the first parameter part W _A , and split it into a first parameter first segment and a first parameter second segment through secret sharing. The first segment of the first parameter is reserved, and the second segment of the first parameter is sent to the second party; the second parameter of the second parameter part W _B secretly shared from the second party is received One shard.

In one embodiment, the above-mentioned apparatus 600 further includes a parameter reconstruction unit 630, configured to: send the second segment of the second parameter updated in the last iteration to the second party, and send it from the first The two parties receive the updated first parameter second segment; the updated first parameter first segment in the last iteration and the received first parameter second segment are combined to obtain the service prediction model The first parameter part W _A after training.

In one embodiment, the service prediction model is a linear regression model; at this time, the error segment determination unit 612 is configured to calculate the difference between the first multiplication-integral segment and the first label segment as the The first error fragment.

In another embodiment, the business prediction model is a logistic regression model; at this time, the error fragment determination unit 612 is configured to obtain the first prediction result based on the first multiplication and integration fragment according to the Taylor expansion form of the sigmoid function Fragmentation, calculating the difference between the one prediction result fragment and the first label fragment as the second error fragment.

Further, in a specific embodiment, the multiplication-integral piece determining unit 611 is further configured to calculate the multiplier of the first multiplier-integral piece to obtain the first multiplier of the multiplier; The slice and the first multi-factorial multiplication, and the second multiplication integral slice and the second multiplication multiplier in the second party perform multiple security matrix multiplication operations to obtain multiple first multi-factorial integral slices Correspondingly, the error piece determination unit 612 is configured to use the first multiplier-integral piece, the first multiplier of the first piece and multiple first multi-factorial integral pieces according to the multi-order Taylor expansion form of the sigmoid function , Determine the second prediction result segment.

In one embodiment, the above-mentioned multiplication-integral piece determining unit 611 is specifically configured to: use the first piece of the second parameter to perform a security matrix multiplication _{with the second feature matrix X B in the second party to obtain the second} Feature the first segment of the second processing result; locally calculate the product of the first feature matrix X _A and the first segment of the first parameter to obtain the first feature first processing result; use the first feature matrix X _A , and Perform security matrix multiplication on the second segment of the first parameter in the second party to obtain the first segment of the second processing result of the first feature; for the first segment of the second processing result of the second feature, so According to the first processing result of the first characteristic, the first slices of the second processing result of the first characteristic are added to obtain the first multiplication-integral slice.

In a specific embodiment, the aforementioned parameter update unit 614 is configured to use the product of the sum of the first part of the first gradient and the first part of the second part of the first gradient and the preset step length as the adjustment amount, The first segment of the first parameter is updated by subtracting the adjustment amount.

Through the above devices deployed in the first party and the second party, the security joint training of the two parties to protect data privacy is realized.

According to another embodiment, there is also provided a computer-readable storage medium on which a computer program is stored. When the computer program is executed in the computer, the computer is caused to execute the method described in conjunction with FIG. 2 to FIG. 4.

According to an embodiment of still another aspect, there is also provided a computing device, including a memory and a processor, the memory stores executable code, and when the processor executes the executable code, a combination of FIGS. 2 to 4 is provided. The method described.

Those skilled in the art should be aware that, in one or more of the above examples, the functions described in the present invention can be implemented by hardware, software, firmware, or any combination thereof. When implemented by software, these functions can be stored in a computer-readable medium or transmitted as one or more instructions or codes on the computer-readable medium.

The specific embodiments described above further describe the purpose, technical solutions and beneficial effects of the present invention in detail. It should be understood that the above are only specific embodiments of the present invention, and are not intended to limit the scope of the present invention. The protection scope, any modification, equivalent replacement, improvement, etc. made on the basis of the technical solution of the present invention shall be included in the protection scope of the present invention.

Claims

A method for two parties to jointly train a business prediction model to protect data privacy. The two parties include a first party and a second party. The first party stores a first feature matrix X composed of first feature parts of multiple business objects. A ; the second party stores a second feature matrix X B composed of the second feature parts of the multiple business objects, and a label vector Y composed of label values; the method is applied to the second party, the The method includes performing model parameter update multiple iterations, where each iteration includes:

Based on the locally maintained first parameter second slice and the second parameter second slice, the second multiplication integral slice is calculated through the local matrix multiplication and the safe matrix multiplication with the first party; wherein, the first parameter The second slice is the second slice used to process the first parameter part W A of the first characteristic part, and the second parameter second slice is the second parameter part W A used to process the second characteristic part. The second fragment of B;

Performing secret sharing on the label vector Y to obtain a second label fragment, and subtracting the second label fragment based on the second multiplying integral fragment to obtain a second error fragment;

The local computing fragmentation product of the second error and second characteristic X B matrix to obtain a first portion of a second gradient; and with the second feature matrix X B, the error of the first fragment of a first prescription Performing security matrix multiplication to obtain the second fragment of the second part of the second gradient, and receiving the second fragment of the second part of the first gradient from the first party;

According to the second slice of the first part of the second gradient and the second slice of the second part of the second gradient, update the second parameter second slice; according to the second slice of the second part of the first gradient, update The first parameter is the second fragment.
The method according to claim 1, before performing model parameter update multiple iterations, further comprising:

Initialize the second parameter part W B , split it into a second parameter first fragment and a second parameter second fragment through secret sharing, retain the second parameter second fragment, and divide the second parameter The first fragment of the parameter is sent to the first party;

Receiving a first secret parameter sharing part W A second fragment of the first parameter from the first party.
The method according to claim 1, after performing model parameter update multiple iterations, further comprising:

Sending the updated second segment of the first parameter in the last iteration to the first party, and receiving the updated first segment of the second parameter from the first party;

Combine the updated second parameter second segment in the last iteration with the received second parameter first segment to obtain the second parameter part W B after the service prediction model is trained.
The method according to claim 1, wherein the business object includes one of the following: users, merchants, commodities, and events; and the business prediction model is used to predict the classification or regression value of the business object.
The method according to claim 1, wherein the business prediction model is a linear regression model; wherein the second label fragment is subtracted based on the second multiplier integral fragment to obtain a second error fragment, comprising :

Calculate the difference between the second multiplication-integral slice and the second label slice as the second error slice.
The method according to claim 1, wherein the business prediction model is a logistic regression model; wherein the second label fragment is subtracted based on the second multiplier integral fragment to obtain a second error fragment, comprising :

According to the Taylor expansion form of the sigmoid function, a second prediction result segment is obtained based on the second multiplication and integration slice, and the difference between the second prediction result segment and the second label segment is calculated as the second Error fragmentation.
The method according to claim 6, wherein, before obtaining the second error fragment, the method further comprises:

Calculate the multiplicity of the second multiplication integral slice to obtain the multiplicity of the second slice;

Using the second multiplication-integration piece and the second multiplication factor of the slicing, multiple safe matrix multiplication operations are performed with the first multiplication/integration piece and the first multiplication factor of the first party to obtain the multiplication Second multifactorial integral piece;

Wherein, obtaining the second prediction result fragment based on the second multiplying integral fragment includes:

According to the multi-order Taylor expansion form of the sigmoid function, the second prediction result fragment is determined by using the second multiplier integral piece, the second multiplier of the second fragment and multiple second multifactorial integral pieces.
The method according to claim 1, wherein calculating the second multiplying integral piece comprises:

Use the first parameter second segment to perform security matrix multiplication with the first feature matrix X A in the first party to obtain the second segment of the first feature second processing result;

Locally calculating the product of the second feature matrix X B and the second segment of the second parameter to obtain the first processing result of the second feature;

Use the second feature matrix X B to perform a security matrix multiplication with the first segment of the second parameter in the first party to obtain the second segment of the second processing result of the second feature;

The second segment of the second processing result of the first feature, the first processing result of the second feature, and the second segment of the second processing result of the second feature are added to obtain the second multiplication Integral piece.
The method according to claim 1, wherein updating the second parameter second slice according to the first part of the second gradient and the second slice of the second part of the second gradient comprises:

The product of the sum of the first part of the second gradient and the second part of the second part of the second gradient and the preset step length is used as the adjustment amount, and the second parameter is updated by subtracting the adjustment amount. Fragmentation.
A method for two parties to jointly train a business prediction model to protect data privacy. The two parties include a first party and a second party. The first party stores a first feature matrix X composed of first feature parts of multiple business objects. A ; the second party stores a second feature matrix X B composed of the second feature parts of the multiple business objects, and a label vector Y composed of label values; the method is applied to the second party, the The method includes performing model parameter update multiple iterations, where each iteration includes:

Based on the first segment of the first parameter maintained locally and the first segment of the second parameter, the first multiplication-integral segment is calculated through the local matrix multiplication and the security matrix multiplication with the second party; wherein, the first parameter The first slice is the first slice used to process the first parameter part W A of the first characteristic part, and the second parameter first slice is the second parameter part W A used to process the second characteristic part. The first fragment of B;

Receiving, from the second party, a first label fragment secretly shared with the label vector Y, and subtracting the first label fragment based on the first multiplication-integral fragment to obtain a first error fragment;

The product of the first local computing error and the first slice of the feature matrix X A, to obtain a first portion of a first gradient; and using said first feature matrix X A, and the second error of the second party fragment Performing security matrix multiplication to obtain the first fragment of the second part of the first gradient, and receiving the first fragment of the second part of the second gradient from the second party;

According to the first slice of the first part of the first gradient and the first slice of the second part of the first gradient, update the first slice of the first parameter; according to the first slice of the second part of the second gradient, update The second parameter is the first fragment.
The method according to claim 10, before performing the model parameter update for multiple iterations, further comprising:

Initializing the first parameter part W A, a secret shared by a first parameter which is split into a first slice and a second slice of the first parameter, the first parameter of the first retention fragments, the first The second fragment of the parameter is sent to the second party;

Receive the first fragment of the second parameter secretly shared with the second parameter part W B from the second party.
The method according to claim 10, after performing model parameter update for multiple iterations, further comprising:

Sending the updated second segment of the second parameter in the last iteration to the second party, and receiving the updated second segment of the first parameter from the second party;

The updated after the last iteration of the first slice of the first parameter, the first parameter and the second slice the received combination parameters to obtain the first portion of the rear of the train traffic prediction model W A.
The method according to claim 10, wherein the calculation to obtain the first multiplication-integral piece comprises:

Use the first segment with the second parameter to perform a security matrix multiplication with the second feature matrix X B in the second party to obtain the first segment with the second processing result of the second feature;

Locally calculating the product of the first feature matrix X A and the first segment of the first parameter to obtain the first processing result of the first feature;

Perform a security matrix multiplication with the first feature matrix X A and the second segment of the first parameter in the second party to obtain the first segment of the second processing result of the first feature;

The first segment of the second processing result of the second feature, the first processing result of the first feature, and the first segment of the second processing result of the first feature are added to obtain the first multiplication Integral piece.
The method according to claim 10, wherein updating the first parameter first slice according to the first slice of the first part of the first gradient and the first slice of the second part of the first gradient comprises:

Take the product of the sum of the first part of the first part of the first gradient and the first part of the second part of the first gradient and the preset step length as the adjustment amount, and update the first parameter first by subtracting the adjustment amount. Fragmentation.
A device for two parties to jointly train a business prediction model to protect data privacy. The two parties include a first party and a second party. The first party stores a first feature matrix X composed of first feature parts of multiple business objects. A ; the second party stores a second feature matrix X B composed of the second feature parts of the multiple business objects, and a label vector Y composed of tag values; the device is deployed on the second party, the The device includes an iterative unit for performing model parameter update multiple times, and further includes:

The multiplication-integral piece determining unit is configured to calculate the second multiplication based on the locally maintained first parameter second piece and the second parameter second piece through the local matrix multiplication and the safe matrix multiplication with the first party Integral slice; wherein the second slice of the first parameter is the second slice used to process the first parameter part W A of the first characteristic part, and the second slice of the second parameter is used to process the second The second segment of the second parameter part W B of the characteristic part;

The error fragment determination unit is configured to secretly share the tag vector Y to obtain a second tag fragment, and subtract the second tag fragment based on the second multiplier and integral fragment to obtain a second error Fragmentation;

Gradient fragmentation determination unit configured to calculate a product of said second local error and a second fragment of the feature matrix X B, to give the first portion of the second gradient; and with the second feature matrix X B, the first Perform security matrix multiplication on the first error segment in the square to obtain the second segment of the second part of the second gradient, and receive the second segment of the second part of the first gradient from the first party;

The parameter update unit is configured to update the second parameter second slice according to the first part of the second gradient and the second slice of the second part of the second gradient; according to the second slice of the second part of the first gradient The second fragment, the second fragment is updated with the first parameter.
The device according to claim 15, further comprising an initialization unit configured to:

Initialize the second parameter part W B , split it into a second parameter first fragment and a second parameter second fragment through secret sharing, retain the second parameter second fragment, and divide the second parameter The first fragment of the parameter is sent to the first party;

Receiving a first secret parameter sharing part W A second fragment of the first parameter from the first party.
The device according to claim 15, further comprising a parameter reconstruction unit configured to:

Sending the updated second segment of the first parameter in the last iteration to the first party, and receiving the updated first segment of the second parameter from the first party;

Combine the updated second parameter second segment in the last iteration with the received second parameter first segment to obtain the second parameter part W B after the service prediction model is trained.
A device for protecting data privacy by two parties jointly training a business prediction model, the two parties including a first party and a second party, and the first party stores a first feature matrix X composed of first feature parts of multiple business objects A ; the second party stores a second feature matrix X B composed of the second feature parts of the multiple business objects, and a label vector Y composed of tag values; the device is deployed on the first party, the The device includes an iterative unit for performing model parameter update multiple times, and further includes:

The multiplication-integral piece determination unit is configured to calculate the first multiplication based on the locally maintained first parameter first piece and the second parameter first piece through local matrix multiplication and a safe matrix multiplication operation with the second party. Integral slices; wherein, the first parameter first slice is used to process the first parameter part W A of the first characteristic part, and the second parameter first slice is used to process the second The first segment of the second parameter part W B of the characteristic part;

An error fragment determination unit, configured to receive a first tag fragment secretly shared with the tag vector Y from the second party, and to subtract the first tag fragment based on the first multiplication-integral fragment , Get the first error fragment;

Gradient fragmentation determination unit configured to calculate a product of said first local error fragment and a first feature matrix X A to obtain a first portion of a first gradient; and using said first feature matrix X A, and the second Perform security matrix multiplication on the second error segment in the square to obtain the first segment of the second part of the first gradient, and receive the first segment of the second part of the second gradient from the second party;

The parameter update unit is configured to update the first parameter first slice according to the first slice of the first part of the first gradient and the first slice of the second part of the first gradient; according to the first slice of the second part of the second gradient The first fragment, the first fragment of the second parameter is updated.
A computer-readable storage medium with a computer program stored thereon, and when the computer program is executed in a computer, the computer is caused to execute the method of any one of claims 1-14.
A computing device, comprising a memory and a processor, characterized in that executable code is stored in the memory, and when the processor executes the executable code, the method described in any one of claims 1-14 is implemented. method.