CN114462626B - Federal model training method and device, terminal equipment and storage medium - Google Patents
Federal model training method and device, terminal equipment and storage medium Download PDFInfo
- Publication number
- CN114462626B CN114462626B CN202210363190.8A CN202210363190A CN114462626B CN 114462626 B CN114462626 B CN 114462626B CN 202210363190 A CN202210363190 A CN 202210363190A CN 114462626 B CN114462626 B CN 114462626B
- Authority
- CN
- China
- Prior art keywords
- participant
- ciphertext
- model
- prediction output
- plaintext
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/602—Providing cryptographic facilities or services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/21—Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/2107—File encryption
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Bioethics (AREA)
- General Health & Medical Sciences (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a method and a device for training a federated model, terminal equipment and a storage medium, wherein a first participant performs homomorphic property calculation based on a first plaintext model weight random number and a second plaintext model random number ciphertext to generate a first model weight ciphertext, and generates and sends a first participant prediction output ciphertext to a second participant based on the first model weight ciphertext and first training data; the second participant obtains joint prediction output and a predicted value according to the prediction output of the first participant and the prediction output of the second participant; the second participant generates a combined prediction output gradient based on the prediction value, encrypts the combined prediction output gradient to obtain a combined prediction output gradient ciphertext, sends the combined prediction output gradient ciphertext to the first participant, and updates the weight of a second plaintext model according to the combined prediction output gradient; and the first participant outputs the gradient ciphertext according to the joint prediction to obtain a first model gradient ciphertext, and updates the first model weight ciphertext according to the first model gradient ciphertext. The invention improves the safety of the federal model training.
Description
Technical Field
The invention relates to the technical field of federal learning, in particular to a method and a device for training a federated model, terminal equipment and a storage medium.
Background
With the deep development of digital economy, big data has become a new production element and strategic resource. In the artificial intelligence era, the acquisition of machine learning, particularly deep learning models, requires a large amount of training data as a premise. In many business scenarios, however, the training data for the model is often scattered across different business teams, departments, and even different companies. Due to privacy protection, data security, business competition, and the like, data owned by different organizations are difficult to integrate, so that so-called "data islands" are formed between the different organizations. And federate learning can participate in the same model training task together through multiple parties, model training is carried out on the premise that data are not taken out of a warehouse, and the problems of data fusion application and privacy protection can be solved.
In the related technology, each participant of federal learning uses sample data with the same sample identification to carry out model training, wherein one participant also has a sample label, the participant with the sample label is responsible for decrypting the model calculation result encrypted by each participant to obtain the gradient multiplier of the model and send the gradient multiplier to other participants, and each participant updates the model based on the gradient multiplier.
However, in the case where each of the other participants owns own plaintext model weights, own prediction outputs can be independently calculated, which may cause data leakage to the participant owning the sample label.
Therefore, a solution for improving the safety of the federal model training is needed.
The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.
Disclosure of Invention
The invention mainly aims to provide a method and a device for training a federated model, terminal equipment and a storage medium, and aims to improve the security of the federated model training.
In order to achieve the above object, the present invention provides a federal model training method, which is applied to a federal learning system, wherein the federal learning system includes a first participant and a second participant, the second participant has a sample label, and the federal model training method includes the following steps:
the first participant performs homomorphic property calculation based on a first plaintext model weight random number and a second plaintext model random number ciphertext to generate a first model weight ciphertext, wherein the second model weight ciphertext is generated by the second participant according to the second plaintext model random number in an encryption manner and is sent to the first participant;
the first participant generates and sends a first participant prediction output ciphertext to the second participant based on the first model weight ciphertext and first training data, so that the second participant decrypts the first participant prediction output ciphertext to obtain a first participant prediction output;
the second participant calculates to obtain joint prediction output according to the first participant prediction output and the second participant prediction output, and calculates to obtain a predicted value according to the joint prediction output, wherein the second participant prediction output is obtained by the second participant according to second training data and second plaintext model weight;
the second participant generates a combined prediction output gradient based on the prediction value and the sample label, encrypts the combined prediction output gradient to obtain a combined prediction output gradient ciphertext and sends the combined prediction output gradient ciphertext to the first participant, and updates the weight of the second plaintext model according to the combined prediction output gradient;
and the first participant outputs the gradient ciphertext according to the joint prediction to perform homomorphic property calculation to obtain a first model gradient ciphertext, and updates the first model weight ciphertext according to the first model gradient ciphertext.
Optionally, the first participant performs homomorphic property calculation based on the first plaintext model weight random number and the second plaintext model random number ciphertext, and the step of generating the first model weight ciphertext further includes:
the first participant generates the first plaintext model weight random number;
the second party generates the second plaintext model weight and the second plaintext model random number, and generates a second plaintext model random number ciphertext according to the second plaintext model random number encryption;
and the second participant sends the second model weight ciphertext to the first participant so that the first participant can perform homomorphic property calculation based on the first plaintext model weight random number and the second plaintext model random number ciphertext.
Optionally, the step of performing homomorphic property calculation by the first participant according to the joint prediction output gradient ciphertext to obtain a first model gradient ciphertext, and updating the first model weight ciphertext according to the first model gradient ciphertext includes:
the first participant receives the combined prediction output gradient ciphertext sent by the second participant, and scalar multiplication is carried out on the combined prediction output gradient ciphertext and first training data to obtain a first model gradient ciphertext;
and the first participant updates the first model weight ciphertext according to the first model gradient ciphertext.
Optionally, the federal model training method includes the following steps:
the first participant performs homomorphic property calculation based on a first plaintext model weight random number and a second plaintext model random number ciphertext to generate a first model weight ciphertext, wherein the second plaintext model random number ciphertext is generated by the second participant according to a second plaintext model random number encryption and is sent to the first participant;
the second participant performs homomorphic property calculation based on a second plaintext model weight random number and a first plaintext model random number ciphertext to generate a second model weight ciphertext, wherein the first plaintext model random number ciphertext is generated by the first participant according to the first plaintext model random number encryption and is sent to the second participant;
the first participant performs forward calculation on the basis of the first model weight ciphertext and first training data to generate a first part of prediction output ciphertext and sends the first part of prediction output ciphertext to the second participant;
the second participant performs forward calculation based on the second model weight ciphertext and second training data to generate a second part of prediction output ciphertext, and sends the second part of prediction output ciphertext to the first participant;
the first participant decrypts the second part of prediction output ciphertext to obtain a second part of prediction output, generates a first prediction output result according to the second part of prediction output, and sends the first prediction output result to the second participant;
the second participant decrypts the first part of prediction output ciphertext to obtain a first part of prediction output, generates a second prediction output result according to the first part of prediction output, calculates according to the first prediction output result and the second prediction output result to obtain joint prediction output, and calculates according to the joint prediction output to obtain a prediction value;
the second participant generates a combined prediction output gradient based on the prediction value and the sample label, updates the second model weight ciphertext according to the combined prediction output gradient, encrypts the combined prediction output gradient to generate a combined prediction output gradient ciphertext, and sends the combined prediction output gradient ciphertext to the first participant;
and the first participant outputs the gradient ciphertext according to the joint prediction to perform homomorphic property calculation to obtain a first model gradient ciphertext, and updates the first model weight ciphertext according to the first model gradient ciphertext.
Optionally, the first participant performs homomorphic property calculation based on the first plaintext model weight and the second plaintext model random number ciphertext, and the step of generating the first model weight ciphertext further includes:
the first participant generates the first plaintext model weight random number and a first plaintext model random number, and encrypts the first plaintext model random number to obtain a first plaintext model random number ciphertext;
and the second participant generates the second plaintext model weight random number and a second plaintext model random number, and encrypts the second plaintext model random number to obtain a second plaintext model random number ciphertext.
Optionally, the step of the first participant performing forward computation based on the first model weight ciphertext and first training data to generate a first part of prediction output ciphertext, and sending the first part of prediction output ciphertext to the second participant includes:
the first participant generating a first model noise;
the first participant calculates according to the first model noise, the first model weight ciphertext and the first training data to generate the first part of prediction output ciphertext, and sends the first part of prediction output ciphertext to the second participant;
the step of the second participant performing forward calculation based on the second model weight ciphertext and second training data to generate a second part of prediction output ciphertext, and sending the second part of prediction output ciphertext to the first participant includes:
the second participant generating a second model noise;
and the second participant calculates according to the second model noise, the second model weight ciphertext and the second training data to generate a second part of prediction output ciphertext, and sends the second part of prediction output ciphertext to the first participant.
Optionally, the step of decrypting, by the first party, the second part of the prediction output ciphertext to obtain a second part of the prediction output, generating a first prediction output result according to the second part of the prediction output, and sending the first prediction output result to the second party includes:
the first participant receives and decrypts a second part of prediction output ciphertext sent by the second participant to obtain the second part of prediction output;
the first participant obtains the first prediction output result according to the second part prediction output and the first model noise, and sends the first prediction output result to the second participant;
the second participant decrypts the first part of the prediction output ciphertext to obtain a first part of the prediction output, generates a second prediction output result according to the first part of the prediction output, calculates a joint prediction output according to the first prediction output result and the second prediction output result, and calculates a prediction value according to the joint prediction output, wherein the step of:
the second participant receives and decrypts the first part of prediction output ciphertext sent by the first participant to obtain the first part of prediction output;
the second participant obtains a second prediction output result according to the first part prediction output and the second model noise;
the second participant carries out scalar addition calculation according to the first prediction output result and the second prediction output result to obtain the joint prediction output;
and the second party obtains the predicted value according to the joint prediction output.
In addition, in order to achieve the above object, the present invention further provides a federal model training apparatus, including:
the first participant module is used for performing homomorphic property calculation based on a first plaintext model weight random number and a second plaintext model random number ciphertext to generate a first model weight ciphertext, wherein the second model weight ciphertext is generated by the second participant according to the second plaintext model random number in an encryption manner and is sent to the first participant;
the first participant module is further configured to generate and send a first participant prediction output ciphertext to the second participant based on the first model weight ciphertext and first training data, so that the second participant decrypts the first participant prediction output ciphertext to obtain a first participant prediction output;
the second participant module is used for calculating according to the first participant prediction output and a second participant prediction output to obtain a joint prediction output and calculating according to the joint prediction output to obtain a predicted value, wherein the second participant prediction output is obtained by the second participant according to second training data and second plaintext model weight;
the second participant module is further configured to generate a joint prediction output gradient based on the prediction value and the sample tag, encrypt the joint prediction output gradient to obtain a joint prediction output gradient ciphertext and send the joint prediction output gradient ciphertext to the first participant, and update the second plaintext model weight according to the joint prediction output gradient;
the first participant module is further configured to perform homomorphic property calculation according to the joint prediction output gradient ciphertext to obtain a first model gradient ciphertext, and update the first model weight ciphertext according to the first model gradient ciphertext.
In addition, in order to achieve the above object, the present invention further provides a terminal device, where the terminal device includes a memory, a processor, and a federal model training program stored in the memory and capable of running on the processor, and the federal model training program implements the steps of the federal model training method as described above when executed by the processor.
In addition, to achieve the above object, the present invention further provides a computer readable storage medium, on which a federal model training program is stored, wherein the federal model training program, when executed by a processor, implements the steps of the federal model training method as described above.
According to the method, the device, the terminal equipment and the storage medium for the federated model training, provided by the embodiment of the invention, a first model weight ciphertext is generated by a first participant through homomorphic property calculation based on a first plaintext model weight random number and a second plaintext model random number ciphertext, wherein the second model weight ciphertext is generated by the second participant through encryption according to a second plaintext model random number and is sent to the first participant; the first participant generates and sends a first participant prediction output ciphertext to the second participant based on the first model weight ciphertext and first training data, so that the second participant decrypts the first participant prediction output ciphertext to obtain a first participant prediction output; the second participant calculates to obtain joint prediction output according to the first participant prediction output and the second participant prediction output, and calculates to obtain a prediction value according to the joint prediction output, wherein the second participant prediction output is obtained by the second participant according to second training data and second plaintext model weight; the second participant generates a combined prediction output gradient based on the prediction value and the sample label, encrypts the combined prediction output gradient to obtain a combined prediction output gradient ciphertext and sends the combined prediction output gradient ciphertext to the first participant, and updates the weight of the second plaintext model according to the combined prediction output gradient; and the first participant performs homomorphism calculation according to the combined prediction output gradient ciphertext to obtain a first model gradient ciphertext, and updates the first model weight ciphertext according to the first model gradient ciphertext. In the initialization process, the first participant obtains a first model weight ciphertext through homomorphic property calculation, the model weight of the first participant is protected, in the reverse calculation process, the first participant model is updated according to homomorphic operation, and the first participant cannot have the own plaintext model weight and the plaintext model gradient, so that the problem of label leakage in the existing scheme is solved.
Drawings
FIG. 1 is a schematic flow chart of an initialization phase in the prior art;
FIG. 2 is a flow chart of a forward computing stage in the prior art;
FIG. 3 is a flow chart of a reverse calculation stage in the prior art;
FIG. 4 is a schematic diagram of a functional module of a terminal device to which the federal model training device of the present invention belongs;
FIG. 5 is a schematic flow chart diagram of an exemplary embodiment of a federated model training method in accordance with the present invention;
FIG. 6 is a flow chart of an initialization phase of a first modification in the embodiment of the present invention;
FIG. 7 is a flow chart of a forward computing stage of a first modification of the embodiment of the present invention;
FIG. 8 is a flow chart of a reverse calculation stage of a first modification of the embodiment of the present invention;
FIG. 9 is a schematic flow chart diagram of another exemplary embodiment of a federated model training method in accordance with the present invention;
FIG. 10 is a flow chart of an initialization phase of a second modification of the embodiment of the present invention;
FIG. 11 is a flow chart of a forward computing stage of a second modification of the embodiment of the present invention;
fig. 12 is a schematic flow chart of a reverse calculation stage of the second modification in the embodiment of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The main solution of the embodiment of the invention is as follows: performing homomorphic property calculation on the basis of a first plaintext model weight random number and a second plaintext model random number ciphertext by the first participant to generate a first model weight ciphertext, wherein the second model weight ciphertext is generated by the second participant through encryption according to a second plaintext model random number and is sent to the first participant; the first participant generates and sends a first participant prediction output ciphertext to the second participant based on the first model weight ciphertext and first training data, so that the second participant decrypts the first participant prediction output ciphertext to obtain a first participant prediction output; the second participant calculates to obtain joint prediction output according to the first participant prediction output and the second participant prediction output, and calculates to obtain a predicted value according to the joint prediction output, wherein the second participant prediction output is obtained by the second participant according to second training data and second plaintext model weight; the second participant generates a combined prediction output gradient based on the prediction value and the sample label, encrypts the combined prediction output gradient to obtain a combined prediction output gradient ciphertext and sends the combined prediction output gradient ciphertext to the first participant, and updates the weight of the second plaintext model according to the combined prediction output gradient; and the first participant outputs the gradient ciphertext according to the joint prediction to perform homomorphic property calculation to obtain a first model gradient ciphertext, and updates the first model weight ciphertext according to the first model gradient ciphertext. In the initialization process, the first participant obtains a first model weight ciphertext through homomorphic property calculation, the model weight of the first participant is protected, in the reverse calculation process, the first participant model is updated according to homomorphic operation, and the first participant cannot have the own plaintext model weight and the plaintext model gradient, so that the problem of label leakage in the existing scheme is solved.
The technical terms related to the embodiment of the invention are as follows:
FL (fed Learning, federal Learning): and under the condition of ensuring that the data is out of the domain, training a machine learning model by combining the data sources of a plurality of participants and providing a model reasoning service. The federated learning can fully utilize data sources of a plurality of participants to improve the performance of a machine learning model while protecting user privacy and data security. Federal learning enables cross-department, cross-company, and even cross-industry collaboration of data while meeting the requirements of data protection laws and regulations. Federal learning can be divided into three categories: horizontal federal learning, vertical federal learning, and federal migratory learning. Wherein longitudinal federated learning is federated learning for cases where training sample identifications of participants overlap more, and data features overlap less.
GLM (Generalized Linear Model): is an extension of the linear model, and establishes the relationship between the mathematical expected values of the response variables and the linear combination of the predicted variables through the join function. The method is characterized in that the natural measurement of the data is not forcibly changed, and the data can have a nonlinear and non-constant variance structure. The method is a development of a linear model in researching the non-normal distribution of response values and the concise and direct linear transformation of the non-linear model.
Send (sending)
Recv (Receive)
Gen (generation): generally for the generation of random numbers or random number matrices;
pk (Public Key )
sk (Private Key/Secret Key)
Enc (Encrypt, encrypted): giving a value (plaintext) V, and performing encryption operation by using a public key to obtain a ciphertext, i.e.And pk is a public key.
Dec (Decrypt, decryption): given a ciphertextThe decryption operation being carried out using a private key to recover the value (plaintext), i.e.And sk is a private key.
Initialization: in the initialization phase, a public key is generatedWith a private keyWherein the public key can be disclosed and the private key can not be disclosed;
homomorphic addition: given two ciphertextsAndby homomorphic addition, to obtain new ciphertext, i.e.Satisfy the following requirementsAnd W = U + V;
scalar addition: given a ciphertextAnd plaintext V, and a new ciphertext is obtained by a scalar addition operation, i.e. theSatisfy the following requirementsAnd W = U + V, it is worth noting that in most cases, scalar addition is by one encryption operationAnd a homomorphic addition operationTo complete the process;
scalar multiplication: given a ciphertextAnd plaintext V, by scalar multiplication to obtain new ciphertext, i.e.Satisfy the following requirementsAnd W = U + V;
encryption and decryption: suppose there is a matrixWe rememberIs a matrix composed of a plurality of ciphertexts, and each of the ciphertexts is a cipher text of a value at a corresponding position of M, that is,the ciphertext at row i and column j is(ii) a Symmetric, we can use private keysTo pairDecrypting to obtain a plaintext matrix M;
matrix addition: if there isThen weIs denoted as matrix multiplication in ciphertext space, whereinThe calculation method of the ciphertext in the ith row and the jth column is as follows:
matrix multiplication: if there areThen we willIs denoted as matrix multiplication in ciphertext space, whereinThe calculation mode of the ciphertext in the ith row and the jth column is as follows:
for convenience of description, we will use the followingTo represent the public key pkA pair passing through party aEncrypted ciphertext, i.e.Decryption must be done by the private key skA of party a; in a corresponding manner, the first and second optical fibers are,to indicate that the ciphertext of V encrypted by party B's public key pkB must be decrypted by party B's private key skB. When we describe any of the participants, we useTo perform the presentation.
The generalized linear model is a common machine learning model, including a linear Regression model, a logistic Regression model, a multi-classification logistic Regression model, etc., and actually, the scheme provided by the present invention is also applicable to other generalized linear models, such as Poisson Regression (Poisson Regression), etc. For different generalized linear models, the main difference is the link function and the loss function, but the scheme proposed by the present invention can be adapted to any link function and loss function.
Assume that the dimensions of the input features of party a and party B are respectivelyThe output dimension of the model is OUT, and the models of both parties in the present application are OUTAssuming a small batch data input characteristic ofWherein BS is the abbreviation of batch size, i.e. the number of sample pieces in each small batch, and the objectives of forward calculation and backward calculation of the generalized linear model are as follows:
forward calculation: two participants sample small batch data respectively, wherein the participant A samples a small batch characteristic XAParticipant B samples the small lot features XBAnd a label y. Two participating joint computation
And the participant B finally obtains the plaintext information of Z, and the participant A cannot obtain the plaintext information of Z.
Participant B calculates the predicted value based on the model selectionCalculating the predicted valueFunction(s)fAlso known as Link Function (Link Function), for example:
(1) in the linear regression model, party B calculates using the Identity function (Identity valued function);
(3) In the multi-classification logistic regression model, party B uses Softmax function to calculate
And (3) reverse calculation: participant B according to the labelAnd the predicted valueAnd selection of model, calculating loss valueCalculating a function of the loss valuelAlso known as Loss Function (Loss Function), for example:
(1) in the linear regression model, party B uses Square Loss;
(2) in the Logistic regression model, participant B uses Logistic Loss;
(3) in the multi-classification logistic regression model, participant B used Cross-entry Loss.
Participant B calculates the loss valueTo pairPartial derivatives of. Since the partial derivative can be calculated fromFor convenience of explanation, the present invention is described in terms of calculation. The partial derivative can be calculated according to a derivation formula and a chain rule, namely, the partial derivative can be calculated through a derivative function of a link function f and a loss function l. Since the scheme provided by the invention is applicable to any link function and loss function, the calculation formula of the partial derivative of the specific model is not described in detail here.
Updating model by two participators respectivelyWhereinIn order to be the gradient of the model,for the step size (step size) of the gradient descent, also called learning rate (learning rate), the training process of the algorithm usually uses a small-batch random descent (mini-batch) algorithm to perform model update.
The existing longitudinal federal algorithm protocol is divided into three stages of initialization, forward calculation and reverse calculation, referring to fig. 1, fig. 1 is a schematic flow chart of the initialization stage in the prior art, as shown in fig. 1, the initialization flow of the existing scheme is very simple, and two participants generate own model weights respectively.
Referring to fig. 2, fig. 2 is a schematic diagram of a forward computing stage in the prior art, and as shown in fig. 2, a forward computing process of the prior art includes three steps:
step 1, two participants sample a small batch (mini-batch) of training data respectively, namely XAAnd XB;
referring to fig. 3, fig. 3 is a schematic diagram of a reverse calculation stage in the prior art, and as shown in fig. 3, the reverse calculation process of the prior art is divided into four steps:
step 2, participant A generates random number according to the generated random numberBased on the homomorphic nature, calculateAnd then the data is sent to a participant B, and the participant B receives the data and then decrypts the data to obtain the data;
and Step 4, updating the own model by the two participants respectively.
As described above, nowIn some algorithm protocols, the party A has the weight of the plaintext model of the own party, that isThus the predicted output of the own party can be calculated independently, i.e.This may result in leakage of the tag of party B.
The invention provides a solution, which is used for protecting the model weight and the model gradient of a participant A in a longitudinal federated generalized linear model algorithm through a privacy protection protocol based on homomorphic encryption and secret sharing technologies, and can effectively solve the problem of label leakage in the existing algorithm protocol.
In the longitudinal federal training or reasoning process of two parties, for example, in the initialization process, the forward calculation process and the backward propagation process, the data privacy information of the participating parties is ensured through the homomorphic encryption and secret sharing technology, and particularly, compared with the existing scheme, the technical scheme further protects the model weight and the model gradient of the participating party A, protects the label information Y of the participating party B and improves the safety of the algorithm protocol. The public key and the private key for homomorphic encryption are managed as follows:
party a has a homomorphic encrypted private key skA and public key pkA, party B has public key pkA, and party a does not disclose private key skA to party B.
Correspondingly, party B has a homomorphic encrypted private key skB and public key pkB, party a has public key pkB, and party B does not disclose private key skB to party a.
The public key is used for encrypting the intermediate calculation results in the model training or reasoning process. The encryption used is additive homomorphic encryption, such as the Paillier homomorphic encryption scheme, the Okamoto-Uchiyama homomorphic encryption scheme, and the like.
Specifically, referring to fig. 4, fig. 4 is a schematic diagram of a functional module of a terminal device to which the federal model training apparatus of the present invention belongs. The federal model training device can be a device which is independent of the terminal equipment and can carry out the federal model training, and the device can be borne on the terminal equipment in a form of hardware or software. The terminal device can be an intelligent mobile terminal with a data processing function, such as a mobile phone and a tablet personal computer, and can also be a fixed terminal device or a server with a data processing function.
In this embodiment, the terminal device to which the federal model training apparatus belongs at least includes an output module 110, a processor 120, a memory 130 and a communication module 140.
The memory 130 stores an operating system and a federal model training program, the federal model training device can perform homomorphic property calculation on a first participant based on a first plaintext model weight random number and a second plaintext model random number ciphertext, generate a first model weight ciphertext, a first participant prediction output ciphertext generated based on the first model weight ciphertext and first training data, obtain a prediction value according to a joint prediction output obtained by the first participant prediction output and the second participant prediction output, generate a joint prediction output gradient based on the prediction value by the second participant, encrypt a joint prediction output gradient ciphertext obtained, and store information such as a first model gradient ciphertext obtained by the first participant according to the joint prediction output gradient ciphertext in the memory 130; the output module 110 may be a display screen or the like. The communication module 140 may include a WIFI module, a mobile communication module, a bluetooth module, and the like, and communicates with an external device or a server through the communication module 140.
Wherein the federated model training program in the memory 130, when executed by the processor, implements the following steps:
the first participant performs homomorphic property calculation based on a first plaintext model weight random number and a second plaintext model random number ciphertext to generate a first model weight ciphertext, wherein the second model weight ciphertext is generated by the second participant through encryption according to a second plaintext model random number and is sent to the first participant;
the first participant generates and sends a first participant prediction output ciphertext to the second participant based on the first model weight ciphertext and first training data, so that the second participant decrypts the first participant prediction output ciphertext to obtain a first participant prediction output;
the second participant calculates to obtain joint prediction output according to the first participant prediction output and the second participant prediction output, and calculates to obtain a predicted value according to the joint prediction output, wherein the second participant prediction output is obtained by the second participant according to second training data and second plaintext model weight;
the second participant generates a combined prediction output gradient based on the prediction value and the sample label, encrypts the combined prediction output gradient to obtain a combined prediction output gradient ciphertext and sends the combined prediction output gradient ciphertext to the first participant, and updates the weight of the second plaintext model according to the combined prediction output gradient;
and the first participant outputs the gradient ciphertext according to the joint prediction to perform homomorphic property calculation to obtain a first model gradient ciphertext, and updates the first model weight ciphertext according to the first model gradient ciphertext.
Further, the federated model training program in memory 130, when executed by the processor, further implements the following steps:
the first participant generates the first plaintext model weight random number;
the second party generates the second plaintext model weight and the second plaintext model random number, and generates a second plaintext model random number ciphertext according to the second plaintext model random number encryption;
and the second participant sends the second model weight ciphertext to the first participant, so that the first participant performs homomorphic property calculation based on the first plaintext model weight random number and the second plaintext model random number ciphertext.
Further, the federated model training program in memory 130, when executed by the processor, further implements the following steps:
the first participant receives the combined prediction output gradient ciphertext sent by the second participant, and scalar multiplication is carried out on the combined prediction output gradient ciphertext and first training data to obtain a first model gradient ciphertext;
and the first participant updates the first model weight ciphertext according to the first model gradient ciphertext.
Further, the federated model training program in memory 130, when executed by the processor, further implements the following steps:
the first participant performs homomorphic property calculation based on a first plaintext model weight random number and a second plaintext model random number ciphertext to generate a first model weight ciphertext, wherein the second plaintext model random number ciphertext is generated by the second participant according to the second plaintext model random number encryption and is sent to the first participant;
the second participant performs homomorphic property calculation based on a second plaintext model weight random number and a first plaintext model random number ciphertext to generate a second model weight ciphertext, wherein the first plaintext model random number ciphertext is generated by the first participant according to the first plaintext model random number encryption and is sent to the second participant;
the first participant performs forward calculation on the basis of the first model weight ciphertext and first training data to generate a first part of prediction output ciphertext, and sends the first part of prediction output ciphertext to the second participant;
the second participant performs forward calculation based on the second model weight ciphertext and second training data to generate a second part of prediction output ciphertext, and sends the second part of prediction output ciphertext to the first participant;
the first participant decrypts the second part of prediction output ciphertext to obtain a second part of prediction output, generates a first prediction output result according to the second part of prediction output, and sends the first prediction output result to the second participant;
the second participant decrypts the first part of prediction output ciphertext to obtain a first part of prediction output, generates a second prediction output result according to the first part of prediction output, calculates according to the first prediction output result and the second prediction output result to obtain joint prediction output, and calculates according to the joint prediction output to obtain a prediction value;
the second participant generates a combined prediction output gradient based on the prediction value and the sample label, updates the second model weight ciphertext according to the combined prediction output gradient, encrypts the combined prediction output gradient to generate a combined prediction output gradient ciphertext, and sends the combined prediction output gradient ciphertext to the first participant;
and the first participant outputs the gradient ciphertext according to the joint prediction to perform homomorphic property calculation to obtain a first model gradient ciphertext, and updates the first model weight ciphertext according to the first model gradient ciphertext.
Further, the federated model training program in memory 130, when executed by the processor, further implements the following steps:
the first participant generates the first plaintext model weight random number and a first plaintext model random number, and encrypts the first plaintext model random number to obtain a first plaintext model random number ciphertext;
and the second participant generates a second plaintext model weight random number and a second plaintext model random number, and encrypts the second plaintext model random number to obtain a second plaintext model random number ciphertext.
Further, the federated model training program in memory 130, when executed by the processor, further implements the following steps:
the first participant generating a first model noise;
the first participant calculates according to the first model noise, the first model weight ciphertext and the first training data to generate the first part of prediction output ciphertext and sends the first part of prediction output ciphertext to the second participant;
the second participant generating a second model noise;
and the second participant calculates according to the second model noise, the second model weight ciphertext and the second training data to generate a second part of prediction output ciphertext and sends the second part of prediction output ciphertext to the first participant.
Further, the federated model training program in memory 130, when executed by the processor, further implements the following steps:
the first participant receives and decrypts a second part of prediction output ciphertext sent by the second participant to obtain the second part of prediction output;
the first participant obtains the first prediction output result according to the second part prediction output and the first model noise, and sends the first prediction output result to the second participant;
the second participant receives and decrypts the first part of prediction output ciphertext sent by the first participant to obtain the first part of prediction output;
the second participant obtains a second prediction output result according to the first part prediction output and the second model noise;
the second participant carries out scalar addition calculation according to the first prediction output result and the second prediction output result to obtain the joint prediction output;
and the second party obtains the predicted value according to the joint prediction output.
In this embodiment, with the above scheme, specifically, the first participant performs homomorphic property calculation based on a first plaintext model weight random number and a second plaintext model random number ciphertext to generate a first model weight ciphertext, where the second model weight ciphertext is generated by the second participant according to encryption of a second plaintext model random number and is sent to the first participant; the first participant generates and sends a first participant prediction output ciphertext to the second participant based on the first model weight ciphertext and first training data, so that the second participant decrypts the first participant prediction output ciphertext to obtain a first participant prediction output; the second participant calculates to obtain joint prediction output according to the first participant prediction output and the second participant prediction output, and calculates to obtain a predicted value according to the joint prediction output, wherein the second participant prediction output is obtained by the second participant according to second training data and second plaintext model weight; the second participant generates a combined prediction output gradient based on the prediction value and the sample label, encrypts the combined prediction output gradient to obtain a combined prediction output gradient ciphertext and sends the combined prediction output gradient ciphertext to the first participant, and updates the weight of the second plaintext model according to the combined prediction output gradient; and the first participant outputs the gradient ciphertext according to the joint prediction to perform homomorphic property calculation to obtain a first model gradient ciphertext, and updates the first model weight ciphertext according to the first model gradient ciphertext. In the initialization process, the first participant obtains a first model weight ciphertext through homomorphic property calculation, the model weight of the first participant is protected, in the reverse calculation process, the first participant model is updated according to homomorphic operation, and the first participant cannot have the own plaintext model weight and the plaintext model gradient, so that the problem of label leakage in the existing scheme is solved.
Based on the above terminal device architecture but not limited to the above architecture, embodiments of the method of the present invention are presented.
The execution subject of the method of this embodiment may be a federal model training device or a terminal device, and the federal model training device is used as an example in this embodiment.
Referring to fig. 5, fig. 5 is a flowchart illustrating an exemplary embodiment of a federal model training method according to the present invention. In this embodiment, the federal model training method is applied to a federal learning system, the federal learning system includes a first participant and a second participant, the second participant possesses a sample label, and the federal model training method includes:
step S10, the first participant performs homomorphic property calculation based on a first plaintext model weight random number and a second plaintext model random number ciphertext to generate a first model weight ciphertext, wherein the second model weight ciphertext is generated by the second participant through encryption according to a second plaintext model random number and is sent to the first participant;
one of the core differences between federal learning and the general machine learning task is that the training participants change from one party to two or even more parties. The federal study completes the model training task by participating in the same model training task together with multiple parties on the premise of not ex-warehouse data and protecting data privacy, and breaks a data isolated island. For example, in two-party vertical federal learning, party a (e.g., an advertising company) and party B (e.g., a social networking platform) cooperate to jointly train one or more deep learning based personalized recommendation models. Where party A owns part of the data feature, e.g., (X)1,X2, …, X40) Total 40-dimensional data features; and party B owns another part of the data feature, e.g., (X)41,X42, …, X100) There are 60 dimensional data features. The participators A and B have more data characteristics together, for example, the data characteristics of A and B are 100-dimensional when the data characteristics are added together, so the characteristic dimension of the training data is obviously expanded. For supervised deep learning, participant a and/or participant B also possess label information y of the training data.
The method comprises the following steps that homomorphic property calculation is carried out on the first participant based on a first plaintext model weight random number and a second plaintext model random number ciphertext, and the step of generating the first model weight ciphertext further comprises the following steps:
the first participant generates the first plaintext model weight random number;
the second party generates the second plaintext model weight and the second plaintext model random number, and generates a second plaintext model random number ciphertext according to the second plaintext model random number encryption;
and the second participant sends the second model weight ciphertext to the first participant so that the first participant can perform homomorphic property calculation based on the first plaintext model weight random number and the second plaintext model random number ciphertext.
Specifically, referring to fig. 6, fig. 6 is an initialization of a first modification in the embodiment of the present inventionThe stage flow diagram is shown in FIG. 6, in which the first party A generates the random number of the first plaintext model weight for the own party(ii) a Second participant B generates second plaintext model weights for the own partyGenerating a second plaintext model random number R for the other partyAAnd R is further encrypted by the second party BAAnd sending the result to the first participant A for the first participant A to generate a first model weight ciphertext by using homomorphic property calculation。
In the process, the model weight of the participant A is protected through homomorphic property calculation, namely, the model weight of the participant A is protected by using homomorphic encryption, so that the participant A can only obtain a ciphertext of the model weight.
Step S20, the first participant generates and sends a first participant prediction output ciphertext to the second participant based on the first model weight ciphertext and the first training data, so that the second participant decrypts the first participant prediction output ciphertext to obtain a first participant prediction output;
before this, secure sample alignment is performed, assuming that participants a and B have datasets D, respectivelyAAnd DB. Two participants of longitudinal federal learning need to align the training data owned by the two participants, and screen out the intersection part of the sample IDs of the data owned by the two participants, namely, find a data set DAAnd DBThe sample ID information of the non-intersected part cannot be disclosed. This step is the two-party secure sample alignment and existing solutions can be used, e.g. Blind RSA based algorithms, Diffie-Hellman based algorithms, Freedman protocol based algorithms, etc.
It should be noted that, in order to reduce the amount of computation and obtain better training effect, usually before each iteration, participant a and participant B respectively extract a mini-batch (mini-batch) of data from the data intersection, for example, each mini-batch includes 128 samples. In this case, it is necessary for party a and party B to coordinate the batching of data and the selection of the minibatches so that the samples in the minibatches selected by both parties in each iteration are also aligned.
For convenience of description, the small-lot features of the two participants are denoted as X in this applicationAAnd XBThe small lot label for party B is denoted as y.
Specifically, referring to fig. 7, fig. 7 is a schematic flow chart of a forward computing stage of a first improvement in the embodiment of the present invention, and as shown in fig. 7, two participants each sample a mini-batch (X) of training data, that is, XAAnd XB. First participant A weights ciphertext based on a first modelWith the first training data XAComputing and transmitting a first participant prediction output ciphertext. And the second party B decrypts the first party prediction output ciphertext to obtain the first party prediction output.
Step S30, the second participant calculates according to the first participant prediction output and the second participant prediction output to obtain a joint prediction output, and calculates according to the joint prediction output to obtain a prediction value, wherein the second participant prediction output is calculated by the second participant according to the second training data and the second plaintext model weight;
further, the second party B receives the first party prediction output ciphertextThen, decryption is carried out to obtain the predicted output Z of the first participantAAccording to the secondTraining data XBAnd second plaintext model weightCalculating to obtain the predicted output Z of the second participantBAnd further output Z from the first participant predictionAWith the second party prediction output ZBForward calculation is carried out to obtain a joint prediction output Z, namely Z = ZA+ZB. After the joint prediction output Z is obtained, a prediction value can be obtained by calculation according to a link function, namely。
Step S40, the second participant generates a combined prediction output gradient based on the prediction value and the sample label, encrypts the combined prediction output gradient to obtain a combined prediction output gradient ciphertext and sends the combined prediction output gradient ciphertext to the first participant, and updates the second plaintext model weight according to the combined prediction output gradient;
further, referring to fig. 8, fig. 8 is a schematic diagram illustrating a flow of a reverse calculation stage of the first improvement in the embodiment of the present invention, as shown in fig. 8, after the predicted value is calculated by the link function, the second party B calculates the predicted value based on the predicted valueThe combined prediction output gradient can be obtained by calculating with the sample label yAre combined with each otherEncrypting to obtain combined prediction output gradient ciphertextAnd then sending the combined prediction output gradient ciphertext to the first participant A, and meanwhile, calculating by the second participant B according to the combined prediction output gradient and second training data to obtain a second model weightHeavy gradientAnd further according to a second model weight gradientUpdating second plaintext model weights。
And step S50, the first participant outputs the gradient ciphertext according to the joint prediction to perform homomorphism calculation to obtain a first model gradient ciphertext, and updates the first model weight ciphertext according to the first model gradient ciphertext.
The first participant receives the combined prediction output gradient ciphertext sent by the second participant, and scalar multiplication is carried out on the combined prediction output gradient ciphertext and first training data to obtain a first model gradient ciphertext;
and the first participant updates the first model weight ciphertext according to the first model gradient ciphertext.
Specifically, when the first party A receives the joint prediction output gradient ciphertextThen, a first model gradient ciphertext can be obtained through homomorphism property calculation according to the first training dataAnd further gradient cipher text according to the first modelUpdating first model weight ciphertextTherefore, the two participants respectively update the own model, wherein the updating of the first participant A model is carried out according to homomorphic operation; for the secondThe model update of party B, in this embodiment, is the same as the prior art scheme.
In this embodiment, a first model weight ciphertext is generated by the first party performing homomorphic property calculation based on a first plaintext model weight random number and a second plaintext model random number ciphertext, wherein the second model weight ciphertext is generated by the second party by encrypting according to a second plaintext model random number and is sent to the first party; the first participant generates and sends a first participant prediction output ciphertext to the second participant based on the first model weight ciphertext and first training data, so that the second participant decrypts the first participant prediction output ciphertext to obtain a first participant prediction output; the second participant calculates to obtain joint prediction output according to the first participant prediction output and the second participant prediction output, and calculates to obtain a prediction value according to the joint prediction output, wherein the second participant prediction output is obtained by the second participant according to second training data and second plaintext model weight; the second participant generates a combined prediction output gradient based on the prediction value and the sample label, encrypts the combined prediction output gradient to obtain a combined prediction output gradient ciphertext and sends the combined prediction output gradient ciphertext to the first participant, and updates the weight of the second plaintext model according to the combined prediction output gradient; and the first participant outputs the gradient ciphertext according to the joint prediction to perform homomorphic property calculation to obtain a first model gradient ciphertext, and updates the first model weight ciphertext according to the first model gradient ciphertext. In the initialization process, the first participant obtains a first model weight ciphertext through homomorphic property calculation, the model weight of the first participant is protected, in the reverse calculation process, the first participant model is updated according to homomorphic operation, and the first participant cannot have the own plaintext model weight and the plaintext model gradient, so that the problem of label leakage in the existing scheme is solved.
Referring to fig. 9, fig. 9 is a flowchart illustrating a federal model training method according to another exemplary embodiment of the present invention. The federal model training method comprises the following steps:
step A10, the first participant performs homomorphic property calculation based on a first plaintext model weight random number and a second plaintext model random number ciphertext to generate a first model weight ciphertext, wherein the second plaintext model random number ciphertext is generated by the second participant according to a second plaintext model random number encryption and is sent to the first participant;
before the step of performing homomorphic property calculation on the first participant based on the first plaintext model weight and the second plaintext model random number ciphertext to generate the first model weight ciphertext, the method further comprises:
the first participant generates the first plaintext model weight random number and a first plaintext model random number, and encrypts the first plaintext model random number to obtain a first plaintext model random number ciphertext;
and the second participant generates the second plaintext model weight random number and a second plaintext model random number, and encrypts the second plaintext model random number to obtain a second plaintext model random number ciphertext.
Specifically, referring to fig. 10, fig. 10 is a schematic flowchart of an initialization phase of a second modification in the embodiment of the present invention, and as shown in fig. 10, a first party a generates a first plaintext model weight random number W for a party bAGenerating a first plaintext model random number R for the other partyB. And the first participant A further generates a random number R for the first plaintext modelBEncrypting to obtain a first plaintext model random number ciphertextAnd sending the plaintext to a second party B, and receiving a second plaintext model random number ciphertext sent by the second party BAnd further according to the weight random number W of the first plaintext modelAA second plaintext model random number ciphertextPerforming homomorphic property calculation to generate a first model weight ciphertextI.e. by。
Step A20, the second participant performs homomorphic property calculation based on a second plaintext model weight random number and a first plaintext model random number ciphertext to generate a second model weight ciphertext, wherein the first plaintext model random number ciphertext is generated by the first participant according to the first plaintext model random number in an encryption manner and is sent to the second participant;
symmetrically, the second party B generates a second plaintext model weight random number W for the own partyBGenerating a first plaintext model random number R for the other partyA. And the second participant B generates a second plaintext model random number RAEncrypting to obtain a second plaintext model random number ciphertextAnd sending the plaintext to a first participant A, and receiving a first plaintext model random number ciphertext sent by the first participant AAnd further according to the second plaintext model weight random number WBFirst plaintext model random number ciphertextPerforming homomorphic property calculation to generate a second model weight ciphertextI.e. by。
Step A30, the first participant performs forward calculation based on the first model weight ciphertext and first training data to generate a first part of prediction output ciphertext, and sends the first part of prediction output ciphertext to the second participant;
the first participant generating a first model noise;
the first participant calculates according to the first model noise, the first model weight ciphertext and the first training data to generate the first part of prediction output ciphertext, and sends the first part of prediction output ciphertext to the second participant;
before this, the two participants each sample a small batch (mini-batch) of training data, i.e. the first training data XAAnd second training data XBFurther, referring to fig. 11, fig. 11 is a schematic flow chart of a forward calculation stage of a second improvement in the embodiment of the present invention, as shown in fig. 11, a first participant a generates a first model noise epsilonAAnd then according to the first model noise epsilonAFirst model weight ciphertextAnd first training data XAPerforming calculation to generate a first part of prediction output ciphertextAnd outputting the first part of prediction output ciphertextAnd sending the information to the second participant.
Step A40, the second participant performs forward calculation based on the second model weight ciphertext and second training data to generate a second part of prediction output ciphertext, and sends the second part of prediction output ciphertext to the first participant;
the second participant generating a second model noise;
and the second participant calculates according to the second model noise, the second model weight ciphertext and the second training data to generate a second part of prediction output ciphertext and sends the second part of prediction output ciphertext to the first participant.
Symmetrically, the second party B generates a second model noise εBAnd then according to the second model noise epsilonBSecond model weight ciphertextAnd second training data XBPerforming calculation to generate the second part of prediction output ciphertextAnd outputting the second part of prediction output ciphertextAnd sending the information to the first participant.
Step A50, the first participant decrypts the second part of the prediction output ciphertext to obtain a second part of the prediction output, generates a first prediction output result according to the second part of the prediction output, and sends the first prediction output result to the second participant;
the first participant receives and decrypts a second part of prediction output ciphertext sent by the second participant to obtain the second part of prediction output;
the first participant obtains the first prediction output result according to the second part prediction output and the first model noise, and sends the first prediction output result to the second participant;
further, the first participant a receives a second part of the prediction output ciphertext sent by the second participant BThen, the second part of the prediction output is obtained by decryptionAnd further predict from the second partOutput ofWith the first model noise epsilonAObtaining the first predicted output result Z’ AThe first party A then outputs a first prediction result Z’ ASent to the second participant B for the second participant B to calculate the joint prediction output Z and the prediction value。
Step A60, the second participant decrypts the first part of prediction output ciphertext to obtain a first part of prediction output, generates a second prediction output result according to the first part of prediction output, calculates according to the first prediction output result and the second prediction output result to obtain a joint prediction output, and calculates according to the joint prediction output to obtain a prediction value;
the second participant receives and decrypts the first part of prediction output ciphertext sent by the first participant to obtain the first part of prediction output;
the second participant obtains a second prediction output result according to the first part prediction output and the second model noise;
the second participant performs scalar addition calculation according to the first prediction output result and the second prediction output result to obtain the joint prediction output;
and the second party obtains the predicted value according to the joint prediction output.
Symmetrically, the second party B receives the first part of prediction output ciphertext sent by the first party AThen, decryption is performed to obtain a first part of prediction outputAnd further predict the output according to the first partWith noise of the second model epsilonBObtaining the second predicted output result Z’ BThe second participant B then outputs a result Z based on the first prediction’ ASecond predicted output result Z’ BCalculating to obtain a joint prediction output Z and a prediction value。
Step A70, the second participant generates a combined prediction output gradient based on the prediction value and the sample label, updates the second model weight ciphertext according to the combined prediction output gradient, encrypts the combined prediction output gradient to generate a combined prediction output gradient ciphertext and sends the combined prediction output gradient ciphertext to the first participant;
referring to fig. 12, fig. 12 is a schematic diagram of a flow of a reverse calculation stage of a second modification in the embodiment of the present invention, and as shown in fig. 12, a second participant B calculates a predicted value according to the predicted valueCalculating with sample label y to obtain combined prediction output gradientAre combined with each otherEncrypting to obtain combined prediction output gradient ciphertextAnd then the joint prediction is output as gradient ciphertextSent to the first party A, while the second party B can output the gradient according to the joint predictionCalculating with the second training data to obtain a second model weight gradientAnd further according to a second model weight gradientUpdating second plaintext model weights。
And A80, the first participant outputs the gradient ciphertext according to the joint prediction to perform homomorphism calculation to obtain a first model gradient ciphertext, and updates the first model weight ciphertext according to the first model gradient ciphertext.
The first participant A receives the combined prediction output gradient ciphertext sent by the second participant BThereafter, the gradient cipher text is output according to the joint predictionHomomorphism property calculation is carried out on the first training data to obtain a first model gradient ciphertextAnd further gradient cipher text according to the first modelUpdating first model weight ciphertext。
In this embodiment, with the above scheme, specifically, a first model weight ciphertext is generated by performing homomorphic property calculation on the basis of a first plaintext model weight random number and a second plaintext model random number ciphertext by the first party, where the second plaintext model random number ciphertext is generated by the second party according to encryption of a second plaintext model random number and is sent to the first party; the second participant performs homomorphic property calculation based on a second plaintext model weight random number and a first plaintext model random number ciphertext to generate a second model weight ciphertext, wherein the first plaintext model random number ciphertext is generated by the first participant according to the first plaintext model random number encryption and is sent to the second participant; the first participant performs forward calculation on the basis of the first model weight ciphertext and first training data to generate a first part of prediction output ciphertext, and sends the first part of prediction output ciphertext to the second participant; the second participant performs forward calculation based on the second model weight ciphertext and second training data to generate a second part of prediction output ciphertext, and sends the second part of prediction output ciphertext to the first participant; the first participant decrypts the second part of prediction output ciphertext to obtain a second part of prediction output, generates a first prediction output result according to the second part of prediction output, and sends the first prediction output result to the second participant; the second participant decrypts the first part of prediction output ciphertext to obtain a first part of prediction output, generates a second prediction output result according to the first part of prediction output, calculates according to the first prediction output result and the second prediction output result to obtain joint prediction output, and calculates according to the joint prediction output to obtain a prediction value; the second participant generates a combined prediction output gradient based on the prediction value and the sample label, updates the second model weight ciphertext according to the combined prediction output gradient, encrypts the combined prediction output gradient to generate a combined prediction output gradient ciphertext, and sends the combined prediction output gradient ciphertext to the first participant; and the first participant outputs the gradient ciphertext according to the joint prediction to perform homomorphic property calculation to obtain a first model gradient ciphertext, and updates the first model weight ciphertext according to the first model gradient ciphertext. Similar to the embodiment shown in fig. 5, the first party cannot own the own party's plaintext model weights and plaintext model gradients, thus solving the problem of tag leakage in the existing scheme. However, the plaintext that the second party cannot obtain in the forward calculation in this embodiment may have a better protection effect on the data security of the first party. Thereby further improving the security of the federal model training.
In addition, the federal model training method provided in the embodiment of the present invention may be applied to a federal model inference service, where the model inference stage is to perform model inference (also referred to as model prediction) on data other than training data, for example, when performing a model performance test or when applying a model inference service, the model inference is required. Model inference requires only the forward computation phase described above. Therefore, the privacy protection protocol provided by the invention is also suitable for longitudinal federal reasoning service.
In addition, an embodiment of the present invention further provides a federal model training apparatus, where the federal model training apparatus includes:
the first participant module is used for performing homomorphic property calculation based on a first plaintext model weight random number and a second plaintext model random number ciphertext to generate a first model weight ciphertext, wherein the second model weight ciphertext is generated by the second participant according to the second plaintext model random number in an encryption manner and is sent to the first participant;
the first participant module is further configured to generate and send a first participant prediction output ciphertext to the second participant based on the first model weight ciphertext and first training data, so that the second participant decrypts the first participant prediction output ciphertext to obtain a first participant prediction output;
the second participant module is used for calculating according to the first participant prediction output and a second participant prediction output to obtain a joint prediction output and calculating according to the joint prediction output to obtain a predicted value, wherein the second participant prediction output is obtained by the second participant according to second training data and second plaintext model weight;
the second participant module is further configured to generate a joint prediction output gradient based on the prediction value and the sample tag, encrypt the joint prediction output gradient to obtain a joint prediction output gradient ciphertext and send the joint prediction output gradient ciphertext to the first participant, and update the second plaintext model weight according to the joint prediction output gradient;
the first participant module is further configured to perform homomorphic property calculation according to the joint prediction output gradient ciphertext to obtain a first model gradient ciphertext, and update the first model weight ciphertext according to the first model gradient ciphertext.
For the principle and implementation process of federal model training, please refer to the above embodiments, which are not described herein again.
In addition, an embodiment of the present invention further provides a terminal device, where the terminal device includes a memory, a processor, and a federal model training program stored on the memory and capable of running on the processor, and the federal model training program implements the steps of the federal model training method described above when executed by the processor.
Since the federate model training program is executed by the processor, the federate model training program adopts all technical solutions of all the foregoing embodiments, and therefore, at least all the beneficial effects brought by all the technical solutions of all the foregoing embodiments are achieved, and details are not repeated herein.
In addition, an embodiment of the present invention further provides a computer-readable storage medium, where a federal model training program is stored on the computer-readable storage medium, and when being executed by a processor, the computer-readable storage medium implements the steps of the federal model training method as described above.
Since the federate model training program is executed by the processor, all technical solutions of all the foregoing embodiments are adopted, so that at least all the beneficial effects brought by all the technical solutions of all the foregoing embodiments are achieved, and no further description is given here.
Compared with the prior art, the federal model training method, the apparatus, the terminal device and the storage medium provided by the embodiment of the invention generate a first model weight ciphertext by performing homomorphic property calculation on the basis of a first plaintext model weight random number and a second plaintext model random number ciphertext by the first participant, wherein the second model weight ciphertext is generated by the second participant according to a second plaintext model random number in an encryption manner and is sent to the first participant; the first participant generates and sends a first participant prediction output ciphertext to the second participant based on the first model weight ciphertext and first training data, so that the second participant decrypts the first participant prediction output ciphertext to obtain a first participant prediction output; the second participant calculates to obtain joint prediction output according to the first participant prediction output and the second participant prediction output, and calculates to obtain a prediction value according to the joint prediction output, wherein the second participant prediction output is obtained by the second participant according to second training data and second plaintext model weight; the second participant generates a combined prediction output gradient based on the prediction value and the sample label, encrypts the combined prediction output gradient to obtain a combined prediction output gradient ciphertext and sends the combined prediction output gradient ciphertext to the first participant, and updates the weight of the second plaintext model according to the combined prediction output gradient; and the first participant outputs the gradient ciphertext according to the joint prediction to perform homomorphic property calculation to obtain a first model gradient ciphertext, and updates the first model weight ciphertext according to the first model gradient ciphertext. In the initialization process, a first participant obtains a first model weight ciphertext through homomorphism calculation, the model weight of the first participant is protected, in the reverse calculation process, the first participant model is updated according to homomorphism calculation, and the first participant cannot have the plaintext model weight and the plaintext model gradient of the first participant, so that the problem of label leakage in the existing scheme is solved.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or system in which the element is included.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
Through the description of the foregoing embodiments, it is clear to those skilled in the art that the method of the foregoing embodiments may be implemented by software plus a necessary general hardware platform, and certainly may also be implemented by hardware, but in many cases, the former is a better implementation. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, a controlled terminal, or a network device) to execute the method of each embodiment of the present application.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.
Claims (10)
1. A federated model training method is applied to a federated learning system, the federated learning system comprises a first participant and a second participant, the second participant possesses a sample label, and the federated model training method comprises the following steps:
the first participant performs homomorphic property calculation based on a first plaintext model weight random number and a second plaintext model random number ciphertext to generate a first model weight ciphertext, wherein the second plaintext model random number ciphertext is generated by the second participant according to the second plaintext model random number encryption and is sent to the first participant;
the first participant generates and sends a first participant prediction output ciphertext to the second participant based on the first model weight ciphertext and first training data, so that the second participant decrypts the first participant prediction output ciphertext to obtain a first participant prediction output;
the second participant calculates to obtain joint prediction output according to the first participant prediction output and the second participant prediction output, and calculates to obtain a predicted value according to the joint prediction output, wherein the second participant prediction output is obtained by the second participant according to second training data and second plaintext model weight;
the second participant generates a combined prediction output gradient based on the prediction value and the sample label, encrypts the combined prediction output gradient to obtain a combined prediction output gradient ciphertext and sends the combined prediction output gradient ciphertext to the first participant, and updates the weight of the second plaintext model according to the combined prediction output gradient;
and the first participant performs homomorphism calculation according to the combined prediction output gradient ciphertext to obtain a first model gradient ciphertext, and updates the first model weight ciphertext according to the first model gradient ciphertext.
2. The federal model training method of claim 1, wherein the first party performs homomorphic property calculation based on a first plaintext model weight random number and a second plaintext model random number ciphertext, and the step of generating the first model weight ciphertext further comprises:
the first participant generates the first plaintext model weight random number;
the second party generates the second plaintext model weight and the second plaintext model random number, and generates a second plaintext model random number ciphertext according to the second plaintext model random number encryption;
and the second party sends the second plaintext model random number ciphertext to the first party, so that the first party performs homomorphic property calculation based on the first plaintext model weight random number and the second plaintext model random number ciphertext.
3. The federated model training method of claim 1, wherein the step of the first participant performing homomorphic property calculations based on the joint prediction output gradient ciphertext to obtain a first model gradient ciphertext, and updating the first model weight ciphertext based on the first model gradient ciphertext comprises:
the first participant receives the combined prediction output gradient ciphertext sent by the second participant, and scalar multiplication is carried out on the combined prediction output gradient ciphertext and first training data to obtain a first model gradient ciphertext;
and the first participant updates the first model weight ciphertext according to the first model gradient ciphertext.
4. A federated model training method is applied to a federated learning system, the federated learning system comprises a first participant and a second participant, the second participant possesses a sample label, and the federated model training method comprises the following steps:
the first participant performs homomorphic property calculation based on a first plaintext model weight random number and a second plaintext model random number ciphertext to generate a first model weight ciphertext, wherein the second plaintext model random number ciphertext is generated by the second participant according to a second plaintext model random number encryption and is sent to the first participant;
the second participant performs homomorphic property calculation based on a second plaintext model weight random number and a first plaintext model random number ciphertext to generate a second model weight ciphertext, wherein the first plaintext model random number ciphertext is generated by the first participant through encryption according to the first plaintext model random number and is sent to the second participant;
the first participant performs forward calculation on the basis of the first model weight ciphertext and first training data to generate a first part of prediction output ciphertext, and sends the first part of prediction output ciphertext to the second participant;
the second participant performs forward calculation based on the second model weight ciphertext and second training data to generate a second part of prediction output ciphertext, and sends the second part of prediction output ciphertext to the first participant;
the first participant decrypts the second part of prediction output ciphertext to obtain a second part of prediction output, generates a first prediction output result according to the second part of prediction output, and sends the first prediction output result to the second participant;
the second participant decrypts the first part of prediction output ciphertext to obtain a first part of prediction output, generates a second prediction output result according to the first part of prediction output, calculates according to the first prediction output result and the second prediction output result to obtain joint prediction output, and calculates according to the joint prediction output to obtain a prediction value;
the second participant generates a combined prediction output gradient based on the prediction value and the sample label, updates the second model weight ciphertext according to the combined prediction output gradient, encrypts the combined prediction output gradient to generate a combined prediction output gradient ciphertext, and sends the combined prediction output gradient ciphertext to the first participant;
and the first participant outputs the gradient ciphertext according to the joint prediction to perform homomorphic property calculation to obtain a first model gradient ciphertext, and updates the first model weight ciphertext according to the first model gradient ciphertext.
5. The federal model training method as in claim 4, wherein the first party performs homomorphic property calculations based on first plaintext model weights and second plaintext model random number ciphertexts, and wherein the step of generating the first model weight ciphertexts further comprises:
the first participant generates the first plaintext model weight random number and a first plaintext model random number, and encrypts the first plaintext model random number to obtain a first plaintext model random number ciphertext;
and the second participant generates the second plaintext model weight random number and a second plaintext model random number, and encrypts the second plaintext model random number to obtain a second plaintext model random number ciphertext.
6. The federated model training method of claim 4, wherein the first participant performs a forward computation based on the first model weight ciphertext and first training data to generate a first portion of the prediction output ciphertext, and sends the first portion of the prediction output ciphertext to the second participant comprises:
the first participant generating a first model noise;
the first participant calculates according to the first model noise, the first model weight ciphertext and the first training data to generate the first part of prediction output ciphertext, and sends the first part of prediction output ciphertext to the second participant;
the step of the second participant performing forward calculation based on the second model weight ciphertext and second training data to generate a second part of prediction output ciphertext, and sending the second part of prediction output ciphertext to the first participant includes:
the second participant generating a second model noise;
and the second participant calculates according to the second model noise, the second model weight ciphertext and the second training data to generate a second part of prediction output ciphertext, and sends the second part of prediction output ciphertext to the first participant.
7. The federal model training method as in claim 6, wherein the step of the first party decrypting the second part of the prediction output ciphertext to obtain a second part of the prediction output, generating a first prediction output result according to the second part of the prediction output, and sending the first prediction output result to the second party comprises:
the first participant receives and decrypts a second part of prediction output ciphertext sent by the second participant to obtain the second part of prediction output;
the first participant obtains the first prediction output result according to the second part prediction output and the first model noise, and sends the first prediction output result to the second participant;
the second participant decrypts the first part of the prediction output ciphertext to obtain a first part of the prediction output, generates a second prediction output result according to the first part of the prediction output, calculates a joint prediction output according to the first prediction output result and the second prediction output result, and calculates a prediction value according to the joint prediction output, wherein the step of:
the second participant receives and decrypts the first part of prediction output ciphertext sent by the first participant to obtain the first part of prediction output;
the second participant obtains a second prediction output result according to the first part prediction output and the second model noise;
the second participant carries out scalar addition calculation according to the first prediction output result and the second prediction output result to obtain the joint prediction output;
and the second party obtains the predicted value according to the joint prediction output.
8. The utility model provides a federal model training device, its characterized in that, federal model training device includes:
the first participant module is used for performing homomorphic property calculation based on a first plaintext model weight random number and a second plaintext model random number ciphertext to generate a first model weight ciphertext, wherein the second plaintext model random number ciphertext is generated by the second participant module according to the second plaintext model random number in an encryption manner and is sent to the first participant;
the first participant module is further configured to generate and send a first participant prediction output ciphertext to the second participant based on the first model weight ciphertext and first training data, so that the second participant decrypts the first participant prediction output ciphertext to obtain a first participant prediction output;
the second participant module is configured to calculate a joint prediction output according to the first participant prediction output and a second participant prediction output, and calculate a prediction value according to the joint prediction output, where the second participant prediction output is calculated by the second participant according to second training data and a second plaintext model weight;
the second participant module is further configured to generate a joint prediction output gradient based on the prediction value and the sample label, encrypt the joint prediction output gradient to obtain a joint prediction output gradient ciphertext and send the joint prediction output gradient ciphertext to the first participant, and update the second plaintext model weight according to the joint prediction output gradient;
the first participant module is further configured to perform homomorphic property calculation according to the joint prediction output gradient ciphertext to obtain a first model gradient ciphertext, and update the first model weight ciphertext according to the first model gradient ciphertext.
9. A terminal device comprising a memory, a processor, and a federated model training program stored on the memory and operable on the processor, the federated model training program when executed by the processor implementing a federated model training method as recited in any of claims 1-3 or 4-7.
10. A computer readable storage medium having stored thereon a federated model training program that, when executed by a processor, implements a federated model training method as recited in any of claims 1-3 or 4-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210363190.8A CN114462626B (en) | 2022-04-08 | 2022-04-08 | Federal model training method and device, terminal equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210363190.8A CN114462626B (en) | 2022-04-08 | 2022-04-08 | Federal model training method and device, terminal equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114462626A CN114462626A (en) | 2022-05-10 |
CN114462626B true CN114462626B (en) | 2022-07-19 |
Family
ID=81417638
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210363190.8A Active CN114462626B (en) | 2022-04-08 | 2022-04-08 | Federal model training method and device, terminal equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114462626B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116541870B (en) * | 2023-07-04 | 2023-09-05 | 北京富算科技有限公司 | Method and device for evaluating federal learning model |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110399742A (en) * | 2019-07-29 | 2019-11-01 | 深圳前海微众银行股份有限公司 | A kind of training, prediction technique and the device of federation's transfer learning model |
CN110572253A (en) * | 2019-09-16 | 2019-12-13 | 济南大学 | Method and system for enhancing privacy of federated learning training data |
CN113037460A (en) * | 2021-03-03 | 2021-06-25 | 北京工业大学 | Federal learning privacy protection method based on homomorphic encryption and secret sharing |
CN113505882A (en) * | 2021-05-14 | 2021-10-15 | 深圳市腾讯计算机系统有限公司 | Data processing method based on federal neural network model, related equipment and medium |
CN113657685A (en) * | 2021-08-25 | 2021-11-16 | 深圳前海微众银行股份有限公司 | Federal model training method, device, equipment, storage medium and program |
CN114139721A (en) * | 2021-11-26 | 2022-03-04 | 北京科技大学 | Distributed learning ciphertext calculation efficiency improving method based on homomorphic encryption |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113505894B (en) * | 2021-06-02 | 2023-12-15 | 北京航空航天大学 | Longitudinal federal learning linear regression and logistic regression model training method and device |
-
2022
- 2022-04-08 CN CN202210363190.8A patent/CN114462626B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110399742A (en) * | 2019-07-29 | 2019-11-01 | 深圳前海微众银行股份有限公司 | A kind of training, prediction technique and the device of federation's transfer learning model |
CN110572253A (en) * | 2019-09-16 | 2019-12-13 | 济南大学 | Method and system for enhancing privacy of federated learning training data |
CN113037460A (en) * | 2021-03-03 | 2021-06-25 | 北京工业大学 | Federal learning privacy protection method based on homomorphic encryption and secret sharing |
CN113505882A (en) * | 2021-05-14 | 2021-10-15 | 深圳市腾讯计算机系统有限公司 | Data processing method based on federal neural network model, related equipment and medium |
CN113657685A (en) * | 2021-08-25 | 2021-11-16 | 深圳前海微众银行股份有限公司 | Federal model training method, device, equipment, storage medium and program |
CN114139721A (en) * | 2021-11-26 | 2022-03-04 | 北京科技大学 | Distributed learning ciphertext calculation efficiency improving method based on homomorphic encryption |
Non-Patent Citations (1)
Title |
---|
VF2Boost: Very Fast Vertical Federated Gradient Boosting for Cross-Enterprise Learning;Fangcheng Fu 等;《ACM Digital Library》;20210625;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN114462626A (en) | 2022-05-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021197037A1 (en) | Method and apparatus for jointly performing data processing by two parties | |
CN113516256B (en) | Third-party-free federal learning method and system based on secret sharing and homomorphic encryption | |
CN111512589B (en) | Method for fast secure multiparty inner product with SPDZ | |
EP3075098B1 (en) | Server-aided private set intersection (psi) with data transfer | |
Siddiqui et al. | A highly nonlinear substitution-box (S-box) design using action of modular group on a projective line over a finite field | |
CN112989368B (en) | Method and device for processing private data by combining multiple parties | |
WO2020015478A1 (en) | Model-based prediction method and device | |
WO2022247576A1 (en) | Data processing method and apparatus, device, and computer-readable storage medium | |
CN113033828B (en) | Model training method, using method, system, credible node and equipment | |
CN111723404A (en) | Method and device for jointly training business model | |
CN113127916A (en) | Data set processing method, data processing device and storage medium | |
TWI720622B (en) | Security model prediction method and device based on secret sharing | |
CN113609781B (en) | Method, system, equipment and medium for optimizing automobile production die based on federal learning | |
JPWO2015155896A1 (en) | Support vector machine learning system and support vector machine learning method | |
Erkin et al. | Privacy-preserving distributed clustering | |
Attaullah et al. | Cryptosystem techniques based on the improved Chebyshev map: an application in image encryption | |
Qiu et al. | Quantum digital signature for the access control of sensitive data in the big data era | |
CN116502732B (en) | Federal learning method and system based on trusted execution environment | |
Ahamed et al. | SMS encryption and decryption using modified vigenere cipher algorithm | |
Zhang et al. | PPNNP: A privacy-preserving neural network prediction with separated data providers using multi-client inner-product encryption | |
CN112818369A (en) | Combined modeling method and device | |
CN114462626B (en) | Federal model training method and device, terminal equipment and storage medium | |
Dai et al. | Vertical federated DNN training | |
CN114492850A (en) | Model training method, device, medium, and program product based on federal learning | |
CN117521102A (en) | Model training method and device based on federal learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |