CN109325584B

CN109325584B - Federal modeling method and device based on neural network and readable storage medium

Info

Publication number: CN109325584B
Application number: CN201810913868.9A
Authority: CN
Inventors: 刘洋; 陈天健; 范涛; 成柯葳; 杨强
Original assignee: WeBank Co Ltd
Current assignee: WeBank Co Ltd
Priority date: 2018-08-10
Filing date: 2018-08-10
Publication date: 2021-06-25
Anticipated expiration: 2038-08-10
Also published as: CN109325584A

Abstract

The invention discloses a federal modeling method, equipment and a readable storage medium based on a neural network, wherein the method comprises the following steps: the method comprises the steps that a first terminal inputs labeled first sample data into a first neural network of a model to be trained for iteration, and homomorphic encryption is carried out on first output of the first neural network after iteration; receiving a homomorphic encrypted second output sent by a second terminal; and calculating an encrypted loss value and a gradient value according to the first output and the second output which are encrypted in the same state, transmitting the encrypted loss value and the gradient value to a third terminal, decrypting the encrypted loss value by the third terminal, judging whether the model to be trained is converged according to the decrypted loss value, and finishing training if the model to be trained is converged to obtain the model to be trained. The invention can improve the privacy and the utilization rate of sample data of all parties.

Description

Federal modeling method and device based on neural network and readable storage medium

Technical Field

The invention relates to the technical field of machine learning, in particular to a federal modeling method and equipment based on a neural network and a readable storage medium.

Background

With the rapid development of machine learning, machine learning can be applied to various fields such as data mining, computer vision, natural language processing, biometric identification, medical diagnosis, detection of credit card fraud, stock market analysis, and DNA sequence sequencing. The machine learning comprises a learning part and an executing part, wherein the learning part utilizes the sample data to modify a knowledge base of the system so as to improve the efficiency of the system executing part in completing tasks, and the executing part completes the tasks according to the knowledge base and simultaneously feeds back the obtained information to the learning part.

At present, because sample data of each party is closely related, if machine learning only uses sample data of one party, a model obtained by learning is inaccurate, and in order to solve the problem, single-layer simple models such as logistic regression or decision tree are applied to machine learning by combining the sample data of each party. However, since sample data of each party needs to be joined, the sample data of one party may be known to the other party, and further, machine learning by a single-layer simple model cannot effectively utilize the sample data of each party.

Therefore, how to improve the privacy and the utilization rate of sample data of each party is a problem to be solved urgently at present.

Disclosure of Invention

The invention mainly aims to provide a federal modeling method, equipment and a readable storage medium based on a neural network, aiming at improving the privacy and the utilization rate of sample data of all parties.

In order to achieve the above object, the present invention provides a federal modeling method based on a neural network, which includes the following steps:

the method comprises the steps that a first terminal inputs labeled first sample data into a first neural network of a model to be trained for iteration, and homomorphic encryption is carried out on first output of the first neural network after iteration;

receiving homomorphic encrypted second output sent by a second terminal, wherein the second terminal inputs second sample data into a second neural network of a model to be trained for iteration, homomorphic encrypts the second output of the second neural network after iteration and transmits the second output to the first terminal;

calculating an encrypted loss value and a gradient value according to the first output and the second output which are encrypted in the same state, and transmitting the encrypted loss value and the encrypted gradient value to a third terminal;

and after the third terminal decrypts the encrypted loss value and the gradient value, judging whether the model to be trained is converged according to the decrypted loss value, and if the model to be trained is converged, ending the training to obtain the model to be trained.

Further, after decrypting the encrypted loss value and gradient value by the third terminal, judging whether the model to be trained converges according to the decrypted loss value comprises:

the third terminal receives the encrypted loss value sent by the first terminal and obtains the historical loss value sent by the first terminal at the previous time;

and decrypting the encrypted loss value and the historical loss value according to a prestored private key, and judging whether the model to be trained is converged according to the decrypted loss value and the historical loss value.

Further, the step of judging whether the model to be trained converges according to the decrypted loss value and the historical loss value includes:

calculating a difference value between the decrypted loss value and the historical loss value, and judging whether the difference value is smaller than or equal to a preset threshold value;

and if the difference is smaller than or equal to a preset threshold value, determining that the model to be trained is converged, otherwise, determining that the model to be trained is not converged.

Further, after the third terminal decrypts the encrypted loss value and gradient value and determines whether the model to be trained converges according to the decrypted loss value, the method further includes:

if the model to be trained is not converged, the third terminal decrypts the gradient of the first output by the target function of the model to be trained, and then returns the gradient to the first terminal;

the first terminal reversely propagates and adjusts the local gradient of the first neural network according to the gradient of the target function of the model to be trained, which is decrypted and returned by the third terminal and is used for the first output;

the third terminal decrypts the gradient of the second output of the target function of the model to be trained, and then returns the gradient to the second terminal;

and the second terminal reversely propagates and adjusts the local gradient of the second neural network according to the gradient of the target function of the model to be trained, which is decrypted and returned by the third terminal, to the second output.

Further, the step of adjusting, by the first terminal, the local gradient of the first neural network by back propagation according to the gradient of the target function of the model to be trained, decrypted and returned by the third terminal, to the first output, includes:

the first terminal carries out polynomial fitting processing on the gradient function of the model to be trained;

receiving the gradient of the target function of the model to be trained, which is decrypted and returned by the third terminal, to the first output;

and adjusting the local gradient of the first neural network according to the gradient function subjected to polynomial fitting and the gradient of the target function of the model to be trained on the first output by back propagation.

Further, the step of homomorphically encrypting the first output of the first neural network after the iteration comprises:

receiving a public key sent by the third terminal, and storing the public key in a preset area;

and homomorphically encrypting the first output of the first neural network after iteration according to the public key in the preset area.

Further, the federal modeling method based on a neural network further comprises the following steps:

when a configuration instruction of an initial weight is detected, counting the number of synapses in the first neural network, calling a preset random number generator, and generating a group of random numbers corresponding to the number of synapses;

and configuring initial weight values of synapses in the first neural network according to the generated group of random numbers.

Further, the step of configuring an initial weight of each synapse in the first neural network according to the generated set of random numbers comprises:

and according to the magnitude sequence of the generated random numbers, sequentially selecting a random number from the generated random numbers as an initial weight value, and configuring the random number to a synapse in the first neural network, wherein each synapse is configured with the initial weight value once.

In addition, to achieve the above object, the present invention further provides a federal modeling device based on a neural network, including: a memory, a processor, and a neural network-based federated modeling program stored on the memory and operable on the processor, the neural network-based federated modeling program when executed by the processor implementing the steps of the neural network-based federated modeling method as described above.

The invention also provides a readable storage medium on which a neural network-based federated modeling program is stored, which when executed by a processor implements the steps of the neural network-based federated modeling method described above.

The invention provides a federal modeling method, equipment and readable storage medium based on neural network, the invention inputs the labeled sample data of one party into a neural network in the model to be trained, and inputs the sample data of the other party into the other neural network in the model to be trained, then when the sample data of both parties reaches the last layer through respective neural networks, the output of both parties ' neural networks is encrypted in a homomorphic way, and the output of one party's encrypting in a homomorphic way is transmitted to the other party, the other party combines the output of both parties ' neural networks encrypting in a homomorphic way to calculate the loss value and gradient value of encryption, finally the loss value and gradient value of encryption are transmitted to the third party, after the third party returns the loss value and gradient value of encryption to decrypt, whether the model to be trained is converged is judged according to the loss value after decryption, if the model to be trained is converged, and the training is finished to obtain a model to be trained, because the data required to be transmitted by the two parties is encrypted in a homomorphic encryption mode, and the joint training can be carried out in an encrypted form, the privacy of sample data of each party is effectively improved, and meanwhile, the multi-layer neural network of each party is combined for machine learning, so that the sample data of each party can be effectively utilized, and the utilization rate of the sample data of each party is improved.

Drawings

FIG. 1 is a schematic diagram of an apparatus architecture of a hardware operating environment according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart of a first embodiment of the federated modeling method based on a neural network of the present invention;

FIG. 3 is a diagram illustrating training a model to be trained by combining sample data of two parties according to a first embodiment of the present invention;

FIG. 4 is a flowchart illustrating a second embodiment of the federated modeling method based on a neural network according to the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

As shown in fig. 1, fig. 1 is a schematic device structure diagram of a hardware operating environment according to an embodiment of the present invention.

The federal modeling device based on the neural network can be a fixed terminal device such as a PC (personal computer) and the like, and can also be a movable terminal device with a display function such as a smart phone, a tablet personal computer and a portable computer.

As shown in fig. 1, the neural network-based federated modeling apparatus may include: a processor 1001, such as a CPU, a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.

Those skilled in the art will appreciate that the neural network-based federated modeling architecture shown in FIG. 1 does not constitute a limitation on neural network-based federated modeling apparatus, and may include more or fewer components than shown, or some components in combination, or a different arrangement of components.

As shown in fig. 1, the memory 1005, which is a type of computer storage medium, may include an operating system, a network communication module, a user interface module, and a neural network-based federated modeling program therein.

In the federal modeling device based on a neural network shown in fig. 1, the network interface 1004 is mainly used for connecting a background server and performing data communication with the background server; the user interface 1003 is mainly used for connecting a client (user side) and performing data communication with the client; and the processor 1001 may be configured to invoke the neural network-based federated modeling program stored in the memory 1005 and perform the following steps:

Further, the processor 1001 may be configured to invoke the neural network-based federated modeling program stored in the memory 1005, and further perform the following steps:

The specific embodiment of the federal modeling device based on the neural network of the present invention is substantially the same as each specific embodiment of the federal modeling method based on the neural network described below, and is not described herein again.

Referring to fig. 2, fig. 2 is a schematic flow chart of a first embodiment of the neural network-based federated modeling method of the present invention.

Step S101, a first terminal inputs labeled first sample data into a first neural network of a model to be trained for iteration, and homomorphic encryption is carried out on a first output of the first neural network after iteration;

in this embodiment, the model to be trained includes at least two machine learning models, at least one of the at least two machine models is a neural network model, and the model to be trained is described below by taking two neural networks and a joint neural network as an example, as shown in fig. 3, the two neural networks are a first neural network and a second neural network, respectively, the first neural network is deployed at a first terminal, and the second neural network is deployed at a second terminal. It should be noted that the network parameters of the first neural network and the second neural network can be set by those skilled in the art based on actual conditions, and this embodiment is not limited in particular. The network parameters include, but are not limited to, the number of network nodes in each layer, the number of hidden layers, the initial weight of each synapse, a learning rate, dynamic parameters, an allowable error, the number of iterations, and an activation function.

In this embodiment, after determining a model to be trained, a developer deploys a first neural network of the model to be trained at a first terminal, deploys a second neural network of the model to be trained at a second terminal, and stores a public key required by the first terminal and the second terminal in a third terminal, where a first sample data that is already labeled is stored locally at the first terminal, and a second sample data that is not labeled is stored locally at the second terminal. In the process of training a model to be trained, a first terminal inputs first sample data which is located locally and is labeled into a first neural network of the model to be trained, and homomorphic encryption is carried out on first output of the first neural network after iteration, namely values of each neuron on the last layer of the first neural network and parameter values required by participation loss value calculation and gradient value calculation of each neuron on the last layer in a gradient function and a loss function of the model to be trained, namely when the first sample data reaches the last layer of the first neural network through processing of the first neural network, homomorphic encryption is carried out on the first output of the first neural network by using a public key acquired from a third terminal. Wherein homomorphic encrypted data is processed to obtain an output, which is decrypted, the result being the same as the output result obtained by processing unencrypted original data in the same way.

Specifically, a public key sent by the third terminal is received, the public key is stored in a preset area, and after the marked first sample data is input into the first neural network, the first output of the first neural network after iteration is homomorphic encrypted according to the public key in the preset area.

Step S102, receiving homomorphic encrypted second output sent by a second terminal, wherein the second terminal inputs second sample data into a second neural network of the model to be trained for iteration, homomorphic encrypts the second output of the iterated second neural network and transmits the second output to the first terminal;

in this embodiment, in the process of training the model to be trained, the second terminal inputs the second sample data located locally into the second neural network of the model to be trained, and outputs the second sample data of the second neural network after iteration, namely, the neuron values of the last layer of the second neural network and the parameter values of the neuron values of the last layer, which are needed by the loss value calculation and the gradient value calculation in the gradient function and the loss function of the model to be trained, are homomorphic encrypted and then transmitted to the first terminal, i.e. the second sample data is processed by the second neural network to the last layer of the second neural network, homomorphically encrypting a second output of the second neural network using a public key obtained from a third terminal, and transmitting the homomorphic encrypted second output of the second neural network to the first terminal, and receiving the homomorphic encrypted second output sent by the second terminal by the first terminal.

Step S103, calculating an encrypted loss value and a gradient value according to the homomorphic encrypted first output and second output, and transmitting the encrypted loss value and gradient value to a third terminal;

in this embodiment, the first terminal calculates the encrypted loss value and gradient value based on the homomorphic encrypted first output and second output, and transmits the encrypted loss value and gradient value to the third terminal, i.e. combining the first output of the first neural network, i.e. the neuron values of the last layer of the first neural network, and parameter values required by the participation of each neuron in the gradient function and the loss function of the model to be trained in the computation of the loss value and the computation of the gradient value and a second output of the second neural network, namely, the neuron values of the last layer of the second neural network and the parameter values required by the neuron values of the last layer in the gradient function and the loss function of the model to be trained to participate in the computation of the loss value and the computation of the gradient value, and calculating the loss value and the gradient value in a ciphertext mode, wherein the calculated loss value and gradient value are in an encryption state.

In the specific implementation, in order to further improve the security of data of both sides, in the process of training the model, the first terminal and the second terminal periodically obtain a public key from the third terminal to update the locally pre-stored public key, specifically, a timer is set in the third terminal, when the model is trained, the timer starts to time, when the timing of the timer reaches a preset time length, the third terminal generates a group of public keys and private keys and sends the public keys to the first terminal and the second terminal, and the timer restarts to time, and the first terminal and the second terminal update the locally pre-stored public key. It should be noted that the preset time period may be set by a person skilled in the art based on actual situations, and this embodiment is not particularly limited thereto.

And step S104, after the third terminal decrypts the encrypted loss value and gradient value, judging whether the model to be trained is converged according to the decrypted loss value, and if the model to be trained is converged, finishing training to obtain the model to be trained.

In this embodiment, the third terminal receives the encrypted loss value and gradient value sent by the first terminal, decrypts the encrypted loss value and gradient value, and then determines whether the model to be trained converges according to the decrypted loss value, that is, obtains a public key of the encrypted loss value and gradient value, obtains a private key corresponding to the public key, decrypts the encrypted loss value and gradient value according to the private key, and determines whether the model to be trained converges according to the decrypted loss value. Specifically, when the third terminal receives the encrypted loss value sent by the first terminal, the third terminal obtains a historical loss value sent by the first terminal last time, decrypts the encrypted loss value and the historical loss value according to the corresponding private key, and determines whether the model to be trained converges according to the decrypted loss value and the historical loss value, that is, calculates a difference between the decrypted loss value and the historical loss value, and determines whether the difference is less than or equal to a preset threshold, if the difference is less than or equal to the preset threshold, it is determined that the model to be trained converges, and if the difference is greater than the preset threshold, it is determined that the model to be trained does not converge. It should be noted that the preset threshold may be set by a person skilled in the art based on actual situations, and this embodiment is not particularly limited thereto.

Further, if the model to be trained is not converged, the third terminal decrypts the gradient of the first output by the target function of the model to be trained, and then returns the decrypted gradient of the first output by the third terminal, the first terminal reversely propagates and adjusts the local gradient of the first neural network according to the gradient of the first output by the target function of the model to be trained decrypted, the third terminal also decrypts the gradient of the second output by the target function of the model to be trained, and then returns the decrypted gradient of the second output by the third terminal to the second terminal, and the second terminal reversely propagates and adjusts the local gradient of the second neural network according to the gradient of the second output by the target function of the model to be trained decrypted and returned by the third terminal.

Specifically, when the gradient function or the loss function of the model to be trained cannot be calculated under encryption, the first terminal performs polynomial fitting processing on the gradient function of the model to be trained, receives the gradient of the target function of the model to be trained, which is decrypted and returned by the third terminal, to the first output, and then performs back propagation to adjust the local gradient of the first neural network according to the gradient function subjected to the polynomial fitting processing and the gradient of the target function of the model to be trained to the first output. Similarly, the second terminal adjusts the local gradient of the second neural network in the same manner. The problem that the nonlinear gradient function or the loss function cannot be homomorphic encrypted can be solved by carrying out polynomial fitting processing on the gradient function. It should be noted that, the gradient function is processed by fitting a polynomial to approximate the activation function, so that the gradient function that cannot be calculated under encryption can be homomorphically encrypted, and it should be noted that, in specific implementation, the gradient function may be processed by other methods, so that the nonlinear gradient function can be homomorphically encrypted.

In this embodiment, the present invention inputs the sample data labeled on one side into a neural network in the model to be trained, inputs the sample data on the other side into the other neural network in the model to be trained, homomorphically encrypts the output of the neural networks when the sample data of the two sides pass through the respective neural networks to the last layer, transmits the homomorphic encrypted output of one side to the other side, calculates the encrypted loss value and gradient value by combining the output of the neural networks of the two sides encrypted by the homomorphic encryption, transmits the encrypted loss value and gradient value to the third side, and after the encrypted loss value and gradient value are returned by the third side for decryption, judges whether the model to be trained is converged according to the decrypted loss value, if the model to be trained is converged, ends the training to obtain the model to be trained, because of the homomorphic encryption, the data which are required to be transmitted by the two parties are encrypted, and combined training can be carried out in an encrypted form, so that the privacy of sample data of each party is effectively improved, meanwhile, machine learning is carried out by combining the multilayer neural networks of each party, so that the sample data of each party can be effectively utilized, and the utilization rate of the sample data of each party is improved.

Further, referring to fig. 4, a second embodiment of the neural network-based federated modeling method of the present invention is proposed based on the above-mentioned first implementation, which is different from the foregoing embodiments in that the neural network-based federated modeling method further includes:

step 105, when a configuration instruction of the initial weight is detected, counting the number of synapses in the first neural network, calling a preset random number generator, and generating a group of random numbers corresponding to the number of synapses;

in this embodiment, before training a model to be trained, an initial weight of each synapse in the model to be trained needs to be configured, and when a configuration instruction of the initial weight is detected, a first terminal counts the number of synapses in a first neural network, invokes a preset random number generator to generate a set of random numbers corresponding to the number of synapses, and simultaneously a second terminal counts the number of synapses in a second neural network, invokes the preset random number generator to generate another set of random numbers corresponding to the number of synapses. It should be noted that the value range of the random number can be set by a person skilled in the art based on actual situations, and this embodiment is not particularly limited to this, and preferably, the value range of the random number is-0.5 to + 0.5.

Step 106, configuring an initial weight of each synapse in the first neural network according to the generated set of random numbers.

In this embodiment, the first terminal configures an initial weight of each synapse in the first neural network according to the generated set of random numbers, that is, according to a magnitude order of the generated set of random numbers, sequentially selects a random number from the generated set of random numbers as the initial weight, and configures the random number to a synapse in the first neural network; and the second terminal configures the initial weight of each synapse in the second neural network according to the generated another group of random numbers, namely, sequentially selecting a random number from the generated another group of random numbers as the initial weight according to the magnitude sequence of the generated another group of random numbers, and configuring the random number to a synapse in the second neural network, wherein each synapse is configured with the initial weight once.

In the embodiment, the random number generator is used for configuring random initial weights for synapses of a first neural network and a second neural network in a model to be trained, so that the initial weights of the synapses are prevented from being the same, the weights of the synapses are always kept equal in the training process, and the accuracy of the model obtained through training is effectively improved.

In addition, an embodiment of the present invention further provides a readable storage medium, where the readable storage medium stores a federal modeling program based on a neural network, and when the program is executed by a processor, the program performs the following steps:

Further, when executed by the processor, the neural network-based federated modeling program further performs the following steps:

after the third terminal decrypts the encrypted loss value and gradient value, the step of judging whether the model to be trained converges according to the decrypted loss value comprises the following steps:

the third terminal acquires the historical loss value sent by the first terminal last time when receiving the encrypted loss value sent by the first terminal;

the step of judging whether the model to be trained is converged according to the decrypted loss value and the historical loss value comprises the following steps:

and the second terminal reversely propagates and adjusts the gradient output in the local gradient second neural network of the second neural network according to the gradient of the second output of the target function of the model to be trained, which is decrypted and returned by the third terminal, and reversely propagates and adjusts the local gradient of the second neural network.

The specific embodiment of the readable storage medium of the present invention is basically the same as the embodiments of the federal modeling method based on a neural network, and is not described herein again.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. The federal modeling method based on the neural network is characterized by comprising the following steps of:

2. The federal modeling method based on a neural network as claimed in claim 1, wherein the step of determining whether the model to be trained converges according to the decrypted loss value after the third terminal decrypts the encrypted loss value and gradient value comprises:

3. The neural network-based federated modeling method of claim 2, wherein the step of determining whether the model to be trained converges based on the decrypted loss value and the historical loss value comprises:

4. The federal modeling method based on a neural network as claimed in claim 1, wherein after the step of decrypting the encrypted loss value and gradient value by the third terminal and determining whether the model to be trained converges according to the decrypted loss value, the method further comprises:

5. The neural network-based federated modeling method of claim 4, wherein the step of back-propagating, by the first terminal, the gradient of the first output of the objective function of the model to be trained returned from decryption by a third terminal, the step of adjusting the local gradient of the first neural network comprises:

6. The neural network-based federated modeling method of any of claims 1-5, wherein the step of homomorphically encrypting the iterated first output of the first neural network includes:

7. The neural network-based federated modeling method of any of claims 1-5, further comprising:

8. The neural network-based federated modeling method of claim 7, wherein the step of configuring initial weights for synapses in the first neural network in accordance with the generated set of random numbers comprises:

9. A federated modeling apparatus based on a neural network, characterized in that the federated modeling apparatus based on the neural network comprises: a memory, a processor, and a neural network-based federated modeling program stored on the memory and operable on the processor, the neural network-based federated modeling program when executed by the processor implementing the steps of the neural network-based federated modeling method of any of claims 1-8.

10. A readable storage medium having stored thereon a neural network-based federated modeling program that, when executed by a processor, performs the steps of the neural network-based federated modeling method of any of claims 1 to 8.