CN111723947A - Method and device for training federated learning model - Google Patents

Method and device for training federated learning model Download PDF

Info

Publication number
CN111723947A
CN111723947A CN202010564409.1A CN202010564409A CN111723947A CN 111723947 A CN111723947 A CN 111723947A CN 202010564409 A CN202010564409 A CN 202010564409A CN 111723947 A CN111723947 A CN 111723947A
Authority
CN
China
Prior art keywords
model parameters
iteration
local
kth
global
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010564409.1A
Other languages
Chinese (zh)
Inventor
刘楠
王玥琪
李晓丽
陈川
郑子彬
严强
李辉忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
WeBank Co Ltd
Original Assignee
Sun Yat Sen University
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University, WeBank Co Ltd filed Critical Sun Yat Sen University
Priority to CN202010564409.1A priority Critical patent/CN111723947A/en
Publication of CN111723947A publication Critical patent/CN111723947A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method and a device for training a federated learning model, which comprise the following steps: the client side obtains global model parameters of the (k-1) th iteration broadcasted by the server, wherein k is a positive integer, then the global model parameters are used as local model parameters with regularization constraints, the local data are used for carrying out the kth iteration training to obtain the local model parameters of the kth iteration training, wherein the regularization constraints are determined based on global model parameters of the server and local model parameters of the client, the gradients in the model are optimized by the regularization constraints, thereby reducing the influence of the extreme data on the local model parameter training, improving the accuracy of the local model parameter on the training of the non-independent same distribution data, and then sending the local model parameters of the kth iterative training to the server so that the server updates the global model parameters of the kth iterative training and improves the accuracy of the global model parameters to the training of the non-independent same distribution data.

Description

Method and device for training federated learning model
Technical Field
The invention relates to the field of financial technology (Fintech), in particular to a method and a device for training a joint learning model.
Background
With the development of computer technology, more and more technologies (such as block chains, cloud computing or big data) are applied in the financial field, the traditional financial industry is gradually changing to the financial technology, and big data technology is no exception, but higher requirements are also put forward on big data technology due to the security and real-time requirements of the financial and payment industries.
In the prior art, federal learning is that communication among nodes is performed in a parameter transmission mode, and information provided by data of each node is integrated in a parameter averaging mode in a training process. In the process of federal learning, data needs to be trained, a node is randomly selected each time, a global model is issued to the node, iteration is carried out by using the data of the node, model parameters obtained by training are sent back to a central server, and the central server averages the model parameters obtained by training each node to determine a model for next iteration.
However, in the federal learning in the prior art, the model trained by using the data which is not independent and distributed has low accuracy and poor effect. Therefore, a method is needed to improve the accuracy of training the parameters of the data model with non-independent and same distribution.
Disclosure of Invention
The embodiment of the invention provides a method and a device for training a federated learning model, which are used for improving the accuracy of a model trained by non-independent and identically distributed data and optimizing the model trained by the non-independent and identically distributed data.
In a first aspect, an embodiment of the present invention provides a method for training a bang learning model, including:
the client acquires global model parameters of the (k-1) th iteration broadcasted by the server; k is a positive integer;
the client side takes the global model parameters as local model parameters with regularization constraints, local data are used for conducting kth iterative training, and local model parameters of the kth iterative training are obtained; the regularization constraint is determined according to global model parameters of the server and local model parameters of the client;
and the client side sends the local model parameters trained by the k iteration to the server so that the server updates the global model parameters of the k iteration.
In the technical scheme, a client acquires global model parameters of k-1 iterations, when k is 1, namely the global model parameters are initial global model parameters, the initial global model parameters are set with a regularization constraint as local model parameters, so that the global model parameters and the local model parameters have regularization constraints in the iteration process, when the local model parameters train local data, a loss function of the local model parameters is constrained, the gradient of the local model parameters is optimized, the influence of extreme data in the local data on the training result of the local model parameters is further reduced, the accuracy of the local model parameters on non-independent identically distributed data training is improved, a local model of the kth iteration training is obtained, the local model parameters of the kth iteration training are sent to a server, so that the server updates the global model parameters of the kth iteration, and further improving the accuracy of the global model parameters to the training of the non-independent same distribution data.
Optionally, the regularization constraint determined according to the global model parameter of the server and the local model parameter of the client includes:
and F norm calculation is carried out on the difference value of the global model parameter and the local model parameter to obtain the regularization constraint.
According to the technical scheme, the regularization constraint is obtained according to the global model parameters and the local model parameters, and the regularization constraint is used for optimizing a loss function in the local model parameters and improving the accuracy of the local model parameters on training of the non-independent same distribution data.
Optionally, determining a final loss function of the local model parameter according to the following formula (1);
Figure BDA0002547145650000021
wherein the content of the first and second substances,
Figure BDA0002547145650000031
a final loss function for the local model parameters, J (W)(i)) As a loss function of the local model parameters, JT-S(W(i)) For the regularization constraint, β are coefficients of the regularization constraint.
Optionally, determining a local model parameter of the kth iterative training according to the following formula (2);
Figure BDA0002547145650000032
wherein, Wk (i)Is the local model parameter of the kth iteration, Wk-1 (i)Local model parameters for the k-1 th iteration, αkFor the learning rate of the k-th iteration,
Figure BDA0002547145650000033
the gradient of the original loss function for the local model parameters,
Figure BDA0002547145650000034
for the gradient of the regularization constraint, Wk-1 (0)Is the global model parameter for the (k-1) th iteration.
In a second aspect, an embodiment of the present invention provides a method for training a bang learning model, including:
the server obtains local model parameters of the kth iteration sent by a plurality of clients; the local model parameters of the kth iteration are obtained by training the global model parameters of the kth-1 th iteration by using the local model parameters with regularization constraints by the client; the regularization constraint is determined by the client according to global model parameters of the server and local model parameters of the client;
the server determines global model parameters of the kth iteration according to the local model parameters of the kth iteration and the global model parameters of k-1 iteration;
and the server broadcasts the global model parameters of the kth iteration to the plurality of clients so that the plurality of clients perform the (k + 1) th iteration training.
In the technical scheme, the client side obtains local model parameters of the kth iteration sent by the plurality of client sides, global model parameters of the kth iteration are determined, after the local model parameters of the client sides are trained through local data, the accuracy of the local model parameters in training on non-independent identically distributed data is improved, after the global model parameters are updated according to the local model parameters, the accuracy of the training on the non-independent identically distributed data is also improved through the global model parameters, and then the global model parameters of the kth iteration are broadcasted to the plurality of client sides for the next iteration.
Optionally, determining a global model parameter of the kth iteration according to the following formula (3);
Figure BDA0002547145650000041
wherein, Wk (0)Is the global model parameter of the kth iteration, Wk-1 (0)α for the global model parameters of the k-1 th iterationkLearning rate for the kth iteration, Wk (i)Local model parameters for the kth iteration for the ith client.
In a third aspect, an embodiment of the present invention provides a training apparatus for a bang learning model, including:
the acquisition module is used for acquiring global model parameters of the (k-1) th iteration broadcasted by the server; k is a positive integer;
the processing module is used for taking the global model parameters as parameters of a local model with regularization constraints, and performing kth iterative training by using local data to obtain local model parameters of the kth iterative training; the regularization constraint is determined according to global model parameters of the server and local model parameters of the client;
and sending the local model parameters trained by the k iteration to the server so that the server updates the global model parameters of the k iteration.
Optionally, the processing module is specifically configured to:
and F norm calculation is carried out on the difference value of the global model parameter and the local model parameter to obtain the regularization constraint.
Optionally, the processing module is specifically configured to:
determining a final loss function of the local model parameters according to the following formula (1);
Figure BDA0002547145650000042
wherein the content of the first and second substances,
Figure BDA0002547145650000043
a final loss function for the local model parameters, J (W)(i)) As a loss function of the local model parameters, JT-S(W(i)) For the regularization constraint, β are coefficients of the regularization constraint.
Optionally, the processing module is specifically configured to:
determining the local model parameters of the kth iteration according to the following formula (2);
Figure BDA0002547145650000044
wherein, Wk (i)Is the local model parameter of the kth iteration, Wk-1 (i)Local model parameters for the k-1 th iteration, αkFor the learning rate of the k-th iteration,
Figure BDA0002547145650000051
the gradient of the original loss function for the local model parameters,
Figure BDA0002547145650000052
for the gradient of the regularization constraint, Wk-1 (0)Is the global model parameter for the (k-1) th iteration.
In a fourth aspect, an embodiment of the present invention provides a training apparatus for a bang learning model, including:
the acquisition unit is used for acquiring local model parameters of the kth iteration sent by a plurality of clients; the local model parameters of the kth iteration are obtained by training the global model parameters of the kth-1 th iteration by using the local model parameters with regularization constraints by the client; the regularization constraint is determined by the client according to global model parameters of the server and local model parameters of the client;
the processing unit is used for determining the global model parameters of the kth iteration according to the local model parameters of the kth iteration and the global model parameters of the k-1 iteration;
broadcasting the global model parameters of the kth iteration to the plurality of clients so that the plurality of clients perform the (k + 1) th iteration training.
Optionally, the processing unit is specifically configured to:
determining the global model parameter of the kth iteration according to the following formula (3);
Figure BDA0002547145650000053
wherein, Wk (0)Is the global model parameter of the kth iteration, Wk-1 (0)α for the global model parameters of the k-1 th iterationkLearning rate for the kth iteration, Wk (i)Is as followsLocal model parameters for the k-th iteration of i clients.
In a fifth aspect, an embodiment of the present invention further provides a computing device, including:
a memory for storing program instructions;
and the processor is used for calling the program instructions stored in the memory and executing the training method of the federal learning model according to the obtained program.
In a sixth aspect, an embodiment of the present invention further provides a computer-readable storage medium, where computer-executable instructions are stored, and the computer-executable instructions are configured to cause a computer to execute the above method for training the federal learning model.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a system architecture diagram according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a method for training a federated learning model according to an embodiment of the present invention;
FIG. 3 is a schematic flow chart of a method for training a federated learning model according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a device for training a federated learning model according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a training device of a bang learning model according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 illustrates an exemplary system architecture to which an embodiment of the present invention is applicable, which includes a server 100 and a client 200.
The server 100 is configured to connect with the client 200 and send the global model parameters of the (k-1) th iteration to the client 200, and it should be noted that fig. 1 only illustrates an exemplary client 200, and may actually be multiple clients 200, which is not limited herein.
The client 200 is configured to obtain global model parameters of the k-1 th iteration sent by the server 100, use the global model parameters of the k-1 th iteration as local model parameters with regularization constraints, perform training using local data to obtain local model parameters of the k-th iteration training, and send the local model parameters of the k-th iteration training to the server 100, so that the server 100 updates the global model parameters of the k-th iteration.
It should be noted that the structure shown in fig. 1 is only an example, and the embodiment of the present invention is not limited thereto.
Based on the above description, fig. 2 exemplarily shows a flow of a method for training a federal learning model according to an embodiment of the present invention, where the flow may be performed by a training apparatus of the federal learning model.
As shown in fig. 2, the process specifically includes:
step 201, a client acquires global model parameters of the (k-1) th iteration broadcasted by a server; and k is a positive integer.
In the embodiment of the invention, before acquiring the global model parameter of the kth-1 iteration broadcasted by the server, the client sends the local model parameter of the kth-1 iteration to the server, so that the global model parameter of the kth-1 iteration is broadcasted after the server carries out the global model parameter of the kth-1 iteration.
Step 202, the client side takes the global model parameters as local model parameters with regularization constraints, and performs kth iterative training by using local data to obtain local model parameters of the kth iterative training; the regularization constraints are determined according to global model parameters of the server and local model parameters of the client.
According to the embodiment of the invention, after the client acquires the global model parameter when k is 1, the acquired global model parameter is used as the local model parameter, and the regularization constraint is set, so that before the model is converged, the global model parameter and the local model parameter are provided with the regularization constraint, the final loss function in the local model parameter is obtained, and the influence of the data of the client on the model training is reduced.
Further, the regularization constraint is determined according to the global model parameters of the server and the local model parameters of the client, and comprises the following steps: and performing norm calculation on the difference value of the global model parameter and the local model parameter to obtain the regularization constraint.
The regularization constraint is obtained by calculating an F-norm of a weight in the global model parameter sent by the server and a weight in the local model parameter of the client, where the F-norm is a matrix norm and refers to a square of a sum of squares of each element of the matrix, and the regularization constraint is specifically determined according to the following formula (4).
Figure BDA0002547145650000081
Wherein, JT-S(W(i)) For regularizing the constraints, W(0)As weights in the global model parameters, W(i)Are weights in the local model parameters.
Further, determining a final loss function of the local model parameters according to the following formula (1);
Figure BDA0002547145650000082
wherein the content of the first and second substances,
Figure BDA0002547145650000083
is the final loss function of the local model, J (W)(i)) As a loss function of the local model parameters, JT-S(W(i)) For the regularization constraint, β are coefficients of the regularization constraint.
And after the client side obtains the global model parameters of the (k-1) th iteration, the regularization constraint is calculated, the global model parameters provided with the regularization constraint are used as local model parameters, then model training is carried out, the final loss function of the local model parameters is obtained, and further the local model parameters are obtained.
By the algorithm, the original loss function (W) of the extreme data distribution relative to the local model parameters(i)) The effect is large, but according to the calculation of the regularization constraint, when W(0)When not changed, W(i)Increasing, then regularizing the constraint (J)T-S(W(i)) ) is reduced, resulting in a final loss function of the local model
Figure BDA0002547145650000084
The balance is approached, and the influence of the final loss function is reduced.
Determining the local model parameters of the kth iterative training according to the following formula (2);
Figure BDA0002547145650000085
wherein, Wk (i)Is the local model parameter of the kth iteration, Wk-1 (i)Local model parameters for the k-1 th iteration, αkFor the learning rate of the k-th iteration,
Figure BDA0002547145650000086
the original gradient of the loss function for the local model parameters,
Figure BDA0002547145650000087
for regularizing the constrained gradient, Wk-1 (0)Is the global model parameter for the (k-1) th iteration.
And obtaining updated local model parameters of the kth iteration by performing derivation on the final loss function of the local model parameters and adding and summing the derivation and the local model parameters of the k-1 iteration.
In order to better explain the above technical solutions, the following description is made in specific examples.
Example 1
Initializing non-independent and identically distributed data as a training set for model training, and comprising the following two modes.
1. The data is sorted according to the digital labels, then the data is divided into multiple shares, such as ten shares, each client holds various data of the digital labels, such as two, for example, after the data is divided into ten shares, label sorting is carried out, 1-10, 1 and 8 are taken for carrying out model training on the 1 st client, 9 and 7 are taken for carrying out model training on the 2 nd client, and therefore the data of each client cannot be used as a representative of global data distribution.
2. And dividing the data into ten parts by using a reference data set, so that the data quantity on each client is greatly different, and the data of each client cannot be used as a representative of the global data distribution.
Selecting 10 clients to perform model training by using the data, wherein the 10 clients acquire the global model parameter W of the k-1 th time broadcasted by the serverk-1 (0)And obtaining local model parameters of the kth iteration according to a minimized final loss function:
Figure BDA0002547145650000091
wherein, Wk (i)Local model parameters for the kth iteration of the ith client, i ∈ (1, 2, 3, 4, 5, 6, 7, 8, 9, 10), αkAnd obtaining the local model parameters of the k-th iteration of each client, wherein the local model parameters of the k-th iteration of each client cannot be used as representatives of the global model parameters.
Step 203, the client sends the local model parameters of the kth iteration training to the server, so that the server updates the global model parameters of the kth iteration.
In the embodiment of the invention, the plurality of clients send the local model parameters of the kth iteration obtained after training to the server, so that the server obtains the local model parameters of the plurality of kth iterations corresponding to the whole data set, further updates the global model parameters of the kth iteration, and broadcasts the global model parameters of the kth iteration next time.
In the embodiment of the invention, regularization constraint is set on the obtained global model parameter of the k-1 iteration as a local model parameter, so that when the local model parameter is trained, the influence of the terminal data in the local data on the training result of the local model parameter is reduced, the accuracy of the local model parameter on the training of the non-independent identically distributed data is improved, the local model of the kth iteration training is obtained, and then the local model parameter of the kth iteration training is sent to the server, so that the accuracy of the global model parameter on the training of the non-independent identically distributed data is improved.
Fig. 2 exemplarily shows a flow of a method for training a bang learning model according to an embodiment of the present invention.
As shown in fig. 3, the specific process includes:
step 301, a server obtains local model parameters of a kth iteration sent by a plurality of clients; the local model parameters of the kth iteration are obtained by training the global model parameters of the kth-1 th iteration by using the local model parameters with regularization constraints by the client; the regularization constraints are determined by the client based on global model parameters of the server and local model parameters of the client.
According to the embodiment of the invention, a server obtains local model parameters of kth iteration sent by a plurality of clients, wherein the local model parameters are provided with regularization constraints.
And 302, the server determines the global model parameters of the kth iteration according to the local model parameters of the kth iteration and the global model parameters of the k-1 iteration.
In the embodiment of the invention, the server calculates the difference between the obtained local model parameters of the kth iteration of a plurality of clients and the global model parameters of the k-1 iteration, sums the calculated difference, obtains the global model parameters of the kth iteration according to the learning rate and the global model parameters of the k-1 iteration,
further, according to the following formula (3), determining global model parameters of the kth iteration;
Figure BDA0002547145650000101
wherein, Wk (0)The global model parameter, W, for the kth iterationk-1 (0)α for the global model parameters of the k-1 th iterationkLearning rate for the kth iteration, Wk (i)Local model parameters for the kth iteration for the ith client.
Compared with the traditional method for calculating the mean value in the federal learning, the embodiment of the invention sums the difference values between the local model parameters of the k iteration of all the clients and the global model parameters of the k-1 iteration, and adds the sum as the step length to the global model parameters of the k-1 iteration to serve as the updated global model parameters of the k iteration.
The above technical solution is described in the following specific example in conjunction with example 1 in the above fig. 2.
Example 2
Obtaining local model parameters of the kth iteration sent by 10 clients, and then summing differences between the local model parameters of the kth iteration of all the clients and the global model parameters of the (k-1) th iteration to obtain a sum:
Figure BDA0002547145650000111
then multiplying the sum by the learning rate, and adding the sum to the global model parameter of the k-1 iteration to obtain the global model parameter of the k iteration as follows:
Figure BDA0002547145650000112
Figure BDA0002547145650000113
step 303, the server broadcasts the global model parameter of the kth iteration to the plurality of clients, so that the plurality of clients perform the (k + 1) th iteration training.
According to the embodiment of the invention, the server broadcasts the global model parameters of the kth iteration to the plurality of clients, the clients do not need to reset regularization constraints, the clients directly carry out the (k + 1) th iteration training, and the local training model parameters of the (k + 1) th iteration sent by the clients are obtained, so that the global model parameters of the (k + 1) th iteration are updated.
Based on the same technical concept, fig. 4 exemplarily shows the structure of a training apparatus for a federated learning model provided in an embodiment of the present invention, and the apparatus may execute the flow of the training method for the federated learning model in fig. 2.
As shown in fig. 4, the apparatus specifically includes:
an obtaining module 401, configured to obtain global model parameters of a k-1 st iteration broadcasted by a server; k is a positive integer;
a processing module 402, configured to use the global model parameter as a parameter of a local model with regularization constraints, and perform a kth iterative training using local data to obtain a local model parameter of the kth iterative training; the regularization constraint is determined according to global model parameters of the server and local model parameters of the client;
and sending the local model parameters trained by the k iteration to the server so that the server updates the global model parameters of the k iteration.
Optionally, the processing module 402 is specifically configured to:
and F norm calculation is carried out on the difference value of the global model parameter and the local model parameter to obtain the regularization constraint.
Optionally, the processing module 402 is specifically configured to:
determining a final loss function of the local model parameters according to the following formula (1);
Figure BDA0002547145650000121
wherein the content of the first and second substances,
Figure BDA0002547145650000122
a final loss function for the local model parameters, J (W)(i)) As a loss function of the local model parameters, JT-S(W(i)) For the regularization constraint, β are coefficients of the regularization constraint.
Optionally, the processing module 402 is specifically configured to:
determining the local model parameters of the kth iterative training according to the following formula (2);
Figure BDA0002547145650000123
wherein, Wk (i)Is the local model parameter of the kth iteration, Wk-1 (i)Local model parameters for the k-1 th iteration, αkFor the learning rate of the k-th iteration,
Figure BDA0002547145650000124
the gradient of the original loss function for the local model parameters,
Figure BDA0002547145650000125
for the gradient of the regularization constraint, Wk-1 (0)Is the global model parameter for the (k-1) th iteration.
Fig. 5 exemplarily shows a structure of a training apparatus for a federal learning model according to an embodiment of the present invention, which may execute a flow of a training method for the federal learning model in fig. 3. As shown in fig. 5, the apparatus specifically includes:
an obtaining unit 501, configured to obtain local model parameters of a kth iteration sent by multiple clients; the local model parameters of the kth iteration are obtained by training the global model parameters of the kth-1 th iteration by using the local model parameters with regularization constraints by the client; the regularization constraint is determined by the client according to global model parameters of the server and local model parameters of the client;
the processing unit 502 determines a global model parameter of the kth iteration according to the local model parameter of the kth iteration and the global model parameter of the k-1 iteration;
broadcasting the global model parameters of the kth iteration to the plurality of clients so that the plurality of clients perform the (k + 1) th iteration training.
Optionally, the processing unit 502 is specifically configured to:
determining the global model parameter of the kth iteration according to the following formula (3);
Figure BDA0002547145650000131
wherein, Wk (0)Is the global model parameter of the kth iteration, Wk-1 (0)α for the global model parameters of the k-1 th iterationkLearning rate for the kth iteration, Wk (i)Local model parameters for the kth iteration for the ith client.
Based on the same technical concept, an embodiment of the present invention further provides a computing device, including:
a memory for storing program instructions;
and the processor is used for calling the program instructions stored in the memory and executing the training method of the federal learning model according to the obtained program.
Based on the same technical concept, the embodiment of the present invention further provides a computer-readable storage medium, where computer-executable instructions are stored, and the computer-executable instructions are configured to enable a computer to execute the above-mentioned method for training the federal learning model.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (10)

1. A method for training a federated learning model is characterized by comprising the following steps:
the client acquires global model parameters of the (k-1) th iteration broadcasted by the server; k is a positive integer;
the client side takes the global model parameters as local model parameters with regularization constraints, local data are used for conducting kth iterative training, and local model parameters of the kth iterative training are obtained; the regularization constraint is determined according to global model parameters of the server and local model parameters of the client;
and the client side sends the local model parameters trained by the k iteration to the server so that the server updates the global model parameters of the k iteration.
2. The method of claim 1, wherein the regularization constraint determined based on the global model parameters of the server and the local model parameters of the client comprises:
and F norm calculation is carried out on the difference value of the global model parameter and the local model parameter to obtain the regularization constraint.
3. The method of claim 1, wherein a final loss function for the local model parameters is determined according to equation (1) below;
Figure FDA0002547145640000011
wherein the content of the first and second substances,
Figure FDA0002547145640000012
participating in the local modelFinal loss function of number, J (W)(i)) As a loss function of the local model parameters, JT-S(W(i)) For the regularization constraint, β are coefficients of the regularization constraint.
4. The method of claim 1, wherein the local model parameters for the kth iterative training are determined according to the following formula (2);
Figure FDA0002547145640000013
wherein, Wk (i)Is the local model parameter of the kth iteration, Wk-1 (i)Local model parameters for the k-1 th iteration, αkFor the learning rate of the k-th iteration,
Figure FDA0002547145640000014
the gradient of the original loss function for the local model parameters,
Figure FDA0002547145640000021
for the gradient of the regularization constraint, Wk-1 (0)Is the global model parameter for the (k-1) th iteration.
5. A method for training a federated learning model is characterized by comprising the following steps:
the server obtains local model parameters of the kth iteration sent by a plurality of clients; the local model parameters of the kth iteration are obtained by training the global model parameters of the kth-1 th iteration by using the local model parameters with regularization constraints by the client; the regularization constraint is determined by the client according to global model parameters of the server and local model parameters of the client;
the server determines global model parameters of the kth iteration according to the local model parameters of the kth iteration and the global model parameters of k-1 iteration;
and the server broadcasts the global model parameters of the kth iteration to the plurality of clients so that the plurality of clients perform the (k + 1) th iteration training.
6. The method of claim 5, wherein the global model parameters for the kth iteration are determined according to the following equation (3);
Figure FDA0002547145640000022
wherein, Wk (0)The global model parameter, W, for the kth iterationk-1 (0)α for the global model parameters of the k-1 th iterationkLearning rate for the kth iteration, Wk (i)Local model parameters for the kth iteration for the ith client.
7. The utility model provides a trainer of bang's learning model which characterized in that includes:
the acquisition module is used for acquiring global model parameters of the (k-1) th iteration broadcasted by the server; k is a positive integer;
the processing module is used for taking the global model parameters as parameters of a local model with regularization constraints, and performing kth iterative training by using local data to obtain local model parameters of the kth iterative training; the regularization constraint is determined according to global model parameters of the server and local model parameters of the client;
and sending the local model parameters trained by the k iteration to the server so that the server updates the global model parameters of the k iteration.
8. The utility model provides a trainer of bang's learning model which characterized in that includes:
the acquisition unit is used for acquiring local model parameters of the kth iteration sent by a plurality of clients; the local model parameters of the kth iteration are obtained by training the global model parameters of the kth-1 th iteration by using the local model parameters with regularization constraints by the client; the regularization constraint is determined by the client according to global model parameters of the server and local model parameters of the client;
the processing unit is used for determining the global model parameters of the kth iteration according to the local model parameters of the kth iteration and the global model parameters of the k-1 iteration;
broadcasting the global model parameters of the kth iteration to the plurality of clients so that the plurality of clients perform the (k + 1) th iteration training.
9. A computing device, comprising:
a memory for storing program instructions;
a processor for calling program instructions stored in said memory to perform the method of any one of claims 1 to 4 or 5 to 6 in accordance with the obtained program.
10. A computer-readable storage medium having stored thereon computer-executable instructions for causing a computer to perform the method of any one of claims 1 to 4 or 5 to 6.
CN202010564409.1A 2020-06-19 2020-06-19 Method and device for training federated learning model Pending CN111723947A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010564409.1A CN111723947A (en) 2020-06-19 2020-06-19 Method and device for training federated learning model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010564409.1A CN111723947A (en) 2020-06-19 2020-06-19 Method and device for training federated learning model

Publications (1)

Publication Number Publication Date
CN111723947A true CN111723947A (en) 2020-09-29

Family

ID=72567617

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010564409.1A Pending CN111723947A (en) 2020-06-19 2020-06-19 Method and device for training federated learning model

Country Status (1)

Country Link
CN (1) CN111723947A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112348063A (en) * 2020-10-27 2021-02-09 广东电网有限责任公司电力调度控制中心 Model training method and device based on federal transfer learning in Internet of things
CN112532451A (en) * 2020-11-30 2021-03-19 安徽工业大学 Layered federal learning method and device based on asynchronous communication, terminal equipment and storage medium
CN112668128A (en) * 2020-12-21 2021-04-16 国网辽宁省电力有限公司物资分公司 Method and device for selecting terminal equipment nodes in federated learning system
CN112906911A (en) * 2021-02-03 2021-06-04 厦门大学 Model training method for federal learning
CN113095513A (en) * 2021-04-25 2021-07-09 中山大学 Double-layer fair federal learning method, device and storage medium
CN113139662A (en) * 2021-04-23 2021-07-20 深圳市大数据研究院 Global and local gradient processing method, device, equipment and medium for federal learning
CN113378994A (en) * 2021-07-09 2021-09-10 浙江大学 Image identification method, device, equipment and computer readable storage medium
CN113837399A (en) * 2021-10-26 2021-12-24 医渡云(北京)技术有限公司 Federal learning model training method, device, system, storage medium and equipment
WO2024022082A1 (en) * 2022-07-29 2024-02-01 脸萌有限公司 Information classification method and apparatus, device, and medium

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112348063A (en) * 2020-10-27 2021-02-09 广东电网有限责任公司电力调度控制中心 Model training method and device based on federal transfer learning in Internet of things
CN112348063B (en) * 2020-10-27 2024-06-11 广东电网有限责任公司电力调度控制中心 Model training method and device based on federal migration learning in Internet of things
CN112532451A (en) * 2020-11-30 2021-03-19 安徽工业大学 Layered federal learning method and device based on asynchronous communication, terminal equipment and storage medium
CN112668128A (en) * 2020-12-21 2021-04-16 国网辽宁省电力有限公司物资分公司 Method and device for selecting terminal equipment nodes in federated learning system
CN112668128B (en) * 2020-12-21 2024-05-28 国网辽宁省电力有限公司物资分公司 Method and device for selecting terminal equipment nodes in federal learning system
CN112906911A (en) * 2021-02-03 2021-06-04 厦门大学 Model training method for federal learning
CN112906911B (en) * 2021-02-03 2022-07-01 厦门大学 Model training method for federal learning
CN113139662B (en) * 2021-04-23 2023-07-14 深圳市大数据研究院 Global and local gradient processing method, device, equipment and medium for federal learning
CN113139662A (en) * 2021-04-23 2021-07-20 深圳市大数据研究院 Global and local gradient processing method, device, equipment and medium for federal learning
CN113095513A (en) * 2021-04-25 2021-07-09 中山大学 Double-layer fair federal learning method, device and storage medium
CN113378994A (en) * 2021-07-09 2021-09-10 浙江大学 Image identification method, device, equipment and computer readable storage medium
CN113837399B (en) * 2021-10-26 2023-05-30 医渡云(北京)技术有限公司 Training method, device, system, storage medium and equipment for federal learning model
CN113837399A (en) * 2021-10-26 2021-12-24 医渡云(北京)技术有限公司 Federal learning model training method, device, system, storage medium and equipment
WO2024022082A1 (en) * 2022-07-29 2024-02-01 脸萌有限公司 Information classification method and apparatus, device, and medium

Similar Documents

Publication Publication Date Title
CN111723947A (en) Method and device for training federated learning model
CN112287982A (en) Data prediction method and device and terminal equipment
CN110659678B (en) User behavior classification method, system and storage medium
CN109889397B (en) Lottery method, block generation method, equipment and storage medium
CN106325756B (en) Data storage method, data calculation method and equipment
CN110689136B (en) Deep learning model obtaining method, device, equipment and storage medium
CN111461164B (en) Sample data set capacity expansion method and model training method
CN111695696A (en) Method and device for model training based on federal learning
CN104008420A (en) Distributed outlier detection method and system based on automatic coding machine
CN112100450A (en) Graph calculation data segmentation method, terminal device and storage medium
CN111831855A (en) Method, apparatus, electronic device, and medium for matching videos
CN109344268A (en) Method, electronic equipment and the computer readable storage medium of graphic data base write-in
CN114138231B (en) Method, circuit and SOC for executing matrix multiplication operation
CN111626311B (en) Heterogeneous graph data processing method and device
CN110851247A (en) Cost optimization scheduling method for constrained cloud workflow
CN111667018B (en) Object clustering method and device, computer readable medium and electronic equipment
CN113011911B (en) Data prediction method and device based on artificial intelligence, medium and electronic equipment
CN114332550A (en) Model training method, system, storage medium and terminal equipment
CN106844024A (en) The GPU/CPU dispatching methods and system of a kind of self study run time forecast model
CN114972695B (en) Point cloud generation method and device, electronic equipment and storage medium
CN114092162B (en) Recommendation quality determination method, and training method and device of recommendation quality determination model
CN109981726A (en) A kind of distribution method of memory node, server and system
CN111984842B (en) Bank customer data processing method and device
CN113656187A (en) Public security big data computing power service system based on 5G
CN114492844A (en) Method and device for constructing machine learning workflow, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination