CN111723947A - Method and device for training federated learning model - Google Patents
Method and device for training federated learning model Download PDFInfo
- Publication number
- CN111723947A CN111723947A CN202010564409.1A CN202010564409A CN111723947A CN 111723947 A CN111723947 A CN 111723947A CN 202010564409 A CN202010564409 A CN 202010564409A CN 111723947 A CN111723947 A CN 111723947A
- Authority
- CN
- China
- Prior art keywords
- model parameters
- iteration
- local
- kth
- global
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012549 training Methods 0.000 title claims abstract description 92
- 238000000034 method Methods 0.000 title claims abstract description 35
- 230000006870 function Effects 0.000 claims description 33
- 238000012545 processing Methods 0.000 claims description 18
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000003860 storage Methods 0.000 claims description 6
- 239000000126 substance Substances 0.000 claims description 5
- 238000009826 distribution Methods 0.000 abstract description 8
- 238000010586 diagram Methods 0.000 description 11
- 238000004590 computer program Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 6
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000009795 derivation Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013499 data model Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a method and a device for training a federated learning model, which comprise the following steps: the client side obtains global model parameters of the (k-1) th iteration broadcasted by the server, wherein k is a positive integer, then the global model parameters are used as local model parameters with regularization constraints, the local data are used for carrying out the kth iteration training to obtain the local model parameters of the kth iteration training, wherein the regularization constraints are determined based on global model parameters of the server and local model parameters of the client, the gradients in the model are optimized by the regularization constraints, thereby reducing the influence of the extreme data on the local model parameter training, improving the accuracy of the local model parameter on the training of the non-independent same distribution data, and then sending the local model parameters of the kth iterative training to the server so that the server updates the global model parameters of the kth iterative training and improves the accuracy of the global model parameters to the training of the non-independent same distribution data.
Description
Technical Field
The invention relates to the field of financial technology (Fintech), in particular to a method and a device for training a joint learning model.
Background
With the development of computer technology, more and more technologies (such as block chains, cloud computing or big data) are applied in the financial field, the traditional financial industry is gradually changing to the financial technology, and big data technology is no exception, but higher requirements are also put forward on big data technology due to the security and real-time requirements of the financial and payment industries.
In the prior art, federal learning is that communication among nodes is performed in a parameter transmission mode, and information provided by data of each node is integrated in a parameter averaging mode in a training process. In the process of federal learning, data needs to be trained, a node is randomly selected each time, a global model is issued to the node, iteration is carried out by using the data of the node, model parameters obtained by training are sent back to a central server, and the central server averages the model parameters obtained by training each node to determine a model for next iteration.
However, in the federal learning in the prior art, the model trained by using the data which is not independent and distributed has low accuracy and poor effect. Therefore, a method is needed to improve the accuracy of training the parameters of the data model with non-independent and same distribution.
Disclosure of Invention
The embodiment of the invention provides a method and a device for training a federated learning model, which are used for improving the accuracy of a model trained by non-independent and identically distributed data and optimizing the model trained by the non-independent and identically distributed data.
In a first aspect, an embodiment of the present invention provides a method for training a bang learning model, including:
the client acquires global model parameters of the (k-1) th iteration broadcasted by the server; k is a positive integer;
the client side takes the global model parameters as local model parameters with regularization constraints, local data are used for conducting kth iterative training, and local model parameters of the kth iterative training are obtained; the regularization constraint is determined according to global model parameters of the server and local model parameters of the client;
and the client side sends the local model parameters trained by the k iteration to the server so that the server updates the global model parameters of the k iteration.
In the technical scheme, a client acquires global model parameters of k-1 iterations, when k is 1, namely the global model parameters are initial global model parameters, the initial global model parameters are set with a regularization constraint as local model parameters, so that the global model parameters and the local model parameters have regularization constraints in the iteration process, when the local model parameters train local data, a loss function of the local model parameters is constrained, the gradient of the local model parameters is optimized, the influence of extreme data in the local data on the training result of the local model parameters is further reduced, the accuracy of the local model parameters on non-independent identically distributed data training is improved, a local model of the kth iteration training is obtained, the local model parameters of the kth iteration training are sent to a server, so that the server updates the global model parameters of the kth iteration, and further improving the accuracy of the global model parameters to the training of the non-independent same distribution data.
Optionally, the regularization constraint determined according to the global model parameter of the server and the local model parameter of the client includes:
and F norm calculation is carried out on the difference value of the global model parameter and the local model parameter to obtain the regularization constraint.
According to the technical scheme, the regularization constraint is obtained according to the global model parameters and the local model parameters, and the regularization constraint is used for optimizing a loss function in the local model parameters and improving the accuracy of the local model parameters on training of the non-independent same distribution data.
Optionally, determining a final loss function of the local model parameter according to the following formula (1);
wherein the content of the first and second substances,a final loss function for the local model parameters, J (W)(i)) As a loss function of the local model parameters, JT-S(W(i)) For the regularization constraint, β are coefficients of the regularization constraint.
Optionally, determining a local model parameter of the kth iterative training according to the following formula (2);
wherein, Wk (i)Is the local model parameter of the kth iteration, Wk-1 (i)Local model parameters for the k-1 th iteration, αkFor the learning rate of the k-th iteration,the gradient of the original loss function for the local model parameters,for the gradient of the regularization constraint, Wk-1 (0)Is the global model parameter for the (k-1) th iteration.
In a second aspect, an embodiment of the present invention provides a method for training a bang learning model, including:
the server obtains local model parameters of the kth iteration sent by a plurality of clients; the local model parameters of the kth iteration are obtained by training the global model parameters of the kth-1 th iteration by using the local model parameters with regularization constraints by the client; the regularization constraint is determined by the client according to global model parameters of the server and local model parameters of the client;
the server determines global model parameters of the kth iteration according to the local model parameters of the kth iteration and the global model parameters of k-1 iteration;
and the server broadcasts the global model parameters of the kth iteration to the plurality of clients so that the plurality of clients perform the (k + 1) th iteration training.
In the technical scheme, the client side obtains local model parameters of the kth iteration sent by the plurality of client sides, global model parameters of the kth iteration are determined, after the local model parameters of the client sides are trained through local data, the accuracy of the local model parameters in training on non-independent identically distributed data is improved, after the global model parameters are updated according to the local model parameters, the accuracy of the training on the non-independent identically distributed data is also improved through the global model parameters, and then the global model parameters of the kth iteration are broadcasted to the plurality of client sides for the next iteration.
Optionally, determining a global model parameter of the kth iteration according to the following formula (3);
wherein, Wk (0)Is the global model parameter of the kth iteration, Wk-1 (0)α for the global model parameters of the k-1 th iterationkLearning rate for the kth iteration, Wk (i)Local model parameters for the kth iteration for the ith client.
In a third aspect, an embodiment of the present invention provides a training apparatus for a bang learning model, including:
the acquisition module is used for acquiring global model parameters of the (k-1) th iteration broadcasted by the server; k is a positive integer;
the processing module is used for taking the global model parameters as parameters of a local model with regularization constraints, and performing kth iterative training by using local data to obtain local model parameters of the kth iterative training; the regularization constraint is determined according to global model parameters of the server and local model parameters of the client;
and sending the local model parameters trained by the k iteration to the server so that the server updates the global model parameters of the k iteration.
Optionally, the processing module is specifically configured to:
and F norm calculation is carried out on the difference value of the global model parameter and the local model parameter to obtain the regularization constraint.
Optionally, the processing module is specifically configured to:
determining a final loss function of the local model parameters according to the following formula (1);
wherein the content of the first and second substances,a final loss function for the local model parameters, J (W)(i)) As a loss function of the local model parameters, JT-S(W(i)) For the regularization constraint, β are coefficients of the regularization constraint.
Optionally, the processing module is specifically configured to:
determining the local model parameters of the kth iteration according to the following formula (2);
wherein, Wk (i)Is the local model parameter of the kth iteration, Wk-1 (i)Local model parameters for the k-1 th iteration, αkFor the learning rate of the k-th iteration,the gradient of the original loss function for the local model parameters,for the gradient of the regularization constraint, Wk-1 (0)Is the global model parameter for the (k-1) th iteration.
In a fourth aspect, an embodiment of the present invention provides a training apparatus for a bang learning model, including:
the acquisition unit is used for acquiring local model parameters of the kth iteration sent by a plurality of clients; the local model parameters of the kth iteration are obtained by training the global model parameters of the kth-1 th iteration by using the local model parameters with regularization constraints by the client; the regularization constraint is determined by the client according to global model parameters of the server and local model parameters of the client;
the processing unit is used for determining the global model parameters of the kth iteration according to the local model parameters of the kth iteration and the global model parameters of the k-1 iteration;
broadcasting the global model parameters of the kth iteration to the plurality of clients so that the plurality of clients perform the (k + 1) th iteration training.
Optionally, the processing unit is specifically configured to:
determining the global model parameter of the kth iteration according to the following formula (3);
wherein, Wk (0)Is the global model parameter of the kth iteration, Wk-1 (0)α for the global model parameters of the k-1 th iterationkLearning rate for the kth iteration, Wk (i)Is as followsLocal model parameters for the k-th iteration of i clients.
In a fifth aspect, an embodiment of the present invention further provides a computing device, including:
a memory for storing program instructions;
and the processor is used for calling the program instructions stored in the memory and executing the training method of the federal learning model according to the obtained program.
In a sixth aspect, an embodiment of the present invention further provides a computer-readable storage medium, where computer-executable instructions are stored, and the computer-executable instructions are configured to cause a computer to execute the above method for training the federal learning model.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a system architecture diagram according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a method for training a federated learning model according to an embodiment of the present invention;
FIG. 3 is a schematic flow chart of a method for training a federated learning model according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a device for training a federated learning model according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a training device of a bang learning model according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 illustrates an exemplary system architecture to which an embodiment of the present invention is applicable, which includes a server 100 and a client 200.
The server 100 is configured to connect with the client 200 and send the global model parameters of the (k-1) th iteration to the client 200, and it should be noted that fig. 1 only illustrates an exemplary client 200, and may actually be multiple clients 200, which is not limited herein.
The client 200 is configured to obtain global model parameters of the k-1 th iteration sent by the server 100, use the global model parameters of the k-1 th iteration as local model parameters with regularization constraints, perform training using local data to obtain local model parameters of the k-th iteration training, and send the local model parameters of the k-th iteration training to the server 100, so that the server 100 updates the global model parameters of the k-th iteration.
It should be noted that the structure shown in fig. 1 is only an example, and the embodiment of the present invention is not limited thereto.
Based on the above description, fig. 2 exemplarily shows a flow of a method for training a federal learning model according to an embodiment of the present invention, where the flow may be performed by a training apparatus of the federal learning model.
As shown in fig. 2, the process specifically includes:
In the embodiment of the invention, before acquiring the global model parameter of the kth-1 iteration broadcasted by the server, the client sends the local model parameter of the kth-1 iteration to the server, so that the global model parameter of the kth-1 iteration is broadcasted after the server carries out the global model parameter of the kth-1 iteration.
According to the embodiment of the invention, after the client acquires the global model parameter when k is 1, the acquired global model parameter is used as the local model parameter, and the regularization constraint is set, so that before the model is converged, the global model parameter and the local model parameter are provided with the regularization constraint, the final loss function in the local model parameter is obtained, and the influence of the data of the client on the model training is reduced.
Further, the regularization constraint is determined according to the global model parameters of the server and the local model parameters of the client, and comprises the following steps: and performing norm calculation on the difference value of the global model parameter and the local model parameter to obtain the regularization constraint.
The regularization constraint is obtained by calculating an F-norm of a weight in the global model parameter sent by the server and a weight in the local model parameter of the client, where the F-norm is a matrix norm and refers to a square of a sum of squares of each element of the matrix, and the regularization constraint is specifically determined according to the following formula (4).
Wherein, JT-S(W(i)) For regularizing the constraints, W(0)As weights in the global model parameters, W(i)Are weights in the local model parameters.
Further, determining a final loss function of the local model parameters according to the following formula (1);
wherein the content of the first and second substances,is the final loss function of the local model, J (W)(i)) As a loss function of the local model parameters, JT-S(W(i)) For the regularization constraint, β are coefficients of the regularization constraint.
And after the client side obtains the global model parameters of the (k-1) th iteration, the regularization constraint is calculated, the global model parameters provided with the regularization constraint are used as local model parameters, then model training is carried out, the final loss function of the local model parameters is obtained, and further the local model parameters are obtained.
By the algorithm, the original loss function (W) of the extreme data distribution relative to the local model parameters(i)) The effect is large, but according to the calculation of the regularization constraint, when W(0)When not changed, W(i)Increasing, then regularizing the constraint (J)T-S(W(i)) ) is reduced, resulting in a final loss function of the local modelThe balance is approached, and the influence of the final loss function is reduced.
Determining the local model parameters of the kth iterative training according to the following formula (2);
wherein, Wk (i)Is the local model parameter of the kth iteration, Wk-1 (i)Local model parameters for the k-1 th iteration, αkFor the learning rate of the k-th iteration,the original gradient of the loss function for the local model parameters,for regularizing the constrained gradient, Wk-1 (0)Is the global model parameter for the (k-1) th iteration.
And obtaining updated local model parameters of the kth iteration by performing derivation on the final loss function of the local model parameters and adding and summing the derivation and the local model parameters of the k-1 iteration.
In order to better explain the above technical solutions, the following description is made in specific examples.
Example 1
Initializing non-independent and identically distributed data as a training set for model training, and comprising the following two modes.
1. The data is sorted according to the digital labels, then the data is divided into multiple shares, such as ten shares, each client holds various data of the digital labels, such as two, for example, after the data is divided into ten shares, label sorting is carried out, 1-10, 1 and 8 are taken for carrying out model training on the 1 st client, 9 and 7 are taken for carrying out model training on the 2 nd client, and therefore the data of each client cannot be used as a representative of global data distribution.
2. And dividing the data into ten parts by using a reference data set, so that the data quantity on each client is greatly different, and the data of each client cannot be used as a representative of the global data distribution.
Selecting 10 clients to perform model training by using the data, wherein the 10 clients acquire the global model parameter W of the k-1 th time broadcasted by the serverk-1 (0)And obtaining local model parameters of the kth iteration according to a minimized final loss function:wherein, Wk (i)Local model parameters for the kth iteration of the ith client, i ∈ (1, 2, 3, 4, 5, 6, 7, 8, 9, 10), αkAnd obtaining the local model parameters of the k-th iteration of each client, wherein the local model parameters of the k-th iteration of each client cannot be used as representatives of the global model parameters.
In the embodiment of the invention, the plurality of clients send the local model parameters of the kth iteration obtained after training to the server, so that the server obtains the local model parameters of the plurality of kth iterations corresponding to the whole data set, further updates the global model parameters of the kth iteration, and broadcasts the global model parameters of the kth iteration next time.
In the embodiment of the invention, regularization constraint is set on the obtained global model parameter of the k-1 iteration as a local model parameter, so that when the local model parameter is trained, the influence of the terminal data in the local data on the training result of the local model parameter is reduced, the accuracy of the local model parameter on the training of the non-independent identically distributed data is improved, the local model of the kth iteration training is obtained, and then the local model parameter of the kth iteration training is sent to the server, so that the accuracy of the global model parameter on the training of the non-independent identically distributed data is improved.
Fig. 2 exemplarily shows a flow of a method for training a bang learning model according to an embodiment of the present invention.
As shown in fig. 3, the specific process includes:
According to the embodiment of the invention, a server obtains local model parameters of kth iteration sent by a plurality of clients, wherein the local model parameters are provided with regularization constraints.
And 302, the server determines the global model parameters of the kth iteration according to the local model parameters of the kth iteration and the global model parameters of the k-1 iteration.
In the embodiment of the invention, the server calculates the difference between the obtained local model parameters of the kth iteration of a plurality of clients and the global model parameters of the k-1 iteration, sums the calculated difference, obtains the global model parameters of the kth iteration according to the learning rate and the global model parameters of the k-1 iteration,
further, according to the following formula (3), determining global model parameters of the kth iteration;
wherein, Wk (0)The global model parameter, W, for the kth iterationk-1 (0)α for the global model parameters of the k-1 th iterationkLearning rate for the kth iteration, Wk (i)Local model parameters for the kth iteration for the ith client.
Compared with the traditional method for calculating the mean value in the federal learning, the embodiment of the invention sums the difference values between the local model parameters of the k iteration of all the clients and the global model parameters of the k-1 iteration, and adds the sum as the step length to the global model parameters of the k-1 iteration to serve as the updated global model parameters of the k iteration.
The above technical solution is described in the following specific example in conjunction with example 1 in the above fig. 2.
Example 2
Obtaining local model parameters of the kth iteration sent by 10 clients, and then summing differences between the local model parameters of the kth iteration of all the clients and the global model parameters of the (k-1) th iteration to obtain a sum:then multiplying the sum by the learning rate, and adding the sum to the global model parameter of the k-1 iteration to obtain the global model parameter of the k iteration as follows:
According to the embodiment of the invention, the server broadcasts the global model parameters of the kth iteration to the plurality of clients, the clients do not need to reset regularization constraints, the clients directly carry out the (k + 1) th iteration training, and the local training model parameters of the (k + 1) th iteration sent by the clients are obtained, so that the global model parameters of the (k + 1) th iteration are updated.
Based on the same technical concept, fig. 4 exemplarily shows the structure of a training apparatus for a federated learning model provided in an embodiment of the present invention, and the apparatus may execute the flow of the training method for the federated learning model in fig. 2.
As shown in fig. 4, the apparatus specifically includes:
an obtaining module 401, configured to obtain global model parameters of a k-1 st iteration broadcasted by a server; k is a positive integer;
a processing module 402, configured to use the global model parameter as a parameter of a local model with regularization constraints, and perform a kth iterative training using local data to obtain a local model parameter of the kth iterative training; the regularization constraint is determined according to global model parameters of the server and local model parameters of the client;
and sending the local model parameters trained by the k iteration to the server so that the server updates the global model parameters of the k iteration.
Optionally, the processing module 402 is specifically configured to:
and F norm calculation is carried out on the difference value of the global model parameter and the local model parameter to obtain the regularization constraint.
Optionally, the processing module 402 is specifically configured to:
determining a final loss function of the local model parameters according to the following formula (1);
wherein the content of the first and second substances,a final loss function for the local model parameters, J (W)(i)) As a loss function of the local model parameters, JT-S(W(i)) For the regularization constraint, β are coefficients of the regularization constraint.
Optionally, the processing module 402 is specifically configured to:
determining the local model parameters of the kth iterative training according to the following formula (2);
wherein, Wk (i)Is the local model parameter of the kth iteration, Wk-1 (i)Local model parameters for the k-1 th iteration, αkFor the learning rate of the k-th iteration,the gradient of the original loss function for the local model parameters,for the gradient of the regularization constraint, Wk-1 (0)Is the global model parameter for the (k-1) th iteration.
Fig. 5 exemplarily shows a structure of a training apparatus for a federal learning model according to an embodiment of the present invention, which may execute a flow of a training method for the federal learning model in fig. 3. As shown in fig. 5, the apparatus specifically includes:
an obtaining unit 501, configured to obtain local model parameters of a kth iteration sent by multiple clients; the local model parameters of the kth iteration are obtained by training the global model parameters of the kth-1 th iteration by using the local model parameters with regularization constraints by the client; the regularization constraint is determined by the client according to global model parameters of the server and local model parameters of the client;
the processing unit 502 determines a global model parameter of the kth iteration according to the local model parameter of the kth iteration and the global model parameter of the k-1 iteration;
broadcasting the global model parameters of the kth iteration to the plurality of clients so that the plurality of clients perform the (k + 1) th iteration training.
Optionally, the processing unit 502 is specifically configured to:
determining the global model parameter of the kth iteration according to the following formula (3);
wherein, Wk (0)Is the global model parameter of the kth iteration, Wk-1 (0)α for the global model parameters of the k-1 th iterationkLearning rate for the kth iteration, Wk (i)Local model parameters for the kth iteration for the ith client.
Based on the same technical concept, an embodiment of the present invention further provides a computing device, including:
a memory for storing program instructions;
and the processor is used for calling the program instructions stored in the memory and executing the training method of the federal learning model according to the obtained program.
Based on the same technical concept, the embodiment of the present invention further provides a computer-readable storage medium, where computer-executable instructions are stored, and the computer-executable instructions are configured to enable a computer to execute the above-mentioned method for training the federal learning model.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.
Claims (10)
1. A method for training a federated learning model is characterized by comprising the following steps:
the client acquires global model parameters of the (k-1) th iteration broadcasted by the server; k is a positive integer;
the client side takes the global model parameters as local model parameters with regularization constraints, local data are used for conducting kth iterative training, and local model parameters of the kth iterative training are obtained; the regularization constraint is determined according to global model parameters of the server and local model parameters of the client;
and the client side sends the local model parameters trained by the k iteration to the server so that the server updates the global model parameters of the k iteration.
2. The method of claim 1, wherein the regularization constraint determined based on the global model parameters of the server and the local model parameters of the client comprises:
and F norm calculation is carried out on the difference value of the global model parameter and the local model parameter to obtain the regularization constraint.
3. The method of claim 1, wherein a final loss function for the local model parameters is determined according to equation (1) below;
4. The method of claim 1, wherein the local model parameters for the kth iterative training are determined according to the following formula (2);
wherein, Wk (i)Is the local model parameter of the kth iteration, Wk-1 (i)Local model parameters for the k-1 th iteration, αkFor the learning rate of the k-th iteration,the gradient of the original loss function for the local model parameters,for the gradient of the regularization constraint, Wk-1 (0)Is the global model parameter for the (k-1) th iteration.
5. A method for training a federated learning model is characterized by comprising the following steps:
the server obtains local model parameters of the kth iteration sent by a plurality of clients; the local model parameters of the kth iteration are obtained by training the global model parameters of the kth-1 th iteration by using the local model parameters with regularization constraints by the client; the regularization constraint is determined by the client according to global model parameters of the server and local model parameters of the client;
the server determines global model parameters of the kth iteration according to the local model parameters of the kth iteration and the global model parameters of k-1 iteration;
and the server broadcasts the global model parameters of the kth iteration to the plurality of clients so that the plurality of clients perform the (k + 1) th iteration training.
6. The method of claim 5, wherein the global model parameters for the kth iteration are determined according to the following equation (3);
wherein, Wk (0)The global model parameter, W, for the kth iterationk-1 (0)α for the global model parameters of the k-1 th iterationkLearning rate for the kth iteration, Wk (i)Local model parameters for the kth iteration for the ith client.
7. The utility model provides a trainer of bang's learning model which characterized in that includes:
the acquisition module is used for acquiring global model parameters of the (k-1) th iteration broadcasted by the server; k is a positive integer;
the processing module is used for taking the global model parameters as parameters of a local model with regularization constraints, and performing kth iterative training by using local data to obtain local model parameters of the kth iterative training; the regularization constraint is determined according to global model parameters of the server and local model parameters of the client;
and sending the local model parameters trained by the k iteration to the server so that the server updates the global model parameters of the k iteration.
8. The utility model provides a trainer of bang's learning model which characterized in that includes:
the acquisition unit is used for acquiring local model parameters of the kth iteration sent by a plurality of clients; the local model parameters of the kth iteration are obtained by training the global model parameters of the kth-1 th iteration by using the local model parameters with regularization constraints by the client; the regularization constraint is determined by the client according to global model parameters of the server and local model parameters of the client;
the processing unit is used for determining the global model parameters of the kth iteration according to the local model parameters of the kth iteration and the global model parameters of the k-1 iteration;
broadcasting the global model parameters of the kth iteration to the plurality of clients so that the plurality of clients perform the (k + 1) th iteration training.
9. A computing device, comprising:
a memory for storing program instructions;
a processor for calling program instructions stored in said memory to perform the method of any one of claims 1 to 4 or 5 to 6 in accordance with the obtained program.
10. A computer-readable storage medium having stored thereon computer-executable instructions for causing a computer to perform the method of any one of claims 1 to 4 or 5 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010564409.1A CN111723947A (en) | 2020-06-19 | 2020-06-19 | Method and device for training federated learning model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010564409.1A CN111723947A (en) | 2020-06-19 | 2020-06-19 | Method and device for training federated learning model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111723947A true CN111723947A (en) | 2020-09-29 |
Family
ID=72567617
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010564409.1A Pending CN111723947A (en) | 2020-06-19 | 2020-06-19 | Method and device for training federated learning model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111723947A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112348063A (en) * | 2020-10-27 | 2021-02-09 | 广东电网有限责任公司电力调度控制中心 | Model training method and device based on federal transfer learning in Internet of things |
CN112532451A (en) * | 2020-11-30 | 2021-03-19 | 安徽工业大学 | Layered federal learning method and device based on asynchronous communication, terminal equipment and storage medium |
CN112668128A (en) * | 2020-12-21 | 2021-04-16 | 国网辽宁省电力有限公司物资分公司 | Method and device for selecting terminal equipment nodes in federated learning system |
CN112906911A (en) * | 2021-02-03 | 2021-06-04 | 厦门大学 | Model training method for federal learning |
CN113095513A (en) * | 2021-04-25 | 2021-07-09 | 中山大学 | Double-layer fair federal learning method, device and storage medium |
CN113139662A (en) * | 2021-04-23 | 2021-07-20 | 深圳市大数据研究院 | Global and local gradient processing method, device, equipment and medium for federal learning |
CN113378994A (en) * | 2021-07-09 | 2021-09-10 | 浙江大学 | Image identification method, device, equipment and computer readable storage medium |
CN113837399A (en) * | 2021-10-26 | 2021-12-24 | 医渡云(北京)技术有限公司 | Federal learning model training method, device, system, storage medium and equipment |
WO2024022082A1 (en) * | 2022-07-29 | 2024-02-01 | 脸萌有限公司 | Information classification method and apparatus, device, and medium |
-
2020
- 2020-06-19 CN CN202010564409.1A patent/CN111723947A/en active Pending
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112348063A (en) * | 2020-10-27 | 2021-02-09 | 广东电网有限责任公司电力调度控制中心 | Model training method and device based on federal transfer learning in Internet of things |
CN112348063B (en) * | 2020-10-27 | 2024-06-11 | 广东电网有限责任公司电力调度控制中心 | Model training method and device based on federal migration learning in Internet of things |
CN112532451A (en) * | 2020-11-30 | 2021-03-19 | 安徽工业大学 | Layered federal learning method and device based on asynchronous communication, terminal equipment and storage medium |
CN112668128A (en) * | 2020-12-21 | 2021-04-16 | 国网辽宁省电力有限公司物资分公司 | Method and device for selecting terminal equipment nodes in federated learning system |
CN112668128B (en) * | 2020-12-21 | 2024-05-28 | 国网辽宁省电力有限公司物资分公司 | Method and device for selecting terminal equipment nodes in federal learning system |
CN112906911A (en) * | 2021-02-03 | 2021-06-04 | 厦门大学 | Model training method for federal learning |
CN112906911B (en) * | 2021-02-03 | 2022-07-01 | 厦门大学 | Model training method for federal learning |
CN113139662B (en) * | 2021-04-23 | 2023-07-14 | 深圳市大数据研究院 | Global and local gradient processing method, device, equipment and medium for federal learning |
CN113139662A (en) * | 2021-04-23 | 2021-07-20 | 深圳市大数据研究院 | Global and local gradient processing method, device, equipment and medium for federal learning |
CN113095513A (en) * | 2021-04-25 | 2021-07-09 | 中山大学 | Double-layer fair federal learning method, device and storage medium |
CN113378994A (en) * | 2021-07-09 | 2021-09-10 | 浙江大学 | Image identification method, device, equipment and computer readable storage medium |
CN113837399B (en) * | 2021-10-26 | 2023-05-30 | 医渡云(北京)技术有限公司 | Training method, device, system, storage medium and equipment for federal learning model |
CN113837399A (en) * | 2021-10-26 | 2021-12-24 | 医渡云(北京)技术有限公司 | Federal learning model training method, device, system, storage medium and equipment |
WO2024022082A1 (en) * | 2022-07-29 | 2024-02-01 | 脸萌有限公司 | Information classification method and apparatus, device, and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111723947A (en) | Method and device for training federated learning model | |
CN112287982A (en) | Data prediction method and device and terminal equipment | |
CN110659678B (en) | User behavior classification method, system and storage medium | |
CN109889397B (en) | Lottery method, block generation method, equipment and storage medium | |
CN106325756B (en) | Data storage method, data calculation method and equipment | |
CN110689136B (en) | Deep learning model obtaining method, device, equipment and storage medium | |
CN111461164B (en) | Sample data set capacity expansion method and model training method | |
CN111695696A (en) | Method and device for model training based on federal learning | |
CN104008420A (en) | Distributed outlier detection method and system based on automatic coding machine | |
CN112100450A (en) | Graph calculation data segmentation method, terminal device and storage medium | |
CN111831855A (en) | Method, apparatus, electronic device, and medium for matching videos | |
CN109344268A (en) | Method, electronic equipment and the computer readable storage medium of graphic data base write-in | |
CN114138231B (en) | Method, circuit and SOC for executing matrix multiplication operation | |
CN111626311B (en) | Heterogeneous graph data processing method and device | |
CN110851247A (en) | Cost optimization scheduling method for constrained cloud workflow | |
CN111667018B (en) | Object clustering method and device, computer readable medium and electronic equipment | |
CN113011911B (en) | Data prediction method and device based on artificial intelligence, medium and electronic equipment | |
CN114332550A (en) | Model training method, system, storage medium and terminal equipment | |
CN106844024A (en) | The GPU/CPU dispatching methods and system of a kind of self study run time forecast model | |
CN114972695B (en) | Point cloud generation method and device, electronic equipment and storage medium | |
CN114092162B (en) | Recommendation quality determination method, and training method and device of recommendation quality determination model | |
CN109981726A (en) | A kind of distribution method of memory node, server and system | |
CN111984842B (en) | Bank customer data processing method and device | |
CN113656187A (en) | Public security big data computing power service system based on 5G | |
CN114492844A (en) | Method and device for constructing machine learning workflow, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |