CN113191504A - Federated learning training acceleration method for computing resource heterogeneity - Google Patents

Federated learning training acceleration method for computing resource heterogeneity Download PDF

Info

Publication number
CN113191504A
CN113191504A CN202110556962.5A CN202110556962A CN113191504A CN 113191504 A CN113191504 A CN 113191504A CN 202110556962 A CN202110556962 A CN 202110556962A CN 113191504 A CN113191504 A CN 113191504A
Authority
CN
China
Prior art keywords
global model
parameters
gradient
updated
updating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110556962.5A
Other languages
Chinese (zh)
Other versions
CN113191504B (en
Inventor
何耶肖
李欢
章小宁
吴昊
范晨昱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202110556962.5A priority Critical patent/CN113191504B/en
Publication of CN113191504A publication Critical patent/CN113191504A/en
Application granted granted Critical
Publication of CN113191504B publication Critical patent/CN113191504B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a federated learning training acceleration method for computing resource isomerism, which is characterized in that whether the iterative number difference between a fastest device and a slowest device reaches a threshold value or not is judged, if so, the fastest device does not need to wait, local model parameters are directly updated by using a gradient update value, then the latest global model parameters are downloaded to obtain the latest global model parameter copy, the global model parameter copy is updated by using extra gradient update parameters, the loss function value with the latest global model parameters is judged, and if the loss function value of the updated global model parameter copy is smaller, the latest global model parameters are replaced by the updated global model parameter copy. The invention provides the relevant adaptability improvement to the traditional SSP parallel synchronization mechanism, and improves the utilization rate of computing resources, thereby improving the training efficiency and shortening the overall training time.

Description

Federated learning training acceleration method for computing resource heterogeneity
Technical Field
The invention relates to the technical field of federal learning, in particular to a calculation resource isomerism oriented federal learning training acceleration method.
Background
In recent years, with the rapid development of machine learning, many artificial intelligence applications, such as data mining, image recognition, natural language processing, biometric recognition, search engines, medical diagnosis, detecting credit card fraud, stock market analysis, voice and handwriting recognition, strategic games, robots, and the like, which are difficult to implement using conventional technologies, have been raised in various fields of human life. Machine learning is a data analysis method for automated analytical model construction that allows computers to learn autonomously without explicit programming. As a data-driven technique, machine learning requires a large amount of data to train to arrive at a high performance model. Today, with the development of cell phones, tablets, and various wearable devices, billions of edge devices generate a lot of user Data, and the Data generated by edge devices will increase to 79.4ZB by 2025 as estimated by the report of International Data Corporation (IDC). This is a valuable data resource for machine learning. The related techniques and applications of machine learning are expected to further advance if such data stored on edge devices can be utilized. However, in order to train the machine learning model, the traditional approach is to upload all raw data sets collected by the edge devices to a remote data center for uniform training. This approach requires a large amount of communication resources to transmit a large amount of data, resulting in unacceptably high costs. In addition, as people's privacy awareness increases, many people are reluctant to upload user data to a data center, and the problem of user privacy disclosure and data security is also caused by transmitting user data through a communication network. Also, the centralized training mode cannot be applied to the fields of finance and the like which are highly sensitive to data security.
In order to protect data privacy and reduce communication resource overhead, federal learning, a distributed training system, has been proposed in recent years to replace the centralized training system. Today, many banks, securities companies, medical equipment manufacturers, and technology companies are actively developing federal learning, the security and utility of which is widely verified, and the basic framework of federal learning is shown in fig. 1.
A set of edge devices (also referred to as clients), such as smartphones, laptops, tablets, etc., use their locally stored data to engage in a distributed training process of the model. Each edge device retains a copy of the model as a local model. The server connecting all edge devices maintains a global model and aggregates the gradient updates from the various edge devices to update the global model. In the training iteration, each edge device uses its local data to compute gradient updates for the model parameters, then uploads the gradient updates to the server, and then downloads the updated new global model parameters from the server as new local model parameters. In the process, each edge device only shares the intermediate calculation result (namely the gradient updating of the model parameters) to the server without uploading the original data stored locally by the edge device, so that the privacy and the security of the data are protected. In addition, the gradient and model of the transmission may utilize various encryption methods to further enhance its security.
In the edge environment, the devices have heterogeneous computing resources due to differences in their processor architectures, power consumption limitations, systems, and the like. The gradient computation time difference between each device is large due to the heterogeneous computing resources. In order to maintain consistency of model parameters of each device, federated learning requires the application of a synchronous parallel mechanism. If the traditional Bulk Synchronization Parallel (BSP) synchronization Parallel mechanism is used, a device with more computing resources capable of quickly completing gradient update computation needs to wait for a device with less computing resources to realize synchronization in each iteration. The more gradiometric device wastes a lot of computing resources due to the waiting process, while the slowest device largely determines the time consumption of the iteration, which is called the Straggler problem. The Straggler problem reduces the training efficiency of federated learning, slowing the convergence rate of the learning model. In order to improve the training efficiency, a Stale Synchronous Parallel (SSP) Synchronous Parallel mechanism is proposed. The strategy of SSP is to allow each device to not wait for the next iteration directly after completing one iteration, but to limit the difference in the number of iterations between the fastest device and the slowest device to within a threshold. Once this threshold is reached, the fastest device needs to wait until the slowest device catches up. Although SSP improves training efficiency to some extent, a large amount of computing resources are inevitably wasted due to the waiting process. The current machine learning algorithm needs a large amount of computing resource support, and on the edge device with limited resources, the low-efficiency training can lead to long training time and difficulty in deploying algorithm application.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a federated learning training acceleration method for computing resource heterogeneity.
In order to achieve the purpose of the invention, the invention adopts the technical scheme that:
a federated learning training acceleration method for computing resource heterogeneity comprises the following steps:
s1, initializing local iteration times, continuous extra gradient calculation times, a continuous extra gradient calculation time threshold value and total iteration times, and downloading initial global model parameters from a server;
s2, adding 1 to the local iteration times, judging whether the local iteration times meet the total iteration times, if so, ending the training, otherwise, entering the step S3;
s3, storing the latest global model parameters downloaded from the server as local model parameters, performing gradient update by using a BP algorithm in combination with a local data set to obtain gradient update parameters, and uploading the gradient update parameters to the server;
s4, judging whether the continuous extra gradient calculation times meet the continuous extra gradient calculation time threshold, if yes, entering a step S10, otherwise, entering a step S5;
s5, updating local model parameters by using the gradient updating parameters obtained in the step S3 to obtain updated local model parameters, and performing an additional gradient updating by using a BP algorithm by combining a local data set to obtain additional gradient updating parameters;
s6, receiving a signal which is issued by the server and allows the latest global model parameter to be downloaded, judging whether the extra gradient calculation is completed, if so, entering the step S7, otherwise, entering the step S9;
s7, downloading the latest global model parameters from the server and copying to obtain a global model parameter copy, updating the global model parameter copy by using the extra gradient update parameters obtained in the step S5, and adding 1 to the calculation times of the continuous extra gradients;
s8, re-determining the latest global model parameters according to the latest global model parameters and the loss function values corresponding to the updated global model parameter copies, and returning to the step S2;
s9, immediately stopping extra gradient calculation, downloading the latest global model parameters from the server, initializing the calculation times of continuous extra gradient, and returning to the step S2;
and S10, initializing continuous additional gradient calculation times, receiving a signal which is sent by the server and allows the latest global model parameter to be downloaded, downloading the latest global model parameter, and returning to the step S2.
The beneficial effect of this scheme does:
after the threshold value of the synchronous parallel mechanism SSP is reached, the equipment which should wait is allowed to additionally perform a round of gradient calculation, after the global model parameters are obtained by next downloading, the global model parameters are updated by using the gradient obtained by the additional round of calculation on the premise of reducing the loss function of the model, so that the calculation resources are further utilized, the waiting time of the equipment which can quickly complete the gradient calculation is reduced, the utilization rate of the calculation resources is improved, the training efficiency is improved, and the overall training time is shortened.
Further, the gradient update parameter calculation formula is:
Figure BDA0003077532780000051
wherein the content of the first and second substances,
Figure BDA0003077532780000052
the parameters are updated for the purpose of the gradient,
Figure BDA0003077532780000053
to derive the parameters in the objective function,
Figure BDA0003077532780000054
for the local model parameters, z is the data sample of the local data set, and Q is the objective function.
The beneficial effects of the further scheme are as follows:
and calculating to obtain gradient updating parameters, and updating global model parameters by combining the gradient updating parameters.
Further, the additional gradient update parameter calculation formula is:
Figure BDA0003077532780000055
wherein the content of the first and second substances,
Figure BDA0003077532780000056
the parameters are updated for the purpose of additional gradients,
Figure BDA0003077532780000057
to derive the parameters in the objective function,
Figure BDA0003077532780000058
for the updated local model parameters, z is the data sample of the local data set and Q is the objective function.
The beneficial effects of the further scheme are as follows:
and performing an additional round of gradient updating to obtain additional gradient updating parameters, and updating the global model parameter copy.
Further, the update formula for updating the local model parameters by using the gradient update parameters in step S5 is as follows:
Figure BDA0003077532780000059
wherein the content of the first and second substances,
Figure BDA00030775327800000510
updating the parameters for the gradient, η is the training step length,
Figure BDA00030775327800000511
as are the parameters of the local model,
Figure BDA00030775327800000512
are updated local model parameters.
The beneficial effects of the further scheme are as follows:
and the local model is updated, an additional round of gradient calculation is performed, and the utilization rate of calculation resources is improved.
Further, the updating formula for updating the global model parameter copy with the additional gradient updating parameters in step S7 is:
Figure BDA0003077532780000061
wherein the content of the first and second substances,
Figure BDA0003077532780000062
updating parameters for additional gradients, ws' is a copy of the parameters of the global model, ws *Is a copy of the updated global model parameters.
The beneficial effects of the further scheme are as follows:
preparation is made for re-determining the latest global model parameters for the loss function values corresponding to the latest global model parameters and the updated copy of the global model parameters in step S8.
Further, the step S8 is specifically:
determining the latest global model parameter wsWith the updated global model parameter copy ws *Respectively corresponding loss function values loss (w)s) And loss (w)s *) Size, if latest globalModel parameter wsLoss function value loss (w)s) Less than the updated global model replica ws *Loss function value loss (w)s *) The updated global model copy w is discardeds *Otherwise, the updated global model copy ws *As the latest global model parameter wsAnd returns to step S2.
The beneficial effects of the further scheme are as follows:
and updating the gradient obtained by the extra gradient calculation, updating the copy after obtaining the global model parameter by next downloading, replacing the original global model parameter with the copy if the updated copy has a smaller loss function value than the original global model parameter, and otherwise discarding the copy to enable the training model to be converged more quickly, thereby improving the training efficiency and shortening the training time.
Further, the receiving of the signal for allowing downloading of the latest global model parameter sent by the server includes the following steps:
s61, initializing global model parameters, an iteration number difference threshold value between the fastest device and the slowest device and a target loss function value;
s62, updating the initial global model parameters by using the gradient updating parameters uploaded to the server in the step S3 to obtain updated global model parameters;
s63, judging whether the loss function value corresponding to the updated global model parameter is smaller than the target loss function value, if so, stopping training, otherwise, entering the step S64;
s64, judging whether the iteration number difference between the fastest device and the slowest device meets an iteration number difference threshold value, if so, entering a step S65, otherwise, entering a step S66;
s65, sending a signal for allowing the latest global model parameter to be downloaded to other equipment except the fastest equipment, and returning to the step S62;
s66, a signal for allowing the latest global model parameter to be downloaded is sent to each device, and the process returns to step S62.
The beneficial effects of the further scheme are as follows:
after reaching the SSP threshold, the fastest devices need not wait, update the local model directly with the gradient obtained from the local computation, and then perform an additional round of gradient computation using the local model and the data. This reduces the latency of the device and improves the utilization of the computing resources.
Further, the update formula for updating the initial global model parameters by using the gradient update parameters in step S62 is as follows:
Figure BDA0003077532780000071
wherein the content of the first and second substances,
Figure BDA0003077532780000072
the parameters are updated for the purpose of the gradient,
Figure BDA0003077532780000073
is an initial global model parameter, eta is a training step length, wsIs the updated global model parameters.
The beneficial effects of the further scheme are as follows:
the global model parameters are updated, the convergence of the global model is promoted, and the overall training progress is promoted.
Drawings
FIG. 1 is a prior art federated learning basic framework;
FIG. 2 is a schematic flow chart of a federated learning training acceleration method for computing resource heterogeneity provided in the present invention;
fig. 3 is a flowchart illustrating the substep of step S6.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined in the appended claims, and all matters produced by the invention using the inventive concept are protected.
As shown in fig. 2, an embodiment of the present invention provides a federated learning training acceleration method for computing resource heterogeneity, including the following steps S1 to S10:
s1, initializing local iteration times tpSuccessive additional gradient calculation times alphapCalculating a time threshold c and a total iteration time T by continuous additional gradiometries, and downloading initial global model parameters from a server
Figure BDA0003077532780000081
In this embodiment, all the edge devices are connected to a single server through a physical channel, and acquire an initial global model parameter of a download signal from the server through the physical channel
Figure BDA0003077532780000082
S2, determining the local iteration number tpAdding 1, entering an iteration process, and judging the local iteration times tpWhether the total iteration times T are met, if yes, finishing the training, and otherwise, entering a step S3;
s3, storing the latest global model parameters downloaded from the server as local model parameters
Figure BDA0003077532780000091
Calculating gradient updating parameters by using a BP algorithm in combination with a local data set, and uploading the gradient updating parameters to a server;
calculating gradient updating parameters through a BP algorithm, wherein the gradient updating parameters are calculated according to the following formula:
Figure BDA0003077532780000092
wherein the content of the first and second substances,
Figure BDA0003077532780000093
updating parameters for gradients,
Figure BDA0003077532780000094
To derive the parameters in the objective function,
Figure BDA0003077532780000095
for the local model parameters, z is the data sample of the local data set, and Q is the objective function.
In this embodiment, the local data set is the data set generated at the edge device.
In this embodiment, the work flow of the BP algorithm includes the following steps:
firstly, providing an input example for an input layer neuron, and then forwarding signals layer by layer until a result of an output layer is generated; then the error is reversely propagated to the hidden layer neuron; then adjusting the connection weight and the threshold value according to the error of the hidden layer neuron; and finally, circularly performing an iterative process until a preset stop condition is reached.
S4, judging the number alpha of continuous extra gradient calculationpWhether the continuous additional gradient calculation time threshold c is met, if yes, the step S10 is executed, and if not, the step S5 is executed;
in this embodiment, the threshold c of the continuous extra-gradient computation time is a hyper-parameter, which is set manually and generally ranges from 1 to 10.
S5, updating parameters by using the gradient obtained in the step S3
Figure BDA0003077532780000101
Updating local model parameters
Figure BDA0003077532780000102
Obtaining updated local model parameters
Figure BDA0003077532780000103
Updating parameters using gradients
Figure BDA0003077532780000104
Updating local model parameters
Figure BDA0003077532780000105
The update formula of (2) is:
Figure BDA0003077532780000106
wherein eta is a training step length;
and calculating an additional round of gradient update by using a BP algorithm in combination with a local data set to obtain an additional gradient update parameter
Figure BDA0003077532780000107
The additional gradient update parameter calculation formula is:
Figure BDA0003077532780000108
wherein the content of the first and second substances,
Figure BDA0003077532780000109
the parameters are updated for the purpose of additional gradients,
Figure BDA00030775327800001010
to derive the parameters in the objective function,
Figure BDA00030775327800001011
for the updated local model parameters, z is the data sample of the local data set and Q is the objective function.
S6, receiving a signal which is issued by the server and allows the latest global model parameter to be downloaded, judging whether the extra gradient calculation is completed, if so, entering the step S7, otherwise, entering the step S9;
in this embodiment, the signal for allowing the latest global model parameter to be downloaded and sent by the server is received through the physical channel.
As shown in fig. 3, the receiving of the signal for allowing the latest global model parameter to be downloaded from the server includes the following steps S61 to S66:
s61, initializing global model parameters wsThreshold m for the difference in the number of iterations between the fastest and the slowest device and the value of the loss-of-interest function lossinf
In this embodiment, the threshold m of the difference between the iteration times of the fastest device and the slowest device is a hyper-parameter, which is set manually, and the range is generally 1 to 10.
S62, updating the parameters by the gradient uploaded to the server in the step S3
Figure BDA0003077532780000111
Updating initial global model parameters
Figure BDA0003077532780000112
Obtaining updated global model parameters wsUpdating parameters using gradients
Figure BDA0003077532780000113
Updating initial global model parameters
Figure BDA0003077532780000114
The update formula of (2) is:
Figure BDA0003077532780000115
wherein eta is a training step length;
s63, judging the updated global model parameter wsCorresponding loss function value loss (w)s) Whether less than the loss of interest function value lossinfIf yes, stopping training, otherwise, entering step S64;
s64, judging whether the iteration number difference between the fastest device and the slowest device meets a preset iteration number difference threshold value m, if so, entering a step S65, otherwise, entering a step S66;
s65, sending a signal for allowing the latest global model parameter to be downloaded to other equipment except the fastest equipment, and returning to the step S62;
s66, a signal for allowing the latest global model parameter to be downloaded is sent to each device, and the process returns to step S62.
In this embodiment, the server is connected to all edge devices participating in federal learning through a physical channel.
S7, downloading the latest global model parameter w from the serversAnd copying to obtain a global model parameter copy ws', and updating the parameters with the additional gradient obtained in step S5
Figure BDA0003077532780000116
Updating global model parameter copy ws *And calculating the number of successive additional gradients alphapAdding 1; updating parameters with additional gradients
Figure BDA0003077532780000117
Updating global model parameter copy wsThe update formula of' is:
Figure BDA0003077532780000118
wherein eta is a training step length;
s8, according to the latest global model parameter wsWith the updated global model parameter copy ws *Corresponding loss function value, and re-determining the latest global model parameter wsReturning to step S2;
in this embodiment, the latest global model parameter w is determinedsWith the updated global model parameter copy ws *Respectively corresponding loss function values loss (w)s) And loss (w)s *) Size, if the latest global model parameter wsLoss function value loss (w)s) Less than the updated global model replica ws *Loss function value loss (w)s *) The updated global model copy w is discardeds *Otherwise, the updated global model copy ws *As the latest global model parameter wsAnd returns to step S2.
S9, immediately stopping additional gradient calculation, and downloading the latest global position from the serverModel parameter wsAnd initializing the number of successive additional gradient calculations alphapReturning to step S2;
s10, initializing continuous additional gradient calculation times alphapAnd receiving a signal that the server allows downloading the latest global model parameter wsReturning to step S2.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The principle and the implementation mode of the invention are explained by applying specific embodiments in the invention, and the description of the embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims (8)

1. A federated learning training acceleration method for computing resource heterogeneity is characterized by comprising the following steps:
s1, initializing local iteration times, continuous extra gradient calculation times, a continuous extra gradient calculation time threshold value and total iteration times, and downloading initial global model parameters from a server;
s2, adding 1 to the local iteration times, judging whether the local iteration times meet the total iteration times, if so, ending the training, otherwise, entering the step S3;
s3, storing the latest global model parameters downloaded from the server as local model parameters, performing gradient update by using a BP algorithm in combination with a local data set to obtain gradient update parameters, and uploading the gradient update parameters to the server;
s4, judging whether the continuous extra gradient calculation times meet the continuous extra gradient calculation time threshold, if yes, entering a step S10, otherwise, entering a step S5;
s5, updating local model parameters by using the gradient updating parameters obtained in the step S3 to obtain updated local model parameters, and performing an additional gradient updating by using a BP algorithm by combining a local data set to obtain additional gradient updating parameters;
s6, receiving a signal which is issued by the server and allows the latest global model parameter to be downloaded, judging whether the extra gradient calculation is completed, if so, entering the step S7, otherwise, entering the step S9;
s7, downloading the latest global model parameters from the server and copying to obtain a global model parameter copy, updating the global model parameter copy by using the extra gradient update parameters obtained in the step S5, and adding 1 to the calculation times of the continuous extra gradients;
s8, re-determining the latest global model parameters according to the latest global model parameters and the loss function values corresponding to the updated global model parameter copies, and returning to the step S2;
s9, immediately stopping extra gradient calculation, downloading the latest global model parameters from the server, initializing the calculation times of continuous extra gradient, and returning to the step S2;
and S10, initializing continuous additional gradient calculation times, receiving a signal which is sent by the server and allows the latest global model parameter to be downloaded, downloading the latest global model parameter, and returning to the step S2.
2. The method for accelerating training of federal learning oriented to computing resource heterogeneity according to claim 1, wherein the gradient update parameter calculation formula is:
Figure FDA0003077532770000021
wherein the content of the first and second substances,
Figure FDA0003077532770000022
the parameters are updated for the purpose of the gradient,
Figure FDA0003077532770000023
to derive the parameters in the objective function,
Figure FDA0003077532770000024
is the local model parameter, z is the data sample of the local data set, Q is the objectiveAnd (4) a standard function.
3. The method for accelerating training of federal learning oriented to computing resource heterogeneity according to claim 1, wherein the additional gradient update parameter calculation formula is:
Figure FDA0003077532770000025
wherein the content of the first and second substances,
Figure FDA0003077532770000026
the parameters are updated for the purpose of additional gradients,
Figure FDA0003077532770000027
to derive the parameters in the objective function,
Figure FDA0003077532770000028
for the updated local model parameters, z is the data sample of the local data set and Q is the objective function.
4. The method as claimed in claim 1, wherein the updating formula for updating the local model parameters by using the gradient update parameters in step S5 is as follows:
Figure FDA0003077532770000029
wherein the content of the first and second substances,
Figure FDA00030775327700000210
updating the parameters for the gradient, η is the training step length,
Figure FDA00030775327700000211
as are the parameters of the local model,
Figure FDA00030775327700000212
are updated local model parameters.
5. The method of claim 4, wherein in the step S7, the update formula for updating the global model parameter copy with the additional gradient update parameter is as follows:
Figure FDA0003077532770000031
wherein the content of the first and second substances,
Figure FDA0003077532770000032
updating parameters for additional gradients, ws' is a copy of the parameters of the global model, ws *Is a copy of the updated global model parameters.
6. The method for accelerating training of federal learning oriented to computing resource heterogeneity according to claim 2, wherein the step S8 specifically includes:
determining the latest global model parameter wsWith the updated global model parameter copy ws *Respectively corresponding loss function values loss (w)s) And loss (w)s *) Size, if the latest global model parameter wsLoss function value loss (w)s) Less than the updated global model replica ws *Loss function value loss (w)s *) The updated global model copy w is discardeds *Otherwise, the updated global model copy ws *As the latest global model parameter wsAnd returns to step S2.
7. The method for accelerating the training of the federal learning oriented towards computing resource heterogeneity according to claim 1, wherein the receiving the signal transmitted by the server for allowing the latest global model parameter to be downloaded includes the following steps:
s61, initializing global model parameters, an iteration number difference threshold value between the fastest device and the slowest device and a target loss function value;
s62, updating the initial global model parameters by using the gradient updating parameters uploaded to the server in the step S3 to obtain updated global model parameters;
s63, judging whether the loss function value corresponding to the updated global model parameter is smaller than the target loss function value, if so, stopping training, otherwise, entering the step S64;
s64, judging whether the iteration number difference between the fastest device and the slowest device meets an iteration number difference threshold value, if so, entering a step S65, otherwise, entering a step S66;
s65, sending a signal for allowing the latest global model parameter to be downloaded to other equipment except the fastest equipment, and returning to the step S62;
s66, a signal for allowing the latest global model parameter to be downloaded is sent to each device, and the process returns to step S62.
8. The method of claim 5, wherein the updating formula for updating the initial global model parameters by using the gradient updating parameters in step S62 is as follows:
Figure FDA0003077532770000041
wherein the content of the first and second substances,
Figure FDA0003077532770000042
the parameters are updated for the purpose of the gradient,
Figure FDA0003077532770000043
is an initial global model parameter, eta is a training step length, wsIs the updated global model parameters.
CN202110556962.5A 2021-05-21 2021-05-21 Federated learning training acceleration method for computing resource isomerism Expired - Fee Related CN113191504B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110556962.5A CN113191504B (en) 2021-05-21 2021-05-21 Federated learning training acceleration method for computing resource isomerism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110556962.5A CN113191504B (en) 2021-05-21 2021-05-21 Federated learning training acceleration method for computing resource isomerism

Publications (2)

Publication Number Publication Date
CN113191504A true CN113191504A (en) 2021-07-30
CN113191504B CN113191504B (en) 2022-06-28

Family

ID=76984715

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110556962.5A Expired - Fee Related CN113191504B (en) 2021-05-21 2021-05-21 Federated learning training acceleration method for computing resource isomerism

Country Status (1)

Country Link
CN (1) CN113191504B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113778966A (en) * 2021-09-15 2021-12-10 深圳技术大学 Cross-school information sharing method and related device for college teaching and course score
CN113902128A (en) * 2021-10-12 2022-01-07 中国人民解放军国防科技大学 Asynchronous federal learning method, device and medium for improving utilization efficiency of edge device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180285731A1 (en) * 2017-03-30 2018-10-04 Atomwise Inc. Systems and methods for correcting error in a first classifier by evaluating classifier output in parallel
CN111611610A (en) * 2020-04-12 2020-09-01 西安电子科技大学 Federal learning information processing method, system, storage medium, program, and terminal
CN112181971A (en) * 2020-10-27 2021-01-05 华侨大学 Edge-based federated learning model cleaning and equipment clustering method, system, equipment and readable storage medium
CN112288100A (en) * 2020-12-29 2021-01-29 支付宝(杭州)信息技术有限公司 Method, system and device for updating model parameters based on federal learning
US20210073678A1 (en) * 2019-09-09 2021-03-11 Huawei Technologies Co., Ltd. Method, apparatus and system for secure vertical federated learning
CN112817653A (en) * 2021-01-22 2021-05-18 西安交通大学 Cloud-side-based federated learning calculation unloading computing system and method
CN112818394A (en) * 2021-01-29 2021-05-18 西安交通大学 Self-adaptive asynchronous federal learning method with local privacy protection

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180285731A1 (en) * 2017-03-30 2018-10-04 Atomwise Inc. Systems and methods for correcting error in a first classifier by evaluating classifier output in parallel
US20210073678A1 (en) * 2019-09-09 2021-03-11 Huawei Technologies Co., Ltd. Method, apparatus and system for secure vertical federated learning
CN111611610A (en) * 2020-04-12 2020-09-01 西安电子科技大学 Federal learning information processing method, system, storage medium, program, and terminal
CN112181971A (en) * 2020-10-27 2021-01-05 华侨大学 Edge-based federated learning model cleaning and equipment clustering method, system, equipment and readable storage medium
CN112288100A (en) * 2020-12-29 2021-01-29 支付宝(杭州)信息技术有限公司 Method, system and device for updating model parameters based on federal learning
CN112817653A (en) * 2021-01-22 2021-05-18 西安交通大学 Cloud-side-based federated learning calculation unloading computing system and method
CN112818394A (en) * 2021-01-29 2021-05-18 西安交通大学 Self-adaptive asynchronous federal learning method with local privacy protection

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LAIZHONG CUI: "ClusterGrad: Adaptive Gradient Compression by Clustering in Federated Learning", 《GLOBECOM 2020 - 2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE》 *
廖钰盈: "面向异构边缘节点的融合联邦学习机制研究", 《中国优秀硕士电子期刊》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113778966A (en) * 2021-09-15 2021-12-10 深圳技术大学 Cross-school information sharing method and related device for college teaching and course score
CN113778966B (en) * 2021-09-15 2024-03-26 深圳技术大学 Cross-school information sharing method and related device for university teaching and course score
CN113902128A (en) * 2021-10-12 2022-01-07 中国人民解放军国防科技大学 Asynchronous federal learning method, device and medium for improving utilization efficiency of edge device

Also Published As

Publication number Publication date
CN113191504B (en) 2022-06-28

Similar Documents

Publication Publication Date Title
CN110880036B (en) Neural network compression method, device, computer equipment and storage medium
CN111784002B (en) Distributed data processing method, device, computer equipment and storage medium
CN113191504B (en) Federated learning training acceleration method for computing resource isomerism
WO2020042658A1 (en) Data processing method, device, apparatus, and system
CN108416440A (en) A kind of training method of neural network, object identification method and device
CN110889509B (en) Gradient momentum acceleration-based joint learning method and device
WO2021238262A1 (en) Vehicle recognition method and apparatus, device, and storage medium
WO2021244354A1 (en) Training method for neural network model, and related product
CN114611705A (en) Data processing method, training method for machine learning, and related device and equipment
WO2023201963A1 (en) Image caption method and apparatus, and device and medium
EP3889846A1 (en) Deep learning model training method and system
WO2023103864A1 (en) Node model updating method for resisting bias transfer in federated learning
CN112163601A (en) Image classification method, system, computer device and storage medium
CN111598213A (en) Network training method, data identification method, device, equipment and medium
WO2021169366A1 (en) Data enhancement method and apparatus
CN112446462B (en) Method and device for generating target neural network model
CN115526307A (en) Network model compression method and device, electronic equipment and storage medium
CN115238909A (en) Data value evaluation method based on federal learning and related equipment thereof
CN114358250A (en) Data processing method, data processing apparatus, computer device, medium, and program product
CN113762503A (en) Data processing method, device, equipment and computer readable storage medium
CN115907041A (en) Model training method and device
CN114758130B (en) Image processing and model training method, device, equipment and storage medium
CN115795025A (en) Abstract generation method and related equipment thereof
CN114723069A (en) Parameter updating method and device and electronic equipment
CN113762304B (en) Image processing method, image processing device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20220628

CF01 Termination of patent right due to non-payment of annual fee