CN113467952A - Distributed federated learning collaborative computing method and system - Google Patents
Distributed federated learning collaborative computing method and system Download PDFInfo
- Publication number
- CN113467952A CN113467952A CN202110802910.1A CN202110802910A CN113467952A CN 113467952 A CN113467952 A CN 113467952A CN 202110802910 A CN202110802910 A CN 202110802910A CN 113467952 A CN113467952 A CN 113467952A
- Authority
- CN
- China
- Prior art keywords
- model
- participant
- training
- local
- learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000004364 calculation method Methods 0.000 title claims abstract description 45
- 238000012549 training Methods 0.000 claims abstract description 61
- 238000000034 method Methods 0.000 claims abstract description 38
- 230000002787 reinforcement Effects 0.000 claims abstract description 36
- 230000008569 process Effects 0.000 claims abstract description 29
- 238000013468 resource allocation Methods 0.000 claims description 41
- 230000005540 biological transmission Effects 0.000 claims description 18
- 230000007547 defect Effects 0.000 claims description 14
- 238000001514 detection method Methods 0.000 claims description 13
- 230000009471 action Effects 0.000 claims description 11
- 239000002184 metal Substances 0.000 claims description 10
- 230000006870 function Effects 0.000 claims description 8
- 238000004088 simulation Methods 0.000 claims description 6
- 230000004044 response Effects 0.000 claims description 5
- 239000000654 additive Substances 0.000 claims description 3
- 230000000996 additive effect Effects 0.000 claims description 3
- 238000004422 calculation algorithm Methods 0.000 claims description 3
- 230000003595 spectral effect Effects 0.000 claims description 3
- 238000011144 upstream manufacturing Methods 0.000 claims description 3
- 230000002776 aggregation Effects 0.000 description 11
- 238000004220 aggregation Methods 0.000 description 11
- 230000008859 change Effects 0.000 description 9
- 238000004891 communication Methods 0.000 description 7
- 230000009286 beneficial effect Effects 0.000 description 4
- 239000007769 metal material Substances 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000003754 machining Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
- 230000002035 prolonged effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Telephonic Communication Services (AREA)
Abstract
The application discloses a distributed federated learning collaborative computing method and a distributed federated learning collaborative computing system, wherein the distributed federated learning collaborative computing method specifically comprises the following steps: carrying out deep reinforcement learning model training; respectively deploying the trained deep reinforcement learning model to each edge server for federal learning; and (5) ending the federal study. The method and the system break the dependence of the traditional federal learning on the central server aiming at the distributed federal learning framework, and effectively guarantee the privacy protection and the safety in the federal learning process.
Description
Technical Field
The application relates to the field of communication, in particular to a distributed federated learning collaborative computing method and system.
Background
The metal material workpiece is an important component of some products in the machining process, and the quality of the metal material workpiece directly influences the market competitiveness of enterprise products, so that the detection of the surface defects of the metal material workpiece in the machining process is very important. For the defect detection of the metal surface, a deep learning technology can be utilized to collect workpiece images from a production line, defect information is extracted from the images, and a network detection and defect identification model of the metal material workpiece defects is established by learning the surface defect characteristics of the metal workpiece. Commonly used detection models include Fast R-CNN, Mask R-CNN, and the like. However, in an industrial park, some plants have problems of limited data size and poor data quality. In addition, due to problems of industry competition, privacy protection and the like, data is difficult to share and integrate among different enterprises, so that the factories are difficult to train high-quality detection models.
Federal Learning (FL) is an artificial intelligence Learning framework developed to deal with the data privacy protection problem faced when artificial intelligence is actually applied. The core objective of the method is to realize the cooperative learning among a plurality of participants and establish a shared and globally effective artificial intelligence model on the premise that each participant does not need to directly exchange data. Under a federal learning framework, the participants firstly train local models locally, then encrypt parameters of the local models and upload the encrypted parameters to a central server, the server conducts safety aggregation on the local models and then sends updated global model parameters to the participants, and the iteration process is repeated until the global models reach target precision. In the process, the uploading and downloading of the participator are parameters of the model, and the data are always kept in the local area, so that the data privacy of the client can be well protected.
In addition, the federal learning framework still presents some safety issues. A centralized manager of model aggregation may be vulnerable to various threats (e.g., single point of failure and DDoS attacks), whose failure (e.g., skewing all local model updates) may cause the entire learning process to fail. Although the problems of insufficient data volume, privacy disclosure and the like of each participant are well solved by federal learning, and the safety problem in the process of federal learning can also be well solved by a distributed federal learning framework, the delay problem of distributed federal learning is rarely concerned by the current academic community. Considering that different participants have different training speeds in the same round, the participant who completes the calculation first enters passive waiting time, which causes waste of resources. Meanwhile, the network connection between the participant and the edge server is unstable, the network quality can change continuously due to environmental factors, the time required for uploading the model has large uncertainty, and the time required for model aggregation is likely to be prolonged.
Therefore, how to improve the accuracy of the global model of each round and reduce the total time delay of the global model to reach the target accuracy while reducing the time required by each round of federal learning is a problem waiting to be solved.
Disclosure of Invention
Based on this, the application provides a distributed federated learning collaborative computing method for an intelligent factory, which ensures the security of the federated learning process, and solves the problems of association between the edge server and the participants, bandwidth resource allocation and the problem of computing resource allocation of the participants by using a Deep Reinforcement Learning (DRL) technology.
In order to achieve the above object, the present application provides a distributed federated learning collaborative computing method, which specifically includes the following steps: carrying out deep reinforcement learning model training; respectively deploying the trained deep reinforcement learning model to each edge server for federal learning; and (5) ending the federal study.
As above, the deep reinforcement learning model training specifically includes the following sub-steps: initializing network parameters and state information of the deep reinforcement learning model; each participant trains a local model according to the network parameters and state information initialized by the deep reinforcement learning model; generating a bandwidth allocation strategy in response to the completion of the simulation training of the local model, and updating AC network parameters in a single step at each time slot; generating an association strategy and a calculation resource allocation strategy in response to the completion of the simulation transmission of the local model, and updating DQN network parameters; detecting whether the deep reinforcement learning model is converged or the maximum iteration times; and if the local model is not converged or the maximum iteration times are not converged, starting the next iteration and carrying out the training of the local model again.
The above, wherein the metal surface defect detection model is used as the local model.
As above, the initialized state information specifically includes: initializing parameters and convergence accuracy of the Actor network, the Critic network and the DQN network, and position coordinates [ x ] of each participantk,yk]Initial mini-batch valueCPU frequency fkPosition coordinates of edge servers [ x ]m,ym]And maximum bandwidth BmSlot length Δ t and maximum number of iterations I.
As above, the training process of the participator for the local model is to use the local data set DkDivided into a plurality of sizes ofAnd b, training the small batch b by updating the local weight through the following formula so as to complete the training of the local model, wherein the training process is represented as:
wherein, eta represents the learning rate,the gradient of the loss function for each small batch b is represented,representing the local model of party k in the ith iteration.
The above, wherein the simulation training of the local model further comprises determining the time required by the participant k during the ith round of local training and the time required by the participant k during the ith round of local trainingWorkshopThe concrete expression is as follows:
wherein, ckDenotes the number of CPU cycles for the participant k to train a single data sample, τ denotes the number of iterations for the participant to execute the MBGD algorithm, fkRepresenting the CPU cycle frequency at which participant k trains,indicating the mini-batch value of participant k at the time the ith round performed the local training.
As above, the current fast-scale state space is used as the input of the AC network, so as to obtain a fast-scale action space, i.e. a bandwidth resource allocation policy; the fast-scale state space s is represented as:
representing the size of the model for each participant to have outstanding transmissions,representing the transmission rate of an uploading model of each time slot participant, wherein t represents a time slot, and delta t represents the time slot length;
fast scale motion spaceThe fast-scale action space is a bandwidth resource allocation strategy, whereinIndicating the bandwidth allocated by the edge server m for party k per slot.
In the above, in the process of uploading the parameters of the trained deep reinforcement learning model to the edge server according to the determined bandwidth resource allocation strategy, the available uplink data transmission rate between the i-th round participant k and the edge server mExpressed as:
wherein, PkWhich represents the transmission power of the participant k,representing the power spectral density of additive white gaussian noise,k denotes the channel gain of the participant k and the edge server m, and ψ 0 denotes the channel power gain at the reference distance.
The method also comprises the time for the ith round participant k to upload the parameters of the deep reinforcement learning model to the edge server mThe concrete expression is as follows:
wherein xi represents the size of the metal surface defect detection model,indicating the available upstream data transmission rate between the ith round participant k and the edge server m.
A distributed federated learning collaborative computing system, comprising: a deep reinforcement learning unit and a federal learning unit; the deep reinforcement learning unit is used for carrying out deep reinforcement learning model training; and the federated learning unit is used for performing federated learning according to the association strategy generated by the deep reinforcement learning model and the calculation, namely the bandwidth resource allocation strategy.
The application has the following beneficial effects:
(1) the distributed federated learning collaborative computing method and the distributed federated learning collaborative computing system provided by the embodiment break the dependence of the traditional federated learning on the central server and effectively ensure the privacy protection and the security in the federated learning process, aiming at a distributed federated learning framework.
(2) The distributed federated learning collaborative computing method and system provided by the embodiment achieve the design goal of minimizing the total time delay of federated learning from two angles, that is, the total iteration turn is reduced and the time consumed by each iteration turn is reduced are considered at the same time, the computing and communication resources of each participant and the edge server are fully utilized, and the utility maximization of federated learning is achieved.
(3) The distributed federated learning collaborative computing method and the distributed federated learning collaborative computing system provided by the embodiment take the influence of the calculated amount of each participant on the model precision into consideration, adjust the weight occupied by the local model of each participant in the global aggregation process, ensure the fairness of the aggregation process, and are beneficial to accelerating the model convergence.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings can be obtained by those skilled in the art according to the drawings.
FIG. 1 is a flow chart of a distributed federated learning collaborative computing method as presented herein;
FIG. 2 is a schematic diagram of a distributed federated learning collaborative computing system as presented herein.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The method and the device solve the problem of minimizing the total time delay in a distributed federated learning system framework, namely minimizing the total time delay when a global model reaches target precision, and give emphasis to the problems of association between an edge server and participants in the system, bandwidth resource allocation and calculation resource allocation of the participants.
Scene assumption is as follows: the application uses the set K ═ {1, 2, …, K } to represent all the participants of federal learning, and the size of the data set of participant K is represented as DkFor each sample d in the datasetn={xn,yn},xnVector representing input, ynRepresenting a vector xnCorresponding output tag with [ x ]k,yk]Representing the location coordinates of party k; all small base stations as edge servers are represented by the set of M {1, 2, …, M }, and xm,ym]Indicating the location coordinates of the edge server m. In addition, the iteration turns of the federal learning are represented by I ═ {1, 2, I, I },indicating that the participant k establishes a communication connection with the edge server m in the ith iteration, otherwise, the participant k establishes a communication connection with the edge server m Representing the mini-batch value of participant k at the time of the ith round of performing local training; all slots of each iteration are denoted by T ═ 1, 2, T, Δ T denotes the slot length,representing edgesThe edge server m allocates bandwidth for the participant k in each time slot; omegaiA global model representing the ith round is shown,representing the local model of party k in the ith iteration.
The technical problem to be solved by the present application is how to solve the problem of minimizing the total time delay of collaborative computation in the federal learning process, and the problem is specifically expressed as follows:
where C1 indicates that each participant can only connect to one edge server; c2 indicates that each edge server is connected to at least one participant; c3 indicates that each edge server does not allocate bandwidth beyond its maximum bandwidth capacity; c4 indicates that the mini-batch value of each round of the participant does not exceed the data size of the participant.Represents the time required by the participant k in the ith round of local training, whereinRepresenting the bandwidth allocated by the edge server m to party k per slot,indicating that the participant k establishes a communication connection with the edge server m in the ith iteration, otherwise, the participant k establishes a communication connection with the edge server mThe size of the data set for party k is denoted Dk.Indicating the mini-batch value of participant k at the time the ith round performed the local training. B ismRepresenting the maximum bandwidth of each edge server.
The problem, which has dynamic constraints and long-term goals and the current state of the system depends only on the state and actions taken at the previous iteration, satisfies the markov property, can be expressed as a Markov Decision Process (MDP), i.e., MDP { S, a, γ, R }. Wherein S represents a state space, A represents an action space, gamma represents a discount factor, and R represents a reward function. Meanwhile, the solution of the problem is also converted into the determination of the optimal action selection corresponding to the current state under different states.
Further, the above problem can be translated into solving the association and bandwidth resource allocation problem between the edge server and the participants and the computational resource allocation problem of the participants. In this problem, there are three decision variables, one for eachWhereinAndare discrete variables and only vary between different polymerization runs, andthe method is a continuous variable and changes among each time slot, so that deep reinforcement learning with double time scales can be adopted, an aggregation turn i is taken as a time interval of a slow time scale, and a DQN network is adopted to generate an association strategy and a calculation resource allocation strategy in the current state on the slow time scale; and with the time slot length delta t as a time interval of a fast time scale, performing single-step updating on the fast time scale by adopting an Actor-critic (AC) network to generate a bandwidth resource allocation strategy in the current state.
Based on the above thought, the present application provides a flowchart of a distributed federated learning collaborative computing method as shown in fig. 1, which specifically includes the following steps.
Step S110: and carrying out deep reinforcement learning model training.
Wherein, the deep reinforcement learning model is trained in advance by adopting an off-line training mode and an on-line executing mode. The training deep reinforcement learning model (DRL model) is specifically a training AC network and a training DQN network. Wherein the DRL model training comprises the following substeps:
step S1101: and initializing the network parameters and the state information of the DRL model.
Specifically, the initialized state information specifically includes: initializing parameters of an Actor network, a Critic network and a DQN network, initializing an association strategy and position coordinates [ x ] of each participantk,yk]Initial mini-batch valueCPU frequency fkPosition coordinates of edge servers [ x ]m,ym]And maximum bandwidth BmTime slot length delta t and maximum iteration number I, local model parameters used in the process of simulating the federal learning.
Step S1102: each participant performs training of its own local model.
And simulating a federal learning process according to the network parameters and the state information initialized in the step 1101, namely simulating each participant to train a local model according to a mini-batch value output by the DQN network. The purpose of simulating the federal learning process is to train the DRL model.
Preferably, each participant uses an optimization method of a small-batch random Gradient Descent (MBGD) method to perform training of the local model.
Local data set DkDivided into a plurality of sizes ofAnd b, training the small batch b by updating the local weight through the following formula so as to complete the training of the local model, wherein the training process is represented as:
wherein, eta represents the learning rate,the gradient of the loss function for each small batch b is represented,representing the local model of party k in the ith iteration.
Wherein, after the simulation training of the local model, the method also comprises the steps of determining the time required by the participant k during the ith round of local training,
wherein, ckDenotes the number of CPU cycles for the participant k to train a single data sample, τ denotes the number of iterations for the participant to execute the MBGD algorithm, fkRepresenting the CPU cycle frequency at which participant k trains.
Step S1103: in response to completing the simulated local model training, a bandwidth allocation policy is generated and the local model transmission is simulated while updating the AC network parameters in a single step at each time slot.
And simultaneously, the AC network observes the fast scale state s of the current time slot, outputs a fast scale action A (t), and adopts a Bellman equation to update AC network parameters.
In particular, the fast-scale state is represented as A local model size representing the outstanding transmission of each participant, whereinξ denotes the local model size,representing the transmission rate at which each timeslot participant uploads the local model,
specifically, the available upstream data transmission rate between the ith round participant k and the edge server m is represented as:
wherein, PkWhich represents the transmission power of the participant k,representing the power spectral density of additive white gaussian noise,indicating the channel gain, ψ, of the participant k and the edge server m0Representing the channel power gain at the reference distance.
Fast scale motionI.e., bandwidth resource allocation policy, whereinIndicating the bandwidth allocated by the edge server m for party k per slot.
The fast scale reward function R (t) is expressed as:
where μ (t) is a parameter for adjusting the reward function.
Discount factor γ: to reduce the impact of future rewards on the current, more distant rewards have less effect. The jackpot achieved by selecting the fast-scale action a (t) in the fast-scale state s may be defined as:
step S1104: and responding to the transmission of the simulated local model, simulating global model aggregation, generating a next round of association strategy and calculation resource allocation strategy, and updating the DQN network parameters.
Wherein the local model parameters of each participant are weighted by the following formula to obtain global model parameters omegaiAnd detecting the global model accuracy:
where α + β ═ 1 denotes two parameters for adjusting the weight ratio.
Since the association policy in step S1103 is initialized in advance, updating of the association policy needs to be performed. Specifically, the current slow scale state S is used as the input of the DQN network, the slow scale action A is output, namely the association strategy and the calculation resource allocation strategy are associated, and the parameters of the DQN network are updated by adopting a Bellman equation.
Wherein the slow scale state is represented as S ═ tk,tk,m],Representing the time vector consumed by each participant in local training,representing a time vector consumed by each participant to upload the model, whereinRepresenting the time it takes for participant k to upload the model to edge server m.
The slow scale motion is denoted as a ═ a, B],An association vector, i.e. an updated association policy,and representing a mini-batch vector when each participant executes local model training, namely a computing resource allocation strategy.
Slow scale reward function RiExpressed as:
where mu is a parameter for adjusting the reward function,indicating the accuracy of the ith round global model.
The jackpot achieved by selecting the slow-scale action a in the slow-scale state S may be defined as:
step S1105: and detecting whether the DRL model converges or reaches the maximum iteration number.
And if the convergence is not achieved or the maximum iteration number is reached, adding 1 to the iteration number, repeatedly executing the steps S1102-S1104, starting the next iteration, and taking the global model as the local model of each participant to re-simulate the local model training.
In the next iteration process, the association strategy generated in the last iteration and the mini-batch vector required by the next local model training are utilized, and then a new bandwidth allocation strategy is generated in the next iteration process according to the fast scale state space observed by the AC network in the current time slot, and a new association strategy and a calculation resource allocation strategy are generated by the DQN in the slow scale state space. By analogy, the bandwidth resource allocation policy, the association policy, and the computing resource allocation policy are continuously updated.
If convergence or the maximum iteration number is reached, training of the AC network and the DQN network is completed, that is, training of the DRL model is completed, and step S1106 is performed.
Step S1106: and sending each parameter of the trained DRL model to an edge server.
The edge server loads a DRL model, namely the trained AC network and DQN network, and is used for generating an association strategy and a bandwidth and computing resource allocation strategy in the current state, and completing the deployment of the DRL model.
Step S120: and responding to the fact that the trained DRL model is respectively deployed to each edge server, and performing federal learning.
Since the DRL model is to solve the problem of minimizing the federal learning delay, the DRL model is applied to the federal learning process in step S120 after the DRL model is trained in step S110.
Wherein step S120 specifically includes the following substeps:
step S1201: the local model is initialized.
Wherein, the proper metal surface defect detection model selected by the appointed party is used as a local model.
Specifically, the parameters of the metal surface defect detection model, the learning rate, the initial mi i-batch value and the iteration times of the metal surface defect detection model are broadcasted to other participants through an edge server, and each participant uses the metal surface defect detection model as a local model to complete the initialization of the local model.
Step S1202: and responding to the completion of the initialization of the local model, and performing local model training by each participant according to the calculation resource allocation strategy in the current state.
In this step, the calculation resource allocation policy in the current state is the calculation resource allocation policy output by the trained DQN network after step S110 is executed.
The training mode of the local model is trained according to the existing method, which is not described herein.
Step S1203: and each participant uploads the local model parameters trained by the participant to the edge server respectively according to the association strategy and the bandwidth resource allocation strategy.
Specifically, the association policy and the bandwidth resource allocation policy at this time are the association policy and the bandwidth resource allocation policy output by the AC network and the DQN network after the step S110 is executed.
Step S1204: and carrying out global model aggregation on the local model uploaded by each participant, and sending the global model parameters and the calculation resource allocation strategy to each participant.
Specifically, the local models uploaded by all the participants are aggregated into a global model.
In the aggregation process, an edge server serving as a central server temporarily is selected according to the position information of the edge server, and the selection is specifically performed according to the following formula:
wherein, [ x ]m,ym]The position coordinates of each edge server are shown, and the set M {1, 2, …, M } shows all the small base stations as edge servers.
Further, after obtaining the temporary central server according to the above formula, the temporary central server weights the local model parameters of each participant by using the following formula, and finally obtains the global model parameter ωi:
Where α + β ═ 1 denotes two parameters for adjusting the weight ratio.
At this time, the calculation resource allocation policy sent to each participant is the calculation resource allocation policy required for the next iteration after the step 1202 and 1203 are executed. Time vector t consumed by local training of each participant in training of local model at step S1202kWith the change, in step S1203, each participant uploads the time vector t consumed by the modelk,mA change has also occurred, so that in the current state space S ═ tk,tk,m]The change also occurs, and the resulting association policy a ═ a, B]The change has occurred in the form of a change,the change is also generated, namely the mini-batch vector used in the next iteration is changed, and the change of the mini-batch vector brings the change of the calculation resource allocation strategy, namely the calculation resource allocation strategy used in the next iteration is changed.
Step S1205: and judging whether the global model reaches the preset convergence precision or the maximum iteration number.
And if the global model does not reach the preset convergence precision or the maximum iteration number, adding 1 to the iteration number, and re-executing the step S1202, namely re-training the local model.
The local model is re-trained according to the global model participation and the computing resource allocation strategy sent to each participant in step S1204.
Specifically, the global model received by each participant is used as the local model again, and the local model is retrained again according to the calculation resource allocation strategy sent to each participant in step S1204 and required by the next iteration. I.e. steps S1202-1204 are repeatedly performed.
If the global model reaches the preset convergence accuracy or reaches the maximum iteration number, ignoring the global model and the calculation resource allocation strategy sent to each participant in step S1204, and performing step S130 without performing the training of the local model.
Step S130: the federal learning process is ended.
As shown in fig. 2, the distributed federated learning collaborative computing system provided for the present application specifically includes: deep reinforcement learning model training unit 210, federal learning unit 220.
The deep reinforcement learning model training unit 210 is configured to perform deep reinforcement learning model training.
The federal learning unit 220 is connected to the deep reinforcement learning model training unit 210, and is configured to perform federal learning according to the association policy and the calculation, i.e., the bandwidth resource allocation policy generated by the deep reinforcement learning model.
The application has the following beneficial effects:
(3) the distributed federated learning collaborative computing method and the distributed federated learning collaborative computing system provided by the embodiment break the dependence of the traditional federated learning on the central server and effectively ensure the privacy protection and the security in the federated learning process, aiming at a distributed federated learning framework.
(4) The distributed federated learning collaborative computing method and system provided by the embodiment achieve the design goal of minimizing the total time delay of federated learning from two angles, that is, the total iteration turn is reduced and the time consumed by each iteration turn is reduced are considered at the same time, the computing and communication resources of each participant and the edge server are fully utilized, and the utility maximization of federated learning is achieved.
(5) The distributed federated learning collaborative computing method and the distributed federated learning collaborative computing system provided by the embodiment take the influence of the calculated amount of each participant on the model precision into consideration, adjust the weight occupied by the local model of each participant in the global aggregation process, ensure the fairness of the aggregation process, and are beneficial to accelerating the model convergence.
The above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: those skilled in the art can still make modifications or easily conceive of changes to the technical solutions described in the foregoing embodiments, or make equivalents to some of them, within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the present disclosure, which should be construed in light of the above teachings. Are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Claims (10)
1. A distributed federated learning collaborative computing method is characterized by specifically comprising the following steps:
carrying out deep reinforcement learning model training;
and responding to the fact that the trained deep reinforcement learning model is respectively deployed to each edge server, and performing federal learning:
and (5) ending the federal study.
2. The distributed federated learning collaborative computing method of claim 1, wherein deep reinforcement learning model training is performed, specifically comprising the sub-steps of:
initializing network parameters and state information of the deep reinforcement learning model;
each participant trains a local model according to the network parameters and state information initialized by the deep reinforcement learning model;
generating a bandwidth allocation strategy in response to the completion of the simulation training of the local model, and updating AC network parameters in a single step at each time slot;
generating an association strategy and a calculation resource allocation strategy in response to the completion of the simulation transmission of the local model, and updating DQN network parameters;
detecting whether the deep reinforcement learning model is converged or the maximum iteration times;
and if the local model is not converged or the maximum iteration times are not converged, starting the next iteration and carrying out the training of the local model again.
3. The distributed federated learning collaborative computing method of claim 2, wherein a metal surface defect detection model is used as a local model.
4. The distributed federated learning collaborative computing method of claim 2, wherein the initialized state information specifically includes: initializing parameters and convergence accuracy of the Actor network, the Critic network and the DQN network, and position coordinates [ x ] of each participantk,yk]Initial mini-batch valueCPU frequency fkPosition coordinates of edge servers [ x ]m,ym]And maximum bandwidth BmSlot length Δ t and maximum number of iterations I.
5. The distributed federated learning collaborative computing method of claim 2, wherein the participants perform the training process of the local model by fitting a local data set DkDivided into a plurality of sizes ofAnd b, training the small batch b by updating the local weight through the following formula so as to complete the training of the local model, wherein the training process is represented as:
6. The distributed federated learning collaborative computing method of claim 2, wherein in performing the simulated training of the local model, further comprising, determining a time required for the participant to be well-formed at the ith round of local training,
time required for good participation in the ith round of local trainingThe concrete expression is as follows:
wherein, ckRepresenting the number of CPU cycles when the participant is well-trained on a single data sample, τ representing the number of iterations when the participant executes the MBGD algorithm, fkRepresenting the CPU cycle frequency at which participant k trains,indicating the mini-batch value of participant k at the time the ith round performed the local training.
7. The distributed federated learning collaborative computing method of claim 2, wherein a current fast-scale state space is taken as an input of an AC network to obtain a fast-scale action space, i.e., a bandwidth resource allocation policy;
representing the size of the model for each participant to have outstanding transmissions,representing the transmission rate of an uploading model of each time slot participant, wherein t represents a time slot, and delta t represents the time slot length;
8. The distributed federated learning collaborative computing method of claim 1, wherein in the process of uploading parameters of the trained deep reinforcement learning model to the edge server according to the determined bandwidth resource allocation strategy, the available uplink data transmission rate between the ith round of participant k and the edge server mExpressed as:
9. The distributed federated learning collaborative computing method of claim 1, further comprising a time it takes for an ith round participant k to upload deep reinforcement learning model parameters to an edge server mThe concrete expression is as follows:
10. A distributed federated learning collaborative computing system is characterized by specifically comprising: a deep reinforcement learning unit and a federal learning unit;
the deep reinforcement learning unit is used for carrying out deep reinforcement learning model training;
and the federated learning unit is used for performing federated learning according to the association strategy generated by the deep reinforcement learning model and the calculation, namely the bandwidth resource allocation strategy.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110802910.1A CN113467952B (en) | 2021-07-15 | 2021-07-15 | Distributed federal learning collaborative computing method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110802910.1A CN113467952B (en) | 2021-07-15 | 2021-07-15 | Distributed federal learning collaborative computing method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113467952A true CN113467952A (en) | 2021-10-01 |
CN113467952B CN113467952B (en) | 2024-07-02 |
Family
ID=77880516
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110802910.1A Active CN113467952B (en) | 2021-07-15 | 2021-07-15 | Distributed federal learning collaborative computing method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113467952B (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113902021A (en) * | 2021-10-13 | 2022-01-07 | 北京邮电大学 | High-energy-efficiency clustering federal edge learning strategy generation method and device |
CN114065863A (en) * | 2021-11-18 | 2022-02-18 | 北京百度网讯科技有限公司 | Method, device and system for federal learning, electronic equipment and storage medium |
CN114090239A (en) * | 2021-11-01 | 2022-02-25 | 国网江苏省电力有限公司信息通信分公司 | Model-based reinforcement learning edge resource scheduling method and device |
CN114168328A (en) * | 2021-12-06 | 2022-03-11 | 北京邮电大学 | Mobile edge node calculation task scheduling method and system based on federal learning |
CN114328432A (en) * | 2021-12-02 | 2022-04-12 | 京信数据科技有限公司 | Big data federal learning processing method and system |
CN114363911A (en) * | 2021-12-31 | 2022-04-15 | 哈尔滨工业大学(深圳) | Wireless communication system for deploying layered federated learning and resource optimization method |
CN114492746A (en) * | 2022-01-19 | 2022-05-13 | 中国石油大学(华东) | Federal learning acceleration method based on model segmentation |
CN114546608A (en) * | 2022-01-06 | 2022-05-27 | 上海交通大学 | Task scheduling method based on edge calculation |
CN114785608A (en) * | 2022-05-09 | 2022-07-22 | 中国石油大学(华东) | Industrial control network intrusion detection method based on decentralized federal learning |
CN115174412A (en) * | 2022-08-22 | 2022-10-11 | 深圳市人工智能与机器人研究院 | Dynamic bandwidth allocation method for heterogeneous federated learning system and related equipment |
CN115329990A (en) * | 2022-10-13 | 2022-11-11 | 合肥本源物联网科技有限公司 | Asynchronous federated learning acceleration method based on model segmentation under edge calculation scene |
CN116341690A (en) * | 2023-04-28 | 2023-06-27 | 昆山杜克大学 | On-line parameter selection method for minimizing federal learning total cost and related equipment |
CN114492746B (en) * | 2022-01-19 | 2024-10-29 | 中国石油大学(华东) | Federal learning acceleration method based on model segmentation |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112163690A (en) * | 2020-08-19 | 2021-01-01 | 清华大学 | Multi-time scale multi-agent reinforcement learning method and device |
WO2021083276A1 (en) * | 2019-10-29 | 2021-05-06 | 深圳前海微众银行股份有限公司 | Method, device, and apparatus for combining horizontal federation and vertical federation, and medium |
-
2021
- 2021-07-15 CN CN202110802910.1A patent/CN113467952B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021083276A1 (en) * | 2019-10-29 | 2021-05-06 | 深圳前海微众银行股份有限公司 | Method, device, and apparatus for combining horizontal federation and vertical federation, and medium |
CN112163690A (en) * | 2020-08-19 | 2021-01-01 | 清华大学 | Multi-time scale multi-agent reinforcement learning method and device |
Non-Patent Citations (1)
Title |
---|
周俊;方国英;吴楠;: "联邦学习安全与隐私保护研究综述", 西华大学学报(自然科学版), no. 04, 10 July 2020 (2020-07-10) * |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113902021A (en) * | 2021-10-13 | 2022-01-07 | 北京邮电大学 | High-energy-efficiency clustering federal edge learning strategy generation method and device |
CN114090239A (en) * | 2021-11-01 | 2022-02-25 | 国网江苏省电力有限公司信息通信分公司 | Model-based reinforcement learning edge resource scheduling method and device |
CN114065863B (en) * | 2021-11-18 | 2023-08-29 | 北京百度网讯科技有限公司 | Federal learning method, apparatus, system, electronic device and storage medium |
CN114065863A (en) * | 2021-11-18 | 2022-02-18 | 北京百度网讯科技有限公司 | Method, device and system for federal learning, electronic equipment and storage medium |
CN114328432A (en) * | 2021-12-02 | 2022-04-12 | 京信数据科技有限公司 | Big data federal learning processing method and system |
CN114168328A (en) * | 2021-12-06 | 2022-03-11 | 北京邮电大学 | Mobile edge node calculation task scheduling method and system based on federal learning |
CN114168328B (en) * | 2021-12-06 | 2024-09-10 | 北京邮电大学 | Mobile edge node calculation task scheduling method and system based on federal learning |
CN114363911B (en) * | 2021-12-31 | 2023-10-17 | 哈尔滨工业大学(深圳) | Wireless communication system for deploying hierarchical federal learning and resource optimization method |
CN114363911A (en) * | 2021-12-31 | 2022-04-15 | 哈尔滨工业大学(深圳) | Wireless communication system for deploying layered federated learning and resource optimization method |
CN114546608A (en) * | 2022-01-06 | 2022-05-27 | 上海交通大学 | Task scheduling method based on edge calculation |
CN114546608B (en) * | 2022-01-06 | 2024-06-07 | 上海交通大学 | Task scheduling method based on edge calculation |
CN114492746A (en) * | 2022-01-19 | 2022-05-13 | 中国石油大学(华东) | Federal learning acceleration method based on model segmentation |
CN114492746B (en) * | 2022-01-19 | 2024-10-29 | 中国石油大学(华东) | Federal learning acceleration method based on model segmentation |
CN114785608A (en) * | 2022-05-09 | 2022-07-22 | 中国石油大学(华东) | Industrial control network intrusion detection method based on decentralized federal learning |
CN114785608B (en) * | 2022-05-09 | 2023-08-15 | 中国石油大学(华东) | Industrial control network intrusion detection method based on decentralised federal learning |
CN115174412A (en) * | 2022-08-22 | 2022-10-11 | 深圳市人工智能与机器人研究院 | Dynamic bandwidth allocation method for heterogeneous federated learning system and related equipment |
CN115174412B (en) * | 2022-08-22 | 2024-04-12 | 深圳市人工智能与机器人研究院 | Dynamic bandwidth allocation method for heterogeneous federal learning system and related equipment |
CN115329990A (en) * | 2022-10-13 | 2022-11-11 | 合肥本源物联网科技有限公司 | Asynchronous federated learning acceleration method based on model segmentation under edge calculation scene |
CN115329990B (en) * | 2022-10-13 | 2023-01-20 | 合肥本源物联网科技有限公司 | Asynchronous federated learning acceleration method based on model segmentation under edge computing scene |
CN116341690A (en) * | 2023-04-28 | 2023-06-27 | 昆山杜克大学 | On-line parameter selection method for minimizing federal learning total cost and related equipment |
Also Published As
Publication number | Publication date |
---|---|
CN113467952B (en) | 2024-07-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113467952B (en) | Distributed federal learning collaborative computing method and system | |
Zhang et al. | Deep reinforcement learning based resource management for DNN inference in industrial IoT | |
CN111629380B (en) | Dynamic resource allocation method for high concurrency multi-service industrial 5G network | |
CN111800828B (en) | Mobile edge computing resource allocation method for ultra-dense network | |
Li et al. | NOMA-enabled cooperative computation offloading for blockchain-empowered Internet of Things: A learning approach | |
CN112181666A (en) | Method, system, equipment and readable storage medium for equipment evaluation and federal learning importance aggregation based on edge intelligence | |
Wang et al. | Distributed reinforcement learning for age of information minimization in real-time IoT systems | |
WO2021036414A1 (en) | Co-channel interference prediction method for satellite-to-ground downlink under low earth orbit satellite constellation | |
CN112287990B (en) | Model optimization method of edge cloud collaborative support vector machine based on online learning | |
CN113687875B (en) | Method and device for unloading vehicle tasks in Internet of vehicles | |
CN109067583A (en) | A kind of resource prediction method and system based on edge calculations | |
CN113392539A (en) | Robot communication control method, system and equipment based on federal reinforcement learning | |
EP4024212A1 (en) | Method for scheduling interference workloads on edge network resources | |
CN113887748B (en) | Online federal learning task allocation method and device, and federal learning method and system | |
US20230189075A1 (en) | Wireless communication network resource allocation method with dynamic adjustment on demand | |
CN114828018A (en) | Multi-user mobile edge computing unloading method based on depth certainty strategy gradient | |
CN112312299A (en) | Service unloading method, device and system | |
CN113543160A (en) | 5G slice resource allocation method and device, computing equipment and computer storage medium | |
CN111988787A (en) | Method and system for selecting network access and service placement positions of tasks | |
CN115086992A (en) | Distributed semantic communication system and bandwidth resource allocation method and device | |
CN113919483A (en) | Method and system for constructing and positioning radio map in wireless communication network | |
Liang et al. | Stochastic Stackelberg Game Based Edge Service Selection for Massive IoT Networks | |
CN117376355B (en) | B5G mass Internet of things resource allocation method and system based on hypergraph | |
Yan et al. | Service caching for meteorological emergency decision-making in cloud-edge computing | |
Sun et al. | Leveraging digital twin and drl for collaborative context offloading in c-v2x autonomous driving |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |