CN113467952A - Distributed federated learning collaborative computing method and system - Google Patents

Distributed federated learning collaborative computing method and system Download PDF

Info

Publication number
CN113467952A
CN113467952A CN202110802910.1A CN202110802910A CN113467952A CN 113467952 A CN113467952 A CN 113467952A CN 202110802910 A CN202110802910 A CN 202110802910A CN 113467952 A CN113467952 A CN 113467952A
Authority
CN
China
Prior art keywords
model
participant
training
local
learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110802910.1A
Other languages
Chinese (zh)
Other versions
CN113467952B (en
Inventor
张天魁
刘天泽
陈泽仁
徐琪
章园
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangxi Xinbingrui Technology Co ltd
Beijing University of Posts and Telecommunications
Original Assignee
Jiangxi Xinbingrui Technology Co ltd
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangxi Xinbingrui Technology Co ltd, Beijing University of Posts and Telecommunications filed Critical Jiangxi Xinbingrui Technology Co ltd
Priority to CN202110802910.1A priority Critical patent/CN113467952B/en
Publication of CN113467952A publication Critical patent/CN113467952A/en
Application granted granted Critical
Publication of CN113467952B publication Critical patent/CN113467952B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The application discloses a distributed federated learning collaborative computing method and a distributed federated learning collaborative computing system, wherein the distributed federated learning collaborative computing method specifically comprises the following steps: carrying out deep reinforcement learning model training; respectively deploying the trained deep reinforcement learning model to each edge server for federal learning; and (5) ending the federal study. The method and the system break the dependence of the traditional federal learning on the central server aiming at the distributed federal learning framework, and effectively guarantee the privacy protection and the safety in the federal learning process.

Description

Distributed federated learning collaborative computing method and system
Technical Field
The application relates to the field of communication, in particular to a distributed federated learning collaborative computing method and system.
Background
The metal material workpiece is an important component of some products in the machining process, and the quality of the metal material workpiece directly influences the market competitiveness of enterprise products, so that the detection of the surface defects of the metal material workpiece in the machining process is very important. For the defect detection of the metal surface, a deep learning technology can be utilized to collect workpiece images from a production line, defect information is extracted from the images, and a network detection and defect identification model of the metal material workpiece defects is established by learning the surface defect characteristics of the metal workpiece. Commonly used detection models include Fast R-CNN, Mask R-CNN, and the like. However, in an industrial park, some plants have problems of limited data size and poor data quality. In addition, due to problems of industry competition, privacy protection and the like, data is difficult to share and integrate among different enterprises, so that the factories are difficult to train high-quality detection models.
Federal Learning (FL) is an artificial intelligence Learning framework developed to deal with the data privacy protection problem faced when artificial intelligence is actually applied. The core objective of the method is to realize the cooperative learning among a plurality of participants and establish a shared and globally effective artificial intelligence model on the premise that each participant does not need to directly exchange data. Under a federal learning framework, the participants firstly train local models locally, then encrypt parameters of the local models and upload the encrypted parameters to a central server, the server conducts safety aggregation on the local models and then sends updated global model parameters to the participants, and the iteration process is repeated until the global models reach target precision. In the process, the uploading and downloading of the participator are parameters of the model, and the data are always kept in the local area, so that the data privacy of the client can be well protected.
In addition, the federal learning framework still presents some safety issues. A centralized manager of model aggregation may be vulnerable to various threats (e.g., single point of failure and DDoS attacks), whose failure (e.g., skewing all local model updates) may cause the entire learning process to fail. Although the problems of insufficient data volume, privacy disclosure and the like of each participant are well solved by federal learning, and the safety problem in the process of federal learning can also be well solved by a distributed federal learning framework, the delay problem of distributed federal learning is rarely concerned by the current academic community. Considering that different participants have different training speeds in the same round, the participant who completes the calculation first enters passive waiting time, which causes waste of resources. Meanwhile, the network connection between the participant and the edge server is unstable, the network quality can change continuously due to environmental factors, the time required for uploading the model has large uncertainty, and the time required for model aggregation is likely to be prolonged.
Therefore, how to improve the accuracy of the global model of each round and reduce the total time delay of the global model to reach the target accuracy while reducing the time required by each round of federal learning is a problem waiting to be solved.
Disclosure of Invention
Based on this, the application provides a distributed federated learning collaborative computing method for an intelligent factory, which ensures the security of the federated learning process, and solves the problems of association between the edge server and the participants, bandwidth resource allocation and the problem of computing resource allocation of the participants by using a Deep Reinforcement Learning (DRL) technology.
In order to achieve the above object, the present application provides a distributed federated learning collaborative computing method, which specifically includes the following steps: carrying out deep reinforcement learning model training; respectively deploying the trained deep reinforcement learning model to each edge server for federal learning; and (5) ending the federal study.
As above, the deep reinforcement learning model training specifically includes the following sub-steps: initializing network parameters and state information of the deep reinforcement learning model; each participant trains a local model according to the network parameters and state information initialized by the deep reinforcement learning model; generating a bandwidth allocation strategy in response to the completion of the simulation training of the local model, and updating AC network parameters in a single step at each time slot; generating an association strategy and a calculation resource allocation strategy in response to the completion of the simulation transmission of the local model, and updating DQN network parameters; detecting whether the deep reinforcement learning model is converged or the maximum iteration times; and if the local model is not converged or the maximum iteration times are not converged, starting the next iteration and carrying out the training of the local model again.
The above, wherein the metal surface defect detection model is used as the local model.
As above, the initialized state information specifically includes: initializing parameters and convergence accuracy of the Actor network, the Critic network and the DQN network, and position coordinates [ x ] of each participantk,yk]Initial mini-batch value
Figure BDA0003165304780000031
CPU frequency fkPosition coordinates of edge servers [ x ]m,ym]And maximum bandwidth BmSlot length Δ t and maximum number of iterations I.
As above, the training process of the participator for the local model is to use the local data set DkDivided into a plurality of sizes of
Figure BDA0003165304780000032
And b, training the small batch b by updating the local weight through the following formula so as to complete the training of the local model, wherein the training process is represented as:
Figure BDA0003165304780000033
wherein, eta represents the learning rate,
Figure BDA0003165304780000034
the gradient of the loss function for each small batch b is represented,
Figure BDA0003165304780000035
representing the local model of party k in the ith iteration.
The above, wherein the simulation training of the local model further comprises determining the time required by the participant k during the ith round of local training and the time required by the participant k during the ith round of local trainingWorkshop
Figure BDA0003165304780000036
The concrete expression is as follows:
Figure BDA0003165304780000037
wherein, ckDenotes the number of CPU cycles for the participant k to train a single data sample, τ denotes the number of iterations for the participant to execute the MBGD algorithm, fkRepresenting the CPU cycle frequency at which participant k trains,
Figure BDA0003165304780000038
indicating the mini-batch value of participant k at the time the ith round performed the local training.
As above, the current fast-scale state space is used as the input of the AC network, so as to obtain a fast-scale action space, i.e. a bandwidth resource allocation policy; the fast-scale state space s is represented as:
Figure BDA0003165304780000039
Figure BDA00031653047800000310
representing the size of the model for each participant to have outstanding transmissions,
Figure BDA0003165304780000041
representing the transmission rate of an uploading model of each time slot participant, wherein t represents a time slot, and delta t represents the time slot length;
fast scale motion space
Figure BDA0003165304780000042
The fast-scale action space is a bandwidth resource allocation strategy, wherein
Figure BDA0003165304780000043
Indicating the bandwidth allocated by the edge server m for party k per slot.
In the above, in the process of uploading the parameters of the trained deep reinforcement learning model to the edge server according to the determined bandwidth resource allocation strategy, the available uplink data transmission rate between the i-th round participant k and the edge server m
Figure BDA0003165304780000044
Expressed as:
Figure BDA0003165304780000045
wherein, PkWhich represents the transmission power of the participant k,
Figure BDA0003165304780000046
representing the power spectral density of additive white gaussian noise,
Figure BDA0003165304780000047
k denotes the channel gain of the participant k and the edge server m, and ψ 0 denotes the channel power gain at the reference distance.
The method also comprises the time for the ith round participant k to upload the parameters of the deep reinforcement learning model to the edge server m
Figure BDA0003165304780000048
The concrete expression is as follows:
Figure BDA0003165304780000049
wherein xi represents the size of the metal surface defect detection model,
Figure BDA00031653047800000410
indicating the available upstream data transmission rate between the ith round participant k and the edge server m.
A distributed federated learning collaborative computing system, comprising: a deep reinforcement learning unit and a federal learning unit; the deep reinforcement learning unit is used for carrying out deep reinforcement learning model training; and the federated learning unit is used for performing federated learning according to the association strategy generated by the deep reinforcement learning model and the calculation, namely the bandwidth resource allocation strategy.
The application has the following beneficial effects:
(1) the distributed federated learning collaborative computing method and the distributed federated learning collaborative computing system provided by the embodiment break the dependence of the traditional federated learning on the central server and effectively ensure the privacy protection and the security in the federated learning process, aiming at a distributed federated learning framework.
(2) The distributed federated learning collaborative computing method and system provided by the embodiment achieve the design goal of minimizing the total time delay of federated learning from two angles, that is, the total iteration turn is reduced and the time consumed by each iteration turn is reduced are considered at the same time, the computing and communication resources of each participant and the edge server are fully utilized, and the utility maximization of federated learning is achieved.
(3) The distributed federated learning collaborative computing method and the distributed federated learning collaborative computing system provided by the embodiment take the influence of the calculated amount of each participant on the model precision into consideration, adjust the weight occupied by the local model of each participant in the global aggregation process, ensure the fairness of the aggregation process, and are beneficial to accelerating the model convergence.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings can be obtained by those skilled in the art according to the drawings.
FIG. 1 is a flow chart of a distributed federated learning collaborative computing method as presented herein;
FIG. 2 is a schematic diagram of a distributed federated learning collaborative computing system as presented herein.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The method and the device solve the problem of minimizing the total time delay in a distributed federated learning system framework, namely minimizing the total time delay when a global model reaches target precision, and give emphasis to the problems of association between an edge server and participants in the system, bandwidth resource allocation and calculation resource allocation of the participants.
Scene assumption is as follows: the application uses the set K ═ {1, 2, …, K } to represent all the participants of federal learning, and the size of the data set of participant K is represented as DkFor each sample d in the datasetn={xn,yn},xnVector representing input, ynRepresenting a vector xnCorresponding output tag with [ x ]k,yk]Representing the location coordinates of party k; all small base stations as edge servers are represented by the set of M {1, 2, …, M }, and xm,ym]Indicating the location coordinates of the edge server m. In addition, the iteration turns of the federal learning are represented by I ═ {1, 2, I, I },
Figure BDA0003165304780000061
indicating that the participant k establishes a communication connection with the edge server m in the ith iteration, otherwise, the participant k establishes a communication connection with the edge server m
Figure BDA0003165304780000062
Figure BDA0003165304780000063
Representing the mini-batch value of participant k at the time of the ith round of performing local training; all slots of each iteration are denoted by T ═ 1, 2, T, Δ T denotes the slot length,
Figure BDA0003165304780000064
representing edgesThe edge server m allocates bandwidth for the participant k in each time slot; omegaiA global model representing the ith round is shown,
Figure BDA0003165304780000065
representing the local model of party k in the ith iteration.
The technical problem to be solved by the present application is how to solve the problem of minimizing the total time delay of collaborative computation in the federal learning process, and the problem is specifically expressed as follows:
Figure BDA0003165304780000066
Figure BDA0003165304780000067
Figure BDA0003165304780000068
Figure BDA0003165304780000069
Figure BDA00031653047800000610
where C1 indicates that each participant can only connect to one edge server; c2 indicates that each edge server is connected to at least one participant; c3 indicates that each edge server does not allocate bandwidth beyond its maximum bandwidth capacity; c4 indicates that the mini-batch value of each round of the participant does not exceed the data size of the participant.
Figure BDA00031653047800000611
Represents the time required by the participant k in the ith round of local training, wherein
Figure BDA00031653047800000612
Representing the bandwidth allocated by the edge server m to party k per slot,
Figure BDA00031653047800000613
indicating that the participant k establishes a communication connection with the edge server m in the ith iteration, otherwise, the participant k establishes a communication connection with the edge server m
Figure BDA0003165304780000071
The size of the data set for party k is denoted Dk.
Figure BDA0003165304780000072
Indicating the mini-batch value of participant k at the time the ith round performed the local training. B ismRepresenting the maximum bandwidth of each edge server.
The problem, which has dynamic constraints and long-term goals and the current state of the system depends only on the state and actions taken at the previous iteration, satisfies the markov property, can be expressed as a Markov Decision Process (MDP), i.e., MDP { S, a, γ, R }. Wherein S represents a state space, A represents an action space, gamma represents a discount factor, and R represents a reward function. Meanwhile, the solution of the problem is also converted into the determination of the optimal action selection corresponding to the current state under different states.
Further, the above problem can be translated into solving the association and bandwidth resource allocation problem between the edge server and the participants and the computational resource allocation problem of the participants. In this problem, there are three decision variables, one for each
Figure BDA0003165304780000073
Wherein
Figure BDA0003165304780000074
And
Figure BDA0003165304780000075
are discrete variables and only vary between different polymerization runs, and
Figure BDA0003165304780000076
the method is a continuous variable and changes among each time slot, so that deep reinforcement learning with double time scales can be adopted, an aggregation turn i is taken as a time interval of a slow time scale, and a DQN network is adopted to generate an association strategy and a calculation resource allocation strategy in the current state on the slow time scale; and with the time slot length delta t as a time interval of a fast time scale, performing single-step updating on the fast time scale by adopting an Actor-critic (AC) network to generate a bandwidth resource allocation strategy in the current state.
Based on the above thought, the present application provides a flowchart of a distributed federated learning collaborative computing method as shown in fig. 1, which specifically includes the following steps.
Step S110: and carrying out deep reinforcement learning model training.
Wherein, the deep reinforcement learning model is trained in advance by adopting an off-line training mode and an on-line executing mode. The training deep reinforcement learning model (DRL model) is specifically a training AC network and a training DQN network. Wherein the DRL model training comprises the following substeps:
step S1101: and initializing the network parameters and the state information of the DRL model.
Specifically, the initialized state information specifically includes: initializing parameters of an Actor network, a Critic network and a DQN network, initializing an association strategy and position coordinates [ x ] of each participantk,yk]Initial mini-batch value
Figure BDA0003165304780000077
CPU frequency fkPosition coordinates of edge servers [ x ]m,ym]And maximum bandwidth BmTime slot length delta t and maximum iteration number I, local model parameters used in the process of simulating the federal learning.
Step S1102: each participant performs training of its own local model.
And simulating a federal learning process according to the network parameters and the state information initialized in the step 1101, namely simulating each participant to train a local model according to a mini-batch value output by the DQN network. The purpose of simulating the federal learning process is to train the DRL model.
Preferably, each participant uses an optimization method of a small-batch random Gradient Descent (MBGD) method to perform training of the local model.
Local data set DkDivided into a plurality of sizes of
Figure BDA0003165304780000081
And b, training the small batch b by updating the local weight through the following formula so as to complete the training of the local model, wherein the training process is represented as:
Figure BDA0003165304780000082
wherein, eta represents the learning rate,
Figure BDA0003165304780000083
the gradient of the loss function for each small batch b is represented,
Figure BDA0003165304780000084
representing the local model of party k in the ith iteration.
Wherein, after the simulation training of the local model, the method also comprises the steps of determining the time required by the participant k during the ith round of local training,
time required by participant k in ith round of local training
Figure BDA0003165304780000085
The concrete expression is as follows:
Figure BDA0003165304780000086
wherein, ckDenotes the number of CPU cycles for the participant k to train a single data sample, τ denotes the number of iterations for the participant to execute the MBGD algorithm, fkRepresenting the CPU cycle frequency at which participant k trains.
Step S1103: in response to completing the simulated local model training, a bandwidth allocation policy is generated and the local model transmission is simulated while updating the AC network parameters in a single step at each time slot.
And simultaneously, the AC network observes the fast scale state s of the current time slot, outputs a fast scale action A (t), and adopts a Bellman equation to update AC network parameters.
In particular, the fast-scale state is represented as
Figure BDA0003165304780000087
Figure BDA0003165304780000091
A local model size representing the outstanding transmission of each participant, wherein
Figure BDA0003165304780000092
ξ denotes the local model size,
Figure BDA0003165304780000093
representing the transmission rate at which each timeslot participant uploads the local model,
Figure BDA0003165304780000094
specifically, the available upstream data transmission rate between the ith round participant k and the edge server m is represented as:
Figure BDA0003165304780000095
wherein, PkWhich represents the transmission power of the participant k,
Figure BDA0003165304780000096
representing the power spectral density of additive white gaussian noise,
Figure BDA0003165304780000097
indicating the channel gain, ψ, of the participant k and the edge server m0Representing the channel power gain at the reference distance.
Fast scale motion
Figure BDA0003165304780000098
I.e., bandwidth resource allocation policy, wherein
Figure BDA0003165304780000099
Indicating the bandwidth allocated by the edge server m for party k per slot.
The fast scale reward function R (t) is expressed as:
Figure BDA00031653047800000910
where μ (t) is a parameter for adjusting the reward function.
Discount factor γ: to reduce the impact of future rewards on the current, more distant rewards have less effect. The jackpot achieved by selecting the fast-scale action a (t) in the fast-scale state s may be defined as:
Figure BDA00031653047800000911
step S1104: and responding to the transmission of the simulated local model, simulating global model aggregation, generating a next round of association strategy and calculation resource allocation strategy, and updating the DQN network parameters.
Wherein the local model parameters of each participant are weighted by the following formula to obtain global model parameters omegaiAnd detecting the global model accuracy:
Figure BDA00031653047800000912
where α + β ═ 1 denotes two parameters for adjusting the weight ratio.
Since the association policy in step S1103 is initialized in advance, updating of the association policy needs to be performed. Specifically, the current slow scale state S is used as the input of the DQN network, the slow scale action A is output, namely the association strategy and the calculation resource allocation strategy are associated, and the parameters of the DQN network are updated by adopting a Bellman equation.
Wherein the slow scale state is represented as S ═ tk,tk,m],
Figure BDA0003165304780000101
Representing the time vector consumed by each participant in local training,
Figure BDA0003165304780000102
representing a time vector consumed by each participant to upload the model, wherein
Figure BDA0003165304780000103
Representing the time it takes for participant k to upload the model to edge server m.
The slow scale motion is denoted as a ═ a, B],
Figure BDA0003165304780000104
An association vector, i.e. an updated association policy,
Figure BDA0003165304780000105
and representing a mini-batch vector when each participant executes local model training, namely a computing resource allocation strategy.
Slow scale reward function RiExpressed as:
Figure BDA0003165304780000106
where mu is a parameter for adjusting the reward function,
Figure BDA0003165304780000107
indicating the accuracy of the ith round global model.
The jackpot achieved by selecting the slow-scale action a in the slow-scale state S may be defined as:
Figure BDA0003165304780000108
step S1105: and detecting whether the DRL model converges or reaches the maximum iteration number.
And if the convergence is not achieved or the maximum iteration number is reached, adding 1 to the iteration number, repeatedly executing the steps S1102-S1104, starting the next iteration, and taking the global model as the local model of each participant to re-simulate the local model training.
In the next iteration process, the association strategy generated in the last iteration and the mini-batch vector required by the next local model training are utilized, and then a new bandwidth allocation strategy is generated in the next iteration process according to the fast scale state space observed by the AC network in the current time slot, and a new association strategy and a calculation resource allocation strategy are generated by the DQN in the slow scale state space. By analogy, the bandwidth resource allocation policy, the association policy, and the computing resource allocation policy are continuously updated.
If convergence or the maximum iteration number is reached, training of the AC network and the DQN network is completed, that is, training of the DRL model is completed, and step S1106 is performed.
Step S1106: and sending each parameter of the trained DRL model to an edge server.
The edge server loads a DRL model, namely the trained AC network and DQN network, and is used for generating an association strategy and a bandwidth and computing resource allocation strategy in the current state, and completing the deployment of the DRL model.
Step S120: and responding to the fact that the trained DRL model is respectively deployed to each edge server, and performing federal learning.
Since the DRL model is to solve the problem of minimizing the federal learning delay, the DRL model is applied to the federal learning process in step S120 after the DRL model is trained in step S110.
Wherein step S120 specifically includes the following substeps:
step S1201: the local model is initialized.
Wherein, the proper metal surface defect detection model selected by the appointed party is used as a local model.
Specifically, the parameters of the metal surface defect detection model, the learning rate, the initial mi i-batch value and the iteration times of the metal surface defect detection model are broadcasted to other participants through an edge server, and each participant uses the metal surface defect detection model as a local model to complete the initialization of the local model.
Step S1202: and responding to the completion of the initialization of the local model, and performing local model training by each participant according to the calculation resource allocation strategy in the current state.
In this step, the calculation resource allocation policy in the current state is the calculation resource allocation policy output by the trained DQN network after step S110 is executed.
The training mode of the local model is trained according to the existing method, which is not described herein.
Step S1203: and each participant uploads the local model parameters trained by the participant to the edge server respectively according to the association strategy and the bandwidth resource allocation strategy.
Specifically, the association policy and the bandwidth resource allocation policy at this time are the association policy and the bandwidth resource allocation policy output by the AC network and the DQN network after the step S110 is executed.
Step S1204: and carrying out global model aggregation on the local model uploaded by each participant, and sending the global model parameters and the calculation resource allocation strategy to each participant.
Specifically, the local models uploaded by all the participants are aggregated into a global model.
In the aggregation process, an edge server serving as a central server temporarily is selected according to the position information of the edge server, and the selection is specifically performed according to the following formula:
Figure BDA0003165304780000121
wherein, [ x ]m,ym]The position coordinates of each edge server are shown, and the set M {1, 2, …, M } shows all the small base stations as edge servers.
Further, after obtaining the temporary central server according to the above formula, the temporary central server weights the local model parameters of each participant by using the following formula, and finally obtains the global model parameter ωi
Figure BDA0003165304780000122
Where α + β ═ 1 denotes two parameters for adjusting the weight ratio.
At this time, the calculation resource allocation policy sent to each participant is the calculation resource allocation policy required for the next iteration after the step 1202 and 1203 are executed. Time vector t consumed by local training of each participant in training of local model at step S1202kWith the change, in step S1203, each participant uploads the time vector t consumed by the modelk,mA change has also occurred, so that in the current state space S ═ tk,tk,m]The change also occurs, and the resulting association policy a ═ a, B]The change has occurred in the form of a change,
Figure BDA0003165304780000123
the change is also generated, namely the mini-batch vector used in the next iteration is changed, and the change of the mini-batch vector brings the change of the calculation resource allocation strategy, namely the calculation resource allocation strategy used in the next iteration is changed.
Step S1205: and judging whether the global model reaches the preset convergence precision or the maximum iteration number.
And if the global model does not reach the preset convergence precision or the maximum iteration number, adding 1 to the iteration number, and re-executing the step S1202, namely re-training the local model.
The local model is re-trained according to the global model participation and the computing resource allocation strategy sent to each participant in step S1204.
Specifically, the global model received by each participant is used as the local model again, and the local model is retrained again according to the calculation resource allocation strategy sent to each participant in step S1204 and required by the next iteration. I.e. steps S1202-1204 are repeatedly performed.
If the global model reaches the preset convergence accuracy or reaches the maximum iteration number, ignoring the global model and the calculation resource allocation strategy sent to each participant in step S1204, and performing step S130 without performing the training of the local model.
Step S130: the federal learning process is ended.
As shown in fig. 2, the distributed federated learning collaborative computing system provided for the present application specifically includes: deep reinforcement learning model training unit 210, federal learning unit 220.
The deep reinforcement learning model training unit 210 is configured to perform deep reinforcement learning model training.
The federal learning unit 220 is connected to the deep reinforcement learning model training unit 210, and is configured to perform federal learning according to the association policy and the calculation, i.e., the bandwidth resource allocation policy generated by the deep reinforcement learning model.
The application has the following beneficial effects:
(3) the distributed federated learning collaborative computing method and the distributed federated learning collaborative computing system provided by the embodiment break the dependence of the traditional federated learning on the central server and effectively ensure the privacy protection and the security in the federated learning process, aiming at a distributed federated learning framework.
(4) The distributed federated learning collaborative computing method and system provided by the embodiment achieve the design goal of minimizing the total time delay of federated learning from two angles, that is, the total iteration turn is reduced and the time consumed by each iteration turn is reduced are considered at the same time, the computing and communication resources of each participant and the edge server are fully utilized, and the utility maximization of federated learning is achieved.
(5) The distributed federated learning collaborative computing method and the distributed federated learning collaborative computing system provided by the embodiment take the influence of the calculated amount of each participant on the model precision into consideration, adjust the weight occupied by the local model of each participant in the global aggregation process, ensure the fairness of the aggregation process, and are beneficial to accelerating the model convergence.
The above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: those skilled in the art can still make modifications or easily conceive of changes to the technical solutions described in the foregoing embodiments, or make equivalents to some of them, within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the present disclosure, which should be construed in light of the above teachings. Are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A distributed federated learning collaborative computing method is characterized by specifically comprising the following steps:
carrying out deep reinforcement learning model training;
and responding to the fact that the trained deep reinforcement learning model is respectively deployed to each edge server, and performing federal learning:
and (5) ending the federal study.
2. The distributed federated learning collaborative computing method of claim 1, wherein deep reinforcement learning model training is performed, specifically comprising the sub-steps of:
initializing network parameters and state information of the deep reinforcement learning model;
each participant trains a local model according to the network parameters and state information initialized by the deep reinforcement learning model;
generating a bandwidth allocation strategy in response to the completion of the simulation training of the local model, and updating AC network parameters in a single step at each time slot;
generating an association strategy and a calculation resource allocation strategy in response to the completion of the simulation transmission of the local model, and updating DQN network parameters;
detecting whether the deep reinforcement learning model is converged or the maximum iteration times;
and if the local model is not converged or the maximum iteration times are not converged, starting the next iteration and carrying out the training of the local model again.
3. The distributed federated learning collaborative computing method of claim 2, wherein a metal surface defect detection model is used as a local model.
4. The distributed federated learning collaborative computing method of claim 2, wherein the initialized state information specifically includes: initializing parameters and convergence accuracy of the Actor network, the Critic network and the DQN network, and position coordinates [ x ] of each participantk,yk]Initial mini-batch value
Figure FDA0003165304770000011
CPU frequency fkPosition coordinates of edge servers [ x ]m,ym]And maximum bandwidth BmSlot length Δ t and maximum number of iterations I.
5. The distributed federated learning collaborative computing method of claim 2, wherein the participants perform the training process of the local model by fitting a local data set DkDivided into a plurality of sizes of
Figure FDA0003165304770000012
And b, training the small batch b by updating the local weight through the following formula so as to complete the training of the local model, wherein the training process is represented as:
Figure FDA0003165304770000021
wherein, eta represents the learning rate,
Figure FDA0003165304770000022
the gradient of the loss function for each small batch b is represented,
Figure FDA0003165304770000023
a local model representing the goodness of the participant in the ith iteration.
6. The distributed federated learning collaborative computing method of claim 2, wherein in performing the simulated training of the local model, further comprising, determining a time required for the participant to be well-formed at the ith round of local training,
time required for good participation in the ith round of local training
Figure FDA0003165304770000024
The concrete expression is as follows:
Figure FDA0003165304770000025
wherein, ckRepresenting the number of CPU cycles when the participant is well-trained on a single data sample, τ representing the number of iterations when the participant executes the MBGD algorithm, fkRepresenting the CPU cycle frequency at which participant k trains,
Figure FDA0003165304770000026
indicating the mini-batch value of participant k at the time the ith round performed the local training.
7. The distributed federated learning collaborative computing method of claim 2, wherein a current fast-scale state space is taken as an input of an AC network to obtain a fast-scale action space, i.e., a bandwidth resource allocation policy;
the fast-scale state space s is represented as:
Figure FDA0003165304770000027
Figure FDA0003165304770000028
representing the size of the model for each participant to have outstanding transmissions,
Figure FDA0003165304770000029
representing the transmission rate of an uploading model of each time slot participant, wherein t represents a time slot, and delta t represents the time slot length;
fast scale motion space
Figure FDA00031653047700000210
The fast-scale action space is a bandwidth resource allocation strategy, wherein
Figure FDA00031653047700000211
Indicating the bandwidth allocated by the edge server m for party k per slot.
8. The distributed federated learning collaborative computing method of claim 1, wherein in the process of uploading parameters of the trained deep reinforcement learning model to the edge server according to the determined bandwidth resource allocation strategy, the available uplink data transmission rate between the ith round of participant k and the edge server m
Figure FDA00031653047700000212
Expressed as:
Figure FDA0003165304770000031
wherein, PkWhich represents the transmission power of the participant k,
Figure FDA0003165304770000032
representing the power spectral density of additive white gaussian noise,
Figure FDA0003165304770000033
indicating the channel gain, ψ, of the participant k and the edge server m0Representing the channel power gain at the reference distance.
9. The distributed federated learning collaborative computing method of claim 1, further comprising a time it takes for an ith round participant k to upload deep reinforcement learning model parameters to an edge server m
Figure FDA0003165304770000034
The concrete expression is as follows:
Figure FDA0003165304770000035
wherein xi represents the size of the metal surface defect detection model,
Figure FDA0003165304770000036
indicating the available upstream data transmission rate between the ith round participant k and the edge server m.
10. A distributed federated learning collaborative computing system is characterized by specifically comprising: a deep reinforcement learning unit and a federal learning unit;
the deep reinforcement learning unit is used for carrying out deep reinforcement learning model training;
and the federated learning unit is used for performing federated learning according to the association strategy generated by the deep reinforcement learning model and the calculation, namely the bandwidth resource allocation strategy.
CN202110802910.1A 2021-07-15 2021-07-15 Distributed federal learning collaborative computing method and system Active CN113467952B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110802910.1A CN113467952B (en) 2021-07-15 2021-07-15 Distributed federal learning collaborative computing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110802910.1A CN113467952B (en) 2021-07-15 2021-07-15 Distributed federal learning collaborative computing method and system

Publications (2)

Publication Number Publication Date
CN113467952A true CN113467952A (en) 2021-10-01
CN113467952B CN113467952B (en) 2024-07-02

Family

ID=77880516

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110802910.1A Active CN113467952B (en) 2021-07-15 2021-07-15 Distributed federal learning collaborative computing method and system

Country Status (1)

Country Link
CN (1) CN113467952B (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113902021A (en) * 2021-10-13 2022-01-07 北京邮电大学 High-energy-efficiency clustering federal edge learning strategy generation method and device
CN114065863A (en) * 2021-11-18 2022-02-18 北京百度网讯科技有限公司 Method, device and system for federal learning, electronic equipment and storage medium
CN114090239A (en) * 2021-11-01 2022-02-25 国网江苏省电力有限公司信息通信分公司 Model-based reinforcement learning edge resource scheduling method and device
CN114168328A (en) * 2021-12-06 2022-03-11 北京邮电大学 Mobile edge node calculation task scheduling method and system based on federal learning
CN114328432A (en) * 2021-12-02 2022-04-12 京信数据科技有限公司 Big data federal learning processing method and system
CN114363911A (en) * 2021-12-31 2022-04-15 哈尔滨工业大学(深圳) Wireless communication system for deploying layered federated learning and resource optimization method
CN114492746A (en) * 2022-01-19 2022-05-13 中国石油大学(华东) Federal learning acceleration method based on model segmentation
CN114546608A (en) * 2022-01-06 2022-05-27 上海交通大学 Task scheduling method based on edge calculation
CN114785608A (en) * 2022-05-09 2022-07-22 中国石油大学(华东) Industrial control network intrusion detection method based on decentralized federal learning
CN115174412A (en) * 2022-08-22 2022-10-11 深圳市人工智能与机器人研究院 Dynamic bandwidth allocation method for heterogeneous federated learning system and related equipment
CN115329990A (en) * 2022-10-13 2022-11-11 合肥本源物联网科技有限公司 Asynchronous federated learning acceleration method based on model segmentation under edge calculation scene
CN116341690A (en) * 2023-04-28 2023-06-27 昆山杜克大学 On-line parameter selection method for minimizing federal learning total cost and related equipment
CN114492746B (en) * 2022-01-19 2024-10-29 中国石油大学(华东) Federal learning acceleration method based on model segmentation

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112163690A (en) * 2020-08-19 2021-01-01 清华大学 Multi-time scale multi-agent reinforcement learning method and device
WO2021083276A1 (en) * 2019-10-29 2021-05-06 深圳前海微众银行股份有限公司 Method, device, and apparatus for combining horizontal federation and vertical federation, and medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021083276A1 (en) * 2019-10-29 2021-05-06 深圳前海微众银行股份有限公司 Method, device, and apparatus for combining horizontal federation and vertical federation, and medium
CN112163690A (en) * 2020-08-19 2021-01-01 清华大学 Multi-time scale multi-agent reinforcement learning method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
周俊;方国英;吴楠;: "联邦学习安全与隐私保护研究综述", 西华大学学报(自然科学版), no. 04, 10 July 2020 (2020-07-10) *

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113902021A (en) * 2021-10-13 2022-01-07 北京邮电大学 High-energy-efficiency clustering federal edge learning strategy generation method and device
CN114090239A (en) * 2021-11-01 2022-02-25 国网江苏省电力有限公司信息通信分公司 Model-based reinforcement learning edge resource scheduling method and device
CN114065863B (en) * 2021-11-18 2023-08-29 北京百度网讯科技有限公司 Federal learning method, apparatus, system, electronic device and storage medium
CN114065863A (en) * 2021-11-18 2022-02-18 北京百度网讯科技有限公司 Method, device and system for federal learning, electronic equipment and storage medium
CN114328432A (en) * 2021-12-02 2022-04-12 京信数据科技有限公司 Big data federal learning processing method and system
CN114168328A (en) * 2021-12-06 2022-03-11 北京邮电大学 Mobile edge node calculation task scheduling method and system based on federal learning
CN114168328B (en) * 2021-12-06 2024-09-10 北京邮电大学 Mobile edge node calculation task scheduling method and system based on federal learning
CN114363911B (en) * 2021-12-31 2023-10-17 哈尔滨工业大学(深圳) Wireless communication system for deploying hierarchical federal learning and resource optimization method
CN114363911A (en) * 2021-12-31 2022-04-15 哈尔滨工业大学(深圳) Wireless communication system for deploying layered federated learning and resource optimization method
CN114546608A (en) * 2022-01-06 2022-05-27 上海交通大学 Task scheduling method based on edge calculation
CN114546608B (en) * 2022-01-06 2024-06-07 上海交通大学 Task scheduling method based on edge calculation
CN114492746A (en) * 2022-01-19 2022-05-13 中国石油大学(华东) Federal learning acceleration method based on model segmentation
CN114492746B (en) * 2022-01-19 2024-10-29 中国石油大学(华东) Federal learning acceleration method based on model segmentation
CN114785608A (en) * 2022-05-09 2022-07-22 中国石油大学(华东) Industrial control network intrusion detection method based on decentralized federal learning
CN114785608B (en) * 2022-05-09 2023-08-15 中国石油大学(华东) Industrial control network intrusion detection method based on decentralised federal learning
CN115174412A (en) * 2022-08-22 2022-10-11 深圳市人工智能与机器人研究院 Dynamic bandwidth allocation method for heterogeneous federated learning system and related equipment
CN115174412B (en) * 2022-08-22 2024-04-12 深圳市人工智能与机器人研究院 Dynamic bandwidth allocation method for heterogeneous federal learning system and related equipment
CN115329990A (en) * 2022-10-13 2022-11-11 合肥本源物联网科技有限公司 Asynchronous federated learning acceleration method based on model segmentation under edge calculation scene
CN115329990B (en) * 2022-10-13 2023-01-20 合肥本源物联网科技有限公司 Asynchronous federated learning acceleration method based on model segmentation under edge computing scene
CN116341690A (en) * 2023-04-28 2023-06-27 昆山杜克大学 On-line parameter selection method for minimizing federal learning total cost and related equipment

Also Published As

Publication number Publication date
CN113467952B (en) 2024-07-02

Similar Documents

Publication Publication Date Title
CN113467952B (en) Distributed federal learning collaborative computing method and system
Zhang et al. Deep reinforcement learning based resource management for DNN inference in industrial IoT
CN111629380B (en) Dynamic resource allocation method for high concurrency multi-service industrial 5G network
CN111800828B (en) Mobile edge computing resource allocation method for ultra-dense network
Li et al. NOMA-enabled cooperative computation offloading for blockchain-empowered Internet of Things: A learning approach
CN112181666A (en) Method, system, equipment and readable storage medium for equipment evaluation and federal learning importance aggregation based on edge intelligence
Wang et al. Distributed reinforcement learning for age of information minimization in real-time IoT systems
WO2021036414A1 (en) Co-channel interference prediction method for satellite-to-ground downlink under low earth orbit satellite constellation
CN112287990B (en) Model optimization method of edge cloud collaborative support vector machine based on online learning
CN113687875B (en) Method and device for unloading vehicle tasks in Internet of vehicles
CN109067583A (en) A kind of resource prediction method and system based on edge calculations
CN113392539A (en) Robot communication control method, system and equipment based on federal reinforcement learning
EP4024212A1 (en) Method for scheduling interference workloads on edge network resources
CN113887748B (en) Online federal learning task allocation method and device, and federal learning method and system
US20230189075A1 (en) Wireless communication network resource allocation method with dynamic adjustment on demand
CN114828018A (en) Multi-user mobile edge computing unloading method based on depth certainty strategy gradient
CN112312299A (en) Service unloading method, device and system
CN113543160A (en) 5G slice resource allocation method and device, computing equipment and computer storage medium
CN111988787A (en) Method and system for selecting network access and service placement positions of tasks
CN115086992A (en) Distributed semantic communication system and bandwidth resource allocation method and device
CN113919483A (en) Method and system for constructing and positioning radio map in wireless communication network
Liang et al. Stochastic Stackelberg Game Based Edge Service Selection for Massive IoT Networks
CN117376355B (en) B5G mass Internet of things resource allocation method and system based on hypergraph
Yan et al. Service caching for meteorological emergency decision-making in cloud-edge computing
Sun et al. Leveraging digital twin and drl for collaborative context offloading in c-v2x autonomous driving

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant