CN114828095A - Efficient data perception layered federated learning method based on task unloading - Google Patents
Efficient data perception layered federated learning method based on task unloading Download PDFInfo
- Publication number
- CN114828095A CN114828095A CN202210293352.5A CN202210293352A CN114828095A CN 114828095 A CN114828095 A CN 114828095A CN 202210293352 A CN202210293352 A CN 202210293352A CN 114828095 A CN114828095 A CN 114828095A
- Authority
- CN
- China
- Prior art keywords
- server
- edge
- task
- formula
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 230000008447 perception Effects 0.000 title claims abstract description 7
- 238000012549 training Methods 0.000 claims abstract description 40
- 238000004220 aggregation Methods 0.000 claims abstract description 37
- 230000006870 function Effects 0.000 claims abstract description 37
- 230000002776 aggregation Effects 0.000 claims abstract description 30
- 238000009826 distribution Methods 0.000 claims abstract description 22
- 230000009471 action Effects 0.000 claims abstract description 21
- 238000004891 communication Methods 0.000 claims description 15
- 150000001875 compounds Chemical class 0.000 claims description 14
- 230000005540 biological transmission Effects 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 11
- 238000005265 energy consumption Methods 0.000 claims description 8
- 238000006116 polymerization reaction Methods 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 6
- 239000000203 mixture Substances 0.000 claims description 5
- 230000001186 cumulative effect Effects 0.000 claims description 3
- 238000005562 fading Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 239000000126 substance Substances 0.000 claims description 3
- 238000004422 calculation algorithm Methods 0.000 abstract description 24
- 238000002474 experimental method Methods 0.000 abstract description 2
- 102100035605 Cas scaffolding protein family member 4 Human genes 0.000 description 5
- 101000947106 Homo sapiens Cas scaffolding protein family member 4 Proteins 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 2
- 229920003087 methylethyl cellulose Polymers 0.000 description 2
- 238000002156 mixing Methods 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 238000013468 resource allocation Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 230000010485 coping Effects 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W28/00—Network traffic management; Network resource management
- H04W28/02—Traffic management, e.g. flow control or congestion control
- H04W28/08—Load balancing or load distribution
- H04W28/09—Management thereof
- H04W28/0958—Management thereof based on metrics or performance parameters
- H04W28/0967—Quality of Service [QoS] parameters
- H04W28/0983—Quality of Service [QoS] parameters for optimizing bandwidth or throughput
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W28/00—Network traffic management; Network resource management
- H04W28/02—Traffic management, e.g. flow control or congestion control
- H04W28/08—Load balancing or load distribution
- H04W28/09—Management thereof
- H04W28/0925—Management thereof using policies
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Quality & Reliability (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The technical scheme of the invention is to provide an efficient data perception layering federal learning method based on task unloading. The invention considers the data distribution in the cost function for the first time, and can improve the quality of the edge data set while reducing the system cost. In addition, the invention designs a TO and RA method based on a multi-intelligence depth deterministic strategy gradient model for reducing an action space. A large number of experiments prove that the algorithm provided by the invention can effectively improve the accuracy of the aggregation model, effectively reduce the unloading cost, improve the training precision of the light-weight data perception HFEL algorithm and reduce the system cost.
Description
Technical Field
The invention relates to joint task unloading, resource allocation and participant selection problems under layered federal edge learning (hereinafter referred to as HFEL) so as to reduce system cost and improve training precision of the federal learning (hereinafter referred to as FL).
Background
In the era of data intelligence, billions of devices produce large amounts of data in marginal scenes. Uploading personal data to third party cloud server computing can cause a number of problems, including privacy disclosure. As an effective coping method, FL has become a promising machine learning paradigm. And the FL uploads the trained gradient or weight, and a plurality of weights are aggregated to finally obtain the global model. FL has been applied to multiple access edge computing (MEC for short hereinafter) scenarios for distributed model training to protect data privacy.
The traditional FL is dominated by a two-layer cloud federal learning (hereinafter referred to as C-FL) architecture, and comprises a parameter server and an edge working node (hereinafter referred to as worker) in the cloud. In classical FL algorithms, such as Federated Averaging (FedAvg), worker performs several rounds of local updates and uploads weights to the cloud for global aggregation. However, communication resources in a wide area network (hereinafter referred to as WAN) in a two-tier C-FL framework are limited and expensive. Network congestion can be exacerbated when a large number of devices communicate with the cloud through the backbone network.
To alleviate the above problems, HFEL frames have received attention. There are two layers of aggregation in this framework, Edge-aggregation in local area networks (hereinafter abbreviated as LANs) and Cloud-aggregation in wide area networks. In an edge scenario, a user device (hereinafter abbreviated UD) offloads data to an edge server (hereinafter abbreviated ES) for training. An edge parameter server (hereinafter EPS) serves as an intermediary between the ES and the cloud. Cloud-aggregation is performed in the Cloud to aggregate the weight of the EPS. The HEFL can be applied in many industrial or Internet scenarios for machine learning based services. For example: one cell has multiple ESs and UDs that upload tasks to the ESs for computation, which involves a task offload. Several such areas (e.g., branches and government agencies) are separated by tens of kilometers and FL is implemented to break data islands.
Disclosure of Invention
The purpose of the invention is: reducing system cost and improving FL training accuracy.
In order to achieve the above object, the technical solution of the present invention is to provide a high-efficiency data-aware hierarchical federated learning method based on task offloading, which is characterized in that the method includes the following steps of defining a FEL-MTMH scenario, in which a training model is located in an edge server, a plurality of user devices can offload data to the plurality of edge servers, and an edge calculation scenario can be abstracted to a Multi-task Multi-Helper scenario, wherein the user devices are only responsible for data collection and offloading, and the processes of model training and parameter aggregation are performed in the edge server and an edge parameter server;
in a FEL-MTMH scenario, there are U user equipments and S edge servers, the U-th user equipment unloads a task U to an edge server through an uplink channel, the S-th edge server is further defined as a server S, and then, for the task U unloaded to the server S: use of h us Representing the channel gain between task u and server s; order toRepresenting the bandwidth allocation of task u; use ofTo represent the bandwidth allocation policy of server s, where U s Representing a set of tasks to be offloaded to a server s, B s Represents the transmission bandwidth of the server s; with binary offload policiesTo express an offload policy, let m us To representM if task u can be offloaded to server s us 1, otherwise, m us When the value is 0, then:
offload delay for server sThe device consists of a transmission part and a calculation part, and is shown as the following formula:
in the formula, R u Uplink transmission rate R of task u u And e represents the calculated density, d u Data size, f, representing task u s Representing the computing power of the server s where the task u is located;
in the formula, alpha representsAndq is an energy parameter depending on the architecture of the s-chip of the server;
will cost function J 1 Defined to minimize the maximum cost among all edge servers, the following is defined:
in the formula (I), the compound is shown in the specification,andin order to be a weight parameter, the weight parameter,and is
Information entropy is introduced into the system cost to represent the characteristics of data distribution, as shown in the following formula:
in the formula: d s Represents a data set collected by server s; c represents the number of categories; p c (D s ) Data representing class c is in D s The ratio of (1);
maximization of J 2 Entropy of (2), then the joint problem is defined as follows, denoted as the continuous discrete mixture MINLP problem P0:
s.t.m us ∈0,1,
the hierarchical federated edge learning system is provided with a cloud server and K FEL-MTMH scenes, wherein each FEL-MTMH scene is provided with an edge parameter server and S edge servers; defining the data set, weight and loss function of the edge server as D s 、w s And F s (ws); defining the data set, weight and loss function of the edge parameter server in the k-th FEL-MTMH scene asAndrespectively defining the data set, the weight and the loss function of the cloud server as D global 、w global And F global (w global );
Setting the aggregation policy to X ═ X 1 ,x 2 ,...,x s ,...,x S ],x s 1 denotes the s-th edge server participating in edge aggregation, x s 0 means that the s-th edge server does not participate in edge aggregation, thenAnd F global (w global ) Represented by the formula:
let each edge server be at κ 1 Edge aggregation is performed after round of local training, with each edge parameter server at κ 2 The secondary aggregation is followed by cloud aggregation, and the process is repeated until sufficient accuracy or communication thresholds are reached and the HFEL system model parameters are updated as:
wherein l represents the number of local training rounds, w s (l) Represents the weight of the ES obtained in the first round of training, η represents the learning rate,is represented by F s (w s (l-1)), wherein,and D global Is a virtual data set;
the continuous discrete hybrid MINLP problem P0 is decomposed into two subproblems, one each by fixingMinimizing binary offload policiesProblem P1 resulting from the cost and problem P2 resulting from solving the bandwidth allocation policy, problem P1 is represented as:
in the formula (I), the compound is shown in the specification,is a binary offload strategyIs determined.
Problem P2 is expressed as:
in the formula (I), the compound is shown in the specification,is a bandwidth allocation policyIs determined.
For problem P1, a reduced action space multi-agent depth certainty policy gradient is utilized to interact with the environment to obtain a binary offload policyFor problem P2, a convex optimization method is used to determine the bandwidth allocation strategy
ES data set D s Distribution P (D) s ) Is defined as: p (D) s )=[P c (D s )|c∈C],P c (D s ) Represents D s A distribution belonging to class C, C being the number of classes;is a virtual data set of edge parameter servers in the kth FEL-MTMH scenario, the distribution of whichIs shown as To representA distribution belonging to class c; global virtual dataset D global Is by polymerizationObtaining;
introducing a KL divergence, defined as:problem P3: the KL divergence minimization problem is shown by the following equation:
in the formula (I), the compound is shown in the specification,represents an aggregated decision for server s;
in the formula, P c (D global ) Represents D global A distribution belonging to class c;
Preferably, the signal-to-interference-and-noise ratio SINRu of task u is represented as:
in the formula, p u Represents the communication power allocated by user u;σ 2 which is indicative of the power of the background noise,is the cumulative inter-cell interference for all tasks associated with the ESs other than the server s.
Uplink transmission rate R of task u u Expressed as:
in particular, if the network in the reduced action space multi-agent depth certainty policy gradient outputs only offload decisions, then the reduced action space multi-agent depth certainty policy gradient is used to interact with the environment to obtain a binary offload policyThe method comprises the following steps:
a reward function and reduced motion space are introduced in the madpg model, which describes the evolution of HFEL systems using the following markov decision process:
(1) the state is as follows: state is s t =[c t ,d t ,f t ,h t ]Wherein, in the step (A),andrespectively representing the sample class and the data size,indicating the computing resources available on the ES,representing ambient channel fading;
(2) the actions are as follows: defining the offload policy generated by each agent as an operationThe representation represents the mapping relation between the user equipment and the edge server;
(3) rewarding: the implication of the reward is the amount of system weighted cost after the offload policy and bandwidth allocation policy are implemented according to the action. Thus, the reward is defined as the negative of the cost function, i.e. r-J 1 *J 2 Maximizing rewards means minimizing system costs;
given Actor and Critic network parameters are respectively theta ═ theta 1 ,...,θ n ]And ω ═ ω [ ω ] 1 ,...,ω n ]. The set of policies for each agent is: pi ═ pi 1 ,...,π n ]Let us assume that the deterministic policy set of N agents is μ ═ μ 1 ,...,μ N ]The deterministic policy gradient is then expressed as follows:
in the formula (I), the compound is shown in the specification,as a strategic gradient formula, E s,a []Indicates expectation of a i Represents an action o i Denotes observation, μ i (a i |o i ) A deterministic policy is represented that is,denotes a i The gradient of (a) of (b) is,a state-action function representing an ith agent concentration;
in the centralized training stage of the MADDPG model, Actor and Critic carry out centralized training;
in the distributed execution phase of the MADDPG model, the Actor only needs to know local observations, and the centralized criticc update is expressed as follows:
in the formula: l (theta) i ) A loss function representing Critic; e s,a,r,s ,[]Indicating a desire, s' indicates a next state, a indicates an action, r indicates a reward; y represents the sum of the reward and the discounted state-action function.
Problem P2 is rewritten as follows:
in the formula: m 1 And M 2 Is a constant associated with the task u,M 2 =-∑ s∈S Entropy(D s ) (ii) a The SINR is the SINR u ;
Deriving optimal policies using Lagrangian multipliers and KKT conditionsWherein the content of the first and second substances,ε s representing the lagrange multiplier.
The invention considers the data distribution in the cost function for the first time, and can improve the quality of the edge data set while reducing the system cost. In addition, the invention designs a TO and RA method based on a multi-intelligence depth deterministic strategy gradient model (RAS-MADDPG for short) for reducing an action space. RAS-MADDPG is improved on the basis of a depth deterministic policy gradient (MADDPG) model, only local observation is adopted to give optimal action, and a dynamic model of the environment and special communication requirements are not required to be known. In order to overcome the influence of non-independent and identically distributed data, the invention provides a lightweight data perception HFEL algorithm, and KL divergence is adopted for participant selection (hereinafter referred to as PS) in the algorithm so as to improve the accuracy of a polymerization model. A large number of experiments prove that the algorithm provided by the invention can effectively improve the accuracy of the aggregation model, effectively reduce the unloading cost, improve the training precision of the lightweight data perception HFEL algorithm and reduce the system cost.
Drawings
FIG. 1 illustrates TO and RA based HEFL architecture in an MEC scenario;
FIG. 2 illustrates the decomposition of problem P0;
FIG. 3 illustrates the detailed algorithm of MADDPG;
FIG. 4 illustrates a Data-aware HFEL specific algorithm;
FIG. 5 illustrates a comparison of HFEL-MADDPG with a cloud-based FL (based on MNIST);
FIG. 6 illustrates a comparison of HFEL-MADDPG with a cloud-based FL (Cifar-based);
FIG. 7 illustrates different HEFL algorithm offload costs (κ) 1 =10,κ 2 =12);
FIG. 8 illustrates different HEFL algorithm offload costs (κ) 1 =30,κ 2 =4);
FIG. 9 illustrates the training performance (κ) of different HFEL algorithms 1 =10,κ 2 =12);
FIG. 10 illustrates the training performance (κ) of different HFEL algorithms 1 =30,κ 2 =4)。
Detailed Description
The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the teaching of the present invention, and such equivalents may fall within the scope of the present invention as defined in the appended claims.
1 System model
1.1 application scenarios
FL is an exploration of distributed machine learning that can be trained using scattered data. This concept adapts well to the characteristics of data fragmentation in MECs. Therefore, introduction of FL into MEC has great engineering utility value. Fig. 1 illustrates offloading (hereinafter referred TO as TO) and bandwidth allocation (hereinafter referred TO as RA) of HFELs in MECs. In the scenario shown in fig. 1, where the training model is located in an ES, multiple UDs may offload data to multiple ESs. The edge computation scenario may be abstracted as a Multi-TaskMulti-helper (MTMH) scenario. Since UDs are only responsible for the collection and offloading of data, the process of model training and parameter aggregation is performed in ES and EPS. The present invention defines the above scenario as a FEL-MTMH scenario.
1.2 TO and RA based on information entropy
Assume that in a FEL-MTMH scenario, there are U UDs and S ESs. The u-th UD offloads task u to the ES through the upstream channel. The communication multiple access scheme is based on OFDMA, and the UD communicates with the ES through orthogonal subbands in a single cell. Thus, the interference mainly comes from inter-cell communication. Defining the s-th ES further as server s, using h for task u offloaded to server s us Representing the channel gain between task u and server s. Order toIndicating the bandwidth allocation for task u. Use ofTo represent the bandwidth allocation policy of server s, where U s Representing a set of tasks to be offloaded to a server s, B s Representing the communication bandwidth of the server s. With binary offload policiesTo indicate an offloading policy. Let m us To representM if task u can be offloaded to server s us 1, otherwise, m us 0. SINR of task u u Expressed as:
in the formula, p u Represents the allocated power of task u;σ 2 which is indicative of the power of the background noise,is the cumulative inter-cell interference for all tasks associated with the ESs other than the server s.
Uplink transmission rate R of task u u Expressed as:
the present invention ignores the delay of the downstream transmission because the upstream rate is much larger than the downstream rate and the size of the data returned by the task is usually very small. Offload delay for server sThe device consists of a transmission part and a calculation part, and is shown as the following formula:
wherein e represents the calculated density, d u Data size (bits), f, representing task u s Representing the computing power of the server s on which the task u is located.
in the formula, alpha representsAndq is an energy parameter that depends on the architecture of the server s-chip. The system cost function may be expressed as a weighted sum of the offload delay and the offload energy consumption. In order to avoid the 'straggler' phenomenon and reduce the influence of unbalanced data, the cost function J is used in the invention 1 Defined to minimize the maximum cost among all ESs, the following is defined:
in the formula (I), the compound is shown in the specification,andin order to be a weight parameter, the weight parameter,and isThe weight parameter may be adjusted, e.g. increased, in dependence of the task propertyTo accommodate delay sensitive tasks. In order to reduce the influence of non-independent co-distributed data, the invention introduces information entropy in system cost to represent the characteristics of data distribution, as shown in the following formula:
in the formula: d s Represents a data set collected by server s; c represents the number of categories; p c (D s ) Representing categoriesc data in D s The ratio of (1). By maximizing J 2 The class of the data in the unloaded data set will be increased, and the FL model can be trained and simultaneously extract features from richer samples, so that the influence of Non-IID is reduced. Notably, J 1 Is the MinMax problem, J 2 Is the problem of maximizing entropy. Finally, the present invention defines the joint problem as the following, denoted as the continuous discrete mixture MINLP problem P0:
s.t.m us ∈0,1,
1.3 HEFL model
The layered federated edge learning system is provided with a cloud server and K FEL-MTMH scenes. There is one EPS and S ESs per FEL-MTMH. Defining the data set, weight and loss function of ES as D s 、w s And F s (w s ) (ii) a Defining the data set, weight and loss function of EPS in the kth FEL-MTMH scene asAndrespectively defining the data set, the weight and the loss function of the cloud server as D global 、w global And F global (w global )。Referred to as polymerization loss, a loss function F can be employed s (w s ) Is calculated as a weighted average. In FL in particular, the sampling of participant weights can affect the convergence and accuracy of the post-aggregation model. An appropriate sampling strategy may improve the representation of the data and reduce the variance. The present invention sets the aggregation strategy to be denoted X ═ X 1 ,x 2 ,...,x s ,...,x S ],x s 1 means that the s-th ES participates in Edge-Aggregation, x s 0 means that the s-th ES does not participate in the polymerization. ThenAnd F global (w global ) Represented by the formula:
to reduce communication overhead, assume that each ES is at κ 1 Edge-Aggregation is performed after round of local training, with each EPS at κ 2 After the secondary polymerization, Cloud-Aggregation was performed. This process is repeated until sufficient accuracy or communication threshold is reached.
The HFEL system model parameters are updated as:
wherein l represents the number of local training rounds, w s (l) Watch (A)Showing the weight of the ES obtained in the first round of training, eta shows the learning rate,is represented by F s (w s (l-1)). To protect privacy, HFEL systems use parameter aggregation instead of data aggregation, soAnd D global Is a virtual data set.
2 Joint problem decoupling strategy
This is an NP-hard problem due to the continuous discrete blend MINLP characteristic of problem P0. The present invention proposes a two-step solution: reduction of action space multi-agent depth certainty policy gradients (hereinafter abbreviated RAS-madpg) and KL divergence based participant selection (hereinafter abbreviated PS-KL). The combined application of RAS-MADDPG and PS-KL processes unbalanced and Non-IID data, and can also improve the model training precision and reduce the system cost.
By fixing the variables, the continuous discrete blending MINLP problem P0 can decompose two sub-problems. As shown in FIG. 2, for problem P0, attention is first directed to fixingMinimizing binary offload policiesThe following problem P1 is obtained:
in the formula (I), the compound is shown in the specification,is a binary offload strategyIs determined.
Next, solving the bandwidth allocation policy results in a problem P2 as shown in the following formula:
in the formula (I), the compound is shown in the specification,is a bandwidth allocation policyIs determined.
For problem P1, RAS-MADDPG is used to interact with the environment to obtain a binary offload policyFor problem P2, the convex optimization method is applicable to the continuous RA problem. After determining the binary unloading strategyAnd bandwidth allocation policyThereafter, the training data set for each ES is also determined. To improve the convergence speed and accuracy of FL, the present invention performs an optimized edge aggregation process at problem P3.
Problem solving strategy based on reinforcement learning
3.1 information entropy-based computational offload policy
The present invention describes the evolution of HFEL systems using the following Markov Decision Process (MDP):
(1) the state is as follows: state is s t =[c t ,d t ,f t ,h t ]Wherein, in the step (A),andrespectively representing the sample class and the data size,indicating the computing resources available on the ES,indicating ambient channel fading.
(2) The actions are as follows: defining the offload policy generated by each agent as an operationAnd showing the mapping relation of UD and ES. Since the RA part is split, the motion space is greatly reduced here.
(3) Rewarding: in RAS-MADDPG, the implication of rewards is the amount of system weighted cost after the action executes the TO and RA policies. Thus, the reward is defined as the negative of the cost function, i.e. r-J 1 *J 2 Maximizing the reward means minimizing the system cost.
The TO framework based on RAS-MADDPG is shown in FIG. 1. The present invention introduces a novel reward function and reduced motion space in the maddppg model. The reward function may effectively incentivize the agent to find an optimal policy. The RA problem P2 is solved in the RA method block without putting in motion space. The network in RAS-MADDPG outputs only the offload decisions, so the action space and network complexity is greatly reduced.
The network parameters given to Actor and Critic are θ ═ θ, respectively 1 ,...,θ n ]And ω ═ ω [ ω ] 1 ,...,ω n ]. The set of policies for each agent is: pi ═ pi 1 ,...,π n ]. Suppose the deterministic policy set of N agents is μ ═ μ 1 ,...,μ N ]The deterministic policy gradient is then expressed as follows:
in the formula (I), the compound is shown in the specification,as a strategic gradient formula, E s,a []Indicates expectation of a i Represents an action o i Denotes observation, μ i (a i |o i ) A deterministic policy is represented that is,denotes a i The gradient of (a) of (b) is,a state-action function representing the ith agent concentration.
Centralized training and distributed execution mechanism of madpg: during the training phase, Actor and Critic perform intensive training. During the execution phase, the Actor only needs to know the local observations. The centralized Critic update leverages the Time Difference (TD) in DQN and the idea of the target network, which is expressed as follows:
wherein L (θ) i ) A loss function representing Critic; e s,a,r,s′ []Indicating a desire, s' indicates a next state, a indicates an action, r indicates a reward; y represents the sum of the reward and the discounted state-action function, wherein, gamma represents the discount rate,denotes a target network,. mu.' j (o j ) ' parameter θ ' representing target strategy with hysteresis update ' j 。
The detailed algorithmic representation of maddppg is shown in fig. 3.
3.2 resource Allocation Module
The problem P0 is difficult to solve due to continuous discrete mixing and non-convexity. Referring to the Tammer decoupling method, problem P2 can be rewritten as follows:
in the formula: m 1 And M 2 Is a constant associated with the task u,M 2 =-∑ s∈S Entropy(D s ) (ii) a The SINR is the SINR u . To the right of the equation is a Min-Max problem. And obtaining the second derivative from the supremum to know that the derivative is a convex function. Known from the theorem of sum of convex functionsFor convex functions, the optimal strategy can be derived using the Lagrangian multiplier and the KKT conditionWherein the content of the first and second substances, ε s representing lagrange multipliers.
3.3 data-aware HFEL
PS is applied for edge aggregation and reduces the impact of non-IID data. Due to FL environmentThe distributed nature of random selection of clients for aggregation can exacerbate the adverse effects of data heterogeneity. To facilitate the study of edge aggregation, the present invention converts the difference in aggregation weights into a difference in the distribution of the data set. ES data set D s Distribution P (D) s ) Is defined as: p (D) s )=[P c (D s )|c∈C],P c (D s ) Data of the indicated category c is in D s C is the number of classes.Is a virtual data set of EPS in the kth FEL-MTMH scenario, the distribution of whichIs shown as Represents class c data inThe ratio of (1). Global virtual dataset D global Is by polymerizationAnd (4) obtaining the product. The invention concerns the similarity of data distribution after edge aggregation and cloud aggregation, introduces KL divergence and defines the following data distribution as follows:based on the above, the present invention proposes problem P3: the KL divergence minimization problem is shown by the following equation:
in the formula (I), the compound is shown in the specification,representing the aggregate decision of the server s.
in the formula, P c (D global ) Data in D representing category c global The ratio of (1).
The second derivative on the right side of the equal sign is known to be a convex function, which means that the KKT condition can be used to obtain the optimal strategyIn particular, an optimal strategyIs in the domainObtained as described above. According toTo a binary value. The specific Data-aware HFEL algorithm is shown in fig. 4.
4 Experimental Environment setup
The present invention employs two exemplary learning-based visual recognition tasks to construct a HFEL simulation system. These tasks are based on two data sets: MNIST digital data set and CIFAR-10 real object data set. To study the effect of non-IID data, the present invention also specifies the way data is distributed in the edge. To simulate unbalanced data, the number of samples on each UD is in accordance with a Gaussian distribution X N (100, 10). The invention shapes two data distributions: (1) IID, each data set is shuffled and then randomly partitioned into UDs. (2) Non-IID, data sets are classified into 10 classes according to label. Each UD is then assigned a sample randomly selected from two categories. Furthermore, training was performed on two models: (1) a multi-layer perceptron with two hidden layers and activated by a sigmoid function. (2) CNN of a 5 x 5 convolution kernel. The model is provided in the TensorFlow course and consists of two convolutional layers and two fully-connected layers.
For each FEL-MTMH scenario, consider a multi-element system where ES is located at the center of the hexagonal element. Assuming 10 cells per FEL-MTMH scenario, the ES computation power is from [6, 8, 10, 12 ]]And (4) randomly selecting in GHz. The background noise power σ and bandwidth B are set to-100 dBm and 5 MHz. The channel gain h follows the free space path loss model. On the UD side, the maximum transmission power is from [80, 100, 120 ]]Randomly selecting from mW. Default energy consumption and delay parameters
The invention realizes the comparison of three representative standards:
(1) HFEL-based RAS-MADDPG algorithm: the invention provides a method that applies RAS-MADDPG and data-aware HFEL. The TO, RA and PS problems in the MEC scene are jointly solved.
(2) Cloud-based FL algorithm (C-FL): the C-FL trains all UD data on distributed computing nodes, and parameters are aggregated through Cloud-based FL.
(3) An HFEL-based DRL (Actor-critical) offload algorithm (hereinafter abbreviated as HFEL-DRL): HFEL-DRL applies the latest DRL algorithm to obtain offloading decisions in HFEL. The algorithm adjusts the strategy through a heuristic method to achieve the performance close to the optimal performance.
(4) HFEL-based independent offload and Joint RA (hereinafter HFEL-IOJR): HFEL-IOJR randomly assigns each task to ES and employs joint RA.
5 influence of Key parameters
First, two key parameters (i.e., κ) of the HFEL-MADDPG algorithm were quantified 1 And kappa 2 ) The influence of (c). The training time and energy consumption for transmission and calculation are calculated in table 1.
TABLE 1 Critical parameter Effect
Table 1 shows that HFEL has less training time and energy consumption compared to C-FL. In addition, with κ 1 The training time of the two data sets decreases monotonically. The energy consumption of the system is firstly reduced and then increased, which shows that 1 The reduction of (b) may reduce the consumption of edge computations. However, too frequent edge aggregation may result in increased communication consumption. Reasonable setting kappa 1 And kappa 2 The system efficiency will be improved.
6 comparison with cloud-based FL
The performance of C-FL and HFEL-MADDPG were compared by convergence analysis. The results are shown in FIGS. 5 and 6. When the data is IID, the accuracy and convergence rate of HFEL-MADDPG are better than those of C-FL. However, when the data is Non-IID, the convergence rate of HFEL-MADDPG is significantly reduced. Thus, two conclusions can be drawn: (1) compared with C-FL, the data-aware HFEL-MADDPG has better performance in terms of training speed and accuracy. (2) non-IID data can greatly reduce the accuracy and convergence speed of the model.
7 comparison with HFEL baseline Algorithm
Next, the performance of the HFEL-MADDPG in the HFEL scenario is compared. In FIGS. 7 and 8, it can be seen that HFEL-MADDPG and HFEL-DRL can effectively reduce the cost of defluxing compared with HFEL-IOJR. In particular, the cost of HFEL-maddppg is minimal, reflecting the effectiveness of the offloading strategy and PS mechanism of the present invention.
The training process of HFEL-madpg was compared to other baseline algorithms with guaranteed communication performance. FIGS. 9 and 10 show the comparison of the training effects of the three HFEL algorithms in the Non-IID scenario. We can see that HFEL-maddppg has the best accuracy and convergence speed. In addition, compared with baselines, the HFEL-MADDPG greatly reduces communication cost, improves training efficiency and embodies the effectiveness of the algorithm.
Claims (5)
1. A high-efficiency data perception layering federated learning method based on task unloading is characterized by comprising the following steps
Defining a FEL-MTMH scene, wherein a training model is positioned in an edge server in the FEL-MTMH scene, a plurality of user equipment can unload data to the plurality of edge servers, the edge calculation scene can be abstracted to a Multi-task Multi-Helper scene, wherein the user equipment is only responsible for collecting and unloading the data, and the process of model training and parameter aggregation is carried out in the edge server and the edge parameter server;
in a FEL-MTMH scenario, there are U user equipments and S edge servers, the U-th user equipment unloads a task U to an edge server through an uplink channel, the S-th edge server is further defined as a server S, and then, for the task U unloaded to the server S: use of h us Representing the channel gain between task u and server s; order toRepresenting the bandwidth allocation of task u; use ofTo represent the bandwidth allocation policy of server s, where U s Representing a set of tasks to be offloaded to a server s, B s Represents the communication bandwidth of the server s; with binary offload policiesTo express an offload policy, let m us To representM if task u can be offloaded to server s us 1, otherwise, m us When the value is 0, then:
offload delay for server sThe device consists of a transmission part and a calculation part, and is shown as the following formula:
in the formula, R u Uplink transmission rate R of task u u And e represents the calculated density, d u Data size, f, representing task u s Representing the computing power of the server s where the task u is located;
in the formula, alpha representsAndq is an energy parameter depending on the architecture of the s-chip of the server;
will cost function J 1 Defined to minimize the maximum cost among all edge servers, the following is defined:
in the formula (I), the compound is shown in the specification,andin order to be a weight parameter, the weight parameter,and is provided with
Information entropy is introduced into the system cost to represent the characteristics of data distribution, as shown in the following formula:
in the formula: d s Represents a data set collected by server s; c represents the number of categories; p c (D s ) Data representing class c is in D s The ratio of (1);
maximization of J 2 Entropy of (2), then the joint problem is defined as follows, denoted as the continuous discrete mixture MINLP problem P0:
s.t.m us ∈0,1,
the hierarchical federated edge learning system is provided with a cloud server and K FEL-MTMH scenes, wherein each FEL-MTMH scene is provided with an edge parameter server and S edge servers; defining the data set, weight and loss function of the edge server as D s 、w s And F s (w s ) (ii) a Defining the data set, weight and loss function of the edge parameter server in the k-th FEL-MTMH scene asAndrespectively defining the data set, the weight and the loss function of the cloud server as D global 、w global And F global (w global );
Setting the aggregation policy to X ═ X 1 ,x 2 ,...,x s ,...,x S ],x s 1 denotes the s-th edge server participating in edge aggregation, x s 0 means that the s-th edge server does not participate in edge aggregation, thenAnd F global (w global ) Represented by the formula:
let each edge server be at κ 1 Edge aggregation is performed after round of local training, with each edge parameter server at κ 2 Performing cloud aggregation after the secondary aggregation, and repeating the processUntil sufficient accuracy or communication threshold is reached, the HFEL system model parameters are updated as:
wherein l represents the number of local training rounds, w s (l) Represents the weight of the ES obtained in the first round of training, η represents the learning rate,is represented by F s (w s (l-1)), wherein,and D global Is a virtual data set;
the continuous discrete hybrid MINLP problem P0 is decomposed into two subproblems, one each by fixingMinimizing binary offload policiesProblem P1 resulting from the cost and problem P2 resulting from solving the bandwidth allocation policy, problem P1 is represented as:
in the formula (I), the compound is shown in the specification,is a binary offload policyIs determined.
Problem P2 is expressed as:
in the formula (I), the compound is shown in the specification,is a bandwidth allocation policyThe optimal function of (2).
For problem P1, a reduced action space multi-agent depth certainty policy gradient is utilized to interact with the environment to obtain a binary offload policyFor problem P2, a convex optimization method is used to determine the bandwidth allocation strategy
ES data set D s Distribution P (D) s ) Is defined as: p (D) s )=[P c (D s )|c∈C],P c (D s ) Represents D s A distribution belonging to class C, C being the number of classes;is a virtual data set of edge parameter servers in the kth FEL-MTMH scenario, the distribution of whichIs shown asTo representA distribution belonging to class c; global virtual dataset D global Is by polymerizationObtaining;
introducing a KL divergence, defined as:problem P3: the KL divergence minimization problem is shown by the following equation:
in the formula (I), the compound is shown in the specification,represents an aggregated decision for server s;
in the formula, P c (D global ) Represents D global A distribution belonging to class c;
2. The efficient data-aware hierarchical federated learning method based on task offloading of claim 1, wherein the task u's signal to interference and noise ratio SINR u Expressed as:
4. the efficient data-aware hierarchical federated learning method based on task offloading as recited in claim 1, wherein if the network in the reduced action-space multi-agent depth certainty policy gradient outputs only offloading decisions, the reduced action-space multi-agent depth certainty policy gradient is utilized to interact with the environment to obtain a binary offloading policyThe method comprises the following steps:
a reward function and reduced motion space are introduced in the madpg model, which describes the evolution of HFEL systems using the following markov decision process:
(1) the state is as follows: state is s t =[c t ,d t ,f t ,h t ]Wherein, in the step (A),andrespectively representing the sample class and the data size,indicating the computing resources available on the ES,representing ambient channel fading;
(2) the actions are as follows: defining the offload policy generated by each agent as an operationThe representation represents the mapping relation between the user equipment and the edge server;
(3) rewarding: the implication of the reward is the amount of system weighted cost after the offload policy and bandwidth allocation policy are implemented according to the action. Thus, the reward is defined as the negative of the cost function, i.e. r-J 1 *J 2 Maximizing rewards means minimizing system costs;
the parameters of the Actor and Critic networks are given as theta ═ theta respectively 1 ,...,θ n ]And ω ═ ω [ ω ] 1 ,...,ω n ]. The set of policies for each agent is: pi ═ pi 1 ,...,π n ]Let us assume that the deterministic policy set of N agents is μ ═ μ 1 ,...,μ N ]The deterministic policy gradient is then expressed as follows:
in the formula of the Chinese medicinal composition,as a strategic gradient formula, E s,a []Indicates expectation of a i Represents an action o i Denotes observation, μ i (a i |o i ) A deterministic policy is represented that is,denotes a i The gradient of (a) of (b) is,a state-action function representing an ith agent concentration;
in the centralized training stage of the MADDPG model, Actor and Critic carry out centralized training;
in the distributed execution phase of the MADDPG model, the Actor only needs to know local observations, and the centralized criticc update is expressed as follows:
wherein, in the formula, L (theta) i ) A loss function representing Critic; e s,a,r,s′ []Indicating a desire, s' indicates a next state, a indicates an action, r indicates a reward; y represents the sum of the reward and the discounted state-action function.
5. The efficient data-aware hierarchical federated learning method based on task offloading of claim 1, wherein problem P2 is rewritten as follows:
in the formula: m 1 And M 2 Is a constant associated with the task u,M 2 =-∑ s∈S Entropy(D s ) (ii) a The SINR is the SINR u ;
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210293352.5A CN114828095A (en) | 2022-03-24 | 2022-03-24 | Efficient data perception layered federated learning method based on task unloading |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210293352.5A CN114828095A (en) | 2022-03-24 | 2022-03-24 | Efficient data perception layered federated learning method based on task unloading |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114828095A true CN114828095A (en) | 2022-07-29 |
Family
ID=82531707
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210293352.5A Withdrawn CN114828095A (en) | 2022-03-24 | 2022-03-24 | Efficient data perception layered federated learning method based on task unloading |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114828095A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117591888A (en) * | 2024-01-17 | 2024-02-23 | 北京交通大学 | Cluster autonomous learning fault diagnosis method for key parts of train |
WO2024087573A1 (en) * | 2022-10-29 | 2024-05-02 | 华为技术有限公司 | Federated learning method and apparatus |
-
2022
- 2022-03-24 CN CN202210293352.5A patent/CN114828095A/en not_active Withdrawn
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024087573A1 (en) * | 2022-10-29 | 2024-05-02 | 华为技术有限公司 | Federated learning method and apparatus |
CN117591888A (en) * | 2024-01-17 | 2024-02-23 | 北京交通大学 | Cluster autonomous learning fault diagnosis method for key parts of train |
CN117591888B (en) * | 2024-01-17 | 2024-04-12 | 北京交通大学 | Cluster autonomous learning fault diagnosis method for key parts of train |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Tang et al. | Computational intelligence and deep learning for next-generation edge-enabled industrial IoT | |
Wei et al. | Joint optimization of caching, computing, and radio resources for fog-enabled IoT using natural actor–critic deep reinforcement learning | |
CN111800828B (en) | Mobile edge computing resource allocation method for ultra-dense network | |
CN114828095A (en) | Efficient data perception layered federated learning method based on task unloading | |
Jiang et al. | Distributed resource scheduling for large-scale MEC systems: A multiagent ensemble deep reinforcement learning with imitation acceleration | |
CN112598150B (en) | Method for improving fire detection effect based on federal learning in intelligent power plant | |
CN111132074B (en) | Multi-access edge computing unloading and frame time slot resource allocation method in Internet of vehicles environment | |
CN110233755B (en) | Computing resource and frequency spectrum resource allocation method for fog computing in Internet of things | |
CN112637883A (en) | Federal learning method with robustness to wireless environment change in power Internet of things | |
CN113613301B (en) | Air-ground integrated network intelligent switching method based on DQN | |
Yang et al. | Deep reinforcement learning based wireless network optimization: A comparative study | |
CN116541106B (en) | Computing task unloading method, computing device and storage medium | |
Elbir et al. | A hybrid architecture for federated and centralized learning | |
CN111224905A (en) | Multi-user detection method based on convolution residual error network in large-scale Internet of things | |
CN115034390A (en) | Deep learning model reasoning acceleration method based on cloud edge-side cooperation | |
CN114528987A (en) | Neural network edge-cloud collaborative computing segmentation deployment method | |
Zhou et al. | Dynamic channel allocation for multi-UAVs: A deep reinforcement learning approach | |
He et al. | Computation offloading and resource allocation based on DT-MEC-assisted federated learning framework | |
Qu et al. | Stochastic cumulative DNN inference with RL-aided adaptive IoT device-edge collaboration | |
Ma et al. | Quality-aware video offloading in mobile edge computing: A data-driven two-stage stochastic optimization | |
CN114629769B (en) | Traffic map generation method of self-organizing network | |
CN113157344B (en) | DRL-based energy consumption perception task unloading method in mobile edge computing environment | |
CN116360996A (en) | Reliable edge acceleration reasoning task allocation method in Internet of vehicles environment | |
CN116321181A (en) | Online track and resource optimization method for multi-unmanned aerial vehicle auxiliary edge calculation | |
Wu et al. | Model-heterogeneous Federated Learning with Partial Model Training |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20220729 |