CN115099420A - Model aggregation weight dynamic distribution method for wireless federal learning - Google Patents

Model aggregation weight dynamic distribution method for wireless federal learning Download PDF

Info

Publication number
CN115099420A
CN115099420A CN202211032084.8A CN202211032084A CN115099420A CN 115099420 A CN115099420 A CN 115099420A CN 202211032084 A CN202211032084 A CN 202211032084A CN 115099420 A CN115099420 A CN 115099420A
Authority
CN
China
Prior art keywords
data center
wireless
model
edge devices
federal learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211032084.8A
Other languages
Chinese (zh)
Inventor
黄川�
崔曙光
郭玮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chinese University of Hong Kong Shenzhen
Original Assignee
Chinese University of Hong Kong Shenzhen
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chinese University of Hong Kong Shenzhen filed Critical Chinese University of Hong Kong Shenzhen
Priority to CN202211032084.8A priority Critical patent/CN115099420A/en
Publication of CN115099420A publication Critical patent/CN115099420A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control
    • H04W28/08Load balancing or load distribution
    • H04W28/09Management thereof
    • H04W28/0958Management thereof based on metrics or performance parameters

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention discloses a model aggregation weight dynamic distribution method facing wireless federal learning, which comprises the following steps: s1. for having one data center andKthe data center of the wireless federal learning system of edge devices uses the model parameters
Figure DEST_PATH_IMAGE001
The wireless channel is broadcasted to all edge devices, and the edge devices estimate and update the received information to obtain model parameters
Figure 194893DEST_PATH_IMAGE002
(ii) a S2, all edge devices update the model parameters through wireless uplink
Figure 505788DEST_PATH_IMAGE002
Sending the data to a data center; and S3, constructing an objective function for influences caused by uplink and downlink wireless channel fading and additive noise in each iteration process, obtaining an optimization problem based on the minimized objective function and power constraint, and solving to obtain an optimal weight distribution scheme. The method can determine the weight of each edge device when the data center carries out model aggregation, and effectively ensures the accuracy of model aggregation in the wireless federal learning process.

Description

Model aggregation weight dynamic distribution method for wireless federal learning
Technical Field
The invention relates to wireless federal learning, in particular to a model aggregation weight dynamic distribution method facing wireless federal learning.
Background
A large number of wireless edge devices with ever-increasing computing and communication capabilities and the mass data generated by them can implement intelligent applications in wireless networks by cooperatively training machine learning models. Federal learning, a newly proposed promising distributed machine learning paradigm, allows all participating end devices to exchange model parameters only with the parameter server and save the raw data locally, thus protecting data privacy and security.
However, when the federal learning is applied in a wireless communication scenario, it consumes a lot of communication resources to serve the terminal devices, so that it is necessary to perform joint optimization design from the perspective of both communication and learning efficiency.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides a dynamic model aggregation weight distribution method for wireless federal learning, which can determine the weight of each edge device when a data center carries out model aggregation, and effectively ensures the accuracy of model aggregation in the wireless federal learning process.
The purpose of the invention is realized by the following technical scheme: a dynamic distribution method of model aggregation weights for wireless federal learning comprises the following steps:
s1. for a data center andKwireless federal learning system for edge devices, data center, and model parameters
Figure 518917DEST_PATH_IMAGE001
Broadcasting the information to all edge devices through a wireless channel, and estimating and updating the edge devices according to the received information to obtain model parameters
Figure 643868DEST_PATH_IMAGE002
S2, all edge devices update the model parameters through wireless uplink
Figure 987125DEST_PATH_IMAGE002
Sending the data to a data center;
and S3, constructing an objective function for the influence caused by the uplink and downlink wireless channel fading and additive noise in each iteration process, and obtaining an optimization problem and solving the optimization problem to obtain an optimal weight distribution scheme based on the minimized objective function and the power constraint.
Further, the step S1 includes the following sub-steps:
s101, enabling the data center to use model parameters
Figure 816409DEST_PATH_IMAGE001
Broadcasting to all edge devices through a wireless channel;
s102, arranging edge equipmentkThe received signal is
Figure 686145DEST_PATH_IMAGE003
Wherein the content of the first and second substances,
Figure 857363DEST_PATH_IMAGE004
representing data center to edge deviceskThe channel coefficient (c) of (a) is,
Figure 549900DEST_PATH_IMAGE005
which represents the transmission power of the data center,
Figure 714034DEST_PATH_IMAGE006
then representing a complex symmetric circular gaussian noise vector;
s103, edge equipmentkReceived signal
Figure 516905DEST_PATH_IMAGE007
After that, the signal is divided by
Figure 45976DEST_PATH_IMAGE008
Scaling is performed to estimate the original signal sent by the data center as a result of the estimation
Figure 81934DEST_PATH_IMAGE009
At this time, the edge devicekWill estimate to obtain the result
Figure 738174DEST_PATH_IMAGE010
As an initial result of the local training update, all edge devices pass throughEAfter the second local update, the updated model parameters are updated
Figure 173048DEST_PATH_IMAGE011
Then sending the data back to the data center;
wherein, the local updating process is as follows:
Figure 748386DEST_PATH_IMAGE012
wherein
Figure 491214DEST_PATH_IMAGE013
It is indicated that the learning rate is,
Figure 403675DEST_PATH_IMAGE014
is shown as
Figure 509034DEST_PATH_IMAGE015
The number of the updating times is updated,
Figure 442224DEST_PATH_IMAGE016
denotes the first
Figure 141190DEST_PATH_IMAGE015
Secondary copyThe parameters of the model at the time of the update,
Figure 263866DEST_PATH_IMAGE017
is shown as
Figure 879524DEST_PATH_IMAGE015
Small batches of data randomly selected at the time of the second local update,
Figure 734348DEST_PATH_IMAGE018
is shown in
Figure 45244DEST_PATH_IMAGE015
Small batch gradients at secondary update; first, theESecond update, i.e.
Figure 833596DEST_PATH_IMAGE019
Results obtained
Figure 382389DEST_PATH_IMAGE020
I.e. the updated model parameters.
Further, the step S2 includes the following sub-steps:
s201. edge equipmentkPrecoding local model parameters, i.e. multiplying precoding factors
Figure 860643DEST_PATH_IMAGE021
Wherein
Figure 658835DEST_PATH_IMAGE022
Representing edge deviceskThe transmission power of the antenna is set to be,
Figure 998681DEST_PATH_IMAGE023
representing edge deviceskThe channel coefficients to the data center are,
Figure 120089DEST_PATH_IMAGE024
a complex symmetric circular gaussian noise vector is represented,
Figure 379032DEST_PATH_IMAGE025
and
Figure 336624DEST_PATH_IMAGE026
respectively representing conjugate transposition and modulus operation of complex numbers; k=1,2,…K;
s202, all edge devices transmit the local model parameters after pre-coding to the data center at the same time, and the signals received by the data center are calculated in the air
Figure 729428DEST_PATH_IMAGE027
S203, the data center receives the signals
Figure 456076DEST_PATH_IMAGE028
Multiplying by a scaling factor
Figure 885920DEST_PATH_IMAGE029
The scaling factor being the inverse of the sum of all transmit powers, i.e.
Figure 577146DEST_PATH_IMAGE030
(ii) a The final received signal of the data center is
Figure 258794DEST_PATH_IMAGE031
Wherein
Figure 89215DEST_PATH_IMAGE032
Is a dynamic model aggregation weight that satisfies
Figure 221120DEST_PATH_IMAGE033
And is directly dependent on the upstream transmit power of the edge device.
Further, the step S3 includes the following sub-steps:
s301, calculating to obtain the influence caused by the uplink and downlink wireless channel fading and the additive noise in each iteration process as an objective function, and expressing the influence as follows:
Figure 825407DEST_PATH_IMAGE034
wherein
Figure 700959DEST_PATH_IMAGE035
A vector representing the components of the edge device transmit power,
Figure 120308DEST_PATH_IMAGE036
the dimensions of the model are represented in a manner that,
Figure 95218DEST_PATH_IMAGE037
the power of the gaussian complex noise is represented,
Figure 436069DEST_PATH_IMAGE038
smooth coefficients representing loss functions, in the first term of the molecular part
Figure 521837DEST_PATH_IMAGE039
Is expressed as
Figure 139900DEST_PATH_IMAGE040
Wherein
Figure 741170DEST_PATH_IMAGE041
Is expressed as
Figure 382367DEST_PATH_IMAGE042
S302, optimizing the transmitting power of the edge devices by minimizing an objective function, wherein each edge device has independent power constraint, namely
Figure 334142DEST_PATH_IMAGE043
Wherein
Figure 993663DEST_PATH_IMAGE044
Is an edge device
Figure 248058DEST_PATH_IMAGE045
The upper limit of the power of (a),
Figure 360239DEST_PATH_IMAGE046
is expressed as
Figure 787809DEST_PATH_IMAGE047
S303, based on the minimized objective function and the power constraint, obtaining the following original optimization problem
Figure 646044DEST_PATH_IMAGE048
Obtaining the optimal power distribution vector by solving the original optimization problem
Figure 55029DEST_PATH_IMAGE049
Thereby obtaining an optimal weight distribution scheme
Figure 670818DEST_PATH_IMAGE050
The solving process of the original optimization problem comprises the following steps:
a1, introduction of auxiliary variables
Figure 964396DEST_PATH_IMAGE051
Defining a new vector
Figure 798841DEST_PATH_IMAGE052
Thereby converting the original optimization problem into the following problem
Figure 457355DEST_PATH_IMAGE053
A2, carrying out variable replacement
Figure 357178DEST_PATH_IMAGE054
The problem in step A1 translates into the following problem
Figure 110240DEST_PATH_IMAGE055
Wherein
Figure 615170DEST_PATH_IMAGE056
Are respectively represented as
Figure 897116DEST_PATH_IMAGE057
A3, variable replacement is carried out again
Figure 284235DEST_PATH_IMAGE058
The problem in step A2 is further converted into the following problem
Figure 857299DEST_PATH_IMAGE059
The problem obtained by transformation in step A3 is a standard semi-definite relaxation problem, and the optimal solution is obtained by solving the problem
Figure 669266DEST_PATH_IMAGE060
A4, solving to obtain an optimal solution
Figure 731900DEST_PATH_IMAGE060
The optimal solution to the original optimization problem is later represented as
Figure 747260DEST_PATH_IMAGE061
Wherein
Figure 845054DEST_PATH_IMAGE062
Figure 324577DEST_PATH_IMAGE063
Representation matrix
Figure 745062DEST_PATH_IMAGE060
Delete the first
Figure 778878DEST_PATH_IMAGE064
Column and first
Figure 880695DEST_PATH_IMAGE064
Left after line
Figure 683566DEST_PATH_IMAGE065
A matrix of sizes;
to obtain
Figure 274953DEST_PATH_IMAGE066
Later, the optimal weight for model aggregation in the data center in each iteration process is
Figure 123960DEST_PATH_IMAGE067
Preferably, the method for dynamically assigning weights further comprises:
at each round of iterative training
Figure 377872DEST_PATH_IMAGE068
Edge device
Figure 831987DEST_PATH_IMAGE069
Using the calculated transmit power after updating the local model
Figure 797538DEST_PATH_IMAGE070
Apply the local model
Figure 133841DEST_PATH_IMAGE071
Sending the data to a data center, and sending the data to equipment when the data center carries out model aggregation
Figure 593773DEST_PATH_IMAGE069
Assigned weight of
Figure 886083DEST_PATH_IMAGE072
Then the new global model obtained by the data center is
Figure 304426DEST_PATH_IMAGE073
Wherein
Figure 393604DEST_PATH_IMAGE074
Is an additive noise vector.
The invention has the beneficial effects that: the method can determine the weight of each edge device when the data center carries out model aggregation, and effectively ensures the accuracy of model aggregation in the wireless federal learning process. Moreover, the dynamic weight of the invention is obtained by directly optimizing the transmitting power of the equipment, so the weight not only balances the learning efficiency of wireless federal learning, but also balances the communication efficiency, and the invention is a simple and efficient learning-communication combined optimization design invention.
Drawings
FIG. 1 is a schematic diagram of a wireless federal learning system;
FIG. 2 is a flow chart of a method of the present invention;
FIG. 3 is a schematic diagram showing the variation of the test accuracy with the training times under independent and identically distributed data distribution;
FIG. 4 is a diagram showing the variation of the test accuracy with the training times under the non-independent and identically distributed data distribution.
Detailed Description
The technical solutions of the present invention are further described in detail below with reference to the accompanying drawings, but the scope of the present invention is not limited to the following.
Aiming at a federal learning algorithm deployed in a wireless communication system, the invention designs a dynamic distribution scheme of model aggregation weights. Which comprises the following steps: modeling uplink and downlink signal transmission between the data center and the edge equipment; designing the weight when the data center carries out model aggregation; and (4) an optimal dynamic weight distribution scheme. As shown in fig. 1, we consider that the federal learning algorithm is deployed in a wireless communication system, where the system includes a data center and a plurality of edge devices, the edge devices transmit a locally updated model to the data center through a wireless uplink for model aggregation, then the data center redistributes a global model obtained after aggregation to the edge devices through a wireless downlink for further update, and a global optimal model is obtained through training through multiple cooperative iterations between the data center and the edge devices;
consider having a data center andKa wireless federal learning system for individual edge devices as shown in fig. 1. In order to train and obtain a global machine learning model and protect the local data privacy of all edge devices, only model parameters are exchanged between a data center and the edge devices in the system through a wireless channel. The invention considers simulating wireless downlink and uplink transmission, wherein the wireless downlink transmission is used for global model distribution, the wireless uplink transmission is based on an air computing technology and used as the basis of weight design of model aggregation, and specifically:
as shown in fig. 2, a method for dynamically allocating model aggregation weights for wireless federal learning includes the following steps:
s1. for having one data center andKwireless federal learning system for edge devices, data center, and method for updating model parameters (after previous aggregation update)
Figure 378265DEST_PATH_IMAGE075
Broadcast to all edge devices through wireless channels (i.e. wireless downlink transmission), and the edge devices estimate and update the model parameters according to the received information
Figure 275814DEST_PATH_IMAGE076
S101, enabling the data center to use model parameters
Figure 458534DEST_PATH_IMAGE075
Broadcasting to all edge devices through a wireless channel;
s102, arranging edge equipmentkThe received signal is
Figure 690801DEST_PATH_IMAGE077
Wherein, the first and the second end of the pipe are connected with each other,
Figure 226956DEST_PATH_IMAGE078
representing data center to edge deviceskThe channel coefficient (c) of (a) is,
Figure 493858DEST_PATH_IMAGE079
which represents the transmission power of the data center,
Figure 847479DEST_PATH_IMAGE080
then representing a complex symmetric circular gaussian noise vector;
s103, serving as edge equipmentkReceived signal
Figure 52195DEST_PATH_IMAGE081
After that, the signal is divided by
Figure 641308DEST_PATH_IMAGE082
Scaling to estimate the original signal sent by the data center with the result of the estimation
Figure 513449DEST_PATH_IMAGE083
At this time, the edge devicekWill estimate to obtain the result
Figure 772392DEST_PATH_IMAGE084
As an initial result of the local training update, all edge devices pass throughEAfter the local update, the updated model parameters are updated
Figure 976322DEST_PATH_IMAGE085
Then sending the data back to the data center;
wherein, the local updating process is as follows:
Figure 119859DEST_PATH_IMAGE086
wherein
Figure 908823DEST_PATH_IMAGE087
It is indicated that the learning rate is,
Figure 791197DEST_PATH_IMAGE088
is shown as
Figure 970506DEST_PATH_IMAGE089
The number of the sub-update is updated,
Figure 635842DEST_PATH_IMAGE090
is shown as
Figure 482576DEST_PATH_IMAGE089
The model parameters at the time of the second local update,
Figure 83321DEST_PATH_IMAGE091
is shown as
Figure 468035DEST_PATH_IMAGE089
Small batches of data randomly selected at the time of the second local update,
Figure 484533DEST_PATH_IMAGE092
is shown in
Figure 437970DEST_PATH_IMAGE089
Small batch gradients at secondary update; first, theESecond update, i.e.
Figure 147300DEST_PATH_IMAGE093
Results obtained
Figure 363517DEST_PATH_IMAGE094
I.e. the updated model parameters.
The local updating strategy adopted by the invention is a small-batch random gradient descent method.
S2, all edge devices update the model parameters through wireless uplink
Figure 901815DEST_PATH_IMAGE094
Sending to a data center (i.e., wireless uplink transmission):
s201. edge equipmentkPrecoding local model parameters, i.e. multiplying by a precoding factor
Figure 191982DEST_PATH_IMAGE021
Wherein
Figure 665689DEST_PATH_IMAGE022
Representing edge deviceskThe transmission power of the antenna is set to be,
Figure 556153DEST_PATH_IMAGE023
representing edge deviceskThe channel coefficients to the data center are,
Figure 914453DEST_PATH_IMAGE024
a complex symmetric circular gaussian noise vector is represented,
Figure 777236DEST_PATH_IMAGE025
and
Figure 93948DEST_PATH_IMAGE026
respectively representing conjugate transposition and modulus operation of complex numbers; k=1,2,…K;
s202, all edge devices transmit the local model parameters after pre-coding to the data center at the same time, and the signals received by the data center are calculated in the air
Figure 284758DEST_PATH_IMAGE027
S203, the data center receives the signals
Figure 365191DEST_PATH_IMAGE028
Multiplying by a scaling factor
Figure 613638DEST_PATH_IMAGE029
The scaling factor being the inverse of the sum of all transmit powers,namely, it is
Figure 429147DEST_PATH_IMAGE030
(ii) a The final received signal of the data center is
Figure 779357DEST_PATH_IMAGE031
Wherein
Figure 463148DEST_PATH_IMAGE032
Is a dynamic model aggregation weight that satisfies
Figure 582414DEST_PATH_IMAGE033
And is directly dependent on the upstream transmit power of the edge device.
S3, constructing an objective function for the influence caused by the uplink and downlink wireless channel fading and additive noise in each iteration process, and obtaining an optimization problem and solving the optimization problem to obtain an optimal weight distribution scheme based on the minimized objective function and the power constraint:
and S301, due to fading of a wireless channel and the existence of additive noise, model parameters received by the edge device and the data center are inaccurate in the training process of the wireless federal learning system. Therefore, in order to reduce the influence of channel fading and additive noise on the training process, the invention designs the model aggregation weight which can be directly determined by the transmitting power of the edge device. Based on the weight, an optimal model aggregation weight distribution scheme can be obtained through further optimization. Due to the dynamic property of the wireless channel, the weight needs to be optimized in each round of training process, so the allocation scheme is a dynamic model aggregation weight allocation scheme.
Through convergence analysis, it can be deduced that the wireless federal learning system considered by the invention is passing throughTAfter the second iteration, the upper bound between the difference between the loss function value and the optimal loss function value can be used as a measure model in the iterationTAn indicator of the effectiveness of the training next time. We refer to this upper bound as the optimal spacing,the optimal interval is the minimum, which shows that the model effect obtained by training is better. The distribution of training set data, the variance of the random gradient, and channel fading and additive noise introduced by wireless communications of the wireless federal learning system considered by the present invention all affect the value of the optimal interval. In order to reduce the influence of wireless communication on model training, the part related to channel fading and additive noise in the optimal interval needs to be minimized, which can be expressed as the weighted sum of the influence caused by uplink and downlink wireless channel fading and additive noise in each iteration process. Therefore, the influence caused by the uplink and downlink wireless channel fading and the additive noise in each iteration process is only required to be minimized, and the part related to the wireless communication in the optimal interval can be minimized.
The influence caused by the uplink and downlink wireless channel fading and the additive noise in each iteration process is calculated and obtained as an objective function, and is expressed as:
Figure 37666DEST_PATH_IMAGE095
wherein
Figure 390019DEST_PATH_IMAGE035
A vector representing the components of the edge device transmit power,
Figure 97075DEST_PATH_IMAGE036
the dimensions of the model are represented in a manner that,
Figure 588624DEST_PATH_IMAGE037
which represents the power of the gaussian complex noise,
Figure 152460DEST_PATH_IMAGE038
smooth coefficient representing loss function, in first term of molecular part
Figure 805158DEST_PATH_IMAGE039
Is expressed as
Figure 627490DEST_PATH_IMAGE040
Wherein
Figure 455769DEST_PATH_IMAGE041
Is expressed as
Figure 439774DEST_PATH_IMAGE042
S302, optimizing the transmitting power of the edge devices by minimizing an objective function, wherein each edge device has independent power constraint, namely
Figure 189555DEST_PATH_IMAGE043
Wherein
Figure 815578DEST_PATH_IMAGE044
Is an edge device
Figure 91838DEST_PATH_IMAGE045
The upper limit of the power of (a),
Figure 263056DEST_PATH_IMAGE046
is expressed as
Figure 746472DEST_PATH_IMAGE047
S303, based on the minimized objective function and the power constraint, obtaining the following original optimization problem
Figure 989234DEST_PATH_IMAGE048
Obtaining an optimal power distribution vector by solving an original optimization problem
Figure 792105DEST_PATH_IMAGE049
Thereby obtaining an optimal weight distribution scheme
Figure 117913DEST_PATH_IMAGE050
The solving process of the original optimization problem comprises the following steps:
a1, introduction of auxiliary variables
Figure 904604DEST_PATH_IMAGE051
Defining a new vector
Figure 544532DEST_PATH_IMAGE052
Thereby converting the original optimization problem into the following problem
Figure 733068DEST_PATH_IMAGE053
A2, carrying out variable replacement
Figure 308406DEST_PATH_IMAGE054
The problem in step A1 translates into the following problem
Figure 34922DEST_PATH_IMAGE055
Wherein
Figure 291591DEST_PATH_IMAGE056
Are respectively represented as
Figure 321252DEST_PATH_IMAGE057
A3, variable replacement is carried out again
Figure 67491DEST_PATH_IMAGE058
The problem in step a2 is further converted into the following problem
Figure 563194DEST_PATH_IMAGE059
The problem obtained by conversion in step A3 is a standard semi-positive definite relaxation problem, and is solvedOptimal solution
Figure 607243DEST_PATH_IMAGE060
A4, solving to obtain an optimal solution
Figure 442475DEST_PATH_IMAGE060
The optimal solution to the original optimization problem is later represented as
Figure 546566DEST_PATH_IMAGE061
Wherein
Figure 857461DEST_PATH_IMAGE062
Figure 455933DEST_PATH_IMAGE063
Representation matrix
Figure 394939DEST_PATH_IMAGE060
Delete the first
Figure 686243DEST_PATH_IMAGE064
Column and first
Figure 218855DEST_PATH_IMAGE064
Left after line
Figure 297318DEST_PATH_IMAGE065
A matrix of sizes;
to obtain
Figure 966197DEST_PATH_IMAGE066
Later, the optimal weight for model aggregation in the data center in each iteration process is
Figure 412090DEST_PATH_IMAGE067
Preferably, the method for dynamically assigning weights further comprises:
at each round of iterative training
Figure 104103DEST_PATH_IMAGE068
Edge device
Figure 28065DEST_PATH_IMAGE069
Using the calculated transmit power after updating the local model
Figure 754713DEST_PATH_IMAGE070
Apply the local model
Figure 309191DEST_PATH_IMAGE071
Sending the data to a data center, and sending the data to equipment when the data center carries out model aggregation
Figure 475118DEST_PATH_IMAGE069
Assigned weight of
Figure 953504DEST_PATH_IMAGE072
Then the new global model obtained by the data center is
Figure 987188DEST_PATH_IMAGE073
Wherein
Figure 525616DEST_PATH_IMAGE074
Is an additive noise vector.
In the embodiment of the application, a simulation result is given to verify the model aggregation scheme of the invention. In addition to the model aggregation scheme proposed by the present invention, the federate learning algorithm and the truncated channel inversion algorithm under the ideal channel also serve as a comparison scheme. In the simulation, we trained the convolutional neural network to recognize the MNIST data set, and the criteria were evaluated but the accuracy was tested. The simulation parameters are set as follows:
Figure 520117DEST_PATH_IMAGE096
Figure 51461DEST_PATH_IMAGE097
the uplink and downlink channels are modeled as independent and identically distributed Rayleigh fading channels, namely complex symmetrical round Gaussian with zero mean unit varianceThe variables are the variables of the process,
Figure 487122DEST_PATH_IMAGE098
Figure 383403DEST_PATH_IMAGE099
Figure 130779DEST_PATH_IMAGE100
Figure 950967DEST_PATH_IMAGE101
. Consider that each edge device has 800 training data.
First, the effect of the proposed scheme of the present invention in the case of independent co-distributed data is examined, as shown in fig. 3. The result shows that with the increase of the iteration times, the test accuracy of the scheme provided by the invention gradually rises and finally converges, and the convergence process almost coincides with the federal learning algorithm under an ideal channel, so that the test accuracy can reach 92.75% finally, and the effectiveness of the dynamic aggregation weight distribution scheme provided by the invention can be demonstrated. In addition, the scheme provided by the invention has better test accuracy than the existing scheme. Then, the effect of the proposed scheme of the present invention in case of non-independent co-distributed data is examined, as shown in fig. 4. The results also show that the test accuracy of the proposed scheme gradually rises and finally converges as the number of iterations increases, but in the case of non-independent identically distributed data, the convergence curve of the proposed scheme is slightly lower than that of the ideal federal learning algorithm, but still significantly better than that of the other existing schemes.
The foregoing is a preferred embodiment of the present invention, and it is to be understood that the invention is not limited to the form disclosed herein, but is not intended to be foreclosed in other embodiments and may be used in other combinations, modifications, and environments and is capable of changes within the scope of the inventive concept as expressed herein, commensurate with the above teachings, or the skill or knowledge of the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (6)

1. A model aggregation weight dynamic distribution method facing wireless federal learning is characterized in that: the method comprises the following steps:
s1. for a data center andKwireless federal learning system for edge devices, data center, and model parameters
Figure 748328DEST_PATH_IMAGE001
Broadcasting the information to all edge devices through a wireless channel, and estimating and updating the edge devices according to the received information to obtain model parameters
Figure 233667DEST_PATH_IMAGE002
S2, all edge devices update the model parameters through wireless uplink
Figure 528382DEST_PATH_IMAGE002
Sending the data to a data center;
and S3, constructing an objective function for the influence caused by the uplink and downlink wireless channel fading and additive noise in each iteration process, and obtaining an optimization problem and solving the optimization problem to obtain an optimal weight distribution scheme based on the minimized objective function and the power constraint.
2. The method for dynamically allocating model aggregation weights for wireless federal learning according to claim 1, wherein: the step S1 includes the following sub-steps:
s101, enabling the data center to use model parameters
Figure 955821DEST_PATH_IMAGE001
Broadcasting to all edge devices through a wireless channel;
s102, arranging edge equipmentkThe received signal is
Figure 109722DEST_PATH_IMAGE003
Wherein the content of the first and second substances,
Figure 179178DEST_PATH_IMAGE004
representing data center to edge deviceskThe channel coefficients of (a) are determined,
Figure 469345DEST_PATH_IMAGE005
which represents the transmission power of the data center,
Figure 598844DEST_PATH_IMAGE006
then representing a complex symmetric circular gaussian noise vector;
s103, serving as edge equipmentkReceived signal
Figure 443303DEST_PATH_IMAGE007
After that, the signal is divided by
Figure 660658DEST_PATH_IMAGE008
Scaling to estimate the original signal sent by the data center with the result of the estimation
Figure 57528DEST_PATH_IMAGE009
At this time, the edge devicekWill estimate to obtain the result
Figure 577503DEST_PATH_IMAGE010
As an initial result of the local training update, all edge devices pass throughEAfter the second local update, the updated model parameters are updated
Figure 955263DEST_PATH_IMAGE011
Then sending the data back to the data center;
wherein, the local updating process is as follows:
Figure 382833DEST_PATH_IMAGE012
wherein
Figure 37806DEST_PATH_IMAGE013
It is indicated that the learning rate is,
Figure 509107DEST_PATH_IMAGE014
is shown as
Figure 452792DEST_PATH_IMAGE015
The number of the sub-update is updated,
Figure 887316DEST_PATH_IMAGE016
is shown as
Figure 276357DEST_PATH_IMAGE015
The model parameters at the time of the second local update,
Figure 403713DEST_PATH_IMAGE017
is shown as
Figure 834694DEST_PATH_IMAGE015
Small batches of data randomly selected at the time of the second local update,
Figure 587756DEST_PATH_IMAGE018
is shown in
Figure 561528DEST_PATH_IMAGE015
Small batch gradients at secondary update; first, theESecond update, i.e.
Figure 109053DEST_PATH_IMAGE019
Results obtained
Figure 168276DEST_PATH_IMAGE020
I.e. the updated model parameters.
3. The method for dynamically allocating the model aggregation weights for wireless federal learning according to claim 1, wherein: the step S2 includes the following sub-steps:
s201. edge equipmentkPrecoding local model parameters, i.e. multiplying by a precoding factor
Figure 69236DEST_PATH_IMAGE021
Wherein
Figure 146782DEST_PATH_IMAGE022
Representing edge deviceskThe transmission power of the antenna is set to be,
Figure 881520DEST_PATH_IMAGE023
representing edge deviceskChannel coefficients to the data center, n representing a complex symmetric circular gaussian noise vector,
Figure 414657DEST_PATH_IMAGE024
and
Figure 260253DEST_PATH_IMAGE025
respectively representing conjugate transposition and modulus operation of complex numbers; k=1,2,…K;
s202, all edge devices transmit the local model parameters after pre-coding to the data center at the same time, and the signals received by the data center are calculated in the air
Figure 457885DEST_PATH_IMAGE026
S203, the data center receives the signals
Figure 222579DEST_PATH_IMAGE027
Multiplying by a scaling factor
Figure 725235DEST_PATH_IMAGE028
The scaling factor being the inverse of the sum of all transmit powers, i.e.
Figure 92632DEST_PATH_IMAGE029
(ii) a The final received signal of the data center is
Figure 161082DEST_PATH_IMAGE030
Wherein
Figure 96677DEST_PATH_IMAGE031
Are dynamic model aggregation weights that satisfy
Figure 601476DEST_PATH_IMAGE032
And is directly dependent on the upstream transmit power of the edge device.
4. The method for dynamically allocating the model aggregation weights for wireless federal learning according to claim 1, wherein: the step S3 includes the following sub-steps:
s301, calculating to obtain the influence caused by the uplink and downlink wireless channel fading and the additive noise in each iteration process as an objective function, and expressing the influence as follows:
Figure 523296DEST_PATH_IMAGE033
wherein
Figure 692590DEST_PATH_IMAGE034
A vector representing the components of the edge device transmit power,
Figure 533507DEST_PATH_IMAGE035
the dimensions of the model are represented in a manner that,
Figure 276336DEST_PATH_IMAGE036
the power of the gaussian complex noise is represented,
Figure 251114DEST_PATH_IMAGE037
smooth coefficient representing loss function, in first term of molecular part
Figure 622052DEST_PATH_IMAGE038
Is expressed as
Figure 774816DEST_PATH_IMAGE039
Wherein
Figure 988628DEST_PATH_IMAGE040
Is expressed as
Figure 517830DEST_PATH_IMAGE041
S302, optimizing the transmitting power of the edge devices by minimizing an objective function, wherein each edge device has independent power constraint, namely
Figure 8854DEST_PATH_IMAGE042
Wherein
Figure 581787DEST_PATH_IMAGE043
Is an edge device
Figure 33628DEST_PATH_IMAGE044
The upper limit of the power of (a),
Figure 959996DEST_PATH_IMAGE045
is expressed as
Figure 433090DEST_PATH_IMAGE046
S303, based on the minimized objective function and the power constraint, obtaining the following original optimization problem
Figure 52290DEST_PATH_IMAGE047
Obtaining an optimal power distribution vector by solving an original optimization problem
Figure 991427DEST_PATH_IMAGE048
Thereby obtaining an optimal weight distribution scheme
Figure 580540DEST_PATH_IMAGE049
5. The method for dynamically allocating the model aggregation weights for wireless federal learning according to claim 4, wherein: the solving process of the original optimization problem comprises the following steps:
a1, introduction of auxiliary variables
Figure 187102DEST_PATH_IMAGE050
Defining a new vector
Figure 977203DEST_PATH_IMAGE051
Thereby converting the original optimization problem into the following one
Figure 387325DEST_PATH_IMAGE052
A2, carrying out variable replacement
Figure 514550DEST_PATH_IMAGE053
The problem in step A1 translates into the following problem
Figure 975618DEST_PATH_IMAGE054
Wherein
Figure 58325DEST_PATH_IMAGE055
Are respectively represented as
Figure 96688DEST_PATH_IMAGE056
A3, carrying out variable replacement again
Figure 575074DEST_PATH_IMAGE057
The problem in step A2 is further converted into the following problem
Figure 139917DEST_PATH_IMAGE058
The problem obtained by transformation in step A3 is a standard semi-definite relaxation problem, and the optimal solution is obtained by solving the problem
Figure 412766DEST_PATH_IMAGE059
A4, solving to obtain an optimal solution
Figure 63059DEST_PATH_IMAGE059
The optimal solution to the original optimization problem is later represented as
Figure 813977DEST_PATH_IMAGE060
Wherein
Figure 108693DEST_PATH_IMAGE061
Figure 208236DEST_PATH_IMAGE062
Representation matrix
Figure 348754DEST_PATH_IMAGE059
Delete the first
Figure 27997DEST_PATH_IMAGE063
Column and first
Figure 52585DEST_PATH_IMAGE064
Left after line
Figure 650926DEST_PATH_IMAGE065
A matrix of sizes;
to obtain
Figure 620019DEST_PATH_IMAGE066
Later, the optimal weight for model aggregation in the data center in each iteration process is
Figure 978319DEST_PATH_IMAGE067
6. The method for dynamically allocating the model aggregation weights for wireless federal learning according to claim 1, wherein: the dynamic weight distribution method further comprises the following steps:
at each round of iterative training
Figure 841102DEST_PATH_IMAGE068
Edge device
Figure 485710DEST_PATH_IMAGE069
Using the calculated transmit power after updating the local model
Figure 83044DEST_PATH_IMAGE070
Apply the local model
Figure 494303DEST_PATH_IMAGE071
Send to a data centerEquipment for model aggregation in data center
Figure 759062DEST_PATH_IMAGE072
Assigned weight of
Figure 493013DEST_PATH_IMAGE073
Then the new global model obtained by the data center is
Figure 312064DEST_PATH_IMAGE074
Wherein
Figure 261435DEST_PATH_IMAGE075
Is an additive noise vector.
CN202211032084.8A 2022-08-26 2022-08-26 Model aggregation weight dynamic distribution method for wireless federal learning Pending CN115099420A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211032084.8A CN115099420A (en) 2022-08-26 2022-08-26 Model aggregation weight dynamic distribution method for wireless federal learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211032084.8A CN115099420A (en) 2022-08-26 2022-08-26 Model aggregation weight dynamic distribution method for wireless federal learning

Publications (1)

Publication Number Publication Date
CN115099420A true CN115099420A (en) 2022-09-23

Family

ID=83300986

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211032084.8A Pending CN115099420A (en) 2022-08-26 2022-08-26 Model aggregation weight dynamic distribution method for wireless federal learning

Country Status (1)

Country Link
CN (1) CN115099420A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115424079A (en) * 2022-09-30 2022-12-02 深圳市大数据研究院 Image classification method based on federal edge learning and related equipment
CN116827393A (en) * 2023-06-30 2023-09-29 南京邮电大学 Honeycomb-free large-scale MIMO uplink receiving method and system based on federal learning
CN116827393B (en) * 2023-06-30 2024-05-28 南京邮电大学 Honeycomb-free large-scale MIMO receiving method and system based on federal learning

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113947210A (en) * 2021-10-08 2022-01-18 东北大学 Cloud side end federal learning method in mobile edge computing
US20220182802A1 (en) * 2020-12-03 2022-06-09 Qualcomm Incorporated Wireless signaling in federated learning for machine learning components

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220182802A1 (en) * 2020-12-03 2022-06-09 Qualcomm Incorporated Wireless signaling in federated learning for machine learning components
CN113947210A (en) * 2021-10-08 2022-01-18 东北大学 Cloud side end federal learning method in mobile edge computing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WEI,GUO,ETC: "Joint Device Selection and Power Controlfor Wireless Federated Learning", 《IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115424079A (en) * 2022-09-30 2022-12-02 深圳市大数据研究院 Image classification method based on federal edge learning and related equipment
CN115424079B (en) * 2022-09-30 2023-11-24 深圳市大数据研究院 Image classification method based on federal edge learning and related equipment
CN116827393A (en) * 2023-06-30 2023-09-29 南京邮电大学 Honeycomb-free large-scale MIMO uplink receiving method and system based on federal learning
CN116827393B (en) * 2023-06-30 2024-05-28 南京邮电大学 Honeycomb-free large-scale MIMO receiving method and system based on federal learning

Similar Documents

Publication Publication Date Title
CN113139662B (en) Global and local gradient processing method, device, equipment and medium for federal learning
Chen et al. Performance optimization of federated learning over wireless networks
CN111629380B (en) Dynamic resource allocation method for high concurrency multi-service industrial 5G network
Lin et al. Relay-assisted cooperative federated learning
Xu et al. Resource allocation based on quantum particle swarm optimization and RBF neural network for overlay cognitive OFDM System
CN110167176B (en) Wireless network resource allocation method based on distributed machine learning
CN105379412A (en) System and method for controlling multiple wireless access nodes
CN115099420A (en) Model aggregation weight dynamic distribution method for wireless federal learning
CN104135743A (en) Resource allocation method based on cache control in LTE-A (Long Term Evolution-Advanced) cellular network
Chen et al. Distributive network utility maximization over time-varying fading channels
Khan et al. MaReSPS for energy efficient spectral precoding technique in large scale MIMO-OFDM
Shi et al. Vertical federated learning over cloud-RAN: Convergence analysis and system optimization
Chai et al. Learning-based resource allocation for ultra-reliable V2X networks with partial CSI
CN112152766B (en) Pilot frequency distribution method
CN115099419B (en) User cooperative transmission method for wireless federal learning
CN111741478B (en) Service unloading method based on large-scale fading tracking
Jing et al. Distributed resource allocation based on game theory in multi-cell OFDMA systems
Alvi et al. Utility fairness for the differentially private federated-learning-based wireless IoT networks
Peng et al. Data-driven spectrum partition for multiplexing URLLC and eMBB
CN115643136B (en) Multi-domain cooperative spectrum interference method and system based on evaluation index
CN116542319A (en) Self-adaptive federation learning method and system based on digital twin in edge computing environment
CN115913844A (en) MIMO system digital predistortion compensation method, device, equipment and storage medium based on neural network
Liu et al. Game based robust power allocation strategy with QoS guarantee in D2D communication network
CN102480793B (en) Distributed resource allocation method and device
CN112134632B (en) Method and device for evaluating average capacity of unmanned aerial vehicle communication system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220923

RJ01 Rejection of invention patent application after publication