CN115767758A - Equipment scheduling method based on combination of channel and local model update - Google Patents

Equipment scheduling method based on combination of channel and local model update Download PDF

Info

Publication number
CN115767758A
CN115767758A CN202211422803.7A CN202211422803A CN115767758A CN 115767758 A CN115767758 A CN 115767758A CN 202211422803 A CN202211422803 A CN 202211422803A CN 115767758 A CN115767758 A CN 115767758A
Authority
CN
China
Prior art keywords
local
training
round
edge mobile
gradient
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211422803.7A
Other languages
Chinese (zh)
Inventor
张帆
王昆仑
万俊杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China Normal University
Original Assignee
East China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China Normal University filed Critical East China Normal University
Priority to CN202211422803.7A priority Critical patent/CN115767758A/en
Publication of CN115767758A publication Critical patent/CN115767758A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Mobile Radio Communication Systems (AREA)

Abstract

The invention discloses a device scheduling method based on combination of channel and local model updating, which relates to the field of federal learning and the field of user scheduling, wherein the communication efficiency is improved by adopting aerial calculation during local model aggregation, aggregation errors generated in the aerial calculation process are reduced by optimizing a receiving end beam forming vector, and then a device scheduling method considering channel and local model updating simultaneously is provided.

Description

Equipment scheduling method based on combination of channel and local model update
Technical Field
The invention relates to the field of wireless communication, in particular to a device scheduling method based on combination of channel and local model updating.
Background
In recent years, as the breakthrough and the improvement of the computational power level of the machine learning technology and the explosive increase of the data volume, the artificial intelligence applications such as automatic driving and virtual reality are increasingly realized. A typical machine learning technology usually performs model training in a centralized processing manner, that is, original data generated by an intelligent mobile device is directly sent to a centralized cloud data center, a large amount of data transmission necessarily causes network congestion, which results in higher delay, and the problem of privacy disclosure is also caused by transmission of the original data. On the other hand, the computing power of edge mobile devices such as base stations, mobile phones and tablet computers is increasingly advanced, so that the computation at the edge of the network is possible, and based on the calculation, a federal learning training framework is proposed. In federal learning, only the trained model or model gradient is transmitted, and data related to privacy is also retained at the edge mobile device, so that the functions of protecting privacy and saving communication resources can be achieved.
To further improve the utilization of communication spectrum resources during federal learning, people combine over-the-air computation with federal learning. The air calculation is to use the superposition characteristic of the channel to complete the gradient aggregation in the Federal learning training process, and because part of calculation is completed in the transmission process, the time delay can be reduced.
User scheduling in federal learning is a research hotspot. There are usually a large number of edge devices connected to the parameter server, but only a portion of the edge devices interact with the parameter server during each training round, taking into account communication load and device energy consumption. In the federal learning based on air calculation, two conditions of channel and local model updating are generally considered when users are scheduled, and the traditional method for combining the channel and the local model updating is to firstly select equipment according to channel gain, perform local training on the selected equipment, and then schedule a part of less equipment according to local training results, which can lead to the fact that part of equipment is subjected to local training but not selected, and waste the energy of the equipment.
Disclosure of Invention
In view of this, the present invention provides a device scheduling method based on combination of channel and local model update, which aims at the federate learning system based on over-the-air computation and improves the model training effect by optimizing the scheduling of the edge mobile device.
In order to achieve the purpose, the invention adopts the following technical scheme:
step 1: constructing a federated learning system
In an edge intelligent scenario, there are K edge mobile devices with single antenna, denoted as K = {1,2, \ 8230;, K } and a parameter server with M antennas, and each edge mobile device K ∈ K has a local data set D k ,|D k | represents a data set D k The global model omega is trained by a parameter server and the edge mobile equipment together in the federal learning, the federal learning training process is a cyclic process, each cycle is called a training round, each training round obtains a new global model, and the omega is used t Represents the global model, ω, obtained for the t-th training run 0 Representing an initial global model, omega, not trained by federal learning t-1 Representing a global model obtained in the last training round to form a federal learning system;
and 2, step: updating two parameters by combining a channel and a local model, and scheduling the edge mobile equipment participating in training by a parameter server;
and step 3: the parameter server obtains the global model omega of the last cycle t-1 Sending the data to all the scheduled edge mobile devices;
and 4, step 4: the scheduled edge mobile equipment adopts a random gradient descent algorithm to carry out local training to obtain respective local gradients;
and 5: the scheduled edge mobile equipment uploads the obtained local gradient to a parameter server, and the global model is updated to obtain omega t The process of uploading the local gradient adopts aerial calculation, and the aerial calculation is optimized;
step 2-5 is executed in a loop until the global model omega t And (6) converging.
Further, the step 2 specifically includes:
and (3) channel parameters: parameter service in the t round of trainingWith different channel gain vectors h between the device and each edge mobile k k,t (ii) a Each edge mobile device sends a pilot frequency sequence to a parameter server, and the parameter server estimates a channel gain vector h between each edge mobile device and the parameter server according to the received pilot frequency sequence k,t (ii) a Using a channel gain vector h k,t L of 2 Norm of
Figure BDA0003943222410000021
Representing channel parameters between the parameter server and the edge mobile device k, wherein
Figure BDA0003943222410000022
Representative vector h k,t The absolute value of the ith component is the channel gain vector h since the parameter server has a total of M antennas k,t A total of M components;
local model update parameters: in the t round of training, the edge mobile device k performs local training to obtain a local gradient g k,t Using local gradients g k,t L of 2 Norm g k,t The | | represents a local model update parameter of the edge mobile device k; g | | k,t The larger the value of | | is, the larger the effect of the local training result of the equipment k on the federal learning training is improved;
the edge mobile device k does not carry out local training before being scheduled, and uses the result g of the local training again when being scheduled k,t Calculate g k,t L of 2 Norm g k,t I guides the scheduling, so I of local gradient obtained by local training is needed 2 Norm g k,t Estimating | l;
combining the channel with the local model updating to decide the scheduling priority of the equipment, wherein the scheduling priority of the equipment is defined as follows:
I k,t =c||h k,t ||+(1-c)||g k,t || (1)
wherein I k,t Represents the scheduling priority of the edge mobile device k in the tth round of training, c ∈ [0,1]Is a hyper-parameter for controlling two schedulesThe impact weight of the parameter;
scheduling priority I for all edge mobiles k k,t Performing descending order, and under the condition that the number of the scheduling devices is fixed to N, 0<N<K, selecting the first N I k,t The largest equipment participates in the tth round of federal learning training; the set of edge mobile devices scheduled in the t-th training round are:
Figure BDA0003943222410000031
further, the local gradient l obtained by the local training 2 Norm g k,t Estimating, specifically:
since the local gradients calculated by the edge shifting device in each training round have a strong temporal correlation, the gradient l in the last training round is used 2 Norm to estimate the gradient l of the current training round 2 Norm, specifically, local gradient l obtained by estimation in the t-th training round 2 Norm is:
Figure BDA0003943222410000032
wherein t is k Representing the last dispatched federal learning training round of the device k;
all edge mobile devices are required to participate in training in the first round of federal learning training round, and local gradients of all edge mobile devices are uploaded, so that the parameter server can perform local gradient l of subsequent training rounds 2 Estimating the norm; when edge mobile k is scheduled in the t-th round of training, the parameter server uses the local gradient g after local training k,t Updating
Figure BDA0003943222410000033
Further, the step 4 specifically includes:
the method comprises the following steps that local training is carried out on scheduled edge mobile equipment by adopting a random gradient descent algorithm, a loss function is firstly constructed by adopting the random gradient descent algorithm, and then a local gradient is calculated by utilizing the loss function;
global loss function of the system of
Figure BDA0003943222410000034
Wherein
Figure BDA0003943222410000035
Sample capacity, F, representing the entire Federal learning System kt-1 ) Is the global model omega obtained from the last cycle t-1 A local loss function on edge mobile k, the global loss function being a weighted average of the local loss functions for all edge mobile k; local loss function on edge shifting device k
Figure BDA0003943222410000041
Wherein (x) k ,y k ) Representing data samples on device k, f (x) k ,y k ;ω t-1 ) Is a single sample (x) k ,y k ) Corresponding loss function, f (x) k ,y k ;ω t-1 ) Measure the global model omega t-1 For sample (x) k ,y k ) The matching performance of (2);
constructing a formula for local training of the edge mobile equipment k in the t round of training to obtain a local gradient:
Figure BDA0003943222410000042
wherein ^ f (x) k ,y k ;ω t-1 ) Representing the derivative of the loss function, L k,t ∈D k Is from data set D in the t-th training round k In a small sample set, L, obtained by random selection b =|L k,t Is the small batch sample set L k,t The number of samples in (1); g k,t Is the local gradient obtained by local training.
Further, the step 5 specifically includes:
local gradient g to be obtained by scheduled edge mobile devices k,t Uploading to a parameter server, updating the global model, firstly constructing a formula for updating the global model by the parameter server:
Figure BDA0003943222410000043
wherein eta t Is the learning rate, | S, in the t-th training round t I is the number of centralized equipment of the dispatching equipment;
the process of uploading the gradient adopts air calculation, wherein the air calculation refers to that the aggregation of the local gradient is realized in the process of local gradient signal transmission by utilizing the superposition characteristic of a channel, namely the formula (5) is completed in the process of local gradient signal transmission
Figure BDA0003943222410000044
Calculating a part; the aerial computing is implemented as follows:
in the t-th round of training, the scheduled edge mobiles upload the calculated local gradient g simultaneously k,t All the sent local gradients are aggregated in the air, and the aggregate signal received by the parameter server is:
Figure BDA0003943222410000045
wherein p is k,t Is the transmitter scalar for device k in the t-th training round; n is the mean 0 and the variance σ 2 A gaussian white noise vector of;
the parameter server endows the received aggregation signal with a beam forming vector, and the aggregation signal processed by the parameter server is as follows:
Figure BDA0003943222410000051
where m is the receiver beamforming vector,the superscript H represents the transpose,
Figure BDA0003943222410000052
namely, the local gradient signal aggregation is completed through air calculation;
the aerial calculation realizes the aggregation of local gradients, which is influenced by channel fading and noise, so that the aerial calculation is optimized;
under an ideal channel without channel fading and noise interference, the ideal aggregate signal is:
Figure BDA0003943222410000053
the error between the ideal aggregate signal and the actual aggregate signal is expressed in terms of a mean square error, which is expressed as follows:
Figure BDA0003943222410000054
where E is the mathematically expected sign;
in order to reduce the influence of channel gain and noise in the air calculation process and improve the performance of air calculation, a transceiver needs to be designed according to a mean square error minimization criterion, and the mean square error between an ideal aggregate signal and an actual aggregate signal is minimized.
Further, the designing the transceiver according to the mean square error minimization criterion to minimize the mean square error between the ideal aggregate signal and the actual aggregate signal specifically includes:
designing a transceiver refers to determining a transmitter scalar p k,t And a receiver beamforming vector m;
the transmitter scalar is designed to:
Figure BDA0003943222410000055
where μ is the transmit power control factor, | p k,t | 2 ≤P 0 ,P 0 Is the maximum transmit power, the symbol | | | | | non-calculation 2 Is to find l of its intermediate vector 2 The square of the norm;
the transmission power control factor is designed as follows:
Figure BDA0003943222410000056
substituting the formula (10) and the formula (11) into the formula (7), the actual aggregate signal is simplified as follows:
Figure BDA0003943222410000057
the mean square error between the actual aggregate signal and the ideal aggregate signal is further expressed as:
Figure BDA0003943222410000061
an optimization problem is constructed with the aim of minimizing the mean square error:
Figure BDA0003943222410000062
introducing a virtual variable
Figure BDA0003943222410000063
The optimization problem formula (14) is converted into the following form
Figure BDA0003943222410000064
The value of m is re-assigned,
Figure BDA0003943222410000065
obtaining an equivalent optimization problem:
Figure BDA0003943222410000066
and (3) solving the optimization problem in the formula (16) to obtain a receiver beam forming vector m.
Further, solving the optimization problem in the formula (16) specifically includes:
solving an initial solution of the problem by using a semi-definite relaxation method SDR, and optimizing the initial solution by using a sequential convex approximation algorithm SCA; the SDR method solves the initial solution as follows:
let A = mm H ,A * =min A tr (A), where tr (A) represents the trace of matrix A, λ 1 Is A * Maximum eigenvalue, u 1 Is λ 1 Corresponding feature vectors;
if A * Is 1, then
Figure BDA0003943222410000067
Is the optimal solution of the optimization problem;
if A * Is not 1, and the initial solution is solved by SCA algorithm
Figure BDA0003943222410000068
Optimizing;
the specific steps of optimizing the initial solution by the SCA method are as follows:
in the optimization problem of equation (16), the non-convex constraint is | | | m H h k,t || 2 ≥1,
Figure BDA0003943222410000069
Introducing an auxiliary variable c k,t =[Re(m H h k,t ),Im(m H h k,t )]Equation (16) converts to:
Figure BDA00039432224100000610
the non-convex constraint in equation (17) is c k,t || 2 ≥1,
Figure BDA0003943222410000071
Use of c k,t || 2 ≥||c k,t (l) || 2 +2(c k,t (l) ) T (c k,t -c k,t (l) )≥1,
Figure BDA0003943222410000072
Carrying out convex approximation on the non-convex limit by iterative relaxation linear constraint; wherein c is k,t (l) Is the solution after this iterative optimization,
replacing the non-convex constraint in equation (17) with the convex constraint, equation (17) is rewritten as:
Figure BDA0003943222410000073
order to
Figure BDA0003943222410000074
Iteratively solving equation (18) until
Figure BDA0003943222410000075
Wherein epsilon represents the solving precision, and the optimal solution obtained correspondingly is the receiver beam forming vector m.
The beneficial effects of the invention are:
the invention provides a device scheduling method based on combination of channel and local model updating, which is characterized in that when edge mobile devices in a Federation learning system are scheduled, local model updating parameters are estimated by using a gradient estimation method before local training, channel related parameters and local model updating related parameters are added by distributing different weights, and then the addition results are compared to schedule the edge mobile devices. Compared with the traditional scheme that equipment selection is performed according to channel gain, the selected equipment performs local training, and then a part of less equipment is scheduled according to a local training result, the scheme can select the edge mobile equipment to be scheduled through only one scheduling step, reduce unnecessary local training and save the energy consumption of the edge mobile equipment.
Drawings
FIG. 1 is a schematic diagram of a Federal learning System model of the present invention;
FIG. 2 is a flow chart of an embodiment of the present invention;
fig. 3 is a flow chart of a specific scheduling scheme in the present invention.
Detailed Description
In order to better understand the purpose, structure and function of the present invention, the following describes the device scheduling method based on the combination of channel and local model update in detail with reference to the accompanying drawings.
The invention specifically comprises the following steps:
step 1: constructing a federated learning system
Referring to fig. 1, in an edge intelligent scenario, an edge mobile device with K single antennas, denoted as K = {1,2, \8230;, K } and a parameter server with M antennas, where each edge mobile device K ∈ K has a local data set D k ,|D k | represents a data set D k The global model omega is trained by a parameter server and the edge mobile equipment together in the federal learning, the federal learning training process is a cyclic process, each cycle is called a training round, each training round obtains a new global model, and the omega is used t Represents the global model, ω, obtained for the t-th training run 0 Representing an initial global model, omega, not trained by federal learning t-1 Representing a global model obtained in the last training round to form a federal learning system;
and 2, step: updating two parameters by combining the channel and the local model, and scheduling the edge mobile equipment participating in training by the parameter server
And (3) channel parameters: in the t-th training round, there are different channel gain vectors h between the parameter server and each edge mobile k k,t (ii) a Each edge mobile device sends a pilot frequency sequence to a parameter server, and the parameter server estimates a channel gain vector h between each edge mobile device and the parameter server according to the received pilot frequency sequence k,t (ii) a Using a channel gain vector h k,t L of 2 Norm of
Figure BDA0003943222410000081
Representing channel parameters between the parameter server and the edge mobile device k, wherein
Figure BDA0003943222410000082
Representative vector h k,t Absolute value of ith component, since the parameter server has M antennas in common, the channel gain vector h k,t A total of M components;
local model update parameters: in the t round of training, the edge mobile device k performs local training to obtain a local gradient g k,t Using local gradients g k,t L of 2 Norm g k,t | | represents a local model update parameter of the edge mobile device k; g | | k,t The larger the value of | is, the larger the effect improvement of the local training result of the equipment k on the federal learning training is;
the edge mobile device k does not carry out local training before being scheduled, and uses the result g of the local training again when being scheduled k,t Calculate g k,t L of 2 Norm g k,t I guides the scheduling, so I of local gradient obtained by local training is needed 2 Norm g k,t Estimating | l;
combining the channel with the local model updating to decide the scheduling priority of the equipment, wherein the scheduling priority of the equipment is defined as follows:
I k,t =c||h k,t ||+(1-c)||g k,t || (1)
wherein I k,t Representing the scheduling priority of edge mobile k in the t-th training round, c ∈ [0,1 [ ]]Is a hyper-parameter used to control the impact weight of two scheduling parameters;
scheduling priority I for all edge mobiles k k,t Performing descending order, and under the condition that the number of the scheduling devices is fixed to N, 0<N<K, selecting the first N I k,t The largest equipment participates in the tth round of federal learning training; the set of edge mobiles scheduled in the t round of training is:
Figure BDA0003943222410000091
the specific scheduling scheme of the edge mobile device is shown in fig. 3;
local gradient of local training 2 Norm g k,t The method for estimating | l is as follows: since the local gradients calculated by the edge shifting device in each training round are strongly time-dependent, the gradient l in the last training round is used 2 Norm to estimate the gradient l of the current training round 2 Norm, specifically, local gradient l obtained by estimation in the t-th training round 2 The norm is:
Figure BDA0003943222410000092
wherein t is k Representing the last scheduled federal learning training round of the equipment k;
all the edge mobile devices are required to participate in training in the first round of federal learning training round, and local gradients g of the first round of training of all the edge mobile devices are uploaded k,1 So that the parameter server can adjust the gradient l of the subsequent training turns 2 Estimating the norm; when edge mobile k is scheduled in the tth round of training, the parameter server uses the local gradient g after local training k,t Updating
Figure BDA0003943222410000093
And step 3: the parameter server obtains the global model omega of the last circulation t-1 Is sent to all the scheduled edge mobile equipment
The parameter server obtains the global model omega of the last cycle through a wireless channel t-1 Issued to all scheduled edge mobiles, assuming a global model ω, since the parameter server has sufficient energy and bandwidth compared to the edge mobiles t-1 Is issued ofThe process is an error-free transmission process.
And 4, step 4: the dispatched edge mobile equipment adopts a random gradient descent algorithm to carry out local training to obtain respective local gradients
Firstly constructing a loss function by adopting a random gradient descent algorithm, and then calculating a local gradient by using the loss function;
the scheduled edge mobile device receives the global model omega obtained from the previous training round t-1 Constructing a global loss function for the entire federated learning system as
Figure BDA0003943222410000094
Wherein
Figure BDA0003943222410000095
Sample Capacity representing the entire Federal learning System, F kt-1 ) Is the global model omega obtained from the last cycle t-1 A local loss function on edge mobile k, the global loss function being a weighted average of the local loss functions for all edge mobile k; local loss function on edge shifting device k
Figure BDA0003943222410000096
Wherein (x) k ,y k ) Representing data samples on device k, f (x) k ,y k ;ω t-1 ) Is a single sample (x) k ,y k ) Corresponding loss function, f (x) k ,y k ;ω t-1 ) Measures the global model omega t-1 For sample (x) k ,y k ) The matching performance of (2);
constructing a formula for local training of the edge mobile equipment k in the t round of training to obtain a local gradient:
Figure BDA0003943222410000101
wherein ^ f (x) k ,y k ;ω t-1 ) Representing the derivative of the loss function, L k,t ∈D k Is from data set D in the t-th training round k In a small sample set, L, obtained by random selection b =|L k,t I is a small batch sample set L k,t The number of samples in (1); g k,t Is the local gradient obtained by local training.
And 5: the scheduled edge mobile equipment uploads the obtained local gradient to a parameter server, and the global model is updated to obtain omega t The process of uploading the local gradient adopts aerial calculation, and the aerial calculation is optimized;
local gradient g to be obtained by scheduled edge mobile devices k,t Uploading to a parameter server, updating the global model, and firstly establishing a formula for updating the global model by the parameter server:
Figure BDA0003943222410000102
wherein eta t Is the learning rate, | S, in the t-th training round t I is the number of centralized equipment of the dispatching equipment;
the process of uploading the gradient adopts air calculation, wherein the air calculation means that the aggregation of the local gradient is realized in the process of transmitting the local gradient signals by utilizing the superposition characteristic of a channel, namely the formula (5) is completed when the local gradient signals are transmitted
Figure BDA0003943222410000103
Calculating a part; the specific implementation process of the over-the-air calculation is as follows:
in the t-th round of training, the scheduled edge mobiles upload the calculated local gradient g simultaneously k,t All the sent local gradients are aggregated in the air, and the aggregate signal received by the parameter server is:
Figure BDA0003943222410000104
wherein p is k,t Is the transmitter scalar for device k in the t-th round of training; n is the mean valueIs 0 and variance is σ 2 A gaussian white noise vector of;
the parameter server endows the received aggregation signal with a beam forming vector, and the aggregation signal processed by the parameter server is as follows:
Figure BDA0003943222410000111
where m is the receiver beamforming vector, the superscript H stands for transpose,
Figure BDA0003943222410000112
namely local gradient signal aggregation completed by aerial calculation;
the aerial calculation realizes the aggregation of local gradients, which is influenced by channel fading and noise, so that the aerial calculation is optimized;
under an ideal channel without channel fading and noise interference, the ideal aggregate signal is:
Figure BDA0003943222410000113
the error between the ideal aggregate signal and the actual aggregate signal is expressed in terms of the mean square error, which is expressed as follows:
Figure BDA0003943222410000114
where E is the mathematically expected sign;
in order to reduce the influence of channel gain and noise in the air calculation process and improve the performance of air calculation, a transceiver needs to be designed according to a mean square error minimization criterion, so as to minimize the mean square error between an ideal aggregate signal and an actual aggregate signal, which specifically includes:
designing a transceiver refers to determining a transmitter scalar p k,t And a receiver beamforming vector m;
the transmitter scalar is designed to:
Figure BDA0003943222410000115
where μ is the transmit power control factor, | p k,t | 2 ≤P 0 ,P 0 Is the maximum transmission power, symbol | calving 2 Is to find l of its intermediate vector 2 The square of the norm;
the transmission power control factor is designed as follows:
Figure BDA0003943222410000116
substituting the formula (10) and the formula (11) into the formula (7), the actual aggregate signal is simplified as follows:
Figure BDA0003943222410000117
the mean square error between the actual aggregate signal and the ideal aggregate signal is further expressed as:
Figure BDA0003943222410000118
an optimization problem is constructed with the aim of minimizing the mean square error:
Figure BDA0003943222410000121
introducing a virtual variable
Figure BDA0003943222410000122
The optimization problem formula (14) is converted into the following form
Figure BDA0003943222410000123
The value of m is re-assigned,
Figure BDA0003943222410000124
obtaining an equivalent optimization problem:
Figure BDA0003943222410000125
solving the optimization problem in the formula (16) to obtain a receiver beamforming vector m, specifically including:
solving an initial solution of the problem by using a semi-definite relaxation method SDR, and optimizing the initial solution by using a sequential convex approximation algorithm SCA;
the SDR method solves the initial solution as follows:
let A = mm H ,A * =min A tr (A), where tr (A) represents the trace of matrix A, λ 1 Is A * Maximum eigenvalue, u 1 Is λ 1 A corresponding feature vector;
if A * Is 1, then
Figure BDA0003943222410000126
Is the optimal solution of the optimization problem;
if A * Is not 1, the initial solution is solved by SCA algorithm
Figure BDA0003943222410000127
Optimizing;
the specific steps of optimizing the initial solution by the SCA method are as follows:
in the optimization problem of equation (16), the non-convex constraint is | | | m H h k,t || 2 ≥1,
Figure BDA0003943222410000128
Introducing an auxiliary variable c k,t =[Re(m H h k,t ),Im(m H h k,t )]Equation (16) translates to:
Figure BDA0003943222410000129
the non-convex constraint in equation (17) is c k,t || 2 ≥1,
Figure BDA00039432224100001210
Using c k,t || 2 ≥||c k,t (l) || 2 +2(c k,t (l) ) T (c k,t -c k,t (l) )≥1,
Figure BDA00039432224100001211
Carrying out convex approximation on the non-convex limit by iterative relaxation linear constraint; wherein c is k,t (l) Is the solution after this iterative optimization,
replacing the non-convex constraint in equation (17) with the convex constraint, and rewriting equation (17) as:
Figure BDA0003943222410000131
order to
Figure BDA0003943222410000132
Iteratively solving equation (18) until
Figure BDA0003943222410000133
Wherein epsilon represents the solving precision, and the optimal solution obtained correspondingly is the receiver beam forming vector m.
Step 2-5 is executed in a loop until the global model omega t And (6) converging.
The flow chart of the whole embodiment is shown in fig. 2.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the invention and is not intended to limit the invention, which has been described in detail with reference to the foregoing examples, but it will be apparent to those skilled in the art that modifications may be made to the above-described embodiment or that equivalents may be substituted for elements thereof. All modifications, equivalents and the like which come within the spirit and principle of the invention are intended to be included within the scope of the invention.

Claims (7)

1. A method for scheduling a device based on a combination of channel and local model update, the method comprising the steps of:
step 1: constructing a federated learning system
In an edge intelligent scenario, there are K edge mobile devices with single antenna, denoted as K = {1,2, \ 8230;, K } and a parameter server with M antennas, and each edge mobile device K ∈ K has a local data set D k ,|D k | represents a data set D k The global model omega is trained by a parameter server and the edge mobile equipment together in the federal learning, the federal learning training process is a cyclic process, each cycle is called a training round, each training round obtains a new global model, and the omega is used t Representing the global model, ω, obtained for the t-th training round 0 Representing an initial global model, omega, not trained by federal learning t-1 Representing a global model obtained in the last training round to form a federal learning system;
step 2: updating two parameters by combining the channel and the local model, and scheduling the edge mobile equipment participating in training by the parameter server;
and step 3: the parameter server obtains the global model omega of the last circulation t-1 Sending the data to all the scheduled edge mobile devices;
and 4, step 4: the scheduled edge mobile equipment adopts a random gradient descent algorithm to carry out local training to obtain respective local gradients;
and 5: the scheduled edge mobile equipment uploads the obtained local gradient to a parameter server, and the global model is updated to obtain omega t The process of uploading the local gradient adopts air calculation, and the air calculation is optimized;
step 2-5 is executed in a loop until the global model omega t And (6) converging.
2. The method for scheduling devices based on combination of channel and local model update according to claim 1, wherein the step 2 specifically comprises:
and (3) channel parameters: in the t-th round of training, there are different channel gain vectors h between the parameter server and each edge mobile k k,t (ii) a Each edge mobile device sends a pilot frequency sequence to a parameter server, and the parameter server estimates a channel gain vector h between each edge mobile device and the parameter server according to the received pilot frequency sequence k,t (ii) a Using a channel gain vector h k,t L of 2 Norm of
Figure FDA0003943222400000011
Representing channel parameters between the parameter server and the edge mobile device k, wherein
Figure FDA0003943222400000012
Representative vector h k,t Absolute value of ith component, since the parameter server has M antennas in common, the channel gain vector h k,t A total of M components;
local model update parameters: in the t round of training, the edge mobile device k performs local training to obtain a local gradient g k,t Using local gradients g k,t L of 2 Norm g k,t | | represents a local model update parameter of the edge mobile device k; g | | k,t The larger the value of | is, the larger the effect improvement of the local training result of the equipment k on the federal learning training is;
the edge mobile device k does not perform local training before being scheduled, and uses the result g of the local training during scheduling k,t Calculate g k,t L of 2 Norm g k,t I guides the scheduling, so I of local gradient obtained by local training is needed 2 Norm g k,t Estimating | l;
combining the channel with the local model updating to decide the scheduling priority of the equipment, wherein the scheduling priority of the equipment is defined as follows:
I k,t =c||h k,t ||+(1-c)||g k,t || (1)
wherein I k,t Representing the scheduling priority of edge mobile k in the t-th training round, c ∈ [0,1 [ ]]Is a hyper-parameter used to control the impact weight of two scheduling parameters;
scheduling priority I for all edge mobiles k k,t Performing descending order, and under the condition that the number of the scheduling devices is fixed to N, 0<N<K, selecting the first N I k,t The largest equipment participates in the tth round of federal learning training; the set of edge mobile devices scheduled in the t-th training round are:
Figure FDA0003943222400000021
3. the method as claimed in claim 2, wherein the i of the local gradient obtained by the local training is selected from 2 Norm g k,t Estimating, specifically:
since the local gradients calculated by the edge shifting device in each training round have a strong temporal correlation, the gradient l in the last training round is used 2 Norm to estimate the gradient l of the current training round 2 Norm, specifically, local gradient l obtained by estimation in the t-th training round 2 Norm is:
Figure FDA0003943222400000022
wherein t is k Representing the last dispatched federal learning training round of the device k;
all edge mobile devices are required to participate in training in the first round of federal learning training round, and local gradients of all edge mobile devices are uploaded, so that the parameter server can perform the gradient l of the subsequent training round 2 Estimating the norm; parameters when edge mobile k is scheduled in the tth round of trainingServer uses local gradient g after local training k,t Updating
Figure FDA0003943222400000023
4. The method for scheduling devices based on combination of channel and local model update according to claim 1, wherein the step 4 specifically includes:
the method comprises the following steps that local training is carried out on scheduled edge mobile equipment by adopting a random gradient descent algorithm, a loss function is firstly constructed by adopting the random gradient descent algorithm, and then a local gradient is calculated by utilizing the loss function;
the scheduled edge mobile device receives the global model omega obtained from the previous training round t-1 Constructing a global loss function for the entire federated learning system as
Figure FDA0003943222400000031
Wherein
Figure FDA0003943222400000032
Sample capacity, F, representing the entire Federal learning System kt-1 ) Is the global model omega obtained from the last cycle t-1 A local loss function on edge mobile k, the global loss function being a weighted average of the local loss functions for all edge mobile k; local loss function on edge shifting device k
Figure FDA0003943222400000033
Wherein (x) k ,y k ) Representing data samples on device k, f (x) k ,y k ;ω t-1 ) Is a single sample (x) k ,y k ) Corresponding loss function, f (x) k ,y k ;ω t-1 ) Measure the global model omega t-1 For sample (x) k ,y k ) The matching performance of (2);
constructing a formula for local training of the edge mobile equipment k in the t round of training to obtain a local gradient:
Figure FDA0003943222400000034
wherein ^ f (x) k ,yk;ω t-1 ) Representing the derivative of the loss function, L k,t ∈D k Is from data set D in the t-th training round k In a small sample set, L, obtained by random selection b =|L k,t I is a small batch sample set L k,t The number of samples in (1); g k,t Is the local gradient obtained by local training.
5. The method for scheduling devices based on combination of channel and local model update according to claim 1, wherein the step 5 specifically includes:
local gradient g to be obtained by scheduled edge mobile devices k,t Uploading to a parameter server, updating the global model, firstly constructing a formula for updating the global model by the parameter server:
Figure FDA0003943222400000035
wherein eta t Is the learning rate, | S, in the t-th training round t I is the number of centralized equipment of the dispatching equipment;
the process of uploading the gradient adopts air calculation, wherein the air calculation means that the aggregation of the local gradient is realized in the process of transmitting the local gradient signal by utilizing the superposition characteristic of a channel, namely the formula (5) is completed when the local gradient signal is transmitted
Figure FDA0003943222400000036
Calculating a part; the aerial computing is implemented as follows:
in the t-th round of training, the scheduled edge mobiles upload the calculated local gradient g simultaneously k,t All local gradients transmittedThe aggregation is realized in the air, and the aggregation signals received by the parameter server are as follows:
Figure FDA0003943222400000041
wherein p is k,t Is the transmitter scalar for device k in the t-th round of training; n is mean 0 and variance σ 2 A gaussian white noise vector of;
the parameter server endows the received aggregation signal with a beam forming vector, and the aggregation signal processed by the parameter server is as follows:
Figure FDA0003943222400000042
where m is the receiver beamforming vector, the superscript H stands for transpose,
Figure FDA0003943222400000043
namely local gradient signal aggregation completed by aerial calculation;
the aerial calculation realizes the aggregation of local gradients, which is influenced by channel fading and noise, so that the aerial calculation is optimized;
under an ideal channel without channel fading and noise interference, the ideal aggregate signal is:
Figure FDA0003943222400000044
the error between the ideal aggregate signal and the actual aggregate signal is expressed in terms of a mean square error, which is expressed as follows:
Figure FDA0003943222400000045
where E is the mathematically expected symbol;
in order to reduce the effect of channel gain and noise during the over-the-air computation and improve the over-the-air computation performance, it is necessary to design the transceiver according to the mean square error minimization criterion, minimizing the mean square error between the ideal aggregate signal and the actual aggregate signal.
6. The method as claimed in claim 5, wherein the designing the transceiver according to the mean square error minimization criterion to minimize the mean square error between the ideal aggregate signal and the actual aggregate signal comprises:
designing a transceiver refers to determining a transmitter scalar p k,t And a receiver beamforming vector m;
the transmitter scalar is designed to:
Figure FDA0003943222400000046
where μ is the transmit power control factor, | p k,t | 2 ≤P 0 ,P 0 Is the maximum transmission power, the symbol | | | calving 2 Is to find l of its intermediate vector 2 The square of the norm;
the transmission power control factor is designed as follows:
Figure FDA0003943222400000051
substituting the formula (10) and the formula (11) into the formula (7), the actual aggregate signal is simplified as follows:
Figure FDA0003943222400000052
the mean square error between the actual aggregate signal and the ideal aggregate signal is further expressed as:
Figure FDA0003943222400000053
an optimization problem is constructed with the aim of minimizing the mean square error:
Figure FDA0003943222400000054
introducing a virtual variable
Figure FDA0003943222400000055
The optimization problem formula (14) is converted into the following form
Figure FDA0003943222400000056
The value of m is re-assigned,
Figure FDA0003943222400000057
obtaining an equivalent optimization problem:
Figure FDA0003943222400000058
and (3) solving the optimization problem in the formula (16) to obtain a receiver beam forming vector m.
7. The method according to claim 6, wherein the solving of the optimization problem in equation (16) specifically includes:
solving an initial solution of the problem by using a semi-definite relaxation method SDR, and optimizing the initial solution by using a sequential convex approximation algorithm SCA;
the SDR method solves the initial solution as follows:
let A = mm H ,A * =min A tr (A), where tr (A) represents the trace of matrix A, λ 1 Is A * Maximum eigenvalue, u 1 Is λ 1 Corresponding characteristic directionAn amount;
if A * Is 1, then
Figure FDA0003943222400000059
Is the optimal solution of the optimization problem;
if A is * Is not 1, the initial solution is solved by SCA algorithm
Figure FDA0003943222400000061
Optimizing;
the specific steps of optimizing the initial solution by the SCA method are as follows:
in the optimization problem of equation (16), the non-convex constraint is
Figure FDA0003943222400000062
Introducing an auxiliary variable c k,t =[Re(m H h k,t ),Im(m H h k,t )]Equation (16) converts to:
Figure FDA0003943222400000063
the non-convex constraint in formula (17) is
Figure FDA0003943222400000064
Use of
Figure FDA0003943222400000065
Carrying out convex approximation on the non-convex limit by iterative relaxation linear constraint; wherein c is k,t (l) Is the solution after this iterative optimization,
replacing the non-convex constraint in equation (17) with the convex constraint, equation (17) is rewritten as:
Figure FDA0003943222400000066
order to
Figure FDA0003943222400000067
Iteratively solving equation (18) until
Figure FDA0003943222400000068
Wherein epsilon represents the solving precision, and the corresponding obtained optimal solution is the receiver beam forming vector m.
CN202211422803.7A 2022-11-15 2022-11-15 Equipment scheduling method based on combination of channel and local model update Pending CN115767758A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211422803.7A CN115767758A (en) 2022-11-15 2022-11-15 Equipment scheduling method based on combination of channel and local model update

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211422803.7A CN115767758A (en) 2022-11-15 2022-11-15 Equipment scheduling method based on combination of channel and local model update

Publications (1)

Publication Number Publication Date
CN115767758A true CN115767758A (en) 2023-03-07

Family

ID=85370596

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211422803.7A Pending CN115767758A (en) 2022-11-15 2022-11-15 Equipment scheduling method based on combination of channel and local model update

Country Status (1)

Country Link
CN (1) CN115767758A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116644802A (en) * 2023-07-19 2023-08-25 支付宝(杭州)信息技术有限公司 Model training method and device
CN116781518A (en) * 2023-08-23 2023-09-19 北京光函数科技有限公司 Federal multi-arm slot machine learning method and system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116644802A (en) * 2023-07-19 2023-08-25 支付宝(杭州)信息技术有限公司 Model training method and device
CN116781518A (en) * 2023-08-23 2023-09-19 北京光函数科技有限公司 Federal multi-arm slot machine learning method and system
CN116781518B (en) * 2023-08-23 2023-10-24 北京光函数科技有限公司 Federal multi-arm slot machine learning method and system

Similar Documents

Publication Publication Date Title
Hamdi et al. Federated learning over energy harvesting wireless networks
CN110719239B (en) Data model dual-drive combined MIMO channel estimation and signal detection method
CN115767758A (en) Equipment scheduling method based on combination of channel and local model update
CN103763782B (en) Dispatching method for MU-MIMO down link based on fairness related to weighting users
CN105338609B (en) Multiaerial system high energy efficiency dynamic power allocation method
CN112911608B (en) Large-scale access method for edge-oriented intelligent network
JP2009141957A (en) Pre-coding transmission method of mimo system
KR102510513B1 (en) Deep learning based beamforming method and apparatus for the same
US20220103211A1 (en) Method and device for switching transmission methods in massive mimo system
US11742901B2 (en) Deep learning based beamforming method and apparatus
WO2022184010A1 (en) Information reporting method and apparatus, first device, and second device
CN114567358B (en) Large-scale MIMO robust WMMSE precoder and deep learning design method thereof
CN109818662A (en) Mixed-beam manufacturing process in full duplex cloud access number energy integrated network
CN112994770A (en) RIS (remote station identification) assisted multi-user downlink robust wireless transmission method based on partial CSI (channel state information)
CN117318774A (en) Channel matrix processing method, device, terminal and network side equipment
Liu et al. Scalable predictive beamforming for IRS-assisted multi-user communications: A deep learning approach
US20240137079A1 (en) User selection for mu-mimo
CN115329954A (en) Training data set acquisition method, wireless transmission method, device and communication equipment
CN110505604A (en) A kind of method of D2D communication system access frequency spectrum
CN115843045A (en) Data acquisition method and device
CN111988791B (en) Fog calculation-based wireless charging network node computing capacity improving method and system
CN115843021A (en) Data transmission method and device
Saxena et al. A learning approach for optimal codebook selection in spatial modulation systems
CN108834155B (en) Method for optimizing spectrum efficiency based on multiple parameters of large-scale antenna system
Bhattacharya et al. Intelligent channel learning exploiting practical energy harvesting for wireless MISO systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination