CN115618963A - Wireless federal learning asynchronous training method based on optimized direction guidance - Google Patents

Wireless federal learning asynchronous training method based on optimized direction guidance Download PDF

Info

Publication number
CN115618963A
CN115618963A CN202211287329.1A CN202211287329A CN115618963A CN 115618963 A CN115618963 A CN 115618963A CN 202211287329 A CN202211287329 A CN 202211287329A CN 115618963 A CN115618963 A CN 115618963A
Authority
CN
China
Prior art keywords
client
training
model
local
representing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211287329.1A
Other languages
Chinese (zh)
Other versions
CN115618963B (en
Inventor
郭爽
吕云山
栗强强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Yitong College
Original Assignee
Chongqing Yitong College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Yitong College filed Critical Chongqing Yitong College
Priority to CN202211287329.1A priority Critical patent/CN115618963B/en
Publication of CN115618963A publication Critical patent/CN115618963A/en
Application granted granted Critical
Publication of CN115618963B publication Critical patent/CN115618963B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

The invention requests to protect an asynchronous federated learning asynchronous training method based on optimization direction guidance, which belongs to the field of distributed learning in a wireless network and comprises the following steps: establishing an optimization problem of efficient training around data heterogeneity and resource heterogeneity of a client in federated learning; extracting sample diversity characteristics by using sparsity after image data processing, improving single-round aggregation effectiveness by guiding a client model with higher sample diversity, and improving model updating instantaneity and training fairness by using a model increment asynchronous updating mechanism based on a training state and a training decision based on model difference; the method optimizes and improves the training efficiency of wireless federal learning from two directions of data isomerism and resource isomerism, and ensures the training fairness while training quickly through a model increment asynchronous updating mechanism and a client training decision.

Description

Wireless federal learning asynchronous training method based on optimized direction guidance
Technical Field
The invention relates to the field of distributed learning in a wireless network, in particular to an asynchronous federated learning asynchronous training method based on optimization direction guidance.
Background
Federal learning has attracted extensive attention in both academia and industry as a new distributed learning method that can satisfy both privacy protection and model training. Artificial intelligence is increasingly developed, and model training guided by practical application requires a data set to have authenticity and universality, so that a large number of companies collect daily real data of customers to meet the target requirement of model training. However, with the increasing emphasis on user data privacy security, a large number of privacy data protection laws have been introduced, so that the privacy security vigilance of users is also improved, and it is difficult for companies or organizations to directly acquire user data.
The concept of federal learning is firstly provided by researchers of Google company, and the core technical idea is that the integration processing of client information is realized in a model interaction mode, so that the model training of artificial intelligence is completed on the premise of protecting the privacy of users. The core idea is to map the information implied by the source data into the model, and the interactive aggregation link of the central server is equivalent to a decryption mapping relation. The wireless federal learning refers to federal learning in a wireless network, and due to compatibility of the wireless network, the client sides are distributed differently, have different characteristics and have different resources, so that performance of the federal learning is greatly influenced, such as optimization direction difference and training time difference of training.
When the client transmits locally trained model parameters to the central server, the aggregation may be less effective due to model differences caused by data heterogeneity. In addition, resource heterogeneity between clients makes training time different, increasing client training costs. Therefore, reducing performance loss due to data and resource heterogeneity is key to achieving efficient federal learning. In some previous researches, regular terms related to model variation are introduced into a model updating algorithm, so that the influence of local large-amplitude updating on a global model is effectively controlled, but the problem cannot be thoroughly solved. Recently, personalized models based on global sample space and dominated by local sample space are proposed, but the model application range is limited. In addition, due to the fact that the training progress of the client-side of the asynchronous federal learning is inconsistent, single-round aggregation effectiveness is low, the number of training rounds is far higher than that of common federal learning, and the energy consumption expense of a training system is too high.
The current research is dedicated to unilateral research of data isomerism and resource isomerism, and joint optimization of complex training problems is lacked. Leading to lengthy training time, high system energy consumption overhead, and inconsistent performance of clients for federal learning in practical applications.
Disclosure of Invention
The present invention is directed to solving the above problems of the prior art. A wireless federal learning asynchronous training method based on optimization direction guidance is provided. The technical scheme of the invention is as follows:
a wireless federal learning asynchronous training method based on optimized direction guidance comprises the following steps:
s1, establishing a training optimization problem according to data isomerism and resource isomerism of a client in federal learning; extracting the sample characteristics of the client by utilizing the sparsity of the processed image data;
s2, actively clustering the client group according to the sample characteristics;
s3, all the clients acquire a global model, and the clients generate a scheduling scheme according to the clustering result;
s4, sending the scheduling scheme to a client set to be trained, calculating the model similarity of the clients, and performing training decision;
and S5, performing training based on the global model by the client side set participating in training, uploading model increment of a local training result by the client side, and aggregating client side parameters of the central grain circle to update the global model.
Further, in the step S1, a feature extraction module for extracting sample diversity is formed by a rectification linear unit, and a sparse value capable of reflecting image pixel distribution is obtained by using sparsity of effective pixels in an image sample; image sample x i ∈D n The sparse value after feature extraction is represented as:
Figure BDA0003899949170000021
wherein the content of the first and second substances,
Figure BDA0003899949170000022
the number of zero elements in the extracted output matrix is represented, H and W respectively represent the height and width of the output characteristic diagram of the rectifying linear unit layer in the convolutional neural network model, namely the sparse difference value of two image samples belonging to the same genus m is as follows:
Figure BDA0003899949170000031
the client respectively calculates the maximum sparse difference values according to the sample types, and finally sums to obtain the accumulated sparse difference values for approximately representing the sample diversity, namely the sample diversity of the client n is represented as follows:
Figure BDA0003899949170000032
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003899949170000033
the maximum sparse difference of the class samples m representing the client n, i.e. the image samples for which the same sparse difference is calculated, belong to the same class.
Further, the S2 actively clusters the client group according to the sample characteristics, specifically including:
the central server obtains a sample diversity sequence delta = [ delta ] of all clients 1 ,δ 2 ,...,δ N ]And then, finishing the clustering of the client subset by utilizing a K-means algorithm (the contour coefficient is adopted to evaluate the class density and dispersion degree in the selection of the K value) based on the contour coefficient assistance so as to break through the prior dependence of the clustering algorithm on the K value, and obtaining a diversity subset sequence of the descending sample through active clustering:
Figure BDA0003899949170000034
wherein:
Figure BDA0003899949170000035
Figure BDA0003899949170000036
sample diversity subset delta i The corresponding client end set is the training subset, so the whole client ends can be divided into a plurality of training subsets according to the diversity subset sequence of the descending sample,
Figure BDA0003899949170000037
the training subset larger than the set value is a guide subset, and the rest are rendering subsets.
Further, in S3, all the clients obtain the global model, and the clients generate the scheduling scheme according to the clustering result, which specifically includes:
the scheduling of the training subsets follows the rule of 'global first and local second', the subsets are guided to be trained first, and then the rendering subsets are gradually added into the training; designing a scheduling triggering condition based on a performance improvement ratio, taking k rounds of training as an example, and in k-1 rounds of evaluation, the local accuracy is lower than the average accuracy acc k-1 As a trigger base set of k rounds
Figure BDA0003899949170000041
Performance improvement ratio of the t-th training in the k-th training
Figure BDA0003899949170000042
Expressed as:
Figure BDA0003899949170000043
wherein the content of the first and second substances,
Figure BDA0003899949170000044
represents the t-th training in the k-th training roundIn the process of exercise,
Figure BDA0003899949170000045
the medium accuracy rate is higher than acc k-1 When a set of clients is present
Figure BDA0003899949170000046
Then scheduling a new training subset, starting k +1 training rounds,
Figure BDA0003899949170000047
representing the performance enhancement factor.
Further, in the step S4, the scheduling scheme is sent to a set of clients to be trained, the model similarity of the clients is calculated, and a training decision is made;
when the client follows the scheduling arrangement of the central server, training decisions need to be made according to the self condition, so that excessive local training is avoided; the client needs to check the difference of the models and calculates the first round of local models
Figure BDA0003899949170000048
With the current global model
Figure BDA0003899949170000049
Cosine similarity between them
Figure BDA00038999491700000410
The expression is as follows:
Figure BDA00038999491700000411
when in use
Figure BDA00038999491700000412
The closer the value is to 1, the higher the similarity between the local model and the current global model is, the less the information amount in the global model is, and the training should not be participated in. At this time, the client randomly waits for a period of time until
Figure BDA00038999491700000413
And the numerical value becomes low and then participates in training and scheduling.
Further, in S5, the training is performed based on the global model expansion by the client set participating in the training, the client uploads the model increment of the local training result, and the central cereal circle aggregation client updates the global model by the client parameter, which specifically includes:
the central server adopts an asynchronous updating mode for the local model submitted by the client, and in order to improve the correlation between the global models, the client submits the incremental part delta M of the local model n,t The central server adopts an updating algorithm as follows:
Figure BDA0003899949170000051
wherein, Δ M n,t =M n,t -M n,t-1 ,M n,t Local model trained for client N at time t, N t Representing the number of clients participating in the aggregation at time t, E t Represents the sum of the number of client samples that have participated in training before time t:
Figure BDA0003899949170000052
wherein e is n,t The training state coefficient of the client n at the moment t is represented, the initial value is 0, and the value is 1 after the client participates in training; under the asynchronous updating mode, the global model at the moment t only depends on the global model at the moment t1 and the client model increment participating in updating at the moment t; the central server only needs to store the sample number of all the clients and the global model of the previous round.
The invention has the following advantages and beneficial effects:
the optimization problem of efficient training is established around data isomerism and resource isomerism of a client in federated learning; sample diversity characteristics are extracted by using sparsity obtained after image data processing, single-round aggregation effectiveness is improved through guidance of a client model with high sample diversity, and model updating instantaneity and training fairness are improved by using a model increment asynchronous updating mechanism based on a training state and a training decision based on model difference.
The method optimizes the Federal learning algorithm under data isomerism and resource isomerism from three aspects of client difference training, asynchronous updating and training decision, realizes the high-efficiency Federal learning training under a complex wireless network through optimizing direction guidance, a model increment asynchronous updating mechanism and a client training decision based on model difference in steps S1, S2 and S3, and solves the problems of low training efficiency, large time consumption, large energy consumption, low fairness and the like.
Drawings
FIG. 1 is a diagram of an optimized direction-based wireless federal learning model used in a preferred embodiment of the present invention;
FIG. 2 is a flow chart of an implementation of asynchronous training based on optimized direction guidance in wireless federal learning proposed by the present invention;
FIG. 3 is a block diagram of optimized direction guidance, model incremental asynchronous update mechanism, and model-diversity-based client training decisions as provided by the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described in detail and clearly with reference to the accompanying drawings. The described embodiments are only some of the embodiments of the present invention.
The technical scheme for solving the technical problems is as follows:
as shown in FIG. 1, the invention is based on a federated learning network composed of a plurality of clients and a central server, and the clients and the central server realize the aggregation of effective information of client groups through model parameter sharing, and finally realize the global model training facing the client groups. The effect of the local model of the client on the global optimization is different due to data heterogeneity, and if all the clients are directly subjected to aggregation processing, the optimization direction of the global model is random, and in addition, training time difference is caused by resource heterogeneity of the clients. Therefore, data heterogeneity and resource heterogeneity are reduced, i.e., federal learning training efficiency is improved. The process involves feature extraction, client scheduling, and aggregation algorithm updating, and model building is performed in order to quantify these resources for optimization.
The invention relates to a wireless federal learning network based on multiple clients and a central server, wherein the clients have data storage and calculation capacity and carry out model training locally, and the central server is responsible for calculating a global model according to a local model provided by the clients and establishing communication between the clients and the central server by utilizing an OFDMA wireless network to realize parameter transmission.
Firstly, a central server broadcasts a training task to a client in a communication coverage range, and after the client receives the task, the client decides whether to participate in task training.
If not, skipping the training task directly;
if so, checking whether the local sample space meets the task requirement, and returning a relevant response message.
Based on the above requirements, the present invention provides an optimization direction guidance-based wireless federal learning asynchronous training method, as shown in fig. 2, including:
s1, extracting sample diversity characteristics of a client by using sparsity of processed image data;
s2, realizing active clustering of the client group according to the diversity characteristics of the samples;
s3, the client participates in different training schedules according to clustering results to achieve optimized direction guiding;
s4, a client training decision based on model difference is made, the global optimization direction is corrected, and the training fairness is improved;
and S5, improving the real-time performance of model updating by a model increment asynchronous updating mechanism based on the training state of the client.
In the embodiment, an optimization problem of efficient training is established around data heterogeneity and resource heterogeneity of a client in federated learning; the method has the advantages that the method utilizes sparsity obtained after image data processing to extract sample diversity characteristics, improves single-round aggregation effectiveness through the guidance of a client model with higher sample diversity, and utilizes a model increment asynchronous updating mechanism based on a training state and a training decision based on model difference to improve model updating instantaneity and training fairness.
In the federal learning efficient training method in this embodiment, the aggregation effectiveness of the central server is low due to data heterogeneity of the clients, and meanwhile, due to hardware resource differences among the clients, it is very important to alleviate the influence of resource heterogeneity on the training efficiency. In a real scene, a client group often has the characteristics of data isomerism and resource isomerism at the same time, so that the influence of the data isomerism and the resource isomerism on the Federal learning performance is reduced, and the method has great significance for the Federal learning practical application.
Optimization directions of client models trained based on heterogeneous data sets are different, and a global model obtained by simply aggregating the client models is often poor in performance, because the global optimization directions after single-round aggregation are random. For the problem of resource heterogeneity, asynchronous federal learning is generally adopted as a solution, but due to asynchronous updating among clients, a single-round aggregated client group is dynamic, so that the relevance of a front-round global model and a rear-round global model is low, and partial effective information is lost. Therefore, training scheduling of a client with a large deviation of the optimization direction is controlled, correlation among global models in asynchronous updating is improved, single-round aggregation effectiveness can be effectively improved, and training time and system energy consumption overhead are reduced.
The distance between the client and the central server is set to obey Gaussian distribution, and the set of the clients is expressed as
Figure BDA0003899949170000081
The central server is denoted S. Each client has a local training task, denoted as
Figure BDA0003899949170000082
D n Local data set, M, representing a client n Local models representing clients, i.e. after downloading the initialisation model from the central server, based on D n The resulting model is trained. D n The method has non-independent equal distribution so as to simulate data heterogeneity of the client.
For the time overhead of the client for completing the single training, the time overhead mainly includes the time calculated locally, the transmission time of the parameter in the uplink and the transmission time of the parameter in the downlink, and the time expression of the client for completing the single training is as follows:
Figure BDA0003899949170000083
wherein
Figure BDA0003899949170000084
The time is calculated locally for the client n,
Figure BDA0003899949170000085
the uplink transmission time is the parameter of the client n,
Figure BDA0003899949170000086
the parameter downlink transmission time of the client n.
In one embodiment, the local computation time of client n
Figure BDA0003899949170000087
The expression of (a) is:
Figure BDA0003899949170000088
wherein, c n Represents the CPU operation period, iter, corresponding to the single iteration of the client n n Represents the iteration number, epoch, corresponding to the single local iteration of the client n n Representing the local iteration round number, f, of the client n n Indicating the CPU operating frequency of the client n.
In one embodiment, the parameter of client n is uplink transmission time
Figure BDA0003899949170000089
The expression of (a) is:
Figure BDA00038999491700000810
wherein, | M n L represents the local model size of client n,
Figure BDA00038999491700000811
the uplink transmission rate of the client is represented by the following expression:
Figure BDA0003899949170000091
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003899949170000092
representing the wireless access bandwidth, h, of the client n n Representing the wireless channel gain between client n and the central server,
Figure BDA0003899949170000093
representing the wireless transmission power, N, of the client N 0 Representing the white noise power spectral density in the radio channel.
In one embodiment, the parameter of the client n is downlink transmission time
Figure BDA0003899949170000094
The expression of (a) is:
Figure BDA0003899949170000095
wherein the content of the first and second substances,
Figure BDA0003899949170000096
the downlink transmission rate of the client n is represented by the following expression:
Figure BDA0003899949170000097
wherein, B s Indicating the radio access bandwidth, p, of the central server s Representing the wireless transmit power of the central server.
The system energy consumption for the client to finish the single training mainly comprises local calculation energy consumption, parameter transmission energy consumption in an uplink and parameter transmission energy consumption in a downlink, and the system energy consumption expression for the client to finish the single training is as follows:
Figure BDA0003899949170000098
wherein
Figure BDA0003899949170000099
Representing the computational energy consumption of a single local training of client n,
Figure BDA00038999491700000910
representing the upstream transmission energy consumption of the client n,
Figure BDA00038999491700000911
representing the downstream transmission energy consumption of the central server with respect to the client n.
In one embodiment, the computational energy consumption of a single local training of client n
Figure BDA00038999491700000912
The expression of (c) is:
Figure BDA00038999491700000913
where ξ represents the CPU power consumption coefficient.
In one embodiment of the present invention,uplink transmission energy consumption of client n
Figure BDA00038999491700000914
The expression of (a) is:
Figure BDA0003899949170000101
in one embodiment, the central server consumes energy for downstream transmission of the client n
Figure BDA0003899949170000102
The expression of (c) is:
Figure BDA0003899949170000103
the channel between the central server and the client adopts a path loss model of 128.1+37.6lg d and a small-scale Rayleigh fading model, and the system energy consumption cost P of the whole training process system Including the calculation energy consumption and transmission energy consumption of the client, and the transmission energy consumption of the central server, P system Expressed as:
Figure BDA0003899949170000104
wherein s is n Representing the total number of local training times, s, of the client n s,n Representing the number of wireless transmissions of the central server with respect to client n.
The method has the advantages that the practical application effect of the federated learning is improved, the influence of data isomerism and resource isomerism on the federated learning needs to be solved at the same time, a training scheduling algorithm capable of effectively dealing with the spatial distribution difference of client samples is designed, the correlation among global models is enhanced while the training efficiency of the client is improved, and a model aggregation algorithm capable of dealing with dynamic client groups and modifying the global optimization direction is designed.
The heterogeneous data condition of the client is described by using the sample diversity, the sample space formed by similar samples is considered to be concentrated, the sample diversity is low, the larger the deviation of the optimization direction of the corresponding local model is, the client clustering is realized according to the sample diversity, the client training is scheduled in stages, the client model with low sample diversity is controlled, and the guiding effect of the client model with high sample diversity is enhanced.
The characteristic extraction module for extracting the diversity of the samples is composed of a rectification linear unit, and the sparsity of effective pixels in the image samples is utilized to obtain a sparse value capable of reflecting the pixel distribution of the image. Image sample x i ∈D n The sparse value after feature extraction is represented as:
Figure BDA0003899949170000105
wherein the content of the first and second substances,
Figure BDA0003899949170000111
the number of zero elements in the extracted output matrix is represented, H and W respectively represent the height and width of the output characteristic diagram of the rectifying linear unit layer in the convolutional neural network model, namely the sparse difference value of two image samples belonging to the same genus m is as follows:
Figure BDA0003899949170000112
the client respectively calculates the maximum sparse difference values according to the sample types, and finally sums to obtain the accumulated sparse difference values for approximately representing the sample diversity, namely the sample diversity of the client n is represented as follows:
Figure BDA0003899949170000113
wherein the content of the first and second substances,
Figure BDA0003899949170000114
the maximum sparse difference of the class samples m representing the client n, i.e. the image samples for which the same sparse difference is calculated, belong to the same class.
The central server obtains sample diversity based on feature extraction, and then realizes training subset division facing data isomerism through active clustering. The sample diversity among clients of the same training subset is relatively close, and the same scheduling strategy is followed.
The central server obtains a sample diversity sequence delta = [ delta ] of all clients 12 ,...,δ N ]Then, completing client subset clustering by using a K-means algorithm (adopting contour coefficients to evaluate the class density and dispersion degree in the selection of the K value) based on contour coefficient assistance so as to break through the prior dependence of the clustering algorithm on the K value, and obtaining a diversity subset sequence of the descending sample through active clustering:
Figure BDA0003899949170000115
wherein:
Figure BDA0003899949170000116
Figure BDA0003899949170000117
sample diversity subset delta i The corresponding client end set is the training subset, so that the whole client ends can be divided into a plurality of training subsets according to the diversity subset sequence of the descending sample. For convenience of analysis, scale
Figure BDA0003899949170000121
The higher training subset is the boot subset and the rest are the rendering subsets.
In order to realize effective aggregation among training subsets with different sample diversity, the scheduling of the training subsets follows the rule of 'global first and local second', the subsets are guided to be trained first, and then the subsets are rendered to be gradually added into training. Designing a scheduling trigger condition based on a performance improvement ratio, taking k rounds of training as an example, and in k-1 rounds of evaluation, the local accuracy is lower than an average standardRate of certainty acc k-1 As a trigger base set of k rounds
Figure BDA0003899949170000122
Performance boost ratio for the t-th training in the kth training round
Figure BDA0003899949170000123
Expressed as:
Figure BDA0003899949170000124
wherein the content of the first and second substances,
Figure BDA0003899949170000125
it is shown that in the t-th training of the k-th training round,
Figure BDA0003899949170000126
the accuracy is higher than acc k-1 When the client is a collection
Figure BDA00038999491700001215
Then, a new training subset is scheduled, k +1 training rounds are started,
Figure BDA0003899949170000128
representing the performance enhancement factor.
When the client follows the scheduling arrangement of the central server, training decisions need to be made according to the self condition, so that excessive local training is avoided. The client needs to check the difference of the models and calculate the first round of local models
Figure BDA0003899949170000129
With the current global model
Figure BDA00038999491700001210
Cosine similarity between them
Figure BDA00038999491700001211
The expression is as follows:
Figure BDA00038999491700001212
when in use
Figure BDA00038999491700001213
The closer the value is to 1, the higher the similarity between the local model and the current global model is, the less the information amount in the global model is, and the training should not be participated in. At this time, the client randomly waits for a period of time until
Figure BDA00038999491700001214
The value becomes lower and then participates in the training scheduling.
The central server adopts an asynchronous updating mode for the local model submitted by the client, and in order to improve the correlation between the global models, the client submits the incremental part delta M of the local model n,t The central server adopts an updating algorithm as follows:
Figure BDA0003899949170000131
wherein, Δ M n,t =M n,t -M n,t-1 ,M n,t Local model trained for client N at time t, N t Representing the number of clients participating in the aggregation at time t, E t Represents the sum of the number of client samples that have participated in training before time t:
Figure BDA0003899949170000132
wherein e is n,t The training state coefficient of the client n at the time t has an initial value of 0 and a value of 1 after the client participates in training. In the asynchronous update mode, the global model at the time t only depends on the global model at the time t1, and the client model increment participating in the update at the time t. The central server only needs to store the sample number of all the clients and the global model of the previous roundThe storage requirements are lower.
Or a unit, which may be embodied by a computer chip or an entity, or by a product having a certain functionality. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising one of 8230; \8230;" 8230; "does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises that element.
The above examples are to be construed as merely illustrative and not limitative of the remainder of the disclosure. After reading the description of the invention, the skilled person can make various changes or modifications to the invention, and these equivalent changes and modifications also fall into the scope of the invention defined by the claims.

Claims (10)

1. A wireless federal learning asynchronous training method based on optimized direction guidance is characterized by comprising the following steps:
s1, establishing a training optimization problem according to data isomerism and resource isomerism of a client in federal learning; extracting sample diversity characteristics of the client by using the sparsity of the processed image data;
s2, actively clustering a client group according to the diversity characteristics of the samples;
s3, all the clients acquire a global model, and the clients generate a scheduling scheme according to the clustering result;
s4, sending the scheduling scheme to a client set for training, calculating the model similarity of the client, and performing a training decision;
and S5, performing training based on the global model by the client side set participating in training, uploading model increment of a local training result by the client side, and aggregating client side parameters of the central grain circle to update the global model.
2. The optimization direction guidance-based wireless federal learning asynchronous training method as claimed in claim 1, wherein in federal learning, the distance between the client and the central server is set to be subject to gaussian distribution, and the client set is represented as
Figure FDA0003899949160000011
The central server is denoted S, each client has a locally performed training Task, denoted Task n =D n ,M n ,D n Local data set, M, representing a client n Local model representing client, i.e. after downloading the initialization model from the central server, based on D n Training the resulting model, D n The method has non-independent equal distribution so as to simulate data heterogeneity of the client.
3. The optimization direction guidance-based wireless federal learning asynchronous training method as claimed in claim 1, wherein the time overhead of the client for completing a single training, including the locally calculated time, the transmission time of the parameter in the uplink and the transmission time of the parameter in the downlink, is expressed as:
Figure FDA0003899949160000012
wherein
Figure FDA0003899949160000013
The time is calculated for the local of the client n,
Figure FDA0003899949160000014
the uplink transmission time is the parameter of the client n,
Figure FDA0003899949160000015
the parameter downlink transmission time of the client n is obtained;
local computation time of client n
Figure FDA0003899949160000016
The expression of (c) is:
Figure FDA0003899949160000021
wherein, c n Represents the CPU operation period, iter, corresponding to n single iterations of the client n Represents the iteration number, epoch, corresponding to the single local iteration of the client n n Representing the local iteration round number, f, of the client n n Representing the CPU operation frequency of the client n;
parameter uplink transmission time of client n
Figure FDA0003899949160000022
The expression of (a) is:
Figure FDA0003899949160000023
wherein, | M n L represents the local model size of client n,
Figure FDA0003899949160000024
the uplink transmission rate of the client is represented by the following expression:
Figure FDA0003899949160000025
wherein the content of the first and second substances,
Figure FDA0003899949160000026
representing the wireless access bandwidth, h, of the client n n Representing the wireless channel gain between client n and the central server,
Figure FDA0003899949160000027
representing the wireless transmission power of the client N, N 0 Representing a white noise power spectral density in a wireless channel;
parameter downlink transmission time of client n
Figure FDA0003899949160000028
The expression of (a) is:
Figure FDA0003899949160000029
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA00038999491600000210
the downlink transmission rate of the client n is represented by the following expression:
Figure FDA00038999491600000211
wherein, B s Indicating the radio access bandwidth, p, of the central server s Representing the wireless transmit power of the central server.
4. The method of claim 3, wherein the method further comprises the step of performing the optimized direction-oriented-based wireless federated learning asynchronous training,
the system energy consumption of the client for completing the single training comprises local calculation energy consumption, transmission energy consumption of parameters in an uplink and transmission energy consumption of parameters in a downlink, and the system energy consumption expression of the client for completing the single training is as follows:
Figure FDA0003899949160000031
wherein
Figure FDA0003899949160000032
Representing the computational energy consumption of a single local training of client n,
Figure FDA0003899949160000033
representing the upstream transmission energy consumption of the client n,
Figure FDA0003899949160000034
representing the downlink transmission energy consumption of the central server relative to the client n;
computational energy consumption for a single local training of client n
Figure FDA0003899949160000035
The expression of (c) is:
Figure FDA0003899949160000036
where ξ represents the CPU energy consumption coefficient;
uplink transmission energy consumption of client n
Figure FDA0003899949160000037
The expression of (a) is:
Figure FDA0003899949160000038
downlink transmission energy consumption of central server relative to client n
Figure FDA0003899949160000039
The expression of (a) is:
Figure FDA00038999491600000310
5. the optimization direction guidance-based wireless federal learning asynchronous training method as claimed in claim 4, wherein the channel between the central server and the client uses a path loss model of 128.1+37.6lg d and a small-scale Rayleigh fading model, and the system energy consumption overhead P of the whole training process system Including the calculation and transmission energy consumption of the client, and the transmission energy consumption of the central server, P system Expressed as:
Figure FDA00038999491600000311
wherein s is n Representing the total number of local training times, s, of the client n s,n Representing the number of wireless transmissions of the central server with respect to client n.
6. The wireless federal learning asynchronous training method based on optimization direction guidance as claimed in claim 4, wherein in step S1, the feature extraction module for extracting sample diversity is composed of a rectification linear unit, and a sparse value capable of reflecting image pixel distribution is obtained by using sparsity of effective pixels in an image sample; image sample x i ∈D n The sparse value after feature extraction is represented as:
Figure FDA0003899949160000041
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003899949160000042
the number of zero elements in the extracted output matrix is represented, H and W respectively represent the height and width of the output characteristic diagram of the rectification linear unit layer in the convolutional neural network model, namely the sparse difference value of the image samples of two sibling m is as follows:
Figure FDA0003899949160000043
the client respectively calculates the maximum sparse difference values according to the sample categories, and finally sums to obtain the accumulated sparse difference values for approximately representing the sample diversity, namely the sample diversity of the client n is represented as follows:
Figure FDA0003899949160000044
wherein the content of the first and second substances,
Figure FDA0003899949160000045
the maximum sparse difference of the class samples m representing the client n, i.e. the image samples for which the same sparse difference is calculated, belong to the same class.
7. The optimization direction guidance-based wireless federal learning asynchronous training method as claimed in claim 6, wherein the S2 actively clusters the client population according to the sample characteristics, specifically comprising:
the central server obtains a sample diversity sequence delta = [ delta ] of all clients 1 ,δ 2 ,...,δ N ]And then, finishing the clustering of the client subset by utilizing a K-means algorithm (the contour coefficient is adopted to evaluate the class density and dispersion degree in the selection of the K value) based on the contour coefficient assistance so as to break through the prior dependence of the clustering algorithm on the K value, and obtaining a diversity subset sequence of the descending sample through active clustering:
Figure FDA0003899949160000046
wherein:
Figure FDA0003899949160000047
Figure FDA0003899949160000051
sample diversity subset delta i The corresponding client end set is the training subset, so the whole client ends can be divided into a plurality of training subsets according to the diversity subset sequence of the descending sample,
Figure FDA0003899949160000052
the training subset larger than the set value is a guide subset, and the rest are rendering subsets.
8. The optimization direction guidance-based wireless federal learning asynchronous training method as claimed in claim 7, wherein the S3, all clients obtain a global model, and the clients generate a scheduling scheme according to the clustering result, specifically comprising:
the scheduling of the training subsets follows the rule of 'global first and local second', the subsets are guided to be trained first, and then the subsets are rendered to be gradually added into the training; designing a scheduling triggering condition based on a performance improvement ratio, taking k rounds of training as an example, and in k-1 rounds of evaluation, the local accuracy is lower than the average accuracy acc k-1 As a trigger base set of k rounds
Figure FDA0003899949160000053
Performance improvement ratio of the t-th training in the k-th training
Figure FDA0003899949160000054
Expressed as:
Figure FDA0003899949160000055
wherein the content of the first and second substances,
Figure FDA0003899949160000056
it is shown that in the t-th training of the k-th training round,
Figure FDA0003899949160000057
the medium accuracy rate is higher than acc k-1 When the client is a collection
Figure FDA0003899949160000058
Then scheduling a new training subset, starting k +1 training rounds,
Figure FDA0003899949160000059
representing the performance enhancement factor.
9. The wireless federal learning asynchronous training method based on the guidance of the optimization direction as claimed in claim 8, wherein, in the step S4, the scheduling scheme is sent to a set of clients to be trained, the model similarity of the clients is calculated, and a training decision is made;
when the client follows the scheduling arrangement of the central server, training decisions need to be made according to the self condition, so that excessive local training is avoided; the client needs to check the difference of the models and calculate the first round of local models
Figure FDA00038999491600000510
With the current global model
Figure FDA00038999491600000511
Cosine similarity between them
Figure FDA00038999491600000512
The expression is as follows:
Figure FDA00038999491600000513
when in use
Figure FDA0003899949160000061
The closer the value is to 1, the higher the similarity between the local model and the current global model is, the less the information amount in the global model is, and the training should not be participated in. At this time, the client randomly waits for a period of time until
Figure FDA0003899949160000062
And the numerical value becomes low and then participates in training and scheduling.
10. The optimization direction guidance-based wireless federal learning asynchronous training method as claimed in claim 9, wherein the S5 training-participating client set is developed and trained based on a global model, the client uploads a model increment of a local training result, and the central cereal circle aggregation client parameter updates the global model, specifically comprising:
the central server adopts an asynchronous updating mode to the local model submitted by the client, and in order to improve the correlation between the global models, the client submits the incremental part delta M of the local model n,t The central server adopts an updating algorithm as follows:
Figure FDA0003899949160000063
wherein, Δ M n,t =M n,t -M n,t-1 ,M n,t Local model trained for client N at time t, N t Representing the number of clients participating in the aggregation at time t, E t Represents the sum of the number of client samples that have participated in training before time t:
Figure FDA0003899949160000064
wherein e is n,t The training state coefficient of the client n at the moment t is represented, the initial value is 0, and the value is 1 after the client participates in training; under the asynchronous updating mode, the global model at the moment t only depends on the global model at the moment t-1 and the client model increment participating in updating at the moment t; the central server only needs to store the sample number of all the clients and the global model of the previous round.
CN202211287329.1A 2022-10-20 2022-10-20 Wireless federal learning asynchronous training method based on optimized direction guidance Active CN115618963B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211287329.1A CN115618963B (en) 2022-10-20 2022-10-20 Wireless federal learning asynchronous training method based on optimized direction guidance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211287329.1A CN115618963B (en) 2022-10-20 2022-10-20 Wireless federal learning asynchronous training method based on optimized direction guidance

Publications (2)

Publication Number Publication Date
CN115618963A true CN115618963A (en) 2023-01-17
CN115618963B CN115618963B (en) 2023-07-14

Family

ID=84864319

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211287329.1A Active CN115618963B (en) 2022-10-20 2022-10-20 Wireless federal learning asynchronous training method based on optimized direction guidance

Country Status (1)

Country Link
CN (1) CN115618963B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112508205A (en) * 2020-12-04 2021-03-16 中国科学院深圳先进技术研究院 Method, device and system for scheduling federated learning
CN113516249A (en) * 2021-06-18 2021-10-19 重庆大学 Federal learning method, system, server and medium based on semi-asynchronization
CN113988160A (en) * 2021-10-15 2022-01-28 武汉大学 Semi-asynchronous layered federal learning updating method based on timeliness
CN114219097A (en) * 2021-11-30 2022-03-22 华南理工大学 Federal learning training and prediction method and system based on heterogeneous resources
CN114912705A (en) * 2022-06-01 2022-08-16 南京理工大学 Optimization method for heterogeneous model fusion in federated learning
CN114980127A (en) * 2022-05-18 2022-08-30 东南大学 Calculation unloading method based on federal reinforcement learning in fog wireless access network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112508205A (en) * 2020-12-04 2021-03-16 中国科学院深圳先进技术研究院 Method, device and system for scheduling federated learning
CN113516249A (en) * 2021-06-18 2021-10-19 重庆大学 Federal learning method, system, server and medium based on semi-asynchronization
CN113988160A (en) * 2021-10-15 2022-01-28 武汉大学 Semi-asynchronous layered federal learning updating method based on timeliness
CN114219097A (en) * 2021-11-30 2022-03-22 华南理工大学 Federal learning training and prediction method and system based on heterogeneous resources
CN114980127A (en) * 2022-05-18 2022-08-30 东南大学 Calculation unloading method based on federal reinforcement learning in fog wireless access network
CN114912705A (en) * 2022-06-01 2022-08-16 南京理工大学 Optimization method for heterogeneous model fusion in federated learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SHI GUOMEI ETAL.: "\"HySync:hybrid federated learning with effective synchronization\"", 《IEEE》, pages 628 - 633 *
WANG ZHOUYU ETAL.: "\"Asynchronous federated learning over wireless communication networks\"", 《IEEE》, pages 6961 - 6979 *

Also Published As

Publication number Publication date
CN115618963B (en) 2023-07-14

Similar Documents

Publication Publication Date Title
Zhang et al. Gradient statistics aware power control for over-the-air federated learning
Ma et al. Layer-wised model aggregation for personalized federated learning
Xu et al. Multiagent federated reinforcement learning for secure incentive mechanism in intelligent cyber–physical systems
CN109299436B (en) Preference sorting data collection method meeting local differential privacy
CN107995660A (en) Support Joint Task scheduling and the resource allocation methods of D2D- Edge Servers unloading
Jin et al. Accelerated federated learning with decoupled adaptive optimization
CN112100514B (en) Friend recommendation method based on global attention mechanism representation learning
CN111224905B (en) Multi-user detection method based on convolution residual error network in large-scale Internet of things
CN107734482A (en) The content distribution method unloaded based on D2D and business
CN112906859A (en) Federal learning algorithm for bearing fault diagnosis
CN114385376A (en) Client selection method for federated learning of lower edge side of heterogeneous data
CN115879542A (en) Federal learning method oriented to non-independent same-distribution heterogeneous data
CN113691594B (en) Method for solving data imbalance problem in federal learning based on second derivative
Fan et al. Cb-dsl: Communication-efficient and byzantine-robust distributed swarm learning on non-iid data
Wang et al. Digital twin-assisted knowledge distillation framework for heterogeneous federated learning
Goh et al. Partial Offloading MEC Optimization Scheme using Deep Reinforcement Learning for XR Real-Time M&S Devices
CN115618963A (en) Wireless federal learning asynchronous training method based on optimized direction guidance
Saputra et al. Federated learning framework with straggling mitigation and privacy-awareness for AI-based mobile application services
CN116227632A (en) Federation learning method and device for heterogeneous scenes of client and heterogeneous scenes of data
Yi et al. pFedLHNs: Personalized Federated Learning via Local Hypernetworks
CN115118591A (en) Cluster federation learning method based on alliance game
Lu et al. A hybrid recommendation model for community attributes of social networks based on association rule mining
Sun et al. Stackelberg game-based task offloading strategy for multi-users
Chuan et al. Machine learning based popularity regeneration in caching-enabled wireless networks
Ma et al. Efl: Elastic federated learning on non-iid data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant