CN115618963A

CN115618963A - Wireless federal learning asynchronous training method based on optimized direction guidance

Info

Publication number: CN115618963A
Application number: CN202211287329.1A
Authority: CN
Inventors: 郭爽; 吕云山; 栗强强
Original assignee: Chongqing Yitong College
Current assignee: Chongqing Yitong College
Priority date: 2022-10-20
Filing date: 2022-10-20
Publication date: 2023-01-17
Anticipated expiration: 2042-10-20
Also published as: CN115618963B

Abstract

The invention requests to protect an asynchronous federated learning asynchronous training method based on optimization direction guidance, which belongs to the field of distributed learning in a wireless network and comprises the following steps: establishing an optimization problem of efficient training around data heterogeneity and resource heterogeneity of a client in federated learning; extracting sample diversity characteristics by using sparsity after image data processing, improving single-round aggregation effectiveness by guiding a client model with higher sample diversity, and improving model updating instantaneity and training fairness by using a model increment asynchronous updating mechanism based on a training state and a training decision based on model difference; the method optimizes and improves the training efficiency of wireless federal learning from two directions of data isomerism and resource isomerism, and ensures the training fairness while training quickly through a model increment asynchronous updating mechanism and a client training decision.

Description

Wireless federal learning asynchronous training method based on optimized direction guidance

Technical Field

The invention relates to the field of distributed learning in a wireless network, in particular to an asynchronous federated learning asynchronous training method based on optimization direction guidance.

Background

Federal learning has attracted extensive attention in both academia and industry as a new distributed learning method that can satisfy both privacy protection and model training. Artificial intelligence is increasingly developed, and model training guided by practical application requires a data set to have authenticity and universality, so that a large number of companies collect daily real data of customers to meet the target requirement of model training. However, with the increasing emphasis on user data privacy security, a large number of privacy data protection laws have been introduced, so that the privacy security vigilance of users is also improved, and it is difficult for companies or organizations to directly acquire user data.

The concept of federal learning is firstly provided by researchers of Google company, and the core technical idea is that the integration processing of client information is realized in a model interaction mode, so that the model training of artificial intelligence is completed on the premise of protecting the privacy of users. The core idea is to map the information implied by the source data into the model, and the interactive aggregation link of the central server is equivalent to a decryption mapping relation. The wireless federal learning refers to federal learning in a wireless network, and due to compatibility of the wireless network, the client sides are distributed differently, have different characteristics and have different resources, so that performance of the federal learning is greatly influenced, such as optimization direction difference and training time difference of training.

When the client transmits locally trained model parameters to the central server, the aggregation may be less effective due to model differences caused by data heterogeneity. In addition, resource heterogeneity between clients makes training time different, increasing client training costs. Therefore, reducing performance loss due to data and resource heterogeneity is key to achieving efficient federal learning. In some previous researches, regular terms related to model variation are introduced into a model updating algorithm, so that the influence of local large-amplitude updating on a global model is effectively controlled, but the problem cannot be thoroughly solved. Recently, personalized models based on global sample space and dominated by local sample space are proposed, but the model application range is limited. In addition, due to the fact that the training progress of the client-side of the asynchronous federal learning is inconsistent, single-round aggregation effectiveness is low, the number of training rounds is far higher than that of common federal learning, and the energy consumption expense of a training system is too high.

The current research is dedicated to unilateral research of data isomerism and resource isomerism, and joint optimization of complex training problems is lacked. Leading to lengthy training time, high system energy consumption overhead, and inconsistent performance of clients for federal learning in practical applications.

Disclosure of Invention

The present invention is directed to solving the above problems of the prior art. A wireless federal learning asynchronous training method based on optimization direction guidance is provided. The technical scheme of the invention is as follows:

a wireless federal learning asynchronous training method based on optimized direction guidance comprises the following steps:

s1, establishing a training optimization problem according to data isomerism and resource isomerism of a client in federal learning; extracting the sample characteristics of the client by utilizing the sparsity of the processed image data;

s2, actively clustering the client group according to the sample characteristics;

s3, all the clients acquire a global model, and the clients generate a scheduling scheme according to the clustering result;

s4, sending the scheduling scheme to a client set to be trained, calculating the model similarity of the clients, and performing training decision;

and S5, performing training based on the global model by the client side set participating in training, uploading model increment of a local training result by the client side, and aggregating client side parameters of the central grain circle to update the global model.

Further, in the step S1, a feature extraction module for extracting sample diversity is formed by a rectification linear unit, and a sparse value capable of reflecting image pixel distribution is obtained by using sparsity of effective pixels in an image sample; image sample x _i ∈D _n The sparse value after feature extraction is represented as:

wherein the content of the first and second substances,

the number of zero elements in the extracted output matrix is represented, H and W respectively represent the height and width of the output characteristic diagram of the rectifying linear unit layer in the convolutional neural network model, namely the sparse difference value of two image samples belonging to the same genus m is as follows:

the client respectively calculates the maximum sparse difference values according to the sample types, and finally sums to obtain the accumulated sparse difference values for approximately representing the sample diversity, namely the sample diversity of the client n is represented as follows:

wherein, the first and the second end of the pipe are connected with each other,

the maximum sparse difference of the class samples m representing the client n, i.e. the image samples for which the same sparse difference is calculated, belong to the same class.

Further, the S2 actively clusters the client group according to the sample characteristics, specifically including:

the central server obtains a sample diversity sequence delta = [ delta ] of all clients ₁ ，δ ₂ ，...，δ _N ]And then, finishing the clustering of the client subset by utilizing a K-means algorithm (the contour coefficient is adopted to evaluate the class density and dispersion degree in the selection of the K value) based on the contour coefficient assistance so as to break through the prior dependence of the clustering algorithm on the K value, and obtaining a diversity subset sequence of the descending sample through active clustering:

wherein:

sample diversity subset delta ⁱ The corresponding client end set is the training subset, so the whole client ends can be divided into a plurality of training subsets according to the diversity subset sequence of the descending sample,

the training subset larger than the set value is a guide subset, and the rest are rendering subsets.

Further, in S3, all the clients obtain the global model, and the clients generate the scheduling scheme according to the clustering result, which specifically includes:

the scheduling of the training subsets follows the rule of 'global first and local second', the subsets are guided to be trained first, and then the rendering subsets are gradually added into the training; designing a scheduling triggering condition based on a performance improvement ratio, taking k rounds of training as an example, and in k-1 rounds of evaluation, the local accuracy is lower than the average accuracy acc _k-1 As a trigger base set of k rounds

Performance improvement ratio of the t-th training in the k-th training

Expressed as:

wherein the content of the first and second substances,

represents the t-th training in the k-th training roundIn the process of exercise,

the medium accuracy rate is higher than acc _k-1 When a set of clients is present

Then scheduling a new training subset, starting k +1 training rounds,

representing the performance enhancement factor.

Further, in the step S4, the scheduling scheme is sent to a set of clients to be trained, the model similarity of the clients is calculated, and a training decision is made;

when the client follows the scheduling arrangement of the central server, training decisions need to be made according to the self condition, so that excessive local training is avoided; the client needs to check the difference of the models and calculates the first round of local models

With the current global model

Cosine similarity between them

The expression is as follows:

when in use

The closer the value is to 1, the higher the similarity between the local model and the current global model is, the less the information amount in the global model is, and the training should not be participated in. At this time, the client randomly waits for a period of time until

And the numerical value becomes low and then participates in training and scheduling.

Further, in S5, the training is performed based on the global model expansion by the client set participating in the training, the client uploads the model increment of the local training result, and the central cereal circle aggregation client updates the global model by the client parameter, which specifically includes:

the central server adopts an asynchronous updating mode for the local model submitted by the client, and in order to improve the correlation between the global models, the client submits the incremental part delta M of the local model _n,t The central server adopts an updating algorithm as follows:

wherein, Δ M _n,t ＝M _n,t -M _n,t-1 ，M _n,t Local model trained for client N at time t, N _t Representing the number of clients participating in the aggregation at time t, E _t Represents the sum of the number of client samples that have participated in training before time t:

wherein e is _n,t The training state coefficient of the client n at the moment t is represented, the initial value is 0, and the value is 1 after the client participates in training; under the asynchronous updating mode, the global model at the moment t only depends on the global model at the moment t1 and the client model increment participating in updating at the moment t; the central server only needs to store the sample number of all the clients and the global model of the previous round.

The invention has the following advantages and beneficial effects:

the optimization problem of efficient training is established around data isomerism and resource isomerism of a client in federated learning; sample diversity characteristics are extracted by using sparsity obtained after image data processing, single-round aggregation effectiveness is improved through guidance of a client model with high sample diversity, and model updating instantaneity and training fairness are improved by using a model increment asynchronous updating mechanism based on a training state and a training decision based on model difference.

The method optimizes the Federal learning algorithm under data isomerism and resource isomerism from three aspects of client difference training, asynchronous updating and training decision, realizes the high-efficiency Federal learning training under a complex wireless network through optimizing direction guidance, a model increment asynchronous updating mechanism and a client training decision based on model difference in steps S1, S2 and S3, and solves the problems of low training efficiency, large time consumption, large energy consumption, low fairness and the like.

Drawings

FIG. 1 is a diagram of an optimized direction-based wireless federal learning model used in a preferred embodiment of the present invention;

FIG. 2 is a flow chart of an implementation of asynchronous training based on optimized direction guidance in wireless federal learning proposed by the present invention;

FIG. 3 is a block diagram of optimized direction guidance, model incremental asynchronous update mechanism, and model-diversity-based client training decisions as provided by the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be described in detail and clearly with reference to the accompanying drawings. The described embodiments are only some of the embodiments of the present invention.

The technical scheme for solving the technical problems is as follows:

as shown in FIG. 1, the invention is based on a federated learning network composed of a plurality of clients and a central server, and the clients and the central server realize the aggregation of effective information of client groups through model parameter sharing, and finally realize the global model training facing the client groups. The effect of the local model of the client on the global optimization is different due to data heterogeneity, and if all the clients are directly subjected to aggregation processing, the optimization direction of the global model is random, and in addition, training time difference is caused by resource heterogeneity of the clients. Therefore, data heterogeneity and resource heterogeneity are reduced, i.e., federal learning training efficiency is improved. The process involves feature extraction, client scheduling, and aggregation algorithm updating, and model building is performed in order to quantify these resources for optimization.

The invention relates to a wireless federal learning network based on multiple clients and a central server, wherein the clients have data storage and calculation capacity and carry out model training locally, and the central server is responsible for calculating a global model according to a local model provided by the clients and establishing communication between the clients and the central server by utilizing an OFDMA wireless network to realize parameter transmission.

Firstly, a central server broadcasts a training task to a client in a communication coverage range, and after the client receives the task, the client decides whether to participate in task training.

If not, skipping the training task directly;

if so, checking whether the local sample space meets the task requirement, and returning a relevant response message.

Based on the above requirements, the present invention provides an optimization direction guidance-based wireless federal learning asynchronous training method, as shown in fig. 2, including:

s1, extracting sample diversity characteristics of a client by using sparsity of processed image data;

s2, realizing active clustering of the client group according to the diversity characteristics of the samples;

s3, the client participates in different training schedules according to clustering results to achieve optimized direction guiding;

s4, a client training decision based on model difference is made, the global optimization direction is corrected, and the training fairness is improved;

and S5, improving the real-time performance of model updating by a model increment asynchronous updating mechanism based on the training state of the client.

In the embodiment, an optimization problem of efficient training is established around data heterogeneity and resource heterogeneity of a client in federated learning; the method has the advantages that the method utilizes sparsity obtained after image data processing to extract sample diversity characteristics, improves single-round aggregation effectiveness through the guidance of a client model with higher sample diversity, and utilizes a model increment asynchronous updating mechanism based on a training state and a training decision based on model difference to improve model updating instantaneity and training fairness.

In the federal learning efficient training method in this embodiment, the aggregation effectiveness of the central server is low due to data heterogeneity of the clients, and meanwhile, due to hardware resource differences among the clients, it is very important to alleviate the influence of resource heterogeneity on the training efficiency. In a real scene, a client group often has the characteristics of data isomerism and resource isomerism at the same time, so that the influence of the data isomerism and the resource isomerism on the Federal learning performance is reduced, and the method has great significance for the Federal learning practical application.

Optimization directions of client models trained based on heterogeneous data sets are different, and a global model obtained by simply aggregating the client models is often poor in performance, because the global optimization directions after single-round aggregation are random. For the problem of resource heterogeneity, asynchronous federal learning is generally adopted as a solution, but due to asynchronous updating among clients, a single-round aggregated client group is dynamic, so that the relevance of a front-round global model and a rear-round global model is low, and partial effective information is lost. Therefore, training scheduling of a client with a large deviation of the optimization direction is controlled, correlation among global models in asynchronous updating is improved, single-round aggregation effectiveness can be effectively improved, and training time and system energy consumption overhead are reduced.

The distance between the client and the central server is set to obey Gaussian distribution, and the set of the clients is expressed as

The central server is denoted S. Each client has a local training task, denoted as

D _n Local data set, M, representing a client _n Local models representing clients, i.e. after downloading the initialisation model from the central server, based on D _n The resulting model is trained. D _n The method has non-independent equal distribution so as to simulate data heterogeneity of the client.

For the time overhead of the client for completing the single training, the time overhead mainly includes the time calculated locally, the transmission time of the parameter in the uplink and the transmission time of the parameter in the downlink, and the time expression of the client for completing the single training is as follows:

wherein

The time is calculated locally for the client n,

the uplink transmission time is the parameter of the client n,

the parameter downlink transmission time of the client n.

In one embodiment, the local computation time of client n

The expression of (a) is:

wherein, c _n Represents the CPU operation period, iter, corresponding to the single iteration of the client n _n Represents the iteration number, epoch, corresponding to the single local iteration of the client n _n Representing the local iteration round number, f, of the client n _n Indicating the CPU operating frequency of the client n.

In one embodiment, the parameter of client n is uplink transmission time

The expression of (a) is:

wherein, | M _n L represents the local model size of client n,

the uplink transmission rate of the client is represented by the following expression:

representing the wireless access bandwidth, h, of the client n _n Representing the wireless channel gain between client n and the central server,

representing the wireless transmission power, N, of the client N ₀ Representing the white noise power spectral density in the radio channel.

In one embodiment, the parameter of the client n is downlink transmission time

The expression of (a) is:

wherein the content of the first and second substances,

the downlink transmission rate of the client n is represented by the following expression:

wherein, B ^s Indicating the radio access bandwidth, p, of the central server ^s Representing the wireless transmit power of the central server.

The system energy consumption for the client to finish the single training mainly comprises local calculation energy consumption, parameter transmission energy consumption in an uplink and parameter transmission energy consumption in a downlink, and the system energy consumption expression for the client to finish the single training is as follows:

wherein

Representing the computational energy consumption of a single local training of client n,

representing the upstream transmission energy consumption of the client n,

representing the downstream transmission energy consumption of the central server with respect to the client n.

In one embodiment, the computational energy consumption of a single local training of client n

The expression of (c) is:

where ξ represents the CPU power consumption coefficient.

In one embodiment of the present invention,uplink transmission energy consumption of client n

The expression of (a) is:

in one embodiment, the central server consumes energy for downstream transmission of the client n

The expression of (c) is:

the channel between the central server and the client adopts a path loss model of 128.1+37.6lg d and a small-scale Rayleigh fading model, and the system energy consumption cost P of the whole training process ^system Including the calculation energy consumption and transmission energy consumption of the client, and the transmission energy consumption of the central server, P ^system Expressed as:

wherein s is _n Representing the total number of local training times, s, of the client n _s,n Representing the number of wireless transmissions of the central server with respect to client n.

The method has the advantages that the practical application effect of the federated learning is improved, the influence of data isomerism and resource isomerism on the federated learning needs to be solved at the same time, a training scheduling algorithm capable of effectively dealing with the spatial distribution difference of client samples is designed, the correlation among global models is enhanced while the training efficiency of the client is improved, and a model aggregation algorithm capable of dealing with dynamic client groups and modifying the global optimization direction is designed.

The heterogeneous data condition of the client is described by using the sample diversity, the sample space formed by similar samples is considered to be concentrated, the sample diversity is low, the larger the deviation of the optimization direction of the corresponding local model is, the client clustering is realized according to the sample diversity, the client training is scheduled in stages, the client model with low sample diversity is controlled, and the guiding effect of the client model with high sample diversity is enhanced.

The characteristic extraction module for extracting the diversity of the samples is composed of a rectification linear unit, and the sparsity of effective pixels in the image samples is utilized to obtain a sparse value capable of reflecting the pixel distribution of the image. Image sample x _i ∈D _n The sparse value after feature extraction is represented as:

wherein the content of the first and second substances,

wherein the content of the first and second substances,

The central server obtains sample diversity based on feature extraction, and then realizes training subset division facing data isomerism through active clustering. The sample diversity among clients of the same training subset is relatively close, and the same scheduling strategy is followed.

The central server obtains a sample diversity sequence delta = [ delta ] of all clients ₁ ,δ ₂ ,...,δ _N ]Then, completing client subset clustering by using a K-means algorithm (adopting contour coefficients to evaluate the class density and dispersion degree in the selection of the K value) based on contour coefficient assistance so as to break through the prior dependence of the clustering algorithm on the K value, and obtaining a diversity subset sequence of the descending sample through active clustering:

wherein:

sample diversity subset delta ⁱ The corresponding client end set is the training subset, so that the whole client ends can be divided into a plurality of training subsets according to the diversity subset sequence of the descending sample. For convenience of analysis, scale

The higher training subset is the boot subset and the rest are the rendering subsets.

In order to realize effective aggregation among training subsets with different sample diversity, the scheduling of the training subsets follows the rule of 'global first and local second', the subsets are guided to be trained first, and then the subsets are rendered to be gradually added into training. Designing a scheduling trigger condition based on a performance improvement ratio, taking k rounds of training as an example, and in k-1 rounds of evaluation, the local accuracy is lower than an average standardRate of certainty acc _k-1 As a trigger base set of k rounds

Performance boost ratio for the t-th training in the kth training round

Expressed as:

wherein the content of the first and second substances,

it is shown that in the t-th training of the k-th training round,

the accuracy is higher than acc _k-1 When the client is a collection

Then, a new training subset is scheduled, k +1 training rounds are started,

representing the performance enhancement factor.

When the client follows the scheduling arrangement of the central server, training decisions need to be made according to the self condition, so that excessive local training is avoided. The client needs to check the difference of the models and calculate the first round of local models

With the current global model

Cosine similarity between them

The expression is as follows:

when in use

The value becomes lower and then participates in the training scheduling.

wherein e is _n,t The training state coefficient of the client n at the time t has an initial value of 0 and a value of 1 after the client participates in training. In the asynchronous update mode, the global model at the time t only depends on the global model at the time t1, and the client model increment participating in the update at the time t. The central server only needs to store the sample number of all the clients and the global model of the previous roundThe storage requirements are lower.

Or a unit, which may be embodied by a computer chip or an entity, or by a product having a certain functionality. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising one of 8230; \8230;" 8230; "does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises that element.

The above examples are to be construed as merely illustrative and not limitative of the remainder of the disclosure. After reading the description of the invention, the skilled person can make various changes or modifications to the invention, and these equivalent changes and modifications also fall into the scope of the invention defined by the claims.

Claims

1. A wireless federal learning asynchronous training method based on optimized direction guidance is characterized by comprising the following steps:

s1, establishing a training optimization problem according to data isomerism and resource isomerism of a client in federal learning; extracting sample diversity characteristics of the client by using the sparsity of the processed image data;

s2, actively clustering a client group according to the diversity characteristics of the samples;

s4, sending the scheduling scheme to a client set for training, calculating the model similarity of the client, and performing a training decision;

2. The optimization direction guidance-based wireless federal learning asynchronous training method as claimed in claim 1, wherein in federal learning, the distance between the client and the central server is set to be subject to gaussian distribution, and the client set is represented as

The central server is denoted S, each client has a locally performed training Task, denoted Task _n ＝D _n ,M _n ，D _n Local data set, M, representing a client _n Local model representing client, i.e. after downloading the initialization model from the central server, based on D _n Training the resulting model, D _n The method has non-independent equal distribution so as to simulate data heterogeneity of the client.

3. The optimization direction guidance-based wireless federal learning asynchronous training method as claimed in claim 1, wherein the time overhead of the client for completing a single training, including the locally calculated time, the transmission time of the parameter in the uplink and the transmission time of the parameter in the downlink, is expressed as:

wherein

The time is calculated for the local of the client n,

the uplink transmission time is the parameter of the client n,

the parameter downlink transmission time of the client n is obtained;

local computation time of client n

The expression of (c) is:

wherein, c _n Represents the CPU operation period, iter, corresponding to n single iterations of the client _n Represents the iteration number, epoch, corresponding to the single local iteration of the client n _n Representing the local iteration round number, f, of the client n _n Representing the CPU operation frequency of the client n;

parameter uplink transmission time of client n

The expression of (a) is:

wherein, | M _n L represents the local model size of client n,

wherein the content of the first and second substances,

representing the wireless transmission power of the client N, N ₀ Representing a white noise power spectral density in a wireless channel;

parameter downlink transmission time of client n

The expression of (a) is:

4. The method of claim 3, wherein the method further comprises the step of performing the optimized direction-oriented-based wireless federated learning asynchronous training,

the system energy consumption of the client for completing the single training comprises local calculation energy consumption, transmission energy consumption of parameters in an uplink and transmission energy consumption of parameters in a downlink, and the system energy consumption expression of the client for completing the single training is as follows:

wherein

representing the upstream transmission energy consumption of the client n,

representing the downlink transmission energy consumption of the central server relative to the client n;

computational energy consumption for a single local training of client n

The expression of (c) is:

where ξ represents the CPU energy consumption coefficient;

uplink transmission energy consumption of client n

The expression of (a) is:

downlink transmission energy consumption of central server relative to client n

The expression of (a) is:

5. the optimization direction guidance-based wireless federal learning asynchronous training method as claimed in claim 4, wherein the channel between the central server and the client uses a path loss model of 128.1+37.6lg d and a small-scale Rayleigh fading model, and the system energy consumption overhead P of the whole training process ^system Including the calculation and transmission energy consumption of the client, and the transmission energy consumption of the central server, P ^system Expressed as:

6. The wireless federal learning asynchronous training method based on optimization direction guidance as claimed in claim 4, wherein in step S1, the feature extraction module for extracting sample diversity is composed of a rectification linear unit, and a sparse value capable of reflecting image pixel distribution is obtained by using sparsity of effective pixels in an image sample; image sample x _i ∈D _n The sparse value after feature extraction is represented as:

the number of zero elements in the extracted output matrix is represented, H and W respectively represent the height and width of the output characteristic diagram of the rectification linear unit layer in the convolutional neural network model, namely the sparse difference value of the image samples of two sibling m is as follows:

the client respectively calculates the maximum sparse difference values according to the sample categories, and finally sums to obtain the accumulated sparse difference values for approximately representing the sample diversity, namely the sample diversity of the client n is represented as follows:

wherein the content of the first and second substances,

7. The optimization direction guidance-based wireless federal learning asynchronous training method as claimed in claim 6, wherein the S2 actively clusters the client population according to the sample characteristics, specifically comprising:

wherein:

8. The optimization direction guidance-based wireless federal learning asynchronous training method as claimed in claim 7, wherein the S3, all clients obtain a global model, and the clients generate a scheduling scheme according to the clustering result, specifically comprising:

the scheduling of the training subsets follows the rule of 'global first and local second', the subsets are guided to be trained first, and then the subsets are rendered to be gradually added into the training; designing a scheduling triggering condition based on a performance improvement ratio, taking k rounds of training as an example, and in k-1 rounds of evaluation, the local accuracy is lower than the average accuracy acc _k-1 As a trigger base set of k rounds

Performance improvement ratio of the t-th training in the k-th training

Expressed as:

wherein the content of the first and second substances,

it is shown that in the t-th training of the k-th training round,

the medium accuracy rate is higher than acc _k-1 When the client is a collection

Then scheduling a new training subset, starting k +1 training rounds,

representing the performance enhancement factor.

9. The wireless federal learning asynchronous training method based on the guidance of the optimization direction as claimed in claim 8, wherein, in the step S4, the scheduling scheme is sent to a set of clients to be trained, the model similarity of the clients is calculated, and a training decision is made;

when the client follows the scheduling arrangement of the central server, training decisions need to be made according to the self condition, so that excessive local training is avoided; the client needs to check the difference of the models and calculate the first round of local models

With the current global model

Cosine similarity between them

The expression is as follows:

when in use

10. The optimization direction guidance-based wireless federal learning asynchronous training method as claimed in claim 9, wherein the S5 training-participating client set is developed and trained based on a global model, the client uploads a model increment of a local training result, and the central cereal circle aggregation client parameter updates the global model, specifically comprising:

the central server adopts an asynchronous updating mode to the local model submitted by the client, and in order to improve the correlation between the global models, the client submits the incremental part delta M of the local model _n,t The central server adopts an updating algorithm as follows:

wherein e is _n,t The training state coefficient of the client n at the moment t is represented, the initial value is 0, and the value is 1 after the client participates in training; under the asynchronous updating mode, the global model at the moment t only depends on the global model at the moment t-1 and the client model increment participating in updating at the moment t; the central server only needs to store the sample number of all the clients and the global model of the previous round.