CN117409294A

CN117409294A - Cloud edge end cooperative distributed learning method and system based on self-adaptive communication frequency

Info

Publication number: CN117409294A
Application number: CN202311408911.3A
Authority: CN
Inventors: 罗龙; 张弛; 陈栖栖; 虞红芳; 孙罡
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2023-10-27
Filing date: 2023-10-27
Publication date: 2024-01-16

Abstract

The invention discloses a cloud edge end collaborative distributed learning method and a system based on self-adaptive communication frequency, wherein the method comprises the steps of receiving local image classification models uploaded by all edge servers, and performing global aggregation update to obtain a global image classification model; lips of all clients uploaded by all edge servers in statistical training processThe Chitz constant and the gradient estimate variance, and estimate the best communication frequency of the edge server and the client with best performance; according to the calculation and communication performance of the edge servers, the communication frequency of each edge server is adjusted; transmitting global image classification model, communication frequencyAnd the optimal communication frequency of the client is given to the edge server i; judging whether the communication resource exceeds the limit, if so, ending the training of the global image classification model, issuing the global image classification model to the client and the edge server, and otherwise, returning to the first step.

Description

Cloud edge end cooperative distributed learning method and system based on self-adaptive communication frequency

Technical Field

The invention relates to an information sharing technology, in particular to a cloud edge end collaborative distributed learning method and system based on self-adaptive communication frequency.

Background

In recent years, the development of artificial intelligence and machine learning techniques is rapid, the technical levels of cloud computing, blockchain and the like are rapidly developed along with the continuous improvement of computing power, and machine learning is one of core technologies for supporting the future intelligent society. The machine learning technology extracts useful information from massive training data to carry out iterative training, outputs a machine learning model meeting the precision requirement to be applied to each specific application scene so as to improve the high accuracy of the model, for example, the mobile equipment (client) carries out image classification/recognition model, and because of the priority of samples in the local data set, the machine learning technology can learn the experience knowledge of the models on other mobile equipment through data sharing so as to improve the classification precision of the local image classification model.

As the number of mobile devices and internet of things devices connected to the internet has proliferated, the network edge generates large amounts of data. Traditionally with high performance data center clusters, cloud-centric model training is faced with extremely high communication costs: transferring massive amounts of training data from different mobile devices to a single cloud computing data center is slow and can result in high communication costs. The existing cloud side end collaborative distributed learning (CEC-CDL) shares model parameters but does not share original training data, so that the data privacy is protected; and meanwhile, two-stage synchronous aggregation is introduced to realize the trade-off between training performance and communication efficiency.

In order to reduce data traffic in the process that the mobile equipment improves the accuracy of the mobile equipment through model training, a plurality of work suggestions are introduced into cloud edge end cooperative distributed learning, and a two-stage aggregation mechanism of local aggregation and global aggregation is adopted, so that communication time expenditure is reduced through reducing communication times. In the technical schemes, all the client nodes and the edge server nodes are distributed with the same and fixed communication frequency, and the communication time cost is reduced by utilizing local aggregation and global aggregation, so that the training efficiency is improved.

The communication frequency optimization method simply distributes the same and fixed communication frequency to all client nodes and edge server nodes respectively, and the communication frequency distribution depends on experience or needs parameter adjustment, so that the efficiency is low, the effectiveness cannot be ensured, and the communication frequency distribution cannot be changed along with the training process. Meanwhile, the technical schemes neglect the influence of system isomerism (in Yun Bianduan collaborative distributed learning systems, the calculation and communication performances of the participating nodes (client and edge servers) are different), and great synchronous waiting time can be generated among the participating nodes with great performance difference, so that serious problem of a straggler is caused, and training speed is influenced.

In order to reduce the systematic heterogeneous effects in cloud-edge collaborative distributed learning, and mitigate the synchronization barrier, some work suggests introducing adaptive communication frequency adjustment. The most advanced solution adopting the design is to set the communication frequency of the slowest client and the edge server to be 1, and the rest clients and the edge server are distributed with the communication frequency matched with the node performance so as to reduce the waiting time caused by the synchronous barrier and improve the training efficiency under the heterogeneous scene of the system.

Compared with the method that the same and fixed communication frequencies are respectively distributed to all the clients and the edge servers, the technical scheme in the second prior art does not analyze the convergence of the distributed collaborative learning of the communication frequency optimization method designed by the method, the communication frequencies of the client nodes and the edge server nodes are adjusted in an empirically self-adaptive manner, and the convergence of model training cannot be guaranteed. Meanwhile, the optimal technical scheme II is sensitive to communication bandwidth setting, and the performance difference under different bandwidth setting is extremely large, so that the performance cannot be ensured.

Disclosure of Invention

Aiming at the defects in the prior art, the cloud edge end cooperative distributed learning method and system based on the self-adaptive communication frequency provided by the invention solve the problem that the cloud edge end cooperative distributed learning training time is long due to the fact that the fixed frequency is adopted in the existing distributed learning.

In order to achieve the aim of the invention, the invention adopts the following technical scheme:

in a first aspect, a cloud edge end collaborative distributed learning method based on adaptive communication frequency is provided, and the method is applied to a cloud server, and includes the steps of:

s1, receiving local image classification models uploaded by all edge servers, and performing global aggregation update to obtain global image classification models;

s2, calculating Lipschitz constants and gradient estimation variances of all clients uploaded by all edge servers in the training process, and estimating the optimal communication frequency of the edge server and the clients with optimal performance;

s3, according to the calculation and communication performance of the edge servers, the communication frequency of each edge server with the best non-performance is adjusted:

wherein,the communication frequency of the edge server i in the h round of global training round is used; y is ^h The completion time of the h round global training round is the completion time of the h round global training round; />Transmitting the communication time of the local image classification model in the h round of global training round for the edge server i; />The time for completing local iterative computation for the client l with the best performance under the h round of global training round edge server i; />Representing a downward rounding;

s4, transmitting the global image classification model, the communication frequency of the corresponding edge server and the optimal communication frequency of the client to the edge server;

s5, judging whether the communication resource exceeds the limit, if so, ending the training of the global image classification model, issuing the global image classification model to the client and the edge server, and otherwise, returning to the step S1.

The beneficial effects of the technical scheme are as follows: by using the method for cloud edge collaborative distributed learning training, communication frequency matched with calculation and communication performance can be distributed for each edge server, so that the completion time of each edge server is close, delay caused by a synchronous barrier is greatly relieved, and communication time is reduced.

Further, the method for estimating the best communication frequency of the edge server and the client with the best performance comprises the following steps:

s21, obtaining the adopted communication frequency k ₁ And a search space formed by the range of the intermediate variable u;

s22, randomly selecting Lipschitz constant and gradient estimation variance of a client to average and then using the average as gradient estimation variance sigma during each search ² ；

S23, estimating variance sigma according to gradient ² Searching within the search space results in approximately equipartition of the gradient g (H, k ₁ U) k corresponding to the smallest time ₁ And u, approximately equally dividing the gradient g (H, k ₁ The calculation formula of u) is:

wherein f (omega) ⁰ ) Is an initial cloudA server global loss function; omega ⁰ Is an initial model parameter; n is the total number of clients; sigma (sigma) ² Estimating variance for the gradient; l is Lipschitz constant; h is the global training total round;

s24, g (H, k) ₁ U) k corresponding to the smallest time ₁ Calculating the communication frequency k ₂ ＝uk ₁ ；

S25, g (H, k) ₁ U) k corresponding to the smallest time ₁ As the optimal communication frequency of the edge server with optimal performance, the communication frequency k is adopted ₂ As the best communication frequency for the best performing client under each edge server.

The beneficial effects of the technical scheme are as follows: by using the method, the optimal communication frequency between the edge server with the best performance and the client can be estimated for the cloud edge collaborative distributed learning training, the model convergence can be ensured, the optimal communication frequency can be obtained by self-adaptively calculating according to the training process, and the training efficiency of the cloud edge collaborative distributed learning can be improved while the model quality is improved.

Further, completion time t ^h And time ofThe calculation formula of (2) is as follows:

wherein η is the learning rate; c ₁ Is a constant;the communication frequency of a client j under the edge server i in the h round of global training round is used; a is that _i A client set under the edge server i; />Calculating time for single-round local iteration of each client; u is the communication frequency k ₂ And communication frequency k ₁ Is a proportional relationship of (a).

The beneficial effects of the technical scheme are as follows: by adopting the model to calculate the time, the completion time and the completion time of the fastest client under the edge server i can be obtained with extremely low calculation cost, and a reference is provided for the self-adaptive adjustment of the communication frequencies of the edge server and the client.

Further, the formula for performing global aggregation update is:

wherein, among them,the global aggregate weight of the model of the edge server i is obtained for the h global training round; m is the total number of edge servers; />The method comprises the steps of (1) classifying a local image of an edge server i for the h-th global training round; v ^h And the global image classification model of the cloud server is used for the h-th global training round.

Further, the global image classification model is a CNN model or a ResNet9 model.

In a second aspect, a cloud edge end collaborative distributed learning method based on adaptive communication frequency is provided, and the method is applied to an edge server, and includes the steps of:

a1, receiving a global image classification model issued by a cloud server, a communication frequency corresponding to an edge server and an optimal communication frequency of a client to the edge server;

a2, calculating the communication frequency of each client with optimal non-performance according to the calculation and communication performance of the client:

wherein,the communication frequency of a client j under the edge server i in the h round of global training round is used; />The time for completing local iterative computation for the client l with the best performance under the h round of global training round edge server i;transmitting the communication time of the local image classification model in the h round of global training round for the client j under the edge server i; />Calculating time for single local iteration of a client j under the edge server i in the h round of global training round;

a3, transmitting the global image classification model and the communication frequency of the corresponding client to the client;

a4, receiving local image classification models uploaded by all clients, and counting Lipschitz constants and gradient estimation variances of training processes uploaded by all clients;

a5, carrying out aggregation updating on the local image classification models uploaded by all the clients to obtain local image classification models;

a6, judging whether the local aggregation times reach the preset aggregation times, if so, entering a step A7, otherwise, sending the local image classification model to each client, and returning to the step A4;

and A7, uploading a local image classification model and Lipschitz constants and gradient estimation variances uploaded by all clients according to communication frequencies issued by the cloud server.

The beneficial effects of the technical scheme are as follows: by using the method for cloud edge collaborative distributed learning training, communication frequencies matched with calculation and communication performances of each client under each edge server can be distributed, so that the completion time of each client is close, delay caused by a synchronous barrier is greatly relieved, and communication time is reduced.

In a third aspect, a cloud edge end collaborative distributed learning method based on adaptive communication frequency is provided, and the method applies a client and includes the steps of:

c1, receiving a global image classification model and a local aggregation model issued by an edge serverThe corresponding communication frequency is adopted, and the local data set is adopted to carry out iterative computation and update of the local image classification model;

and C2, estimating Lipschitz constant and gradient estimation variance of the local image classification model training process:

wherein,a loss function of a customer service end j under the edge server i; omega ^h-1 And->The global image classification model of the cloud server and the local image classification model of the customer service end j under the edge server i in the h-1 th global training round are respectively; l (L) _ij And->Lipschitz constant and gradient estimation variance of customer service end j under edge server i are respectively calculated; />Classifying sample data for images in a local data set of a customer service end j under an edge server i in the h-1 th global training round;

and C3, uploading a local image classification model, a Lipschitz constant and a gradient estimation variance according to the communication frequency issued by the edge server.

The beneficial effects of the technical scheme are as follows: by using the method, lipschitz constant and gradient estimation variance of the training process of the local image classification model can be obtained with extremely low calculation cost, and a basis is provided for self-adaptively adjusting the communication frequency of the client and the edge server for the cloud server in the next global training round.

Further, the local image classification model update formula of the client is:

wherein,the local image classification model of the customer service end j under the edge server i in the h-1 th global training round is obtained.

The beneficial effects of the technical scheme are as follows: when the local image classification model is updated, the local model is updated by using small batches of image classification samples, so that the calculation cost can be saved, the calculation time can be reduced, and the calculated local model is unbiased in theory.

In a fourth aspect, a cloud edge collaborative distributed learning system based on adaptive communication frequency optimization is provided, which includes a cloud server, a plurality of edge servers and a plurality of clients, wherein the cloud server communicates with the plurality of edge servers, and each edge server communicates with the plurality of clients.

The beneficial effects of the invention are as follows: according to the scheme, the communication frequencies of different clients are explored and adjusted to control local updating, meanwhile, the communication frequencies of different edge servers are adjusted to control local aggregation of a model, the optimal communication frequency under each global training round is obtained through calculation by quantifying the relation between the communication frequencies and training performances, and the communication frequencies are optimized aiming at the performances of different clients and edge servers to relieve the problem of a straggler, so that the overall training performance is improved.

According to the scheme, edge calculation and local aggregation are fully utilized, the communication frequency of the client node and the edge server node is adjusted on the premise of ensuring convergence, communication overhead in the training process is reduced, and the communication bottleneck problem is relieved. In the process of estimating the optimal communication frequency of the edge server and the client with the best performance, the cloud edge collaborative distributed learning system is helped to adaptively adjust the communication frequency according to the training process, and the iteration of model training is accelerated while the model performance is improved.

According to the scheme, the influence of system isomerism on the cloud edge collaborative distributed learning system is considered, the model aggregation updating is carried out in a weak synchronization updating mode, and the influence of the synchronization barrier on the training efficiency is effectively relieved. The effectiveness and the high efficiency of the scheme provided by the invention are verified through experimental tests, and compared with the existing scheme, the scheme improves the model convergence accuracy by 16% at most and reduces the training completion time by 4.7 times at most.

After the client adopts the distributed learning of the scheme, the local image classification model on the client can learn the advantages of the models on other clients faster, so that the defect of insufficient classification prediction performance of the model under the limited condition of a local data set is overcome, and the prediction classification precision of the model is improved.

Drawings

Fig. 1 is a flowchart of a cloud edge end collaborative distributed learning method based on adaptive communication frequency optimization applied to a cloud server.

Fig. 2 is a flowchart of a cloud edge end collaborative distributed learning method based on adaptive communication frequency optimization applied to an edge server.

Fig. 3 is a flowchart of a cloud edge end collaborative distributed learning method based on adaptive communication frequency optimization applied to customer service end.

Fig. 4 is an architecture diagram of a cloud-edge collaborative distributed learning system based on adaptive communication frequency optimization.

FIG. 5 is a graph of test accuracy of training a CNN model on a Fashion-MNIST dataset, (a) bandwidth b _ce ＝b _ec (b) is bandwidth b _ce ＝10b _ec 。

FIG. 6 is a test of training ResNet9 model on CIFAR-10 datasetAccuracy, (a) is bandwidth b _ce ＝b _ec (b) is bandwidth b _ce ＝10b _ec 。

Detailed Description

The following description of the embodiments of the present invention is provided to facilitate understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and all the inventions which make use of the inventive concept are protected by the spirit and scope of the present invention as defined and defined in the appended claims to those skilled in the art.

Referring to fig. 1, fig. 1 shows a flowchart of a cloud edge end collaborative distributed learning method based on adaptive communication frequency optimization applied to a cloud server, and the method S includes steps S1 to S5.

In step S1, receiving local image classification models uploaded by all edge servers, and performing global aggregation update to obtain a global image classification model; the formula for global aggregate update is:

wherein, among them,the global aggregate weight of the model of the edge server i is obtained for the h global training round; m is the total number of edge servers; />The method comprises the steps of (1) classifying a local image of an edge server i for the h-th global training round; omega ^h And the global image classification model of the cloud server is used for the h-th global training round.

In step S2, statistics is performed on Lipschitz constants and gradient estimation variances of all clients uploaded by all edge servers in the training process, and the optimal communication frequencies of the edge servers and clients with the optimal performance are estimated;

in one embodiment of the present invention, a method for estimating the best communication frequency of the best performing edge server and client comprises:

wherein f (omega) ⁰ ) Global loss function for the initial cloud server; omega ⁰ Is an initial model parameter; n is the total number of clients; sigma (sigma) ² Estimating variance for the gradient; l is Lipschitz constant; h is the global training total round;

In step S3, according to the calculation and communication performance of the edge servers, the communication frequency of each edge server with the best non-performance is adjusted:

wherein,the communication frequency of the edge server i in the h round of global training round is used; t is t ^h The completion time of the h round global training round is the completion time of the h round global training round; />Transmitting the communication time of the local image classification model in the h round of global training round for the edge server i; />The time for completing local iterative computation for the client l with the best performance under the h round of global training round edge server i; />Representing a downward rounding;

in step S4, the global image classification model, the communication frequency of the corresponding edge server and the optimal communication frequency of the client are sent to the edge server;

in step S5, it is determined whether the communication resource exceeds the limit, and if the global image classification model has converged, if any determination condition is satisfied, the global image classification model training is ended, and the global image classification model is issued to the client and the edge server, otherwise, step S1 is returned.

In practice, the present scheme preferably completes time t ^h And time ofThe calculation formula of (2) is as follows:

The global image classification model, the local image classification model and the local image classification model of the scheme are CNN models or ResNet9 models.

Referring to fig. 2, fig. 2 shows a flowchart of a cloud edge collaborative distributed learning method based on adaptive communication frequency optimization applied to an edge server; as shown in fig. 2, the method a includes steps A1 to A7.

In step A1, receiving a global image classification model issued by a cloud server, a communication frequency corresponding to an edge server and an optimal communication frequency of a client to the edge server;

in step A2, according to the calculation and communication performance of the clients, the communication frequency of each client with the best non-performance is calculated:

in step A3, the global image classification model and the communication frequency of the corresponding client are sent to the client;

in step A4, receiving local image classification models uploaded by all clients, and counting Lipschitz constants and gradient estimation variances of training processes uploaded by all clients;

in step A5, the local image classification models uploaded by all clients are aggregated and updated to obtain local image classification models:

wherein,the method comprises the steps of (1) classifying a local image of an edge server i for the h-th global training round; n (N) _i The number of clients under edge server i; />The model local aggregation weight of the client j under the edge server i is obtained when the h global training round is carried out; />And (3) a local image classification model of the client j under the edge server i in the h global training round.

In the step A6, judging whether the local aggregation times reach the preset aggregation times, if so, entering the step A7, otherwise, sending the local image classification model to each client, and returning to the step A4;

in step A7, uploading the local image classification model and Lipschitz constants and gradient estimation variances uploaded by all clients according to the communication frequency issued by the cloud server.

Referring to fig. 3, fig. 3 shows a flowchart of a cloud edge end collaborative distributed learning method based on adaptive communication frequency optimization applied to a customer service end; as shown in FIG. 3, the method C includes steps C1 to C3.

In step C1, a global image classification model and a local aggregation model issued by an edge server are receivedThe corresponding communication frequency is adopted, and the local data set is adopted to carry out iterative computation and update of the local image classification model;

in step C2, the Lipschitz constant and gradient estimation variance of the local image classification model training process are estimated:

in step C3, the local image classification model, lipschitz constant and gradient estimation variance are uploaded according to the communication frequency issued by the edge server.

In implementation, the local image classification model updating formula of the preferable client side of the scheme is as follows:

As shown in fig. 4, the scheme further provides a cloud edge collaborative distributed learning system based on adaptive communication frequency optimization, which comprises a cloud server, a plurality of edge servers and a plurality of clients, wherein the cloud server is communicated with the plurality of edge servers, and each edge server is communicated with the plurality of clients.

The accuracy of image classification by the cloud edge end collaborative distributed learning method provided by the scheme is described below by combining a specific example:

data set selection

The scheme uses different models and real-world image classification data sets to perform performance tests, and specifically comprises the following steps: (1) A data set Fashion-MNIST and a 2-layer CNN model with a parameter quantity of 0.58M; (2) Data set CIFAR-10 and ResNet9 model with 2.45M parameters.

Performance index

The index of the performance test is the accuracy of the image classification of the global model, namely the ratio of the number of images which can be correctly classified by the global model in the image of the test set to the number of images of the test set.

Isomerism arrangement

To simulate computational heterogeneity, the present scheme assumes that the computation time of the CNN local iteration on Fashion-MNIST follows a uniform distribution of U (0.5, 3), while the computation time of ResNet9 on CIFAR-10 follows a uniform distribution of U (2, 6). To simulate network heterogeneity, the scheme enables bandwidth b between edge server and cloud server _ec Fluctuating between 0.5Mbps and 5Mbps, while bandwidth b between client and edge server _ce Two settings are used, one is b _ce ＝b _ec The other is b _ce ＝10b _ec 。

Contrast algorithm

The scheme uses three algorithms, i.e. HierFAVG (HierFAVG is a widely used cloud edge cooperative distributed learning method, which respectively allocates the same and fixed communication frequency to each client and edge server), HFL (HFL is a classical cloud edge cooperative distributed learning method, which respectively allocates the same and fixed communication frequency to each client and edge server), and RAF (RAF is a current most advanced cloud edge cooperative distributed learning method, which respectively allocates the communication frequency to the slowest client and edge server to be 1, and then adaptively adjusts the communication frequencies of other clients and edge servers based on the same as a comparison algorithm of the scheme method (CDlada). Wherein HierFAVG and HFL are fixed communication frequency algorithms, respectively using the method (k) ₁ ＝6,k ₂ =10) and (k ₁ ＝5,k ₂ =50) communication frequency setting; RAF is a communication frequency adaptive adjustment algorithm for tree-based hierarchical training systems.

As shown in fig. 5 and 6, the test precision results of the CNN model training on the fascion-MNIST data set in the independent co-distribution scene and the test precision results of the ResNet9 model training on the CIFAR-10 data set in the non-independent co-distribution scene are respectively obtained by different algorithms.

Experimental results show that in multiple scenes, the CDlambda performance is superior to all comparison algorithms, namely HierFAVG, HFL and RAF, and is not affected by network bandwidth setting, and model convergence accuracy can be improved by 16% up to a maximum, and training completion time is reduced by 4.7 times.

According to the scheme, under the guarantee of model convergence, the communication frequencies of the client and the edge server are adaptively adjusted, the communication frequencies matched with the performance of each client and the edge server are distributed according to the calculation and communication capabilities of each client and each edge server, delay caused by system isomerism is reduced, and training efficiency of cloud edge collaborative distributed learning is improved while model convergence is guaranteed.

The method of the scheme is relative to HierFAVG and HFL: the influence of system isomerism is considered, communication frequencies matched with the clients and edge servers with different performances are allocated to the clients and the edge servers, delay caused by a synchronous barrier is relieved, and training efficiency is improved;

the method in the scheme is relative to RAF: and performing convergence analysis, and adaptively adjusting the communication frequency of the client and the edge server under the convergence guarantee, thereby providing the convergence guarantee for the method.

Claims

1. The cloud edge end collaborative distributed learning method based on the self-adaptive communication frequency is characterized by comprising the following steps of:

2. The cloud-edge collaborative distributed learning method according to claim 1, wherein the method comprises the following steps: the method for estimating the best communication frequency of the edge server and the client with the best performance comprises the following steps:

3. The cloud-edge collaborative distributed learning method according to claim 2, characterized in that: completion time t ^h And time ofThe calculation formula of (2) is as follows:

wherein η is the learning rate; c ₁ Is a constant;the communication frequency of a client j under the edge server i in the h round of global training round is used; a is that _i A client set under the edge server i; />Calculating time for single-round local iteration of each client; />The communication frequency of a client j under the edge server i in the h round of global training round is used; u is the communication frequency k ₂ And communication frequency k ₁ Is a proportional relationship of (a).

4. The cloud-edge collaborative distributed learning method according to claim 1, wherein the method comprises the following steps: the formula for global aggregate update is:

wherein,the global aggregate weight of the model of the edge server i is obtained for the h global training round; m is the total number of edge servers; />The method comprises the steps of (1) classifying a local image of an edge server i for the h-th global training round; omega ^h And the global image classification model of the cloud server is used for the h-th global training round.

5. The cloud edge end collaborative distributed learning method according to any one of claims 1-4, wherein: the global image classification model is a CNN model or a ResNet9 model.

6. The cloud edge end collaborative distributed learning method based on the self-adaptive communication frequency is characterized by comprising the following steps of:

wherein,the communication frequency of a client j under the edge server i in the h round of global training round is used; />The time for completing local iterative computation for the client l with the best performance under the h round of global training round edge server i; />Transmitting the communication time of the local image classification model in the h round of global training round for the client j under the edge server i; />Calculating time for single local iteration of a client j under the edge server i in the h round of global training round;

7. The cloud edge end collaborative distributed learning method based on the self-adaptive communication frequency is characterized by comprising the following steps of:

wherein,a loss function of a customer service end j under the edge server i; omega ^h-1 And->The global image classification model of the cloud server and the local image classification model of the customer service end j under the edge server i in the h-1 th global training round are respectively; l (L) _uj And->Lipschitz constant and gradient estimation variance of customer service end j under edge server i are respectively calculated; />Classifying sample data for images in a local data set of a customer service end j under an edge server i in the h-1 th global training round;

8. The cloud-edge collaborative distributed learning method based on the adaptive communication frequency according to claim 7, wherein the method comprises the following steps: the local image classification model updating formula of the client is as follows:

9. Cloud edge end collaborative distributed learning system based on self-adaptive communication frequency, which is characterized in that: the cloud server for executing the cloud edge end cooperative distributed learning method according to any one of claims 1 to 5, a plurality of edge servers for executing the cloud edge end cooperative distributed learning method according to claim 6, and a plurality of clients for executing the cloud edge end cooperative distributed learning method according to claim 7 or 8; the cloud server communicates with a number of edge servers, each of which communicates with a number of clients.