CN114169543A - Federal learning algorithm based on model obsolescence and user participation perception - Google Patents

Federal learning algorithm based on model obsolescence and user participation perception Download PDF

Info

Publication number
CN114169543A
CN114169543A CN202111476376.6A CN202111476376A CN114169543A CN 114169543 A CN114169543 A CN 114169543A CN 202111476376 A CN202111476376 A CN 202111476376A CN 114169543 A CN114169543 A CN 114169543A
Authority
CN
China
Prior art keywords
client
model
center server
clients
cloud center
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111476376.6A
Other languages
Chinese (zh)
Other versions
CN114169543B (en
Inventor
王爽
谢帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University China
Original Assignee
Northeastern University China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University China filed Critical Northeastern University China
Priority to CN202111476376.6A priority Critical patent/CN114169543B/en
Publication of CN114169543A publication Critical patent/CN114169543A/en
Application granted granted Critical
Publication of CN114169543B publication Critical patent/CN114169543B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/12Avoiding congestion; Recovering from congestion
    • H04L47/125Avoiding congestion; Recovering from congestion by balancing the load, e.g. traffic engineering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Computer And Data Communications (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a federated learning algorithm based on model obsolescence and user participation perception, and relates to the technical field of federated learning. The algorithm divides the clients participating in the task into different performance levels, and selects the client with the same performance level when selecting the client each time, so that the conversion of the federal learning from randomly selecting the client to selectively selecting the client is realized, and the communication efficiency of the federal learning is improved; an objective function added with a near-end item of the global model is used as an objective function of each client for training the local model, so that the problem that the global model is biased to a certain client local model is solved; by setting a self-adaptive hyper-parameter and considering the hysteresis degree of a local model of a client to a latest global model on a cloud center server, an updating mode based on combination of model obsolescence and user participation perception is provided to dynamically adjust the global model, and the problem that the global model is biased to a certain client local model in federal learning is solved.

Description

Federal learning algorithm based on model obsolescence and user participation perception
Technical Field
The invention relates to the technical field of federal learning, in particular to a federal learning algorithm based on model obsolescence and user participation perception.
Background
Modern mobile and internet of things devices (e.g., smartphones, smart wearable devices, smart home devices) generate large amounts of data each day, which provides opportunities for building complex machine learning models to solve challenging artificial intelligence problems. However, these data containing personal sensitive information are very easy to be revealed, the protection problem of information security and personal privacy in international society is also more important, and various related laws are issued successively, so that the management, supervision and protection of private data are more comprehensive, strict and intensive. The data of each large company is more and more emphasized, and the data are not taken out and shared as assets, so that the phenomenon of data island is caused. Furthermore, limited to the limited wireless network communication resources in real life, transmitting large amounts of training data from edge devices to cloud-centric servers is a huge challenge.
Federal learning arises from reasons that make it increasingly attractive to store data locally while pushing computations to the edge. Fig. 1 is a schematic diagram of the federal learning update procedure. Referring to fig. 1, federal learning does not require each client to share private data belonging to them, but lets each client locally train a model, and transfers the model obtained after training to a cloud center server; then, the cloud center server aggregates the trained models into a global model; and finally, downloading the aggregated global model from the cloud center server by each client, and continuously repeating the processes, thereby completing various artificial intelligence tasks.
The federal optimization algorithm is the core of federal learning. The currently widely adopted federal optimization algorithm is a federal average algorithm, and the core idea of the algorithm is to select batch clients each time and enable the clients to execute a random gradient descent algorithm for multiple times locally so as to reduce the communication frequency between the clients and the cloud center server and achieve the purpose of reducing the communication cost. However, the algorithm causes the problem that the global model is biased to a certain client local model, and the improvement on the communication efficiency is not obvious. In addition, the system heterogeneity, which is possessed in the federal learning environment, of each device, that may be different due to differences in hardware (CPU, memory), network connection (3G, 4G, 5G, WIFI), and power supply (battery level), and the data heterogeneity, which is different in data amount of each client and Non-independent identical in data (Non _ IID), may cause each round of communication between the cloud center server and the client, and is necessarily affected by the client that needs to communicate for the longest time among the clients participating in the round of communication, so that the communication efficiency of the federal learning is greatly limited, and is not applicable to an actual scene.
Thus, some researchers began using asynchronous federal learning algorithms: the method comprises the steps that a client side of a local model is obtained by completing local training firstly, the client side which does not complete the local training can directly send the local model to a cloud center server without waiting for the client side which does not complete the local training, the cloud center server conducts aggregation after receiving the local model of the client side, then the global model obtained after aggregation is immediately sent to the client side, then the client side starts to conduct a new round of local training immediately after receiving a new global model, and iteration is continued in this way, and finally the global model is updated. Although the method realizes the wait-free update of the client and the cloud center server, the communication frequency between the client and the cloud center server is necessarily greatly increased, and the resource consumption is more. Moreover, this method fails to consider the problem that the global model is biased toward the local model of a certain client due to the communication differences among different clients, and the client that needs longer time to complete communication affects the global model due to its model staleness.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a federated learning algorithm based on model obsolescence and user participation perception.
The technical scheme of the invention is as follows:
a federated learning algorithm based on model staleness and user engagement perception, the algorithm comprising the steps of:
step 1: the cloud center server divides each client participating in the task into different performance levels;
step 2: the cloud center server initializes a global model and required global variables; the global variable comprises the total communication round number T, n individual performance levels of the cloud center server and the clientSet of clients u1,u2,...,unThe number of opportunities p for selecting the client set of each performance level by the cloud center server to perform local model training1,p2,...,pnAnd the initial sampling rate rho of each performance level client set determined according to the opportunity number12,...,ρn
And step 3: the cloud center server and the clients start first round communication, namely the communication round number t is 0, in the current round of communication, the cloud center server respectively samples part of the clients randomly from the client set of each performance level according to the corresponding initial sampling rate to obtain a client set used for first round local model training, and informs the sampled clients to download the initialized global model from the cloud center server;
and 4, step 4: after receiving the notice of the cloud center server, the sampled client starts to download the current latest global model from the cloud center server, and the current latest global model is used as an initial model for the next round of local model training of the sampled client;
and 5: performing local model training by using a random gradient descent algorithm in parallel according to local data from each client side sampled at the same performance level to obtain local models after respective multi-round iterative training;
step 6: each client end which finishes the training of the local model uploads the local model to the cloud center server;
and 7: for the client set of the performance level corresponding to each client which completes the local model training, the cloud center server subtracts 1 from the opportunity number value which is selected for the local model training, and updates the sampling rate of the client set according to the current opportunity number value;
and 8: the cloud center server aggregates the local models uploaded by the clients belonging to the same performance level and having completed the local model training, and updates the global model;
and step 9: for the client which completes the local model training, the cloud center server randomly samples part of the clients from the corresponding client set of the performance level according to the updated sampling rate, and informs the part of the clients to download the latest global model from the cloud center server;
step 10: and judging whether the number T of the communication rounds of the current cloud center server and the client is more than or equal to T, if not, making T equal to T +1, and going to the step 4 to update the global model of the next round, and if so, determining that the task is ended.
Further, according to the federated learning algorithm based on model staleness and user participation perception, and according to the time consumed by the clients to complete the same task, the cloud center server divides the clients participating in the task into different performance levels.
Further, according to the federated learning algorithm based on model staleness and user engagement perception, the step 1 comprises the following steps:
step 1-1: the cloud center server distributes the same machine learning task to each client participating in the task, and sets the number R of local model training rounds of the clients and the maximum time T of each local model training round of the clientsmaxAnd a number n of client performance levels;
step 1-2: after receiving the machine learning task sent by the cloud center server, each client executes R rounds of local model training, and after each round of local model training is finished, each client sends the time-consuming T of each round of local model trainingiGiving the cloud center server;
step 1-3: the cloud center server receives the time-consuming T of each round of local model training of each clientiWhen is greater than or equal to TmaxAll the time consumption is counted as Tmax
Step 1-4: after R rounds of local model training are finished, Total time consumption Total _ T is removed from all client sides participating in tasksiGreater than or equal to R x TmaxFor the rest of the clients, performing Total time consumption Total _ T of R rounds of local model training according to the rest of the clientsiDivided into n performance levels.
Further, according to the federated learning algorithm based on model staleness and user engagement perception,the Total time consumption Total _ T of the R rounds of local model training of the rest clients according to the client-side dataiThe method for dividing the performance level into n performance levels comprises the following steps: according to Total time consumption Total _ TiSorting clients, and sorting clients according to the sorting
Figure BDA0003393686080000031
The clients are divided into a group and regarded as the same performance level, wherein N is the total number of the clients and N is the preset total number of the performance levels.
Further, according to the federated learning algorithm based on model staleness and user engagement perception, the relationship between the sampling rate corresponding to the client set of each performance level and the number of opportunities for the client set to be selected by the cloud center server for local model training is as follows:
Figure BDA0003393686080000041
where ρ isnRepresenting the sampling rate corresponding to the client set of the nth level; p is a radical ofnAnd representing the number of opportunities for the client set of the nth level to be selected by the cloud center server for local model training.
Further, according to the federated learning algorithm based on model staleness and user participation perception, the sampled client side solves the regularization optimization problem of the following formula by using a random gradient descent algorithm to obtain a local model W of the client side after multiple rounds of iterative trainingt i,k
Figure BDA0003393686080000042
In the above formula, Fi,k(W) a local penalty function representing the kth client in the set of clients with performance level i; μ is a regularization parameter; wtThe method comprises the steps that a client downloads a latest global model from a cloud center server; w is the local model of the client;
Figure BDA0003393686080000043
is a global model near-end term; h isi,k(W;Wt) The method is an objective function added with a global model near-end term on the basis of a loss function trained locally by a client.
Further, according to the federal learning algorithm based on model staleness and user engagement perception, the step 8 further includes the following steps:
step 8-1: the cloud center server aggregates the local models uploaded by the clients belonging to the same performance level and having completed the local model training according to the following formula:
Figure BDA0003393686080000044
in the above formula, the first and second carbon atoms are,
Figure BDA0003393686080000045
represents a model obtained by aggregating local models uploaded by clients belonging to a performance level i, | SiI is the scale of the client set of the performance level i;
step 8-2: the cloud center server updates the global model in a manner based on model staleness combined with user engagement awareness as follows:
Figure BDA0003393686080000046
in the above formula, Wt+1The global model is the latest global model obtained after the t +1 th round of updating on the cloud center server; rhoiThe sampling rate corresponding to the client set with the performance level i;
Figure BDA0003393686080000047
the coefficient is related to the staleness of the client model and is used for measuring the hysteresis of the local model uploaded by the client to the latest global model on the cloud center server;
Figure BDA0003393686080000048
from alpha and sτDetermining, wherein alpha is an adaptive hyper-parameter distributed in (0-1),
Figure BDA0003393686080000051
and the hysteresis degree of the local model uploaded by the current tth round of clients to the latest global model on the cloud center server is represented, and tau is the time interval required by the clients to upload the local model to the cloud center server.
Compared with the prior art, the invention has the following beneficial effects:
(1) in the algorithm, the cloud center server distributes the same machine learning task to all the clients participating in the task, the tasks are divided into different performance levels according to the total time consumed by all the clients for completing the task, the clients with the same performance level can be selected when the clients are selected, so that the change from randomly selecting the clients to selectively selecting the clients in the federal learning is realized, the time required by each round of communication between the cloud center server and the clients is greatly reduced, and the communication efficiency of the federal learning is greatly improved.
(2) Different from the traditional federal learning algorithm, the loss function of each client is directly used as the target function of local model training of the client, the algorithm of the invention uses the target function added with the near-end term of the global model as the target function of the local model training of each client, thereby greatly improving the problem that the global model is biased to a certain client local model, accelerating the convergence of the algorithm and simultaneously greatly improving the precision.
(3) The algorithm of the invention provides an updating mode based on the combination of model obsolescence and user participation perception by setting a self-adaptive hyper-parameter and considering the hysteresis degree of a local model of a client to the latest global model on a cloud center server, so as to dynamically adjust the global model: when the client of a certain performance level communicates with the cloud center server frequently, the updating mode can limit the sampling rate corresponding to the client set of the performance level, so that the scale of the client sampled from the client set of the performance level next time is reduced, and the influence of the client on the global model is reduced; on the other hand, when the client of a certain performance level is not in good communication with the cloud center server, the updating mode limits the weight of the client in the global model, and the problem of model obsolescence caused by the fact that the client of the performance level is not in communication with the cloud center server for a long time is prevented. The updating mode effectively solves the problems that the global model in the federal learning is biased to a certain client local model and the communication efficiency is poor, so that the federal learning can be applied to more practical scenes.
Drawings
FIG. 1 is a schematic illustration of a Federal learning update procedure;
FIG. 2 is a flow chart of a federated learning algorithm based on model staleness and user engagement perception according to the present embodiment;
fig. 3 is a flowchart of client performance layering according to this embodiment.
Detailed Description
To facilitate an understanding of the present application, the present application will now be described more fully with reference to the accompanying drawings. Preferred embodiments of the present application are given in the accompanying drawings. This application may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.
FIG. 2 is a flow chart of the federated learning algorithm based on model staleness and user engagement awareness of the present invention, as shown in FIG. 2, which includes the following steps:
step 1: according to the time consumed by the clients for completing the same task, the cloud center server divides the clients participating in the task into different performance levels.
In this embodiment, the cloud center server distributes the same machine learning task to all the clients, and divides the same machine learning task into different performance levels according to the total time consumed by each client to complete the task, where a specific flow is shown in fig. 3 and includes the following sub-steps:
step 1-1: the cloud center server distributes the same machine learning task to corresponding clients, and sets the number R of local model training rounds of the clients and the maximum time T of each local model training round of the clientsmaxAnd a number n of client performance levels;
step 1-2: after receiving the machine learning task sent by the cloud center server, each client executes R rounds of local model training, and after each round of local model training is finished, each client sends the time-consuming T of each round of local model trainingiGiving the cloud center server;
step 1-3: the cloud center server receives the time-consuming T of each round of local model training of each clientiWhen is greater than or equal to TmaxAll the time consumption is counted as Tmax
Step 1-4: after the R round of local model training is finished, removing Total time consumption Total _ T from all clients participating in executing machine learning tasksiGreater than or equal to R x TmaxFor the rest of the clients, performing Total time consumption Total _ T of R rounds of local model training according to the rest of the clientsiDivided into n performance levels.
After the R round of local model training is finished, Total time consumption Total _ TiGreater than or equal to R TmaxIs considered as an offline (a dropped party with a problem with communication with the cloud-centric server), it is excluded from the set of candidate sampling clients. The rest clients consume Total _ T according to the Total time consumptioniThe method is divided into n performance levels, and comprises the following specific steps: the rest clients are subjected to Total time consumption Total _ TiSorting is performed, and then each is sorted according to the sorting
Figure BDA0003393686080000061
The clients are divided into a group and are regarded as the same performance level. Wherein, N is the number of the clients left after the offline one is excluded, and N is the number of the divided performance levels.
Step 2: the cloud center server initializes a global model and required global variables, and comprises the following sub-steps:
step 2-1: cloud center clothesInitializing a neural network structure of a global model by a server, wherein the neural network structure comprises the number of neurons of an input layer, the number of neurons of a hidden layer and the number of neurons of an output layer of the global model neural network, initializing the global model by adopting a random initialization method based on standard normal distribution according to the initialized neural network structure, and obtaining an initialized global model W0
Step 2-2: the cloud center server initializes a global variable, which mainly includes the total round number T (i.e. the total number of times of global model update) of communication between the cloud center server and the clients, and the set u of the clients layered into n performance levels in step 11,u2,...,unAnd its corresponding initial sampling rate rho12,...,ρnThe number of opportunities p for selecting the client set of each performance level by the cloud center server to perform local model training1,p2,...,pnAnd a maximum client size K per sample. Wherein u isnRepresenting a set of clients classified as an nth tier; p is a radical ofnRepresenting the opportunity number of the client set of the nth level selected by the cloud center server for local model training; rhonThe sampling rate corresponding to the client set of the nth hierarchy can be represented by the opportunity number pnCalculated by the formula (1).
Figure BDA0003393686080000071
And step 3: the cloud center server and the clients start first round communication, namely the communication round number t is 0, in the current round of communication, the cloud center server randomly samples part of the clients from the client set of each performance level according to the corresponding initial sampling rate to obtain a client set used for first round local model training, and informs the sampled clients to download the initialized global model from the cloud center server.
In step 2, after the cloud center server is initialized, the first client sampling is performed, and the cloud center server should perform sampling on the client sets of each performance level respectivelyTo obtain a client set S for first-round local model training1,S2,...,Sn. Wherein the sampled client size of each performance level is | Si|=ρi·K(i=1,...,n)。
In this embodiment, the cloud center server randomly samples part of the clients from the set of clients in each performance level according to the sampling rate corresponding to each performance level, and notifies the sampled clients to download the initialized global model W from the cloud center server0. Set of clients u at performance level iiFor example, the cloud-centric server is from the set uiMiddle sampling to obtain scale of | Si|=ρiK set of clients Si. Where ρ isiFor a set u of clients with a performance level iiK is the maximum client scale of this sampling.
And 4, step 4: after receiving the notice of the cloud center server, the sampled client starts to download the latest global model W obtained after the current t-th round of updating from the cloud center servertAnd the initial model is used as the initial model of the next round of local model training of the sampled client.
And 5: and (3) performing local model training by using a random gradient descent algorithm in parallel according to local data from each client sampled at the same performance level to obtain local models after respective multi-round iterative training.
Set of clients u with slave performance level iiTaking the k-th client of the middle sampling as an example, solving the regularization optimization problem in the formula (2) by using a random gradient descent algorithm to obtain a local model W of the client after multiple rounds of iterative trainingt i,k
Figure BDA0003393686080000081
In the above formula, Fi,k(W) a local penalty function representing the kth client in the set of clients with performance level i; μ is a regularization parameter; wtIs that the client is slave to the cloudThe latest global model obtained after the t round of updating is downloaded by the central server; w is the local model of the client;
Figure BDA0003393686080000082
is a global model near-end term; h isi,k(W;Wt) The method is characterized in that an objective function of a near-end term of a global model is added on the basis of a loss function of local training of a client, and the local model is obtained by solving an optimization problem of minimizing the objective function, so that the problem that the global model is biased to a certain client local model is greatly improved, and the accuracy can be greatly improved while the algorithm convergence is accelerated.
Step 6: and each client end which finishes the training of the local model uploads the local model to the cloud center server.
And 7: for the client set of the performance level corresponding to each client which completes the local model training, the cloud center server subtracts 1 from the opportunity number value selected for the local model training, and updates the corresponding sampling rate according to the current opportunity number value and the formula (1).
Therefore, the value of the number of opportunities for performing local model training is continuously reduced when the frequently updated client set is selected, so that the clients of the performance level are assigned with lower sampling rates, and the situation that the clients of the performance level perform local model training for multiple times to bias the global model to the local model of the clients of the performance level is prevented.
And 8: and the cloud center server aggregates the local models uploaded by the clients belonging to the same performance level and having completed the local model training, and updates the global model.
Due to the fact that the clients belong to different performance levels, the time for completing local model training is different; and the time for the clients belonging to the same performance level to complete the local model training is the same. The cloud center server aggregates the local models uploaded by the clients which have finished the local model training and belong to the same performance level, and then updates the global model. Assuming that the client set with the performance level i completes local model training first, the update process of the global model on the cloud center server includes the following sub-steps:
step 8-1: the cloud center server aggregates the local models uploaded by the clients belonging to the performance level i and having completed the local model training according to a formula (3):
Figure BDA0003393686080000083
in the above formula, the first and second carbon atoms are,
Figure BDA0003393686080000084
representing the aggregated model, S, of the local models uploaded by the clients belonging to the performance level ii|=ρiK is the size of the set of clients for the performance level.
Step 8-2: the cloud center server updates the global model in a manner based on model staleness combined with user engagement awareness as shown in equation (4):
Figure BDA0003393686080000091
in the above formula, Wt+1The global model is the latest global model obtained after the t +1 th round of updating on the cloud center server; rhoiThe sampling rate is the sampling rate corresponding to the client set with the performance level i, when the communication between the client of the performance level and the cloud center server is frequent, the sampling rate is limited due to the reduction of the number of opportunities (the client set of the performance level is selected by the cloud center server to perform local model training), so that the scale of the client sampled from the client set of the performance level next time is reduced, namely the participation of a user is reduced, and the influence of the local model of the client of the performance level on the global model is reduced;
Figure BDA0003393686080000092
is stale with the client model (clients that take longer to complete communication with the cloud-centric server, whose local model lags behind the global model, produce modelsStaleness) coefficient that measures the hysteresis of the local model uploaded by the client to the latest global model on the cloud-centric server.
Figure BDA0003393686080000093
From alpha and sτDetermining, wherein alpha is an adaptive hyper-parameter distributed in (0-1),
Figure BDA0003393686080000094
and the hysteresis degree of the local model uploaded by the current tth round of clients to the latest global model on the cloud center server is represented, and tau is the time interval required by the clients to upload the local model to the cloud center server.
And step 9: and aiming at the client which finishes the local model training, the cloud center server randomly samples part of the clients from the corresponding client set of the performance level according to the updated sampling rate, and informs the part of the clients to download the latest global model from the cloud center server.
In this example, for a client that has completed local model training, the cloud center server randomly samples a part of clients from the set of clients in the performance level where the client is located according to the sampling rate corresponding to the performance level of the client, obtains a set of clients for a new round of local model training, and notifies the sampled clients to download the latest global model from the cloud center server. If the performance level of the client end which completes the local model training is i, the cloud center server collects u according to the client end where the client end is locatediCorresponding sampling rate ρiCome from uiMiddle sampling to obtain scale of | Si|=ρiK set of clients Si
Step 10: and judging whether the number T of the communication rounds of the current cloud center server and the client is more than or equal to T, if not, making T equal to T +1, and going to the step 4 to update the global model of the next round, if so, determining that the whole learning process is finished.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art; the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions as defined in the appended claims.

Claims (7)

1. A federated learning algorithm based on model staleness and user engagement perception is characterized in that the algorithm comprises the following steps:
step 1: the cloud center server divides each client participating in the task into different performance levels;
step 2: the cloud center server initializes a global model and required global variables; the global variables comprise total communication round number T of the cloud center server and the clients, and a client set u of n performance levels1,u2,...,unThe number of opportunities p for selecting the client set of each performance level by the cloud center server to perform local model training1,p2,...,pnAnd the initial sampling rate rho of each performance level client set determined according to the opportunity number12,...,ρn
And step 3: the cloud center server and the clients start first round communication, namely the communication round number t is 0, in the current round of communication, the cloud center server respectively samples part of the clients randomly from the client set of each performance level according to the corresponding initial sampling rate to obtain a client set used for first round local model training, and informs the sampled clients to download the initialized global model from the cloud center server;
and 4, step 4: after receiving the notice of the cloud center server, the sampled client starts to download the current latest global model from the cloud center server, and the current latest global model is used as an initial model for the next round of local model training of the sampled client;
and 5: performing local model training by using a random gradient descent algorithm in parallel according to local data from each client side sampled at the same performance level to obtain local models after respective multi-round iterative training;
step 6: each client end which finishes the training of the local model uploads the local model to the cloud center server;
and 7: for the client set of the performance level corresponding to each client which completes the local model training, the cloud center server subtracts 1 from the opportunity number value which is selected for the local model training, and updates the sampling rate of the client set according to the current opportunity number value;
and 8: the cloud center server aggregates the local models uploaded by the clients belonging to the same performance level and having completed the local model training, and updates the global model;
and step 9: for the client which completes the local model training, the cloud center server randomly samples part of the clients from the corresponding client set of the performance level according to the updated sampling rate, and informs the part of the clients to download the latest global model from the cloud center server;
step 10: and judging whether the number T of the communication rounds of the current cloud center server and the client is more than or equal to T, if not, making T equal to T +1, and going to the step 4 to update the global model of the next round, and if so, determining that the task is ended.
2. The federated learning algorithm based on model staleness and user engagement awareness as claimed in claim 1, wherein a cloud-centric server divides each client participating in a task into different performance levels according to the time it takes for the client to complete the same task.
3. The federated learning algorithm based on model staleness and user engagement awareness as claimed in claim 2, wherein the step 1 further comprises the steps of:
step 1-1: the cloud center server distributes the same machine learning task to each client participating in the task, and sets the number of local model training rounds R of the clients and the maximum consumption of each round of local model training of the clientsTime TmaxAnd a number n of client performance levels;
step 1-2: after receiving the machine learning task sent by the cloud center server, each client executes R rounds of local model training, and after each round of local model training is finished, each client sends the time-consuming T of each round of local model trainingiGiving the cloud center server;
step 1-3: the cloud center server receives the time-consuming T of each round of local model training of each clientiWhen is greater than or equal to TmaxAll the time consumption is counted as Tmax
Step 1-4: after R rounds of local model training are finished, Total time consumption Total _ T is removed from all client sides participating in tasksiGreater than or equal to R x TmaxFor the rest of the clients, performing Total time consumption Total _ T of R rounds of local model training according to the rest of the clientsiDivided into n performance levels.
4. The federated learning algorithm based on model staleness and user engagement awareness of claim 3, wherein the Total time spent, Total _ T, for the remaining clients in their respective R rounds of local model training is performediThe method for dividing the performance level into n performance levels comprises the following steps: according to Total time consumption Total _ TiSorting clients, and sorting clients according to the sorting
Figure FDA0003393686070000021
The clients are divided into a group and regarded as the same performance level, wherein N is the total number of the clients and N is the preset total number of the performance levels.
5. The federated learning algorithm based on model staleness and user engagement awareness as claimed in claim 1, wherein the relationship between the sampling rate corresponding to the set of clients at each performance level and the number of opportunities the set of clients has been selected by the cloud center server for local model training is as follows:
Figure FDA0003393686070000022
where ρ isnRepresenting the sampling rate corresponding to the client set of the nth level; p is a radical ofnAnd representing the number of opportunities for the client set of the nth level to be selected by the cloud center server for local model training.
6. The federated learning algorithm based on model staleness and user engagement awareness as claimed in claim 1, wherein the sampled client uses a stochastic gradient descent algorithm to solve the regularized optimization problem of the following formula to obtain the local model W of the client after multiple rounds of iterative trainingt i,k
Figure FDA0003393686070000031
In the above formula, Fi,k(W) a local penalty function representing the kth client in the set of clients with performance level i; μ is a regularization parameter; wtThe method comprises the steps that a client downloads a latest global model from a cloud center server; w is the local model of the client;
Figure FDA0003393686070000032
is a global model near-end term; h isi,k(W;Wt) The method is an objective function added with a global model near-end term on the basis of a loss function trained locally by a client.
7. The federated learning algorithm based on model staleness and user engagement awareness as claimed in claim 1, wherein the step 8 further comprises the steps of:
step 8-1: the cloud center server aggregates the local models uploaded by the clients belonging to the same performance level and having completed the local model training according to the following formula:
Figure FDA0003393686070000033
in the above formula, the first and second carbon atoms are,
Figure FDA0003393686070000034
represents a model obtained by aggregating local models uploaded by clients belonging to a performance level i, | SiI is the scale of the client set of the performance level i;
step 8-2: the cloud center server updates the global model in a manner based on model staleness combined with user engagement awareness as follows:
Figure FDA0003393686070000035
in the above formula, Wt+1The global model is the latest global model obtained after the t +1 th round of updating on the cloud center server; rhoiThe sampling rate corresponding to the client set with the performance level i;
Figure FDA0003393686070000036
the coefficient is related to the staleness of the client model and is used for measuring the hysteresis of the local model uploaded by the client to the latest global model on the cloud center server;
Figure FDA0003393686070000037
from alpha and sτDetermining, wherein alpha is an adaptive hyper-parameter distributed in (0-1),
Figure FDA0003393686070000038
and the hysteresis degree of the local model uploaded by the current tth round of clients to the latest global model on the cloud center server is represented, and tau is the time interval required by the clients to upload the local model to the cloud center server.
CN202111476376.6A 2021-12-06 2021-12-06 Federal learning method based on model staleness and user participation perception Active CN114169543B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111476376.6A CN114169543B (en) 2021-12-06 2021-12-06 Federal learning method based on model staleness and user participation perception

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111476376.6A CN114169543B (en) 2021-12-06 2021-12-06 Federal learning method based on model staleness and user participation perception

Publications (2)

Publication Number Publication Date
CN114169543A true CN114169543A (en) 2022-03-11
CN114169543B CN114169543B (en) 2024-08-09

Family

ID=80483310

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111476376.6A Active CN114169543B (en) 2021-12-06 2021-12-06 Federal learning method based on model staleness and user participation perception

Country Status (1)

Country Link
CN (1) CN114169543B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115081626A (en) * 2022-07-21 2022-09-20 山东大学 Personalized federate sample-less learning system and method based on representation learning
CN115080801A (en) * 2022-07-22 2022-09-20 山东大学 Cross-modal retrieval method and system based on federal learning and data binary representation
CN115311692A (en) * 2022-10-12 2022-11-08 深圳大学 Federal pedestrian re-identification method, system, electronic device and storage medium
CN116506507A (en) * 2023-06-29 2023-07-28 天津市城市规划设计研究总院有限公司 Data processing method based on client characteristics

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113112027A (en) * 2021-04-06 2021-07-13 杭州电子科技大学 Federal learning method based on dynamic adjustment model aggregation weight
CN113221470A (en) * 2021-06-10 2021-08-06 南方电网科学研究院有限责任公司 Federal learning method for power grid edge computing system and related device thereof
WO2021155671A1 (en) * 2020-08-24 2021-08-12 平安科技(深圳)有限公司 High-latency network environment robust federated learning training method and apparatus, computer device, and storage medium
WO2021190638A1 (en) * 2020-11-24 2021-09-30 平安科技(深圳)有限公司 Federated modelling method based on non-uniformly distributed data, and related device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021155671A1 (en) * 2020-08-24 2021-08-12 平安科技(深圳)有限公司 High-latency network environment robust federated learning training method and apparatus, computer device, and storage medium
WO2021190638A1 (en) * 2020-11-24 2021-09-30 平安科技(深圳)有限公司 Federated modelling method based on non-uniformly distributed data, and related device
CN113112027A (en) * 2021-04-06 2021-07-13 杭州电子科技大学 Federal learning method based on dynamic adjustment model aggregation weight
CN113221470A (en) * 2021-06-10 2021-08-06 南方电网科学研究院有限责任公司 Federal learning method for power grid edge computing system and related device thereof

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115081626A (en) * 2022-07-21 2022-09-20 山东大学 Personalized federate sample-less learning system and method based on representation learning
CN115081626B (en) * 2022-07-21 2024-05-31 山东大学 Personalized federal few-sample learning system and method based on characterization learning
CN115080801A (en) * 2022-07-22 2022-09-20 山东大学 Cross-modal retrieval method and system based on federal learning and data binary representation
CN115080801B (en) * 2022-07-22 2022-11-11 山东大学 Cross-modal retrieval method and system based on federal learning and data binary representation
CN115311692A (en) * 2022-10-12 2022-11-08 深圳大学 Federal pedestrian re-identification method, system, electronic device and storage medium
CN115311692B (en) * 2022-10-12 2023-07-14 深圳大学 Federal pedestrian re-identification method, federal pedestrian re-identification system, electronic device and storage medium
CN116506507A (en) * 2023-06-29 2023-07-28 天津市城市规划设计研究总院有限公司 Data processing method based on client characteristics

Also Published As

Publication number Publication date
CN114169543B (en) 2024-08-09

Similar Documents

Publication Publication Date Title
CN114169543A (en) Federal learning algorithm based on model obsolescence and user participation perception
CN112882815B (en) Multi-user edge calculation optimization scheduling method based on deep reinforcement learning
CN113222179B (en) Federal learning model compression method based on model sparsification and weight quantification
Lee et al. Adaptive transmission scheduling in wireless networks for asynchronous federated learning
CN116523079A (en) Reinforced learning-based federal learning optimization method and system
CN115374853A (en) Asynchronous federal learning method and system based on T-Step polymerization algorithm
CN113378474B (en) Contribution-based federated learning client selection method, system and medium
CN113691594B (en) Method for solving data imbalance problem in federal learning based on second derivative
CN114912626A (en) Method for processing distributed data of federal learning mobile equipment based on summer pril value
CN116363449A (en) Image recognition method based on hierarchical federal learning
CN116187483A (en) Model training method, device, apparatus, medium and program product
CN114625506A (en) Edge cloud collaborative task unloading method based on adaptive covariance matrix evolution strategy
CN117994635B (en) Federal element learning image recognition method and system with enhanced noise robustness
CN116321255A (en) Compression and user scheduling method for high-timeliness model in wireless federal learning
CN116389270A (en) DRL (dynamic random link) joint optimization client selection and bandwidth allocation based method in federal learning
CN115695429A (en) Non-IID scene-oriented federal learning client selection method
Zhang et al. Optimizing federated edge learning on Non-IID data via neural architecture search
CN114723071A (en) Federal learning method and device based on client classification and information entropy
CN113391897B (en) Heterogeneous scene-oriented federal learning training acceleration method
CN113132482B (en) Distributed message system parameter adaptive optimization method based on reinforcement learning
CN115115064A (en) Semi-asynchronous federal learning method and system
CN117521781B (en) Differential privacy federal dynamic aggregation method and system based on important gradient protection
CN118586474A (en) Self-adaptive asynchronous federal learning method and system based on deep reinforcement learning
CN117692939B (en) Client scheduling method in dynamic communication environment
CN117557870B (en) Classification model training method and system based on federal learning client selection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant