CN114528304A - Federal learning method, system and storage medium for updating self-adaptive client parameters - Google Patents

Federal learning method, system and storage medium for updating self-adaptive client parameters Download PDF

Info

Publication number
CN114528304A
CN114528304A CN202210152598.0A CN202210152598A CN114528304A CN 114528304 A CN114528304 A CN 114528304A CN 202210152598 A CN202210152598 A CN 202210152598A CN 114528304 A CN114528304 A CN 114528304A
Authority
CN
China
Prior art keywords
client
local
model
central server
parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210152598.0A
Other languages
Chinese (zh)
Inventor
潘紫柔
吴宣够
卫琳娜
张卫东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University of Technology AHUT
Original Assignee
Anhui University of Technology AHUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University of Technology AHUT filed Critical Anhui University of Technology AHUT
Priority to CN202210152598.0A priority Critical patent/CN114528304A/en
Publication of CN114528304A publication Critical patent/CN114528304A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a federal learning method, a federal learning system and a storage medium for updating self-adaptive client parameters, which relate to the technical field of wireless communication networks; the method comprises the steps that a central server issues a global model to a client side; the client updates the model by using the local data; the client estimates energy consumption and transmission delay before the next global model training process, and selects local updating times by using reinforcement learning; when the local update times of the client reach the trained local update optimal value, uploading the model to a central server for global aggregation; the method can efficiently execute the federal learning task, reduce the communication cost required by the federal learning model, select the locally updated local optimal model of the client and improve the integral training efficiency of the federal learning.

Description

Federal learning method, system and storage medium for updating self-adaptive client parameters
Technical Field
The invention relates to the technical field of wireless communication networks, in particular to a federal learning method, a federal learning system and a storage medium for updating parameters of a self-adaptive client.
Background
In the past years, rapid development of machine learning in the field of artificial intelligence application is witnessed, and none of the machine learning technologies is successful on the basis of a large amount of data, so that artificial intelligence can perform tasks which are difficult to complete by human beings in many fields by using the large data.
However, as society develops, it has been found that in many applications, the amount of data that meets the above scale is difficult or even impossible to achieve. The reason is that with the development of artificial intelligence, people's attention to user privacy and data security is also increasing; there is also a loss of control over the data, exacerbating the severity of data islanding, and preventing the use of the large data necessary to train artificial intelligence models. Therefore, the federal learning comes, the client side participating in training is allowed to store data locally, the training process is not shared, only parameters of a machine learning model trained on the local data are shared in the training process, and the model parameters can be protected by using technologies such as a compression mechanism, safe multi-party calculation, differential privacy and the like, so that the privacy and the safety of a user are protected to a great extent.
However, federal learning, as an emerging technology, still presents some problems. Through browsing analysis comparison, the problems and defects of the prior federal study are found as follows:
the federal learning has data quality problems, which are specifically represented as follows: because the data set is stored locally, the server cannot contact the data source, and the problems that whether the label of the data is correct, whether the data is confused and the like are difficult to ensure are solved; the federated learning synchronous iteration has a waiting time problem, which is specifically represented as follows: model parameters are exchanged between the federal server and the client in a synchronous mode, and a new iteration process can be started only after all client models are completely updated, because of the problem of system heterogeneity, the client with strong computing capability and good network state has a large amount of idle waiting time; the federal learning communication efficiency under partial scenes is not high, most of the current federal learning is synchronous, and a server needs to perform data interaction with a plurality of participants in one iteration. If a plurality of defense means are adopted to ensure the security of the model and the sensitive information, the communication burden of the server is increased, and even a denial of service attack or a single point of failure is caused.
Disclosure of Invention
The invention aims to provide a federated learning method, a federated learning system and a storage medium for updating parameters of a self-adaptive client, which solve the problems of synchronous iteration and communication efficiency of federated learning, better play the characteristic of federated learning and apply the federated learning to more practical scenes.
In order to achieve the above purpose, the invention provides the following technical scheme: a federal learning method for updating self-adaptive client parameters is applied to a central server and comprises the following steps:
establishing a Q table at the central server by using a Q-Learning algorithm, wherein the Q table is established in an arbitrary state s at the central server1Then, any action a is selected and issued to all the clients, the feedback award r is obtained, the Q value is calculated according to the award r and is filled into a Q table, and the action a is used as the next new state s entered by the central server2Repeatedly executing calculation until the Q table is not changed or the change is within a set range;
the central server broadcasts the initialized global model parameters to all the clients so that each client can train according to local data owned by the client and update local resource information;
receiving locally updated resource information uploaded by a client, and selecting an action a corresponding to the maximum Q value in a Q table according to the state corresponding to the resource information by adopting a Markov decision process to feed back to the client; taking the action a corresponding to the maximum Q value as the next new state s of the client2Iteration is carried out for multiple times until the local model converges or the set local model precision is reached, and a locally updated local optimal model is obtained;
receiving parameters of local models uploaded by all clients, aggregating by adopting a federal average algorithm, and updating global model parameters; the parameters of the local model uploaded by the client are parameters of a local updated local optimal model;
and issuing the updated global model parameters to each client so that the client repeatedly executes the process of determining the local updated local optimal model, and iterating for multiple times until the global models in all the clients in the federated learning system converge or reach the set global model precision.
Further, the markov decision process is defined as a selection strategy problem of locally updating a local optimal model by a client in the federated learning system, and is recorded as < S, A, P, R >, wherein S, A, P and R are respectively a state space, an action space, a state transition probability and a reward function of the federated learning system;
the state space S is expressed as the resource information of all clients in the system and is defined as
Figure BDA0003511175980000031
Wherein II is Cartesian product, n is the number of clients in the Federal learning System, skIs the state of the client k, which is expressed as
sk={fk,ek,wk;fk≤F,ek≤E,wk≤W}
Wherein F, E, W is the limit of the periodic frequency, energy unit and wireless bandwidth of the central server, respectively, fkIs the period frequency of the client k, ekIs the energy unit of client k, wkA limit of wireless bandwidth for client k;
the operation space A is expressed as a combination of selection strategies of the central server for locally updating the local optimal model of all the clients contained in the system, and is defined as
Figure BDA0003511175980000032
Wherein, akIs an action of client k, and ak0 or 1;
when a isk0 denotes that client k does not upload updates of the local model, ak1 represents that the client k uploads the local model update of the current round;
the state transition probability P is expressed as the federate learning system from the current state s1Transition to the next state s2The state transition is determined according to the transition of all client states in the system;
the central server updates global model parameters according to the parameters of the local update local optimal model uploaded by the client, and evaluates the quality of the local update optimal model strategy of the client according to the accumulated reward, wherein the quality of the local update optimal model strategy of the client is searched by a Markov decision method, so that an optimal strategy is obtained; the optimal strategy represents that the client executes the strategy all the time in an initial state until the state of the client reaches local model convergence or set local model precision;
the accumulated reward is expressed by a reward function R, and the calculation method comprises the following steps:
Figure BDA0003511175980000041
wherein R issThe accumulated reward under the state s that the client k reaches the local model convergence or the set local model precision is shown, alpha and beta are discount factors, m is the local updating times of one round of training of the client, BkThe energy consumption required for each iteration of the client;
energy consumed by client k per iteration BkThe calculation is as follows:
BK=fk 2μG
where μ is the training data and G is the number of central server cycles required to process one local data.
Further, the federal learning method for updating the parameters of the self-adaptive client is applied to the client and comprises the following steps:
receiving an initialized global model parameter sent by a central server, training according to local data owned by the global model parameter, and updating local resource information;
uploading the locally updated resource information to a central server, so that the central server selects an action a corresponding to the maximum Q value in a Q table according to the state corresponding to the resource information by adopting a Markov decision process and feeds the action a back to the client; wherein the Q table is established for the central server by using a Q-Learning algorithm, and the establishing process is any state s of the central server1Then, any action a is selected and issued to all the clients, the feedback award r is obtained, the Q value is calculated according to the award r and is filled into a Q table, and the action a is used as the next new state s entered by the central server2Repeatedly executing calculation until the Q table is not changed or the change is within a set range;
with action a as the next new state s2Iteration is carried out for multiple times until the local model converges or the set local model precision is reached, and a locally updated local optimal model is obtained;
uploading the parameters of the locally updated local optimal model to a central server so that the central server adopts a federal average algorithm to aggregate and update the parameters of the global model;
and receiving updated global model parameters sent by the central server, repeatedly executing the local updating process for determining the locally updated local optimal model, and iterating for multiple times until the global models in all the clients in the federal learning system converge or reach the set global model precision.
Further, the federate learning system is defined to include n clients, and each client stores local data, so that the local loss function and the global loss function of the system are respectively:
Figure BDA0003511175980000051
Figure BDA0003511175980000052
Figure BDA0003511175980000053
wherein, i and j are any client in the federated learning system respectively, w is a weight matrix of the global model, and D is a local data set stored by all clients.
Further, the initialized global model parameters are obtained by initializing a weight matrix w of the global model included in the global penalty function to 0.
Further, the client performs a training process according to the local data owned by the client to perform one or more gradient descent updates on the local data.
The invention discloses another technical scheme of a federal learning system for updating self-adaptive client parameters, which comprises a central server, a plurality of clients connected with the central server through a network, and the following modules:
the establishing module is used for establishing a Q table at the central server by using a Q-Learning algorithm, wherein the establishing process of the Q table is an arbitrary state s at the central server1Then, any action a is selected and issued to all the clients, the feedback award r is obtained, the Q value is calculated according to the award r and is filled into a Q table, and the action a is used as the next new state s entered by the central server2Repeatedly executing calculation until the Q table is not changed or the change is within a set range;
the broadcast module is used for broadcasting the initialized global model parameters to all the clients by the central server so that each client can train according to local data owned by the client and update local resource information;
the system comprises a first receiving module, a second receiving module and a third receiving module, wherein the first receiving module is used for receiving locally updated resource information uploaded by a client, and selecting an action a corresponding to the maximum Q value in a Q table according to the state corresponding to the resource information by adopting a Markov decision process to feed back the action a to the client; taking the action a corresponding to the maximum Q value as the next new state s of the client2Iteration is carried out for multiple times until the local model converges or the set local model precision is reached, and a locally updated local optimal model is obtained;
the second receiving module is used for receiving the parameters of the local models uploaded by all the clients, aggregating the parameters by adopting a federal average algorithm and updating the parameters of the global model; the parameters of the local model uploaded by the client are parameters of a local updated local optimal model;
the issuing module is used for issuing the updated global model parameters to each client;
and the repeated iteration module is used for repeatedly executing the process of determining the locally updated local optimal model by the client according to the received updated global model parameters, and repeating for many times until the global models in all the clients in the federal learning system converge or reach the set global model precision.
The invention also discloses electronic equipment which is characterized by comprising a memory, a processor and a control program of the federal learning system, wherein the control program of the federal learning system is stored on the memory and can run on the processor, and when being executed by the processor, the control program of the federal learning system realizes the federal learning method for updating the parameters of the adaptive client.
The invention also discloses a storage medium, wherein a control program of the federal learning system is stored on the storage medium, and the control program of the federal learning system is executed by a processor to realize the federal learning method for updating the self-adaptive client parameters
According to the technical scheme, the technical scheme of the invention has the following beneficial effects:
the invention discloses a federal learning method, a system and a storage medium for updating self-adaptive client parameters, wherein the method comprises the following steps: firstly, establishing a Q table by using a Q-Learning algorithm in a central server; then, the central server broadcasts the initialized global model parameters to all the clients, and each client trains according to local data owned by the client, updates and uploads local resource information to the central server; then, the central server determines the corresponding state s according to the resource information through the Markov decision process1Selecting action a corresponding to the maximum Q value from the Q table and feeding back the action a to the client, wherein the action a corresponding to the maximum Q value is used as the next new state s of the client2Repeatedly performing the calculation for a plurality of iterations until the localThe model of the part converges or reaches the set local model precision; uploading the parameters of the local updated local optimal model by the client so that the central server adopts a federal average algorithm to aggregate, update and issue global model parameters; and each client repeatedly executes the local updating process for determining the optimal value of the updating times until the global model in all clients in the federal learning system converges or the set global model precision is reached.
According to the method, the optimal value of the local update times of the client is searched through a Markov decision method, the global aggregation times of the central server are reduced, the federal learning task is efficiently executed, the communication cost required by the federal learning model parameters is reduced, the local update optimal value is dynamically selected, and the integral training efficiency of the federal learning is improved.
It should be understood that all combinations of the foregoing concepts and additional concepts described in greater detail below can be considered as part of the inventive subject matter of this disclosure unless such concepts are mutually inconsistent.
The foregoing and other aspects, embodiments and features of the present teachings can be more fully understood from the following description taken in conjunction with the accompanying drawings. Additional aspects of the present invention, such as features and/or advantages of exemplary embodiments, will be apparent from the description which follows, or may be learned by practice of specific embodiments in accordance with the teachings of the present invention.
Drawings
The drawings are not intended to be drawn to scale. In the drawings, each identical or nearly identical component that is illustrated in various figures may be represented by a like numeral. For purposes of clarity, not every component may be labeled in every drawing. Embodiments of various aspects of the present invention will now be described, by way of example, with reference to the accompanying drawings, in which:
FIG. 1 is a block diagram of a federated learning system as presented in the present invention;
FIG. 2 is a diagram of the process of Q-Learning in reinforcement Learning according to the present invention;
fig. 3 is a flowchart of a federal learning method for adaptive client parameter update proposed by the present invention.
In the figure, the specific meaning of each mark is:
1-central server, 2-client.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the drawings of the embodiments of the present invention. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the described embodiments of the invention without any inventive step, are within the scope of protection of the invention. Unless defined otherwise, technical or scientific terms used herein shall have the ordinary meaning as understood by one of ordinary skill in the art to which this invention belongs.
The use of "first," "second," and similar terms in the description and claims of the present application do not denote any order, quantity, or importance, but rather the terms are used to distinguish one element from another. Similarly, the singular forms "a," "an," or "the" do not denote a limitation of quantity, but rather denote the presence of at least one, unless the context clearly dictates otherwise. The terms "comprises," "comprising," or the like, mean that the elements or items listed before "comprises" or "comprising" encompass the features, integers, steps, operations, elements, and/or components listed after "comprising" or "comprising," and do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. "upper", "lower", "left", "right", and the like are used only to indicate relative positional relationships, and when the absolute position of the object being described changes, the relative positional relationships may also change accordingly.
The invention provides a federal learning method, a system and a storage medium for self-adaptive client parameter updating, which solve the technical problems that in the prior art, the communication overhead problem exists in federal learning, the local updating frequency of a client does not reach an optimal value, the federal learning communication efficiency under partial scenes is not high, and the requirement of industrial application scenes is not met.
The method, system, and storage medium for adaptive client parameter update federal learning according to the present invention are further described in detail with reference to the embodiments shown in the drawings.
As shown in fig. 3, the execution of the federal learning method for updating adaptive client parameters according to an embodiment of the present invention includes the following steps: the central server 1 issues a global model to the client 2; the client 2 updates the model by using the local data; the client 2 estimates energy consumption and transmission delay before the next global model training process, and locally updates the local optimal model by using reinforcement learning; when the local update times of the client 2 reach the trained local update optimal model, uploading the model to a central server for global aggregation; and repeatedly executing the local updating process until a preset condition is reached. By searching the local update local optimal model selection strategy of the client 2, the client 2 is prevented from uploading local model parameters after each local update, the times of global aggregation are reduced, and the purpose of reducing communication overhead is achieved.
Specifically, when the method of the present invention is applied to the central server 1, the method includes: establishing a Q table by using a Q-Learning algorithm in the central server 1; broadcasting the initialized global model parameters to all the clients 2 so that each client 2 can train according to local data owned by the client and update local resource information; receiving locally updated resource information uploaded by the client 2, and selecting an action a corresponding to the maximum Q value in the Q table to feed back to the client 2 by adopting a Markov decision process according to a state s corresponding to the resource information; wherein, the action a corresponding to the maximum Q value is the next new state s of the client 22Repeatedly executing calculation for multiple iterations until the local model converges or the set local model precision is reached; receiving all parameters of the local updated local optimal model uploaded by the client 2, and adopting a federal average algorithm to aggregate and update global model parameters; the updated global model parameters are transmitted to each client 2 so that the clients 2 can conveniently repeatAnd repeatedly executing the process of determining the local update local optimal model, and iterating for multiple times until the global model in all the clients 2 in the federal learning system converges or the set global model precision is reached.
When the method of the present invention is applied to the client 2, the method includes: receiving an initialized global model parameter sent by the central server 1, training according to local data owned by the global model parameter, and updating local resource information; uploading the locally updated resource information to the central server 1, so that the central server 1 selects an action a corresponding to the maximum Q value in the Q table according to the state corresponding to the resource information by adopting a Markov decision process and feeds the action a back to the client, wherein the action a corresponding to the maximum Q value is the next new state s of the client2Repeatedly executing calculation for multiple iterations until the local model converges or reaches the set local model precision, and obtaining a locally updated local optimal model; the client 2 uploads the parameters of the locally updated local optimal model to the central server 1, so that the central server 1 adopts a federal average algorithm to aggregate and update the parameters of the global model; receiving updated global model parameters sent by the central server 1, repeatedly executing the process of determining the local updated local optimal model, and iterating for multiple times until the global models in all the clients 2 in the federal learning system converge or reach the set global model precision.
As shown in fig. 2, when the method is applied, the process of establishing the Q table is as follows, that is, starting in an arbitrary state s of the central server 1, selecting an action a to send to all the clients 2, obtaining a feedback award r, calculating a Q value according to the award r and filling in the Q table, and repeatedly performing the calculation until the Q table is not changed any more or is changed within a set range according to the selected action a as a next new state s of the client 2.
In addition, after the client 2 uploads 1 its updated resource information to the central server 1, the central server 1 first observes all the resource information of the client 2, such as the wireless channel state and the real-time energy state, and then selects the optimal policy according to the resource information of the client 2 and the markov decision process.
In the implementation of the method, a selection strategy problem used for training the client 2 to locally update the local optimal model in the federated learning system is expressed by a Markov decision process and is recorded as < S, A, P and R >, wherein S, A, P and R are respectively a state space, an action space, a state transition probability and a reward function of the federated learning system;
the state space S is expressed as resource information of all the clients 2 in the system, and is defined as
Figure BDA0003511175980000101
Wherein II is Cartesian product, n is the number of clients 2 in the Federal learning System, skIs the state of the client k, which is expressed as
sk={fk,ek,wk;fk≤F,ek≤E,wk≤W}
Wherein F, E, W is the limit of the periodic frequency, energy unit and wireless bandwidth of the central server, respectively, fkIs the period frequency of the client k, ekIs the energy unit of client k, wkA limit of wireless bandwidth for client k;
the operating space A is expressed as a combination of selection policies of the central server 1 for the number of local updates of all the clients 2 included in the system, defined as
Figure BDA0003511175980000102
Wherein, akIs an action of client k, and ak0 or 1;
when a isk0 denotes that client k does not upload updates of the local model, ak1 represents that the client k uploads the local model update of the current round;
the state transition probability P is expressed as the federate learning system from the current state s1Transition to the next state s2The state transition is determined according to the transition of all the client 2 states in the system;
the central server updates global model parameters according to the parameters of the local update local optimal model uploaded by the client, and evaluates the quality of the local update optimal model strategy of the client according to the accumulated reward, wherein the quality of the local update optimal model strategy of the client is searched by a Markov decision method, so that an optimal strategy is obtained; the optimal strategy represents that the client executes the strategy all the time in an initial state until the state of the client reaches local model convergence or set local model precision;
the accumulated reward is expressed by a reward function R, and the calculation method comprises the following steps:
Figure BDA0003511175980000111
wherein R issThe accumulated reward under the state s that the client k reaches the local model convergence or the set local model precision is shown, alpha and beta are discount factors, m is the local updating times of one round of training of the client, BkThe energy consumption required for each iteration of the client;
energy consumed by client 2 per iteration BkThe calculation is as follows:
BK=fk 2μG
where μ is the training data and G is the number of central server cycles required to process one local data.
The federal learning method for updating the self-adaptive client parameters is suitable for the federal learning system shown in figure 1. Specifically, the federate learning system is defined to include n clients 2, each client 2 stores local data, and an average loss function obtained by training the local data of a single client 2 in the system, that is, a local loss function Fi(w), and the loss function trained on the local data set composed of all clients 2, i.e. the global loss function f (w), are:
Figure BDA0003511175980000112
Figure BDA0003511175980000121
Figure BDA0003511175980000122
wherein, i and j are any client 2 in the federal learning system respectively, w is a weight matrix of the global model, and D is a local data set stored by all clients 2. The core of the machine learning problem is to iteratively update a parameter set for solving a loss function by inputting a data set so as to reduce the loss function to a set value; the training task of the federated learning system is also to solve a weight matrix w, which is a solution that optimizes the global loss function.
In the embodiment, the global model parameter obtaining manner for initialization broadcasted by the central server 1 is to initialize the weight matrix w of the global model included in the global loss function to 0, and the local training process performed by the client 2 after receiving the optimal value of the local update times of the client 2 fed back by the central server 1 is to perform one or more gradient descent updates on the local data.
In this embodiment, an electronic device is also provided, where the electronic device includes a processor, a memory, and a control program of a federal learning system stored in the memory and operable on the processor; when the control program of the federal learning system is executed by a processor, the processor executes the method in the above embodiment.
The control program for the federated learning system described above may be run on a processor or may also be stored on a computer-readable storage medium that may implement information storage by any method or technique, including permanent and non-permanent, removable and non-removable media. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of memory storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a storage medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
These computer programs may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks, and corresponding steps may be implemented by different modules.
By way of example, the present embodiment provides a system, namely a federal learning system for adaptive client parameter update, which comprises a central server 1 and a plurality of clients 2 connected to the central server 1 through a network, and the following program modules: the establishing module is used for establishing a Q table by using a Q-Learning algorithm in the central server; the broadcast module is used for broadcasting the initialized global model parameters to all the clients by the central server so that each client can train according to local data owned by the client and update local resource information; the system comprises a first receiving module, a second receiving module and a third receiving module, wherein the first receiving module is used for receiving locally updated resource information uploaded by a client, and selecting an action a corresponding to the maximum Q value in a Q table according to the state corresponding to the resource information by adopting a Markov decision process to feed back the action a to the client; the action a corresponding to the maximum Q value serves as the next new state s2 of the client, and iteration is carried out for multiple times until the local model converges or reaches the set local model precision, so that a local update local optimal model is obtained; the second receiving module is used for receiving the parameters of the local models uploaded by all the clients, aggregating the parameters by adopting a federal average algorithm and updating the parameters of the global model; the parameters of the local model uploaded by the client are parameters of a local updated local optimal model; the issuing module is used for issuing the updated global model parameters to each client; and the repeated iteration module is used for repeatedly executing the process of determining the locally updated local optimal model by the client according to the received updated global model parameters, and iterating for multiple times until the global models in all the clients in the federal learning system converge or reach the set global model precision.
Optionally, the process of establishing the Q table in the establishing module is any state s in the central server 11Then, any action a is selected and issued to all the clients 2, the feedback award r is obtained, the Q value is calculated according to the award r and is filled into a Q table, and the selected action a is used as the next new state s entered by the central server 12And repeatedly executing the calculation until the Q table is not changed or the change is within the set range. The main idea of the Q-Learning algorithm is to construct the state s and the action a into a Q table to store the Q value, and then select the action that can obtain the maximum profit according to the Q value.
The system updates the initialized global parameters locally through the client 2 during running, uploads resource information, namely a state s to the central server 1, and the central server 1 selects an action a with the maximum Q value from a Q table established in advance by using a Markov decision in reinforcement learning to feed back to the client 2, wherein the action a corresponding to the maximum Q value is used as the next new state s of the client 22Iteration is carried out for multiple times until the local model converges or the set local model precision is reached, and a locally updated local optimal model is obtained; then, the client 2 uploads the updated parameters of the local update local optimal model, so that the central server 1 performs global aggregation to update the global model parameters, thereby reducing the local overhead and the communication overhead to a great extent.
The method comprises the steps of obtaining a local updated local optimal model at a client 2 and uploading the local optimal model to a central server for global aggregation through a Markov decision method, reducing the times of global aggregation of the central server, efficiently executing a federal learning task, reducing communication cost required by federal learning model parameters, dynamically selecting a local updated optimal value, and improving the whole training efficiency of federal learning.
Although the present invention has been described with reference to the preferred embodiments, it is not intended to be limited thereto. Those skilled in the art can make various changes and modifications without departing from the spirit and scope of the invention. Therefore, the protection scope of the present invention should be determined by the appended claims.

Claims (10)

1. A federal learning method for updating self-adaptive client parameters is applied to a central server and comprises the following steps:
establishing a Q table at the central server by using a Q-Learning algorithm, wherein the Q table is established in an arbitrary state s at the central server1Then, any action a is selected and issued to all the clients, the feedback award r is obtained, the Q value is calculated according to the award r and is filled into a Q table, and the action a is used as the next new state s entered by the central server2Repeatedly executing calculation until the Q table is not changed or the change is within a set range;
the central server broadcasts the initialized global model parameters to all the clients so that each client can train according to local data owned by the client and update local resource information;
receiving locally updated resource information uploaded by a client, and selecting an action a corresponding to the maximum Q value in a Q table according to the state corresponding to the resource information by adopting a Markov decision process to feed back to the client; taking the action a corresponding to the maximum Q value as the next new state s of the client2Iteration is carried out for multiple times until the local model converges or the set local model precision is reached, and a locally updated local optimal model is obtained;
receiving parameters of local models uploaded by all clients, aggregating by adopting a federal average algorithm, and updating global model parameters; the parameters of the local model uploaded by the client are parameters of a local updated local optimal model;
and issuing the updated global model parameters to each client so that the client repeatedly executes the process of determining the local updated local optimal model, and iterating for multiple times until the global models in all the clients in the federated learning system converge or reach the set global model precision.
2. The federated learning method for adaptive client parameter update according to claim 1, wherein the markov decision process is defined as a selection policy problem of locally updating a locally optimal model for a client in the federated learning system, denoted as < S, a, P, R >, where S, A, P and R are a state space, an action space, a state transition probability and a reward function of the federated learning system, respectively;
the state space S is expressed as the resource information of all clients in the system and is defined as
Figure FDA0003511175970000021
Wherein II is Cartesian product, n is the number of clients in the Federal learning System, skIs the state of the client k, which is expressed as
sk={fk,ek,wk;fk≤F,ek≤E,wk≤W}
Wherein F, E, W is the limit of the periodic frequency, energy unit and wireless bandwidth of the central server, respectively, fkIs the period frequency of the client k, ekIs the energy unit of client k, wkA limit of wireless bandwidth for client k;
the operation space A is expressed as a combination of selection strategies of the central server for locally updating the local optimal model of all the clients contained in the system, and is defined as
Figure FDA0003511175970000022
Wherein, akTo be guestAction of the client k, and ak0 or 1;
when a isk0 denotes that client k does not upload updates of the local model, ak1 represents that the client k uploads the local model update of the current round;
the state transition probability P is expressed as the federate learning system from the current state s1Transition to the next state s2The state transition is determined according to the transition of all client states in the system;
the central server updates global model parameters according to the parameters of the local update local optimal model uploaded by the client, and evaluates the quality of the local update optimal model strategy of the client according to the accumulated reward, wherein the quality of the local update optimal model strategy of the client is searched by a Markov decision method, so that an optimal strategy is obtained; the optimal strategy represents that the client executes the strategy all the time in an initial state until the state of the client reaches local model convergence or set local model precision;
the accumulated reward is expressed by a reward function R, and the calculation method comprises the following steps:
Figure FDA0003511175970000023
wherein R issThe accumulated reward under the state s that the client k reaches the local model convergence or the set local model precision is shown, alpha and beta are discount factors, m is the local updating times of one round of training of the client, BkThe energy consumption required for each iteration of the client;
energy consumed by client k per iteration BkThe calculation is as follows:
BK=fk 2μG
where μ is the training data and G is the number of central server cycles required to process one local data.
3. The federated learning method for adaptive client parameter updating according to claim 1, applied to a client, includes:
receiving an initialized global model parameter sent by a central server, training according to local data owned by the global model parameter, and updating local resource information;
uploading the locally updated resource information to a central server, so that the central server selects an action a corresponding to the maximum Q value in a Q table according to the state corresponding to the resource information by adopting a Markov decision process and feeds the action a back to the client; wherein the Q table is established for the central server by using a Q-Learning algorithm, and the establishing process is any state s of the central server1Then, any action a is selected and issued to all the clients, the feedback award r is obtained, the Q value is calculated according to the award r and is filled into a Q table, and the action a is used as the next new state s entered by the central server2Repeatedly executing calculation until the Q table is not changed or the change is within a set range;
with action a as the next new state s2Iteration is carried out for multiple times until the local model is converged or the set local model precision is reached, and a locally updated local optimal model is obtained;
uploading the parameters of the locally updated local optimal model to a central server so that the central server adopts a federal average algorithm to aggregate and update the parameters of the global model;
and receiving updated global model parameters sent by the central server, repeatedly executing the local updating process for determining the locally updated local optimal model, and iterating for multiple times until the global models in all the clients in the federal learning system converge or reach the set global model precision.
4. The federal learning method for adaptive client parameter update according to claim 1, wherein the federal learning system is defined to include n clients, each client stores local data, and then the local loss function and the global loss function of the system are respectively:
Figure FDA0003511175970000041
Figure FDA0003511175970000042
Figure FDA0003511175970000043
wherein, i and j are any client in the federated learning system respectively, w is a weight matrix of the global model, and D is a local data set stored by all clients.
5. The federated learning method for adaptive client parameter update according to claim 4, wherein the initialized global model parameters are obtained by initializing a weight matrix w of a global model included in a global loss function to 0.
6. The federated learning method for adaptive client parameter updates of claim 1, wherein the central server is networked to any client.
7. The federated learning method for adaptive client parameter updating according to claim 1, wherein the client performs a training process based on its own local data for one or more gradient descent updates on the local data.
8. The federal learning system for updating the parameters of the self-adaptive client is characterized by comprising a central server and a plurality of clients connected with the central server through a network, and the following modules:
the establishing module is used for establishing a Q table at the central server by using a Q-Learning algorithm, wherein the establishing process of the Q table is an arbitrary state s at the central server1Then, any action a is selected and issued to all the clients, the feedback award r is obtained, the Q value is calculated according to the award r and is filled into a Q table, and the action a is used as the input of the central serverNext new state s2Repeatedly executing calculation until the Q table is not changed or the change is within a set range;
the broadcast module is used for broadcasting the initialized global model parameters to all the clients by the central server so that each client can train according to local data owned by the client and update local resource information;
the system comprises a first receiving module, a second receiving module and a third receiving module, wherein the first receiving module is used for receiving locally updated resource information uploaded by a client, and selecting an action a corresponding to the maximum Q value in a Q table according to the state corresponding to the resource information by adopting a Markov decision process to feed back the action a to the client; taking the action a corresponding to the maximum Q value as the next new state s of the client2Iteration is carried out for multiple times until the local model converges or the set local model precision is reached, and a locally updated local optimal model is obtained;
the second receiving module is used for receiving the parameters of the local models uploaded by all the clients, aggregating the parameters by adopting a federal average algorithm and updating the parameters of the global model; the parameters of the local model uploaded by the client are parameters of a local updated local optimal model;
the issuing module is used for issuing the updated global model parameters to each client;
and the repeated iteration module is used for repeatedly executing the process of determining the locally updated local optimal model by the client according to the received updated global model parameters, and repeating for many times until the global models in all the clients in the federal learning system converge or reach the set global model precision.
9. An electronic device comprising a memory, a processor, and a control program of a federated learning system stored on the memory and operable on the processor, the control program of the federated learning system, when executed by the processor, implementing the federated learning method for adaptive client parameter updating as recited in any one of claims 1 to 7.
10. A storage medium having stored thereon a control program of a federal learning system, the control program of the federal learning system being executed by a processor to implement the federal learning method for adaptive client parameter update of any one of claims 1 to 7.
CN202210152598.0A 2022-02-18 2022-02-18 Federal learning method, system and storage medium for updating self-adaptive client parameters Pending CN114528304A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210152598.0A CN114528304A (en) 2022-02-18 2022-02-18 Federal learning method, system and storage medium for updating self-adaptive client parameters

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210152598.0A CN114528304A (en) 2022-02-18 2022-02-18 Federal learning method, system and storage medium for updating self-adaptive client parameters

Publications (1)

Publication Number Publication Date
CN114528304A true CN114528304A (en) 2022-05-24

Family

ID=81623261

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210152598.0A Pending CN114528304A (en) 2022-02-18 2022-02-18 Federal learning method, system and storage medium for updating self-adaptive client parameters

Country Status (1)

Country Link
CN (1) CN114528304A (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114782758A (en) * 2022-06-21 2022-07-22 平安科技(深圳)有限公司 Image processing model training method, system, computer device and storage medium
CN115018086A (en) * 2022-06-08 2022-09-06 河海大学 Model training method based on federal learning and federal learning system
CN115081002A (en) * 2022-06-28 2022-09-20 西安电子科技大学 Aggregation server selection method for decentralized federal learning
CN115134687A (en) * 2022-06-22 2022-09-30 中国信息通信研究院 Service identification method and device for optical access network, electronic equipment and storage medium
CN115130683A (en) * 2022-07-18 2022-09-30 山东大学 Asynchronous federal learning method and system based on multi-agent model
CN115145966A (en) * 2022-09-05 2022-10-04 山东省计算中心(国家超级计算济南中心) Comparison federal learning method and system for heterogeneous data
CN115277555A (en) * 2022-06-13 2022-11-01 香港理工大学深圳研究院 Network traffic classification method, device, terminal and storage medium in heterogeneous environment
CN115357402A (en) * 2022-10-20 2022-11-18 北京理工大学 Intelligent edge optimization method and device
CN116090550A (en) * 2022-12-27 2023-05-09 百度在线网络技术(北京)有限公司 Federal learning method, federal learning device, federal learning server, federal learning electronic device, and federal learning storage medium
CN116306986A (en) * 2022-12-08 2023-06-23 哈尔滨工业大学(深圳) Federal learning method based on dynamic affinity aggregation and related equipment
CN116741388A (en) * 2023-08-14 2023-09-12 中国人民解放军总医院 Method for constructing cardiovascular critical severe disease large model based on federal learning
CN116911403A (en) * 2023-06-06 2023-10-20 北京邮电大学 Federal learning server and client integrated training method and related equipment
CN116936048A (en) * 2023-07-04 2023-10-24 吉林大学 Federal learning hospital selection method, device and storage medium for heterogeneous medical information
CN117278540A (en) * 2023-11-23 2023-12-22 中国人民解放军国防科技大学 Self-adaptive edge federal learning client scheduling method and device and electronic equipment
CN115081002B (en) * 2022-06-28 2024-05-14 西安电子科技大学 Aggregation server selection method for decentralised federal learning

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115018086A (en) * 2022-06-08 2022-09-06 河海大学 Model training method based on federal learning and federal learning system
CN115018086B (en) * 2022-06-08 2024-05-03 河海大学 Model training method based on federal learning and federal learning system
CN115277555B (en) * 2022-06-13 2024-01-16 香港理工大学深圳研究院 Heterogeneous environment network traffic classification method, heterogeneous environment network traffic classification device, terminal and storage medium
CN115277555A (en) * 2022-06-13 2022-11-01 香港理工大学深圳研究院 Network traffic classification method, device, terminal and storage medium in heterogeneous environment
CN114782758B (en) * 2022-06-21 2022-09-02 平安科技(深圳)有限公司 Image processing model training method, system, computer device and storage medium
CN114782758A (en) * 2022-06-21 2022-07-22 平安科技(深圳)有限公司 Image processing model training method, system, computer device and storage medium
CN115134687A (en) * 2022-06-22 2022-09-30 中国信息通信研究院 Service identification method and device for optical access network, electronic equipment and storage medium
CN115134687B (en) * 2022-06-22 2024-05-07 中国信息通信研究院 Service identification method and device of optical access network, electronic equipment and storage medium
CN115081002A (en) * 2022-06-28 2022-09-20 西安电子科技大学 Aggregation server selection method for decentralized federal learning
CN115081002B (en) * 2022-06-28 2024-05-14 西安电子科技大学 Aggregation server selection method for decentralised federal learning
CN115130683A (en) * 2022-07-18 2022-09-30 山东大学 Asynchronous federal learning method and system based on multi-agent model
CN115145966A (en) * 2022-09-05 2022-10-04 山东省计算中心(国家超级计算济南中心) Comparison federal learning method and system for heterogeneous data
CN115357402A (en) * 2022-10-20 2022-11-18 北京理工大学 Intelligent edge optimization method and device
CN115357402B (en) * 2022-10-20 2023-01-24 北京理工大学 Intelligent edge optimization method and device
CN116306986B (en) * 2022-12-08 2024-01-12 哈尔滨工业大学(深圳) Federal learning method based on dynamic affinity aggregation and related equipment
CN116306986A (en) * 2022-12-08 2023-06-23 哈尔滨工业大学(深圳) Federal learning method based on dynamic affinity aggregation and related equipment
CN116090550B (en) * 2022-12-27 2024-03-22 百度在线网络技术(北京)有限公司 Federal learning method, federal learning device, federal learning server, federal learning electronic device, and federal learning storage medium
CN116090550A (en) * 2022-12-27 2023-05-09 百度在线网络技术(北京)有限公司 Federal learning method, federal learning device, federal learning server, federal learning electronic device, and federal learning storage medium
CN116911403B (en) * 2023-06-06 2024-04-26 北京邮电大学 Federal learning server and client integrated training method and related equipment
CN116911403A (en) * 2023-06-06 2023-10-20 北京邮电大学 Federal learning server and client integrated training method and related equipment
CN116936048B (en) * 2023-07-04 2024-03-19 吉林大学 Federal learning hospital selection method, device and storage medium for heterogeneous medical information
CN116936048A (en) * 2023-07-04 2023-10-24 吉林大学 Federal learning hospital selection method, device and storage medium for heterogeneous medical information
CN116741388B (en) * 2023-08-14 2023-11-21 中国人民解放军总医院 Method for constructing cardiovascular critical severe disease large model based on federal learning
CN116741388A (en) * 2023-08-14 2023-09-12 中国人民解放军总医院 Method for constructing cardiovascular critical severe disease large model based on federal learning
CN117278540A (en) * 2023-11-23 2023-12-22 中国人民解放军国防科技大学 Self-adaptive edge federal learning client scheduling method and device and electronic equipment
CN117278540B (en) * 2023-11-23 2024-02-13 中国人民解放军国防科技大学 Self-adaptive edge federal learning client scheduling method and device and electronic equipment

Similar Documents

Publication Publication Date Title
CN114528304A (en) Federal learning method, system and storage medium for updating self-adaptive client parameters
Han et al. Adaptive gradient sparsification for efficient federated learning: An online learning approach
Yu et al. Computation offloading for mobile edge computing: A deep learning approach
CN110113190A (en) Time delay optimization method is unloaded in a kind of mobile edge calculations scene
Lee et al. Adaptive transmission scheduling in wireless networks for asynchronous federated learning
CN113469325B (en) Hierarchical federation learning method for edge aggregation interval self-adaptive control, computer equipment and storage medium
AlQerm et al. DeepEdge: A new QoE-based resource allocation framework using deep reinforcement learning for future heterogeneous edge-IoT applications
Mehrizi et al. Online spatiotemporal popularity learning via variational bayes for cooperative caching
Chua et al. Resource allocation for mobile metaverse with the Internet of Vehicles over 6G wireless communications: A deep reinforcement learning approach
WO2022217210A1 (en) Privacy-aware pruning in machine learning
CN108306965A (en) The data processing method and device of camera, storage medium, camera
CN112364365A (en) Industrial data encryption method, edge server and computer readable storage medium
Chen et al. Profit-Aware Cooperative Offloading in UAV-Enabled MEC Systems Using Lightweight Deep Reinforcement Learning
CN115714814B (en) Edge cache replacement method based on multi-agent reinforcement learning
CN114968402A (en) Edge calculation task processing method and device and electronic equipment
CN114022731A (en) Federal learning node selection method based on DRL
CN107483541A (en) A kind of online task immigration method based on rolling time horizon
CN112416577A (en) Cooperative intelligent calculation and distribution method suitable for block chain workload certification
Wang et al. Adaptive Compute Offloading Algorithm for Metasystem Based on Deep Reinforcement Learning
Wang et al. Latency optimization of task offloading in NOMA‐MEC systems
Zhang et al. Cache-enabled dynamic rate allocation via deep self-transfer reinforcement learning
Jiang et al. Deep reinforcement learning for distributed computation offloading in massive-user mobile edge networks
Wan et al. Deep reinforcement learning based computation offloading in SWIPT-assisted MEC networks
Li et al. Series editorial: Inauguration issue of the series on machine learning in communications and networks
Zhou et al. DRL-Based Workload Allocation for Distributed Coded Machine Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination