CN114528304A

CN114528304A - Federal learning method, system and storage medium for updating self-adaptive client parameters

Info

Publication number: CN114528304A
Application number: CN202210152598.0A
Authority: CN
Inventors: 潘紫柔; 吴宣够; 卫琳娜; 张卫东
Original assignee: Anhui University of Technology AHUT
Current assignee: Anhui University of Technology AHUT
Priority date: 2022-02-18
Filing date: 2022-02-18
Publication date: 2022-05-24

Abstract

The invention provides a federal learning method, a federal learning system and a storage medium for updating self-adaptive client parameters, which relate to the technical field of wireless communication networks; the method comprises the steps that a central server issues a global model to a client side; the client updates the model by using the local data; the client estimates energy consumption and transmission delay before the next global model training process, and selects local updating times by using reinforcement learning; when the local update times of the client reach the trained local update optimal value, uploading the model to a central server for global aggregation; the method can efficiently execute the federal learning task, reduce the communication cost required by the federal learning model, select the locally updated local optimal model of the client and improve the integral training efficiency of the federal learning.

Description

Federal learning method, system and storage medium for updating self-adaptive client parameters

Technical Field

The invention relates to the technical field of wireless communication networks, in particular to a federal learning method, a federal learning system and a storage medium for updating parameters of a self-adaptive client.

Background

In the past years, rapid development of machine learning in the field of artificial intelligence application is witnessed, and none of the machine learning technologies is successful on the basis of a large amount of data, so that artificial intelligence can perform tasks which are difficult to complete by human beings in many fields by using the large data.

However, as society develops, it has been found that in many applications, the amount of data that meets the above scale is difficult or even impossible to achieve. The reason is that with the development of artificial intelligence, people's attention to user privacy and data security is also increasing; there is also a loss of control over the data, exacerbating the severity of data islanding, and preventing the use of the large data necessary to train artificial intelligence models. Therefore, the federal learning comes, the client side participating in training is allowed to store data locally, the training process is not shared, only parameters of a machine learning model trained on the local data are shared in the training process, and the model parameters can be protected by using technologies such as a compression mechanism, safe multi-party calculation, differential privacy and the like, so that the privacy and the safety of a user are protected to a great extent.

However, federal learning, as an emerging technology, still presents some problems. Through browsing analysis comparison, the problems and defects of the prior federal study are found as follows:

the federal learning has data quality problems, which are specifically represented as follows: because the data set is stored locally, the server cannot contact the data source, and the problems that whether the label of the data is correct, whether the data is confused and the like are difficult to ensure are solved; the federated learning synchronous iteration has a waiting time problem, which is specifically represented as follows: model parameters are exchanged between the federal server and the client in a synchronous mode, and a new iteration process can be started only after all client models are completely updated, because of the problem of system heterogeneity, the client with strong computing capability and good network state has a large amount of idle waiting time; the federal learning communication efficiency under partial scenes is not high, most of the current federal learning is synchronous, and a server needs to perform data interaction with a plurality of participants in one iteration. If a plurality of defense means are adopted to ensure the security of the model and the sensitive information, the communication burden of the server is increased, and even a denial of service attack or a single point of failure is caused.

Disclosure of Invention

The invention aims to provide a federated learning method, a federated learning system and a storage medium for updating parameters of a self-adaptive client, which solve the problems of synchronous iteration and communication efficiency of federated learning, better play the characteristic of federated learning and apply the federated learning to more practical scenes.

In order to achieve the above purpose, the invention provides the following technical scheme: a federal learning method for updating self-adaptive client parameters is applied to a central server and comprises the following steps:

establishing a Q table at the central server by using a Q-Learning algorithm, wherein the Q table is established in an arbitrary state s at the central server₁Then, any action a is selected and issued to all the clients, the feedback award r is obtained, the Q value is calculated according to the award r and is filled into a Q table, and the action a is used as the next new state s entered by the central server₂Repeatedly executing calculation until the Q table is not changed or the change is within a set range;

the central server broadcasts the initialized global model parameters to all the clients so that each client can train according to local data owned by the client and update local resource information;

receiving locally updated resource information uploaded by a client, and selecting an action a corresponding to the maximum Q value in a Q table according to the state corresponding to the resource information by adopting a Markov decision process to feed back to the client; taking the action a corresponding to the maximum Q value as the next new state s of the client₂Iteration is carried out for multiple times until the local model converges or the set local model precision is reached, and a locally updated local optimal model is obtained;

receiving parameters of local models uploaded by all clients, aggregating by adopting a federal average algorithm, and updating global model parameters; the parameters of the local model uploaded by the client are parameters of a local updated local optimal model;

and issuing the updated global model parameters to each client so that the client repeatedly executes the process of determining the local updated local optimal model, and iterating for multiple times until the global models in all the clients in the federated learning system converge or reach the set global model precision.

Further, the markov decision process is defined as a selection strategy problem of locally updating a local optimal model by a client in the federated learning system, and is recorded as < S, A, P, R >, wherein S, A, P and R are respectively a state space, an action space, a state transition probability and a reward function of the federated learning system;

the state space S is expressed as the resource information of all clients in the system and is defined as

Wherein II is Cartesian product, n is the number of clients in the Federal learning System, s_kIs the state of the client k, which is expressed as

s_k＝{f_k,e_k,w_k；f_k≤F,e_k≤E,w_k≤W}

Wherein F, E, W is the limit of the periodic frequency, energy unit and wireless bandwidth of the central server, respectively, f_kIs the period frequency of the client k, e_kIs the energy unit of client k, w_kA limit of wireless bandwidth for client k;

the operation space A is expressed as a combination of selection strategies of the central server for locally updating the local optimal model of all the clients contained in the system, and is defined as

Wherein, a_kIs an action of client k, and a_k0 or 1;

when a is_k0 denotes that client k does not upload updates of the local model, a_k1 represents that the client k uploads the local model update of the current round;

the state transition probability P is expressed as the federate learning system from the current state s₁Transition to the next state s₂The state transition is determined according to the transition of all client states in the system;

the central server updates global model parameters according to the parameters of the local update local optimal model uploaded by the client, and evaluates the quality of the local update optimal model strategy of the client according to the accumulated reward, wherein the quality of the local update optimal model strategy of the client is searched by a Markov decision method, so that an optimal strategy is obtained; the optimal strategy represents that the client executes the strategy all the time in an initial state until the state of the client reaches local model convergence or set local model precision;

the accumulated reward is expressed by a reward function R, and the calculation method comprises the following steps:

wherein R is_sThe accumulated reward under the state s that the client k reaches the local model convergence or the set local model precision is shown, alpha and beta are discount factors, m is the local updating times of one round of training of the client, B_kThe energy consumption required for each iteration of the client;

energy consumed by client k per iteration B_kThe calculation is as follows:

B_K＝f_k ²μG

where μ is the training data and G is the number of central server cycles required to process one local data.

Further, the federal learning method for updating the parameters of the self-adaptive client is applied to the client and comprises the following steps:

receiving an initialized global model parameter sent by a central server, training according to local data owned by the global model parameter, and updating local resource information;

uploading the locally updated resource information to a central server, so that the central server selects an action a corresponding to the maximum Q value in a Q table according to the state corresponding to the resource information by adopting a Markov decision process and feeds the action a back to the client; wherein the Q table is established for the central server by using a Q-Learning algorithm, and the establishing process is any state s of the central server₁Then, any action a is selected and issued to all the clients, the feedback award r is obtained, the Q value is calculated according to the award r and is filled into a Q table, and the action a is used as the next new state s entered by the central server₂Repeatedly executing calculation until the Q table is not changed or the change is within a set range;

with action a as the next new state s₂Iteration is carried out for multiple times until the local model converges or the set local model precision is reached, and a locally updated local optimal model is obtained;

uploading the parameters of the locally updated local optimal model to a central server so that the central server adopts a federal average algorithm to aggregate and update the parameters of the global model;

and receiving updated global model parameters sent by the central server, repeatedly executing the local updating process for determining the locally updated local optimal model, and iterating for multiple times until the global models in all the clients in the federal learning system converge or reach the set global model precision.

Further, the federate learning system is defined to include n clients, and each client stores local data, so that the local loss function and the global loss function of the system are respectively:

wherein, i and j are any client in the federated learning system respectively, w is a weight matrix of the global model, and D is a local data set stored by all clients.

Further, the initialized global model parameters are obtained by initializing a weight matrix w of the global model included in the global penalty function to 0.

Further, the client performs a training process according to the local data owned by the client to perform one or more gradient descent updates on the local data.

The invention discloses another technical scheme of a federal learning system for updating self-adaptive client parameters, which comprises a central server, a plurality of clients connected with the central server through a network, and the following modules:

the establishing module is used for establishing a Q table at the central server by using a Q-Learning algorithm, wherein the establishing process of the Q table is an arbitrary state s at the central server₁Then, any action a is selected and issued to all the clients, the feedback award r is obtained, the Q value is calculated according to the award r and is filled into a Q table, and the action a is used as the next new state s entered by the central server₂Repeatedly executing calculation until the Q table is not changed or the change is within a set range;

the broadcast module is used for broadcasting the initialized global model parameters to all the clients by the central server so that each client can train according to local data owned by the client and update local resource information;

the system comprises a first receiving module, a second receiving module and a third receiving module, wherein the first receiving module is used for receiving locally updated resource information uploaded by a client, and selecting an action a corresponding to the maximum Q value in a Q table according to the state corresponding to the resource information by adopting a Markov decision process to feed back the action a to the client; taking the action a corresponding to the maximum Q value as the next new state s of the client₂Iteration is carried out for multiple times until the local model converges or the set local model precision is reached, and a locally updated local optimal model is obtained;

the second receiving module is used for receiving the parameters of the local models uploaded by all the clients, aggregating the parameters by adopting a federal average algorithm and updating the parameters of the global model; the parameters of the local model uploaded by the client are parameters of a local updated local optimal model;

the issuing module is used for issuing the updated global model parameters to each client;

and the repeated iteration module is used for repeatedly executing the process of determining the locally updated local optimal model by the client according to the received updated global model parameters, and repeating for many times until the global models in all the clients in the federal learning system converge or reach the set global model precision.

The invention also discloses electronic equipment which is characterized by comprising a memory, a processor and a control program of the federal learning system, wherein the control program of the federal learning system is stored on the memory and can run on the processor, and when being executed by the processor, the control program of the federal learning system realizes the federal learning method for updating the parameters of the adaptive client.

The invention also discloses a storage medium, wherein a control program of the federal learning system is stored on the storage medium, and the control program of the federal learning system is executed by a processor to realize the federal learning method for updating the self-adaptive client parameters

According to the technical scheme, the technical scheme of the invention has the following beneficial effects:

the invention discloses a federal learning method, a system and a storage medium for updating self-adaptive client parameters, wherein the method comprises the following steps: firstly, establishing a Q table by using a Q-Learning algorithm in a central server; then, the central server broadcasts the initialized global model parameters to all the clients, and each client trains according to local data owned by the client, updates and uploads local resource information to the central server; then, the central server determines the corresponding state s according to the resource information through the Markov decision process₁Selecting action a corresponding to the maximum Q value from the Q table and feeding back the action a to the client, wherein the action a corresponding to the maximum Q value is used as the next new state s of the client₂Repeatedly performing the calculation for a plurality of iterations until the localThe model of the part converges or reaches the set local model precision; uploading the parameters of the local updated local optimal model by the client so that the central server adopts a federal average algorithm to aggregate, update and issue global model parameters; and each client repeatedly executes the local updating process for determining the optimal value of the updating times until the global model in all clients in the federal learning system converges or the set global model precision is reached.

According to the method, the optimal value of the local update times of the client is searched through a Markov decision method, the global aggregation times of the central server are reduced, the federal learning task is efficiently executed, the communication cost required by the federal learning model parameters is reduced, the local update optimal value is dynamically selected, and the integral training efficiency of the federal learning is improved.

It should be understood that all combinations of the foregoing concepts and additional concepts described in greater detail below can be considered as part of the inventive subject matter of this disclosure unless such concepts are mutually inconsistent.

The foregoing and other aspects, embodiments and features of the present teachings can be more fully understood from the following description taken in conjunction with the accompanying drawings. Additional aspects of the present invention, such as features and/or advantages of exemplary embodiments, will be apparent from the description which follows, or may be learned by practice of specific embodiments in accordance with the teachings of the present invention.

Drawings

The drawings are not intended to be drawn to scale. In the drawings, each identical or nearly identical component that is illustrated in various figures may be represented by a like numeral. For purposes of clarity, not every component may be labeled in every drawing. Embodiments of various aspects of the present invention will now be described, by way of example, with reference to the accompanying drawings, in which:

FIG. 1 is a block diagram of a federated learning system as presented in the present invention;

FIG. 2 is a diagram of the process of Q-Learning in reinforcement Learning according to the present invention;

fig. 3 is a flowchart of a federal learning method for adaptive client parameter update proposed by the present invention.

In the figure, the specific meaning of each mark is:

1-central server, 2-client.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the drawings of the embodiments of the present invention. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the described embodiments of the invention without any inventive step, are within the scope of protection of the invention. Unless defined otherwise, technical or scientific terms used herein shall have the ordinary meaning as understood by one of ordinary skill in the art to which this invention belongs.

The use of "first," "second," and similar terms in the description and claims of the present application do not denote any order, quantity, or importance, but rather the terms are used to distinguish one element from another. Similarly, the singular forms "a," "an," or "the" do not denote a limitation of quantity, but rather denote the presence of at least one, unless the context clearly dictates otherwise. The terms "comprises," "comprising," or the like, mean that the elements or items listed before "comprises" or "comprising" encompass the features, integers, steps, operations, elements, and/or components listed after "comprising" or "comprising," and do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. "upper", "lower", "left", "right", and the like are used only to indicate relative positional relationships, and when the absolute position of the object being described changes, the relative positional relationships may also change accordingly.

The invention provides a federal learning method, a system and a storage medium for self-adaptive client parameter updating, which solve the technical problems that in the prior art, the communication overhead problem exists in federal learning, the local updating frequency of a client does not reach an optimal value, the federal learning communication efficiency under partial scenes is not high, and the requirement of industrial application scenes is not met.

The method, system, and storage medium for adaptive client parameter update federal learning according to the present invention are further described in detail with reference to the embodiments shown in the drawings.

As shown in fig. 3, the execution of the federal learning method for updating adaptive client parameters according to an embodiment of the present invention includes the following steps: the central server 1 issues a global model to the client 2; the client 2 updates the model by using the local data; the client 2 estimates energy consumption and transmission delay before the next global model training process, and locally updates the local optimal model by using reinforcement learning; when the local update times of the client 2 reach the trained local update optimal model, uploading the model to a central server for global aggregation; and repeatedly executing the local updating process until a preset condition is reached. By searching the local update local optimal model selection strategy of the client 2, the client 2 is prevented from uploading local model parameters after each local update, the times of global aggregation are reduced, and the purpose of reducing communication overhead is achieved.

Specifically, when the method of the present invention is applied to the central server 1, the method includes: establishing a Q table by using a Q-Learning algorithm in the central server 1; broadcasting the initialized global model parameters to all the clients 2 so that each client 2 can train according to local data owned by the client and update local resource information; receiving locally updated resource information uploaded by the client 2, and selecting an action a corresponding to the maximum Q value in the Q table to feed back to the client 2 by adopting a Markov decision process according to a state s corresponding to the resource information; wherein, the action a corresponding to the maximum Q value is the next new state s of the client 2₂Repeatedly executing calculation for multiple iterations until the local model converges or the set local model precision is reached; receiving all parameters of the local updated local optimal model uploaded by the client 2, and adopting a federal average algorithm to aggregate and update global model parameters; the updated global model parameters are transmitted to each client 2 so that the clients 2 can conveniently repeatAnd repeatedly executing the process of determining the local update local optimal model, and iterating for multiple times until the global model in all the clients 2 in the federal learning system converges or the set global model precision is reached.

When the method of the present invention is applied to the client 2, the method includes: receiving an initialized global model parameter sent by the central server 1, training according to local data owned by the global model parameter, and updating local resource information; uploading the locally updated resource information to the central server 1, so that the central server 1 selects an action a corresponding to the maximum Q value in the Q table according to the state corresponding to the resource information by adopting a Markov decision process and feeds the action a back to the client, wherein the action a corresponding to the maximum Q value is the next new state s of the client₂Repeatedly executing calculation for multiple iterations until the local model converges or reaches the set local model precision, and obtaining a locally updated local optimal model; the client 2 uploads the parameters of the locally updated local optimal model to the central server 1, so that the central server 1 adopts a federal average algorithm to aggregate and update the parameters of the global model; receiving updated global model parameters sent by the central server 1, repeatedly executing the process of determining the local updated local optimal model, and iterating for multiple times until the global models in all the clients 2 in the federal learning system converge or reach the set global model precision.

As shown in fig. 2, when the method is applied, the process of establishing the Q table is as follows, that is, starting in an arbitrary state s of the central server 1, selecting an action a to send to all the clients 2, obtaining a feedback award r, calculating a Q value according to the award r and filling in the Q table, and repeatedly performing the calculation until the Q table is not changed any more or is changed within a set range according to the selected action a as a next new state s of the client 2.

In addition, after the client 2 uploads 1 its updated resource information to the central server 1, the central server 1 first observes all the resource information of the client 2, such as the wireless channel state and the real-time energy state, and then selects the optimal policy according to the resource information of the client 2 and the markov decision process.

In the implementation of the method, a selection strategy problem used for training the client 2 to locally update the local optimal model in the federated learning system is expressed by a Markov decision process and is recorded as < S, A, P and R >, wherein S, A, P and R are respectively a state space, an action space, a state transition probability and a reward function of the federated learning system;

the state space S is expressed as resource information of all the clients 2 in the system, and is defined as

Wherein II is Cartesian product, n is the number of clients 2 in the Federal learning System, s_kIs the state of the client k, which is expressed as

s_k＝{f_k,e_k,w_k；f_k≤F,e_k≤E,w_k≤W}

the operating space A is expressed as a combination of selection policies of the central server 1 for the number of local updates of all the clients 2 included in the system, defined as

Wherein, a_kIs an action of client k, and a_k0 or 1;

the state transition probability P is expressed as the federate learning system from the current state s₁Transition to the next state s₂The state transition is determined according to the transition of all the client 2 states in the system;

energy consumed by client 2 per iteration B_kThe calculation is as follows:

B_K＝f_k ²μG

The federal learning method for updating the self-adaptive client parameters is suitable for the federal learning system shown in figure 1. Specifically, the federate learning system is defined to include n clients 2, each client 2 stores local data, and an average loss function obtained by training the local data of a single client 2 in the system, that is, a local loss function F_i(w), and the loss function trained on the local data set composed of all clients 2, i.e. the global loss function f (w), are:

wherein, i and j are any client 2 in the federal learning system respectively, w is a weight matrix of the global model, and D is a local data set stored by all clients 2. The core of the machine learning problem is to iteratively update a parameter set for solving a loss function by inputting a data set so as to reduce the loss function to a set value; the training task of the federated learning system is also to solve a weight matrix w, which is a solution that optimizes the global loss function.

In the embodiment, the global model parameter obtaining manner for initialization broadcasted by the central server 1 is to initialize the weight matrix w of the global model included in the global loss function to 0, and the local training process performed by the client 2 after receiving the optimal value of the local update times of the client 2 fed back by the central server 1 is to perform one or more gradient descent updates on the local data.

In this embodiment, an electronic device is also provided, where the electronic device includes a processor, a memory, and a control program of a federal learning system stored in the memory and operable on the processor; when the control program of the federal learning system is executed by a processor, the processor executes the method in the above embodiment.

The control program for the federated learning system described above may be run on a processor or may also be stored on a computer-readable storage medium that may implement information storage by any method or technique, including permanent and non-permanent, removable and non-removable media. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of memory storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a storage medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

These computer programs may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks, and corresponding steps may be implemented by different modules.

By way of example, the present embodiment provides a system, namely a federal learning system for adaptive client parameter update, which comprises a central server 1 and a plurality of clients 2 connected to the central server 1 through a network, and the following program modules: the establishing module is used for establishing a Q table by using a Q-Learning algorithm in the central server; the broadcast module is used for broadcasting the initialized global model parameters to all the clients by the central server so that each client can train according to local data owned by the client and update local resource information; the system comprises a first receiving module, a second receiving module and a third receiving module, wherein the first receiving module is used for receiving locally updated resource information uploaded by a client, and selecting an action a corresponding to the maximum Q value in a Q table according to the state corresponding to the resource information by adopting a Markov decision process to feed back the action a to the client; the action a corresponding to the maximum Q value serves as the next new state s2 of the client, and iteration is carried out for multiple times until the local model converges or reaches the set local model precision, so that a local update local optimal model is obtained; the second receiving module is used for receiving the parameters of the local models uploaded by all the clients, aggregating the parameters by adopting a federal average algorithm and updating the parameters of the global model; the parameters of the local model uploaded by the client are parameters of a local updated local optimal model; the issuing module is used for issuing the updated global model parameters to each client; and the repeated iteration module is used for repeatedly executing the process of determining the locally updated local optimal model by the client according to the received updated global model parameters, and iterating for multiple times until the global models in all the clients in the federal learning system converge or reach the set global model precision.

Optionally, the process of establishing the Q table in the establishing module is any state s in the central server 1₁Then, any action a is selected and issued to all the clients 2, the feedback award r is obtained, the Q value is calculated according to the award r and is filled into a Q table, and the selected action a is used as the next new state s entered by the central server 1₂And repeatedly executing the calculation until the Q table is not changed or the change is within the set range. The main idea of the Q-Learning algorithm is to construct the state s and the action a into a Q table to store the Q value, and then select the action that can obtain the maximum profit according to the Q value.

The system updates the initialized global parameters locally through the client 2 during running, uploads resource information, namely a state s to the central server 1, and the central server 1 selects an action a with the maximum Q value from a Q table established in advance by using a Markov decision in reinforcement learning to feed back to the client 2, wherein the action a corresponding to the maximum Q value is used as the next new state s of the client 2₂Iteration is carried out for multiple times until the local model converges or the set local model precision is reached, and a locally updated local optimal model is obtained; then, the client 2 uploads the updated parameters of the local update local optimal model, so that the central server 1 performs global aggregation to update the global model parameters, thereby reducing the local overhead and the communication overhead to a great extent.

The method comprises the steps of obtaining a local updated local optimal model at a client 2 and uploading the local optimal model to a central server for global aggregation through a Markov decision method, reducing the times of global aggregation of the central server, efficiently executing a federal learning task, reducing communication cost required by federal learning model parameters, dynamically selecting a local updated optimal value, and improving the whole training efficiency of federal learning.

Although the present invention has been described with reference to the preferred embodiments, it is not intended to be limited thereto. Those skilled in the art can make various changes and modifications without departing from the spirit and scope of the invention. Therefore, the protection scope of the present invention should be determined by the appended claims.

Claims

1. A federal learning method for updating self-adaptive client parameters is applied to a central server and comprises the following steps:

2. The federated learning method for adaptive client parameter update according to claim 1, wherein the markov decision process is defined as a selection policy problem of locally updating a locally optimal model for a client in the federated learning system, denoted as < S, a, P, R >, where S, A, P and R are a state space, an action space, a state transition probability and a reward function of the federated learning system, respectively;

s_k＝{f_k,e_k,w_k；f_k≤F,e_k≤E,w_k≤W}

Wherein, a_kTo be guestAction of the client k, and a_k0 or 1;

energy consumed by client k per iteration B_kThe calculation is as follows:

B_K＝f_k ²μG

3. The federated learning method for adaptive client parameter updating according to claim 1, applied to a client, includes:

with action a as the next new state s₂Iteration is carried out for multiple times until the local model is converged or the set local model precision is reached, and a locally updated local optimal model is obtained;

4. The federal learning method for adaptive client parameter update according to claim 1, wherein the federal learning system is defined to include n clients, each client stores local data, and then the local loss function and the global loss function of the system are respectively:

5. The federated learning method for adaptive client parameter update according to claim 4, wherein the initialized global model parameters are obtained by initializing a weight matrix w of a global model included in a global loss function to 0.

6. The federated learning method for adaptive client parameter updates of claim 1, wherein the central server is networked to any client.

7. The federated learning method for adaptive client parameter updating according to claim 1, wherein the client performs a training process based on its own local data for one or more gradient descent updates on the local data.

8. The federal learning system for updating the parameters of the self-adaptive client is characterized by comprising a central server and a plurality of clients connected with the central server through a network, and the following modules:

the establishing module is used for establishing a Q table at the central server by using a Q-Learning algorithm, wherein the establishing process of the Q table is an arbitrary state s at the central server₁Then, any action a is selected and issued to all the clients, the feedback award r is obtained, the Q value is calculated according to the award r and is filled into a Q table, and the action a is used as the input of the central serverNext new state s₂Repeatedly executing calculation until the Q table is not changed or the change is within a set range;

9. An electronic device comprising a memory, a processor, and a control program of a federated learning system stored on the memory and operable on the processor, the control program of the federated learning system, when executed by the processor, implementing the federated learning method for adaptive client parameter updating as recited in any one of claims 1 to 7.

10. A storage medium having stored thereon a control program of a federal learning system, the control program of the federal learning system being executed by a processor to implement the federal learning method for adaptive client parameter update of any one of claims 1 to 7.