CN116719607A - Model updating method and system based on federal learning - Google Patents
Model updating method and system based on federal learning Download PDFInfo
- Publication number
- CN116719607A CN116719607A CN202310706337.3A CN202310706337A CN116719607A CN 116719607 A CN116719607 A CN 116719607A CN 202310706337 A CN202310706337 A CN 202310706337A CN 116719607 A CN116719607 A CN 116719607A
- Authority
- CN
- China
- Prior art keywords
- cluster
- model
- target client
- target
- client
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 125
- 238000009826 distribution Methods 0.000 claims abstract description 154
- 238000004138 cluster model Methods 0.000 claims abstract description 151
- 230000008569 process Effects 0.000 claims abstract description 51
- 238000003860 storage Methods 0.000 claims abstract description 17
- 230000006870 function Effects 0.000 claims description 52
- 238000012549 training Methods 0.000 claims description 35
- 125000004122 cyclic group Chemical group 0.000 claims description 33
- 238000005457 optimization Methods 0.000 claims description 30
- 239000002131 composite material Substances 0.000 claims description 19
- 238000004590 computer program Methods 0.000 claims description 11
- 238000011478 gradient descent method Methods 0.000 claims description 8
- 238000005516 engineering process Methods 0.000 claims description 7
- 238000013140 knowledge distillation Methods 0.000 claims description 5
- 238000004891 communication Methods 0.000 abstract description 16
- 238000010586 diagram Methods 0.000 description 10
- 230000002776 aggregation Effects 0.000 description 4
- 238000004220 aggregation Methods 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 230000003042 antagnostic effect Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 241001522296 Erithacus rubecula Species 0.000 description 1
- 230000008485 antagonism Effects 0.000 description 1
- 210000004027 cell Anatomy 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000012886 linear function Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/45562—Creating, deleting, cloning virtual machine instances
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/45595—Network integration; Enabling network access in virtual machine instances
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The invention provides a model updating method and system based on federal learning, wherein the method comprises the following steps: updating clusters to which the target client belongs according to synthetic data of the target client participating in federal learning until the initialized K clusters reach a stable distribution state, and sending a target cluster model in each cluster to the target client, wherein the synthetic data is obtained according to the received data distribution information of the target client, the stable distribution state is that the cluster to which the target client belongs is not changed any more, and the target cluster model is determined according to the cluster model in each cluster in which the initialized K clusters reach the stable distribution state; a first loop process is performed until the candidate models in the target client converge. The invention can fully utilize the data of all clients in the data heterogeneous environment, and provides an optimal personalized model for each client under the conditions of limited network communication resources, limited computing resources and limited storage resources of the clients.
Description
Technical Field
The invention relates to the technical field of federal learning, in particular to a federal learning-based model updating method and a federal learning-based model updating system.
Background
As edge devices become more common in modern society, distributed private data is rapidly growing. This data provides a great opportunity for artificial intelligence applications. However, these data are stored in the form of islands in the end user's own equipment, such as cell phones and personal computers, which are mostly highly sensitive. With the advent of data privacy protection laws, such as general data protection regulations (General Data Protection Regulation, GDPR), the need for protecting privacy in artificial intelligence has grown so that such data is not typically disclosed. In recent years, artificial intelligence (Artificial Intelligence, AI) has been rapidly evolving, with an increasingly large AI model, which is also increasingly demanding in terms of the amount of training data. In order to realize information fusion by utilizing data of a plurality of institutions or individuals on the premise of protecting data privacy, federal learning becomes an effective method for training an AI model. This approach allows model training with large amounts of data while maintaining data confidentiality.
Joint learning (Federated Learning, FL) is a machine learning paradigm proposed in recent years that employs a "client-server" architecture model, with the aim of solving the above-mentioned problems, and fig. 1 is a schematic diagram of federal learning architecture provided in the prior art, as in fig. 1. The federal learning training model is trained by iterating three steps until convergence: (1) The server sends the global model to the clients participating in training of the round through the network; (2) After the client receives the global model, the old local model is covered, and then training is carried out by combining private data of the local client; (3) After training is completed, the client uploads the trained local model to the server through a network; and after receiving the trained local models uploaded by the clients meeting the set number, the server executes model aggregation to generate a new global model. It can be seen that joint learning promotes clients of multiple data bins to conduct collaborative training in a privacy-protecting manner, private data of the clients are stored locally and are not directly shared in the training process, and therefore the clients can achieve better performance than single work while protecting privacy.
However, since the data sources and computing nodes are end-user personal devices, each client has its own independent data, and there is no independent and identifiable distribution (Independent and Identically Distributed, IID) of training data, i.e., the private data of participating training clients may differ in size and distribution, resulting in a single global model generated by conventional federal learning that does not perform well for some clients, even as effectively as simply training a local model using the respective local data (even if it is small).
Personalized federal learning aims at utilizing federal learning process to make up for the problem of insufficient local data volume of clients, learning a model suitable for a local data set for each client, and improving training effect of personalized federal learning model.
In the personalized federal learning field, some joint learning methods attempt to solve this problem by performing local fine tuning after global model training, but the personalized performance of this method still depends on the generalized performance of the global model, and does not solve the essential problem. Other joint learning algorithms explore distributed clustering techniques to better simulate non-IID data using a multiple iteration clustering scheme, but this approach is very unfriendly to clients with limited communication bandwidth and computational resources.
Therefore, the data of all clients in the data heterogeneous environment can be fully utilized, and under the conditions of limited network communication resources, limited computing resources and limited storage resources of the clients, an optimal personalized model is learned for each client, so that higher performance is achieved, and the method is one of key problems of federal learning technology.
Disclosure of Invention
The model updating method and system based on federal learning, provided by the invention, are used for solving the problems in the prior art.
The invention provides a model updating method based on federal learning, which is applied to servers participating in federal learning and comprises the following steps:
updating clusters to which the target client belongs according to synthetic data of the target client participating in federal learning until the initialized K clusters reach a stable distribution state, and sending a target cluster model in each cluster to the target client, wherein the synthetic data is obtained according to the received data distribution information of the target client, the stable distribution state is that the cluster to which the target client belongs is not changed any more, the target client comprises N clients, and the target cluster model is determined according to the cluster model in each cluster in which the initialized K clusters reach the stable distribution state;
Executing the following first circulation process until the candidate model in the target client converges, wherein the candidate model is obtained after the target client updates the received target cluster model in each cluster;
the first cyclic process includes:
receiving a candidate model sent by a target client and updating the candidate model;
and sending the updated candidate model to the target client so that the target client updates the candidate model.
The model updating method based on federation learning provided by the invention is applied to servers participating in federation learning, and the updating of the cluster to which the target client belongs according to the synthetic data of the target client comprises the following steps:
initializing a cluster to which a target client belongs;
acquiring an empirical loss value of the cluster model of the synthesized data of each client in each cluster according to the empirical loss function of the cluster model in each cluster;
and updating the cluster to which the target client belongs according to the experience loss value of the cluster model of the synthesized data of each client in each cluster so as to determine the clients contained in each cluster when the K clusters reach a stable distribution state, wherein the clients contained in each cluster are the clients which meet the requirement that the experience loss value of the cluster model of the synthesized data in each cluster reaches the minimum value in the target client.
The model updating method based on federation learning provided by the invention is applied to servers participating in federation learning, and updates the cluster to which the target client belongs according to the experience loss value of the cluster model of the synthesized data of each client in each cluster, and comprises the following steps:
performing a second cyclic process until the K clusters reach a stable distribution state;
the second cyclic process includes:
updating the cluster model in each cluster based on a gradient descent method;
updating experience loss values of updated cluster models of the clients contained in each cluster in the clusters based on the synthesized data of the clients contained in each cluster;
and updating the cluster to which the target client belongs according to the experience loss value of the updated cluster model of the client in the cluster.
According to the model updating method based on federation learning provided by the invention, the model updating method is applied to a server participating in federation learning, the candidate model sent by the target client is received, and the candidate model is updated, and the method comprises the following steps:
determining an optimization objective function of the candidate model according to the sum of a first optimization objective function and a second optimization objective function, wherein the first optimization objective function is the sum of loss functions of candidate models of the target clients, and the second optimization objective function is the sum of nonlinear functions of differences between a received target cluster model of any client and target cluster models received by other clients in the target clients;
Optimizing the first optimization objective function based on a gradient descent method;
optimizing the second optimization objective function based on an approximate point method;
and updating the candidate model according to the candidate model when the optimization objective function takes the minimum value.
The invention also provides a model updating method based on federal learning, which is applied to a target client participating in federal learning and comprises the following steps:
transmitting the data distribution information of the target client to a server participating in the federation learning so that the server updates a cluster to which the target client belongs according to the synthetic data of the target client, wherein the target client comprises N clients, and the synthetic data is obtained according to the received data distribution information of the target client;
receiving a target cluster model in each cluster transmitted by the server under the condition that the server determines that the initialized K clusters reach a stable distribution state, wherein the stable distribution state is that the cluster to which the target client belongs is not changed, and the target cluster model is determined according to the cluster model in each cluster in which the initialized K clusters reach the stable distribution state;
Executing the following first circulation process until a candidate model is converged, wherein the candidate model is obtained after the target client updates the received target cluster model in each cluster;
the first cyclic process includes:
sending a candidate model to the server so that the server updates the received candidate model;
and receiving the updated candidate model sent by the server, and updating the updated candidate model.
The model updating method based on federation learning is applied to a target client participating in federation learning, and the data distribution information acquisition mode of the target client comprises the following steps:
pre-training a teacher network based on knowledge distillation technology;
taking the trained teacher network as a discriminator of a generated countermeasure network, wherein the generated countermeasure network is deployed at the target client;
training a generator in the generated countermeasure network based on the discriminator until the value of the loss function of the generated countermeasure network is smaller than a preset value;
and determining the data distribution information of the target client according to the trained generator.
The invention also provides a model updating system based on federal learning, which is applied to a server participating in federal learning and comprises the following steps: the first sending module and the first updating module;
the first sending module is configured to update, according to synthetic data of a target client participating in federal learning, clusters to which the target client belongs, until initialized K clusters reach a stable distribution state, and send a target cluster model in each cluster to the target client, where the synthetic data is obtained according to received data distribution information of the target client, the stable distribution state is that the cluster to which the target client belongs is no longer changed, the target client includes N clients, and the target cluster model is determined according to a cluster model in each cluster in which the initialized K clusters reach the stable distribution state;
the first updating module is configured to execute a first cyclic process until a candidate model in the target client converges, where the candidate model is obtained after the target client updates a target cluster model in each received cluster;
The first cyclic process includes:
receiving a candidate model sent by a target client and updating the candidate model;
and the second sending module is used for sending the updated candidate model to the target client so that the target client can update the candidate model.
The invention also provides a model updating system based on federal learning, which is applied to a target client participating in federal learning and comprises the following steps: the device comprises a second sending module, a receiving module and a second updating module;
the second sending module is configured to send data distribution information of the target client to a server participating in the federal learning, so that the server updates a cluster to which the target client belongs according to composite data of the target client, where the target client includes N clients, and the composite data is obtained according to the received data distribution information of the target client;
the receiving module is configured to receive, when the server determines that the initialized K clusters reach a stable distribution state, a target cluster model in each cluster sent by the server, where the stable distribution state is that a cluster to which the target client belongs is no longer changed, and the target cluster model is determined according to a cluster model in each cluster in which the initialized K clusters reach the stable distribution state;
The second updating module is configured to execute a first cyclic process until a candidate model converges, where the candidate model is obtained after the target client updates a target cluster model in each received cluster;
the first cyclic process includes:
sending a candidate model to the server so that the server updates the received candidate model;
the second receiving module is configured to receive the updated candidate model sent by the server, and update the updated candidate model.
The invention also provides an electronic device comprising a processor and a memory storing a computer program, the processor implementing the federal learning-based model updating method as described in any one of the above when executing the program.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a federal learning-based model updating method as described in any of the above.
The invention also provides a computer program product comprising a computer program which when executed by a processor implements a federal learning-based model updating method as described in any one of the above.
According to the model updating method and system based on federal learning, the target client participating in collaborative training uploads the data distribution information to the server, and the server performs clustering division on the target client according to the uploaded data distribution information to obtain the cluster to which the target client belongs, so that the computing resources and the storage resources of the target client are saved, the communication overhead between the target client and the server is reduced, and an optimal personalized model (i.e. a converged candidate model) is learned for each client.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a federal learning architecture provided by the prior art;
FIG. 2 is a schematic flow chart of a federal learning-based model update method according to the present invention;
FIG. 3 is a schematic flow chart of updating a cluster to which a target client belongs by a server provided by the invention;
FIG. 4 is a schematic diagram of generating a personalized model provided by the present invention;
FIG. 5 is a second flow chart of a federal learning-based model update method according to the present invention;
FIG. 6 is a schematic diagram of a generated data distribution with a generated countermeasure network provided by the present invention;
FIG. 7 is a third flow chart of a model update method for federal learning according to the present invention;
FIG. 8 is a schematic diagram of a federal learning-based model update system according to one embodiment of the present invention;
FIG. 9 is a second schematic diagram of a model update system based on federal learning according to the present invention;
fig. 10 is a schematic diagram of the physical structure of the electronic device provided by the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
According to the model updating method based on federal learning, the clustering division work of the client is completed on the server side based on data distribution of the client, communication expenditure between the client and the server is reduced, computing resources and storage resources of the client are saved, and a high-performance personalized model of the client is learned in a clustering cluster, so that the problems that the effect of a global model of collaborative learning is poor and model training convergence is slow under the condition of heterogeneous data distribution of a bottom client in federal learning are solved, and the method comprises the following steps: the client learns different data distributions of the bottom client by using the generated countermeasure network technology and sends the data distributions to the server through a communication channel; the server utilizes the data distribution information to carry out similarity clustering on the participating clients, so that the clients with similar distribution are gathered together; within each cluster, similarity among clients is fully utilized, and personalized models belonging to each client are trained for one client collaboration. The invention can utilize the data distribution information of the client to complete clustering work at the server, effectively reduce communication load, improve communication efficiency, lighten communication bandwidth pressure, save computing resources and storage resources of the local client, and obviously improve the effect of the personalized model, and is realized as follows:
Fig. 2 is a schematic flow chart of a federal learning-based model updating method according to the present invention, as shown in fig. 2, the method includes:
step 110, updating clusters to which the target client belongs according to synthetic data of the target client participating in federal learning until the initialized K clusters reach a stable distribution state, and sending a target cluster model in each cluster to the target client, wherein the synthetic data is obtained according to the received data distribution information of the target client, the stable distribution state is that the cluster to which the target client belongs is not changed, the target client comprises N clients, and the target cluster model is determined according to the cluster model in each cluster in which the initialized K clusters reach the stable distribution state;
step 120, executing the following first cyclic process until the candidate model in the target client converges, where the candidate model is obtained after the target client updates the received target cluster model in each cluster;
the first cyclic process includes:
receiving a candidate model sent by a target client and updating the candidate model;
And sending the updated candidate model to the target client so that the target client updates the candidate model.
It should be noted that, the execution subject of the method may be a computer device, or may be applied to a server participating in federal learning.
Alternatively, the target client may specifically be N clients randomly selected by the server participating in federal learning from among the clients participating in federal learning, where N is a positive integer.
The composite data of the N clients may be specifically obtained by data distribution information of the N clients, which is a generator in each client, where the generator may be specifically obtained by training an initial local model (antagonistic generation network) deployed at the target client.
The server receives the data release information of the N clients, performs cluster division on the server, and generates synthesized data corresponding to each client according to the data distribution information (i.e. the trained generator) uploaded by different clients, for example, the server inputs noise data to the trained generator to generate the synthesized data of each client.
The server receives client C numbered 1,2, …, i …, N 1 C 2 ...C i ...C n Data distribution information G of (2) 1 G 2 ...G i ...G n Based on data distribution information G 1 G 2 ...G i ...G n Generating synthetic data corresponding to each client, and recording the synthetic data of the ith client information in the server as D' i The method comprises the steps of carrying out a first treatment on the surface of the The composite dataset size is noted as |D' i |。
And updating the cluster to which the target client belongs according to the synthetic data respectively corresponding to the N clients.
The cluster may be any one of K clusters initialized according to the server, each cluster corresponding to a cluster model, where parameters of the cluster model areWherein (1)>And (3) representing the parameters of the jth cluster model, wherein K is a positive integer.
The server iterates for a plurality of times on the basis of the synthesized data corresponding to the target client at the server, and estimates the cluster to which each client belongs according to the minimum loss calculated by the empirical loss function of the cluster model in the K clusters until the initialized K clusters reach a stable distribution state, wherein the stable distribution state specifically means that the cluster to which each client in the target client belongs is not changed, namely, the client contained in each cluster is not changed.
And under the condition that the initialized K clusters reach a stable distribution state, the server sends the target cluster model in each cluster to the clients in the cluster according to the cluster to which each client in the N clients belongs.
And after the target client receives the target clustering model sent by the server, updating the target clustering model by adopting a gradient descent method.
The following first round process is performed until the candidate model in the target client converges.
The first cyclic process includes:
the target client sends the updated target cluster model (i.e., candidate model) to the server.
And the server updates the parameters in the received candidate models and sends the updated candidate models to the target client.
And the target client updates the updated candidate model sent by the received server by adopting a first descent method until the candidate model converges, and takes the converged candidate model as a personalized model of the target client.
According to the model updating method based on federal learning, the target client participating in collaborative training uploads the data distribution information to the server, and the server performs clustering division on the target client according to the uploaded data distribution information to obtain the cluster to which the target client belongs, so that the computing resources and the storage resources of the target client are saved, the communication overhead between the target client and the server is reduced, and an optimal personalized model (namely a converged candidate model) is learned for each client.
Further, in one embodiment, the updating the cluster to which the target client belongs according to the composite data of the target client may specifically include:
initializing a cluster to which a target client belongs;
acquiring an empirical loss value of the cluster model of the synthesized data of each client in each cluster according to the empirical loss function of the cluster model in each cluster;
and updating the cluster to which the target client belongs according to the experience loss value of the cluster model of the synthesized data of each client in each cluster so as to determine the clients contained in each cluster when the K clusters reach a stable distribution state, wherein the clients contained in each cluster are the clients which meet the requirement that the experience loss value of the cluster model of the synthesized data in each cluster reaches the minimum value in the target client.
Further, in an embodiment, the updating the cluster to which the target client belongs according to the experience loss value of the cluster model of the composite data of each client in each cluster may specifically include:
performing a second cyclic process until the K clusters reach a stable distribution state;
The second cyclic process includes:
updating the cluster model in each cluster based on a gradient descent method;
updating experience loss values of updated cluster models of the clients contained in each cluster in the clusters based on the synthesized data of the clients contained in each cluster;
and updating the cluster to which the target client belongs according to the experience loss value of the updated cluster model of the client in the cluster.
Optionally, fig. 3 is a schematic flow chart of updating a cluster to which a target client belongs by a server provided in the present invention, and as shown in fig. 3, the server allocates a cluster identity to each client in the target client, where the cluster identity is used to characterize the cluster to which the client belongs, and it is assumed that an ith client C i Is the cluster identity of
Initializing K cluster models at a server side, and marking the K cluster models as theta K The number of layers of the cluster model is recorded as |theta K The weighting parameter of the K-th clustering model is recorded as IWherein p represents on the penultimate p-layer of the cluster model; initializing and marking the K-th cluster model as +.>The empirical loss function of the cluster model is recorded as F K (θ K );
Determining an empirical loss function F of a cluster model K (θ K ):F K (θ K )=argmin{F K (θ K1 ),F k (θ K2 ),...F K (θ Kn )}
The server is the client C i Is the synthesized data G of (2) i Empirical loss values at K cluster models: f (θ) 1 ),F(θ 2 )...F(θ K )
Finding synthetic data G i Updating cluster identity for each client with minimal loss at K modelsI.e.Until the K clusters reach a stable distribution state.
And under the condition that the K clusters reach a stable distribution state, the server determines the clients contained in each cluster, wherein the clients contained in each cluster are the clients which meet the condition that the empirical loss value of the cluster model of the synthesized data in each cluster reaches the minimum value (namely the minimum loss) in the target clients.
Optionally, after confirming the cluster identity at each iteration, in each cluster, updating the empirical loss value of the kth cluster model in combination with the synthetic data of the clients in the cluster.
Specifically, the server performs the following second round robin procedure until the K clusters reach a stable distribution state:
after the cluster to which the target client belongs is updated once through the process, the m clients contained in the Kth cluster are C respectively 1 C 2 ...C m 。
And updating the cluster model in the cluster by the server in a gradient descent mode.
Within the kth cluster, client C within the cluster is based on 1 C 2 ...C m Is the synthesized data G of (2) 1 G 2 ...G i Updating the empirical loss value F (θ) of the updated cluster model within the kth cluster k ) Wherein, the method comprises the steps of, wherein,and gamma is the learning rate of the cluster model weight training.
And the server updates the cluster to which the target client belongs according to the experience loss value of the updated cluster model of the client in each cluster until the K clusters reach a stable distribution state.
According to the model updating method based on federation learning, clustering work of the client is completed by the server, the target client participating in collaborative training uploads data distribution information to the server, the server divides the client according to the uploaded data distribution information by using a clustering strategy capable of minimizing loss.
Further, in an embodiment, the receiving the candidate model sent by the target client and updating the candidate model may specifically include:
Determining an optimization objective function of the candidate model according to the sum of a first optimization objective function and a second optimization objective function, wherein the first optimization objective function is the sum of loss functions of candidate models of the target clients, and the second optimization objective function is the sum of nonlinear functions of differences between a received target cluster model of any client and target cluster models received by other clients in the target clients;
optimizing the first optimization objective function based on a gradient descent method;
optimizing the second optimization objective function based on an approximate point method;
and updating the candidate model according to the candidate model when the optimization objective function takes the minimum value.
Optionally, obtaining a cluster model of each cluster in each cluster reaching stable distribution state, and combining different cluster modelsIssuing to each corresponding client, specifically:
clustering to steady state, client C 1 C 2 ...C i ...C n Is divided into K clusters, and the K-th cluster model theta is used for k And issuing to clients belonging to the cluster.
Step 11, client C i Updating the received target cluster model sent by the server in a gradient descent mode, and sending the updated target cluster model (namely a candidate model) to the server, specifically:
Step 111, client C 1 C 2 ...C i ...C n The method comprises the steps that a receiving server transmits a target clustering model corresponding to a cluster identity of a client;
step 112, the ith client C i The received target cluster model is marked as w i The method comprises the steps of carrying out a first treatment on the surface of the The layer number of the target cluster model is recorded as |w i The weight of the target cluster model is recorded as IWherein p represents that only local aggregation is applied on the penultimate p layer of the target cluster model; the loss function of the target cluster model is noted as F (w i );
Step 113, determining an objective function F (w i ):F(w i )=argmin{F(w 1 ),F(w 2 ),...F(w n )}
Step 114, determining an updater of the target cluster modelCovering an old local model by using the received target cluster model, and then updating the target cluster model by combining local private data in a gradient descent mode:wherein eta is the learning rate of the local update model, < ->For client C i Updating the target clustering model to obtain a candidate model;
and step 115, uploading the updated target cluster model (i.e. the candidate model) to a server through a network.
Step 21, the server generates a personalized model U of each client by using a message passing mechanism in the cluster according to the received candidate models of the transmissions of the clients belonging to different clusters i And issues it to the corresponding client C i The method comprises the steps of carrying out a first treatment on the surface of the In particular, the method comprises the steps of,
step 211, the server receives clients C belonging to different clusters 1 C 2 ...C i ...C n The transmitted candidate models;
step 212, determining a personalized cloud model of the client, and determining the ith client C i The personalized cloud model is marked as U i The method comprises the steps of carrying out a first treatment on the surface of the The number of layers of the personalized cloud model is recorded as |U i The weight of the personalized cloud model is recorded as IWhere p represents applying only local aggregation on the penultimate p-layer of the local model; />Parameters representing a personalized cloud model belonging to an ith client in a kth cluster;
step 213, determining an overall optimization objective function:
wherein the first half of the formula represents a first optimization objective function that is the sum of the loss functions of the objective cluster models of all clients within the clusterThe latter half of the formula represents the second optimization objective function, which is the attentional mechanism part +.> Wherein A is (||w i -w i || 2 ) Is a metrics client C i Received target cluster model w i And the rest of clients C j Received target cluster model w j The non-linear function of the parameter difference of (2) satisfies the conditions of increasing from 0 and being slightly wait.
Based on the above analysis, the optimization objective function is reduced to:
step 214, determining an optimization mode based on the simplified formula. The optimization step is divided into two parts: first, optimizing by gradient descent Obtain the intermediate value U k Then further optimizing +.>Updating the candidate model according to the candidate model when the optimized objective function takes the minimum value to obtain an updated candidate model, namely a personalized model;
step 215, the server calculates parameters of the personalized cloud model of the clients in each cluster Is->A linear combination of the corresponding parameter sets is as follows:
wherein,,is client C i The parameter set of the target clustering model is uploaded, and is calculated by using local private data of the client, and zeta is calculated by using the local private data of the client i,1 ,...,ξ i,m Correspond to->Is a weight of (2). (xi) i,1 +…+ξ i,m =1)。ξ i,j Represented as client C within a cluster j For client C i Contribution weights of the personalized model of (a). />And->The higher the similarity between them, the greater the contribution to each other.
Step 215, issuing the generated personalized cloud model to each client C 1 C 2 ...C i ,..C n 。
Step 31, repeating step 11 and step 21, and performing T-turn training until the candidate model (i.e. personalized model) of each client converges, where T may be flexibly set according to the actual situation, as shown in fig. 4.
According to the model updating method based on federal learning, for similar clients in the same cluster, the similarity between the clients is fully utilized, the personalized model belonging to each client is trained for the client in a cooperative mode, and better model performance can be obtained compared with other clustering personalized federal learning methods.
FIG. 5 is a second flow chart of a model updating method based on federal learning according to the present invention, as shown in FIG. 5, including:
step 210, sending the data distribution information of the target client to a server participating in the federation learning, so that the server updates a cluster to which the target client belongs according to the composite data of the target client, wherein the target client comprises N clients, and the composite data is obtained according to the received data distribution information of the target client;
step 220, receiving a target cluster model in each cluster transmitted by the server when the server determines that the initialized K clusters reach a stable distribution state, wherein the stable distribution state is that the cluster to which the target client belongs is not changed, and the target cluster model is determined according to the cluster model in each cluster in which the initialized K clusters reach the stable distribution state;
step 230, executing the following first cyclic process until the candidate model converges, wherein the candidate model is obtained after the target client updates the received target cluster model in each cluster;
The first cyclic process includes:
sending a candidate model to the server so that the server updates the received candidate model;
and receiving the updated candidate model sent by the server, and updating the updated candidate model.
Alternatively, it should be noted that the execution subject of the method may be a computer device, and may also be applied to a target client that participates in federal learning.
Alternatively, the target client may specifically be N clients randomly selected by the server participating in federal learning from among the clients participating in federal learning, where N is a positive integer.
The composite data of the N clients may be specifically obtained by data distribution information of the N clients, which is a generator in each client, where the generator may be specifically obtained by training an initial local model (antagonistic generation network) deployed at the target client.
N clients send the corresponding data release information to the servers participating in federal learning, and cluster division is carried out on the servers.
The server generates composite data corresponding to each client according to the received data distribution information (i.e., trained generator) uploaded by different clients, for example, the server inputs noise data to the trained generator to generate composite data for each client.
The server receives client C numbered 1,2 1 C 2 ...C i ...C n Data distribution information G of (2) 1 G 2 ...G i ...G n Based on data distribution information G 1 G 2 ...G i ...G n Generating synthetic data corresponding to each client, and recording the synthetic data of the ith client information in the server as D' i The method comprises the steps of carrying out a first treatment on the surface of the The composite dataset size is noted as |D' i |。
And the server updates the cluster to which the target client belongs according to the synthesized data respectively corresponding to the N clients.
The cluster may be any one of K clusters initialized according to the server, each cluster corresponding to a cluster model, where parameters of the cluster model areWherein (1)>And (3) representing the parameters of the jth cluster model, wherein K is a positive integer.
The server iterates for a plurality of times on the basis of the synthesized data corresponding to the target client at the server, and estimates the cluster to which each client belongs according to the minimum loss calculated by the empirical loss function of the cluster model in the K clusters until the initialized K clusters reach a stable distribution state, wherein the stable distribution state specifically means that the cluster to which each client in the target client belongs is not changed, namely, the client contained in each cluster is not changed.
And under the condition that the server determines that the initialized K clusters reach the stable distribution state, N clients receive target cluster models sent by the server, wherein the target cluster models are determined by the server according to the cluster models in each cluster in which the initialized K clusters reach the stable distribution state.
And after the target client receives the target clustering model sent by the server, updating the target clustering model by adopting a gradient descent method.
The following first loop process is performed until the candidate model converges.
The first cyclic process includes:
the target client sends the updated target cluster model (i.e., candidate model) to the server.
And the server updates the parameters in the received candidate models and sends the updated candidate models to the target client.
And the target client updates the updated candidate model sent by the received server by adopting a first descent method until the candidate model converges, and takes the converged candidate model as a personalized model of the target client.
According to the model updating method based on federal learning, the target client participating in collaborative training uploads the data distribution information to the server, and the server performs clustering division on the target client according to the uploaded data distribution information to obtain the cluster to which the target client belongs, so that the computing resources and the storage resources of the target client are saved, the communication overhead between the target client and the server is reduced, and an optimal personalized model (namely a converged candidate model) is learned for each client.
Further, in an embodiment, the method for obtaining the data distribution information of the target client may specifically include:
pre-training a teacher network based on knowledge distillation technology;
taking the trained teacher network as a discriminator of a generated countermeasure network, wherein the generated countermeasure network is deployed at the target client;
training a generator in the generated countermeasure network based on the discriminator until the value of the loss function of the generated countermeasure network is smaller than a preset value;
and determining the data distribution information of the target client according to the trained generator.
Optionally, the target client trains the generation type countermeasure network, learns local data distribution information, and uploads the learned data distribution information to the server, specifically:
n clients participating in federal learning are numbered as C 1 C 2 ...C i ...C n The method comprises the steps of carrying out a first treatment on the surface of the Record the data set on the ith client as D i The method comprises the steps of carrying out a first treatment on the surface of the The number of samples comprised by the data set is noted as |D i I (I); the local generative model (i.e., generator) on the ith client is labeled G i The method comprises the steps of carrying out a first treatment on the surface of the The layer number of the locally generated model is recorded as |G i I, mark the local discriminant model (i.e., discriminant) on the ith client asThe number of layers of the local discriminant model is recorded as +. >The weight of the locally generated model is marked +.>Wherein the method comprises the steps ofp denotes the application of local aggregation only on the penultimate p-layer of the local model (generative antagonism network); the initialized locally generated model on the ith client is denoted +.>The loss function of the generated countermeasure network is recorded as +.>
Introducing knowledge distillation technology, pre-training teacher network, and marking teacher network on ith client as T i The layer number of the teacher network is recorded as |T i The weight of the teacher network is recorded asWhich trains the teacher network according to the real data, fixes the trained teacher network as a discriminator of the generated countermeasure network ∈>Initializing a discriminant in a generated countermeasure network to a trained teacher model T i I.e. +.>
Determining objective functions of a generative countermeasure network
Wherein,,is a cross entropy loss function, y T The teacher network obtains the probability of each category of the generated image, t is the category with the highest probability, and uses one-hot vector representation, cross entropy->To measure +.>And t i Is a similarity of (3). The characteristics extracted by the teacher's network are expressed as +.>The L1 norm is used to measure the number of active neurons corresponding to the output before the fully connected layer. />Represents the information quantity possessed by p, wherein k represents a coefficient, which can be flexibly set, p ' represents a given probability vector, and p ' = { p ' 1 ,p′ 2 ,...,p′ o O represents the number of probabilities comprised by the probability vector p'.
According to an objective functionA well-fixed arbiter>Training generator G i And obtaining data distribution information according to the trained generator.
Client C i And uploading the learned data distribution information to a server.
The server receives the data distribution information G uploaded by the client i Clustering division is performed at the server side, composite data corresponding to each client side is generated according to data distribution uploaded by different client sides, and a schematic diagram of data distribution information generated through the process is shown in fig. 6.
FIG. 7 is a third flow chart of a model update method for federal learning according to the present invention, as shown in FIG. 7, including:
step 1, a target client trains a generated countermeasure network, learns local data distribution information and uploads the learned data distribution information to a server;
step 2, the server receives the data distribution information uploaded by the clients, performs cluster division on the server, and generates synthetic data corresponding to each client according to the data distribution information uploaded by different clients;
step 3, initializing K clusters by the server
And 4, iterating for a plurality of times on the server side based on the synthesized data of the target client side. The cluster identity of each client is estimated from the minimum loss calculated by the empirical loss function in the K cluster models.
And 5, after confirming the cluster identity in each iteration, updating parameters of a K-th cluster model in each cluster by combining the synthetic data of the clients in the cluster.
Step 6, repeating the step 4 and the step 5 until the K clusters reach a stable distribution state;
step 7, obtaining a clustering model of each cluster, namely a target clustering model, in each cluster reaching a stable distribution state, and transmitting different target clustering models to each corresponding client according to the cluster to which the client belongs;
step 8, the client updates the target clustering model sent by the server in a gradient descent mode, and sends the updated target clustering model (namely a candidate model) to the server;
step 9, the server generates a personalized model of each client by using a message transmission mechanism in the cluster according to the received candidate models of the clients belonging to different clusters, and sends the personalized model to the corresponding client;
and 10, repeating the step 8 and the step 9, and executing T-wheel training until the personalized model training of each client converges.
According to the model updating method based on federal learning, the data distribution of the local client is learned through the generated countermeasure network and the knowledge distillation technology, so that the local data distribution information can be accurately learned, the learning and convergence of the generated model are quickened, and the resources of the client are saved.
The model updating system based on federal learning provided by the invention is described below, and the model updating system based on federal learning described below and the model updating method based on federal learning described above can be correspondingly referred to each other.
Fig. 8 is a schematic structural diagram of a model update system based on federal learning according to the present invention, as shown in fig. 8, applied to a server participating in federal learning, including:
a first sending module 810 and a first updating module 811;
the first sending module 810 is configured to update, according to synthetic data of a target client participating in the federal learning, clusters to which the target client belongs, until initialized K clusters reach a stable distribution state, and send a target cluster model in each cluster to the target client, where the synthetic data is obtained according to received data distribution information of the target client, the stable distribution state is that the cluster to which the target client belongs is no longer changed, the target client includes N clients, and the cluster model is determined according to a cluster model in each cluster in which the initialized K clusters reach a stable distribution state;
The first updating module 811 is configured to execute a first cyclic process until a candidate model in the target client converges, where the candidate model is obtained after the target client updates a target cluster model in each received cluster;
the first cyclic process includes:
receiving a candidate model sent by a target client and updating the candidate model;
and the second sending module is used for sending the updated candidate model to the target client so that the target client can update the candidate model.
According to the model updating system based on federal learning, the target client participating in collaborative training uploads the data distribution information to the server, and the server performs clustering division on the target client according to the uploaded data distribution information to obtain the cluster to which the target client belongs, so that the computing resources and the storage resources of the target client are saved, the communication overhead between the target client and the server is reduced, and an optimal personalized model (namely a converged candidate model) is learned for each client.
Fig. 9 is a second schematic structural diagram of a model update system based on federal learning according to the present invention, as shown in fig. 9, applied to a target client participating in federal learning, including:
A second sending module 910, a receiving module 911, and a second updating module 912;
the second sending module 910 is configured to send the data distribution information of the target client to a server participating in the federal learning, so that the server updates a cluster to which the target client belongs according to the composite data of the target client, where the target client includes N clients, and the composite data is obtained according to the received data distribution information of the target client;
the receiving module 911 is configured to receive, when the server determines that the initialized K clusters reach a stable distribution state, a target cluster model in each cluster sent by the server, where the stable distribution state is that a cluster to which the target client belongs is no longer changed, and the target cluster model is determined according to a cluster model in each cluster in which the initialized K clusters reach the stable distribution state;
the second updating module 912 is configured to execute a first loop process until a candidate model converges, where the candidate model is obtained after the target client updates the received target cluster model in each cluster;
The first cyclic process includes:
sending a candidate model to the server so that the server updates the received candidate model;
the second receiving module is configured to receive the updated candidate model sent by the server, and update the updated candidate model.
According to the model updating system based on federal learning, the target client participating in collaborative training uploads the data distribution information to the server, and the server performs clustering division on the target client according to the uploaded data distribution information to obtain the cluster to which the target client belongs, so that the computing resources and the storage resources of the target client are saved, the communication overhead between the target client and the server is reduced, and an optimal personalized model (namely a converged candidate model) is learned for each client.
Fig. 10 is a schematic physical structure of an electronic device according to the present invention, as shown in fig. 10, the electronic device may include: a processor (processor) 1010, a communication interface (communication interface) 1011, a memory (memory) 1012 and a bus (bus) 513, wherein the processor 510, the communication interface 1011, and the memory 1012 communicate with each other via the bus 1013. The processor 1010 may call logic instructions in the memory 1012 to perform the following methods:
Updating clusters to which the target client belongs according to synthetic data of the target client participating in federal learning until the initialized K clusters reach a stable distribution state, and sending a target cluster model in each cluster to the target client, wherein the synthetic data is obtained according to the received data distribution information of the target client, the stable distribution state is that the cluster to which the target client belongs is not changed any more, the target client comprises N clients, and the target cluster model is determined according to the cluster model in each cluster in which the initialized K clusters reach the stable distribution state;
executing the following first circulation process until the candidate model in the target client converges, wherein the candidate model is obtained after the target client updates the received target cluster model in each cluster;
the first cyclic process includes:
receiving a candidate model sent by a target client and updating the candidate model;
and sending the updated candidate model to the target client so that the target client updates the candidate model.
Or alternatively, the first and second heat exchangers may be,
transmitting the data distribution information of the target client to a server participating in the federation learning so that the server updates a cluster to which the target client belongs according to the synthetic data of the target client, wherein the target client comprises N clients, and the synthetic data is obtained according to the received data distribution information of the target client;
receiving a target cluster model in each cluster transmitted by the server under the condition that the server determines that the initialized K clusters reach a stable distribution state, wherein the stable distribution state is that the cluster to which the target client belongs is not changed, and the target cluster model is determined according to the cluster model in each cluster in which the initialized K clusters reach the stable distribution state;
executing the following first circulation process until a candidate model is converged, wherein the candidate model is obtained after the target client updates the received target cluster model in each cluster;
the first cyclic process includes:
sending a candidate model to the server so that the server updates the received candidate model;
And receiving the updated candidate model sent by the server, and updating the updated candidate model.
Further, the logic instructions in the memory described above may be implemented in the form of software functional units and stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer power supply screen (which may be a personal computer, a server, or a network power supply screen, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Further, the present invention discloses a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, are capable of performing the federal learning based model updating method provided by the above-described method embodiments, for example, comprising:
Updating clusters to which the target client belongs according to synthetic data of the target client participating in federal learning until the initialized K clusters reach a stable distribution state, and sending a target cluster model in each cluster to the target client, wherein the synthetic data is obtained according to the received data distribution information of the target client, the stable distribution state is that the cluster to which the target client belongs is not changed any more, the target client comprises N clients, and the target cluster model is determined according to the cluster model in each cluster in which the initialized K clusters reach the stable distribution state;
executing the following first circulation process until the candidate model in the target client converges, wherein the candidate model is obtained after the target client updates the received target cluster model in each cluster;
the first cyclic process includes:
receiving a candidate model sent by a target client and updating the candidate model;
and sending the updated candidate model to the target client so that the target client updates the candidate model.
Or alternatively, the first and second heat exchangers may be,
transmitting the data distribution information of the target client to a server participating in the federation learning so that the server updates a cluster to which the target client belongs according to the synthetic data of the target client, wherein the target client comprises N clients, and the synthetic data is obtained according to the received data distribution information of the target client;
receiving a target cluster model in each cluster transmitted by the server under the condition that the server determines that the initialized K clusters reach a stable distribution state, wherein the stable distribution state is that the cluster to which the target client belongs is not changed, and the target cluster model is determined according to the cluster model in each cluster in which the initialized K clusters reach the stable distribution state;
executing the following first circulation process until a candidate model is converged, wherein the candidate model is obtained after the target client updates the received target cluster model in each cluster;
the first cyclic process includes:
sending a candidate model to the server so that the server updates the received candidate model;
And receiving the updated candidate model sent by the server, and updating the updated candidate model.
In another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the federal learning-based model updating method provided in the above embodiments, for example, including:
updating clusters to which the target client belongs according to synthetic data of the target client participating in federal learning until the initialized K clusters reach a stable distribution state, and sending a target cluster model in each cluster to the target client, wherein the synthetic data is obtained according to the received data distribution information of the target client, the stable distribution state is that the cluster to which the target client belongs is not changed any more, the target client comprises N clients, and the target cluster model is determined according to the cluster model in each cluster in which the initialized K clusters reach the stable distribution state;
executing the following first circulation process until the candidate model in the target client converges, wherein the candidate model is obtained after the target client updates the received target cluster model in each cluster;
The first cyclic process includes:
receiving a candidate model sent by a target client and updating the candidate model;
and sending the updated candidate model to the target client so that the target client updates the candidate model.
Or alternatively, the first and second heat exchangers may be,
transmitting the data distribution information of the target client to a server participating in the federation learning so that the server updates a cluster to which the target client belongs according to the synthetic data of the target client, wherein the target client comprises N clients, and the synthetic data is obtained according to the received data distribution information of the target client;
receiving a target cluster model in each cluster transmitted by the server under the condition that the server determines that the initialized K clusters reach a stable distribution state, wherein the stable distribution state is that the cluster to which the target client belongs is not changed, and the target cluster model is determined according to the cluster model in each cluster in which the initialized K clusters reach the stable distribution state;
executing the following first circulation process until a candidate model is converged, wherein the candidate model is obtained after the target client updates the received target cluster model in each cluster;
The first cyclic process includes:
sending a candidate model to the server so that the server updates the received candidate model;
and receiving the updated candidate model sent by the server, and updating the updated candidate model.
The system embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer power screen (which may be a personal computer, a server, or a network power screen, etc.) to perform the method described in the various embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims (10)
1. A model updating method based on federal learning is applied to a server participating in federal learning, and is characterized by comprising the following steps:
updating clusters to which the target client belongs according to synthetic data of the target client participating in federal learning until the initialized K clusters reach a stable distribution state, and sending a target cluster model in each cluster to the target client, wherein the synthetic data is obtained according to the received data distribution information of the target client, the stable distribution state is that the cluster to which the target client belongs is not changed any more, the target client comprises N clients, and the target cluster model is determined according to the cluster model in each cluster in which the initialized K clusters reach the stable distribution state;
Executing the following first circulation process until the candidate model in the target client converges, wherein the candidate model is obtained after the target client updates the received target cluster model in each cluster;
the first cyclic process includes:
receiving a candidate model sent by a target client and updating the candidate model;
and sending the updated candidate model to the target client so that the target client updates the candidate model.
2. The federal learning-based model updating method according to claim 1, applied to a server participating in federal learning, wherein the updating the cluster to which the target client belongs according to the composite data of the target client includes:
initializing a cluster to which a target client belongs;
acquiring an empirical loss value of the cluster model of the synthesized data of each client in each cluster according to the empirical loss function of the cluster model in each cluster;
and updating the cluster to which the target client belongs according to the experience loss value of the cluster model of the synthesized data of each client in each cluster so as to determine the clients contained in each cluster when the K clusters reach a stable distribution state, wherein the clients contained in each cluster are the clients which meet the requirement that the experience loss value of the cluster model of the synthesized data in each cluster reaches the minimum value in the target client.
3. The federal learning-based model updating method according to claim 2, applied to a server participating in federal learning, wherein updating the cluster to which the target client belongs according to the experience loss value of the cluster model of the composite data of each client in each cluster, comprises:
performing a second cyclic process until the K clusters reach a stable distribution state;
the second cyclic process includes:
updating the cluster model in each cluster based on a gradient descent method;
updating experience loss values of updated cluster models of the clients contained in each cluster in the clusters based on the synthesized data of the clients contained in each cluster;
and updating the cluster to which the target client belongs according to the experience loss value of the updated cluster model of the client in the cluster.
4. The federal learning-based model updating method according to claim 1, applied to a server participating in federal learning, wherein the receiving the candidate model sent by the target client and updating the candidate model includes:
Determining an optimization objective function of the candidate model according to the sum of a first optimization objective function and a second optimization objective function, wherein the first optimization objective function is the sum of loss functions of candidate models of the target clients, and the second optimization objective function is the sum of nonlinear functions of differences between a received target cluster model of any client and target cluster models received by other clients in the target clients;
optimizing the first optimization objective function based on a gradient descent method;
optimizing the second optimization objective function based on an approximate point method;
and updating the candidate model according to the candidate model when the optimization objective function takes the minimum value.
5. The model updating method based on federal learning is applied to a target client participating in federal learning, and is characterized by comprising the following steps:
transmitting the data distribution information of the target client to a server participating in the federation learning so that the server updates a cluster to which the target client belongs according to the synthetic data of the target client, wherein the target client comprises N clients, and the synthetic data is obtained according to the received data distribution information of the target client;
Receiving a target cluster model in each cluster transmitted by the server under the condition that the server determines that the initialized K clusters reach a stable distribution state, wherein the stable distribution state is that the cluster to which the target client belongs is not changed, and the target cluster model is determined according to the cluster model in each cluster in which the initialized K clusters reach the stable distribution state;
executing the following first circulation process until a candidate model is converged, wherein the candidate model is obtained after the target client updates the received target cluster model in each cluster;
the first cyclic process includes:
sending a candidate model to the server so that the server updates the received candidate model;
and receiving the updated candidate model sent by the server, and updating the updated candidate model.
6. The model updating method based on federation learning according to claim 5, applied to a target client participating in federation learning, wherein the obtaining manner of the data distribution information of the target client includes:
pre-training a teacher network based on knowledge distillation technology;
Taking the trained teacher network as a discriminator of a generated countermeasure network, wherein the generated countermeasure network is deployed at the target client;
training a generator in the generated countermeasure network based on the discriminator until the value of the loss function of the generated countermeasure network is smaller than a preset value;
and determining the data distribution information of the target client according to the trained generator.
7. A federal learning-based model update system for servers participating in federal learning, comprising: the first sending module and the first updating module;
the first sending module is configured to update, according to synthetic data of a target client participating in federal learning, clusters to which the target client belongs, until initialized K clusters reach a stable distribution state, and send a target cluster model in each cluster to the target client, where the synthetic data is obtained according to received data distribution information of the target client, the stable distribution state is that the cluster to which the target client belongs is no longer changed, the target client includes N clients, and the target cluster model is determined according to a cluster model in each cluster in which the initialized K clusters reach the stable distribution state;
The first updating module is configured to execute a first cyclic process until a candidate model in the target client converges, where the candidate model is obtained after the target client updates a target cluster model in each received cluster;
the first cyclic process includes:
receiving a candidate model sent by a target client and updating the candidate model;
and the second sending module is used for sending the updated candidate model to the target client so that the target client can update the candidate model.
8. A model updating system based on federal learning, applied to a target client participating in federal learning, comprising: the device comprises a second sending module, a receiving module and a second updating module;
the second sending module is configured to send data distribution information of the target client to a server participating in the federal learning, so that the server updates a cluster to which the target client belongs according to composite data of the target client, where the target client includes N clients, and the composite data is obtained according to the received data distribution information of the target client;
The receiving module is configured to receive, when the server determines that the initialized K clusters reach a stable distribution state, a target cluster model in each cluster sent by the server, where the stable distribution state is that a cluster to which the target client belongs is no longer changed, and the target cluster model is determined according to a cluster model in each cluster in which the initialized K clusters reach the stable distribution state;
the second updating module is configured to execute a first cyclic process until a candidate model converges, where the candidate model is obtained after the target client updates a target cluster model in each received cluster;
the first cyclic process includes:
sending a candidate model to the server so that the server updates the received candidate model;
the second receiving module is configured to receive the updated candidate model sent by the server, and update the updated candidate model.
9. An electronic device comprising a processor and a memory storing a computer program, wherein the processor implements the federal learning-based model updating method of any of claims 1-4 or 5-6 when executing the computer program.
10. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor implements a federally learning based model updating method according to any of claims 1-4 or 5-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310706337.3A CN116719607A (en) | 2023-06-14 | 2023-06-14 | Model updating method and system based on federal learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310706337.3A CN116719607A (en) | 2023-06-14 | 2023-06-14 | Model updating method and system based on federal learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116719607A true CN116719607A (en) | 2023-09-08 |
Family
ID=87869338
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310706337.3A Pending CN116719607A (en) | 2023-06-14 | 2023-06-14 | Model updating method and system based on federal learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116719607A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117892805A (en) * | 2024-03-18 | 2024-04-16 | 清华大学 | Personalized federal learning method based on supernetwork and hierarchy collaborative graph aggregation |
-
2023
- 2023-06-14 CN CN202310706337.3A patent/CN116719607A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117892805A (en) * | 2024-03-18 | 2024-04-16 | 清华大学 | Personalized federal learning method based on supernetwork and hierarchy collaborative graph aggregation |
CN117892805B (en) * | 2024-03-18 | 2024-05-28 | 清华大学 | Personalized federal learning method based on supernetwork and hierarchy collaborative graph aggregation |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lu et al. | Differentially private asynchronous federated learning for mobile edge computing in urban informatics | |
US20230039182A1 (en) | Method, apparatus, computer device, storage medium, and program product for processing data | |
CN112949837B (en) | Target recognition federal deep learning method based on trusted network | |
CN113191484A (en) | Federal learning client intelligent selection method and system based on deep reinforcement learning | |
CN116579417A (en) | Layered personalized federal learning method, device and medium in edge computing network | |
Zhang et al. | Towards data-independent knowledge transfer in model-heterogeneous federated learning | |
CN115344883A (en) | Personalized federal learning method and device for processing unbalanced data | |
CN116719607A (en) | Model updating method and system based on federal learning | |
Lu et al. | Heterogeneous model fusion federated learning mechanism based on model mapping | |
Yang et al. | Federated continual learning via knowledge fusion: A survey | |
CN117994635B (en) | Federal element learning image recognition method and system with enhanced noise robustness | |
CN115879542A (en) | Federal learning method oriented to non-independent same-distribution heterogeneous data | |
Yao et al. | F ed gkd: Towards heterogeneous federated learning via global knowledge distillation | |
Hao et al. | Waffle: Weight anonymized factorization for federated learning | |
Cheng et al. | GFL: Federated learning on non-IID data via privacy-preserving synthetic data | |
CN117371555A (en) | Federal learning model training method based on domain generalization technology and unsupervised clustering algorithm | |
Wang et al. | Eidls: An edge-intelligence-based distributed learning system over internet of things | |
Chen et al. | Resource-aware knowledge distillation for federated learning | |
Liu et al. | GDST: Global Distillation Self-Training for Semi-Supervised Federated Learning | |
Guo et al. | Dual class-aware contrastive federated semi-supervised learning | |
Yi et al. | pFedKT: Personalized federated learning with dual knowledge transfer | |
Tun et al. | Federated learning with intermediate representation regularization | |
Zou et al. | FedDCS: Federated learning framework based on dynamic client selection | |
Tian et al. | An Edge-Cloud Collaboration Framework for Generative AI Service Provision with Synergetic Big Cloud Model and Small Edge Models | |
Liu et al. | AdapterFL: Adaptive Heterogeneous Federated Learning for Resource-constrained Mobile Computing Systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |