CN116205311A - Federal learning method based on Shapley value - Google Patents
Federal learning method based on Shapley value Download PDFInfo
- Publication number
- CN116205311A CN116205311A CN202310124072.6A CN202310124072A CN116205311A CN 116205311 A CN116205311 A CN 116205311A CN 202310124072 A CN202310124072 A CN 202310124072A CN 116205311 A CN116205311 A CN 116205311A
- Authority
- CN
- China
- Prior art keywords
- model parameters
- clients
- client
- federal learning
- graph
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 16
- 238000012549 training Methods 0.000 claims abstract description 27
- 238000004364 calculation method Methods 0.000 claims description 6
- 239000013598 vector Substances 0.000 claims description 4
- 230000006870 function Effects 0.000 claims description 3
- 230000002776 aggregation Effects 0.000 abstract description 2
- 238000004220 aggregation Methods 0.000 abstract description 2
- 230000000694 effects Effects 0.000 description 6
- 238000010801 machine learning Methods 0.000 description 3
- 230000002411 adverse Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 101100001674 Emericella variicolor andI gene Proteins 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a federal learning method based on Shapley values. According to the method, the difference of data distribution of different clients in federal learning is considered, and when global model parameters are acquired, the parameters of the global training target are weighted and aggregated according to the contribution of a local training model of the client to the global training target. After each iteration training of federal learning, a weighted graph is constructed according to cosine similarity among local model parameters of each client, and a shape value of each client vertex in the graph is calculated. The server sets corresponding weight coefficients for the model parameters of each client based on the shape values of the clients, and carries out weighted aggregation on the model parameters of the clients according to the coefficients to obtain global model parameters of the next training until the training target is reached.
Description
Technical Field
The invention belongs to the field of machine learning, and particularly relates to a federal machine learning method.
Background
Along with the large-scale growth of intelligent terminals and internet of things interconnection equipment, the processing of mass data has become a necessary technology in the digital transformation era. For large-scale application scenarios, the federal machine learning method becomes one of the key technologies. As a novel distributed learning method, federal learning can alleviate the problems of high computational load and the like which are required when a single server processes large-scale data, and training the data is assigned to a plurality of clients for processing so as to share the cost. Meanwhile, by means of joint modeling of different clients on the premise that original data are not shared, privacy protection is achieved, and data safety is guaranteed.
Federal learning trains local models simultaneously through multiple clients and aggregates local models from different clients to obtain a global model. However, there is a data heterogeneity problem because data in federal learning is typically not evenly distributed across different clients and is typically of a non-independent co-distributed type. If the local models of different clients are aggregated without distinction, the overall training effect is degraded due to data heterogeneity. Therefore, a reasonably designed and effective method is needed to cope with the data heterogeneity problem in federal learning to improve the overall effect of training.
Disclosure of Invention
Technical problems: in federal learning, the data on multiple clients participating in the learning is typically non-independent and uniformly distributed, and the effect of the trained model on the overall training goal tends to be different for different clients on different rounds of iteration. Therefore, when the local parameter models from different clients are aggregated, the local parameter models need to be distinguished, the differences among the local models trained by the different clients are fully considered, and adverse effects on the whole federal learning effect due to data heterogeneity are relieved.
The technical scheme is as follows: in order to solve the technical problems, the invention provides a federal learning method based on Shapley values, which is characterized in that when local model parameters of clients are aggregated in federal learning, a weight coefficient is set based on the Shapley value of each client, and then the local model parameters are weighted and aggregated according to the weight coefficient to obtain global model parameters.
Further, a weighted graph is built at the end of each iteration training of federal learning, the vertexes of the graph are all clients participating in the learning of the federal learning, the clients are connected in pairs to form edges of the graph, and the weights of the edges are cosine similarity between local model parameters of the two clients connected with the edges.
Calculating shape of each client according to the constructed graph, wherein for a certain vertex (i.e. client) i in the graph, the shape (recorded as) The calculation method of (1) is that
wherein ,Si Represents a set of all subsets including vertex members i, s represents the number of elements in set s, n represents the number of all vertices in the graph, j is any vertex in set s except i, e ij Is the weight corresponding to the edge connecting vertices i and j.
The global model parameter after each iteration is the weighted sum of the local model parameters of all clients participating in the current learning, wherein the weighting coefficient corresponding to the local model parameter of the client i is related toThe specific calculation method is that
wherein ,is the global model parameter after the training of the t-th round is finished, L t Representing a set of all clients participating in the t-th round of learning,Is about->Function of->Is the local model parameter obtained by training the client i in the t-th round.
The beneficial effects are that: according to the method, the possible data difference between different clients in federal learning is fully considered, when local model parameters are polymerized, the contribution of the local model obtained by the different clients in each iteration to the overall training target is calculated based on the Shapley value, and different weighting coefficients are set according to the contribution value, so that adverse effects of data heterogeneity on the federal learning are reduced.
Drawings
Fig. 1 is a flow chart of a federal learning method based on Shapley values according to the present invention.
Detailed Description
A federal learning method based on Shapley values is characterized in that when local model parameters of clients are aggregated in federal learning, a weight coefficient is set based on the Shapley value of each client, and then the local model parameters are weighted and aggregated according to the weight coefficient to obtain global model parameters.
The design of the scheme of the invention is further specifically described with reference to fig. 1 and related formulas.
When federal learning starts, the central server randomly selects clients participating in the next round of iterative training, and issues an initial global model parameter to all the selected clients. The client performs local training on the basis of the global model parameters to obtain a new round of local model parameters.
Assume that at the time of t-th training of federal learning, n clients participate in the present training, and are marked as a set L t . Wherein client i (i.e.L) t ) The parameter model obtained in the training of the round isAfter the round training is finished, each client uploads the local model parameters obtained through training to a server, and the server constructs a graph according to the local model parameters of the client, wherein the top point of the graph is the client participating in the round learning, the clients are connected to form the edge of the graph, and the weight of the edge is the cosine similarity between the local model parameters of the two clients connected with the edge. Specifically, for the weight e corresponding to the edge formed between vertices i and j ij Is that
Wherein the molecules represent vectorsVector->Dot product between them, +.> andRespectively represent vectors andIs a mold of (a).
The server calculates the shape of each vertex in the constructed graph, specifically, for a certain vertex (i.e., client) i in the graph, the shape (denoted as) The calculation method of (1) is that
wherein ,Si Represents a set of all subsets including vertex members i, s represents the number of elements in set s, n represents the number of all vertices in the graph, j is any vertex in set s except i, e ij Is the weight corresponding to the edge connecting vertices i and j.
After the shape value of each client vertex is calculated, the server performs weighted summation on the local model parameters of all clients participating in the round of learning to obtain new global model parameters, wherein the specific calculation method is as follows
wherein ,is the global model parameter after the training of the t th round is finished,>is about->Is a function of (2).
And the server transmits the global model parameters obtained by aggregation after each round of iteration to the client selected to participate in the learning in the next round, and the client performs a new round of training on the basis of the global model parameters until the overall training convergence target is reached.
The above description is merely of preferred embodiments of the present invention, and the scope of the present invention is not limited to the above embodiments, but all equivalent modifications or variations according to the present disclosure will be within the scope of the claims.
Claims (5)
1. A federal learning method based on Shapley values is characterized in that when local model parameters of clients are aggregated in federal learning, a weight coefficient is set based on the Shapley value of each client, and then the local model parameters are weighted and aggregated according to the weight coefficient to obtain global model parameters.
2. The shape-based federal learning method according to claim 1, wherein a weighted graph is constructed at the end of each iteration training of federal learning, vertices of the graph are all clients participating in the learning, the clients are connected to each other to form edges of the graph, and weights of the edges are cosine similarities between local model parameters of two clients connected to the edges.
4. The shape-based federal learning method according to claim 2, wherein the shape of each client is calculated based on the constructed graph, and for a certain vertex (i.e., client) i in the graph, the shape (denoted as) The calculation method of (1) is that
wherein ,Si Representing a set of all subsets of clients i containing vertex members, |s| represents the number of elements in set s, n represents the number of all vertices in the graph, j is any vertex in set s other than i, e ij Is the weight corresponding to the edge connecting vertices i and j.
5. The shape based federal learning method according to claim 1, wherein the global model parameters after each iteration are weighted sums of local model parameters of all clients participating in the present learning, wherein the weighting coefficients corresponding to the local model parameters of client i are the shapeThe specific calculation method is that
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310124072.6A CN116205311A (en) | 2023-02-16 | 2023-02-16 | Federal learning method based on Shapley value |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310124072.6A CN116205311A (en) | 2023-02-16 | 2023-02-16 | Federal learning method based on Shapley value |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116205311A true CN116205311A (en) | 2023-06-02 |
Family
ID=86508965
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310124072.6A Pending CN116205311A (en) | 2023-02-16 | 2023-02-16 | Federal learning method based on Shapley value |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116205311A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116522988A (en) * | 2023-07-03 | 2023-08-01 | 粤港澳大湾区数字经济研究院(福田) | Federal learning method, system, terminal and medium based on graph structure learning |
CN117057442A (en) * | 2023-10-09 | 2023-11-14 | 之江实验室 | Model training method, device and equipment based on federal multitask learning |
-
2023
- 2023-02-16 CN CN202310124072.6A patent/CN116205311A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116522988A (en) * | 2023-07-03 | 2023-08-01 | 粤港澳大湾区数字经济研究院(福田) | Federal learning method, system, terminal and medium based on graph structure learning |
CN116522988B (en) * | 2023-07-03 | 2023-10-31 | 粤港澳大湾区数字经济研究院(福田) | Federal learning method, system, terminal and medium based on graph structure learning |
CN117057442A (en) * | 2023-10-09 | 2023-11-14 | 之江实验室 | Model training method, device and equipment based on federal multitask learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116205311A (en) | Federal learning method based on Shapley value | |
CN112235384A (en) | Data transmission method, device, equipment and storage medium in distributed system | |
CN110334580A (en) | The equipment fault classification method of changeable weight combination based on integrated increment | |
CN107203891A (en) | A kind of automatic many threshold values characteristic filter method and devices | |
CN113033712B (en) | Multi-user cooperative training people flow statistical method and system based on federal learning | |
CN113065974B (en) | Link prediction method based on dynamic network representation learning | |
CN113553755B (en) | Power system state estimation method, device and equipment | |
CN115688913A (en) | Cloud-side collaborative personalized federal learning method, system, equipment and medium | |
CN107832259A (en) | A kind of load forecasting method based on time series and Kalman filtering | |
CN107194415A (en) | Peak clustering method based on Laplace centrality | |
CN112287990A (en) | Model optimization method of edge cloud collaborative support vector machine based on online learning | |
CN114861893B (en) | Multi-channel aggregated countermeasure sample generation method, system and terminal | |
CN115829027A (en) | Comparative learning-based federated learning sparse training method and system | |
CN113657678A (en) | Power grid power data prediction method based on information freshness | |
CN106227767A (en) | A kind of based on the adaptive collaborative filtering method of field dependency | |
CN115131605A (en) | Structure perception graph comparison learning method based on self-adaptive sub-graph | |
CN117829307A (en) | Federal learning method and system for data heterogeneity | |
CN117056763A (en) | Community discovery method based on variogram embedding | |
CN111414937A (en) | Training method for improving robustness of multi-branch prediction single model in scene of Internet of things | |
CN111046958A (en) | Image classification and recognition method based on data-dependent kernel learning and dictionary learning | |
CN114581750B (en) | Rapid and accurate federal learning method and application aiming at non-independent co-distributed scene | |
CN112926658B (en) | Image clustering method and device based on two-dimensional data embedding and adjacent topological graph | |
CN114186168A (en) | Correlation analysis method and device for intelligent city network resources | |
Yin et al. | Grpose: Learning graph relations for human image generation with pose priors | |
CN113269218B (en) | Video classification method based on improved VLAD algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |