CN116167452A - Cluster federation learning method based on model similarity - Google Patents

Cluster federation learning method based on model similarity Download PDF

Info

Publication number
CN116167452A
CN116167452A CN202211625268.5A CN202211625268A CN116167452A CN 116167452 A CN116167452 A CN 116167452A CN 202211625268 A CN202211625268 A CN 202211625268A CN 116167452 A CN116167452 A CN 116167452A
Authority
CN
China
Prior art keywords
client
local
clients
model
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211625268.5A
Other languages
Chinese (zh)
Inventor
胡敏
曾云川
黄宏程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202211625268.5A priority Critical patent/CN116167452A/en
Publication of CN116167452A publication Critical patent/CN116167452A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a cluster federation learning method based on model similarity, and belongs to the field of federation learning. The method comprises the following steps: s1: designing a local training adjustment strategy, and releasing a federal learning task by a server; after acquiring the federation learning task, the client sends a federation learning joining request containing identity information and data resource information to the server node; s2: after the server node verifies the identity and the data resource information of the client, the server broadcasts a global model; s3: adjusting the local training period of each client according to the data volume of each client; the obtained local model is used for calculating a model weight distance matrix; s4: designing a client self-adaptive clustering strategy, revealing the clustering relation among clients from a model similarity matrix, and self-adaptively dividing the clients with similar data distribution into the same cluster under the condition that the clustering quantity is not specified; s5: the FedCluster obtains a stable client cluster in communication between the client and the server.

Description

Cluster federation learning method based on model similarity
Technical Field
The invention belongs to the field of federal learning, and relates to a cluster federal learning method based on model similarity.
Background
Federal learning is a distributed machine learning framework that, in this highly data-sensitive era, is capable of cooperatively training a machine learning model involving multiple data warehouses in a manner that protects data privacy. The most widely used current is the "server-client" architecture, federal learning allows multiple users (called clients) to cooperatively train a shared global model without sharing data in the local device. And the central server coordinates and completes multiple rounds of federal learning to obtain a final global model. Wherein at the beginning of each round, the central server sends the current global model to clients participating in federal learning. Each client trains the received global model according to the local data, and returns the updated model to the central server after training.
Since federal learning focuses on obtaining a high quality global model by distributively learning the local data of all participating clients, it cannot capture personal information of each device, resulting in degraded performance of reasoning or classification. Furthermore, the accuracy of FedAvg is greatly reduced when learning on non-independent co-distributed (non I.I.D.) data. In the case where there are significant differences in the data distribution of individual clients, it is difficult for a single global model to cope with local distribution situations that are distinct from global distribution. For practical applications that often face non-independent co-distributed data sets, it is often not sufficient to have only a single model. Taking the mobile keyboard development language model as an example, users from different populations may have different usage patterns due to different international, linguistic and cultural nuances, e.g., certain words or emoticons may be used by a particular population of users. In this case, a more targeted prediction needs to be made for each user to meet the needs of the user.
In order to solve the data heterogeneity in federal learning, many people have made related studies. The FedAVg algorithm is improved by zhao et al, and the fact that when the data are in independent and same distribution, higher precision loss exists when the FedAVg algorithm is applied is found. They propose a computational weight divergence that can improve the accuracy of federal learning in non-IID data and a federal learning strategy for data sharing that improves the training effect on non-IID data by creating a small portion of data that is globally shared between all client devices at a central server. Muhammad et al, in order to improve the training efficiency of federal learning, combine federal learning and recommendation systems, put forward FedFast algorithm, this algorithm is an improved version of FedAvg algorithm, its basic flow is similar to federal average algorithm, mainly aiming at two key steps of federal learning, have improved customer end selection and model aggregation. It is well known that training models can be personalized to reduce heterogeneity and to have each model obtain a high quality personalized model, i.e. personalized federal learning.
Disclosure of Invention
Accordingly, the present invention is directed to a method for cluster federal learning based on model similarity, which can eliminate the influence of Non-IID and unbalanced data at the same time. To handle unbalanced data, the local training adjustment strategy adaptively adjusts the number of local training periods for each client. To further improve the accuracy and adaptability of clustering, a client clustering strategy based on weighted voting automatically groups each client into the appropriate cluster.
In order to achieve the above purpose, the present invention provides the following technical solutions:
a method of cluster federation learning based on model similarity, the method comprising the steps of:
s1: designing a local training adjustment strategy, and releasing a federal learning task by a server; after acquiring the federation learning task, the client sends a federation learning joining request containing identity information and data resource information to the server node;
s2: after the server node verifies the identity and the data resource information of the client, the server broadcasts a global model;
s3: after the client acquires the global model, the local training period of the client is adjusted according to the data volume of each client; different clients experience different numbers of local training stages, and the resulting local model is used to calculate a model weight distance matrix, i.e., a model similarity matrix;
s4: designing a client self-adaptive clustering strategy, revealing the clustering relation among clients from a model similarity matrix, and self-adaptively dividing the clients with similar data distribution into the same cluster under the condition that the clustering quantity is not specified;
s5: the FedCluster obtains a stable client cluster in communication between the client and the server.
Optionally, in the step S1, the local training adjustment policy specifically is:
the cumulative loss of the client reflects the change of the local loss in the multi-round communication, and after the t-round communication, the cumulative loss of the client m is as follows:
Figure BDA0003996024110000021
Figure BDA0003996024110000022
representing a local experience loss of client m in the ith round of communication;
based on accumulated losses
Figure BDA0003996024110000023
FedCluster calculates the local iteration number of client m in the t-th round of communication loop +.>
Figure BDA0003996024110000024
The method comprises the following steps:
Figure BDA0003996024110000025
s * indicating that the client has the most local samples,
Figure BDA0003996024110000028
and |D m I represents the local number of samples s of the client, respectively * And m; the parameter alpha represents the iteration number increase rate controlling the iteration number increase rate, and the parameter ρ is definedThe meaning is as follows:
Figure BDA0003996024110000026
in equation (1), fedcmaster uses the number of local samples and the cumulative loss to calculate the number of local iterations per client; the number of local samples determines the maximum step size
Figure BDA0003996024110000027
And the local experience loss determines the actual increase step ρ based on the maximum step; resulting in a ratio->
Figure BDA0003996024110000031
I.e. the difference between the local experience loss of the customer and the reference will gradually decrease;
after the first round of communication, the cumulative loss of clients with fewer local samples is compared to the baseline client s * The gap between them becomes smaller and smaller; less than
Figure BDA0003996024110000032
They are->
Figure BDA0003996024110000033
The gap between the client ends is gradually increased, and the iteration times of the client ends are not increased; once the variance of the customer's cumulative loss is minimized, the FedCluster will stop the local training adjustment process and each customer will complete its local model training in this round.
Optionally, in S4, the client adaptive clustering policy is:
the minimum inter-cluster distance is not less than the maximum intra-cluster distance, i.e
min dist(G i ·,G j ·)≥max dist(G ,G ) (4)
dist(G ,G ) Representing the model distance between any two clients from two different clusters, dist (G ,G ) Watch (watch)Showing the model distance between any two clients within the same cluster;
selecting the weight closest to the output layer as a representative weight set of all model weights to calculate the similarity of the two models; the total weight of the model is represented by ω, and the selected partial weight is represented by ω';
the FedCluster groups the clients by using a model similarity matrix M, wherein the matrix is obtained by carrying out local training calculation on the regulated local models of all the clients; each element M [ M, n ]]=dist(ω′ m ,ω′ n ) Using the formula
Figure BDA0003996024110000034
To measure the model distance between any two clients; the ω is the weight number of the two models; taking M as input, fedCluster performs client clustering according to three steps, namely cluster demarcation detection, weighted voting and voting-based clustering.
Optionally, the group demarcation detection is:
for client M, fedCluster will first send M [ M, ]]The model distance values in (1) are ordered according to ascending order to obtain M' [ M, ]]Wherein M' is [ M, n ]]≥M′[m,z]N is greater than z; fedCluster calculations were then stored at M' [ M, ]]The difference between the distance values of any two adjacent models is obtained, and the maximum distance difference t is obtained m The method comprises the steps of carrying out a first treatment on the surface of the Let t m Clustering boundaries of all clients taking the client m as a reference; based on t m M' [ M, ]]All clients indexed are divided into two groups P m,1 And P m,2 ,P m,1 The difference between any two adjacent values is less than t m ,P m,2 The difference between any two adjacent values is not less than t m
P m,1 The clients in (a) can be allocated to the same cluster, cannot be assigned to the group belonging to P m,2 Clustering the clients of (1); adding weighted voting to adaptively determine a final client clustering result;
the weighted voting is as follows:
for each client M and its corresponding M' [ M, ]]At P m,1 The guests having the largest number of local samples among the clients:
client mmax =arg max{||D n |,n∈P m,1 } (5)
then FedCluster let P m,1 Each client n in (2) is a client according to equation (5) mmax Voting and updating client n pairs of clients mmax Total score of votes of (2)
Figure BDA0003996024110000041
Each customer maintains a list of accumulated vote scores for other customers who have been selected as the customer with the most samples;
for a particular customer, more local samples are given greater voting weights;
the voting-based clustering is as follows:
after cluster demarcation detection and weighted voting are operated on all clients by traversing all rows, a final voting score list of each client is obtained; for a given client m, it is assigned to the representative client with the highest cumulative score in the voting score list for m * In the same cluster G * In, i.e
Figure BDA0003996024110000042
By scanning all clients and their list of voting scores, the enhanced CFL automatically selects some representative clients as cluster heads and assigns other clients to those clusters.
The invention has the beneficial effects that:
the proposed framework was evaluated on four open datasets and showed the advantages of fedclumer compared to FedAvg, fedProx and FeSEM, which improved the stability and accuracy of federal learning training compared to FeSEM.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objects and other advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the specification.
Drawings
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in the following preferred detail with reference to the accompanying drawings, in which:
FIG. 1 is a workflow diagram of FedCluster.
Detailed Description
Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention. It should be noted that the illustrations provided in the following embodiments merely illustrate the basic idea of the present invention by way of illustration, and the following embodiments and features in the embodiments may be combined with each other without conflict.
Wherein the drawings are for illustrative purposes only and are shown in schematic, non-physical, and not intended to limit the invention; for the purpose of better illustrating embodiments of the invention, certain elements of the drawings may be omitted, enlarged or reduced and do not represent the size of the actual product; it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numbers in the drawings of embodiments of the invention correspond to the same or similar components; in the description of the present invention, it should be understood that, if there are terms such as "upper", "lower", "left", "right", "front", "rear", etc., that indicate an azimuth or a positional relationship based on the azimuth or the positional relationship shown in the drawings, it is only for convenience of describing the present invention and simplifying the description, but not for indicating or suggesting that the referred device or element must have a specific azimuth, be constructed and operated in a specific azimuth, so that the terms describing the positional relationship in the drawings are merely for exemplary illustration and should not be construed as limiting the present invention, and that the specific meaning of the above terms may be understood by those of ordinary skill in the art according to the specific circumstances.
Similar to FedAvg, three main steps are included: (1) the server broadcasting the global model to the clients; (2) each client trains a model using its local data; (3) The server aggregates these updated local models from the client; based on typical FL training, fedCluster incorporates two key strategies to address the shortcomings of the existing CFL methods. A local training adjustment strategy is first designed that adjusts the number of local training iterations per client given the amount of data. Different clients may experience different numbers of local training times and their resulting local models are used to calculate a model weight distance matrix (i.e., model similarity matrix). Secondly, an adaptive client clustering strategy is proposed to reveal the clustering relation between clients in a model similarity matrix and adaptively divide clients with similar data distribution into the same cluster. A weighted voting mechanism is designed to further eliminate the effects of unbalanced data. In combination with the two novel strategies, fedCluster can derive stable client clusters in the recorded communication between the client and the server in an efficient and adaptive manner.
1. Local training adjustment strategy
Typically, the local empirical loss reflects the state of the local model and is affected by the number of iterations during FL training. In particular, more iterations tend to make experience loss of local model training smaller before model convergence. In general, one epoch consists of multiple iterations through all samples in the machine learning model training process. Thus, unlike FedAvg setting the same number of local iterations for all clients, fedCluster dynamically adjusts the number of iterations per client during local model training to control the experience loss of the client. In contrast, more local iterations are set for fewer clients of the sample, while for more clients of the sample, fewer local iterations are set. Essentially, experience loss for clients with similar data distribution should remain consistent after such local training adjustments are made.
However, it is difficult to keep local experience loss consistent for customers of different sample numbers in a round of communication. This is because increasing the number of local iterations of a client may result in a significant reduction in local experience loss or overfitting when the number of samples stored by the client is small. Thus, in turn, the cumulative loss of customers is made more consistent rather than the local experience loss of a round of communication. The cumulative loss of a customer reflects its variation in local loss in multiple rounds of communication, which is easier to control than the local experience loss of a single round of communication. After t rounds of communication, the accumulated loss of the client m is as follows:
Figure BDA0003996024110000061
Figure BDA0003996024110000062
representing the loss of experience local to client m in the ith round of communication.
Based on accumulated losses
Figure BDA0003996024110000063
FedCluster calculates the local iteration number of client m in the t-th round of communication loop +.>
Figure BDA0003996024110000064
The method comprises the following steps:
Figure BDA0003996024110000065
s * indicating that the client has the most local samples,
Figure BDA00039960241100000611
and |D m I represents the local number of samples s of the client, respectively * And m. In addition, the parameter α represents the iteration number increase rate that controls the iteration number increase rate, and the parameter ρ is defined as:
Figure BDA0003996024110000066
on the one hand, the cumulative loss of customers with more local samples tends to be more reliable. On the other hand, increasing the number of iterations of the client would incur excessive local computational overhead. Thus, the local iteration number should be set for all clients within a reasonable range. In practice, the cumulative loss of the clients takes most local samples as a standard with the minimum loss in general, and the cumulative loss among the clients is more consistent by controlling the cumulative loss of other clients to approach the standard. In equation (1), fedcmaster uses the number of local samples and the cumulative loss to calculate the number of local iterations per client. In particular the number of local samples determines the maximum step size (e.g
Figure BDA0003996024110000067
) While the local empirical loss determines the actual increase step size (e.g., p) based on the maximum step size. The adjustment strategy is conservative in that the actual step size of the increase in the number of iterations is decreasing during the local training adjustment. A greater number of local training iterations will result in a faster reduction in local experience loss, resulting in a ratio
Figure BDA0003996024110000068
I.e. the difference between the local experience loss of the customer and the local experience loss of the benchmark will gradually decrease.
FedCluster can naturally stop local training adjustment. After the first round of communication, clients with fewer local samples typically train the local model through multiple iterations, thus having a faster drop in local experience loss. Cumulative loss of such clients and benchmark clients * The gap between them will be smaller and smaller. But once they are smallIn the following
Figure BDA0003996024110000069
They are->
Figure BDA00039960241100000610
The gap between the client and the client will gradually increase, because the iteration number of the client will not increase. In summary, once the variance of the customer's cumulative loss is minimized, the FedCluster will stop the local training adjustment process and each customer will complete its local model training in this round.
2. Client adaptive clustering strategy
Unlike CFL work that previously required the input of k clusters in a priori, fedclumer can accomplish client clustering by using model similarity matrix alone without knowing the number of clusters.
The motivation for adaptive client cluster design is an important criterion for evaluating the clustering results, i.e. the minimum inter-cluster distance should not be smaller than the maximum intra-cluster distance, i.e.
min dist(G ,G )≥max dist(G ,G ) (4)
dist(G ,G ) Representing the model distance between any two clients from two different clusters, dist (G ,G ) Representing the model distance between any two clients within the same cluster. Such preconditions may ensure that all clients may be categorized into the appropriate cluster. Based on this precondition, fedclumter can simply search for cluster separation conditions instead of optimizing inter-cluster distance and intra-cluster distance.
Using a large model is all weights to compute model similarity between clients can result in a significant computational overhead. To reduce the computational cost of deriving the model similarity matrix, fedcmaster uses partially carefully selected weights to compute the model similarity between any two clients. The chosen weights will better reflect the differences between the two models, which is theoretically supported by some previous work. For example, a.rozantsev and j.yosinki propose that model high-level weights are more task-dependent than these low-level weights. Also, m.luo reports that when non-IID data is present, the neural network model has a greater model difference between the weights of the classifier layers than the model trained on IID data. Inspired by these efforts, the weight closest to the output layer is selected as the representative weight set of all model weights to calculate the similarity of the two models. The total weight of the model is denoted by ω, and the selected partial weight is denoted by ω'.
FedCluster groups clients using a model similarity matrix M, which is calculated from the adjusted local training of the stable local model of all clients. Specifically, each element M [ M, n]=dist(ω′ m ,ω′ n ) Using the formula
Figure BDA0003996024110000071
(|ω| is the weight number of the two models) to measure the model distance between any two clients. M < M > -M of M matrix]Representing model distance between a model of client M and all other client models, where M [ M, M]=0, and M [ M, n]> 0, n noteq.m. Taking M as input, fedCluster performs client clustering according to three key steps, namely cluster demarcation detection, weighted voting and voting-based clustering. The method comprises the following specific steps:
(1) Cluster demarcation detection
For client M, fedCluster will first send M [ M, ]]The model distance values in (1) are ordered according to ascending order to obtain M' [ M, ]]Wherein M' is [ M, n ]]≥M′[m,z]N > z. FedCluster calculations were then stored at M' [ M, ]]The difference between the distance values of any two adjacent models is obtained, and the maximum distance difference t is obtained m . Considering that t is given in the precondition that "good" cluster is represented in equation (4) m Clustering boundaries of all clients with reference to the client m. Based on t m M' [ M, ]]All clients indexed are divided into two groups P m,1 And P m,2 ,P m,1 The difference between any two adjacent values is less than t m ,P m,2 No difference between any two adjacent valuesLess than t m
Intuitively, P m,1 The clients in (a) can be assigned to the same cluster but not to the group P m,2 Because their model differences are relatively large. Furthermore, the maximum distance difference t calculated for the customer with fewer samples m Poor stability compared to more sample customers, and therefore cannot rely on t alone m And clustering clients. To solve these problems, a weighted voting mechanism may be added to adaptively determine the final customer clustering results.
(2) Weight voting
For each client M and its corresponding M' [ M, ]]At P m,1 The guests having the largest number of local samples among the clients:
client mmax =arg max{|D n |,n∈P m,1 } (5)
then FedCluster let P m,1 Each client n in (2) is a client according to equation (5) mmax Voting and updating client n pairs of clients mmax Total score of votes of (2)
Figure BDA0003996024110000081
Each customer maintains a list of accumulated vote scores for other customers who have been selected as having the most samples.
Generally, the voting weight is assigned according to the sample size of the client. Specifically, for a particular customer, more local samples are given greater voting weights. Weighted voting can further eliminate cluster instability caused by clients with unbalanced samples and reduce the chance of incorrect clustering.
(3) Voting-based clustering
By traversing all rows, after steps (1) and (2) are run on all clients, a final list of voting scores for each client can be derived. For a given client m, accumulated in the voting score list assigned to mHighest representative client * In the same cluster G * In, i.e
Figure BDA0003996024110000082
By scanning all clients and their list of voting scores, the enhanced CFL automatically selects some representative clients as cluster heads and assigns other clients to those clusters.
Figure 1 shows the workflow of fedcmaster,
Figure BDA0003996024110000083
representing the loss of experience local to client m in the ith round of communication.
It is similar to FedAvg and comprises three main steps: (1) the server broadcasting the global model to clients; (2) each client trains a model using its local data; (3) The server aggregates these updated local models from the clients.
FedCluster is improved on the basis of typical federal training, and two key strategies are added to solve the defects of the current cluster federal learning method.
1: the server issues federal learning tasks; after acquiring the federation learning task, the client sends a federation learning joining request containing identity information and data resource information to the server node;
2: after the server node verifies the identity and the data resource information of the client, the server broadcasts a global model;
3: after the client acquires the global model, the local training period of the client is adjusted according to the data volume of each client. Different customers may experience different numbers of local training phases and their resulting local models are used to calculate a model weight distance matrix (i.e., a model similarity matrix)
4: the clustering relation among the clients is revealed from the model similarity matrix, and the clients with similar data distribution are adaptively divided into the same cluster under the condition that the clustering quantity is not specified
5: fedCluster can obtain a stable client cluster in an efficient and adaptive manner in several rounds of communication between the client and the server.
Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the present invention, which is intended to be covered by the claims of the present invention.

Claims (4)

1. A cluster federation learning method based on model similarity is characterized in that: the method comprises the following steps:
s1: designing a local training adjustment strategy, and releasing a federal learning task by a server; after acquiring the federation learning task, the client sends a federation learning joining request containing identity information and data resource information to the server node;
s2: after the server node verifies the identity and the data resource information of the client, the server broadcasts a global model;
s3: after the client acquires the global model, the local training period of the client is adjusted according to the data volume of each client; different clients experience different numbers of local training stages, and the resulting local model is used to calculate a model weight distance matrix, i.e., a model similarity matrix;
s4: designing a client self-adaptive clustering strategy, revealing the clustering relation among clients from a model similarity matrix, and self-adaptively dividing the clients with similar data distribution into the same cluster under the condition that the clustering quantity is not specified;
s5: the FedCluster obtains a stable client cluster in communication between the client and the server.
2. A method of cluster federal learning based on model similarity according to claim 1, wherein: in the step S1, the local training adjustment policy specifically includes:
the cumulative loss of the client reflects the change of the local loss in the multi-round communication, and after the t-round communication, the cumulative loss of the client m is as follows:
Figure FDA0003996024100000011
Figure FDA0003996024100000012
representing a local experience loss of client m in the ith round of communication;
based on accumulated losses
Figure FDA0003996024100000013
FedCluster calculates the local iteration number of client m in the t-th round of communication loop +.>
Figure FDA0003996024100000014
The method comprises the following steps:
Figure FDA0003996024100000015
s * indicating that the client has the most local samples,
Figure FDA00039960241000000111
and |D m I represents the local number of samples s of the client, respectively * And m; the parameter α represents the iteration number increase rate controlling the iteration number increase rate, and the parameter ρ is defined as:
Figure FDA0003996024100000016
in equation (1), fedcmaster uses the number of local samples and the cumulative loss to calculate the number of local iterations per client; the number of local samples determines the maximum stepLong length
Figure FDA0003996024100000017
And the local experience loss determines the actual increase step ρ based on the maximum step; resulting in a ratio->
Figure FDA0003996024100000018
I.e. the difference between the local experience loss of the customer and the reference will gradually decrease;
after the first round of communication, the cumulative loss of clients with fewer local samples is compared to the baseline client s * The gap between them becomes smaller and smaller; less than
Figure FDA0003996024100000019
They are->
Figure FDA00039960241000000110
The gap between the client ends is gradually increased, and the iteration times of the client ends are not increased; once the variance of the customer's cumulative loss is minimized, the FedCluster will stop the local training adjustment process and each customer will complete its local model training in this round.
3. A method of cluster federal learning based on model similarity according to claim 2, wherein: in the step S4, the adaptive clustering strategy of the client is:
the minimum inter-cluster distance is not less than the maximum intra-cluster distance, i.e
min dist(G ,G )≥max dist(G ,G ) (4)
dist(G ,G ) Representing the model distance between any two clients from two different clusters, dist (G ,G ) Representing a model distance between any two clients within the same cluster;
selecting the weight closest to the output layer as a representative weight set of all model weights to calculate the similarity of the two models; the total weight of the model is represented by ω, and the selected partial weight is represented by ω';
the FedCluster groups the clients by using a model similarity matrix M, wherein the matrix is obtained by carrying out local training calculation on the regulated local models of all the clients; each element M [ M, n ]]=dist(ω′ m ,ω′ n ) Using the formula
Figure FDA0003996024100000021
To measure the model distance between any two clients; the ω is the weight number of the two models; taking M as input, fedCluster performs client clustering according to three steps, namely cluster demarcation detection, weighted voting and voting-based clustering.
4. A method of clustered federal learning based on model similarity according to claim 3, wherein: the group demarcation detection is:
for client M, fedCluster will first send M [ M, ]]The model distance values in (1) are ordered according to ascending order to obtain M' [ M, ]]Wherein M' is [ M, n ]]≥M′[m,z]N is greater than z; fedCluster calculations were then stored at M' [ M, ]]The difference between the distance values of any two adjacent models is obtained, and the maximum distance difference t is obtained m The method comprises the steps of carrying out a first treatment on the surface of the Let t m Clustering boundaries of all clients taking the client m as a reference; based on t m M' [ M, ]]All clients indexed are divided into two groups P m,1 And P m,2 ,P m,1 The difference between any two adjacent values is less than t m ,P m,2 The difference between any two adjacent values is not less than t m
P m,1 The clients in (a) can be allocated to the same cluster, cannot be assigned to the group belonging to P m,2 Clustering the clients of (1); adding weighted voting to adaptively determine a final client clustering result;
the weighted voting is as follows:
for each client M and its corresponding M' [ M; carrying out]At P m,1 Is the guest having the largest number of local samples among the clients of (a)The method comprises the following steps:
client mmax =arg max{|D n |,n∈P m,1 } (5)
then FedCluster let P m,1 Each client n in (2) is a client according to equation (5) mmax Voting and updating client n pairs of clients mmax Total score of votes of (2)
Figure FDA0003996024100000022
Each customer maintains a list of accumulated vote scores for other customers who have been selected as the customer with the most samples;
for a particular customer, more local samples are given greater voting weights;
the voting-based clustering is as follows:
after cluster demarcation detection and weighted voting are operated on all clients by traversing all rows, a final voting score list of each client is obtained; for a given client m, it is assigned to the representative client with the highest cumulative score in the voting score list for m * In the same cluster G * In, i.e
Figure FDA0003996024100000031
By scanning all clients and their list of voting scores, the enhanced CFL automatically selects some representative clients as cluster heads and assigns other clients to those clusters.
CN202211625268.5A 2022-12-13 2022-12-13 Cluster federation learning method based on model similarity Pending CN116167452A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211625268.5A CN116167452A (en) 2022-12-13 2022-12-13 Cluster federation learning method based on model similarity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211625268.5A CN116167452A (en) 2022-12-13 2022-12-13 Cluster federation learning method based on model similarity

Publications (1)

Publication Number Publication Date
CN116167452A true CN116167452A (en) 2023-05-26

Family

ID=86415453

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211625268.5A Pending CN116167452A (en) 2022-12-13 2022-12-13 Cluster federation learning method based on model similarity

Country Status (1)

Country Link
CN (1) CN116167452A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117150255A (en) * 2023-10-26 2023-12-01 合肥工业大学 Clustering effect verification method, terminal and storage medium in cluster federation learning
CN117557870A (en) * 2024-01-08 2024-02-13 之江实验室 Classification model training method and system based on federal learning client selection

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117150255A (en) * 2023-10-26 2023-12-01 合肥工业大学 Clustering effect verification method, terminal and storage medium in cluster federation learning
CN117150255B (en) * 2023-10-26 2024-02-02 合肥工业大学 Clustering effect verification method, terminal and storage medium in cluster federation learning
CN117557870A (en) * 2024-01-08 2024-02-13 之江实验室 Classification model training method and system based on federal learning client selection
CN117557870B (en) * 2024-01-08 2024-04-23 之江实验室 Classification model training method and system based on federal learning client selection

Similar Documents

Publication Publication Date Title
CN116167452A (en) Cluster federation learning method based on model similarity
Petzka et al. On the regularization of wasserstein gans
Boutsis et al. On task assignment for real-time reliable crowdsourcing
Li et al. On social event organization
CN110457589A (en) A kind of vehicle recommended method, device, equipment and storage medium
CN110502704A (en) A kind of group recommending method and system based on attention mechanism
CN111222665B (en) Cloud manufacturing service combination optimization selection method based on preference NSGA-III algorithm
Gong et al. Adaptive client clustering for efficient federated learning over non-iid and imbalanced data
Laguel et al. A superquantile approach to federated learning with heterogeneous devices
Tekin et al. Adaptive ensemble learning with confidence bounds
CA2496278A1 (en) Statistical personalized recommendation system
CN114357455B (en) Trust method based on multidimensional attribute trust evaluation
Brando et al. Modelling heterogeneous distributions with an uncountable mixture of asymmetric laplacians
Xiong et al. A large-scale consensus model to manage non-cooperative behaviors in group decision making: A perspective based on historical data
CN115495771A (en) Data privacy protection method and system based on self-adaptive adjustment weight
Li et al. Heterogeneity-aware fair federated learning
CN117994635B (en) Federal element learning image recognition method and system with enhanced noise robustness
CN117252253A (en) Client selection and personalized privacy protection method in asynchronous federal edge learning
CN116595328A (en) Knowledge-graph-based intelligent construction device and method for data scoring card model
CN117235331A (en) Fair federation learning method for cross-domain social network node classification tasks
Tun et al. Federated learning with intermediate representation regularization
CN110322055A (en) A kind of method and system improving data risk model scoring stability
Tilahun et al. Fuzzy preference of multiple decision-makers in solving multi-objective optimisation problems using genetic algorithm
CN112487799A (en) Crowdsourcing task recommendation algorithm using extrinsic product attention
CN115392058B (en) Method for constructing digital twin model based on evolution game in industrial Internet of things

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination