CN115952860A - Heterogeneous statistics-oriented clustering federal learning method - Google Patents
Heterogeneous statistics-oriented clustering federal learning method Download PDFInfo
- Publication number
- CN115952860A CN115952860A CN202310060893.8A CN202310060893A CN115952860A CN 115952860 A CN115952860 A CN 115952860A CN 202310060893 A CN202310060893 A CN 202310060893A CN 115952860 A CN115952860 A CN 115952860A
- Authority
- CN
- China
- Prior art keywords
- node
- clustering
- model
- edge
- cluster
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a heterogeneous statistics-oriented clustering federal learning method, which comprises the following steps: step 1, constructing an edge node distribution classifier; step 2, determining a measurement index of edge node clustering; step 3, determining a clustering method of the node cluster; step 4, clustering the edge nodes by using a clustering method; step 5, the server initializes the global model and sends the model to the head node of each node cluster; step 6, after receiving the model, the edge node performs local training on a local data set and updates the model, sends the updated model to the next node in the cluster for training until all the nodes in each cluster complete training, and uploads the updated model to the server; step 7, the server receives the updated models of all clusters, then carries out weighted average and updates the global model; and 8, repeating the step 6 and the step 7 until the global model converges. Compared with the traditional federal learning method, the method is more efficient and has stronger applicability.
Description
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a heterogeneous statistics-oriented clustering federal learning method.
Background
Modern mobile and internet of things devices (e.g., smartphones, smart wearable devices, smart home devices) are producing large amounts of data each day, which provides opportunities to make complex Machine Learning (ML) models to address challenging artificial intelligence tasks. In conventional High Performance Computing (HPC), all data is collected and concentrated in one place for processing by a supercomputer having hundreds to thousands of compute nodes. However, concerns about security and privacy have led to new legislation, such as General Data Protection Regulations (GDPR) and Health Insurance Portability and Accountability Act (HIPAA), that prevents data from being transmitted to a centralized location, making traditional high performance computing difficult to apply to collecting and processing scattered data. Joint learning addresses security and privacy challenges by using decentralized data, i.e., training local models on local data of each client (data side), and using a central aggregator to accumulate learning gradients of local models to train a global model, thereby enlightening a new emerging high performance computational paradigm. While the computing resources of a single client may be far less powerful than the computing nodes of a traditional supercomputer, the computing power from a large number of clients can be accumulated to form a very powerful "decentralized virtual supercomputer". Joint learning has proven its success in a range of applications. From GBoard and keyword discovery and the like consumer devices to the pharmaceutical, medical research, financial, and manufacturing industries.
The data in federal learning is owned by the customer and may vary widely in quantity and content. Resulting in severe data heterogeneity that does not typically occur in data center distributed learning because the data distribution therein is well controlled. In data center distributed learning, the categories and characteristics of the training data are evenly distributed across all clients, i.e., independently Identically Distributed (IID). However, in federal learning, the distribution of data classes and features depends on the data owner, thus resulting in a non-uniform data distribution, referred to as a non-independent uniform distribution (non-IID data heterogeneity). This heterogeneity greatly affects training time and accuracy, and a technical solution for the above situation is needed.
Disclosure of Invention
Aiming at the problems that in federal learning, the distribution of data categories and characteristics depends on data owners, so that the data distribution is non-uniform, and further the training time and accuracy are greatly influenced, the invention provides a heterogeneous statistics-oriented clustering federal learning method, which is used in a federal learning environment with data statistics heterogeneity, and realizes a more efficient and highly applicable federal learning method.
In order to achieve the purpose, the invention adopts the following technical scheme:
a heterogeneous statistics-oriented clustering federal learning method comprises the following steps:
step 2, determining a measurement index of edge node clustering;
step 3, determining a clustering method of the node cluster;
step 4, clustering the edge nodes by using a clustering method;
step 5, the server initializes the global model and sends the model to the head node of each node cluster;
step 6, after receiving the model, the edge node performs local training on a local data set and updates the model, sends the updated model to the next node in the cluster for training until all the nodes in each cluster complete training, and uploads the updated model to the server;
step 7, the server receives the updated models of all clusters, then carries out weighted average and updates the global model;
and 8, repeating the step 6 and the step 7 until the global model converges.
Further, the step 1 comprises:
global model f θ Segmentation into a depth feature extractorAnd a classifier->Wherein θ = (θ) feat ,θ clf ) Is a set of parameters of the global model;
before the formal start of federal learning, a pre-training phase is used to estimate the data distribution on the edge nodes participating in training, during which each edge node k initializes θ from the same random 0 Initially, e rounds of training are performed on their local data sets, updating the model to
Based on local classifiers, respectivelyParameter ψ clf Or it's common data set at the server sideUpper prediction psi conf Constructing an edge node distribution classifier;
at the server side, a classifier is used for updating the model according to the edge nodeEvaluating the data distribution of the node>
Further, the step 2 comprises:
approximation of data distribution from edge node kInitially, similar node clusters are established from nodes with different distributions, the distance between the node clusters is minimized, and the distance in the node clusters is maximized;
use ofCosine and Euclidean distance are used for comparing the weight of the client classifierThe actual probability distribution form given as the confidence vector, and the KL divergence as the measure index.
Further, the clustering method comprises the following steps:
strategy 1: the clients are randomly assigned to the node clusters until a defined stopping criterion is met;
strategy 2: first, N is obtained by using the K-means method S Individual homogeneous clustering; all node clusters are then formed by iteratively extracting one edge node at a time from each cluster, up to the number of samples in each node cluster SAnd edge node K S ≤k S,max ;
Strategy 3: randomly selecting an edge node k i Assigned to the current node cluster S, i ∈ [ K ]](ii) a Then, a second edge node k is selected j Let k be i And k j The distance therebetween reaches a maximum, i.e.This process is repeated continuously and finally maximization->Reach the set maximum edge node number K S,max And a minimum sample number>Wherein tau is a measure of clustering.
Compared with the prior art, the invention has the following beneficial effects:
the method is suitable for the federal learning environment with data statistics heterogeneity, can be conveniently deployed under the traditional two-layer framework of the server-edge node, and can also be expanded and deployed under the three-layer framework of the cloud-edge server-edge node. Compared with the traditional federal learning method, the method is more efficient and has stronger applicability.
Drawings
Fig. 1 is a schematic flow chart of a heterogeneous statistics-oriented clustering federal learning method according to an embodiment of the present invention;
fig. 2 is a second flowchart of a heterogeneous statistics-oriented clustering federal learning method according to an embodiment of the present invention.
Detailed Description
The invention is further illustrated by the following examples in conjunction with the accompanying drawings:
the goal of traditional federal learning is to learn a global modelEach edge node K ∈ [ K ]]Can be based on the local data set->To obtain n k The FedAvg is a method based on T communication rounds of iteration and aims to solve the problem of ^ er>WhereinIs a local empirical risk,/ k Is the cross entropy loss, n = ∑ Σ k n k Is the total amount of data involved in the training. At each round T e [ T ∈ [ ]]The server will theta t Is sent to a randomly selected->A part of the customer. Each clientUsing D by minimizing local objects k Performing local gradient descent to reduce theta t Is updated to be->And returns it to the server. The updated model is then summarized by the server to a new global model->Is medium, i.e.>However, in real-world scenarios, there is no guarantee that local datasets from different customers are independently extracted from the same underlying distribution.
Aiming at the problems, as shown in fig. 1 and fig. 2, the invention provides a heterogeneous statistics-oriented clustering federal learning method, which comprises the following steps:
step one, constructing an edge node distribution classifier psi. The invention integrates a global model f θ Separation into a depth feature extractorAnd a classifier->Wherein θ = (θ) feat ,θ clf ) Is a set of parameters of the global model. The classification output is selected by>It is given. Before the formal start of federal learning, a pre-training phase is used to estimate the data distribution on the edge nodes participating in training, during which each edge node k initializes θ from the same random 0 Initially, e rounds of training are performed on their local data sets to update the model to ≧>The invention uses two strategies, based on local classifier @, respectively>Radix Ginseng (radix Ginseng)Number psi clf Or its public data set at the server side->Upper prediction psi conf . For strategy one, assume that the weight of the classifier can represent the local distribution of each client and directly feed it back to the clustering method φ (.) . For strategy two, at a common "feature set">Up-test each pick>Wherein->Containing c e [ N [ ] C ]J samples of (a). Then according to class>The predictions are averaged and a confidence vector for the kth client is defined as ≥>On the server side, the classifier psi is used, and the updated model is combined with the updated edge node>An estimate of the data distribution of the node is obtained>
And step two, determining a clustering measurement index tau. Approximation of data distribution from edge node kInitially, similar node clusters are established from nodes having different distributions in order to minimize the distance between node clusters while maximizing the distance within a node cluster. Given a/>And &>It is necessary to find a metric for measuring the distance between two distribution estimatesUsing cosine and Euclidean distance to compare the weight of the customer classifier, use @>The actual probability distribution form given as the confidence vector, and the KL divergence as the measure.
And step three, determining a clustering method phi of the node cluster. First, defineFor a client belonging to a cluster of nodes S>The collection of data of (2). In order to find a satisfying: minimum sample number>And a maximum number of customers K S,max Maximum number of nodes N of the constraint S Distributing the classifier psi at a given edge node (.) And clustering metric τ, the present invention introduces three strategies to find an approximation of the maximization problem. The first is phi rand Policy, a simple and practical method, is that customers are randomly assigned to a cluster of nodes until a defined stopping criterion is met. The second is phi kmeans The strategy is based on a K-means algorithm: first, N is obtained by using the K-means method S Individual homogeneous clustering; then, all node clusters are formed by iteratively extracting one edge node at a time from each cluster until the number of samples ≧ in each node cluster S>And edge node K S ≤k S,max . Finally, phi geedy The strategy follows a greedy approach to generating node clusters. Initially, an edge node k is randomly selected i Assigned to the current node cluster S, i ∈ [ K ]]. Then, a second edge node k is selected j Let k be i And k j Reaches a maximum, i.e. < >>This process is repeated continuously, finally maximizing by iteration>To a predetermined maximum number of edge nodes K S,max And a minimum sample number->
And step four, according to a clustering method, dividing all edge nodes participating in training into i node clusters after pre-training is finished, combining edge nodes with different distributions together, and simultaneously dividing edge nodes with similar distributions.
Step five, the server initializes a global model theta t Communicating with all edge nodes participating in the training and sending the model to all node clusters S i ,i∈[N S ]Head node k of i,1 。
Step six, node k i,1 Upon receiving the model θ t Then, in the local dataTo carry out E k Updating the model to be ^ er/standard for each round of training>Will then->Sent to the next edge node k in the node cluster i,2 And repeating the process until the last client side of the node cluster is judged to be>The model is received and local training is completed. Is at>After the training is completed, the updated model is->Sends to head node k in the cluster i,1 。
Step seven, head node k in the cluster i,1 Receiving a modelThen judging whether to repeat E according to the training effect S Substep six, if not required, the model is evaluated>Sending the updated model to a server, and after receiving the updated model returned by all the node clusters, the server bases the operation on the updated model
The model updates are averaged.
And step eight, repeating the step six and the step seven until the global model converges.
It should be noted that if the method is applied to a three-tier architecture of cloud-edge server-edge node, the edge server can be regarded as the server in the above steps simply when the edge server-edge node hierarchy. In the server-edge server hierarchy, the edge servers can be regarded as edge nodes in the above steps, and the sequence training is performed between the edge serversThe method can be carried out without clustering. The basis for this is that merging models may only be useful if the models are trained on a larger data set. According to statistics, in N S After a round, each model may have been trained on the entire data set, so that the performance of the strategy is closer and closer to a centralized strategy.
The above shows only the preferred embodiments of the present invention, and it should be noted that it is obvious to those skilled in the art that various modifications and improvements can be made without departing from the principle of the present invention, and these modifications and improvements should also be considered as the protection scope of the present invention.
Claims (4)
1. A heterogeneous statistics-oriented clustering federal learning method is characterized by comprising the following steps:
step 1, constructing an edge node distribution classifier;
step 2, determining a measurement index of edge node clustering;
step 3, determining a clustering method of the node cluster;
step 4, clustering the edge nodes by using a clustering method;
step 5, the server initializes the global model and sends the model to the head node of each node cluster;
step 6, after receiving the model, the edge node performs local training on a local data set and updates the model, sends the updated model to the next node in the cluster for training until all the nodes in each cluster complete training, and uploads the updated model to the server;
step 7, the server receives the updated models of all clusters, then carries out weighted average and updates the global model;
and 8, repeating the step 6 and the step 7 until the global model converges.
2. The heterogeneous statistics-oriented clustering federated learning method according to claim 1, wherein the step 1 includes:
global model f θ Is divided intoDepth feature extractorAnd a classifier->Wherein θ = (θ) feat ,θ clf ) Is a set of parameters of the global model;
before the formal start of federal learning, a pre-training phase is used to estimate the data distribution on the edge nodes participating in training, during which each edge node k initializes θ from the same random 0 Initially, e rounds are trained on their local data sets, updating the model to
Based on local classifiers, respectivelyParameter ψ clf Or it's common data set at the server sideUpper prediction psi conf Constructing an edge node distribution classifier;
3. The heterogeneous statistics-oriented clustering federated learning method according to claim 2, wherein the step 2 includes:
data distribution approximation from edge node kValue ofInitially, similar node clusters are established from nodes with different distributions, the distance between the node clusters is minimized, and the distance in the node clusters is maximized;
4. The heterogeneous statistics-oriented clustering federated learning method according to claim 1, wherein the clustering method includes:
strategy 1: the clients are randomly assigned to the node clusters until a defined stopping criterion is met;
strategy 2: first, N is obtained by using the K-means method S Individual homogeneous clustering; all node clusters are then formed by iteratively extracting one edge node at a time from each cluster, up to the number of samples in each node cluster SAnd edge node K S ≤k S,max ;
Strategy 3: randomly selecting an edge node k i Assigned to the current node cluster S, i ∈ [ K ]](ii) a Then, a second edge node k is selected j Let k be i And k j The distance between them is maximized, i.e.This process is repeated continuously, finally maximizing by iteration>To a predetermined maximum edge pitchNumber of points K S,max And a minimum sample number>Wherein tau is a measure of clustering. />
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310060893.8A CN115952860A (en) | 2023-01-17 | 2023-01-17 | Heterogeneous statistics-oriented clustering federal learning method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310060893.8A CN115952860A (en) | 2023-01-17 | 2023-01-17 | Heterogeneous statistics-oriented clustering federal learning method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115952860A true CN115952860A (en) | 2023-04-11 |
Family
ID=87282541
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310060893.8A Pending CN115952860A (en) | 2023-01-17 | 2023-01-17 | Heterogeneous statistics-oriented clustering federal learning method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115952860A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117806838A (en) * | 2024-02-29 | 2024-04-02 | 浪潮电子信息产业股份有限公司 | Heterogeneous data-based device clustering method, apparatus, device, system and medium |
-
2023
- 2023-01-17 CN CN202310060893.8A patent/CN115952860A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117806838A (en) * | 2024-02-29 | 2024-04-02 | 浪潮电子信息产业股份有限公司 | Heterogeneous data-based device clustering method, apparatus, device, system and medium |
CN117806838B (en) * | 2024-02-29 | 2024-06-04 | 浪潮电子信息产业股份有限公司 | Heterogeneous data-based device clustering method, apparatus, device, system and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhong et al. | Applying big data based deep learning system to intrusion detection | |
Chien et al. | Community detection in hypergraphs: Optimal statistical limit and efficient algorithms | |
Asad et al. | Evaluating the communication efficiency in federated learning algorithms | |
CN114186237A (en) | Truth-value discovery-based robust federated learning model aggregation method | |
CN115358487A (en) | Federal learning aggregation optimization system and method for power data sharing | |
Yi et al. | Fedlora: Model-heterogeneous personalized federated learning with lora tuning | |
CN113537509A (en) | Collaborative model training method and device | |
CN115952860A (en) | Heterogeneous statistics-oriented clustering federal learning method | |
CN114821237A (en) | Unsupervised ship re-identification method and system based on multi-stage comparison learning | |
CN114999635A (en) | circRNA-disease association relation prediction method based on graph convolution neural network and node2vec | |
CN115359298A (en) | Sparse neural network-based federal meta-learning image classification method | |
CN115114484A (en) | Abnormal event detection method and device, computer equipment and storage medium | |
CN116244484B (en) | Federal cross-modal retrieval method and system for unbalanced data | |
Zhang et al. | Federated multi-task learning with non-stationary heterogeneous data | |
CN108121912B (en) | Malicious cloud tenant identification method and device based on neural network | |
CN111160077A (en) | Large-scale dynamic face clustering method | |
Yang et al. | An academic social network friend recommendation algorithm based on decision tree | |
Liu et al. | Optimizing federated unsupervised person re-identification via camera-aware clustering | |
Basu et al. | Pareto optimal streaming unsupervised classification | |
CN112766336A (en) | Method for improving verifiable defense performance of model under maximum random smoothness | |
Nguyen et al. | Gradual federated learning using simulated annealing | |
Govindarajan et al. | Network Traffic Prediction Using Radial Kernelized-Tversky Indexes-Based Multilayer Classifier. | |
Tian et al. | FedACQ: adaptive clustering quantization of model parameters in federated learning | |
CN112085114B (en) | Online and offline identity matching method, device, equipment and storage medium | |
Fan et al. | Robust distributed swarm learning for intelligent iot |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |