CN115829027A - Comparative learning-based federated learning sparse training method and system - Google Patents

Comparative learning-based federated learning sparse training method and system Download PDF

Info

Publication number
CN115829027A
CN115829027A CN202211349843.3A CN202211349843A CN115829027A CN 115829027 A CN115829027 A CN 115829027A CN 202211349843 A CN202211349843 A CN 202211349843A CN 115829027 A CN115829027 A CN 115829027A
Authority
CN
China
Prior art keywords
local
sparse
model
learning
global model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211349843.3A
Other languages
Chinese (zh)
Inventor
陈家辉
李峥明
徐培明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CSG Electric Power Research Institute
Guangdong University of Technology
Original Assignee
CSG Electric Power Research Institute
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CSG Electric Power Research Institute, Guangdong University of Technology filed Critical CSG Electric Power Research Institute
Priority to CN202211349843.3A priority Critical patent/CN115829027A/en
Publication of CN115829027A publication Critical patent/CN115829027A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a comparative learning-based federated learning sparse training method and system, and relates to the cross field of a federated learning algorithm framework, neural network sparse training and comparative learning. Wherein the method comprises the following steps: the server side sends a global model and a mask to the local client side; the local client generates a local sparse model according to the received global model and the mask, and trains the local sparse model by using a local data set; the local client calculates a comparison loss function, updates the local loss function and the local sparse model, and uploads the updated local sparse model to the server; and the server side aggregates the local sparse model updated by the local client side, updates the global model, sends the updated global model to the local client side, and starts a new round of communication training until the global model converges. According to the method, sparse training and contrast learning are introduced into federal learning, so that the calculation communication overhead is obviously reduced, and the performance of a global model is improved.

Description

Comparative learning-based federated learning sparse training method and system
Technical Field
The invention relates to the technical field of distributed machine learning, relates to the crossing field of a federated learning algorithm frame and neural network sparse training and contrast learning, and particularly relates to a federated learning sparse training method and system based on contrast learning.
Background
Data islanding caused by privacy protection, computing resources and other reasons is preventing the use of big data necessary for training artificial intelligence models.
As a distributed machine learning technology, federated learning becomes a method for solving data islands, and a machine learning model is trained by a plurality of clients together. Federal learning is widely applied to medical learning, natural language processing, fraud credit card detection and the like by cooperatively training a machine learning model through an exchange model under the condition that data is not sent to others, so that data privacy is protected.
But federal learning still has the following problems at present:
(1) The problem of heterogeneity: the heterogeneity of data, that is, the data which are not independently and identically distributed can make the local model deviate from the global model, and the performance of the aggregated global model is influenced;
(2) Calculating the communication overhead problem: in real life, some local clients are small devices such as mobile phones or personal notebooks, and these devices do not have enough power to train large models, and meanwhile, communication with the server is limited by bandwidth.
When resources are limited, the federate learning training precision is greatly reduced due to the existence of the problems.
Disclosure of Invention
The invention provides a comparative learning-based dynamic sparse training method for federal learning, which aims to reduce the communication overhead of federal learning and ensure the accuracy of a model.
In order to solve the technical problems, the technical scheme of the invention is as follows:
in a first aspect, a comparative learning-based federated learning sparse training method includes:
the server side sends a global model and a mask to the local client side; wherein the mask is generated based on sparsity to indicate whether global model parameters are retained;
the local client generates a local sparse model according to the received global model and the mask, and trains the local sparse model by using a local data set;
in each round of training process, the local client side carries out comparison loss function calculation, updates the local loss function and the local sparse model, and uploads the updated local sparse model to the server side;
and the server side aggregates the local sparse model updated by the local client side, updates the global model, sends the updated global model to the local client side, and starts a new round of communication training until the global model converges.
In the technical scheme, the sparse model is directly trained in the process of federal learning, so that the calculated amount in the training process is effectively reduced, the storage cost of equipment is reduced, the training process is accelerated, and the federal learning calculation communication overhead is obviously reduced; in addition, a comparison learning method is introduced in the process of federal learning, common characteristics among similar examples are learned, the similarity of the same target under different data enhancement is maximized by using a comparison loss function, the similarity among different targets is minimized, the problem of data heterogeneity is solved, and the accuracy of the model is improved while the communication overhead of federal learning calculation is reduced.
As a preferred scheme, the sending, by the server, the global model and the mask to the local client includes:
server side initialization global model
Figure BDA0003919291310000021
Generating a mask indicating whether parameters of a global model are reserved according to sparsity
Figure BDA0003919291310000022
Wherein t represents the federal learning round, and the sparsity S is the ratio of the number of parameters cut out in the global model to the total number of parameters;
random selection of participation book by serverThe global model is generated by taking part in the local client of the federal study
Figure BDA0003919291310000023
Sum mask
Figure BDA0003919291310000024
And sending the data to the local client.
As a preferred scheme, the local client generates a local sparse model according to the received global model and the mask, specifically:
local client receives global model
Figure BDA0003919291310000025
Sum mask
Figure BDA0003919291310000026
Global model
Figure BDA0003919291310000027
Sum mask
Figure BDA0003919291310000028
Performing Hadamard inner product to obtain a local sparse model
Figure BDA0003919291310000029
Where t represents the federal learning round and k represents the index of the local client.
As a preferred scheme, the training of the local sparse model by using the local data set includes:
local client inputs local data set into local sparse model
Figure BDA00039192913100000210
Middle, local sparse model
Figure BDA00039192913100000211
Making a prediction and calculating a loss function
Figure BDA00039192913100000215
Wherein t represents a federal learning turn, and k represents an index of a local client;
according to a preset learning rate eta, the local sparse model is subjected to
Figure BDA00039192913100000212
And (6) updating.
As a possible design of the preferred embodiment, the local sparse model is subjected to a learning rate η according to a preset learning rate η
Figure BDA00039192913100000213
Updating, wherein the updating process adopts the following operations:
Figure BDA00039192913100000214
as a preferred scheme, in each round of training, the local client performs comparison loss function calculation to update the local loss function and the local sparse model, including:
respectively inputting the local data sets into the t-th round local sparse models
Figure BDA0003919291310000031
Local sparse model of round t-1
Figure BDA0003919291310000032
Global model of the t-th round
Figure BDA0003919291310000033
In the method, corresponding characteristic vectors z and z are obtained respectively last And z glob
Computing a contrast loss function from the feature vectors
Figure BDA0003919291310000038
The expression is as follows:
Figure BDA0003919291310000034
in the formula, tau is a preset temperature over-parameter;
updating a local loss function, wherein the expression of the local loss function is as follows:
Figure BDA0003919291310000039
in the formula (I), the compound is shown in the specification,
Figure BDA00039192913100000310
representing local sparse models
Figure BDA0003919291310000035
A loss function of (d);
utilizing updated local penalty functions
Figure BDA00039192913100000311
Updating local sparse models
Figure BDA0003919291310000036
As an optimal scheme, in each training process, the local client side carries out comparison loss function calculation, after the local loss function and the local sparse model are updated, mask adjustment is carried out in a preset communication turn, the network structure of the local sparse model is dynamically evolved and updated, and then the dynamically evolved and updated local sparse model is uploaded to the server side.
In the preferred scheme, the mask is adjusted in a specific turn, and the local sparse network is dynamically updated, so that the purpose of searching for a better sparse structure can be realized. Compared with static sparse training, under the condition of high sparsity, the dynamic sparse training can improve the precision of the local sparse model, and further improve the accuracy of the whole federal learning model.
As a possible design of the preferred scheme, the mask adjustment is performed in a preset communication turn, and the network structure of the local sparse model is updated through dynamic evolution, specifically:
specific turns of communication with server at local clientRemoving the connection between partial neuron nodes of the local sparse model to adjust the local sparse model to higher sparsity S + (1-S) alpha t (ii) a Wherein alpha is t Is a dynamic adjustment parameter, whose expression is:
Figure BDA0003919291310000037
wherein alpha represents a preset dynamic adjustment parameter alpha of the first round 1 T represents a federal learning round, T end Representing the last round of learning;
and (3) according to the instant gradient information of the local sparse model, increasing and removing the same number of neurons and the connection with the maximum gradient, and recovering the sparsity of the model to be the original sparsity S.
As a preferred scheme, the server aggregating the local sparse model updated by the local client and updating the global model includes:
the server receives local sparse models uploaded by a plurality of local clients
Figure BDA0003919291310000041
The server side enables a plurality of local sparse models to be combined based on a FedAvg mode
Figure BDA0003919291310000042
Performing unified aggregation to generate an updated global model
Figure BDA0003919291310000043
The expression of the polymerization process is as follows:
Figure BDA0003919291310000044
wherein K represents the local client c participating in the training of the t-th round k Number of (2), D k Representing local client c k The corresponding local data set is then stored in the memory,
Figure BDA0003919291310000045
data set representing all local clients, k local client c k Is used to determine the index of (1).
In a second aspect, the comparative learning-based federated learning sparse training system is applied to the comparative learning-based federated learning sparse training method provided in any technical scheme in the first aspect, and comprises a server and a local client, wherein the server is connected with the local client;
the server is used for sending a global model and a mask to the local client, aggregating the local sparse models uploaded by the local client and updating the global model; the mask is generated based on sparsity and is used for representing whether the global model parameters are reserved or not;
the local client is used for receiving the global model and the mask to generate a local sparse model, training the local sparse model by using the local data set, calculating a contrast loss function, updating the local loss function and the local sparse model, and uploading the updated local sparse model to the server.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the invention adopts a sparse training method in the process of federal learning, obviously reduces the calculation communication overhead, simultaneously introduces a comparative learning method, corrects the local model based on the similarity between model representations, trains a global model with smaller deviation, solves the problem of data heterogeneity in federal learning and improves the performance of the global model.
Drawings
FIG. 1 is a flow chart of a federated learning sparse training method;
FIG. 2 is a flow diagram of a federated learning sparse training method including mask adjustment;
FIG. 3 is a schematic diagram of a learning process framework of the federated learning sparse training method in embodiment 2;
fig. 4 is a comparison graph of the test accuracy results of the comparative learning-based federal learning sparse training method and other federal learning methods on the MNIST data set in example 2.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent;
for the purpose of better illustrating the present embodiments, certain elements of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product;
it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical solution of the present invention is further described with reference to the drawings and the embodiments.
Example 1
The embodiment provides a comparative learning-based federated learning sparse training method, which, referring to fig. 1, includes:
the server side sends the global model and the mask to the local client side; wherein the mask is generated based on sparsity to indicate whether global model parameters are retained;
the local client generates a local sparse model according to the received global model and the mask, and trains the local sparse model by using a local data set;
in each round of training process, the local client side calculates a comparison loss function, updates the local loss function and the local sparse model, and uploads the updated local sparse model to the server side;
and the server side aggregates the updated local sparse models of the local client side, updates the global model, sends the updated global model to the local client side, and starts a new round of communication training until the global model is converged.
In the embodiment, in the process of federal learning, a sparse training method is introduced, and a local sparse model is generated at a local client by using a mask and is directly trained, so that the calculated amount in the process of federal learning is effectively reduced, the storage cost of equipment is reduced, the training process is accelerated, and the calculation communication overhead of federal learning is remarkably reduced; meanwhile, a comparison learning method is introduced, and the local model is corrected based on the similarity between model representations, so that the problem of data heterogeneity is solved. Through the cross cooperation among the federal learning, the coefficient training and the contrast learning, the accuracy of the global model is improved while the calculation communication overhead of the federal learning is reduced.
In a preferred embodiment, the sending, by the server, the global model and the mask to the local client includes:
server-side initialized global model
Figure BDA0003919291310000051
Generating a mask indicating whether parameters of a global model are reserved according to sparsity
Figure BDA0003919291310000052
Wherein t represents the federal learning round, and the sparsity S is the ratio of the number of parameters cut out in the global model to the total number of parameters;
the server randomly selects local clients participating in the federate learning in the current round, and the global model is converted into a global model
Figure BDA0003919291310000053
Sum mask
Figure BDA0003919291310000054
And sending the data to the local client.
In the preferred embodiment, the sparsity S is the ratio of the number of parameters pruned from the global model to the total number of parameters, and the mask is generated based on sparsity, which represents the structure of the sparse network.
In an alternative embodiment, the mask is in binary form.
As a non-limiting example, the mask is generated using a pruning algorithm based on sparsity.
In a preferred embodiment, the local client generates a local sparse model according to the received global model and the mask, specifically:
local client receives global model
Figure BDA0003919291310000061
Sum mask
Figure BDA0003919291310000062
Global model
Figure BDA0003919291310000063
Sum mask
Figure BDA0003919291310000064
Performing Hadamard inner product to obtain a local sparse model
Figure BDA0003919291310000065
Where t represents the federal learning round and k represents the index of the local client.
That is to say that the temperature of the molten steel,
Figure BDA0003919291310000066
wherein | _ is the Hadamard inner product.
In a preferred embodiment, the training the local sparse model using the local data set includes:
local client inputs local data set into local sparse model
Figure BDA0003919291310000067
Middle, local sparse model
Figure BDA0003919291310000068
Making predictions and calculating loss functions
Figure BDA00039192913100000621
According to the preset learning rate eta, the local sparse model is subjected to
Figure BDA0003919291310000069
And (6) updating.
In an optional embodiment, the local sparse model is subjected to a pair of local sparse models according to a preset learning rate eta
Figure BDA00039192913100000610
Updating, wherein the updating process adopts the following operations:
Figure BDA00039192913100000611
in a preferred embodiment, in each round of training, the local client performs a comparison loss function calculation to update the local loss function and the local sparse model, including:
respectively inputting the local data sets into the t-th round local sparse models
Figure BDA00039192913100000612
Local sparse model of round t-1
Figure BDA00039192913100000613
Global model of the t-th round
Figure BDA00039192913100000614
In (2), corresponding feature vectors z, z are obtained respectively last And z glob (ii) a Wherein z represents a vector of the features of the sample passing through the output of the Projection head (Projection head) structure of the feature representation network;
computing a contrast loss function from the feature vectors
Figure BDA00039192913100000622
The expression is as follows:
Figure BDA00039192913100000615
in the formula, tau is a preset temperature hyper-parameter;
updating a local loss function, wherein the expression of the local loss function is as follows:
Figure BDA00039192913100000616
in the formula (I), the compound is shown in the specification,
Figure BDA00039192913100000617
representing local sparse models
Figure BDA00039192913100000618
A loss function of (d);
utilizing updated local penalty functions
Figure BDA00039192913100000619
Updating local sparse models
Figure BDA00039192913100000620
In a preferred embodiment, in each training process, the local client performs comparison loss function calculation, updates the local loss function and the local sparse model, performs mask adjustment in a preset communication turn, dynamically evolves and updates a network structure of the local sparse model, and uploads the dynamically evolved and updated local sparse model to the server.
In a specific implementation process, a sparse network structure is randomly selected in an initial training stage, and mask adjustment is performed in a subsequent sparse training process. Because the mask represents the structure of the sparse network, the structure of the sparse network can be continuously changed through mask adjustment, so that the purpose of searching for a better sparse structure is achieved.
In an optional embodiment, referring to fig. 2, the mask adjustment is performed in a preset communication turn, and the network structure of the local sparse model is updated through dynamic evolution, specifically:
removing the connection between partial neuron nodes of the local sparse model in a specific turn of communication between the local client and the server, so that the local sparse model is adjusted to be higher in sparsity S + (1-S) alpha t (ii) a Wherein alpha is t Is a dynamic adjustment parameter, whose expression is:
Figure BDA0003919291310000071
in which alpha represents a preset first roundDynamic adjustment of parameter alpha 1 T represents a federal learning round, T end Representing the last round of learning;
according to the instant gradient information of the local sparse model, the connection with the maximum gradient and the same number of neurons is increased and removed, so that the sparsity of the model is recovered to be the original sparsity S.
In a preferred embodiment, the aggregating, by the server, the local sparse model updated by the local client to update the global model includes:
the server receives local sparse models uploaded by a plurality of local clients
Figure BDA0003919291310000072
The server side enables a plurality of local sparse models to be combined based on a FedAvg mode
Figure BDA0003919291310000073
Performing unified aggregation to generate an updated global model
Figure BDA0003919291310000074
The polymerization process expression is as follows:
Figure BDA0003919291310000075
wherein K represents the local client c participating in the training in the tth round k The number of the (c) component(s),
Figure BDA0003919291310000076
representing local client c k The corresponding local data set is then stored in the memory,
Figure BDA0003919291310000077
data set representing all local clients, k local client c k Is used to determine the index of (1).
In a specific implementation process, after the server completes the local sparse model, the newly generated global model is sent to the selected local client, and a new round of communication training is started until the global model converges.
Example 2
In this embodiment, an experiment is performed on the comparative learning-based federal learning sparse training method proposed in embodiment 1 by using a public MNIST dataset, with reference to fig. 1 to 4.
The MNIST data set (Mixed National Institute of Standards and Technology database) is a large handwritten digital database collected and collated by the National Institute of Standards and Technology, containing a training set of 60000 examples and a test set of 10000 examples.
Consider a typical federal learning framework: setting a global model as a convolutional neural network comprising two 5*5 convolutional layers, two maximum pooling layers and four fully-connected layers; the total number of the local clients is 100, 20 local clients are randomly selected from each communication turn to participate in training, and each local client iterates 10 times on a local data set by using an SGD optimizer in each turn and communicates with a service terminal 50 times.
As a non-limiting example, in the training process, the server sets sparsity S =0.5, and initializes the global model
Figure BDA0003919291310000081
Setting masks according to sparsity
Figure BDA0003919291310000082
The server randomly selects 20 local clients and sends a global model and a mask to the selected local clients;
after receiving the global model and the mask, the local client generates a local sparse model
Figure BDA0003919291310000083
Training a local model on a local data set, inputting local data x into the local sparse model in small batches of 32 samples, predicting the local sparse model, and calculating a loss function
Figure BDA0003919291310000084
Preset learning rate η =0.01 and proceedThe local sparse model is updated as follows:
Figure BDA0003919291310000085
respectively inputting local data x into local sparse models of the current round
Figure BDA0003919291310000086
Local sparse model of previous round
Figure BDA0003919291310000087
Global model of the present round
Figure BDA0003919291310000088
In the method, corresponding characteristic vectors z and z are obtained respectively last 、z glob Presetting a temperature hyperparametric calculation contrast loss function with tau =1
Figure BDA0003919291310000089
Figure BDA00039192913100000810
Update the local loss function as:
Figure BDA00039192913100000811
utilizing updated local penalty functions
Figure BDA00039192913100000812
Updating local sparse models
Figure BDA00039192913100000813
Setting that the local client performs mask adjustment once every ten rounds and dynamically updating the result of the sparse network. Setting α =0.01, when the local training is completed, the communication between the local client and the server is performedAnd (3) setting the number of rounds, wherein the local client adjusts the local sparse model to higher sparsity S + (1-S) alpha by removing the connection between the neuron nodes of the local sparse model part t (ii) a And then, according to the instant gradient information of the local sparse model, the connection with the maximum gradient and the same number of neurons is increased and removed, so that the sparsity of the local sparse model is recovered to be S. Wherein alpha is t Is to dynamically adjust the parameters and update the plan according to the cosine attenuation
Figure BDA0003919291310000091
And adjusting the change of sparsity.
After the local sparse model training and updating are completed by the selected local clients in the round, the local models are uploaded to the server, the server aggregates the local sparse models uploaded by the local clients in a FedAvg manner, and an updated global model is generated
Figure BDA0003919291310000092
Completing a round of communication learning;
wherein the polymerization mode is as follows:
Figure BDA0003919291310000093
after the server finishes aggregation of the local sparse models uploaded by the 20 local clients participating in the training, the newly generated global model is used
Figure BDA0003919291310000094
And sending the information to the selected local client, and starting a new round of communication training until the global model converges.
In addition, the convolutional neural network with the same structure and the same setting as those of the global model are selected in the embodiment, and the MNIST classification prediction task is executed. Selecting 20 local clients from 100 local clients, and under the condition that the given sparsity is S =0.5, each local client iterates 10 times on a local data set by using an SGD optimizer in each round and communicates with a service end 50 times to perform prediction after federal learning training, wherein the accuracy of the prediction result is shown in fig. 4. Obviously, compared with fedstt, fedAvg and FedProx, the model obtained by the comparative learning-based federal learning sparse training method provided by the embodiment has better performance, and higher accuracy can be obtained by requiring fewer communication rounds.
Example 3
The embodiment provides a comparative learning-based federated learning sparse training system, which is applied to the comparative learning-based federated learning sparse training method provided in embodiment 1, and the comparative learning-based federated learning sparse training system comprises a server and a local client, wherein the server is connected with the local client;
the server is used for sending a global model and a mask to the local client, aggregating the local sparse models uploaded by the local client and updating the global model; the mask is generated based on sparsity and is used for representing whether the global model parameters are reserved or not;
the local client is used for receiving the global model and the mask to generate a local sparse model, training the local sparse model by using the local data set, calculating a contrast loss function, updating the local loss function and the local sparse model, and uploading the updated local sparse model to the server.
The same or similar reference numerals correspond to the same or similar parts;
the terms describing positional relationships in the drawings are for illustrative purposes only and are not to be construed as limiting the patent;
it should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (10)

1. A comparative learning-based federated learning sparse training method is characterized by comprising the following steps:
the server side sends a global model and a mask to the local client side; wherein the mask is generated based on sparsity to indicate whether global model parameters are retained;
the local client generates a local sparse model according to the received global model and the mask, and trains the local sparse model by using a local data set;
in each round of training process, the local client side carries out comparison loss function calculation, updates the local loss function and the local sparse model, and uploads the updated local sparse model to the server side;
and the server side aggregates the local sparse model updated by the local client side, updates the global model, sends the updated global model to the local client side, and starts a new round of communication training until the global model converges.
2. The comparative learning-based federated learning sparse training method according to claim 1, wherein the server sends a global model and a mask to a local client, comprising:
server side initialization global model
Figure FDA0003919291300000011
Generating a mask indicating whether parameters of a global model are reserved according to sparsity S
Figure FDA0003919291300000012
Wherein t represents the federal learning round, and the sparsity S is the ratio of the number of parameters cut out in the global model to the total number of parameters;
the server randomly selects local clients participating in the federate learning in the current round, and the global model is converted into a global model
Figure FDA0003919291300000013
Sum mask
Figure FDA0003919291300000014
And sending the data to the local client.
3. The comparative learning-based federated learning sparse training method according to claim 1, wherein the local client generates a local sparse model according to the received global model and mask, specifically:
local client receives global model
Figure FDA0003919291300000015
Sum mask
Figure FDA0003919291300000016
Global model
Figure FDA0003919291300000017
Sum mask
Figure FDA0003919291300000018
Performing Hadamard inner product to obtain a local sparse model
Figure FDA0003919291300000019
Where t represents the federal learning round and k represents the index of the local client.
4. The comparative learning based federated learning sparse training method of claim 1, wherein the training of the local sparse model using the local data set comprises:
local client inputs local data set into local sparse model
Figure FDA00039192913000000110
Middle, local sparse model
Figure FDA00039192913000000111
Making predictions and calculating loss functions
Figure FDA00039192913000000112
Wherein t represents a federal learning turn, and k represents an index of a local client;
according to the preset learning rate eta, the local sparse model is subjected to
Figure FDA00039192913000000113
And (6) updating.
5. The comparative learning-based federated learning sparse training method of claim 4, wherein the local sparse model is subjected to η pair according to a preset learning rate
Figure FDA0003919291300000021
Updating, wherein the updating process adopts the following operations:
Figure FDA0003919291300000022
6. the comparative learning-based federated learning sparse training method according to claim 1, wherein during each training round, the local client performs a comparative loss function calculation to update the local loss function and the local sparse model, which includes:
respectively inputting the local data sets into the t-th round local sparse models
Figure FDA0003919291300000023
Local sparse model of round t-1
Figure FDA0003919291300000024
Global model of the t-th round
Figure FDA0003919291300000025
In (1),respectively obtaining corresponding eigenvectors z and z last And z glob
Computing a contrast loss function from the feature vectors
Figure FDA0003919291300000026
The expression is as follows:
Figure FDA0003919291300000027
in the formula, tau is a preset temperature over-parameter;
updating a local loss function, wherein the expression of the local loss function is as follows:
Figure FDA0003919291300000028
in the formula (I), the compound is shown in the specification,
Figure FDA0003919291300000029
representing local sparse models
Figure FDA00039192913000000210
A loss function of (d);
utilizing updated local penalty functions
Figure FDA00039192913000000211
Updating local sparse models
Figure FDA00039192913000000212
7. The comparative learning-based federal learning sparse training method as claimed in claim 1, wherein in each training process, the local client performs comparative loss function calculation, performs mask adjustment in a preset communication turn after updating the local loss function and the local sparse model, dynamically evolves and updates a network structure of the local sparse model, and uploads the dynamically evolved and updated local sparse model to the server.
8. The comparative learning-based federal learning sparse training method as claimed in claim 7, wherein the mask adjustment is performed in a preset communication turn, and the network structure of the local sparse model is dynamically updated through evolution, specifically:
removing the connection between partial neuron nodes of the local sparse model in a specific turn of communication between the local client and the server, so that the local sparse model is adjusted to a higher sparsity S + (1-S) alpha t (ii) a Wherein alpha is t Is a dynamic adjustment parameter, whose expression is:
Figure FDA00039192913000000213
wherein alpha represents a preset dynamic adjustment parameter alpha of the first round 1 T represents a federal learning round, T end Representing the last round of learning;
according to the instant gradient information of the local sparse model, the connection with the maximum gradient and the same number of neurons is increased and removed, so that the sparsity of the model is recovered to be the original sparsity S.
9. The comparative learning-based federated learning sparse training method according to any one of claims 1 to 8, wherein the server aggregates the updated local sparse models of the local clients and updates the global model, including:
the server receives local sparse models uploaded by a plurality of local clients
Figure FDA0003919291300000031
The server side enables a plurality of local sparse models to be combined based on a FedAvg mode
Figure FDA0003919291300000032
Performing unified aggregation to generate an updated global model
Figure FDA0003919291300000033
The expression of the polymerization process is as follows:
Figure FDA0003919291300000034
wherein K represents the local client c participating in the local training in the tth round k The number of the (c) component(s),
Figure FDA0003919291300000035
representing local client c k The corresponding local data set is then stored in the memory,
Figure FDA0003919291300000036
data sets representing all local clients, k representing local client k Is used to determine the index of (1).
10. A comparative learning-based federated learning sparse training system, which is applied to any one of claims 1 to 9, and is characterized by comprising a server and a local client, wherein the server is connected with the local client;
the server is used for sending the global model and the mask to the local client, aggregating the local sparse models uploaded by the local client and updating the global model; the mask is generated based on sparsity and is used for representing whether the global model parameters are reserved or not;
the local client is used for receiving the global model and the mask to generate a local sparse model, training the local sparse model by using the local data set, calculating a contrast loss function, updating the local loss function and the local sparse model, and uploading the updated local sparse model to the server.
CN202211349843.3A 2022-10-31 2022-10-31 Comparative learning-based federated learning sparse training method and system Pending CN115829027A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211349843.3A CN115829027A (en) 2022-10-31 2022-10-31 Comparative learning-based federated learning sparse training method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211349843.3A CN115829027A (en) 2022-10-31 2022-10-31 Comparative learning-based federated learning sparse training method and system

Publications (1)

Publication Number Publication Date
CN115829027A true CN115829027A (en) 2023-03-21

Family

ID=85525940

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211349843.3A Pending CN115829027A (en) 2022-10-31 2022-10-31 Comparative learning-based federated learning sparse training method and system

Country Status (1)

Country Link
CN (1) CN115829027A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116341689A (en) * 2023-03-22 2023-06-27 深圳大学 Training method and device for machine learning model, electronic equipment and storage medium
CN116578674A (en) * 2023-07-07 2023-08-11 北京邮电大学 Federal variation self-coding theme model training method, theme prediction method and device
CN117196014A (en) * 2023-09-18 2023-12-08 深圳大学 Model training method and device based on federal learning, computer equipment and medium
CN117391187A (en) * 2023-10-27 2024-01-12 广州恒沙数字科技有限公司 Neural network lossy transmission optimization method and system based on dynamic hierarchical mask
CN117196014B (en) * 2023-09-18 2024-05-10 深圳大学 Model training method and device based on federal learning, computer equipment and medium

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116341689A (en) * 2023-03-22 2023-06-27 深圳大学 Training method and device for machine learning model, electronic equipment and storage medium
CN116341689B (en) * 2023-03-22 2024-02-06 深圳大学 Training method and device for machine learning model, electronic equipment and storage medium
CN116578674A (en) * 2023-07-07 2023-08-11 北京邮电大学 Federal variation self-coding theme model training method, theme prediction method and device
CN116578674B (en) * 2023-07-07 2023-10-31 北京邮电大学 Federal variation self-coding theme model training method, theme prediction method and device
CN117196014A (en) * 2023-09-18 2023-12-08 深圳大学 Model training method and device based on federal learning, computer equipment and medium
CN117196014B (en) * 2023-09-18 2024-05-10 深圳大学 Model training method and device based on federal learning, computer equipment and medium
CN117391187A (en) * 2023-10-27 2024-01-12 广州恒沙数字科技有限公司 Neural network lossy transmission optimization method and system based on dynamic hierarchical mask

Similar Documents

Publication Publication Date Title
CN115829027A (en) Comparative learning-based federated learning sparse training method and system
Lin et al. Network pruning using adaptive exemplar filters
CN115081532A (en) Federal continuous learning training method based on memory replay and differential privacy
CN112115967A (en) Image increment learning method based on data protection
CN115331069A (en) Personalized image classification model training method based on federal learning
CN110781912A (en) Image classification method based on channel expansion inverse convolution neural network
CN111694977A (en) Vehicle image retrieval method based on data enhancement
Gil et al. Quantization-aware pruning criterion for industrial applications
CN115600686A (en) Personalized Transformer-based federal learning model training method and federal learning system
CN115359298A (en) Sparse neural network-based federal meta-learning image classification method
CN113987236B (en) Unsupervised training method and unsupervised training device for visual retrieval model based on graph convolution network
CN109948589B (en) Facial expression recognition method based on quantum depth belief network
CN115278709A (en) Communication optimization method based on federal learning
CN111401193A (en) Method and device for obtaining expression recognition model and expression recognition method and device
Zhang et al. Stochastic approximation approaches to group distributionally robust optimization
Du et al. CGaP: Continuous growth and pruning for efficient deep learning
CN112836822A (en) Federal learning strategy optimization method and device based on width learning
CN117217328A (en) Constraint factor-based federal learning client selection method
CN111414937A (en) Training method for improving robustness of multi-branch prediction single model in scene of Internet of things
CN116010832A (en) Federal clustering method, federal clustering device, central server, federal clustering system and electronic equipment
Zhang et al. Federated multi-task learning with non-stationary heterogeneous data
Zhao et al. Exploiting channel similarity for network pruning
CN116168197A (en) Image segmentation method based on Transformer segmentation network and regularization training
CN113256507B (en) Attention enhancement method for generating image aiming at binary flow data
CN113033653B (en) Edge-cloud cooperative deep neural network model training method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination