CN115861705A - Federal learning method for eliminating malicious clients - Google Patents

Federal learning method for eliminating malicious clients Download PDF

Info

Publication number
CN115861705A
CN115861705A CN202211638722.0A CN202211638722A CN115861705A CN 115861705 A CN115861705 A CN 115861705A CN 202211638722 A CN202211638722 A CN 202211638722A CN 115861705 A CN115861705 A CN 115861705A
Authority
CN
China
Prior art keywords
data set
resnet
clients
client
classification model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211638722.0A
Other languages
Chinese (zh)
Inventor
张剑飞
周超然
张婧
杨宏伟
冯欣
杨佳东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changchun University of Science and Technology
Original Assignee
Changchun University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changchun University of Science and Technology filed Critical Changchun University of Science and Technology
Priority to CN202211638722.0A priority Critical patent/CN115861705A/en
Publication of CN115861705A publication Critical patent/CN115861705A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a federal learning method for eliminating malicious clients, and relates to the field of image classification. Selecting clients with a preset proportion from all the clients, sending a checking command, and sending the ResNet-101 classification model of the current turn to the unselected clients; the selected client side carries out data set distillation according to the ResNet-101 classification model and the local data set of the previous round, and the server rejects the selected client side with the accumulated score smaller than the score threshold; the client sides which are not selected are trained locally, and the server updates the ResNet-101 classification model according to the gradients uploaded by all the client sides which are not selected; and finally, obtaining a final ResNet-101 classification model through multiple iterations. The method and the system can screen and eliminate malicious clients which intentionally upload wrong gradients, and ensure the reliability and safety of training among multiple clients on the premise of not reducing the training accuracy.

Description

Federal learning method for eliminating malicious clients
Technical Field
The invention relates to the field of image classification, in particular to a federal learning method for eliminating malicious clients.
Background
In recent years, with the rapid development of big data and the internet of things, the data volume is increased explosively, and the artificial intelligence technology based on the big data is also developed rapidly. In mass data, the privacy protection problem of users is particularly prominent. In the training process of ordinary distributed machine learning, the local data of a plurality of clients can be directly sent to the server for unified training, and the danger of user privacy disclosure is undoubtedly increased. Moreover, due to the problems of industry competition, privacy safety, complex administrative procedures and the like, data often exists in an isolated island form, and the data of each client is difficult to directly utilize for training. Therefore, each researcher considers that the training process can not be completed without uploading the local data of the client. Accordingly, federal learning arises. The federal learning technique was first proposed by Google in 2016 and then applied to the training and deployment of next word association models for mobile device input methods. The design goal of federal learning is to train the technology of machine learning model without concentrating all data on a central server, while ensuring the safety of each user data participating in training. Therefore, the data of the client can complete the training of the model only locally, so that the problem of privacy protection is solved to a certain extent, the data of the client can be subjected to safe interactive modeling and learning under the participation of multiple clients, and ideal common benefits are achieved.
However, in the actual application environment of federal learning, a malicious client may have the following effects on the training of federal learning: affecting the performance of the model, causing privacy disclosure, disrupting the global model aggregation process, and uploading false gradients, resulting in a model that cannot converge quickly. The presence of a lazy client also creates a risk of privacy disclosure.
Therefore, how to screen the client is of great significance to improving the safety of the federal learning process and ensuring the efficiency of the federal learning process.
Disclosure of Invention
The invention aims to provide a federal learning method for eliminating malicious clients, which ensures the reliability and safety of training among multiple clients on the premise of not reducing the training accuracy.
In order to achieve the purpose, the invention provides the following scheme:
a federal learning method for eliminating malicious clients comprises the following steps:
dividing the ImageNet data set into a plurality of subsets and respectively distributing the subsets to each client as a local data set of each client;
selecting clients with a preset proportion from all the clients, sending a check command, and sending the ResNet-101 classification model of the current turn to the unselected clients;
the selected client side carries out data set distillation according to the ResNet-101 classification model and the local data set of the previous round, and a distillation data set obtained after the data set distillation is uploaded to the server;
the server scores according to the distillation data set and by combining the ResNet-101 classification model of the previous round and the gradient uploaded by the selected client in the previous round, and obtains the accumulated score of the selected client;
enabling the selected clients with the accumulated scores larger than or equal to the score threshold value to participate in next round of federal training, and rejecting the selected clients with the accumulated scores smaller than the score threshold value;
the unselected clients perform local training based on the local data sets of the clients and the ResNet-101 classification model of the current round, and upload the gradient obtained after training to the server;
the server calculates a ResNet-101 classification model of the next round according to the gradients uploaded by all the unselected clients;
and when the times of the turns reach the global iteration times, stopping the federal training to obtain a final ResNet-101 classification model.
Optionally, the selected client performs data set distillation according to the ResNet-101 classification model and the local data set of the previous round, and specifically includes:
randomly initializing learning rate eta of selected client and distillation data set composed of m data
Figure BDA0004007568680000021
Randomly selecting b data from a local data set D by the selected client to form a small batch D batch
According to the learning rate eta and the distillation data set
Figure BDA0004007568680000031
Updating the parameters of the ResNet-101 classification model in the previous round by adopting a gradient descent method to obtain updated model parameters theta upd
Based on small batch D batch And updated model parameters theta upd Using gradient descent method on distillation data set
Figure BDA0004007568680000032
Updating the learning rate eta;
based on the updated distillation data set and the updated model parameter θ upd Computing cross entropy loss function
Figure BDA0004007568680000033
If it is
Figure BDA0004007568680000034
And if the maximum iteration times T are not reached, replacing the learning rate eta with the updated learning rate, and collecting distillation data>
Figure BDA0004007568680000035
Replacing the distillation data set with an updated distillation data set, and returning to the step of randomly selecting b data from the local data set D by the selected client to form a small batch D batch "; wherein, end f Is the loss function upper bound;
if it is
Figure BDA0004007568680000036
Or the maximum iteration time T is reached, the data set distillation is finished, and the updated distillation data set is output.
Optionally, the learning-based rate η and distillation data set
Figure BDA0004007568680000037
Updating the parameters of the ResNet-101 classification model in the previous round by adopting a gradient descent method to obtain updated model parameters theta upd The method specifically comprises the following steps:
from distillation data set
Figure BDA0004007568680000038
By means of a formula>
Figure BDA0004007568680000039
Parameter theta of ResNet-101 classification model for the last round of calculation orig A gradient of (a); in the formula (II)>
Figure BDA00040075686800000310
Based on a distillation data set->
Figure BDA00040075686800000311
And a parameter theta orig Calculated cross entropy loss function->
Figure BDA00040075686800000312
Is->
Figure BDA00040075686800000313
With respect to parameter θ orig Solving a gradient obtained by partial derivative;
according to the learning rate eta and the parameter theta of the ResNet-101 classification model of the previous round orig By the formula
Figure BDA00040075686800000314
Calculating updated model parameters theta upd
Optionally, the server performs scoring according to the distillation data set by combining the ResNet-101 classification model of the previous round and the gradient uploaded by the selected client in the previous round, to obtain a cumulative score of the selected client, which specifically includes:
the server calculates the gradient of the ResNet-101 classification model of the previous round according to the distillation data set and the ResNet-101 classification model of the previous round;
calculating cosine similarity between the gradient of the ResNet-101 classification model of the previous round and the gradient uploaded by the selected client in the previous round;
calculating the score of the selected client in the current turn according to the cosine similarity;
and adding the score of the current turn with the accumulated score before the current turn to obtain the updated accumulated score of the selected client.
Optionally, the formula for calculating the cosine similarity is
Figure BDA0004007568680000041
In the formula, C k In order to be the cosine similarity, the similarity between the cosine and the cosine is calculated,
Figure BDA0004007568680000042
distilling data set ≥ uploaded for client k selected by the server based on the current round (tth round)>
Figure BDA0004007568680000043
The gradient of the ResNet-101 classification model of the t-1 th round is calculated,
Figure BDA0004007568680000044
gradient uploaded by the selected client k for the t-1 th round;
the calculation formula of the score of the selected client in the current turn is
Figure BDA0004007568680000045
In the formula, S k,t The selected client K is scored in the t-th round, L is a scaling factor of the scoring, L is more than 0, Q is a tolerance factor, B is a critical factor, V is a completely malicious factor, K is a speed factor of the scoring, and K is more than 1.
Optionally, the calculation formula of the next round of the ResNet-101 classification model is
Figure BDA0004007568680000046
In the formula, theta t+1 ResNet-101 classification model for the t +1 th round, θ t For the ResNet-101 classification model for round t,
Figure BDA0004007568680000047
the gradient uploaded by the client i in the t round, alpha is the learning rate used when the gradient descent method is applied to update the model parameters, and n i Is the local dataset size, S, of client i init An initial cumulative score for client i, S i For the cumulative score of the client i in the t round, S limit Is the score threshold.
Optionally, the selecting a preset proportion of clients from all clients and sending a check command, and sending the ResNet-101 classification model of the current round to the non-selected clients further includes:
server initialization ResNet-101 classification model theta 0 And initializing the accumulated scores of all the clients to S init
If the iteration is the first iteration, classifying the ResNet-101 model theta 0 To each client.
Alternatively, the same client may not be selected for two consecutive rounds.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
the invention discloses a federal learning method for eliminating malicious clients, which comprises the steps of firstly, selecting clients with a preset proportion from all the clients and sending a check command, and simultaneously sending ResNet-101 classification models of the current round to unselected clients, secondly, carrying out data set distillation on the selected clients according to the ResNet-101 classification models and a local data set of the previous round, grading the selected clients by a server according to a distillation data set, and eliminating the selected clients with accumulated grades smaller than a grading threshold value; the unselected clients perform local training based on local data sets of the unselected clients and the ResNet-101 classification model of the current turn, and the server updates the ResNet-101 classification model according to gradients uploaded by all the unselected clients; and finally, obtaining a final ResNet-101 classification model through multiple iterations. The method and the system can screen and eliminate malicious clients which intentionally upload wrong gradients, and ensure the reliability and safety of training among multiple clients on the premise of not reducing the training accuracy.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
Fig. 1 is a flowchart of a federal learning method for eliminating a malicious client according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a federated learning method for eliminating a malicious client according to an embodiment of the present invention;
FIG. 3 is a framework diagram of a federated learning process provided by an embodiment of the present invention;
FIG. 4 is a flow chart of data set distillation provided by an embodiment of the present invention;
fig. 5 is a flowchart of a cumulative scoring process according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
The malicious client is probably a bad client with an offensive or threatening property, and the client has a great threat to the federal learning algorithm, and the privacy of the client can be leaked in a serious case. Therefore, in order to eliminate the malicious clients, the invention provides a federal learning method for eliminating the malicious clients, which mainly screens and eliminates the malicious clients which intentionally upload error gradients and the lazy clients which only acquire models but do not participate in training. The method can ensure the reliability and safety of training among multiple clients on the premise of not reducing the training accuracy.
The federal learning method for eliminating the malicious clients provided by the embodiment of the invention is characterized in that federal training is carried out on the basis of the server and the clients, each client is scored in the training process, clients meeting requirements are evaluated, and the malicious clients are eliminated. As shown in fig. 1 to 3, the method comprises the following steps:
step S1, dividing the ImageNet data set into a plurality of subsets and respectively distributing the subsets to each client as a local data set of each client.
And S2, selecting clients with a preset proportion from all the clients, sending a check command, and sending the ResNet-101 classification model of the current turn to the unselected clients.
Server initializes ResNet-101 classification model (hereinafter referred to as global model) θ 0 And initializing the accumulated scores of all the clients to S init In this embodiment, S is used init =5。
If the iteration is the first round, the initial global model of the round is carried outType theta 0 To each client. Otherwise, selecting clients with the proportion p from all the clients (the same client cannot be selected in two consecutive rounds), sending a check command to the selected clients, and then, selecting the global model theta of the current round t And sending the data to other unselected clients.
And S3, distilling the data set by the selected client according to the ResNet-101 classification model and the local data set of the previous round, and uploading the distilled data set obtained by distilling the data set to a server.
And respectively distilling the data set of each selected client to obtain a distilled data set for scoring the client. It is assumed here that the parameters needed for training are: client local data set D, model parameters θ, distillation data set
Figure BDA0004007568680000071
Number m of distillation data, learning rate eta, small batch D batch Small batch size b, loss function f, end condition end f The maximum number of iterations T. Wherein, D means a data set formed by all local data of the client participating in training, and in this embodiment, the data set of ImageNet (ISLVRC 2012) is divided into a plurality of subsets and respectively allocated to each client; theta means the parameters of the ResNet-101 classification model used, where theta orig Initial parameters of the model before the start of distillation of the data set, global model parameters in the method specific to the previous round, and theta upd Representing the updating of the initial model parameter θ using a gradient descent method orig Obtaining model parameters;
Figure BDA0004007568680000072
By representation is meant a distillation data set obtained by distillation of the data set; m means that the distillation data set->
Figure BDA0004007568680000073
The number of data in (1); η means controlling the extent of parameter update when updating the model parametersHyper-parameters; d batch The meaning of the representation is small batches randomly selected from the local data set; b means the number of data in the small lot; f means a loss function adopted in model training, which is a cross entropy loss function in the embodiment; end f Meaning, expressed, the end condition of the dataset distillation, i.e. the model parameter θ at the end of the training upd In the distillation data set->
Figure BDA0004007568680000074
The upper limit of the loss function in (b), which is the loss function f (θ) of the client at the end of the previous iteration in this embodiment t-1 ;D k ) (ii) a T means the maximum number of iterations of the dataset distillation.
Referring to fig. 4, the detailed process of data set distillation is:
3.1 client random initial learning Rate η and distillation data set consisting of m data
Figure BDA0004007568680000075
3.2 the client randomly selects b data from the local data set D to form a small batch D batch
3.3 based on distillation data set
Figure BDA0004007568680000076
Calculation of initial model parameters θ Using gradient descent method orig And updating the gradient value to obtain an updated model parameter theta upd . The expression of the gradient descent method is as follows:
Figure BDA0004007568680000081
Figure BDA0004007568680000082
Figure BDA0004007568680000083
wherein,
Figure BDA0004007568680000084
based on a distillation data set->
Figure BDA0004007568680000085
And initial model parameter θ orig Calculating the obtained cross entropy loss function; n is the distillation data set->
Figure BDA0004007568680000086
The amount of data of (a); c is the total number of the labels; y is the distillation data set->
Figure BDA0004007568680000087
The original tag list of data x in (1); y is i The ith original label of the data x; p is a radical of formula θ (x) Label list of data x predicted for model with parameter θ, p θ (x) i The ith predictive tag for data x;
Figure BDA0004007568680000088
For a loss function with respect to theta orig Solving a gradient obtained by partial derivative; theta upd The model parameters obtained after updating by using a gradient descent method.
3.4 based on short run D batch And updated model parameters theta upd Calculating loss function, and using gradient descent method
Figure BDA0004007568680000089
And η are updated. The expression of the gradient descent method is as follows:
Figure BDA00040075686800000810
Figure BDA00040075686800000811
wherein, f (theta) upd ;D n ) For clients based on small batch D batch And updated model parameter θ upd Calculating the obtained loss function;
Figure BDA00040075686800000812
in relation to a loss function>
Figure BDA00040075686800000813
Solving a gradient obtained by partial derivative;
Figure BDA00040075686800000814
Calculating a partial derivative of the loss function with respect to η to obtain a gradient; lambda is the distillation data set->
Figure BDA00040075686800000815
And the learning rate η is a step length (learning rate) used for updating by applying a gradient descent method.
3.5 based on distillation data set
Figure BDA00040075686800000816
And updated model parameters theta upd Calculating a loss function pick>
Figure BDA00040075686800000817
If it is satisfied with
Figure BDA00040075686800000818
Or a number of iterations T is reached, the distillation of the data set is concluded and a return is made to->
Figure BDA00040075686800000819
Otherwise, the step 1.2 is returned to carry out the next iteration.
And S4, scoring by the server according to the distillation data set and by combining the ResNet-101 classification model of the previous round and the gradient uploaded by the selected client in the previous round to obtain the accumulated score of the selected client.
And scoring each client based on the gradient and distillation data set uploaded by the client, and calculating the cumulative score of the client. Fig. 5 is a flowchart of the cumulative scoring process. It is assumed here that the parameters needed for training are: global model theta t-1 Cumulative score S, cosine similarity C, tolerance factor Q, critical factor B, complete malicious factor V. Wherein, theta t-1 The meaning of the representation is a global model used when the client is scored, which is an initial global model used in the previous iteration in the embodiment; s means that the client carries out accumulated scoring in multiple rounds of inspection; c means that the gradient uploaded by the client is based on the distillation data set and the initial global model theta t-1 Cosine similarity between the obtained gradients; the expression Q means that the minimum value of the cosine similarity is determined when the client is completely normal, and the cos is 15 degrees in the embodiment; the meaning denoted by B is a critical value of the cosine similarity when the client is determined to be normal or malicious, in this embodiment, cos 30 °, and the meaning denoted by V is a maximum value of the cosine similarity when the client is determined to be completely malicious, in this embodiment, cos 90 °.
Referring to fig. 5, the cumulative scoring process is as follows:
4.1 the Server receives the distillation dataset of client k
Figure BDA0004007568680000091
Then, based on the global model theta of the previous round t-1 And &>
Figure BDA0004007568680000092
A one-step gradient descent is performed and the gradient of the global model is calculated>
Figure BDA0004007568680000093
4.2 calculate the gradient and the gradient uploaded by the client in the previous round
Figure BDA0004007568680000094
Cosine similarity between them C k . Cosine similarityC k The expression of (a) is as follows: />
Figure BDA0004007568680000095
Wherein,
Figure BDA0004007568680000096
the gradient uploaded for client k in the previous round (round t-1).
4.3 calculate the round of scores S for the client k,t . The round of scoring S k,t The expression of (a) is as follows:
Figure BDA0004007568680000097
wherein L > 0 is the scaling factor for the score, and S is k,t Is limited to [ L, -L]In this embodiment, L =5 is taken; k > 1 is a scoring speed factor, and S can be controlled in addition to preventing the limiting of the independent variable of the logarithm function k,t K =2 in this example. S k,t The descent speed of C k And B, the farther the distance, the faster the descent speed. In the expression, the smaller K is, S k,t About C k And B the greater the acceleration of the distance between them.
4.4 updating the cumulative score S for this client k . Cumulative score S k The expression of (a) is as follows:
S k =S k +S k,t (8)
and S5, enabling the selected client with the accumulated score larger than or equal to the score threshold to participate in next round of federal training, and rejecting the selected client with the accumulated score smaller than the score threshold.
The malicious clients with the accumulated scores smaller than the score threshold value and intentionally uploading error gradients can be removed, the interference of the malicious clients on the federal learning process is reduced, the accuracy of the federal learning algorithm can be further improved, and the model can quickly reach the convergence state.
And S6, locally training the unselected clients based on the local data sets of the unselected clients and the ResNet-101 classification model of the current turn, and uploading the gradients obtained after training to the server.
After the non-selected client receives the global model sent by the server, E times of local training is carried out based on local data held by the client and the global model; and E is the number of times that the unselected clients in each round update the model parameters locally by using a gradient descent method.
And S7, the server calculates the ResNet-101 classification model of the next round according to the gradients uploaded by all the unselected clients.
The calculation formula of the ResNet-101 classification model of the next round is
Figure BDA0004007568680000101
In the formula, theta t+1 ResNet-101 classification model for round t +1, θ t For the ResNet-101 classification model for round t,
Figure BDA0004007568680000102
gradient uploaded by the client i in the round t, alpha is learning rate used when updating model parameters by applying gradient descent method, and n i Is the local dataset size, S, of client i init Initial cumulative score for client i, S i For client i in the cumulative score of round t, S limit Is the score threshold.
And S8, stopping the federal training when the number of times of the turns reaches the global iteration number, and obtaining a final ResNet-101 classification model.
In the training process, a lazy client which does not perform model training but acquires the global model all the time may exist. To achieve this, lazy clients tend to send a random gradient close to 0. Because such clients cannot predict the correct update direction of the model, the random gradient uploaded by them is likely to deviate from the correct direction. If the scoring rule is set to be strict (in this embodiment, the score is set to be positive when the angle between the two gradients is within 30 degrees), the score is negative with a high probability. In addition, if the upload gradient is 0 then the score is directly negative. Therefore, the method also has a certain removing effect on the lazy client and can prevent the client from eavesdropping to a certain extent.
The method comprises the steps of firstly requiring each client to record a global model of the previous round and a loss function obtained when the last local update is carried out in the round in the training process of the client. Secondly, when each round starts, the server randomly selects a certain proportion of clients from the clients participating in training, and requires that the clients do not perform next round of updating, but needs to perform data set distillation based on the global model and the local data set of the previous round, and the termination condition is that the loss function is smaller than that of the client when the previous round ends. And then, the server calculates the gradient of the model by using the distillation data set uploaded by the selected client, calculates the cosine similarity according to the gradient and the gradient uploaded by the client in the previous round, and updates the accumulated score of the client. If the accumulated score is smaller than a specified threshold value, the client is prohibited (rejected) from performing subsequent federal learning, and the global model is updated only by using the gradient uploaded by the client which meets the evaluation result until the model converges. The method can not only eliminate malicious clients which upload error gradients intentionally and lazy clients which only acquire the model but do not participate in training, but also ensure that the training accuracy is not reduced to a certain extent, and can effectively guarantee the safety of the federal learning process.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principle and the implementation of the present invention are explained by applying specific examples in the embodiment, and the description of the above embodiments is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (8)

1. A federal learning method for eliminating malicious clients is characterized by comprising the following steps:
dividing the ImageNet data set into a plurality of subsets and respectively distributing the subsets to each client as a local data set of each client;
selecting clients with a preset proportion from all the clients, sending a check command, and sending the ResNet-101 classification model of the current turn to the unselected clients;
the selected client side carries out data set distillation according to the ResNet-101 classification model and the local data set of the previous round, and a distillation data set obtained after the data set distillation is uploaded to the server;
the server scores according to the distillation data set and by combining the ResNet-101 classification model of the previous round and the gradient uploaded by the selected client in the previous round, and obtains the accumulated score of the selected client;
enabling the selected client with the accumulated score larger than or equal to the score threshold to participate in next round of federal training, and rejecting the selected client with the accumulated score smaller than the score threshold;
the unselected clients perform local training based on local data sets of the clients and the ResNet-101 classification model of the current turn, and upload gradients obtained after training to the server;
the server calculates a ResNet-101 classification model of the next round according to the gradients uploaded by all the unselected clients;
and when the times of the turns reach the global iteration times, stopping the federal training to obtain a final ResNet-101 classification model.
2. The federal learning method for eliminating malicious clients as claimed in claim 1, wherein the selected clients perform data set distillation according to the ResNet-101 classification model of the previous round and a local data set, and specifically comprises:
randomly initializing learning rate eta of selected client and distillation data set composed of m data
Figure FDA0004007568670000011
Randomly selecting b data from a local data set D by the selected client to form a small batch D batch
From the learning rate eta and the distillation data set
Figure FDA0004007568670000012
Updating the parameters of the ResNet-101 classification model in the previous round by adopting a gradient descent method to obtain updated model parameters theta upd
Based on small batches D batch And updated model parameters theta upd Using gradient descent method on distillation data set
Figure FDA0004007568670000021
Updating the learning rate eta;
based on the updated distillation data set and the updated model parameter θ upd Computing cross entropy loss function
Figure FDA0004007568670000022
If it is
Figure FDA0004007568670000023
And if the maximum iteration times T are not reached, replacing the learning rate eta with the updated learning rate, and collecting distillation data>
Figure FDA0004007568670000024
Replace with the updated distillation data set and return to the selected customerRandomly selecting b data from a local data set D by the terminal to form a small batch D batch "; therein, end f Is the loss function upper bound;
if it is
Figure FDA0004007568670000025
Or the maximum iteration time T is reached, the data set distillation is finished, and the updated distillation data set is output.
3. The federated learning method of eliminating malicious clients as claimed in claim 2, wherein the learning-by-learning rate η and distillation data set
Figure FDA0004007568670000026
Updating the parameters of the ResNet-101 classification model in the previous round by adopting a gradient descent method to obtain updated model parameters theta upd The method specifically comprises the following steps:
from distillation data set
Figure FDA0004007568670000027
By means of a formula>
Figure FDA0004007568670000028
Parameter theta of ResNet-101 classification model for the last round of calculation orig A gradient of (a); in the formula (II)>
Figure FDA0004007568670000029
Based on a distillation data set ÷ for a selected client>
Figure FDA00040075686700000210
And a parameter theta orig Calculated cross entropy loss function->
Figure FDA00040075686700000211
Is->
Figure FDA00040075686700000212
With respect to parameter θ orig Solving a gradient obtained by partial derivative;
according to the learning rate eta and the parameter theta of the ResNet-101 classification model of the previous round orig By the formula
Figure FDA00040075686700000213
Calculating updated model parameters theta upd
4. The federal learning method for eliminating malicious clients as claimed in claim 1, wherein the server scores according to the distillation data set and by combining with the ResNet-101 classification model of the previous round and the gradient uploaded by the selected client in the previous round, to obtain the cumulative score of the selected client, specifically comprises:
the server calculates the gradient of the ResNet-101 classification model of the previous round according to the distillation data set and the ResNet-101 classification model of the previous round;
calculating cosine similarity between the gradient of the ResNet-101 classification model of the previous round and the gradient uploaded by the selected client in the previous round;
calculating the grade of the selected client in the current turn according to the cosine similarity;
and adding the score of the current turn with the accumulated score before the current turn to obtain the updated accumulated score of the selected client.
5. The federal learning method for eliminating malicious clients as claimed in claim 4, wherein the calculation formula of the cosine similarity is
Figure FDA0004007568670000031
In the formula, C k The similarity between the two signals is a cosine similarity,
Figure FDA0004007568670000032
distilling data set uploaded by a server based on a client k selected in the tth round->
Figure FDA0004007568670000033
The gradient of the ResNet-101 classification model of the t-1 th round is calculated,
Figure FDA0004007568670000034
gradient uploaded by the selected client k for the t-1 th round;
the calculation formula of the score of the selected client in the current turn is
Figure FDA0004007568670000035
In the formula, S k,t The selected client K is scored in the t-th round, L is a scaling factor of the scoring, L is more than 0, Q is a tolerance factor, B is a critical factor, V is a completely malicious factor, K is a speed factor of the scoring, and K is more than 1.
6. The federal learning method for eliminating malicious clients as claimed in claim 1, wherein the calculation formula of the ResNet-101 classification model of the next round is
Figure FDA0004007568670000036
In the formula, theta t+1 ResNet-101 classification model for the t +1 th round, θ t For the ResNet-101 classification model for round t,
Figure FDA0004007568670000037
gradient uploaded by the client i in the t round, alpha is learning rate used when updating model parameters by using gradient descent method, and n i Local dataset size, S, for client i init For client iInitial cumulative score of, S i For the cumulative score of the client i in the t round, S limit Is the score threshold.
7. The federal learning method for eliminating malicious clients as claimed in claim 1, wherein the method further comprises the steps of selecting clients with a preset ratio from all clients and sending a check command, and sending the ResNet-101 classification model of the current turn to the non-selected clients, wherein the method further comprises the following steps:
server initialization ResNet-101 classification model theta 0 And initializing the accumulated scores of all the clients to S init
If the iteration is the first iteration, classifying the ResNet-101 model theta 0 To each client.
8. The federal learning method for eliminating malicious clients as claimed in claim 1, wherein the same client cannot be selected in two consecutive rounds.
CN202211638722.0A 2022-12-20 2022-12-20 Federal learning method for eliminating malicious clients Pending CN115861705A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211638722.0A CN115861705A (en) 2022-12-20 2022-12-20 Federal learning method for eliminating malicious clients

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211638722.0A CN115861705A (en) 2022-12-20 2022-12-20 Federal learning method for eliminating malicious clients

Publications (1)

Publication Number Publication Date
CN115861705A true CN115861705A (en) 2023-03-28

Family

ID=85674359

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211638722.0A Pending CN115861705A (en) 2022-12-20 2022-12-20 Federal learning method for eliminating malicious clients

Country Status (1)

Country Link
CN (1) CN115861705A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117436515A (en) * 2023-12-07 2024-01-23 四川警察学院 Federal learning method, system, device and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117436515A (en) * 2023-12-07 2024-01-23 四川警察学院 Federal learning method, system, device and storage medium
CN117436515B (en) * 2023-12-07 2024-03-12 四川警察学院 Federal learning method, system, device and storage medium

Similar Documents

Publication Publication Date Title
CN108875807B (en) Image description method based on multiple attention and multiple scales
CN106415594B (en) Method and system for face verification
CN107506799B (en) Deep neural network-based mining and expanding method and device for categories of development
JP6159489B2 (en) Face authentication method and system
CN110443351B (en) Generating natural language descriptions of images
CN110659723B (en) Data processing method and device based on artificial intelligence, medium and electronic equipment
CN105205448A (en) Character recognition model training method based on deep learning and recognition method thereof
CN111400452B (en) Text information classification processing method, electronic device and computer readable storage medium
CN103729459A (en) Method for establishing sentiment classification model
CN110462638B (en) Training neural networks using posterior sharpening
CN109948149A (en) A kind of file classification method and device
CN113642431A (en) Training method and device of target detection model, electronic equipment and storage medium
CN111104941B (en) Image direction correction method and device and electronic equipment
CN106155327A (en) Gesture identification method and system
CN112446331A (en) Knowledge distillation-based space-time double-flow segmented network behavior identification method and system
Seo et al. FaNDeR: fake news detection model using media reliability
CN115861705A (en) Federal learning method for eliminating malicious clients
CN114742224A (en) Pedestrian re-identification method and device, computer equipment and storage medium
CN114065834B (en) Model training method, terminal equipment and computer storage medium
CN112861601A (en) Method for generating confrontation sample and related equipment
CN116416212B (en) Training method of road surface damage detection neural network and road surface damage detection neural network
WO2024015591A1 (en) Efficient decoding of output sequences using adaptive early exiting
CN111783688A (en) Remote sensing image scene classification method based on convolutional neural network
CN113722477B (en) Internet citizen emotion recognition method and system based on multitask learning and electronic equipment
CN114743049A (en) Image classification method based on course learning and confrontation training

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination