CN116029369A - Back door attack defense method and system based on federal learning - Google Patents

Back door attack defense method and system based on federal learning Download PDF

Info

Publication number
CN116029369A
CN116029369A CN202310096388.9A CN202310096388A CN116029369A CN 116029369 A CN116029369 A CN 116029369A CN 202310096388 A CN202310096388 A CN 202310096388A CN 116029369 A CN116029369 A CN 116029369A
Authority
CN
China
Prior art keywords
global
client
model
local
update
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310096388.9A
Other languages
Chinese (zh)
Inventor
王晓东
李晓璇
魏志强
杨昊宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ocean University of China
Original Assignee
Ocean University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ocean University of China filed Critical Ocean University of China
Priority to CN202310096388.9A priority Critical patent/CN116029369A/en
Publication of CN116029369A publication Critical patent/CN116029369A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Computer And Data Communications (AREA)

Abstract

The invention belongs to the technical field of machine learning, and discloses a backdoor attack defense method and a backdoor attack defense system based on federal learning, wherein the method comprises the following steps: determining back door image data for training, initializing a training task at a server side and distributing a global model; the client receives the global model issued by the server, trains the local model, and updates and uploads the local model to the server; according to the sign information updated by the local model of the client, the global learning rate of the server is adjusted; and updating the global model according to a dynamic weight aggregation rule based on the historical global update. The invention solves the problem that the traditional federal learning backdoor defense method is limited by data distribution and attack types, can achieve better defense effect under the condition of not influencing the performance of the model, and improves the safety and the robustness of federal learning.

Description

Back door attack defense method and system based on federal learning
Technical Field
The invention belongs to the technical field of machine learning, and particularly relates to a backdoor attack defense method and system based on federal learning.
Background
Current machine learning developments face data islanding and data privacy protection issues. The data island problem greatly limits the usability of data, the credibility among institutions is lacked, and the sufficient sharing of the data is difficult to realize. Meanwhile, in the age that privacy regulations such as General Data Protection Regulations (GDPR) are continuously emerging, the problem of data privacy disclosure is prominent. With the rapid growth in machine learning model complexity and large-scale training dataset requirements, more and more research is beginning to be directed to distributed machine learning. In such a context, federal learning has emerged as an emerging distributed learning paradigm. Unlike traditional machine learning methods, federal learning allows multiple clients to cooperatively train a specific neural network model in a manner that does not share raw data, mainly comprising several client trains to generate local updates, and a central server constantly iterates to optimize and distribute federal global models to each client, which has been widely used in multiple fields of autopilot, edge computing, medical treatment, and the like.
However, federal learning is vulnerable to antagonistic machine learning attacks by malicious clients, since the client can fully control the local data and local training process. For example, an attacker may gradually replace the global model with a backdoor model by poisoning local data or replacing some updated parameters that the client passes to the server, a process known as a backdoor attack. The goal of the backdoor attack is to make the global model classify malicious data into classes designated by an attacker when the malicious data is input, but the classification result of normal data is not affected, so that the overall performance of the training main task is guaranteed to be good. The back door attack forms a potential threat to the learning system, can be triggered only when the model obtains specific input, has strong concealment and brings great challenges to attack defense.
Because federal learning places high importance on client privacy, a central server has no client data access rights, and local updates uploaded by a client cannot be determined whether the local updates are generated correctly according to a privacy protocol, so that a back door attack is more difficult to detect. The existing defense methods for the back door attack mainly comprise a Bayesian robust aggregation rule, a clustering-based anomaly detection method, a similarity measurement-based defense and a differential privacy defense method, but the effectiveness of the defense measures is lacking due to the hidden characteristic of the back door attack. The robust aggregation rules based on statistical characteristics such as average value and median value are limited to specific attack models, and the defense effect depends on data distribution and environment setting. The clustering anomaly detection method with fewer clusters cannot meet the requirement of various attack types, and an attacker can aggregate at least one back door model update and benign model update through injecting various back door triggers, so that detection of all back doors is difficult to realize, and when no malicious client exists, the method can misjudge benign local update with deviation data distribution as malicious local update, and is difficult to be compatible with the condition without malicious clients. When the target update directions of honest clients are similar, it is difficult to identify malicious model updates only by calculating the similarity, and the situation of only a single malicious client cannot be defended. The research method is used for carrying out the backdoor attack related defense based on the differential privacy technology, and is suitable for a general attack model, but ensuring that the noise added by the backdoor is effectively eliminated also influences the benign performance of the aggregation model to a certain extent, and the balance of privacy and accuracy cannot be obtained.
In summary, there is no federal learning back door attack defense method at present, which effectively reduces the success rate of back door attacks while ensuring the performance of global models without specific settings such as data distribution. Therefore, how to design an effective defense scheme to resist back door attacks is an important issue that federal learning needs to solve.
Disclosure of Invention
Aiming at the defects existing in the prior art, the invention provides a backdoor attack defense method and a backdoor attack defense system based on federal learning, which are characterized in that the influence of local model update on a global model is quantified by calculating sign information of local update of a client, and the global learning rate of a server is adjusted accordingly, so that the model is promoted to be updated in a direction away from a malicious target; the dynamic weight aggregation rule based on the historical global update is designed, meanwhile, the influence of the gradient size and the direction on the final model is considered, on one hand, the contribution of a malicious client to the global model is limited by controlling the size of the local model update, meanwhile, benign model update is promoted to play a role, on the other hand, cosine similarity is calculated by using the sum of the client model update and the historical global update, the cosine similarity is more accurate and stable than the calculation similarity between clients, the back door attack success rate is effectively reduced, the back door attack effect is weakened, and the defending comprehensiveness is guaranteed. The lightweight backdoor attack defense method is suitable for the federal learning scene, solves the problems that the traditional federal learning backdoor defense method is limited by data distribution and attack types, can achieve better defense effect under the condition of not affecting the performance of a model, and improves the safety and robustness of federal learning.
In order to solve the technical problems, the invention adopts the following technical scheme:
first, the invention provides a backdoor attack defense method based on federal learning, which comprises the following steps: step S1, determining back door image data for training, initializing a training task at a server side and distributing a global model;
s2, the client receives the global model issued by the server, trains the local model by using the local data, and uploads the generated local model update to the server;
s3, adjusting the global learning rate of the server side according to the symbol information updated by the local model of the client side;
s4, processing the local model update of each client so as to scale the local model update to the same magnitude as the global model update;
step S5, updating the global model according to a dynamic weight aggregation rule based on historical global updating:
the server side aggregates and generates global model update by utilizing the local model update after the scaling treatment, calculates the client side aggregation score for training rounds of which the global learning rate is kept unchanged, screens and removes the client side local update of which the cosine similarity with the sum of the historical global updates is smaller than zero, and calculates the client side aggregation score according to the negative value of the cosine similarity between the client side local model update and the sum of the historical global updates for the training rounds of which the global learning rate is negative; and finally, carrying out weighted average on the scaled local model update of the client according to the aggregation score to generate global model update, and carrying out iterative training until a final global model is obtained.
Further, in step S1, the server first determines a back door image dataset, a model training target, and initializes a global model w 0 And super-parameters, and will initiate the global model w 0 Broadcasting to the client; wherein the back door image dataset is generated by adding triggers of different shapes in the data samples and assigning them to the specified clients according to different data distributions, a set of clients c= { C is set 1 ,c 2 ,…,c k Local data set per client }
Figure BDA0004071778470000031
The global model training objective can be regarded as a process of solving the optimal global model based on empirical risk minimization, i.e. finding a set of optimal model parameters +.>
Figure BDA0004071778470000032
Satisfy->
Figure BDA0004071778470000033
Wherein (1)>
Figure BDA0004071778470000034
Represents a d-dimensional weight vector, N i Representative dataset D i Is the size of (a) training data total +.>
Figure BDA0004071778470000035
F(w;D i ) Representing the empirical loss function of client i.
Further, in step S2, each client receives the global model w of the previous round issued by the server t-1 As its local model w i,t =w t-1 Wherein w is i,t Representing the book of client i at the t-th roundThe ground model, then client i utilizes the local dataset D i The current local model is optimized and the current model is,
Figure BDA0004071778470000036
obtaining the parameter difference between local models generated by the client i during adjacent rounds of training, wherein the parameter difference is called client local model update, and delta w i,t =w i,t -w i,t-1 Wherein w is i,t And w i,t-1 Representing the local model of the client i at the t-th round and the t-1 st round respectively, l represents the local learning rate of the client,
Figure BDA0004071778470000037
representing a client gradient loss function; after the local training is completed, each client will generate local model update Deltaw i,t Uploading to a server.
Further, in step S3, each time a round of global training is completed, the server obtains the local model updates uploaded by all clients, and calculates the symbol sum S of the local model updates t The definition is as follows:
Figure BDA0004071778470000038
where n is the number of clients, Δw i,t For client i to generate local model updates at round t, sgn (Δw i,t ) As a sign function if Deltaw i,t > 0, sgn (Deltaw) i,t ) =1, otherwise sgn (Δw i,t )=-1;
Introducing a learning rate threshold value theta, if the sign sum St of local model updating of the current round is smaller than the threshold value theta, at the moment, adjusting the global learning rate of a server end to be-eta, maximizing model training loss, and if the sign sum is larger than or equal to the threshold value theta, the global learning rate is unchanged, and the method is normally updated, namely:
Figure BDA0004071778470000041
wherein eta represents the initial global learning rate of the server side, S t Symbol sum, η, representing local model updates θ,t The global learning rate of the t-th round given a threshold value theta;
the learning rate threshold value theta is set according to the number of clients and the proportion of malicious clients, and the value range of theta needs to meet CP+1< theta < C (1-P), wherein C is the number of clients, and P is the proportion of the malicious clients.
Further, step S4, before executing the aggregation operation, processes the local model update size of each client to scale to the same magnitude as the global model update; local model update after scaling
Figure BDA0004071778470000044
Calculated from the following formula:
Figure BDA0004071778470000042
wherein t represents the training round of the model, and Deltaw t-1 Representing global model updates, Δw, generated by a previous round of training i,t Representing local model updates generated by client i at round t, |·| represents vector binorms.
Further, in step S5, the server side aggregates and generates global updates by using the local model updates after the scaling processing, and the specific method is as follows:
s51, the server obtains all global updates generated in previous rounds of the current training task, which are called historical global update sum Deltaw pre,t-1 =Δw 1 +Δw 2 +…+Δw t-1 Respectively calculating cosine similarity CS of the sum of local model update and historical global update of each client i,t
Figure BDA0004071778470000043
Wherein Deltaw is t-1 Representing global model updates for round t-1, deltaw pre,t-1 Representing a historical global update sum,
Figure BDA0004071778470000045
local model update of the representative client i after the t-th round of scaling processing; if the cosine similarity calculation is negative, the fact that the local model update generated by the current client i has negative influence on the global model is indicated;
s52, for training rounds with the global learning rate kept unchanged, malicious local model updates with the opposite direction to the global update need to be filtered in the model aggregation process, namely, the local model updates with the cosine similarity calculated as negative are specifically defined as aggregation scores T of the current client i in the T-th round by using the cosine similarity obtained by the Relu operation processing calculation i,t And is used as the aggregation basis of the current round global model update; for training rounds with negative global learning rate, the cosine similarity obtained by Relu operation processing is also used, except that since the global learning rate is negative at this time, in order to update the model in a direction away from a malicious target, the model training loss needs to be maximized, and at this time, the client aggregation score T needs to be calculated according to the negative value of the cosine similarity i,t Client aggregate score T i,t Calculated from the following formula:
Figure BDA0004071778470000051
wherein Deltaw is pre,t-1 Representing a historical global update sum,
Figure BDA0004071778470000052
local model update on behalf of client i after the t-th round of scaling process, if CS i,t >0, relu (CS) i,t )=CS i,t Otherwise Relu (CS) i,t )=0,CS i,t Cosine similarity representing the sum of local model update and historical global update of the client i;
s53, according to the local update and the aggregation score after the scaling processing of the client, the server calculates a weighted average value as the global model update after aggregation:
Figure BDA0004071778470000053
where n represents the number of clients, T i,t For the aggregate score of the current client i at round t,
Figure BDA0004071778470000054
and scaling the processed local model update for the client i.
Further, since the historical global update of the previous round cannot be calculated in the first round of training, the median of the local model updates is taken as the global model update of the first round
Figure BDA0004071778470000055
Further, in step S5, finally, the server side updates the global model according to the calculated global learning rate and the global model to obtain a final global model:
Figure BDA0004071778470000056
wherein n represents the number of clients, eta represents the initial global learning rate of the server, S t For the sign sum of the local model update, aw t-1 Global model update generated for previous round of training, CS i,t For cosine similarity of the local model update of client i to the sum of the historical global updates,
Figure BDA0004071778470000057
and scaling the processed local model update for the client i.
The invention further provides a backdoor attack defense system based on federal learning, which is used for realizing the backdoor attack defense method based on federal learning, and comprises the following steps: a local training module, a parameter uploading module, a learning rate modifying module, a parameter processing module and an aggregation updating module,
the local training module is used for training the local model by the client and generating corresponding local model update;
the parameter uploading module is connected with the local training module and is used for uploading the local model update of each client to the server, and the server receives all the local model updates of the current training round and marks the local model update as w t ={Δw i,t I e n, where t is the model training round, n is the number of clients, Δw i,t Updating the local model generated by the client i in the t-th round;
the learning rate modification module is connected with the parameter uploading module and is used for judging whether the global learning rate of the round of training needs to be modified according to the local model update of the global client;
the parameter processing module is connected with the parameter uploading module and is used for processing the size of the local model update of the client so as to enable the local model update to be scaled to be the same as the global model update, reduce the contribution of a malicious client to the global model and promote the benign model update to play a role;
the aggregation updating module is connected with the learning rate modification module and the parameter processing module and is used for dynamically aggregating weights based on historical global updating if the system starts the backdoor attack defending method provided by the invention: for training rounds with the global learning rate kept unchanged, the server calculates the client aggregation score, screens and removes the client local update with cosine similarity smaller than zero with the sum of the historical global updates, and for training rounds with the global learning rate negative, calculates the client aggregation score according to the negative value of the cosine similarity; the client local model update after scaling treatment is weighted and averaged according to the aggregation score to generate a global model update, and iterative training is carried out until a final global model is obtained; if the system does not start the defending method, the aggregation updating module is connected with the parameter uploading module and is used for a federal average aggregation algorithm, and the federal average aggregation algorithm directly uses the average value updated by each client model as a global model to update the global model.
Compared with the prior art, the invention has the advantages that:
(1) The invention provides a backdoor attack defense method based on federal learning, which utilizes a plurality of client training to generate a plurality of local updates, and adjusts the global learning rate of a server according to the sign information of the client local updates of each round; the method improves the traditional FedAvg federal average aggregation algorithm, and in each round of training, the aggregation score of each round of training of the client is obtained by calculating the cosine similarity of the sum of the local model update of each client and the previous round of history global update, and is used as the aggregation basis of the current round of global model update, so that the influence of malicious local update on the backdoor defense effect is reduced, the defending comprehensiveness is ensured, and the federal learning robustness is improved.
The method is suitable for most federal learning frameworks and data distribution scenes, has enough universality, is an unsupervised method deployed at a federal learning server end, does not occupy computer resources of terminal equipment, and is simple in implementation steps, low in calculation cost, and large in calculation cost, and the existing research is often limited to a certain data distribution scene.
(2) According to the method, the global learning rate of the server side is adjusted according to the overall direction of the local model update of the client side, the model is promoted to update in the direction opposite to the malicious target, the back door attack effect is weakened, the method is independent, and different types of back door attacks can be defended by combining different model aggregation rules. The existing method for performing backdoor defense by changing the learning rate is less, and partial research prevents backdoor attack by reducing the learning rate of malicious clients, but the learning rate of partial honest clients in the training process is disturbed due to the limitation of detection accuracy. (3) The dynamic weight aggregation rule based on the historical global update simultaneously considers the influence of the gradient size and the direction on the final model, limits the contribution of a malicious client to the global model by controlling the size of the local model update, calculates the similarity by using the cosine distance between the client model update and the sum of the historical global update, is more accurate and stable than the calculation similarity between the clients, and effectively reduces the success rate of back door attack. The existing model aggregation rules often estimate the general scope of global update according to statistical methods such as average value, median and the like, and cannot fundamentally remove the influence of local update of a backdoor on the global model.
(4) The invention improves the existing federal learning training process and can be applied to federal learning scenes with untrusted clients. And calculating the sign sum of the local model update of the client, adjusting the global learning rate of the server, calculating the aggregation score of the client according to the historical global update, and carrying out weighted aggregation to obtain a final model, so that the contribution of suspicious local update to the global model is reduced, and comprehensive and effective backdoor attack defense is realized. The defense method can also be combined with other defense methods (such as clipping updates and noise addition) to enhance the defense effect.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow diagram of a federal learning-based backdoor attack defense method of the present invention;
FIG. 2 is a flowchart illustrating a specific process for adjusting a global learning rate at a server according to the present invention;
FIG. 3 is a schematic diagram of a specific flow of a dynamic weight aggregation rule based on historical global updates according to the present invention;
fig. 4 is a schematic diagram of a backdoor attack defense system based on federal learning according to the present invention.
Detailed Description
The invention will be further described with reference to the accompanying drawings and specific examples.
Aiming at the safety problem caused by local data abnormality in the federal learning training process, the invention provides a federal learning-based backdoor attack defense method and a federal learning-based backdoor attack defense system, which are improved based on the general federal learning training process, and the improved training process can effectively resist backdoor attack initiated by a client and improve the safety and robustness of a federal learning system. The following description is made in connection with specific embodiments.
Example 1
Referring to fig. 1, the present embodiment provides a back door attack defending method based on federal learning, which includes the following steps:
step S1, determining back door image data for training, initializing a training task at a server side and distributing a global model;
s2, the client receives the global model issued by the server, trains a local model by using local data, and updates and uploads the local model to the server;
s3, adjusting the global learning rate of the server side according to the symbol information updated by the local model of the client side;
s4, processing the local model update of each client so as to scale the local model update to the same magnitude as the global model update;
step S5, updating the global model according to a dynamic weight aggregation rule based on historical global updating:
the server side aggregates and generates global model update by utilizing the local model update after the scaling treatment, calculates the client side aggregation score for training rounds of which the global learning rate is kept unchanged, screens and removes the client side local update of which the cosine similarity with the sum of the historical global updates is smaller than zero, and calculates the client side aggregation score according to the negative value of the cosine similarity between the client side local model update and the sum of the historical global updates for the training rounds of which the global learning rate is negative; and finally, carrying out weighted average on the scaled local model update of the client according to the aggregation score to generate global model update, and carrying out iterative training until a final global model is obtained. In a preferred embodiment, in step S1, the server first determines a back door image dataset, model training targets, and initializes a global model w 0 And super parameters (such as training round number, learning rate, number of clients participating in training each time, malicious client proportion, etc.)And will initiate the global model w 0 Broadcast to clients.
The method is suitable for application scenes of independent and same distribution of data and dependent and same distribution of data, wherein independent and same distribution scenes are that training data are uniformly distributed in all clients at random, the dependent and same distribution scenes refer to that data are independently generated by data owners, the types of data labels or the number of data samples of all clients are unevenly distributed, and the selected data set is divided into the dependent and same distribution form by Dirichlet distribution with super parameter alpha=0.05.
The back door image dataset is generated by adding differently shaped triggers in the data samples and assigning them to the specified clients according to different data distributions, setting the set of clients c= { C 1 ,c 2 ,…,c k Local data set per client }
Figure BDA0004071778470000091
The global model training objective can be regarded as a process of solving the optimal global model based on empirical risk minimization, i.e. finding a set of optimal model parameters +.>
Figure BDA0004071778470000092
Satisfy->
Figure BDA0004071778470000093
Wherein (1)>
Figure BDA0004071778470000094
Represents a d-dimensional weight vector, N i Representative dataset D i Is the size of (a) training data total +.>
Figure BDA0004071778470000095
F(w;D i ) Representing the empirical loss function of client i.
In step S2, each client (including the malicious client) receives the global model w of the previous round issued by the server t-1 As its local model w i,t =w t-1 Wherein w is i,t Representing the client i's local model at round t, then the client utilizes the local dataset D i The current local model is optimized and the current model is,
Figure BDA0004071778470000096
obtaining the parameter difference between local models generated by the client i during adjacent rounds of training, wherein the parameter difference is called client local model update, and delta w i,t =w i,t -w i,t-1 Wherein w is i,t And w i,t-1 Representing the local model of the client i in the t-th and t-1 th rounds, respectively, k representing the client local learning rate,/o->
Figure BDA0004071778470000097
Representing the client gradient loss function.
After the local training is completed, each client (including malicious clients) will update the generated local model by Deltaw i,t Uploading to a server.
Example 2
With reference to fig. 2, the present embodiment improves the defending effect of the back door attack by adjusting the global learning rate of the server side. In step S3, each time a round of global training is completed, the server acquires the local model updates uploaded by all clients, and calculates the symbol sum S of the local model updates t The definition is as follows:
Figure BDA0004071778470000098
where n is the number of clients, Δw i,t For client i to generate local model updates at round t, sgn (Δw i,t ) As a sign function if Deltaw i,t >0, sgn (Deltaw) i,t ) =1, otherwise sgn (Δw i,t )=-1。
In this embodiment, a learning rate threshold θ is introduced, if the sign sum of the local model updates of the current round S t When the value is smaller than the threshold value theta, the update directions of the uploaded local models are inconsistent, the probability of being attacked by the back door in the local training process is high, and the local training process is neededThe global learning rate of the server side is adjusted to be-eta, the model training loss is maximized, the global model is prevented from converging towards a malicious target, if the symbol sum is larger than or equal to a threshold value theta, the global learning rate is unchanged, and the method is normally updated, namely:
Figure BDA0004071778470000101
wherein eta represents the initial global learning rate of the server side, S t Symbol sum, η, representing local model updates θ,t The global learning rate of the t-th round given the threshold θ.
Here, it should be noted that, the learning rate threshold θ is set according to the number of clients and the proportion of malicious clients, and the value range of θ needs to satisfy cp+1< θ < C (1-P), where C is the number of clients and P is the proportion of malicious clients.
Example 3
In order to accurately identify the local update of the backdoor with high attack influence and eliminate the influence of the local update on the global model, the embodiment provides a model aggregation method based on historical global update, which is used as an aggregation basis of current round global model update. Since malicious clients can dominate the aggregated global model updates by scaling up the malicious model updates. Thus, before performing the aggregation operation, the size of each client local model update needs to be processed to scale to the same order of magnitude as the global model update.
In connection with fig. 3, step S4, the local model updates of each client are sized to scale to the same order as the global model updates before performing the aggregation operation. On the one hand, the narrowing vector ensures that a single local model update does not have too much effect on the aggregated global model updates, which reduces the contribution of malicious clients to the global model to some extent. On the other hand, in order to maximize the attack influence, an attacker needs to generate malicious model update with larger magnitude, so that small-scale local model update is more likely to come from an honest client in theory, proper expansion of the vector scale of the model update is beneficial to benign update to play a role, and the effect of back door attack is reduced to the greatest extent.
Local model update after scaling
Figure BDA0004071778470000103
Calculated from the following formula:
Figure BDA0004071778470000102
wherein t represents the training round of the model, and Deltaw t-1 Representing global model updates, Δw, generated by a previous round of training i,t The method comprises the steps that local model update generated by a client i in a t-th round is represented, and I is represented by vector two norms;
step S5, utilizing the local model update after the scaling treatment, the server side aggregates to generate global update, and the specific method is as follows:
s51, the server obtains all global updates generated in previous rounds of the current training task, which are called historical global update sum Deltaw pre,t-1 =Δw 1 +Δw 2 +…+Δw t-1 Respectively calculating cosine similarity CS of the sum of local model update and historical global update of each client i,t
Figure BDA0004071778470000111
Wherein Deltaw is t-1 Representing global model updates for round t-1, deltaw pre,t-1 Representing a historical global update sum,
Figure BDA0004071778470000112
local model update of the representative client i after the t-th round of scaling processing; if the cosine similarity calculation is negative, the method indicates that the local model update generated by the current client i has negative influence on the global model.
S52, training rounds with the global learning rate kept unchanged are needed to be aggregated in the modelFiltering out malicious local model update with opposite global update direction in the process, namely, local model update with negative cosine similarity calculation, specifically, using the cosine distance calculated by Relu operation processing to define the cosine distance as the aggregation score T of the current client i in the T-th round i,t And is used as the aggregation basis of the current round global model update; for training rounds with negative global learning rate, the obtained cosine distance is calculated by using the Relu operation process, except that since the global learning rate is negative at this time, in order to update the model in a direction away from a malicious target, the model training loss needs to be maximized, and at this time, the client aggregation score T needs to be calculated according to the negative value of cosine similarity i,t . Client aggregation score T i,t Calculated from the following formula:
Figure BDA0004071778470000113
wherein Deltaw is pre,t-1 Representing a historical global update sum,
Figure BDA0004071778470000118
local model update on behalf of client i after the t-th round of scaling process, if CS i,t >0, relu (CS) i,t )=CS i,t Otherwise Relu (CS) i,t )=0,CS i,t The cosine similarity of the local model update to the sum of the historical global updates representing client i.
S53, according to the local update and the aggregation score after the scaling processing of the client, the server calculates a weighted average value as the global model update after aggregation:
Figure BDA0004071778470000114
where n represents the number of clients, T i,t For the aggregate score of the current client i at round t,
Figure BDA0004071778470000115
and scaling the processed local model update for the client i.
Here, since the historical global update of the previous round cannot be calculated in the first round of training, the median of the local model updates is taken as the first round of global model update
Figure BDA0004071778470000116
Figure BDA0004071778470000117
Finally, the server side updates the global model according to the calculated global learning rate and the global model to obtain a final global model:
Figure BDA0004071778470000121
wherein n represents the number of clients, eta represents the initial global learning rate of the server, S t For the sign sum of the local model update, aw t-1 Global model update generated for previous round of training, CS i,t For cosine similarity of the local model update of client i to the sum of the historical global updates,
Figure BDA0004071778470000122
and scaling the processed local model update for the client i.
Example 4
Referring to fig. 4, the present embodiment provides a back door attack defense system based on federal learning, including: the system can be used for realizing the backdoor attack defending method based on federal learning, which is described in the previous embodiment, and detailed description is omitted here.
The local training module is used for training the local model by the client and generating corresponding local model update;
the parameter uploading module is connected with the local training module and is used for uploading the parameters of each clientThe local model update is uploaded to a server side, and the server side receives all local model updates of the current training round and marks w as the local model updates t ={Δw i,t I e n, where t is the model training round, n is the number of clients, Δw i,t And updating the local model generated by the client i at the t-th round.
The learning rate modification module is connected with the parameter uploading module and is used for judging whether the global learning rate of the round of training needs to be modified according to the local model update of the global client.
The parameter processing module is connected with the parameter uploading module and is used for processing and unifying the local model update of the client so as to enable the local model update to be scaled to be the same magnitude as the global model update, reduce the contribution of the malicious client to the global model and promote the benign model update to play a role.
The aggregation updating module is connected with the learning rate modification module and the parameter processing module and is used for dynamically aggregating weights based on historical global updating if the system starts the backdoor attack defending method provided by the invention: for training rounds with the global learning rate kept unchanged, the server calculates the client aggregation score, screens and removes the client local update with cosine similarity smaller than zero with the sum of the historical global updates, and for training rounds with the global learning rate negative, calculates the client aggregation score according to the cosine similarity negative value; and (3) carrying out weighted average on the scaled local model update of the client according to the aggregation score, generating global model update, and carrying out iterative training until a final global model is obtained. If the system does not start the defense method, the aggregation updating module is connected with the parameter uploading module and is used for a default federal average aggregation algorithm (FedAVg) (the algorithm is not the design gist of the present invention and is not repeated here), and the federal average aggregation algorithm directly uses the average value updated by each client model as a global model to update the global model.
Example 5
The embodiment provides an application of a backdoor attack defense system based on federal learning, which is used for backdoor attack defense through the backdoor attack defense system based on federal learning, updating a global model, using the updated global model, and outputting a recognition prediction result.
In summary, aiming at the problem of defending against back door attacks in federal learning, the existing method cannot meet the following requirements at the same time: 1. the method has enough universality and is suitable for different data distribution scenes. 2. The method is simple and easy to understand, and the calculation cost is not excessive. 3. The method needs to identify and remove the back door local updates rather than just reducing the impact of back door attacks. 4. The method needs to reduce detection errors as much as possible and does not influence the behavior of the honest client, thereby ensuring the accuracy of the global model.
In a back door attack, an attacker needs to add specific triggers in the target dataset and train using the poisoning dataset so that the global model identifies the target sample as a target class. For the final trained global model, back door attacks are only effective if the client model updates have a negative impact on the global model and the back door update impact of an attacker is greater than the benign update impact of an honest client. In order to meet the four requirements, the invention starts from two aspects and provides a backdoor attack defense method and system based on federal learning. Firstly, in order to quantify the influence of the model update of the current round of aggregation on the global model, the invention adjusts the learning rate of a server end according to the sign and the information of the local model update, and prevents the model from converging towards a malicious target. Then, in order to remove suspicious local model update before global model update is aggregated and prevent the model from converging towards a malicious target, the method calculates the aggregation score of the client based on historical global update, and reduces the influence degree of abnormal local update on the global model. In addition, the model back door attack is also a back door attack in federal learning, and in order to achieve the purpose of attack, an attacker needs to roughly know system parameters and the current model training state so as to replace the global model update with the back door model update when the model approaches convergence. However, the attack is generally accompanied by operations such as parameter scaling, and the like, so that the attack is easily detected by a defense method based on gradient clipping and noise addition, and the operations of an attacker on parameters need to be balanced between concealment and a back door attack success rate, and have higher requirements on the capability of the attacker. In addition, the defending method can be combined with a defending method based on gradient clipping and noise addition, and the defending effect is enhanced. Finally, the invention provides a backdoor attack defending system based on federal learning, which comprises a local training module, a parameter uploading module, a learning rate modifying module, a parameter processing module and an aggregation updating module, wherein the defending system corresponds to the provided defending method one by one, so that the attack accuracy is reduced, and the backdoor attack resistance of the system is improved.
It should be understood that the above description is not intended to limit the invention to the particular embodiments disclosed, but to limit the invention to the particular embodiments disclosed, and that various changes, modifications, additions and substitutions can be made by those skilled in the art without departing from the spirit and scope of the invention.

Claims (9)

1. A backdoor attack defense method based on federal learning is characterized by comprising the following steps:
step S1, determining back door image data for training, initializing a training task at a server side and distributing a global model;
s2, the client receives the global model issued by the server, trains the local model by using the local data, and uploads the generated local model update to the server;
s3, adjusting the global learning rate of the server side according to the symbol information updated by the local model of the client side;
s4, processing the local model update of each client so as to scale the local model update to the same magnitude as the global model update;
step S5, updating the global model according to a dynamic weight aggregation rule based on historical global updating:
the server side aggregates and generates global model update by utilizing the local model update after the scaling treatment, calculates the client side aggregation score for training rounds of which the global learning rate is kept unchanged, screens and removes the client side local update of which the cosine similarity with the sum of the historical global updates is smaller than zero, and calculates the client side aggregation score according to the negative value of the cosine similarity between the client side local model update and the sum of the historical global updates for the training rounds of which the global learning rate is negative; and finally, carrying out weighted average on the scaled local model update of the client according to the aggregation score to generate global model update, and carrying out iterative training until a final global model is obtained.
2. The federal learning-based back door attack defense method according to claim 1, wherein in step S1, the server first determines a back door image dataset, a model training target, and initializes a global model w 0 And super-parameters, and will initiate the global model w 0 Broadcasting to the client; wherein the back door image dataset is generated by adding triggers of different shapes in the data samples and assigning them to the specified clients according to different data distributions, a set of clients c= { C is set 1 ,c 2 ,...,c k Local data set per client }
Figure FDA0004071778460000011
Figure FDA0004071778460000012
The global model training objective can be regarded as a process of solving the optimal global model based on empirical risk minimization, i.e. finding a set of optimal model parameters +.>
Figure FDA0004071778460000013
Satisfy the following requirements
Figure FDA0004071778460000014
Figure FDA0004071778460000015
Wherein (1)>
Figure FDA0004071778460000016
Represents a d-dimensional weight vector, N i Representative dataset D i Is the size of (a) training data total +.>
Figure FDA0004071778460000017
F(w;D i ) Representing the empirical loss function of client i.
3. The federal learning-based back door attack defense method according to claim 1, wherein in step S2, each client receives the global model w of the previous round issued by the server t-1 As its local model w i,t =w t-1 Wherein w is i,t Representing the local model of client i at round t, client i then utilizes local dataset D i Optimizing current local model
Figure FDA0004071778460000018
Obtaining the parameter difference between local models generated by the client i during adjacent rounds of training, wherein the parameter difference is called client local model update Deltaw i,t =w i,t -w i,t-1 Wherein w is i,t And w i,t-1 Representing the local model of the client i in the t-th and t-1 th rounds, respectively, l representing the client local learning rate,/>
Figure FDA0004071778460000021
Representing a client gradient loss function; after the local training is completed, each client will generate local model update Deltaw i,t Uploading to a server.
4. The federal learning-based back door attack defense method according to claim 1, wherein in step S3, each time a round of global training is completed, the server side obtains local model updates uploaded by all clients, and calculates a symbol sum S of the local model updates t The definition is as follows:
Figure FDA0004071778460000022
where n is the number of clients, Δw i,t For client i to generate local model updates at round t, sgn (Δw i,t ) As a sign function if Deltaw i,t > 0, sgn (Deltaw) i,t ) =1, otherwise sgn (Δw i,t )=-1;
Introducing a learning rate threshold value theta if the sign sum of the local model updates of the current round S t And when the total learning rate is smaller than the threshold value theta, the global learning rate at the server side is adjusted to be-eta, the model training loss is maximized, if the symbol sum is larger than or equal to the threshold value theta, the global learning rate is unchanged, and the model training loss is updated normally, namely:
Figure FDA0004071778460000023
wherein eta represents the initial global learning rate of the server side, S t Symbol sum, η, representing local model updates θ,t The global learning rate of the t-th round given a threshold value theta;
the learning rate threshold value theta is set according to the number of clients and the proportion of malicious clients, and the value range of theta is required to meet the condition that CP+1< theta < C (1-P), wherein C is the number of clients, and P is the proportion of the malicious clients.
5. The federal learning-based back door attack defense method according to claim 1, wherein step S4, before performing the aggregation operation, processes the size of each client local model update to scale to the same magnitude as the global model update; local model update after scaling
Figure FDA0004071778460000024
Calculated from the following formula:
Figure FDA0004071778460000025
wherein, t is shown in tableShow the training turns, Δw t-1 Representing global model updates, Δw, generated by a previous round of training i,t Representing local model updates generated by client i at round t, |·| represents vector binorms.
6. The federal learning-based back door attack defense method according to claim 5, wherein in step S5, global updates are generated by server-side aggregation using the scaled local model updates, and the specific method is as follows:
s51, the server obtains all global updates generated in previous rounds of the current training task, which are called historical global update sum Deltaw pre,t-1 =Δw 1 +Δw 2 +...+Δw t-1 Respectively calculating cosine similarity CS of the sum of local model update and historical global update of each client i,t
Figure FDA0004071778460000031
Wherein Deltaw is t-1 Representing global model updates for round t-1, deltaw pre,t-1 Representing a historical global update sum,
Figure FDA0004071778460000032
local model update after the t-th round of scaling processing is performed on behalf of the client i; if the cosine similarity calculation is negative, the fact that the local model update generated by the current client i has negative influence on the global model is indicated;
s52, for training rounds with the global learning rate kept unchanged, malicious local model updates with the opposite direction to the global update need to be filtered in the model aggregation process, namely, the local model updates with the cosine similarity calculated as negative are specifically defined as aggregation scores T of the current client i in the T-th round by using the cosine similarity obtained by the Relu operation processing calculation i,t And is used as the aggregation basis of the current round global model update; for training rounds with negative global learning rate, the Relu operation process is also usedThe difference between the calculated cosine similarity is that, because the global learning rate is negative at this time, in order to update the model in a direction away from the malicious target, the model training loss needs to be maximized, and the client aggregation score T needs to be calculated according to the negative value of the cosine similarity at this time i,t Client aggregate score T i,t Calculated from the following formula:
Figure FDA0004071778460000033
wherein Deltaw is pre,t-1 Representing a historical global update sum,
Figure FDA0004071778460000034
local model update on behalf of client i after the t-th round of scaling process, if CS i,t > 0, then Relu (CS) i,t )=CS i,t Otherwise Relu (CS) i,t )=0,CS i,t Cosine similarity representing the sum of local model update and historical global update of the client i;
s53, according to the local update and the aggregation score after the scaling processing of the client, the server calculates a weighted average value as the global model update after aggregation:
Figure FDA0004071778460000035
where n represents the number of clients, T i,t For the aggregate score of the current client i at round t,
Figure FDA0004071778460000036
and scaling the processed local model update for the client i.
7. The federal learning-based back door attack defense method according to claim 6, wherein the local model is taken because the historical global update of the previous round cannot be calculated in the first round of trainingMedian of updates as first round global model updates
Figure FDA0004071778460000037
8. The federal learning-based back door attack defense method according to claim 6, wherein in step S4, finally, the server side updates the final global model according to the calculated global learning rate and the global model to obtain the final global model:
Figure FDA0004071778460000041
wherein n represents the number of clients, eta represents the initial global learning rate of the server, S t For the sign sum of the local model update, aw t-1 Global model update generated for previous round of training, CS i,t For cosine similarity of the local model update of client i to the sum of the historical global updates,
Figure FDA0004071778460000042
and scaling the processed local model update for the client i.
9. The backdoor attack defense system based on federal learning, which is used for implementing the backdoor attack defense method based on federal learning according to any one of claims 1 to 8, and comprises: a local training module, a parameter uploading module, a learning rate modifying module, a parameter processing module and an aggregation updating module,
the local training module is used for training the local model by the client and generating corresponding local model update;
the parameter uploading module is connected with the local training module and is used for uploading the local model update of each client to the server, and the server receives all the local model updates of the current training round and marks the local model update as w t ={Δw i,t I e n, where t is the model training round, n is the number of clients, Δw i,t Updating the local model generated by the client i in the t-th round;
the learning rate modification module is connected with the parameter uploading module and is used for judging whether the global learning rate of the round of training needs to be modified according to the local model update of the global client;
the parameter processing module is connected with the parameter uploading module and is used for processing the size of the local model update of the client so as to enable the local model update to be scaled to be the same as the global model update, reduce the contribution of a malicious client to the global model and promote the benign model update to play a role;
the aggregation updating module is connected with the learning rate modification module and the parameter processing module and is used for dynamically aggregating weights based on historical global updating if the system starts the backdoor attack defense method based on federal learning: for training rounds with the global learning rate kept unchanged, the server calculates the client aggregation score, screens and removes the client local update with cosine similarity smaller than zero with the sum of the historical global updates, and for training rounds with the global learning rate negative, calculates the client aggregation score according to the negative value of the cosine similarity;
the client local model update after scaling treatment is weighted and averaged according to the aggregation score to generate a global model update, and iterative training is carried out until a final global model is obtained; if the system does not start the defending method, the aggregation updating module is connected with the parameter uploading module and used for defaulting a federal average aggregation algorithm, and the federal average aggregation algorithm directly uses the average value updated by each client model as a global model to update the global model.
CN202310096388.9A 2023-02-10 2023-02-10 Back door attack defense method and system based on federal learning Pending CN116029369A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310096388.9A CN116029369A (en) 2023-02-10 2023-02-10 Back door attack defense method and system based on federal learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310096388.9A CN116029369A (en) 2023-02-10 2023-02-10 Back door attack defense method and system based on federal learning

Publications (1)

Publication Number Publication Date
CN116029369A true CN116029369A (en) 2023-04-28

Family

ID=86075856

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310096388.9A Pending CN116029369A (en) 2023-02-10 2023-02-10 Back door attack defense method and system based on federal learning

Country Status (1)

Country Link
CN (1) CN116029369A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116739114A (en) * 2023-08-09 2023-09-12 山东省计算中心(国家超级计算济南中心) Robust federal learning aggregation method and device for resisting model poisoning attack
CN116822647A (en) * 2023-05-25 2023-09-29 大连海事大学 Model interpretation method based on federal learning
CN117094410A (en) * 2023-07-10 2023-11-21 西安电子科技大学 Model repairing method for poisoning damage federal learning
CN117313898A (en) * 2023-11-03 2023-12-29 湖南恒茂信息技术有限公司 Federal learning malicious model updating detection method based on key period identification
CN117454381A (en) * 2023-12-26 2024-01-26 山东省计算中心(国家超级计算济南中心) Progressive attack method for federal learning under non-independent co-distributed data
CN117708877A (en) * 2023-12-07 2024-03-15 重庆市科学技术研究院 Personalized federal learning method and system for hybrid multi-stage private model
CN118094566A (en) * 2024-03-28 2024-05-28 南开大学 Historical information-based dynamic joint learning defense framework for poisoning attack of non-target model
CN118445817A (en) * 2024-07-08 2024-08-06 山东省计算中心(国家超级计算济南中心) Method and device for enhancing federal learning model defense based on historical global model and readable computer storage medium

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116822647A (en) * 2023-05-25 2023-09-29 大连海事大学 Model interpretation method based on federal learning
CN116822647B (en) * 2023-05-25 2024-01-16 大连海事大学 Model interpretation method based on federal learning
CN117094410A (en) * 2023-07-10 2023-11-21 西安电子科技大学 Model repairing method for poisoning damage federal learning
CN117094410B (en) * 2023-07-10 2024-02-13 西安电子科技大学 Model repairing method for poisoning damage federal learning
CN116739114B (en) * 2023-08-09 2023-12-19 山东省计算中心(国家超级计算济南中心) Federal learning method and device for resisting model poisoning attack deployed on server
CN116739114A (en) * 2023-08-09 2023-09-12 山东省计算中心(国家超级计算济南中心) Robust federal learning aggregation method and device for resisting model poisoning attack
CN117313898A (en) * 2023-11-03 2023-12-29 湖南恒茂信息技术有限公司 Federal learning malicious model updating detection method based on key period identification
CN117708877B (en) * 2023-12-07 2024-07-12 重庆市科学技术研究院 Personalized federal learning method and system for hybrid multi-stage private model
CN117708877A (en) * 2023-12-07 2024-03-15 重庆市科学技术研究院 Personalized federal learning method and system for hybrid multi-stage private model
CN117454381A (en) * 2023-12-26 2024-01-26 山东省计算中心(国家超级计算济南中心) Progressive attack method for federal learning under non-independent co-distributed data
CN117454381B (en) * 2023-12-26 2024-06-04 山东省计算中心(国家超级计算济南中心) Progressive attack method for federal learning under non-independent co-distributed data
CN118094566A (en) * 2024-03-28 2024-05-28 南开大学 Historical information-based dynamic joint learning defense framework for poisoning attack of non-target model
CN118094566B (en) * 2024-03-28 2024-09-17 南开大学 Historical information-based dynamic joint learning defense framework for poisoning attack of non-target model
CN118445817A (en) * 2024-07-08 2024-08-06 山东省计算中心(国家超级计算济南中心) Method and device for enhancing federal learning model defense based on historical global model and readable computer storage medium

Similar Documents

Publication Publication Date Title
CN116029369A (en) Back door attack defense method and system based on federal learning
WO2021026805A1 (en) Adversarial example detection method and apparatus, computing device, and computer storage medium
CN115333825B (en) Defense method for federal learning neuron gradient attack
CN111680292A (en) Confrontation sample generation method based on high-concealment universal disturbance
CN115907029B (en) Method and system for defending against federal learning poisoning attack
CN116308762B (en) Credibility evaluation and trust processing method based on artificial intelligence
CN111178504B (en) Information processing method and system of robust compression model based on deep neural network
CN117454381B (en) Progressive attack method for federal learning under non-independent co-distributed data
CN116739114B (en) Federal learning method and device for resisting model poisoning attack deployed on server
CN117272306A (en) Federal learning half-target poisoning attack method and system based on alternate minimization
CN115456192A (en) Pond learning model virus exposure defense method, terminal and storage medium
CN112434213A (en) Network model training method, information pushing method and related device
CN117940936A (en) Method and apparatus for evaluating robustness against
CN114494771B (en) Federal learning image classification method capable of defending back door attack
CN114024738A (en) Network defense method based on multi-stage attack and defense signals
CN111507396B (en) Method and device for relieving error classification of unknown class samples by neural network
CN116389093A (en) Method and system for defending Bayesian attack in federal learning scene
Yang et al. DeMAC: Towards detecting model poisoning attacks in federated learning system
CN116824232A (en) Data filling type deep neural network image classification model countermeasure training method
CN116523078A (en) Horizontal federal learning system defense method
Chen et al. MagicGAN: multiagent attacks generate interferential category via GAN
CN116050546A (en) Federal learning method of Bayesian robustness under data dependent identical distribution
Guo et al. Not all Minorities are Equal: Empty-Class-Aware Distillation for Heterogeneous Federated Learning
Li et al. Contribution-wise Byzantine-robust aggregation for Class-Balanced Federated Learning
Zhang et al. MF2POSE: Multi-task Feature Fusion Pseudo-Siamese Network for intrusion detection using Category-distance Promotion Loss

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination