CN116150745A - Back door attack defense method based on multidimensional index dynamic identification federal learning - Google Patents
Back door attack defense method based on multidimensional index dynamic identification federal learning Download PDFInfo
- Publication number
- CN116150745A CN116150745A CN202310019902.9A CN202310019902A CN116150745A CN 116150745 A CN116150745 A CN 116150745A CN 202310019902 A CN202310019902 A CN 202310019902A CN 116150745 A CN116150745 A CN 116150745A
- Authority
- CN
- China
- Prior art keywords
- model
- distance
- gradient
- round
- back door
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- Computer Security & Cryptography (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Hardware Design (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Computer And Data Communications (AREA)
Abstract
The invention discloses a backdoor attack defense method based on multidimensional index dynamic identification in federal learning, which comprises the following steps: first, manhattan distance is introduced, and the proposed distance-based defensive measure shows remarkable performance to the concealed back door by using the manhattan distance. In order to cope with various attacks, malicious gradients are identified by utilizing the cooperation of various indexes, the mahalanobis distance is applied, dynamic weights are generated, the non-IID distribution of participants and different scales brought by different distances are processed, and finally, the score is calculated for each submitted gradient, and only the benign gradients are gathered according to the score; the multi-dimensional index is adopted to dynamically identify the back door attack in federal learning, so that different back door attack means can be resisted as much as possible while the accuracy rate of a main task and the performance of a model are ensured, the system can adapt to the data distribution condition of non-independent same distribution, and can defend hidden back door attacks which are carefully designed by attackers.
Description
Technical Field
The invention relates to the field of artificial intelligence data security, in particular to a backdoor attack defense method in federal learning based on multi-dimensional index dynamic identification.
Background
Federal learning is a collaborative machine learning framework that satisfies privacy protection, meets the regulations related to "data security protection laws" and "personal privacy protection laws", but is susceptible to backdoor attacks from malicious parties due to the inability to examine data of the parties. As shown in fig. 1, under the framework of federal learning, each party uses own data to train a model, only parameters such as gradient and the like updated by each model iteration are aggregated and updated in a public area, and the data of each party is kept locally without moving out, so that privacy is not revealed and regulations are not violated; the multiple participants combine the data to build a virtual common model and benefit together; under the federal learning system, the identities and the positions of all participants are equal; the modeling effect of federal learning is the same as, or not much different from, the effect of modeling the entire dataset in one place. Federal learning systems are susceptible to backdoor attacks, also known as targeted data poisoning attacks, that manipulate models toward targeted behavior on input selected by a particular adversary. Back-gate attacks are more difficult to discover than non-targeted data-poisoning attacks, because they do not affect the normal function of the model, and their gradients are more similar to benign. Because federal learning is a privacy concern, the central server does not have access to the user's local data and training process, making it less secure and thus more vulnerable to attack. Model replacement attacks successfully inject a backdoor into the global model through a single attack. Some well-designed attacks strategically target weaknesses in defense, such as PGD attack scaling and projection gradients, or DBA attacks split them before uploading triggers. In addition, edge-case PGD attacks modify the poisoned data and models. Clearly, these attacks present a significant challenge to the security of the bang-study system.
The existing defense method of the back door attack in the federal learning mainly comprises the following two steps: the first is to reject the gradient uploaded by the attacker in a classified mode; the second is to add noise to the federally learned model by means of differential privacy to gradually eliminate the back door. The first scheme has the problems that a hidden back door attack, particularly a back door attack after model scaling, cannot be identified, and meanwhile, the method is difficult to cope with different attack strategies and is difficult to adapt to the situation that the data of clients are not independent and distributed at the same time; the problem of the second scheme is that the added noise can have a larger influence on the performance of the model main task, and the aim of collaborative training is violated.
The main disadvantages of the existing defending method for defending the back door attack are as follows.
1. Most defenses tend to assume a specific data distribution, e.g., krum defensive methods cull client models with higher L2 norms than other models by comparing L2 norms between each client model, which requires that the data distribution be assumed to be independent and co-distributed (Machine Learning with Adversaries: byzantine Tolerant Gradient Descent, peva blancard, el Mahdi El Mhamdi, rachd Guerraoui, julien Stainer); the Foolsgold defense method is to compare the cosine similarity (The Limitations of Federated Learning in Sybil Settings, clement Fung, chris J.M. Yoon, ivan Beschastnikh) between the client model and the history, consider that the model uploaded by an attacker has consistency, but the benign client has no consistency, and assume that the data distribution is not independent and distributed, so that the adaptability is low;
2. Many defenses at present can identify malicious gradients such as Krum and RFA defensive algorithms based on euclidean distance in vector space, the RFA defensive algorithm replaces the original average algorithm through geometric median of each client, but euclidean distance is greatly influenced by dimensional curse in high-dimensional space, identification capability is poor, and a neural network is a very typical high-dimensional space, so that identification accuracy is low (Robust Aggregation for Federated Learning, krishna Pillutla, sham M).
Kakade,and Zaid Harchaoui);
3. Most existing defenses cannot defend against model scaled and hidden back door data attacks such as Edge-case PGD attacks, but only defend against e.g. the Weak-DP defensive algorithm by means of differential privacy plus noise (Can You Really Backdoor Federated Learning, ziteng Sun, peter Kairouz, ananda Theertha Suresh, H.
Brendan McMahan), which reduces the accuracy of the model's primary tasks, resulting in high defense costs.
4. Most of the backdoor attacks have single dead plates, can only defend specific attacks in a targeted manner, are difficult to defend various attack means effectively, have poor defending efficiency, for example, krun and RFA defend through European distances, foolsgold defend through cosine similarity, and are easy to break in a targeted manner by attackers;
Disclosure of Invention
The invention aims to provide a method for dynamically identifying the defending of the back door attack in federal learning based on multidimensional indexes, which firstly introduces Manhattan distance, is more significant than Euclidean distance in a high-dimensional space in theory, and shows remarkable performance on the hidden back door by using the Manhattan distance and the defending measure based on the distance. To cope with various attacks, a variety of index partnerships are utilized to identify malicious gradients. In addition, mahalanobis distance is applied and dynamic weights are generated to handle non-IID distribution of participants and different scales brought by different distances. Finally, a score is calculated for each submitted gradient and only benign gradients are aggregated according to the score. The multi-dimensional index is adopted to dynamically identify the back door attack in federal learning, so that different back door attack means can be resisted as much as possible while the accuracy rate of a main task and the performance of a model are ensured, the system can adapt to the data distribution condition of non-independent same distribution, and can defend hidden back door attacks which are carefully designed by attackers.
The invention is realized at least by one of the following technical schemes.
A backdoor attack defense method based on multidimensional index dynamic identification in federal learning comprises the following steps:
S1, initializing a federal learning framework, defining and calculating gradient characteristics, and defining and calculating characteristic values of each client;
s2, calculating dynamic weights and scores of the clients by using the Markov distances, and calculating the scores d according to the distances i Sequencing;
s3, eliminating attack gradients, aggregating benign gradients and adding noise.
Further, the gradient is characterized as follows:
where i is the ith user, w i A local model trained for the ith user, w 0 The global model is the global model which is issued after the last round of aggregation; gradient features for ith user training include Manhattan distanceEuclidean distance->And cosine distanceThe user gradient feature information is expressed by the following formula: />
x=(x Man ,x Eul ,x Cosine ) (4)
Further, the dispersion is selected as a basis for judging whether a gradient is an abnormal value, and the characteristic value of the model of each client is defined and calculated:
wherein the method comprises the steps ofRespectively redefined characteristic value,/->Is the original characteristic value of the gradient model, and +.>Is the division of +.f among all gradients uploaded in this round>Other gradients than those described.
Further, after defining the gradient feature, feature information of the user gradient is expressed by the following formula:
wherein x' (i) A dispersion vector representing the i-th client,representing the ith client and the rest of clients x in K clients selected in the round j Is used as the dispersion on the index, the sum of the absolute values of the differences over the Manhattan distance>And->The dispersion over euclidean distance and cosine distance indices, respectively.
Further, dynamic weights and scores are calculated for gradient features using mahalanobis distance weighting:
where Σ is the covariance matrix of the gradient features of all users involved in training when the round is selected, the inverse of which is used as the dynamic weight of the round, d i For the dispersion vector x 'through the ith client' (i) The obtained fraction is used for calculating the total weight of the product,is x' (i) Is a transpose of (a).
Further, initializing a federal learning framework consisting of N clients, comprising the steps of:
(1) The central server issues the initial global model w of the current round of K clients participating in the current round of training 0 ;
(2) Each client uses its local data to train the model w locally 0 And trained local model w i Uploading server, wherein w i A local model uploaded for an ith client;
(3) The server receives the K trained local models.
Further, defining and calculating the characteristic value of each client comprises the following steps of
1) Local model w for the ith user respectively i Manhattan distance for calculating model vectors Euclidean distanceAnd
2) Local model w for the ith user respectively i Defining and calculating features of its model vectorSign vector x i =(x i Man ,x i Eul ,x i cosine );
3) Local model w for the ith user respectively i Calculating the dispersion of Manhattan distance of model vectorWherein->Representing the sum of absolute values of Manhattan distance differences between the ith user and other users;
4) Local model w for the ith user respectively i Calculating Euclidean distance dispersion of model vectorWherein->Representing the sum of absolute values of the Euclidean distance differences between the ith user and the other users;
5) Local model w for the ith user respectively i Calculating Euclidean distance dispersion of model vectorWherein->Representing the sum of absolute values of cosine distance differences between the ith user and other users;
6) Redefining and calculating the characteristic value of each client as follows:
7) Calculating matrix composed of eigenvalue vectors of K clients in current roundWherein->Representing the transpose of the matrix.
Further, step S2 includes the steps of:
s21, calculating a covariance matrix sigma of a matrix X formed by eigenvalue vectors of K clients, wherein an inverse matrix of the covariance matrix is used as an adaptive weight matrix of the round;
s22, local model w for ith user respectively i Calculate its mahalanobis distance d i And as a fraction of its distance,the lower the score, the more representative the local model of the ith user is of the population of local models of all users;
s23, K clients participating in training of all the rounds according to the distance score d i And sequencing.
Further, step S3 includes the steps of:
s31, eliminating local models of p users with higher distance scores;
s32, aggregating the rest user models and taking the user models as a global model w of the federal training of the round * ;
S33, directing to the global model w * Adding differential privacy noise;
and S34, if the round reaches the preset federal training round, the training is exited, and if the round does not reach the preset federal training round, a new round of federal training is performed.
Further, to the global model w * Adding differential privacy noise with standard deviation sigmaAnd get a new global model +.>
Compared with the prior art, the invention has the beneficial effects that:
the invention can effectively identify various back door attacks while meeting the requirement of maintaining the performance of the main task, and is not constrained by the data distribution property.
The invention uses a plurality of different indexes to simultaneously identify the back door gradient so as to cope with various back door attacks and realize the high efficiency of the back door attack defense;
According to the invention, a new dynamic weight is generated in each round of iteration through the dynamic matrix generated by the Markov distance in a self-adaptive manner so as to solve the problem of large gradient characteristic difference caused by different data distribution, and realize high adaptability of defense;
the invention introduces the back door gradient of the Manhattan distance involved in identifying the attacker, relieves the problem of poor identification capability in a high-dimensional space caused by the curse of the dimension, and realizes high accuracy of defense;
the invention realizes the accurate identification of the hidden back door, so that the hidden back door attack which is carefully designed by an attacker can be defended without a means of differential privacy and noise, and the defending cost is low.
Drawings
In order to further explain the technical means of the invention, the principles, purposes, characteristics and the like of the invention are presented more easily and more commonly, the following examples are given, and the invention is assisted with the drawings;
FIG. 1 is a diagram of a federal learning framework of the present invention;
FIG. 2 is a flowchart of a method for defending a back door attack in dynamic recognition federal learning based on multidimensional index according to the present invention;
FIG. 3 is a schematic diagram of defining gradient vector features in the present invention;
fig. 4 is a diagram illustrating the reason why the covariance matrix is used as the weight in the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, the following description will be given in detail with reference to the accompanying drawings and detailed description. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The present invention will be further described with reference to the accompanying drawings and detailed description, wherein it is to be understood that, on the premise of no conflict, the following embodiments or technical features may be arbitrarily combined to form new embodiments.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
As shown in fig. 2, the method for defending a back door attack in federal learning based on multi-dimensional index dynamic identification of the embodiment includes the following steps:
s1, initializing a federal learning framework, defining characteristics of gradients, calculating dispersion of characteristic values of each user model, and taking the dispersion as a new characteristic value;
the federal learning framework is based on the FedAvg federal averaging algorithm, and has the following formula:
wherein w is * Is a new global model, w t As a global model of the t-th round,the method comprises the steps that (1) a local model which is trained by an ith user in the t+1st round is obtained, N is the number of all users, K is the number of clients participating in one round of training, and eta is the training learning rate;
the framework of federal learning training can enable users to train a common efficient artificial intelligence model in cooperation with multiple users without leaving the local data, but if malicious users (attackers) exist to make poisoning in the model, the model safety is greatly compromised. The invention focuses on the back door attack, because the back door attack only responds on the back door and does not appear on the main performance, the back door attack has concealment, is easy to be deployed in the actual production environment and causes great harm to society.
The back door attack method f (·) is based on a trigger or back door pattern delta versus normal sample x i Processing to obtain poisoning sample x b I.e. x b =f(x i Delta) and assign a target label y to the poisoning sample t Then, a plurality of poisoning data pairs (x b ,y t ) And normal data (x) i ,y i ) Together, a new training data set is formed for training the neural network model to obtain a model M with embedded back door b . When using the neural network model for normal samplesWhen the prediction is carried out, the model can still obtain a correct prediction result M b (x i {test )=y i While when using the neural network model for poisoning samples with triggers $delta $ is $>When predicting, the model will be output according to the target class label specified by the attacker, namely +.>
Under the federal learning framework, an attacker trains through backdoor data as described above to obtain a backdoor model and then uploads the backdoor model to a server so as to poison the global model and jeopardize the federal learning framework. Simple back door attacks are easily defended by existing defense methods, but cannot be defended by elaborate, advanced, hidden attack means. The novel defending method constructed by the invention can defend the existing attack strategy in a universal way.
Firstly, defining and calculating the characteristics of a gradient model uploaded by a user, wherein the Manhattan distance, the Euclidean distance and the cosine distance of a model vector are shown in figure 3, and the characteristics of the gradient are as follows:
where i is the ith user, w i A local model trained for the ith user, w 0 The global model is the global model which is issued after the last round of aggregation; gradient features for ith user training include Manhattan distanceEuclidean distance->And cosine distanceThe gradient characteristic values are defined by using the multiple indexes, and if an attacker uses common Model Replacement and the like to cause the L2 norm of the gradient to be increased, the Euclidean distance characteristic of the formula (2) is larger; if the ratio of back gate data (toxic data) is large in the back gate data set of the attacker, the cosine distance characteristic value of the formula (3) is causedLarger; even if an attacker uses a scaling method such as a projection method PGD to enable the gradient of the projection method PGD to be more hidden in the Euclidean space, the projection method PGD can be detected through the Manhattan distance characteristic of the formula (1).
The user gradient feature information is expressed by the following formula:
x=(x Man ,x Eul ,x Cosine ) (4)
the dispersion is selected as a basis for judging whether a gradient is an outlier, and then the characteristic value of the model of each client is redefined and calculated:
The population relationship of all gradients can be observed by the formulas (5) to (7), and if the value is high, the model gradient is relatively outlier in the characteristic index, and if the value is low, the model gradient is relatively close to the center of the outlier. Obviously, the back door model gradient has a sample-label mapping relation which is not provided with a normal model gradient, and as long as the number of attackers is not greater than the number of normal participating users, the model gradient of the attackers is an outlier, namely if the characteristic value of the gradient model is higher, the gradient model is considered to be an attack gradient model.
After redefining the gradient features, the feature information of the user gradient can be redefined using the following formula:
wherein x' (i) A dispersion vector representing the i-th client,representing the ith client and the rest clients x in the K clients selected in the round j As a dispersion on the index.
The basic logic of distance-based defense involves defining indicators that distinguish well between malicious gradients and benign gradients and remove hostile updates from the aggregate. Thus, the core problem becomes how to define an index that can identify malicious gradient features. However, the widely used euclidean distance suffers from "curse in the dimension" and has proven to be less significant in distinguishing in high-dimensional spaces, thus introducing manhattan distance as a distinction. The existing method often adopts the distinguishing measurement of Euclidean distance, cosine distance and the like in the Euclidean space, but extensive practical work discovers that the method can not effectively distinguish the distance distinction between neighbors in the high latitude space.
Meanwhile, the current method is to defend on the basis of a single index. Because there is only one metric, a subtle attacker can bypass them with a carefully designed gradient without difficulty, and the federal aggregation of white boxes is performed under the scene framework of most federal learning in the actual working environment, which means that a malicious attacker knows which defense index will be adopted by the server as a strategy and can purposefully design, so a widely applied and general defense strategy is urgently needed at present. In addition, an attacker attacks under different environments and data distribution, so that malicious gradients have different characteristics and a single index cannot be processed. To this end, the invention proposes a plurality of metrics, by defining gradient angle features with cosine similarity, defining the length of the gradient with euclidean distance, and cooperatively identifying malicious gradients with manhattan distance as a complement to this in a high dimensional space.
S2, dynamically calculating the Markov distance of each user model through covariance matrix inversion so as to calculate the dynamic weight and score updated each time; two factors are mainly considered as to how to calculate the weight of each feature: the first is that the different scales of the three distance metrics are the first obstacle to the cooperative use of these metrics, but since each metric is correlated, a new regularization method is needed instead of the usual regularization at maximum; the second is that different data distributions (e.g., different degrees of non-IID) make the gradient of both malicious and benign clients different. Thus, dynamic weighting is required to cope with various environments and attacks to achieve a general, universal defense.
Firstly, calculating a matrix composed of gradient model eigenvalues of all current users:
wherein X is a matrix of eigenvalue vectors of K clients, whereinRepresenting the transpose of the matrix, x' 1 ,x′ 2 ,...,x′ i The eigenvalue vectors representing K clients are calculated by step S1.
The next step is to calculate the matrix x= [ X ]' 1 ,x′ 2 ,...,x′ i ] T And takes this as the weight of the round.
Taking the above two factors into account, the mahalanobis distance is applied to dynamically weight and score the gradient features, as shown in the following formula:
wherein d is i For the dispersion vector x 'through the ith client' (i) The obtained fraction is used for calculating the total weight of the product,is x' (i) Is the covariance of the matrix of gradient features of all users who participate in training when the round is selectedThe inverse of the matrix, also called the precision matrix or concentration matrix, reflects the relationship between different metrics. Using the inverse of the covariance as a weight may eliminate the difference in metrics, cluster information, across different index features. The method can eliminate the difference of different dimension standards and the error caused by variance, and if the variance on the Manhattan distance characteristic is large and the variance on the Euclidean distance is small, a gradient model with small outlier on the Manhattan distance characteristic and large outlier on the Euclidean distance characteristic must have larger outlier than a gradient model with large outlier on the Manhattan distance characteristic and small outlier on the Euclidean distance characteristic.
As shown in fig. 4, point C is the center of the cluster, if a is closer to C than B, so B will be determined as an outlier, but it is obvious that point B is more consistent with the cluster characteristics, and the outlier to be removed is a, so a weight matrix is needed to eliminate the difference in the cluster direction. The inverse of the covariance matrix is calculated from the selected gradients, which dynamically changes the weights of the features according to the feature distribution, and is therefore a "dynamic weight". With the dynamic weight, the system can be better adapted to different environments and resist various attacks. And taking the obtained Marshall distance d as the score of each gradient, reflecting that the greater the distance and the greater the score are, the greater the degree of abnormality of the gradient is, and rejecting the score with the maximum value.
S3, polymerizing benign gradient: and taking the mahalanobis distance as a score, removing the score with larger score, and aggregating the rest user models to be used as a new global model. After the score for each gradient is obtained, the gradients with higher scores are summed up, as a higher score means that this gradient is less divergent among all gradients. The proportion of the selected gradient is indicated by setting a fixed proportion p (p e 0, 1), which is an adjustable super-parameter. Intuitively, the performance of the model and the convergence rate of the training are positively correlated with p. In contrast, the relationship between the accuracy of the back door task and p is much more complex. On the one hand, increasing the value of p increases the probability of choosing a back door gradient to train, which is detrimental to defending against back door attacks. On the other hand, increasing the value of p will mitigate the effect of selecting a back gate gradient, which is advantageous for defense. However, due to lack of knowledge about the attacker (e.g. the number of attackers), the best p cannot be simply determined. For simplicity as much as possible, a predefined fixed p is used and it has been empirically demonstrated that the defense proposed is not dependent on the choice of p. It has been shown in practice that any value of p set not exceeding 0.9 has good defensive properties, preferably set to 0.3-0.5, which balances the advantages and losses.
The detailed flow steps of the method of the invention include the following steps:
(1) Initializing a federal learning framework consisting of N clients, wherein the following steps are specific steps of each round until a preset training round is reached;
(2) The central server issues the initial global model w of the current round of K clients participating in the current round of training 0 ;
(3) Each client uses its local data to train model w locally 0 And trained local model w i Uploading server, wherein w i A local model uploaded for an ith client;
(4) The server receives K trained local models;
(5) Local model w for the ith user respectively i Manhattan distance for calculating model vectorsEuclidean distance->And cosine distance->
(6) Local model w for the ith user respectively i Feature vector x of its model vector is defined and calculated i =(x i Man ,x i Eul ,x i Cosine );
(7) Respectively giveLocal model w of ith user i Calculating the dispersion of Manhattan distance of model vector Wherein->Representing the sum of absolute values of Manhattan distance differences between the ith user and other users;
(8) Local model w for the ith user respectively i Calculating Euclidean distance dispersion of model vectorWherein->Representing the sum of absolute values of the Euclidean distance differences between the ith user and the other users;
(9) Local model w for the ith user respectively i Calculating Euclidean distance dispersion of model vectorWherein->Representing the sum of absolute values of cosine distance differences between the ith user and other users;
(11) Calculating a matrix x= [ x 'formed by eigenvalue vectors of K clients in the current round' 1 ,x′ 2 ,...,x′ i ] T WhereinRepresenting a transpose of the matrix;
(12) Calculating a covariance matrix sigma of the matrix X, wherein an inverse matrix of the covariance matrix is used as an adaptive weight matrix of the wheel;
(13) Local model w for the ith user respectively i Calculate its mahalanobis distance d i And as a fraction of its distance, lower scores indicate that the model is more representative of the entire model population;
(14) K clients participating in training of all the rounds according to the distance score d i Sequencing;
(15) Rejecting p user models with higher distance scores;
(16) The remaining user models are aggregated and used as the global model w for the federally trained round * ;
(17) And (3) if the round reaches the preset federal training round, the training is exited, and if the round does not reach the preset federal training round, the step (2) is skipped to perform a new round of federal training.
The decentrality and privacy protectiveness of federal learning makes it vulnerable to back door attacks whose purpose is to manipulate the behavior of the generated model over specific inputs selected by an adversary. However, most existing statistical difference-based defenses are only effective for specific attacks, especially when the malicious gradient is similar to the benign gradient or the data is highly dependent and the same distribution (non-IID). In the present invention, a simple and effective defense strategy is presented to adaptively identify backdoors with multiple metrics and dynamic weighting. Furthermore, the novel defenses of the present invention do not rely on predefined assumptions about attack settings or data distribution and have little impact on benign performance. The method solves the limitations of low defending efficiency, poor adaptability, poor identifying ability, high defending cost and the like of the existing defending method.
Meanwhile, practical implementation finds that on the data of CIFAR10, only 10 users are selected to participate in training in each round with 200 users participating in, and the non-independent co-distribution reaches a highly complex environment with the Dirichlet parameter of 0.5, under the severe environment with 4% of attackers, and the attackers implement the Edge-case PGD which is a high-level attack with modification on both data and models, after 1500 rounds of training, the back door accuracy of the federal learning model under the defense of the invention is only 3.06%, and meanwhile, the method has very good performance, and far exceeds the rest SOTA (states of the art) methods.
Example 2
The basic logic of distance-based defense involves defining indicators that distinguish well between malicious gradients and benign gradients and remove hostile updates from the aggregate. Thus, the core problem becomes how to define an index that can identify malicious gradient features. However, the widely used euclidean distance suffers from "curse in the dimension" and has proven to be less significant in distinguishing in high-dimensional spaces, thus introducing manhattan distance as a distinction. The existing method often adopts the Euclidean distance or cosine distance and other distinguishing measures in the Euclidean space, but extensive practical work discovers that the method can not effectively distinguish the distance distinction between neighbors in the high latitude space.
Meanwhile, the current method is to defend on the basis of a single index. Because there is only one metric, a subtle attacker can bypass them with a carefully designed gradient without difficulty, and the federal aggregation of white boxes is performed under the scene framework of most federal learning in the actual working environment, which means that a malicious attacker knows which defense index will be adopted by the server as a strategy and can purposefully design, so a widely applied and general defense strategy is urgently needed at present. In addition, an attacker attacks under different environments and data distribution, so that malicious gradients have different characteristics and a single index cannot be processed. To this end, the invention proposes a plurality of metrics, by defining gradient angle features with cosine similarity, defining the length of the gradient with euclidean distance, and cooperatively identifying malicious gradients with manhattan distance as a complement to this in a high dimensional space.
Unlike example 1, the time complexity of the implementation of the method increases because the computing of the dispersion requires that each gradient model eigenvalue be differenced from the eigenvalues of the other gradient models. The time cost is increased, and the method specifically comprises the following steps: s1, defining the characteristics of gradients, calculating the dispersion of the characteristic values of each user model, and taking the dispersion as a new characteristic value; the gradient is characterized as follows:
Where i is the ith user, w i A local model trained for the ith user, w 0 The global model is the global model which is issued after the last round of aggregation; gradient features for ith user training include Manhattan distanceEuclidean distance->And cosine distanceThe user gradient feature information is expressed by the following formula:
x=(x Man ,x Eul ,x Cosine ) (4)
firstly, calculating the average value of the gradient model eigenvalues of all users:
wherein mean Man 、mean Eul 、mean Cosine Respectively average values of the gradient model Manhattan distance characteristic values, the Euclidean distance characteristic values and the cosine distance characteristic values of all users in the round,the Manhattan distance feature value, the Euclidean distance feature value, and the cosine distance feature value of the gradient model of the ith user in the K gradient models, respectively.
The average eigenvalue vector can then be found as:
mean=(mean Man ,mean Eul ,mean Cosine ) (8)
where mean is the average vector of the run, mean Man 、mean Eul 、mean Cosine The average value of the gradient model Manhattan distance characteristic value, the Euclidean distance characteristic value and the cosine distance characteristic value of all users in the round is respectively obtained.
The basic logic of distance-based defense involves defining indicators that distinguish well between malicious gradients and benign gradients and remove hostile updates from the aggregate. Thus, the core problem becomes how to define an index that can identify malicious gradient features. However, the widely used euclidean distance suffers from "curse in the dimension" and has proven to be less significant in distinguishing in high-dimensional spaces, thus introducing manhattan distance as a distinction.
The current method is to defend on the basis of a single index. With only one metric, a subtle attacker can bypass them with a carefully designed gradient without difficulty. In addition, an attacker attacks under different environments and data distribution, so that malicious gradients have different characteristics and a single index cannot be processed. To this end, the invention proposes a plurality of metrics, by defining gradient angle features with cosine similarity, defining the length of the gradient with euclidean distance, and cooperatively identifying malicious gradients with manhattan distance as a complement to this in a high dimensional space.
S2, dynamically calculating the Markov distance of each user model through covariance matrix inversion so as to calculate the dynamic weight and score updated each time; two factors are mainly considered as to how to calculate the weight of each feature: the first is that the different scales of the three distance metrics are the first obstacle to the cooperative use of these metrics, but since each metric is correlated, a new regularization method is needed instead of the usual regularization at maximum; the second is that different data distributions (e.g., different degrees of non-IID) make the gradient of both malicious and benign clients different. Thus, dynamic weighting is required to cope with various environments and attacks to achieve a general, universal defense.
Calculating a matrix composed of gradient model eigenvalues of all current users:
wherein X is a matrix of eigenvalue vectors of K clients, whereinRepresenting the transpose of the matrix, x 1 ,x 2 ,...,x i The eigenvalue vectors representing K clients are calculated by step S1.
Taking the two factors into consideration, the invention applies the mahalanobis distance to dynamically weight and score the gradient characteristics, such as the following formula:
wherein d is i For gradient model eigenvalue x through the ith client i And calculating the difference between the average gradient model characteristic value mean.
Where Σ is the covariance matrix of the matrix of gradient features of all users who participate in training when the round is selected, the inverse matrix obtained by inverting it is also called the precision matrix or concentration matrix, and the relation between different metrics is reflected. As shown in fig. 4, point C is the center of the cluster, if a is closer to C than B, so B will be determined as an outlier, but it is obvious that point B is more consistent with the cluster characteristics, and the outlier to be removed is a, so a weight matrix is needed to eliminate the difference in the cluster direction. The inverse of the covariance matrix is calculated from the selected gradients, which dynamically changes the weights of the features according to the feature distribution, and is therefore a "dynamic weight". With the dynamic weight, the system can be better adapted to different environments and resist various attacks. And taking the obtained Marshall distance d as the score of each gradient, reflecting that the greater the distance and the greater the score are, the greater the degree of abnormality of the gradient is, and rejecting the score with the maximum value.
Step S3 is the same as in example 1.
The detailed flow steps of the method of the invention include the following steps:
(1) Initializing a federal learning framework consisting of N clients, wherein the following steps are specific steps of each round until a preset training round is reached;
(2) The central server issues the initial global model w of the current round of K clients participating in the current round of training 0 ;
(3) Each client uses its local data to train model w locally 0 And trained local model w i Uploading server, wherein w i A local model uploaded for an ith client;
(4) The server receives K trained local models;
(5) Local model w for the ith user respectively i Manhattan distance for calculating model vectorsEuclidean distanceAnd cosine distance->
(6) Local model w for the ith user respectively i Feature vector x of its model vector is defined and calculated i =(x i Man ,x i Eul ,x i Cosine );
(7) An average value of Manhattan distance feature values of the gradient models of the K users is calculated,
(8) Calculating the average value of Euclidean distance eigenvalues of the gradient models of K users,
(9) Calculating the average value of cosine distance characteristic values of the gradient models of the K users,
(10) Calculating eigenvalue average value vector of gradient model of K users, mean= (mean Man ,mean Eul ,mean Cosine )
(11) Calculation of the current round of K guestsMatrix of eigenvalue vectors of householdsWherein->Representing a transpose of the matrix;
(12) Calculating a covariance matrix sigma of the matrix X, wherein an inverse matrix of the covariance matrix is used as an adaptive weight matrix of the wheel;
(13) Local model w for the ith user respectively i Calculate its mahalanobis distance d i And as a fraction of its distance, lower scores indicate that the model is more representative of the entire model population;
(14) K clients participating in training of all the rounds according to the distance score d i Sequencing;
(15) Rejecting p user models with higher distance scores;
(16) The remaining user models are aggregated and used as the global model w for the federally trained round * ;
(17) And (3) if the round reaches the preset federal training round, the training is exited, and if the round does not reach the preset federal training round, the step (2) is skipped to perform a new round of federal training.
Example 3
Unlike embodiments 1 and 2, embodiment 3 adds differential privacy noise after aggregation in step S3, which not only can protect the privacy security of the user gradient model, but also can gradually eliminate a small amount of back door attacks defended by steps S1 and S2 through noise.
S3, polymerizing benign gradient: and taking the mahalanobis distance as a score, removing the score with larger score, and aggregating the rest user models to be used as a new global model. After the score for each gradient is obtained, the gradients with higher scores are summed up, as a higher score means that this gradient is less divergent among all gradients. The proportion of the selected gradient is indicated by setting a fixed proportion p (p e 0, 1), which is an adjustable super-parameter. Intuitively, the performance of the model and the convergence rate of the training are positively correlated with p. In contrast, the relationship between the accuracy of the back door task and p is much more complex. On the one hand, increasing the value of p increases the probability of choosing a back door gradient to train, which is detrimental to defending against back door attacks. On the other hand, increasing the value of p will mitigate the effect of selecting a back gate gradient, which is advantageous for defense. However, due to lack of knowledge about the attacker (e.g. the number of attackers), the best p cannot be simply determined. For simplicity as much as possible, a predefined fixed p is used and it has been empirically demonstrated that the defense proposed is not dependent on the choice of p.
The aggregated global model is w * A sufficient amount of gaussian noise is added to the global model as follows:
wherein w is * ' is a new global model, w * Is an old global modelRepresenting a mean of 0 and a variance of sigma 2 Is also referred to as gaussian noise, and has a standard deviation of σ. Even in a stricter defense algorithm, under a large number of training rounds, a small number of backdoors can be injected into the model due to negligence, and noise can be added to cause the model to pay attention to neurons, so that a certain amount of retraining process is initiated, and the backdoors are gradually eliminated. Extensive experiments prove that the backdoor of the residual in the global model of the passing-around defenses of the lucky-like can be gradually eliminated when the standard deviation is only 0.0025. The Gaussian noise is added, so that the back door can be gradually eliminated to improve the safety of the federal learning model, meanwhile, the differential attack initiated by the curious participants can be defended, and the participation is protectedUser data security and privacy security.
Extensive experiments prove that the backdoor of the residual in the global model of the passing-around defenses of the lucky-like can be gradually eliminated when the standard deviation is only 0.0025. The Gaussian noise is added, so that the back door can be gradually eliminated to improve the safety of the federal learning model, meanwhile, differential attacks initiated by 'curious participants' can be defended, and the data safety and privacy safety of the participating users are protected.
The detailed flow steps of the method of the invention include the following steps:
(1) Initializing a federal learning framework consisting of N clients, wherein the following steps are specific steps of each round until a preset training round is reached;
(2) The central server issues the initial global model w of the current round of K clients participating in the current round of training 0 ;
(3) Each client uses its local data to train model w locally 0 And trained local model w i Uploading server, wherein w i A local model uploaded for an ith client;
(4) The server receives K trained local models;
(5) Local model w for the ith user respectively i Manhattan distance for calculating model vectorsEuclidean distanceAnd cosine distance->/>
(6) Local model w for the ith user respectively i Feature vector x of its model vector is defined and calculated i =(x i Man ,x i Eul ,x i cosine );
(7) Local model w for the ith user respectively i Calculating model vectorsDispersion of manhattan distance of (c) Wherein->Representing the sum of absolute values of Manhattan distance differences between the ith user and other users;
(8) Local model w for the ith user respectively i Calculating Euclidean distance dispersion of model vectorWherein->Representing the sum of absolute values of the Euclidean distance differences between the ith user and the other users;
(9) Local model w for the ith user respectively i Calculating Euclidean distance dispersion of model vectorWherein->Representing the sum of absolute values of cosine distance differences between the ith user and other users;
(11) Calculating a matrix x= [ x 'formed by eigenvalue vectors of K clients in the current round' 1 ,x′ 2 ,...,x′ i ] T WhereinRepresenting a transpose of the matrix;
(12) Calculating a covariance matrix sigma of the matrix X, wherein an inverse matrix of the covariance matrix is used as an adaptive weight matrix of the wheel;
(13) Local model w for the ith user respectively i Calculate its mahalanobis distance d i And as a fraction of its distance, lower scores indicate that the model is more representative of the entire model population;
(14) K clients participating in training of all the rounds according to the distance score d i Sequencing;
(15) Rejecting p user models with higher distance scores;
(16) The remaining user models are aggregated and used as the global model w for the federally trained round * ;
(17) Adding differential privacy noise with standard deviation sigma to the global model w, and obtaining a new global model
(18) And (3) if the round reaches the preset federal training round, the training is exited, and if the round does not reach the preset federal training round, the step (2) is skipped to perform a new round of federal training.
The preferred embodiments of the invention disclosed above are intended only to assist in the explanation of the invention. The preferred embodiments are not exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. The invention is limited only by the claims and the full scope and equivalents thereof.
Claims (10)
1. The method for dynamically identifying the backdoor attack defense in the federal learning based on the multidimensional index is characterized by comprising the following steps:
s1, initializing a federal learning framework, defining and calculating gradient characteristics, and defining and calculating characteristic values of each client;
s2, calculating dynamic weights and scores of the clients by using the Markov distances, and calculating the scores d according to the distances i Sequencing;
s3, eliminating attack gradients, aggregating benign gradients and adding noise.
2. The method for dynamically identifying back door attack defense in federal learning based on multidimensional metrics according to claim 1, wherein the gradient is defined as follows:
Where i is the ith user, w i A local model trained for the ith user, w 0 The global model is the global model which is issued after the last round of aggregation; gradient features for ith user training include Manhattan distanceEuclidean distanceLeave->And cosine distance->The user gradient feature information is expressed by the following formula:
x=(x Man ,x Eul ,x Cosine )(4)。
3. the method for defending back door attacks in federal learning based on multi-dimensional index dynamic identification according to claim 2, wherein the dispersion is selected as a basis for judging whether a gradient is an outlier, and the characteristic value of each client model is defined and calculated:
4. The method for dynamically identifying back door attack defense in federal learning based on multidimensional index according to claim 3, wherein after defining gradient characteristics, the characteristic information of user gradient is expressed by the following formula:
wherein x' (i) A dispersion vector representing the i-th client,representing the ith client and the rest of clients x in K clients selected in the round j Is used as the dispersion on the index, the sum of the absolute values of the differences over the Manhattan distance >And->The dispersion over euclidean distance and cosine distance indices, respectively.
5. The method for defending against a back door attack in dynamically identifying federal learning based on multidimensional metrics of claim 1, wherein dynamic weights and scores are calculated using mahalanobis distance to weight gradient features:
where Σ is the covariance matrix of the matrix of gradient features when the round is selected for all users involved in training, the inverse of which is foundMatrix as dynamic weight of the round, d i For the dispersion vector x 'through the ith client' (i) The obtained fraction is used for calculating the total weight of the product,is x' (i) Is a transpose of (a).
6. The method for dynamically identifying back door attack defense in federal learning based on multidimensional metrics according to claim 1, wherein initializing a federal learning framework composed of N clients comprises the steps of:
(1) The central server issues the initial global model w of the current round of K clients participating in the current round of training 0 ;
(2) Each client uses its local data to train the model w locally 0 And trained local model w i Uploading server, wherein w i A local model uploaded for an ith client;
(3) The server receives the K trained local models.
7. The method for defending against a back door attack in dynamically identifying federal learning based on multidimensional index according to claim 1, wherein the feature value of each client is defined and calculated, comprising the steps of
1) Local model w for the ith user respectively i Manhattan distance for calculating model vectorsEuclidean distance->And cosine distance->
2) Local model w for the ith user respectively i Defining and calculating features of its model vectorSign vector x i =(x i Man ,x i Eul ,x i Cosine );
3) Local model w for the ith user respectively i Calculating the dispersion of Manhattan distance of model vectorWherein->Representing the sum of absolute values of Manhattan distance differences between the ith user and other users;
4) Local model w for the ith user respectively i Calculating Euclidean distance dispersion of model vectorWherein->Representing the sum of absolute values of the Euclidean distance differences between the ith user and the other users;
5) Local model w for the ith user respectively i Calculating Euclidean distance dispersion of model vectorWherein->Representing the sum of absolute values of cosine distance differences between the ith user and other users;
6) Redefining and calculating the characteristic value of each client as follows:
8. The method for defending a back door attack in dynamically identifying federal learning based on multidimensional index according to claim 1, wherein step S2 comprises the steps of:
S21, calculating a covariance matrix sigma of a matrix X formed by eigenvalue vectors of K clients, wherein an inverse matrix of the covariance matrix is used as an adaptive weight matrix of the round;
s22, local model w for ith user respectively i Calculate its mahalanobis distance d i And as a fraction of its distance,the lower the score, the more representative the local model of the ith user is of the population of local models of all users;
s23, K clients participating in training of all the rounds according to the distance score d i And sequencing.
9. The method for dynamically identifying back door attack defense in federal learning based on multidimensional metrics according to any one of claims 1 to 8, wherein step S3 comprises the steps of:
s31, eliminating local models of p users with higher distance scores;
s32, aggregating the rest user models and taking the user models as a global model w of the federal training of the round * ;
S33, directing to the global model w * Adding differential privacy noise;
and S34, if the round reaches the preset federal training round, the training is exited, and if the round does not reach the preset federal training round, a new round of federal training is performed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310019902.9A CN116150745A (en) | 2023-01-06 | 2023-01-06 | Back door attack defense method based on multidimensional index dynamic identification federal learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310019902.9A CN116150745A (en) | 2023-01-06 | 2023-01-06 | Back door attack defense method based on multidimensional index dynamic identification federal learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116150745A true CN116150745A (en) | 2023-05-23 |
Family
ID=86361309
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310019902.9A Pending CN116150745A (en) | 2023-01-06 | 2023-01-06 | Back door attack defense method based on multidimensional index dynamic identification federal learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116150745A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117278305A (en) * | 2023-10-13 | 2023-12-22 | 北方工业大学 | Data sharing-oriented distributed GAN attack and defense method and system |
CN117596592A (en) * | 2023-12-01 | 2024-02-23 | 广西大学 | Gradient selection method for unmanned aerial vehicle federal learning based on blockchain |
CN117278305B (en) * | 2023-10-13 | 2024-06-11 | 深圳市互联时空科技有限公司 | Data sharing-oriented distributed GAN attack and defense method and system |
-
2023
- 2023-01-06 CN CN202310019902.9A patent/CN116150745A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117278305A (en) * | 2023-10-13 | 2023-12-22 | 北方工业大学 | Data sharing-oriented distributed GAN attack and defense method and system |
CN117278305B (en) * | 2023-10-13 | 2024-06-11 | 深圳市互联时空科技有限公司 | Data sharing-oriented distributed GAN attack and defense method and system |
CN117596592A (en) * | 2023-12-01 | 2024-02-23 | 广西大学 | Gradient selection method for unmanned aerial vehicle federal learning based on blockchain |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Weng et al. | Evaluating the robustness of neural networks: An extreme value theory approach | |
Rieger et al. | Deepsight: Mitigating backdoor attacks in federated learning through deep model inspection | |
Shen et al. | Auror: Defending against poisoning attacks in collaborative deep learning systems | |
CN111460443A (en) | Security defense method for data manipulation attack in federated learning | |
Liu et al. | A fuzzy logic based reputation model against unfair ratings | |
Zhang | Machine learning with feature selection using principal component analysis for malware detection: a case study | |
CN116150745A (en) | Back door attack defense method based on multidimensional index dynamic identification federal learning | |
CN112329009B (en) | Defense method for noise attack in joint learning | |
CN111783845A (en) | Hidden false data injection attack detection method based on local linear embedding and extreme learning machine | |
CN112883874A (en) | Active defense method aiming at deep face tampering | |
Li et al. | On the adversarial robustness of subspace learning | |
CN116049570A (en) | Double-tower social recommendation method based on federal contrast learning | |
Williams et al. | Detecting profile injection attacks in collaborative filtering: a classification-based approach | |
CN117272306A (en) | Federal learning half-target poisoning attack method and system based on alternate minimization | |
Xue et al. | Use the spear as a shield: an adversarial example based privacy-preserving technique against membership inference attacks | |
CN113297574B (en) | Activation function adaptive change model stealing defense method based on reinforcement learning reward mechanism | |
Wei et al. | Multi-objective evolving long–short term memory networks with attention for network intrusion detection | |
Sadeghzadeh et al. | HODA: Hardness-Oriented Detection of Model Extraction Attacks | |
Yuan et al. | A simple framework to enhance the adversarial robustness of deep learning-based intrusion detection system | |
Zhao et al. | Defense against poisoning attack via evaluating training samples using multiple spectral clustering aggregation method | |
Yu et al. | Security and Privacy in Federated Learning | |
Umer et al. | Vulnerability of covariate shift adaptation against malicious poisoning attacks | |
CN116738270A (en) | Unsupervised federal learning toxin-throwing defense method based on high-dimensional spatial clustering | |
Bakir et al. | Comparisons on intrusion detection and prevention systems in distributed databases | |
Qi et al. | A novel shilling attack detection model based on particle filter and gravitation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |