CN113779563A - Method and device for defending against backdoor attack of federal learning - Google Patents

Method and device for defending against backdoor attack of federal learning Download PDF

Info

Publication number
CN113779563A
CN113779563A CN202110897437.XA CN202110897437A CN113779563A CN 113779563 A CN113779563 A CN 113779563A CN 202110897437 A CN202110897437 A CN 202110897437A CN 113779563 A CN113779563 A CN 113779563A
Authority
CN
China
Prior art keywords
model
initial local
learning
local model
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110897437.XA
Other languages
Chinese (zh)
Inventor
杨会峰
辛锐
陈连栋
郭少勇
魏勇
阮琳娜
程凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
Information and Telecommunication Branch of State Grid Hebei Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
Information and Telecommunication Branch of State Grid Hebei Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, Information and Telecommunication Branch of State Grid Hebei Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202110897437.XA priority Critical patent/CN113779563A/en
Publication of CN113779563A publication Critical patent/CN113779563A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/554Detecting local intrusion or implementing counter-measures involving event detection and direct action
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and a device for defending backdoor attacks in federated learning, wherein the method comprises the following steps: establishing an image classification model based on federal learning and training; classifying images to be classified according to the trained image classification model; the model training mode is as follows: respectively obtaining initial local model parameters of target model update dimensionality obtained by each client in target wheel model learning; determining abnormal values in the initial local model parameters based on the average value and the standard deviation of the initial local model parameters; updating the initial local model parameters of the client corresponding to the abnormal values to the average value of the initial local model parameters to obtain new local model parameters of the client; and calculating the average value of each new local model parameter to obtain the aggregation model parameter of the update dimensionality of the target model corresponding to the target wheel model learning. Therefore, the trained model can keep good performance, and the accuracy of the model in actual application is ensured.

Description

Method and device for defending against backdoor attack of federal learning
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a method and a device for defending backdoor attack of federated learning.
Background
With the development of big data, edge computing, large-scale cloud computing platforms and various open source frameworks, artificial intelligence technologies such as machine learning and the like are applied to various industries at an unprecedented speed. However, while opportunities arise from artificial intelligence techniques, new challenges such as privacy and security of data are also presented. In order to strengthen the privacy protection of user data and the safety management of the data, the use of the data is limited, so that the internet data is dispersed in different enterprises and organizations to form a data island phenomenon, and the data of all parties cannot be directly shared or exchanged. Recently emerging federal learning is a viable direction to solve the "data islanding" problem while ensuring data privacy and security.
Federal learning is an emerging learning paradigm that is based on the distributed training of learning models among a set of clients under the coordination of a central server, while segregating the training data among the clients. Federal learning is a method to protect user privacy designed for use in decentralized scenarios. Meanwhile, as a modern machine learning algorithm may be subjected to various adversarial attacks in a practical application scene, including poisoning of data and model updating processes, model evasion, model stealing, data speculative attack on user's private training data and the like, federal learning is also easily damaged by the adversarial attacks, and due to the fact that the distributed characteristics and the inaccessible characteristics of data prevent defense against the malicious attacks, the adversarial attacks for federal learning are more hidden and difficult to find.
At present, the defense scheme applied to the federated learning scene is mostly based on the adjustment of the aggregation operator, because the attack on the learning model is usually executed by the client, and the representative measure is to perform more robust update aggregation, such as Byzantine-robust aggregation rule. However, these defensive measures are not effective enough against back door attacks due to the concealment of the back door attacks.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a method and a device for defending backdoor attack of federated learning.
In a first aspect, the present invention provides a method for defending a backdoor attack of federated learning, comprising:
establishing an image classification model based on federal learning and training;
classifying images to be classified according to the trained image classification model;
the image classification model training mode is as follows:
aiming at target wheel model learning, respectively acquiring initial local model parameters of target model update dimensionality obtained by each client in the target wheel model learning;
determining abnormal values in the initial local model parameters based on the average value and the standard deviation of the initial local model parameters;
updating the initial local model parameters of the client corresponding to the abnormal values to be the average value of the initial local model parameters to obtain new local model parameters of the client;
calculating an average value of each new local model parameter to obtain a polymerization model parameter of the target wheel model learning corresponding to the update dimensionality of the target model;
and updating the dimension aggregation model parameters of each model based on each round of model learning, and obtaining the image classification model after training.
Optionally, the determining an abnormal value in each initial local model parameter based on the mean and the standard deviation of each initial local model parameter includes:
and if the difference value between the target initial local model parameter and the average value of each initial local model parameter is determined to be greater than or equal to the preset multiple of the standard deviation, determining the value of the target initial local model parameter as an abnormal value.
Optionally, the preset multiple is 3.
Optionally, the distribution of each of the initial local model parameters follows a gaussian distribution.
In a second aspect, the present invention further provides a defense apparatus for backdoor attack in federated learning, including:
the training module is used for establishing an image classification model based on federal learning and training;
the processing module is used for carrying out classification processing on the images to be classified according to the trained image classification model;
wherein, the training module is specifically configured to:
aiming at target wheel model learning, respectively acquiring initial local model parameters of target model update dimensionality obtained by each client in the target wheel model learning;
determining abnormal values in the initial local model parameters based on the average value and the standard deviation of the initial local model parameters;
updating the initial local model parameters of the client corresponding to the abnormal values to be the average value of the initial local model parameters to obtain new local model parameters of the client;
calculating an average value of each new local model parameter to obtain a polymerization model parameter of the target wheel model learning corresponding to the update dimensionality of the target model;
and updating the dimension aggregation model parameters of each model based on each round of model learning, and obtaining the image classification model after training.
Optionally, the training module is specifically configured to:
and if the difference value between the target initial local model parameter and the average value of each initial local model parameter is determined to be greater than or equal to the preset multiple of the standard deviation, determining the value of the target initial local model parameter as an abnormal value.
Optionally, the preset multiple is 3.
Optionally, the distribution of each of the initial local model parameters follows a gaussian distribution.
In a third aspect, the present invention further provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the federally learned backdoor attack defense method according to the first aspect when executing the program.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, carries out the steps of the federally learned backdoor attack defense method as defined in the first aspect above.
According to the method and the device for defending against backdoor attacks in federal learning, potential antagonistic clients are identified in a detection and filtering mode in the model training process, and model updates of the risk clients are filtered, so that the influence and harm of backdoor attacks on the models are eliminated, the trained models can keep good performance, and the accuracy of the models in actual application is guaranteed.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a federally learned backdoor attack defense method provided by the present invention;
FIG. 2 is a schematic flow chart of the RFOut-1d algorithm provided by the present invention;
FIG. 3 is a schematic representation of a mode key used in a mode key backdoor attack provided by the present invention;
FIG. 4 is a schematic structural diagram of a federally learned backdoor attack defense provided by the present invention;
fig. 5 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In federal learning applications, adversarial attacks can be roughly classified into two categories, i.e., non-directed attacks and directed attacks, according to the different attack targets. The goal of non-directed attacks is to break the model so that it does not achieve optimal performance in the main task. In a directed attack (often referred to as a back door attack), the adversary's goal is to have the model maintain good overall performance in the main task while exhibiting poor performance in certain specific subtasks. Because the attack mode of the backdoor attack is more hidden and the potential destructiveness is larger, the backdoor attack is focused on and is taken as a main target of the proposed scheme.
The method comprises the steps that the adversarial client has the capacity of misleading the behavior of the federal learning model, injecting backdoor attacks or destroying data privacy, the purpose of the method is to identify potential adversarial clients through a detection and filtering mode, and the model updates of the risk clients are filtered out, so that the influence and harm of the backdoor attacks on the model are eliminated.
Fig. 1 is a schematic flow chart of a federally learned backdoor attack defense method provided by the present invention, and as shown in fig. 1, the method includes the following steps:
step 100, establishing an image classification model based on federal learning and training;
the image classification model training mode is as follows:
aiming at the target wheel model learning, respectively acquiring initial local model parameters of target model update dimensionality obtained by each client in the target wheel model learning;
determining abnormal values in the initial local model parameters based on the average value and the standard deviation of the initial local model parameters;
updating the initial local model parameters of the client corresponding to the abnormal values to the average value of the initial local model parameters to obtain new local model parameters of the client;
calculating the average value of each new local model parameter to obtain the aggregation model parameter of the update dimension of the target model corresponding to the target wheel model learning;
updating the dimensional aggregation model parameters of each model based on each round of model learning, and obtaining an image classification model after training;
specifically, since various backdoor attack modes may exist in a federated learning scene, in order to enable a model obtained based on federated learning training to effectively defend against backdoor attacks, the invention provides an RFOut-1d (robust filter of one-dimensional outliers) federated aggregation operator based on filtering outliers in a client model update distribution, so as to generate more robust aggregation in each learning round.
In order to facilitate a clearer understanding of the technical solution of the present invention, a backdoor attack manner that may exist in a federal learning scenario is first described below.
To facilitate the description of the attack and defense of the present invention, a series of symbols will be used to represent the corresponding objects. Let EtRepresenting a global model (also referred to as an aggregate model) in the t-th round of learning,
Figure BDA0003198552980000061
local model (also called office) representing jth client in the t round of learningPartial model), n is the total number of clients selected for each aggregation, and η is the server learning rate. Accordingly, the update of the global model in the t-th round of learning is performed in equation (1) as follows:
Figure BDA0003198552980000062
in this case, the present invention defines a backdoor attack scenario as one or more clients coordinated to inject secondary or backdoor tasks into the global model. Often, these attacks do not negatively impact the performance of the original task, making them more difficult to identify. Due to the distributed nature of the learning process, there are a large number of clients participating in each aggregation, and if the proportion of resistant clients is significantly lower than normal clients, the impact of the resistant clients will be ignored by the rest of the clients, and no effective attack will occur. For this reason, the present invention focuses on model poisoning backdoor attacks based on a model replacement paradigm.
If only one antagonistic client is selected to use its back door model in the t-th round of learning
Figure BDA0003198552980000063
Replacement of the global office model EtThe model optimizes the original and backdoor tasks by sending to the federated learning server.
Figure BDA0003198552980000071
Wherein the content of the first and second substances,
Figure BDA0003198552980000072
is a factor that is a promotion of the growth of,
Figure BDA0003198552980000073
is a back door model for antagonistic clients
Figure BDA0003198552980000074
Optimized back door tasks. Then, replacing equation (2) in equation (1) yields:
Figure BDA0003198552980000075
according to the definition of federal learning, the federal learning model can be converged to a solution finally, so that the invention can be assumed for normal clients
Figure BDA0003198552980000076
Therefore, equation (3) is rewritten as follows:
Figure BDA0003198552980000077
thus, the antagonistic client enables the global model to be replaced with the back door model.
If multiple antagonistic clients participate in the same round of learning, it is assumed that they can coordinate attacks by distributing the facilitation factor β among the attackers.
The invention considers that in a simulated attack scene, model updating of a normal client minimizes global task loss, and model updating of an antagonistic client optimizes global and backdoor task loss. Therefore, the present invention proposes a RFOut-1d (robust filter of one-dimensional outliers) federated aggregation operator based on filtering outliers in the client model update distribution, with the goal of producing a more robust aggregation in each learning round t.
In the embodiment of the invention, an image classification application scene is taken as an example for explanation, an image classification model based on federal learning is established firstly, potential antagonistic clients are identified in a detection and filtering mode when the model is trained, and model updates of the risk clients are filtered out, so that the influence and harm of backdoor attacks on the model are eliminated.
In particular, due to the high dimensionality of the updates (typically from neural networks), to avoid information loss by applying dimension reduction techniques, the present invention updates the model each timeAnd carrying out univariate anomaly detection in each dimension. Thus, for each dimension i e { 1.,. m }, where m is the dimension of the model update vector, the present invention considers the vector formed by the local model updates of each client in that dimension
Figure BDA0003198552980000081
Where n is the number of clients participating in the aggregation.
Aiming at any round of model learning, initial local model parameters, namely U, on any model updating dimension i, obtained by each client in the round of model learning can be respectively obtainedi(ii) a Outliers in each initial local model parameter are then determined based on the mean and standard deviation of each initial local model parameter.
Alternatively, assuming that the distribution of each initial local model parameter follows a gaussian distribution, the outlier in each initial local model parameter may be determined based on a method of detecting univariate outliers in the gaussian distribution. In embodiments of the present invention, it may be assumed that the model update of the client follows a gaussian distribution for a certain round of learning, because the global aggregation model tends to converge to a common solution, which is a visual proof based on the central limit theorem that indicates that the sum of independent random variables is very close to a gaussian distribution, and let the distribution of the local weights of the client be each random variable, then their linear combination is close to a gaussian distribution. Thus, the convergence over and over again results in a convergence to a gaussian distribution. In particular, the data distribution for each dimension of the update converges to a unary Gaussian distribution.
Optionally, determining an abnormal value in each initial local model parameter based on the mean and the standard deviation of each initial local model parameter includes:
and if the difference value between the target initial local model parameter and the average value of each initial local model parameter is determined to be greater than or equal to the preset multiple of the standard deviation, determining the value of the target initial local model parameter as an abnormal value.
Specifically, a standard deviation method may be applied for univariate anomaly detection. It filters out those large differences between the verification value and the mean valueThose values that are equal to or delta times the standard deviation. Formally, it is by muiReplace those satisfied
Figure BDA0003198552980000082
A value of (b), whereiniAnd σiAre respectively UiThe average value and the standard deviation of (a),
Figure BDA0003198552980000083
the parameter is a parameter of a dimension i of the model update of the client j in the t-th round of learning, δ may be set according to the actual model training experience, and is not limited herein, and optionally, the value of δ may be 3. The present invention uses mean estimation because it filters out the participation of outliers in the aggregation as model updates for later clients are aggregated.
Finally, the Federal aggregation RFOut-1d consists of a one-dimensional average of unfiltered parameters. Formally, the aggregate model parameters E obtained in each dimension i in the t-th round of learningt[i]Comprises the following steps:
Figure BDA0003198552980000091
Figure BDA0003198552980000092
wherein
Figure BDA0003198552980000093
Is the result vector after applying the rule of equation (5)
Figure BDA0003198552980000094
Therefore, the aggregation model parameters of the corresponding model updating dimensionality can be learned based on the obtained models of each round, and the image classification model can be obtained after training is finished.
FIG. 2 is a schematic flow chart of the RFOut-1d algorithm provided by the present invention, as shown in FIG. 2Setting the value of delta to be 3, for any round of model learning, respectively obtaining initial local model parameters of each client in any model updating dimension i obtained in the round of model learning, filtering abnormal values with the difference between the abnormal values and the average value being more than or equal to 3 times of standard deviation, and then using mu to obtain the initial local model parametersiAnd replacing abnormal values to obtain new local model parameters of each client, and finally, calculating the average value of the new local model parameters to obtain the aggregation model parameters of the update dimensionality of the model corresponding to any model in the round of model learning.
RFOut-1d, in addition to filtering out those clients that may be attackers, may also optimize the learning process by enabling faster convergence to a generic solution. Furthermore, it can be combined with other aggregation mechanisms as defense, such as norm threshold update or weak differential privacy, etc.
And 101, classifying images to be classified according to the trained image classification model.
Specifically, after the image classification model based on federal learning is trained, the classification processing of the image to be classified can be performed according to the image classification model obtained by the training.
According to the federal-learned backdoor attack defense method, potential antagonistic clients are identified in a detection and filtering mode in the model training process, and model updates of the risk clients are filtered, so that the influence and harm of backdoor attacks on the models are eliminated, the trained models can keep good performance, and the accuracy of the models in actual application is guaranteed.
In order to better understand the above-mentioned back door attack defense method provided by the present invention, the following introduces the attack environment simulated by the present invention for verifying the performance of the defense mechanism.
1. Antagonistic attack
With respect to attacks, the present invention defines two taxonomies:
1) pattern-poisoning and data-poisoning attacks are distinguished based on which part of the federated learning pattern is attacked. In practice, the two are almost equivalent, since data poisoning can lead to pattern poisoning. However, data poisoning attacks and some model poisoning attacks may be ineffective because these attacks have the potential to be eliminated in the aggregation of numerous clients. For this reason, these attacks are often combined with model replacement techniques to enhance the countermeasure model (or models) to replace the global model in the aggregation.
2) Non-target attacks (Byzantine attacks) aimed at trying to affect the performance of the model, and target attacks (backdoor attacks) aimed at injecting secondary or backdoor tasks secretly into the global model are distinguished according to the purpose of the attack. The second type of attacks may be more damaging because they may compromise the integrity of the global model if not discovered. Furthermore, because the antagonistic client model optimizes both the original task and the antagonistic task, they are also more difficult to detect during the aggregation process. The second is also the type of attack that the present invention is primarily directed to.
With respect to back-door attacks, a back-door attack has a set of hyper-parameters or attribute sets to configure its behavior, defined as follows:
number of backdoor tasks: in an input instance backdoor attack, the present invention treats each client's sample as a specific backdoor task due to differences in client distribution. The number of backdoor tasks corresponds to the number of clients, and a sample of backdoor data sets, called S, is selected from the clientsbackdoor. In a pattern key back door attack, since the attack should be generalized, it can be considered that it can be solved by only one back door task (one pattern).
Number of antagonistic clients: the number of clients that are affected and adjusted to perform backdoor tasks. The local training dataset for each antagonistic client j is represented by their original training dataset
Figure BDA0003198552980000111
And back door data set SbackdoorIs composed of a union of (i) i
Figure BDA0003198552980000112
In an input instance backdoor attack, SbackdoorCorresponding to the sample set from each backdoor task. As for the mode key backdoor attack, SbackdoorIt is composed of all samples that are inversely altered according to a particular pattern.
Sampling and frequency of antagonistic clients: the frequency of appearance of competing clients is a critical factor in the client sub-set selected for each aggregation. According to the conclusions of the relevant literature, the proportion of resistant clients required for an attack to be effective when random sampling is used is too high and impractical. Therefore, the scheme of the invention focuses on the fixed frequency attack, namely, the number of the antagonistic clients participating in the attack in each aggregation is autonomously determined.
2. Problem description of simulation attack
The problem to be solved by the invention for simulating the attack environment is to resist the attack to be executed in the corresponding attack environment and ensure the safety and the integrity of the model.
In order to simulate an attack environment, the invention selects two backdoor attacks based on data operation. These attacks differ in the way how the data is poisoned, specifically:
input example backdoor attack: the goal of the attack is to cause the federated learning model to misclassify certain specific samples of the input distribution, thereby favoring a certain goal. For example, in a face recognition system for entering a room, a person (specific input) who does not have an entrance permission is allowed to enter the room.
Model key back door attack: in this case, the goal is to misclassify some samples modified according to a particular pattern to support a particular goal. For example, in the same case as the previous example, all the persons wearing the purple glasses (mode) are allowed to enter.
3. Attack scenario configuration
A. Data set
With respect to the data sets needed to model a scene, the general idea of the present invention is that classical machine learning data sets can be used and distributed between clients according to different data distributions. However, while it is possible to simulate the non-independent co-distribution nature of data distribution, it is quite complex to simulate specialization of data between clients, making them represent respective features. For this reason, the present invention decides to use the jointly defined dataset to select the following image classification datasets contained in the LEAF reference:
1) digits FEMNIST: EMNIST syndicated version of the digital data set, where each client corresponds to an original writer.
2) CelebA: an image classification dataset consisting of celebrity face images, each image having 40 binary attribute annotations, associates each celebrity with a client. The invention takes the binary image classification data set as a binary image classification data set, and selects specific attributes as targets, particularly Smiling (CelebA-S) and Atactive (CelebA-A).
Using federated data sets may result in an insufficient amount of data for some clients. Therefore, the present invention sets a minimum number of samples k per client and discards clients that do not meet this condition. For the CelebA dataset, the invention uses k 30, and for FEMNIST, the invention sets k 8, since it is the minimum number of samples per client.
B. Backdoor attack settings
Input example backdoor attack settings: the invention sets an object label and a sample set SbackdoorThese samples come from clients belonging to other classes (original tags). The attack includes using the target tags to classify the maximum number of these samples without modifying any of the samples. Due to the particularity of each client, the invention sets the number of backdoor tasks as a backdoor data set SbackdoorThe number of clients that acquire the sample, the number of antagonistic clients is set to the number of clients that own the backdoor dataset in their data and attack frequency. Based on these parameters, input instance backdoor attacks for the FEMNIST, CelebA-S, and CelebA-A datasets are defined, which are set as follows:
Figure BDA0003198552980000131
mode key backdoor attack settings: the invention sets a target label and a mode key. The attack includes classifying all samples contaminated by the pattern key as target labels. Fig. 3 is a schematic diagram showing mode keys for a mode key backdoor attack provided by the present invention, and as shown in fig. 3, in order to show the countermeasure behavior of RFOut-1d, the present invention uses three modes with different difficulty levels, which are represented by pixel numbers, namely, a 1-pixel mode, an 8-pixel mode (cross with length of 4) and a 25-pixel mode (square with side 5) from left to right. The settings for the model key backdoor attacks for the FEMNIST, CelebA-S and CelebA-A datasets are defined as follows:
Figure BDA0003198552980000132
based on the simulated attack environment, the performance verification is carried out on the backdoor attack defense method, and experimental results show that RFOut-1d is a very effective defense scheme, the influence of backdoor attacks can be weakened to the extent of (almost) making the backdoor attacks ineffective in all learning rounds, compared with other defense measures, the method can not obstruct the federal learning process by maintaining (even improving) the performance of the model in the original task, and the combination of the model and a general solution is accelerated and optimized by screening out a client deviating from the solution, so that the influence of the backdoor attacks is eliminated, the performance of the federal learning model is maintained, and the performance of the method is superior to most of defense means at present.
The federally learned backdoor attack defense device provided by the invention is described below, and the federally learned backdoor attack defense device described below and the federally learned backdoor attack defense method described above can be referred to in a corresponding manner.
Fig. 4 is a schematic structural diagram of a federally learned backdoor attack defense apparatus provided in the present invention, and as shown in fig. 4, the apparatus includes:
the training module 400 is used for establishing an image classification model based on federal learning and training;
the processing module 410 is configured to perform classification processing on the images to be classified according to the trained image classification model;
wherein the training module 400 is specifically configured to:
aiming at the target wheel model learning, respectively acquiring initial local model parameters of target model update dimensionality obtained by each client in the target wheel model learning;
determining abnormal values in the initial local model parameters based on the average value and the standard deviation of the initial local model parameters;
updating the initial local model parameters of the client corresponding to the abnormal values to the average value of the initial local model parameters to obtain new local model parameters of the client;
calculating the average value of each new local model parameter to obtain the aggregation model parameter of the update dimension of the target model corresponding to the target wheel model learning;
and updating the dimension aggregation model parameters of each model based on each round of model learning, and obtaining the image classification model after training.
Optionally, the training module 400 is specifically configured to:
and if the difference value between the target initial local model parameter and the average value of each initial local model parameter is determined to be greater than or equal to the preset multiple of the standard deviation, determining the value of the target initial local model parameter as an abnormal value.
Optionally, the preset multiple is 3.
Optionally, the distribution of each initial local model parameter follows a gaussian distribution.
It should be noted that, the apparatus provided in the present invention can implement all the method steps implemented by the method embodiments and achieve the same technical effects, and detailed descriptions of the same parts and beneficial effects as the method embodiments in this embodiment are omitted here.
Fig. 5 is a schematic structural diagram of an electronic device provided in the present invention, and as shown in fig. 5, the electronic device may include: a processor (processor)510, a communication Interface (Communications Interface)520, a memory (memory)530 and a communication bus 540, wherein the processor 510, the communication Interface 520 and the memory 530 communicate with each other via the communication bus 540. Processor 510 may invoke logic instructions in memory 530 to perform the steps of any of the federally learned backdoor attack defense methods provided by the various embodiments described above, such as: establishing an image classification model based on federal learning and training; classifying images to be classified according to the trained image classification model; the image classification model training mode is as follows: aiming at the target wheel model learning, respectively acquiring initial local model parameters of target model update dimensionality obtained by each client in the target wheel model learning; determining abnormal values in the initial local model parameters based on the average value and the standard deviation of the initial local model parameters; updating the initial local model parameters of the client corresponding to the abnormal values to the average value of the initial local model parameters to obtain new local model parameters of the client; calculating the average value of each new local model parameter to obtain the aggregation model parameter of the update dimension of the target model corresponding to the target wheel model learning; and updating the dimension aggregation model parameters of each model based on each round of model learning, and obtaining the image classification model after training.
Furthermore, the logic instructions in the memory 530 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the steps of any one of the federally learned backdoor attack defense methods provided by the embodiments described above, for example: establishing an image classification model based on federal learning and training; classifying images to be classified according to the trained image classification model; the image classification model training mode is as follows: aiming at the target wheel model learning, respectively acquiring initial local model parameters of target model update dimensionality obtained by each client in the target wheel model learning; determining abnormal values in the initial local model parameters based on the average value and the standard deviation of the initial local model parameters; updating the initial local model parameters of the client corresponding to the abnormal values to the average value of the initial local model parameters to obtain new local model parameters of the client; calculating the average value of each new local model parameter to obtain the aggregation model parameter of the update dimension of the target model corresponding to the target wheel model learning; and updating the dimension aggregation model parameters of each model based on each round of model learning, and obtaining the image classification model after training.
In yet another aspect, the present invention further provides a non-transitory computer readable storage medium, having stored thereon a computer program, which when executed by a processor, performs the steps of any one of the above-mentioned federally learned backdoor attack defense methods, such as: establishing an image classification model based on federal learning and training; classifying images to be classified according to the trained image classification model; the image classification model training mode is as follows: aiming at the target wheel model learning, respectively acquiring initial local model parameters of target model update dimensionality obtained by each client in the target wheel model learning; determining abnormal values in the initial local model parameters based on the average value and the standard deviation of the initial local model parameters; updating the initial local model parameters of the client corresponding to the abnormal values to the average value of the initial local model parameters to obtain new local model parameters of the client; calculating the average value of each new local model parameter to obtain the aggregation model parameter of the update dimension of the target model corresponding to the target wheel model learning; and updating the dimension aggregation model parameters of each model based on each round of model learning, and obtaining the image classification model after training.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for defending against backdoor attacks in federated learning is characterized by comprising the following steps:
establishing an image classification model based on federal learning and training;
classifying images to be classified according to the trained image classification model;
the image classification model training mode is as follows:
aiming at target wheel model learning, respectively acquiring initial local model parameters of target model update dimensionality obtained by each client in the target wheel model learning;
determining abnormal values in the initial local model parameters based on the average value and the standard deviation of the initial local model parameters;
updating the initial local model parameters of the client corresponding to the abnormal values to be the average value of the initial local model parameters to obtain new local model parameters of the client;
calculating an average value of each new local model parameter to obtain a polymerization model parameter of the target wheel model learning corresponding to the update dimensionality of the target model;
and updating the dimension aggregation model parameters of each model based on each round of model learning, and obtaining the image classification model after training.
2. The federally learned backdoor attack defense method of claim 1, wherein the determining outliers in each initial local model parameter based on the mean and standard deviation of each initial local model parameter comprises:
and if the difference value between the target initial local model parameter and the average value of each initial local model parameter is determined to be greater than or equal to the preset multiple of the standard deviation, determining the value of the target initial local model parameter as an abnormal value.
3. The federally learned backdoor attack defense method of claim 2, wherein the preset multiple is 3.
4. The federally learned backdoor attack defense method of claim 1, wherein the distribution of each of the initial local model parameters follows a gaussian distribution.
5. The utility model provides a back door attack defense device of bang's study which characterized in that includes:
the training module is used for establishing an image classification model based on federal learning and training;
the processing module is used for carrying out classification processing on the images to be classified according to the trained image classification model;
wherein, the training module is specifically configured to:
aiming at target wheel model learning, respectively acquiring initial local model parameters of target model update dimensionality obtained by each client in the target wheel model learning;
determining abnormal values in the initial local model parameters based on the average value and the standard deviation of the initial local model parameters;
updating the initial local model parameters of the client corresponding to the abnormal values to be the average value of the initial local model parameters to obtain new local model parameters of the client;
calculating an average value of each new local model parameter to obtain a polymerization model parameter of the target wheel model learning corresponding to the update dimensionality of the target model;
and updating the dimension aggregation model parameters of each model based on each round of model learning, and obtaining the image classification model after training.
6. The federally learned backdoor attack defense apparatus as claimed in claim 5, wherein the training module is specifically configured to:
and if the difference value between the target initial local model parameter and the average value of each initial local model parameter is determined to be greater than or equal to the preset multiple of the standard deviation, determining the value of the target initial local model parameter as an abnormal value.
7. The federally learned backdoor attack defense apparatus as claimed in claim 6, wherein the preset multiple is 3.
8. The federally learned backdoor attack defense apparatus as claimed in claim 5, wherein the distribution of each of the initial local model parameters follows a Gaussian distribution.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements the steps of the federally learned backdoor attack defense method as claimed in any one of claims 1 to 4.
10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the steps of the federally learned backdoor attack defense method as claimed in any of claims 1 to 4.
CN202110897437.XA 2021-08-05 2021-08-05 Method and device for defending against backdoor attack of federal learning Pending CN113779563A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110897437.XA CN113779563A (en) 2021-08-05 2021-08-05 Method and device for defending against backdoor attack of federal learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110897437.XA CN113779563A (en) 2021-08-05 2021-08-05 Method and device for defending against backdoor attack of federal learning

Publications (1)

Publication Number Publication Date
CN113779563A true CN113779563A (en) 2021-12-10

Family

ID=78836948

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110897437.XA Pending CN113779563A (en) 2021-08-05 2021-08-05 Method and device for defending against backdoor attack of federal learning

Country Status (1)

Country Link
CN (1) CN113779563A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114494771A (en) * 2022-01-10 2022-05-13 北京理工大学 Federal learning image classification method capable of defending backdoor attacks
CN114548428A (en) * 2022-04-18 2022-05-27 杭州海康威视数字技术股份有限公司 Intelligent attack detection method and device of federated learning model based on instance reconstruction
CN115731424A (en) * 2022-12-03 2023-03-03 北京邮电大学 Image classification model training method and system based on enhanced federal domain generalization
CN115907029A (en) * 2022-11-08 2023-04-04 北京交通大学 Defense method and system for federal learning virus attack
CN116527393A (en) * 2023-06-06 2023-08-01 北京交通大学 Method, device, equipment and medium for defending against federal learning poisoning attack
CN116542342A (en) * 2023-05-16 2023-08-04 江南大学 Asynchronous federal optimization method capable of defending Bayesian attack

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111460443A (en) * 2020-05-28 2020-07-28 南京大学 Security defense method for data manipulation attack in federated learning
CN112257063A (en) * 2020-10-19 2021-01-22 上海交通大学 Cooperative game theory-based detection method for backdoor attacks in federal learning
CN112365005A (en) * 2020-12-11 2021-02-12 浙江工业大学 Neuron distribution characteristic-based federal learning poisoning detection method
CN112446025A (en) * 2020-11-23 2021-03-05 平安科技(深圳)有限公司 Federal learning defense method and device, electronic equipment and storage medium
CN112714106A (en) * 2020-12-17 2021-04-27 杭州趣链科技有限公司 Block chain-based federal learning casual vehicle carrying attack defense method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111460443A (en) * 2020-05-28 2020-07-28 南京大学 Security defense method for data manipulation attack in federated learning
CN112257063A (en) * 2020-10-19 2021-01-22 上海交通大学 Cooperative game theory-based detection method for backdoor attacks in federal learning
CN112446025A (en) * 2020-11-23 2021-03-05 平安科技(深圳)有限公司 Federal learning defense method and device, electronic equipment and storage medium
CN112365005A (en) * 2020-12-11 2021-02-12 浙江工业大学 Neuron distribution characteristic-based federal learning poisoning detection method
CN112714106A (en) * 2020-12-17 2021-04-27 杭州趣链科技有限公司 Block chain-based federal learning casual vehicle carrying attack defense method

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114494771A (en) * 2022-01-10 2022-05-13 北京理工大学 Federal learning image classification method capable of defending backdoor attacks
CN114548428A (en) * 2022-04-18 2022-05-27 杭州海康威视数字技术股份有限公司 Intelligent attack detection method and device of federated learning model based on instance reconstruction
CN114548428B (en) * 2022-04-18 2022-08-16 杭州海康威视数字技术股份有限公司 Intelligent attack detection method and device of federated learning model based on instance reconstruction
CN115907029A (en) * 2022-11-08 2023-04-04 北京交通大学 Defense method and system for federal learning virus attack
CN115731424A (en) * 2022-12-03 2023-03-03 北京邮电大学 Image classification model training method and system based on enhanced federal domain generalization
CN115731424B (en) * 2022-12-03 2023-10-31 北京邮电大学 Image classification model training method and system based on enhanced federal domain generalization
CN116542342A (en) * 2023-05-16 2023-08-04 江南大学 Asynchronous federal optimization method capable of defending Bayesian attack
CN116527393A (en) * 2023-06-06 2023-08-01 北京交通大学 Method, device, equipment and medium for defending against federal learning poisoning attack
CN116527393B (en) * 2023-06-06 2024-01-16 北京交通大学 Method, device, equipment and medium for defending against federal learning poisoning attack

Similar Documents

Publication Publication Date Title
CN113779563A (en) Method and device for defending against backdoor attack of federal learning
CN111310802B (en) Anti-attack defense training method based on generation of anti-network
CN110334742B (en) Graph confrontation sample generation method based on reinforcement learning and used for document classification and adding false nodes
US20220067432A1 (en) Robustness assessment for face recognition
Wang et al. Adversarial attacks and defenses in machine learning-empowered communication systems and networks: A contemporary survey
Maranhão et al. Noise-robust multilayer perceptron architecture for distributed denial of service attack detection
Chen et al. Patch selection denoiser: An effective approach defending against one-pixel attacks
CN115687758A (en) User classification model training method and user detection method
Nguyen et al. Backdoor attacks and defenses in federated learning: Survey, challenges and future research directions
Macas et al. Adversarial examples: A survey of attacks and defenses in deep learning-enabled cybersecurity systems
EP4060572A1 (en) Computer-implemented method for accelerating convergence in the training of generative adversarial networks (gan) to generate synthetic network traffic, and computer programs of same
Naseer The efficacy of Deep Learning and Artificial Intelligence Framework in Enhancing Cybersecurity, Challenges and Future Prospects
Takahashi et al. Breaching FedMD: image recovery via paired-logits inversion attack
Alsubaei et al. Enhancing phishing detection: A novel hybrid deep learning framework for cybercrime forensics
Ali et al. A survey on attacks and their countermeasures in deep learning: Applications in deep neural networks, federated, transfer, and deep reinforcement learning
Yu et al. Security and Privacy in Federated Learning
Şeker Use of Artificial Intelligence Techniques/Applications in Cyber Defense
Alrajhi A survey of Artificial Intelligence techniques for cybersecurity improvement
CN111026087A (en) Weight-containing nonlinear industrial system fault detection method and device based on data
CN113837398A (en) Graph classification task poisoning attack method based on federal learning
Yang et al. DeMAC: Towards detecting model poisoning attacks in federated learning system
Li et al. Few pixels attacks with generative model
Li et al. A First Order Meta Stackelberg Method for Robust Federated Learning (Technical Report)
Xie et al. A survey on vulnerability of federated learning: A learning algorithm perspective
Chen et al. Byzantine-resilient Federated Learning via Gradient Memorization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination