CN112528281A - Poisoning attack detection method, device and equipment for federal learning - Google Patents
Poisoning attack detection method, device and equipment for federal learning Download PDFInfo
- Publication number
- CN112528281A CN112528281A CN202011463000.7A CN202011463000A CN112528281A CN 112528281 A CN112528281 A CN 112528281A CN 202011463000 A CN202011463000 A CN 202011463000A CN 112528281 A CN112528281 A CN 112528281A
- Authority
- CN
- China
- Prior art keywords
- model
- aggregation
- patch
- data set
- poisoning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 231100000572 poisoning Toxicity 0.000 title claims abstract description 79
- 230000000607 poisoning effect Effects 0.000 title claims abstract description 79
- 238000001514 detection method Methods 0.000 title claims description 16
- 230000002776 aggregation Effects 0.000 claims abstract description 163
- 238000004220 aggregation Methods 0.000 claims abstract description 163
- 238000000034 method Methods 0.000 claims abstract description 53
- 238000012545 processing Methods 0.000 claims abstract description 15
- 238000011156 evaluation Methods 0.000 claims description 47
- 210000002569 neuron Anatomy 0.000 claims description 34
- 230000006870 function Effects 0.000 claims description 23
- 238000003860 storage Methods 0.000 claims description 13
- 239000002574 poison Substances 0.000 claims description 11
- 231100000614 poison Toxicity 0.000 claims description 11
- 231100000331 toxic Toxicity 0.000 claims description 11
- 230000002588 toxic effect Effects 0.000 claims description 11
- 230000004931 aggregating effect Effects 0.000 claims description 2
- 238000006116 polymerization reaction Methods 0.000 abstract description 10
- 238000013473 artificial intelligence Methods 0.000 abstract description 6
- 238000012549 training Methods 0.000 description 45
- 238000010586 diagram Methods 0.000 description 10
- 238000013528 artificial neural network Methods 0.000 description 8
- 230000000694 effects Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 239000000243 solution Substances 0.000 description 5
- 238000004590 computer program Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000002347 injection Methods 0.000 description 2
- 239000007924 injection Substances 0.000 description 2
- 206010000117 Abnormal behaviour Diseases 0.000 description 1
- 238000012935 Averaging Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 235000019800 disodium phosphate Nutrition 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000012821 model calculation Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/562—Static detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Security & Cryptography (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Computer Hardware Design (AREA)
- Life Sciences & Earth Sciences (AREA)
- Virology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application provides a method, a device and equipment for detecting model poisoning attack of federated learning, and relates to the technical field of artificial intelligence. The method comprises the following steps: receiving model parameters of a terminal equipment side model sent by at least one terminal equipment participating in federal learning; performing aggregation processing according to the model parameters of the models at the terminal equipment sides to obtain an aggregation model at the server side; inverting the polymerization model to obtain an inverted data set of the polymerization model; generating a poisoned patch according to the inversion data set of the aggregation model; from the poisoned patch, it is determined whether the aggregated model is a normal model. The method can ensure the efficiency of detecting the model poisoning attack of the federated learning, and can also ensure the safety and privacy of the local data in each terminal device.
Description
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a poisoning attack detection method, device and equipment for federated learning.
Background
Federal Learning (Federal Learning) is a new artificial intelligence basic technology, which was proposed by Google in 2016, and is originally used for solving the problem of local model updating of android mobile phone terminal users, and the design goal of the technology is to carry out efficient machine Learning among multiple parties or multiple computing nodes on the premise of guaranteeing information safety during big data exchange, protecting terminal data and personal data privacy and guaranteeing legal compliance. The machine learning algorithm which can be used for federal learning is not limited to a neural network, and also comprises important algorithms such as a random forest. However, researchers have found that federal learning is vulnerable to attack by inside participants, known as poisoning attacks, and an attacker can play a benign participant in uploading poisoned updates to the server so that he can easily influence the performance global model.
At present, a method for effectively detecting poisoning attack of federal learning is lacked in the industry, so that the purpose of preventing malicious participants from attacking the federal learning model cannot be achieved, and the success rate of attacking the federal learning model by the malicious participants is high. Therefore, how to detect the poisoning attack of federal learning becomes a problem which needs to be solved at present.
Disclosure of Invention
The application aims to provide a poisoning attack detection method, a device and equipment for federated learning aiming at the defects in the prior art, so that the problem that a method for detecting the federated learning poisoning attack does not exist in the prior art is solved, the efficiency of searching for a poisoning backdoor is improved, and the purpose of preventing malicious participants from attacking a federated learning model can be achieved.
In order to achieve the above purpose, the technical solutions adopted in the embodiments of the present application are as follows:
in a first aspect, an embodiment of the present application provides a method for detecting a poisoning attack of a federated learning model, where the method includes:
receiving model parameters of a terminal equipment side model sent by at least one terminal equipment participating in federal learning;
performing aggregation processing according to the model parameters of each terminal equipment side model to obtain an aggregation model of the server side;
inverting the aggregation model to obtain an inversion data set of the aggregation model; generating a poisoned patch according to the inversion data set of the aggregation model;
determining whether the aggregated model is a normal model according to the poisoned patch.
Optionally, the inverting the aggregation model to obtain an inverted data set of the aggregation model includes:
and inverting the aggregation model by using a preset first objective function to obtain an inversion data set of the aggregation model.
Optionally, the generating a poison patch according to the inverse data set of the aggregation model includes:
and generating a toxic patch according to the inversion data set of the aggregation model and the damaged neuron in the aggregation model.
Optionally, before the generating the toxic patch according to the inverted data set of the aggregated model and the damaged neuron in the aggregated model, the method further includes:
obtaining damaged neurons in the aggregation model by adopting a preset second objective function; wherein the damaged neuron is abnormally activated at a specific input.
Optionally, the aggregating the model parameters of each terminal device side model to obtain an aggregation model of the server side includes:
and adopting a preset model aggregation function to aggregate the model parameters of the terminal equipment side models to obtain an aggregation model of the server side.
Optionally, the determining whether the aggregated model is a normal model according to the poison patch includes:
determining a patch evaluation index value of the poisoned patch; and determining whether the aggregation model is a normal model or not according to the patch index value.
Optionally, the determining whether the aggregation model is a normal model according to the patch evaluation index value includes:
and if the patch evaluation index value is smaller than a preset threshold value, determining that the aggregation model is a normal model.
Optionally, the determining whether the aggregation model is a normal model according to the patch evaluation index value includes:
and if the patch evaluation index value is greater than or equal to a preset threshold value, determining that the aggregation model is a poisoning model.
In a second aspect, an embodiment of the present application further provides a device for detecting a model poisoning attack of federated learning, where the device includes: the device comprises a receiving module, an aggregation module, an inversion module, a generation module and a determination module;
the receiving module is used for receiving the model parameters of the terminal equipment side model sent by at least one terminal equipment participating in federal learning;
the aggregation module is used for carrying out aggregation processing according to the model parameters of the terminal equipment side models to obtain an aggregation model of the server side;
the inversion module is used for inverting the aggregation model to obtain an inversion data set of the aggregation model;
the generating module is used for generating a poisoning patch according to the inversion data set of the aggregation model;
the determining module is used for determining whether the aggregation model is a normal model according to the poisoning patch.
Optionally, the aggregation module is specifically configured to:
and inverting the aggregation model by using a preset first objective function to obtain an inversion data set of the aggregation model.
Optionally, the generating module is specifically configured to:
and generating a toxic patch according to the inversion data set of the aggregation model and the damaged neuron in the aggregation model.
Optionally, the generating module is further configured to:
obtaining damaged neurons in the aggregation model by adopting a preset second objective function; wherein the damaged neuron is abnormally activated at a specific input.
Optionally, the generating module is further configured to:
and adopting a preset model aggregation function to aggregate the model parameters of the terminal equipment side models to obtain an aggregation model of the server side.
Optionally, the determining module is specifically configured to:
determining a patch evaluation index value of the poisoned patch; and determining whether the aggregation model is a normal model or not according to the patch evaluation index value.
Optionally, the determining module is further configured to:
and if the patch evaluation index value is smaller than a preset threshold value, determining that the aggregation model is a normal model.
Optionally, the determining module is further configured to:
and if the patch evaluation index value is greater than or equal to a preset threshold value, determining that the aggregation model is a poisoning model.
In a third aspect, this embodiment of the present application further provides an electronic device, including a processor, a storage medium, and a bus, where the storage medium stores program instructions executable by the processor, and when the electronic device runs, the processor communicates with the storage medium through the bus, and the processor executes the program instructions to perform the steps of the method provided in the embodiment of the first aspect.
In a fourth aspect, this application further provides a computer-readable storage medium, where a computer program is stored on the storage medium, and when the computer program is executed by a processor, the computer program performs the steps of the method provided in the embodiment of the first aspect.
The beneficial effect of this application is:
the application provides a method, a device and equipment for detecting model poisoning attack of federated learning, wherein the method comprises the following steps: receiving model parameters of a terminal equipment side model sent by at least one terminal equipment participating in federal learning; performing aggregation processing according to the model parameters of the models at the terminal equipment sides to obtain an aggregation model at the server side; inverting the polymerization model to obtain an inverted data set of the polymerization model; generating a poisoned patch according to the inversion data set of the aggregation model; from the poisoned patch, it is determined whether the aggregated model is a normal model. According to the method, the inversion data set is obtained by inverting the aggregation model obtained at the server side, and the poisoning patch is generated according to the inversion data set, so that whether the aggregation model is a normal model is judged according to the generated poisoning patch, and thus the poisoning patch does not need to be generated according to the local data set in the terminal equipment, whether the aggregation model obtained at the server side is a normal model is judged, the efficiency of detecting model poisoning attacks in the Union learning is ensured, and the safety and privacy of the local data in each terminal equipment can also be ensured.
In addition, after the aggregation model is determined to be a normal model according to the patch evaluation index value, the parameters of the aggregation model are issued to each terminal device participating in federal learning, so that a model with better performance is obtained through collaborative training, the accuracy of identification of a plurality of terminal devices participating in federal learning is improved, and the usability of federal learning is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a schematic structural diagram of a server according to an embodiment of the present application;
fig. 2 is a schematic flowchart of a method for detecting poisoning attacks by a federated learning model according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a framework for federated learning provided in the embodiment of the present application;
fig. 4 is a schematic flowchart of another federally learned model poisoning attack detection method according to an embodiment of the present application;
FIG. 5 is a block diagram of a method for detecting a model poisoning attack for federated learning according to an embodiment of the present application;
fig. 6 is a frame diagram of another federally learned model poisoning attack detection method provided in an embodiment of the present application;
fig. 7 is a schematic structural diagram of a device for detecting a model poisoning attack of federated learning according to an embodiment of the present application.
Icon: 100-an electronic device; 101-a processor; 102-memory.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments.
The application provides a plurality of embodiments to realize detection of the model backdoor poisoning attack of federal learning, and the method comprises the steps of carrying out model inversion on the aggregated model to obtain an artificially synthesized training data set, then searching possible poisoning patches in the training data set, and detecting the found patches to judge whether the aggregated model is poisoned, thereby effectively improving the efficiency of searching poisoning attack detection. This is explained below by means of a number of examples.
Fig. 1 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure, where the electronic device may be, for example, a server. For convenience of description, the following embodiments of the present application are all described by taking an electronic device as an example. As shown in fig. 1, the server 100 includes: a processor 101 and a memory 102.
The memory 102 is used for storing a program, and the processor 101 calls the program stored in the memory 102 to execute the method for detecting the model poisoning attack of federal learning provided in the following embodiments, and the specific implementation and technical effects are similar, and the method for detecting the model poisoning attack of federal learning provided in the present application will be described in detail through a plurality of specific embodiments as follows.
Fig. 2 is a schematic flowchart of a method for detecting poisoning attacks by a federated learning model according to an embodiment of the present application; the method may be implemented by a processor in the server provided by the above embodiments. As shown in fig. 2, the method includes:
s201, receiving model parameters of a terminal device side model sent by at least one terminal device participating in federal learning.
Fig. 3 is a schematic structural diagram of a framework for federated learning provided in the embodiment of the present application; as shown in FIG. 3, Federal learning is a machine learning environment that aims to train a high quality centralized model while the training data is still distributed across a large number of end devices.
For example, the number of terminals participating in federal learning is K, that is, terminal 1, terminal 2, terminal 3, terminal 4, …, and terminal K, where each terminal obtains an initial training model from the server, so that each terminal performs model training using local data, and uploads model training parameters obtained by each terminal to the server after several rounds of training, and the server aggregates the received K model parameters to obtain a latest model, and issues the latest global model parameters to each terminal.
For example, in some embodiments, the model parameters of the terminal device side model sent by the plurality of terminal devices participating in the federal learning may be received by the server, for example, as follows:
1) initializing a server side, and setting the current t to be 0, so that a plurality of terminal devices participating in federal learning obtain an initial training model from the server, for example, a VGG16(Visual Geometry Group) model, an AlexNet (a neural network) model, and the like.
In another possible implementation manner, a plurality of initial training models of the terminal equipment participating in the federal learning can be manually set.
2) Initializing a plurality of terminal devices participating in federal learning so that the plurality of terminal devices participating in federal learning can perform model training by using local data, and setting the number of rounds E, the learning rate epsilon and the data volume B of each round of training at the terminal device side.
For example, the total number of the terminal devices participating in the federal learning is K, and F (F ≦ K) terminal devices are selected to utilize the local data set Di(i is less than or equal to F) to obtain a local training model Lt i+1。
3) After the E round of training, uploading the trained model parameters to a server, so that the server can timely receive the model parameters of the terminal equipment side model sent by the F terminal equipment participating in federal learning.
S202, carrying out aggregation processing according to the model parameters of the models on the terminal equipment side to obtain an aggregation model on the server side.
In some embodiments, after receiving the model parameters of the terminal device side models sent by the plurality of terminal devices participating in federal learning, the server may perform aggregation processing according to the model parameters of the terminal device side models to obtain an aggregation model on the server side.
For example, 1) the goals of benign participants in the terminal are:
wherein (x)j,yj) Are each DiData and tags in the data set, Pgt+1(xj)=yj]Is Gt+1Model calculation data xjPredicted value and real label y ofjSame success rate, wiIs composed ofThe weight parameters of the model are used to determine,is composed ofWeight parameter after model optimization, Gt+1Is the aggregated server-side model.
2) The targets of the malicious participants in the terminal are:
wherein (x)j,yj) Are each DiData and tags in a data set, wiIs composed ofThe weight parameters of the model are used to determine,is composed ofThe weight parameter of the model, phi, is the added poisoning patch, R, is the added poisoning attack method, R (x)jPhi) is poisoning data generated using a poisoning attack method R and a patch phi, tau is a label of an attack target, P Gt+1(R(xj,φ))=τ]Is Gt+1The success rate, G, of the model calculating the predicted value of the poisoning data to be the same as the poisoning label taut+1Is the aggregated server-side model.
It should be noted that the added poisoning attack method R may include: BadNets (a neural network of reverse graffiti) poisoning attacks: under the condition that the structure of the original network is unknown, the neural network can express abnormal behaviors in the test stage by modifying pixel values in the training image; AIS (Access Injection strategy), BIS (blended Injection strategy) poisoning attacks: under the condition that an attacker does not know an attacked model or training data thereof, targeted attack can be realized by adding a small amount of poisoning samples.
Carrying out model aggregation according to the model parameters uploaded by each terminal device, wherein an aggregation formula is as follows:
wherein G ist+1For aggregated models after server-side aggregation, GtFor the previous round of training the aggregated model,for local training of the model, 1/F is weight scaling, which can be used to ensure that the back door survives averaging.
S203, inverting the aggregation model to obtain an inversion data set of the aggregation model.
Optionally, after the aggregation model is obtained, the server side updates the global model according to the aggregation parameters of the aggregation model to obtain a latest model, and issues the latest global model parameters to the plurality of terminal devices participating in federal learning, so that the safety and privacy of local data in each terminal device are effectively guaranteed, and meanwhile, the effect of obtaining a model with better performance through the collaborative training of the plurality of participating terminal devices can be achieved.
In this embodiment, after obtaining the server-side aggregate model, the aggregate model is inverted to obtain an inverted data set of the aggregate model. Wherein, the inversion means is: and (3) image method for reversely deducing model training set by using known models and labels: and initially setting a noise, and continuously optimizing the noise image through the model confidence coefficient and the target label to ensure that the confidence coefficient of the noise image for the target label is continuously improved, so that the noise image similar to the original image can be obtained when fitting.
It should be noted that the inversion data set obtained by inverting the aggregation model is not local data of a plurality of terminal devices participating in federal learning, so that the security and privacy of the local data set in each terminal device are effectively ensured.
And S204, generating a poisoning patch according to the inversion data set of the aggregation model.
Optionally, a neural network scanning technique based on AI (Artificial Intelligence) may be used to determine whether the training model is attacked by the backdoor. Firstly, a trained neural network needs to be acquired, and damaged neurons are found according to a training data set. Then, a poisoning patch is generated according to the damaged neuron and the training data set, and then the poisoning patch is evaluated, so that whether the model is a poisoning model or not can be judged according to the evaluation index, the method is mainly applied to the training of a single model, and a large amount of local training data is needed. However, with this technique, when detecting a poisoning attack of a model in federal learning, local data of a plurality of terminal devices cannot be acquired, which results in failure to solve the poisoning attack detection of an aggregation model.
In this embodiment, after obtaining the inverse data set of the aggregate model, the poisoning patch may be generated according to the inverse data set of the aggregate model.
It should be noted that, in the present embodiment, a poison patch is generated according to an inversion data set, which is different from the AI-based neural network scanning technique that finds a damaged neuron according to a local training data set. That is to say, compared with a general intrusion detection model, the security and privacy of the local data of each terminal device are effectively ensured by generating the poisoning patch according to the inversion data set of the aggregation model.
S205, determining whether the aggregation model is a normal model according to the poisoned patch.
In this embodiment, after the poisoning patch is generated according to the inversion data set of the aggregation model, whether the aggregation model is a normal model or not can be further determined according to the poisoning patch, so that efficiency of searching for a poisoning backdoor is greatly improved, and in addition, data security and privacy are also ensured by the method.
To sum up, the embodiment of the present application provides a method for detecting model poisoning attack of federated learning, which includes: receiving model parameters of a terminal equipment side model sent by at least one terminal equipment participating in federal learning; performing aggregation processing according to the model parameters of the models at the terminal equipment sides to obtain an aggregation model at the server side; inverting the polymerization model to obtain an inverted data set of the polymerization model; generating a poisoned patch according to the inversion data set of the aggregation model; from the poisoned patch, it is determined whether the aggregated model is a normal model. According to the method, the inversion data set of the aggregation model is obtained by inverting the aggregation model obtained at the server side, and the poisoning patch is generated according to the inversion data set, so that whether the aggregation model is a normal model is judged according to the generated poisoning patch, and thus the poisoning patch does not need to be generated according to the local data set in the terminal equipment, whether the aggregation model obtained at the server side is a normal model is judged, the efficiency of model poisoning attack detection of the Federation learning is ensured, and the safety and privacy of the local data in each terminal equipment can also be ensured.
Optionally, inverting the aggregation model to obtain an inverted data set of the aggregation model, including:
and inverting the aggregation model by using a preset first objective function to obtain an inversion data set of the aggregation model.
For example, the preset first objective function is adopted as follows:
wherein f is an aggregation model G of the server sidet+1Probability of being classified as τ.
Specifically, after obtaining the model aggregate, the preset first objective function may be used to invert the aggregate model, so as to obtain an inverted data set D of the aggregate modelMI(XMI,YMI)。
Optionally, finding the damaged neuron in the aggregate model according to the inversion data set of the aggregate model specifically includes:
for example, the preset second objective function is adopted as follows:
ccom=max(NSF(c,θ,τ)-NSF(c,μ,τ)) (5)
wherein c is a neuron, ccomFor damaged neurons, NSF is the state equation of the neuron, and θ and μ are the value sampling values of the inverted data set image and the patch-added image, respectively.
Specifically, since the damaged neuron is abnormally activated at a specific input, a preset second objective function may be used to obtain the damaged neuron in the aggregation model.
Optionally, generating a poison patch from the inverted data set of the aggregated model comprises:
and generating a toxic patch according to the inversion data set of the aggregation model and the damaged neuron in the aggregation model.
For example, using damaged neurons and inverting data set DMI(XMI,YMI) Obtaining a poisoned patch:
M=argmaxM(g(xMI;ccom;M;τ)·Δ) (6)
wherein M is a toxic patch generated using a gradient of damaged neurons, ccomFor the damaged neuron, g is the probability of classifying the aggregation model inversion data set under the current damaged neuron into a target class tau after adding a patch, and delta is the hyper-parameter setting of the patch.
Optionally, performing aggregation processing according to model parameters of each terminal device side model to obtain an aggregation model of the server side, where the aggregation model includes: and adopting a preset model aggregation function to aggregate the model parameters of the models at the terminal equipment sides to obtain an aggregation model at the server side.
For example, the model aggregation function employed is as follows:
wherein G ist+1For aggregated models after server-side aggregation, GtFor the previous round of training the aggregated model,for the local training model, 1/F is the weight scaling.
Specifically, training of the deep neural network is distributed among K participating terminal devices, and the training is carried out in the process of each round of t iterationThe server randomly selects a subset of the F participants and sends them the current joint model GtAnd training the selected participants to obtain a new local model based on the data set stored in the local terminal equipmentThe terminal device will differenceSending the information to a server to enable the server to update and calculate a global aggregation model according to the received information, so that the obtained aggregation model is Gt+1。
Fig. 4 is a schematic flowchart of another federally learned model poisoning attack detection method according to an embodiment of the present application; as shown in fig. 4, in step S205: determining whether the aggregation model is a normal model according to the poisoned patch, specifically comprising:
s401, determining a patch evaluation index value of the poisoned patch.
To detect outliers, a patch evaluation index value for a poisoned patch based on median absolute deviation may be used.
First, the absolute deviation between all data points and the median, the median of these absolute deviations is called MAD, and provides a reliable measure of scatter distribution.
Then, the patch evaluation index value of the data point is defined as the absolute deviation of the data point divided by the MAD, as follows:
where df is the patch evaluation index value. The MAD may be calculated by the following formula:
wherein, MAD is the median of absolute deviation, mask is the mask of the toxic patch, and is filled with 0 and 1, N kinds of masks are shared by N kinds of data sets, and l is the index of the mask.
If the smaller the mask value is, the more likely it is to indicate poisoning, the patch evaluation index value df is introduced, and the absolute difference is subtracted from the minimum value of the mask to obtain the absolute deviation MAD.
S402, determining whether the aggregation model is a normal model or not according to the patch evaluation index value.
In some embodiments, after obtaining the patch evaluation index value, it may be further determined whether the aggregation model is a normal model according to the patch evaluation index value df, for example, if the patch evaluation index value is set to be greater than or equal to a preset threshold, it is determined that the aggregation model is a poisoning model.
For example, if the calculated patch evaluation index value df is 3 and the preset threshold is 2, it may be determined that the patch evaluation index value is greater than the preset threshold, that is, 3>2, and it is determined that the aggregation model is the poisoning model.
Optionally, determining whether the aggregation model is a normal model according to the patch evaluation index value includes: and if the patch evaluation index value is smaller than a preset threshold value, determining that the aggregation model is a normal model.
In a possible implementation manner, for example, if the calculated patch evaluation index value df is 1 and the preset threshold is 2, it may be determined that the patch evaluation index value is smaller than the preset threshold, that is, 1<2, it is determined that the aggregation model is a normal model, and the parameters of the aggregation model are issued to each participating terminal device, so that a model with better performance can be obtained by cooperatively training a plurality of participating terminal devices, thereby improving the accuracy of the identification of the participating terminal devices and improving the availability of federal learning.
Optionally, determining whether the aggregation model is a normal model according to the patch evaluation index value includes: and if the patch evaluation index value is greater than or equal to the preset threshold value, determining that the aggregation model is the poisoning model.
In another possible implementation manner, for example, if the calculated patch evaluation index value is greater than or equal to a preset threshold, it may be determined that the aggregation model is a poisoning model, and the efficiency of detecting backdoor poisoning may be improved.
The detection process of the model poisoning attack of federal learning of the present application is described below with reference to a specific data set example, and specific implementation processes and technical effects thereof are referred to above and will not be described again below.
FIG. 5 is a block diagram of a method for detecting a model poisoning attack for federated learning according to an embodiment of the present application; as shown in fig. 5, there are 16 terminal devices participating in federal learning, only 5 terminal devices are shown in the figure, and the data set in each terminal device is a MNIST (mixed National Institute of Standards and Technology database) data set (MNIST is a picture data set of handwritten numbers), and the selected training model is AlexNet, and the specific process is as follows:
1) initializing the server side, and setting the current t to be 0
2) Each terminal device is initialized, and the MNIST data set in the 5 terminal devices is trained by adopting the selected AlexNe model.
The number of training rounds E set on the terminal device side is 10, the learning rate ∈ is 0.1, and the data amount B per round of training is 16.
3) Selecting local data set D in 4 terminal devicesi(i is less than or equal to 4) training to obtain a local training model
For example, the terminal device target of benign participation is exemplified by the above equation (1).
The target of the malicious participating terminal device is exemplified by equation (2) above.
4) And uploading model parameters obtained by training the selected 4 terminal devices to a server, and carrying out model aggregation in the server, wherein the aggregation formula is the formula (3).
5) After polymerization, a polymerization model G was obtainedt+1And performing model inversion on the aggregation model to obtain an inversion data set DMI(XMI,YMI) The objective function of the model inversion is the above equation (4).
6) Searching for damaged neurons in the aggregate model, wherein the damaged neurons are abnormally activated under specific input, and the objective function is the formula (5).
7) Inversion of data set D from obtained modelMI(XMI,YMI) And obtaining a toxic patch according to the damaged neuron and the inversion data set, wherein the calculation formula is the formula (6).
8) After obtaining the poison patch, determining a patch evaluation index value of the poison patch, and calculating the formula as the formula (7) and the formula (8).
For example, it may be determined whether the aggregation model is a normal model according to the patch evaluation index value, and if the patch evaluation index value is smaller than the preset threshold, it is determined that the aggregation model is a normal model, and the aggregation model may be issued to the 5 terminal devices participating in federal learning.
And if the patch evaluation index value is greater than or equal to the preset threshold value, determining that the aggregation model is the poisoning model.
Fig. 6 is a frame diagram of another federally learned model poisoning attack detection method provided in an embodiment of the present application; as shown in fig. 6, there are 16 terminal devices participating in federal learning, only 5 terminal devices are shown in the figure, and the data set in each terminal device is a CIFAR10 data set (a computer vision data set for generic object recognition), the selected training model is AlexNet, and the specific process is as follows:
1) initializing the server side, and setting the current t to be 0
2) Initializing each terminal device, and training a CIFAR10 data set in the 5 terminal devices by adopting the selected VGG16 model.
The number of training rounds E set on the terminal device side is 30, the learning rate ∈ is 0.01, and the data amount B per round of training is 32.
3) Selecting local data set D in 4 terminal devicesi(i is less than or equal to 4) training to obtain a local training model
For example, the terminal device target of benign participation is exemplified by the above equation (1).
The target of the malicious participating terminal device is exemplified by equation (2) above.
4) And uploading model parameters obtained by training the selected 4 terminal devices to a server, and carrying out model aggregation in the server, wherein the aggregation formula is the formula (3).
5) After polymerization, a polymerization model G was obtainedt+1And performing model inversion on the aggregation model to obtain an inversion data set DMI(XMI,YMI) The objective function of the model inversion is the above equation (4).
6) Searching for damaged neurons in the aggregate model, wherein the damaged neurons are abnormally activated under specific input, and the objective function is the formula (5).
7) Inversion of data set D from obtained modelMI(XMI,YMI) And obtaining a toxic patch according to the damaged neuron and the inversion data set, wherein the calculation formula is the formula (6).
8) After obtaining the poison patch, determining a patch evaluation index value of the poison patch, and calculating the formula as the formula (7) and the formula (8).
For example, whether the aggregation model is a normal model is determined according to the patch evaluation index value, and if the patch evaluation index value is smaller than the preset threshold value, it is determined that the aggregation model is a normal model, and the aggregation model may be issued to the 5 terminal devices participating in federal learning.
And if the patch evaluation index value is greater than or equal to the preset threshold value, determining that the aggregation model is the poisoning model.
The following describes an apparatus, a storage medium, and the like for performing the model poisoning attack detection for federal learning provided in the present application, and specific implementation procedures and technical effects thereof are described above and will not be described again below.
Fig. 7 is a schematic structural diagram of a device for detecting poisoning attacks on a federated learning model provided in the embodiment of the present application; as shown in fig. 7, the apparatus includes: a receiving module 701, an aggregation module 702, an inversion module 703, a generation module 704 and a determination module 705;
a receiving module 701, configured to receive a model parameter of a terminal device side model sent by at least one terminal device participating in federal learning;
an aggregation module 702, configured to perform aggregation processing according to the model parameters of each terminal device side model to obtain an aggregation model on the server side;
an inversion module 703, configured to invert the aggregation model to obtain an inversion data set of the aggregation model;
a generating module 704, configured to generate a poisoned patch according to the inversion data set of the aggregation model;
the determining module 705 is configured to determine whether the aggregation model is a normal model according to the poison patch.
Optionally, the aggregation module 702 is specifically configured to:
and inverting the aggregation model by using a preset first objective function to obtain an inversion data set of the aggregation model.
Optionally, the generating module 704 is specifically configured to:
and generating a toxic patch according to the inversion data set of the aggregation model and the damaged neuron in the aggregation model.
Optionally, the generating module 704 is further configured to:
obtaining damaged neurons in the aggregation model by adopting a preset second objective function; wherein the damaged neuron is abnormally activated by a specific input.
Optionally, the generating module 704 is further configured to:
and adopting a preset model aggregation function to aggregate the model parameters of the models at the terminal equipment sides to obtain an aggregation model at the server side.
Optionally, the determining module 705 is specifically configured to:
determining a patch evaluation index value of the poisoned patch; and determining whether the aggregation model is a normal model or not according to the patch evaluation index value.
Optionally, the determining module 705 is further configured to:
if the patch evaluation index value is smaller than the preset threshold value, determining that the aggregation model is a normal model
Optionally, the determining module 705 is further configured to:
and if the patch evaluation index value is greater than or equal to the preset threshold value, determining that the aggregation model is the poisoning model.
The above-mentioned apparatus is used for executing the method provided by the foregoing embodiment, and the implementation principle and technical effect are similar, which are not described herein again.
These above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more microprocessors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), among others. For another example, when one of the above modules is implemented in the form of a Processing element scheduler code, the Processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor capable of calling program code. For another example, these modules may be integrated together and implemented in the form of a system-on-a-chip (SOC).
Optionally, the present application also provides a program product, such as a computer readable storage medium, comprising a program which, when being executed by a processor, is adapted to carry out the above-mentioned method embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to perform some steps of the methods according to the embodiments of the present application. And the aforementioned storage medium includes: a U disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Claims (10)
1. A method for detecting model poisoning attack of federated learning is characterized by comprising the following steps:
receiving model parameters of a terminal equipment side model sent by at least one terminal equipment participating in federal learning;
performing aggregation processing according to the model parameters of each terminal equipment side model to obtain an aggregation model of the server side;
inverting the aggregation model to obtain an inversion data set of the aggregation model;
generating a poisoned patch according to the inversion data set of the aggregation model;
determining whether the aggregated model is a normal model according to the poisoned patch.
2. The method of claim 1, the inverting the aggregate model to obtain an inverted data set of the aggregate model, comprising:
and inverting the aggregation model by using a preset first objective function to obtain an inversion data set of the aggregation model.
3. The method of claim 1, the generating a poison patch from an inverted data set of the aggregate model, comprising:
and generating a toxic patch according to the inversion data set of the aggregation model and the damaged neuron in the aggregation model.
4. The method of claim 3, further comprising, prior to the generating a toxic patch from the inverted data set of the aggregated model and the damaged neurons in the aggregated model:
obtaining damaged neurons in the aggregation model by adopting a preset second objective function; wherein the damaged neuron is abnormally activated at a specific input.
5. The method according to any one of claims 1 to 4, wherein the aggregating the model parameters of each terminal device side model to obtain a server side aggregate model comprises:
and adopting a preset model aggregation function to aggregate the model parameters of the terminal equipment side models to obtain an aggregation model of the server side.
6. The method of any of claims 1-4, the determining whether the aggregated model is a normal model from the poison patch, comprising:
determining a patch evaluation index value of the poisoned patch;
and determining whether the aggregation model is a normal model or not according to the patch evaluation index value.
7. The method of claim 6, the determining whether the aggregated model is a normal model according to the patch evaluation index value, comprising:
and if the patch evaluation index value is smaller than a preset threshold value, determining that the aggregation model is a normal model.
8. The method of claim 6, wherein determining whether the aggregated model is a normal model based on the patch evaluation index value comprises:
and if the patch evaluation index value is greater than or equal to a preset threshold value, determining that the aggregation model is a poisoning model.
9. The utility model provides a model poisoning attack detection device of bang's study, its characterized in that, the device includes: the device comprises a receiving module, an aggregation module, an inversion module, a generation module and a determination module;
the receiving module is used for receiving the model parameters of the terminal equipment side model sent by at least one terminal equipment participating in federal learning;
the aggregation module is used for carrying out aggregation processing according to the model parameters of the terminal equipment side models to obtain an aggregation model of the server side;
the inversion module is used for inverting the aggregation model to obtain an inversion data set of the aggregation model;
the generating module is used for generating a poisoning patch according to the inversion data set of the aggregation model;
the determining module is used for determining whether the aggregation model is a normal model according to the poisoning patch.
10. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating via the bus when the electronic device is operating, the processor executing the machine-readable instructions to perform the steps of the method according to any one of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011463000.7A CN112528281B (en) | 2020-12-11 | 2020-12-11 | Poisoning attack detection method, device and equipment for federal learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011463000.7A CN112528281B (en) | 2020-12-11 | 2020-12-11 | Poisoning attack detection method, device and equipment for federal learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112528281A true CN112528281A (en) | 2021-03-19 |
CN112528281B CN112528281B (en) | 2024-08-27 |
Family
ID=74999325
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011463000.7A Active CN112528281B (en) | 2020-12-11 | 2020-12-11 | Poisoning attack detection method, device and equipment for federal learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112528281B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113360897A (en) * | 2021-06-03 | 2021-09-07 | 哈尔滨工业大学 | Free Rider attack method under horizontal federated learning architecture |
CN115333825A (en) * | 2022-08-10 | 2022-11-11 | 浙江工业大学 | Defense method aiming at gradient attack of federal learning neurons |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110598400A (en) * | 2019-08-29 | 2019-12-20 | 浙江工业大学 | Defense method for high hidden poisoning attack based on generation countermeasure network and application |
CN111259404A (en) * | 2020-01-09 | 2020-06-09 | 鹏城实验室 | Toxic sample generation method, device, equipment and computer readable storage medium |
CN111598143A (en) * | 2020-04-27 | 2020-08-28 | 浙江工业大学 | Credit evaluation-based defense method for federal learning poisoning attack |
CN111914256A (en) * | 2020-07-17 | 2020-11-10 | 华中科技大学 | Defense method for machine learning training data under toxic attack |
-
2020
- 2020-12-11 CN CN202011463000.7A patent/CN112528281B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110598400A (en) * | 2019-08-29 | 2019-12-20 | 浙江工业大学 | Defense method for high hidden poisoning attack based on generation countermeasure network and application |
CN111259404A (en) * | 2020-01-09 | 2020-06-09 | 鹏城实验室 | Toxic sample generation method, device, equipment and computer readable storage medium |
CN111598143A (en) * | 2020-04-27 | 2020-08-28 | 浙江工业大学 | Credit evaluation-based defense method for federal learning poisoning attack |
CN111914256A (en) * | 2020-07-17 | 2020-11-10 | 华中科技大学 | Defense method for machine learning training data under toxic attack |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113360897A (en) * | 2021-06-03 | 2021-09-07 | 哈尔滨工业大学 | Free Rider attack method under horizontal federated learning architecture |
CN115333825A (en) * | 2022-08-10 | 2022-11-11 | 浙江工业大学 | Defense method aiming at gradient attack of federal learning neurons |
CN115333825B (en) * | 2022-08-10 | 2024-04-09 | 浙江工业大学 | Defense method for federal learning neuron gradient attack |
Also Published As
Publication number | Publication date |
---|---|
CN112528281B (en) | 2024-08-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111046379B (en) | Anti-attack monitoring method and device | |
CN111914256B (en) | Defense method for machine learning training data under toxic attack | |
Aïvodji et al. | Gamin: An adversarial approach to black-box model inversion | |
CN114186237A (en) | Truth-value discovery-based robust federated learning model aggregation method | |
CN113645197B (en) | Decentralized federal learning method, device and system | |
CN114331829A (en) | Countermeasure sample generation method, device, equipment and readable storage medium | |
CN112528281B (en) | Poisoning attack detection method, device and equipment for federal learning | |
CN112365005B (en) | Federal learning poisoning detection method based on neuron distribution characteristics | |
CN111340144A (en) | Risk sample detection method and device, electronic equipment and storage medium | |
CN108400972A (en) | A kind of method for detecting abnormality and device | |
CN112231570A (en) | Recommendation system trust attack detection method, device, equipment and storage medium | |
CN110912874A (en) | Method and system for effectively identifying machine access behaviors | |
CN114021188A (en) | Method and device for interactive security verification of federated learning protocol and electronic equipment | |
CN113688387A (en) | Defense method for federal learning poisoning attack based on server and client dual detection | |
CN115422537A (en) | Method for resisting turnover attack of federal learning label | |
Yin et al. | Detecting CAN overlapped voltage attacks with an improved voltage-based in-vehicle intrusion detection system | |
CN116737850A (en) | Graph neural network model training method for APT entity relation prediction | |
CN115758365A (en) | Neuron activation dependency graph-based federated learning model poisoning attack detection method | |
Battiato et al. | Computational data analysis for first quantization estimation on JPEG double compressed images | |
CN114694222A (en) | Image processing method, image processing device, computer equipment and storage medium | |
CN114360002A (en) | Face recognition model training method and device based on federal learning | |
Sami et al. | An automated framework for finding fake accounts on Facebook | |
US20220405585A1 (en) | Training device, estimation device, training method, and training program | |
CN112269987B (en) | Intelligent model information leakage degree evaluation method, system, medium and equipment | |
WO2022222143A1 (en) | Security test method and apparatus for artificial intelligence system, and terminal device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |