CN113420289B - Hidden poisoning attack defense method and device for deep learning model - Google Patents

Hidden poisoning attack defense method and device for deep learning model Download PDF

Info

Publication number
CN113420289B
CN113420289B CN202110675083.4A CN202110675083A CN113420289B CN 113420289 B CN113420289 B CN 113420289B CN 202110675083 A CN202110675083 A CN 202110675083A CN 113420289 B CN113420289 B CN 113420289B
Authority
CN
China
Prior art keywords
deep learning
learning model
poisoning
image
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110675083.4A
Other languages
Chinese (zh)
Other versions
CN113420289A (en
Inventor
陈晋音
邹健飞
熊晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN202110675083.4A priority Critical patent/CN113420289B/en
Publication of CN113420289A publication Critical patent/CN113420289A/en
Application granted granted Critical
Publication of CN113420289B publication Critical patent/CN113420289B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/554Detecting local intrusion or implementing counter-measures involving event detection and direct action
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a hidden poisoning attack defense method facing a deep learning model, which comprises the steps of (1) obtaining an image data set and the deep learning model; (2) screening by using a deep learning model to obtain a clean image data set; (3) generating a poisoning sample, and concealing the process of generating the poisoning sample in the image preprocessing process; (4) inputting the generated poisoning sample into a deep learning model to poison the model, and making the model make an error judgment on the triggering sample in a testing stage; (5) and inputting the generated poisoning sample labeled with the correct class mark into the deep learning model for strengthening training so as to repair the deep learning model. The invention also discloses a hidden poisoning attack defense device facing the deep learning model, which is used for implementing the method. According to the method, the hidden poisoning attack on the model is realized by generating the poisoning sample, and then the generated poisoning sample is used for repairing the model, so that the safety and the robustness of the model are improved.

Description

Hidden poisoning attack defense method and device for deep learning model
Technical Field
The invention relates to the technical field of poisoning detection, in particular to a hidden poisoning attack defense method and device for a deep learning model.
Background
Deep learning gradually becomes a research hotspot and a mainstream development direction in the field of artificial intelligence. Deep learning is a machine learning technique that learns a data representation having multiple levels of abstraction, using a computational model composed of multiple processing layers. The deep learning represents the main development direction of machine learning and artificial intelligence research, and revolutionary progress is brought to the fields of machine learning, computer vision and the like. The artificial intelligence technology makes a breakthrough in the fields of computer vision, natural language processing and the like, so that the artificial intelligence is led to a new round of explosive development. Deep learning is the key to these breakthroughs. The image classification technology based on the deep convolutional network already exceeds the precision of human eyes, the speech recognition technology based on the deep neural network already reaches the precision of 95%, and the machine translation technology based on the deep neural network already approaches the average translation level of human beings. With the rapid improvement of precision, computer vision and natural language processing have entered the industrialization stage and have driven the rise of emerging industries.
The artificial intelligence model based on the neural network is widely applied to various applications such as face recognition, target detection, autonomous driving and the like, and the superiority of the artificial intelligence model is proved to be superior to that of the traditional calculation method. More and more people tend to believe that the application of artificial intelligence models to all aspects of life plays a crucial role. As complexity and functionality increase, training such models requires significant effort in collecting training data and optimizing performance. Thus, pre-trained models are becoming valuable items that suppliers (e.g., Google) and developers distribute, share, reuse, and even sell to profit. For example, thousands of pre-trained models are being released and shared on the Caffe model zo, ONNX zo, and BigML model markets, just like traditional software is shared on GitHub. These models can be trained by well-credited suppliers, institutions and even individuals.
However, pre-trained intelligent system models may contain backdoors injected by training or by transforming internal neuron weights. These trojan models work normally when regular inputs are provided, and when special patterns printed with triggers are entered, the specific output labels are misclassified. However, the concealment of the poisoning sample in the current poisoning attack method is not good, and the effect is poor in practical application, so that the patent provides a very concealed poisoning attack method, the generation of the poisoning sample is concealed in the image preprocessing process, and a defense method aiming at the concealed poisoning attack is provided, so that contribution is made to the improvement of the safety and the robustness of the model.
Disclosure of Invention
The invention aims to provide a hidden poisoning attack defense method and a hidden poisoning attack defense device for a deep learning model, which conceal the process of generating a poisoning sample in the image preprocessing process through an algorithm so that the poisoning process is more hidden, and simultaneously provides a defense method aiming at the hidden poisoning attack, so that the improvement of the safety and the robustness of the deep learning model is realized.
A hidden poisoning attack defense method facing a deep learning model,
the method comprises the following steps:
(1) acquiring an image data set and a deep learning model;
(2) identifying the image data set by using a deep learning model, screening to obtain images which can be identified correctly, and forming a clean image data set;
(3) generating a poisoning sample, and concealing the process of generating the poisoning sample in the image preprocessing process;
(4) inputting the generated poisoning sample into a deep learning model to poison the model, and making the model make an error judgment on the triggering sample in a testing stage;
(5) and (4) inputting the generated poisoning sample labeled with a correct class label into the deep learning model for forced training so as to repair the deep learning model and improve the accuracy of the recognition result of the deep learning model.
According to the scheme, the hidden poisoning attack on the model is realized by generating the poisoning sample, and then the generated poisoning sample is used for repairing the model, so that the safety and the robustness of the model are improved.
Preferably, the image dataset is a MNIST dataset, a CIFAR10 dataset, or a Driving dataset; the deep learning model is a LeNet deep learning model, a VGG19 deep learning model or a ResNet50 deep learning model.
Preferably, the step (2) is specifically:
and (2) inputting the image data set in the step (1) into a deep learning model, outputting a prediction class mark of an input image by the model, and if the prediction class mark is consistent with a real class mark of the image, correctly identifying the image by the deep learning model and putting the image into a clean image data set.
Preferably, in step (3), the image preprocessing specifically uses an interpolation method, and the poisoning sample is generated by using a resize process in the image preprocessing.
Further preferably, based on the interpolation linearization, the following problems are solved by adopting reverse interpolation:
Figure BDA0003119413210000042
wherein the vector W row And W col Linearly independent, core image I c The method comprises the steps that set by an attacker, disturbance rho is a solution of the formula, when the output size of interpolation is smaller than the input size, the formula becomes an underdetermined equation, and a solution space is non-empty; without changing the output, the disturbance is manipulated on the basis of the solution space of equations, which are divided into W row ρ is 0 and ρ W col The base of the solution space is the union of the two spatial bases, by ρ ═ B row X row +X col B col Calculating perturbation matrices, B row And B col Is the basis of the space of the weight matrix, matrix X row And X col Are coordinates.
Further preferably, the image I is modified by inverse interpolation c To make it visually correspond to another image I b Similarly, the loss function is calculated as:
Figure BDA0003119413210000041
the final generated image trained in the input model is I c +(B row X row +X col B col )。
Preferably, step (4) is specifically represented by:
f(I)[c]=b
wherein, f (I) and [ c ] indicate that the class label prediction result aiming at the class label [ c ] in the input deep learning model of the input image I in the testing stage is b.
A covert poisoning attack defense device facing a deep learning model, comprising:
the acquisition module is used for acquiring an image data set and a deep learning model;
the acquisition clean image data set module is used for screening by utilizing a deep learning model to acquire a clean image data set;
the generating module is used for concealing the process of generating the poisoning sample in the image preprocessing process;
the poisoning module is used for inputting the generated poisoning sample into the deep learning model to poison the model and making the model make an error judgment on the trigger sample in the testing stage;
and the repairing module is used for inputting the generated poisoning sample labeled with the correct class label into the deep learning model for strengthening training so as to repair the deep learning model.
The invention has the beneficial effects that:
according to the hidden poisoning attack defense method for the deep learning model, the process of generating the poisoning sample is hidden in the image preprocessing process according to the algorithm, so that the poisoning process is more hidden, and the method for the hidden poisoning attack has strong concealment. And performing reinforcement training on the original deep learning model by using the obtained toxic image to repair the deep learning model so as to improve the safety and robustness of the deep learning model.
Drawings
FIG. 1 is a flow chart of the method for defending against the hidden poisoning attack facing the deep learning model provided by the present invention.
FIG. 2 is a structural block diagram of the device for defending against the hidden poisoning attack facing the deep learning model provided by the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
For an automatic driving model or a face recognition model, the safety requirement is high, but the model is easily affected by poisoning attacks. In order to improve the safety of the two models, the hidden poisoning attack method for the deep learning model generates the poisoning sample to realize hidden poisoning attack on the model, and then the generated poisoning sample is used for repairing the model, so that the safety and robustness of the model are improved.
A hidden poisoning attack defense method facing a deep learning model comprises the following steps:
(1) obtaining an image dataset and a deep learning model
The picture data set is an MNIST data set, a CIFAR10 data set or a Driving data set.
The deep learning model is a LeNet deep learning model, a VGG19 deep learning model or a ResNet50 deep learning model.
(2) Recognizing the image data set by using a deep learning model, screening to obtain images which can be correctly recognized, and forming a clean image data set
Inputting the data set in the step (1) into a deep learning model, outputting a prediction class mark of an input image by the model, and if the prediction class mark is consistent with a real class mark of the image, correctly identifying the image by the deep learning model and putting the image into a clean image data set.
(3) Generating a poisoning sample, and concealing the process of generating the poisoning sample in an image preprocessing process
Referring to fig. 1, generating a poisoned sample is accomplished using a resize process in image pre-processing. Interpolation is a key image preprocessing technology, and the image Iresize obtained in the step (2) is adjusted to a target size. It can be represented as I H,W =f i (I h,w ) Where H and H are the heights before and after interpolation; w and W refer to the respective widths. As a linear operation, the calculation of the interpolation is equivalent to the corresponding weight matrix of the left product and the right product. It can be expressed as f i (I)=W row IW col Wherein W is row And W col Respectively, a matrix of weights for the row and column samples.
On the basis of interpolation linearization, either sampled or non-sampled pixels can be modified, and the interpolation result is manipulated according to the weight matrix. When this idea is applied to a poisoning attack, the perturbation is added to the non-sampled area of the input image, the sampled area being unchanged. Eventually covering the original content of the non-sampled area. When the number of correction pixels reaches a certain ratio, the actual output of the interpolation cannot be distinguished from the disturbance image with the eyes. This process is called inverse interpolation. The following problems are solved:
Figure BDA0003119413210000082
wherein the vector W row And W col Are linearly independent. Core image I c The perturbation ρ is a solution to the above equation, set by the attacker. When the output magnitude of the interpolation is smaller than the input magnitude, the above formula becomes an underdetermined equation, and the solution space is non-empty. Thus, the disturbance can be manipulated by the basis of the equation solution space without changing the output. The key is to find the optimal solution for the coordinates. The equations can be divided into W row ρ is 0 and ρ W col 0. The basis of the solution space is the union of the bases of the two spaces. By rho ═ B row X row +X col B col Calculating perturbation matrices, B row And B col Is the basis of the weight matrix space. Matrix X row And X col Is a coordinate, the establishment of the above formula can be guaranteed regardless of the value. The reason for this is that the weight matrix and its corresponding basis are orthogonal and the post-interpolation perturbation is zero. This is of great significance for extended attacks.
The purpose of the inverse interpolation is to modify the image I c To make it visually correspond to another image I b The same is true. The loss function is calculated as:
Figure BDA0003119413210000081
and selecting random gradient descent (SGD) to calculate optimal coordinates. Overall optimized inverse interpolation (I) c ,I b ) Corresponding to the core image I c Hidden to cover image I b In (1). The result appears to be image I b . However, in the forward propagation of neural networks, the bunker image is sampled by interpolationThe pixel of (2) is filtered. The generated image becomes actually the core image I c Equivalent to giving the input image I c Is printed with an image I b Class label of (2). The final generated image trained in the input model is I c +(B row X row +X col B col )。
(4) Inputting the generated poisoning sample into the deep learning model to poison the model, and making the model make an error judgment on the triggering sample in the testing stage, wherein the error judgment is expressed as:
f(I)[c]=b
wherein, f (I) and [ c ] indicate that the class label prediction result aiming at the class label [ c ] in the input deep learning model of the input image I in the testing stage is b.
(5) And inputting the generated poisoning sample labeled with the correct class mark into the deep learning model for intensive training so as to repair the deep learning model and improve the accuracy of the recognition result of the deep learning model.
Referring to fig. 2, a hidden poisoning attack defense device facing a deep learning model comprises:
the acquisition module is used for acquiring an image data set and a deep learning model;
the acquisition clean image data set module is used for screening by utilizing a deep learning model to acquire a clean image data set;
the generating module is used for concealing the process of generating the poisoning sample in the image preprocessing process;
the poisoning module is used for inputting the generated poisoning sample into the deep learning model to poison the model and making the model make an error judgment on the trigger sample in the testing stage;
and the repairing module is used for inputting the generated poisoning sample labeled with the correct class label into the deep learning model for strengthening training so as to repair the deep learning model.
It should be noted that, when the device for defending against the deep learning model-oriented covert poisoning attack according to the foregoing embodiment performs defense against the deep learning model-oriented covert poisoning attack, the division of the functional modules is taken as an example, and the function distribution may be completed by different functional modules as needed, that is, the internal structure of the terminal or the server is divided into different functional modules to complete all or part of the functions described above. In addition, the device for defending against hidden poisoning attacks facing the deep learning model and the method for defending against hidden poisoning attacks facing the deep learning model provided in the embodiments belong to the same concept, and specific implementation processes thereof are detailed in the embodiment of the method for defending against hidden poisoning attacks facing the deep learning model, and are not described herein again.
Although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that various changes in the embodiments and/or modifications of the invention can be made, and equivalents and modifications of some features of the invention can be made without departing from the spirit and scope of the invention.

Claims (6)

1. A hidden poisoning attack defense method facing a deep learning model is characterized by comprising the following steps:
(1) acquiring an image data set and a deep learning model;
(2) identifying the image data set by using a deep learning model, screening to obtain correctly identified images, and forming a clean image data set;
(3) generating a poisoning sample, and concealing the process of generating the poisoning sample in the image preprocessing process;
(4) inputting the generated poisoning sample into a deep learning model to poison the model, and making the model make an error judgment on the triggering sample in a testing stage;
(5) inputting the generated poisoning sample labeled with a correct class mark into a deep learning model for strengthening training so as to repair the deep learning model;
in the step (3), the image preprocessing specifically adopts an interpolation method, and a resize process in the image preprocessing is utilized to generate a poisoning sample;
on the basis of interpolation linearization, the following problems are solved by adopting inverse interpolation:
Figure FDA0003658409460000011
wherein the vector W row And W col Linearly independent, core image I c The method comprises the steps that set by an attacker, disturbance rho is a solution of the formula, when the output size of interpolation is smaller than the input size, the formula is changed into an underdetermined equation, and a solution space is non-empty; without changing the output, the disturbance is manipulated on the basis of the solution space of equations, which are divided into W row ρ is 0 and ρ W col The base of the solution space is the union of the two spatial bases, by ρ ═ B row X row +X col B col Calculating perturbation matrices, B row And B col Is the basis of a space of weight matrices, matrix X row And X col Are coordinates.
2. The deep learning model-oriented covert poisoning attack defense method according to claim 1, wherein the image data set is an MNIST data set, a CIFAR10 data set or a Driving data set; the deep learning model is a LeNet deep learning model, a VGG19 deep learning model or a ResNet50 deep learning model.
3. The method for defending against hidden poisoning attacks of the deep learning model according to claim 1 or 2, wherein the step (2) is specifically:
and (2) inputting the image data set in the step (1) into a deep learning model, outputting a prediction class mark of an input image by the model, and if the prediction class mark is consistent with a real class mark of the image, correctly identifying the image by the deep learning model and putting the image into a clean image data set.
4. The deep learning model-oriented hidden poisoning attack defense method according to claim 1, characterized in that the image I is modified by inverse interpolation c To make it visually correspond to another image I b Similarly, the loss function is calculated as:
Figure FDA0003658409460000021
the final generated image trained in the input model is I c +(B row X row +X col B col )。
5. The method for defending against hidden poisoning attacks of deep learning models according to claim 1, wherein the step (4) is specifically expressed as:
f(I)[c]=b
wherein, f (I) and [ c ] indicate that the class label prediction result aiming at the class label [ c ] in the input deep learning model of the input image I in the testing stage is b.
6. A hidden poisoning attack defense device facing a deep learning model is characterized by comprising:
the acquisition module is used for acquiring an image data set and a deep learning model;
the acquisition clean image data set module is used for screening by utilizing a deep learning model to acquire a clean image data set;
the generating module is used for concealing the process of generating the poisoning sample in the image preprocessing process;
specifically, the image preprocessing adopts an interpolation method, and a resize process in the image preprocessing is utilized to generate a poisoning sample;
on the basis of interpolation linearization, the following problems are solved by adopting inverse interpolation:
Figure FDA0003658409460000031
wherein the vector W row And W col Linearly independent, core image I c Set by the attacker, the perturbation ρ is the solution to the above formula, which is the case when the output magnitude of the interpolation is smaller than the input magnitudeChanging into an underdetermined equation, and setting a solution space to be non-empty; without changing the output, the disturbance is manipulated on the basis of the solution space of equations, which are divided into W row ρ ═ 0 and ρ W col The base of the solution space is the union of the two spatial bases, by ρ ═ B row X row +X col B col Calculating perturbation matrices, B row And B col Is the basis of a weight matrix space, matrix X row And X col Is a coordinate;
the poisoning module is used for inputting the generated poisoning sample into the deep learning model to poison the model and making the model make an error judgment on the trigger sample in the testing stage;
and the repairing module is used for inputting the generated poisoning sample labeled with the correct class label into the deep learning model for strengthening training so as to repair the deep learning model.
CN202110675083.4A 2021-06-17 2021-06-17 Hidden poisoning attack defense method and device for deep learning model Active CN113420289B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110675083.4A CN113420289B (en) 2021-06-17 2021-06-17 Hidden poisoning attack defense method and device for deep learning model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110675083.4A CN113420289B (en) 2021-06-17 2021-06-17 Hidden poisoning attack defense method and device for deep learning model

Publications (2)

Publication Number Publication Date
CN113420289A CN113420289A (en) 2021-09-21
CN113420289B true CN113420289B (en) 2022-08-26

Family

ID=77789044

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110675083.4A Active CN113420289B (en) 2021-06-17 2021-06-17 Hidden poisoning attack defense method and device for deep learning model

Country Status (1)

Country Link
CN (1) CN113420289B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI814213B (en) * 2022-01-17 2023-09-01 國立清華大學 Data poisoning method and data poisoning apparatus
CN114462031B (en) * 2022-04-12 2022-07-29 北京瑞莱智慧科技有限公司 Back door attack method, related device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110598400A (en) * 2019-08-29 2019-12-20 浙江工业大学 Defense method for high hidden poisoning attack based on generation countermeasure network and application
CN111753986A (en) * 2020-06-28 2020-10-09 浙江工业大学 Dynamic testing method and device for deep learning model
CN112905997A (en) * 2021-01-29 2021-06-04 浙江工业大学 Method, device and system for detecting poisoning attack facing deep learning model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11514297B2 (en) * 2019-05-29 2022-11-29 Anomalee Inc. Post-training detection and identification of human-imperceptible backdoor-poisoning attacks

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110598400A (en) * 2019-08-29 2019-12-20 浙江工业大学 Defense method for high hidden poisoning attack based on generation countermeasure network and application
CN111753986A (en) * 2020-06-28 2020-10-09 浙江工业大学 Dynamic testing method and device for deep learning model
CN112905997A (en) * 2021-01-29 2021-06-04 浙江工业大学 Method, device and system for detecting poisoning attack facing deep learning model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
深度学习模型的中毒攻击与防御综述;陈晋音;《信息安全学报》;20200731;第14-25页 *

Also Published As

Publication number Publication date
CN113420289A (en) 2021-09-21

Similar Documents

Publication Publication Date Title
US11908244B2 (en) Human posture detection utilizing posture reference maps
CN107945204B (en) Pixel-level image matting method based on generation countermeasure network
Wang et al. SaliencyGAN: Deep learning semisupervised salient object detection in the fog of IoT
CN108416266B (en) Method for rapidly identifying video behaviors by extracting moving object through optical flow
CN107239733A (en) Continuous hand-written character recognizing method and system
CN113420289B (en) Hidden poisoning attack defense method and device for deep learning model
CN111681178B (en) Knowledge distillation-based image defogging method
CN113344806A (en) Image defogging method and system based on global feature fusion attention network
CN106874879A (en) Handwritten Digit Recognition method based on multiple features fusion and deep learning network extraction
CN110245754A (en) A kind of knowledge distillating method based on position sensing figure
CN110826056A (en) Recommendation system attack detection method based on attention convolution self-encoder
CN109712108A (en) It is a kind of that vision positioning method is directed to based on various distinctive candidate frame generation network
CN116052218B (en) Pedestrian re-identification method
CN110969089A (en) Lightweight face recognition system and recognition method under noise environment
CN111739037B (en) Semantic segmentation method for indoor scene RGB-D image
CN108062559A (en) A kind of image classification method based on multiple receptive field, system and device
CN112906520A (en) Gesture coding-based action recognition method and device
CN112364747A (en) Target detection method under limited sample
CN112084895A (en) Pedestrian re-identification method based on deep learning
CN115861306B (en) Industrial product abnormality detection method based on self-supervision jigsaw module
CN117115911A (en) Hypergraph learning action recognition system based on attention mechanism
CN109583584A (en) The CNN with full articulamentum can be made to receive the method and system of indefinite shape input
CN114495163A (en) Pedestrian re-identification generation learning method based on category activation mapping
CN114638408A (en) Pedestrian trajectory prediction method based on spatiotemporal information
CN113128425A (en) Semantic self-adaptive graph network method for human action recognition based on skeleton sequence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant