CN109086884B - Neural network attack defense method based on gradient reverse countermeasure sample restoration - Google Patents
Neural network attack defense method based on gradient reverse countermeasure sample restoration Download PDFInfo
- Publication number
- CN109086884B CN109086884B CN201810781467.2A CN201810781467A CN109086884B CN 109086884 B CN109086884 B CN 109086884B CN 201810781467 A CN201810781467 A CN 201810781467A CN 109086884 B CN109086884 B CN 109086884B
- Authority
- CN
- China
- Prior art keywords
- sample
- algorithm
- classification
- samples
- countermeasure
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
A neural network attack defense method based on gradient reverse countermeasure sample restoration realizes optimization of a sample set by detecting countermeasure samples from the sample set and then restoring the samples into common samples in an attack mode.
Description
Technical Field
The invention relates to a technology in the field of artificial intelligence countermeasure engineering, in particular to a method for recovering a countermeasure sample into a non-countermeasure sample through a gradient reverse countermeasure sample.
Background
Artificial intelligence is being widely applied to various fields in life, and in the process of the technology developing increasingly, the safety problem is exposed increasingly, and the safety problem is extremely serious for an artificial intelligence classifier, and an attacker can cause the classifier to make classification errors by adding a carefully constructed disturbance to a sample. Many studies then hope to resist attack against the sample by training a robust enough model, but it is always difficult to achieve satisfactory results. Many recent studies hope to detect the challenge sample by the characteristics of the challenge sample, but the mere detection of the challenge sample still cannot improve the accuracy of the training.
Disclosure of Invention
The invention provides a neural network attack defense method based on gradient reverse countermeasure sample restoration, which aims at the problem of how to process countermeasure samples, can treat the samples as normal samples by adding disturbance to the countermeasure samples and enabling the samples to cross a decision boundary to be restored into normal samples, and also improves the multiplexing degree of a system.
The invention is realized by the following technical scheme:
the invention relates to a neural network attack defense method based on gradient reverse countermeasure sample restoration, which detects countermeasure samples from a sample set and then restores the samples into common samples in an attack mode, thereby realizing the optimization of the sample set.
The confrontation samples in the sample set are generated by, but not limited to, an FGSM algorithm, a C & W algorithm, a Deepfol algorithm and a JSMA algorithm.
The attack mode comprises the following steps: the method comprises a fast gradient descent algorithm (FGSM), an optimization-based confrontation sample distance calculation method (C & W), a confusion deep learning method (DeepFool), and a Jacobian matrix-based greedy matching algorithm (JSMA), wherein the confrontation sample is recovered by preferably adopting the confusion deep learning method.
The recovery comprises the following specific steps:
and 2, increasing the minimum specification through iterative calculation to resist disturbance, and gradually pushing the image positioned in the classification boundary out of the boundary until the error classification occurs, namely recovering the image to be a normal sample.
The iterative computation specifically includes: initialization x0=xadvI is 0, wherein xadvIs a challenge sample; when argmax (f (x)i))=argmax(f(x0) Time cycle meterCalculating:xi+1=xi+rii ═ i +1, up to f (x)i) And f (x)0) Up to an odd sign, x obtainediI.e. the recovery sample.
Technical effects
Compared with the prior art, the method adds disturbance to the confrontation sample in a reverse attack mode to restore the confrontation sample into a normal sample, solves the problem of treatment of the confrontation sample, makes up for the defects of the function of the detector, and correctly classifies 90.2 percent of the confrontation sample in the experiment by the classifier after the disturbance is added, so that the method can effectively improve the robustness of the neural network classifier.
Drawings
FIG. 1 is a schematic diagram of an embodiment;
FIG. 2 is a diagram illustrating an embodiment obfuscating deep learning method.
Detailed Description
As shown in FIG. 1, the present embodiment selects the landmark identification Belgium TS data set, which may be Belgium TSC _ Testing (76.5MBytes) in the http:// btsd. ethz. ch/shared data/download, "Belgium TS for classified images" part.
The detector in this embodiment detects the challenge samples using a LID-based detection method. The LID (local intrinsic dimension) characterizes the dimensional properties of the space around the sample. Experiments have shown that the LID values of challenge samples are significantly higher than those of normal samples, i.e. the challenge space has a higher intrinsic dimension than the normal sample space. The LID value increases during the transition from the normal sample to the challenge sample. The LID-based antagonistic sample detection method has good detection performance, and the detection accuracy on BelgiumTS is about 95.2%.
The embodiment specifically comprises the following steps:
70% of Belgium TS data set is directly put into a sample set (because normal samples are most in practical conditions), the remaining 30% of Belgium TS data set is divided into four parts, each part is 7.5%, each part generates countermeasure samples through an FGSM algorithm, a C & W algorithm, a Deepfol algorithm and a JSMA algorithm respectively, and then the countermeasure samples are added into the sample set.
Secondly, inputting the samples in the sample set into a detector for detection, directly entering a classifier for normal classification after confirming that the samples are not the countermeasure samples, and increasing the minimum standard countermeasure disturbance through iterative calculation when the samples are detected as the countermeasure samples so as to enable the samples to be restored to the normal samples and then enter the classifier.
The classifier adopts but is not limited to a neural network model with a five-layer structure.
The iterative computation specifically includes: initialization x0=xadvI is 0, wherein xadvIs a challenge sample; when argmax (f (x)i))=argmax(f(x0) Time-loop calculation:xi+1=xi+rii ═ i +1, up to f (x)i) And f (x)0) Up to an odd sign, x obtainediI.e. the recovery sample.
In the implementation environment, the FGSM and the BIM are used for repairing the samples, the repairing success rate of the method in the attack samples generated by different attack methods (namely the repairing success rate is changed into the proportion of normal samples to total attack samples after the method is implemented) is tested, the row in the table represents which repairing method is used, and the column represents the test which is carried out in which attack sample set. The experimental data for this example are given in both MNIST and CIFAR data sets.
The foregoing embodiments may be modified in many different ways by those skilled in the art without departing from the spirit and scope of the invention, which is defined by the appended claims and all changes that come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.
Claims (1)
1. A neural network attack defense method based on gradient reverse countermeasure sample restoration is characterized in that a countermeasure sample is detected from a sample set and then restored into a common sample in an attack mode, so that the optimization of the sample set is realized;
the confrontation samples in the sample set are generated by adopting an FGSM algorithm, a C & W algorithm, a Deepfol algorithm and a JSMA algorithm;
the attack mode comprises the following steps: a fast gradient descent algorithm, an optimization-based confrontation sample distance calculation method, a confusion deep learning method and a greedy matching algorithm based on a Jacobian matrix;
the recovery comprises the following specific steps:
step 1, calculating a minimum disturbance distance, wherein the minimum disturbance distance is the shortest distance from a current input point to a segmentation plane, deducing a disturbance generation method under the condition of classification of a two-classification model, and expanding to multi-classification from the two-classification model;
step 2, adding minimum specifications to resist disturbance through iterative calculation, gradually pushing the image positioned in the classification boundary out of the boundary until error classification occurs, and then recovering the image to be a normal sample;
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810781467.2A CN109086884B (en) | 2018-07-17 | 2018-07-17 | Neural network attack defense method based on gradient reverse countermeasure sample restoration |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810781467.2A CN109086884B (en) | 2018-07-17 | 2018-07-17 | Neural network attack defense method based on gradient reverse countermeasure sample restoration |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109086884A CN109086884A (en) | 2018-12-25 |
CN109086884B true CN109086884B (en) | 2020-09-01 |
Family
ID=64838063
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810781467.2A Active CN109086884B (en) | 2018-07-17 | 2018-07-17 | Neural network attack defense method based on gradient reverse countermeasure sample restoration |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109086884B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109784411B (en) * | 2019-01-23 | 2021-01-05 | 四川虹微技术有限公司 | Defense method, device and system for confrontation sample and storage medium |
CN111488898B (en) * | 2019-01-28 | 2023-09-19 | 北京达佳互联信息技术有限公司 | Countermeasure data acquisition method, device, equipment and storage medium |
CN110768959B (en) * | 2019-09-20 | 2021-12-21 | 浙江工业大学 | Defense method based on signal boundary exploration attack |
CN111209370A (en) * | 2019-12-27 | 2020-05-29 | 同济大学 | Text classification method based on neural network interpretability |
CN114724014B (en) * | 2022-06-06 | 2023-06-30 | 杭州海康威视数字技术股份有限公司 | Deep learning-based method and device for detecting attack of countered sample and electronic equipment |
CN114861893B (en) * | 2022-07-07 | 2022-09-23 | 西南石油大学 | Multi-channel aggregated countermeasure sample generation method, system and terminal |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106022273A (en) * | 2016-05-24 | 2016-10-12 | 华东理工大学 | Handwritten form identification system of BP neural network based on dynamic sample selection strategy |
CN107463951A (en) * | 2017-07-19 | 2017-12-12 | 清华大学 | A kind of method and device for improving deep learning model robustness |
CN108198179A (en) * | 2018-01-03 | 2018-06-22 | 华南理工大学 | A kind of CT medical image pulmonary nodule detection methods for generating confrontation network improvement |
US10007866B2 (en) * | 2016-04-28 | 2018-06-26 | Microsoft Technology Licensing, Llc | Neural network image classifier |
-
2018
- 2018-07-17 CN CN201810781467.2A patent/CN109086884B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10007866B2 (en) * | 2016-04-28 | 2018-06-26 | Microsoft Technology Licensing, Llc | Neural network image classifier |
CN106022273A (en) * | 2016-05-24 | 2016-10-12 | 华东理工大学 | Handwritten form identification system of BP neural network based on dynamic sample selection strategy |
CN107463951A (en) * | 2017-07-19 | 2017-12-12 | 清华大学 | A kind of method and device for improving deep learning model robustness |
CN108198179A (en) * | 2018-01-03 | 2018-06-22 | 华南理工大学 | A kind of CT medical image pulmonary nodule detection methods for generating confrontation network improvement |
Non-Patent Citations (1)
Title |
---|
EVALUATING THE ROBUSTNESS OF NEURAL NETWORKS:AN EXTREME VALUE THEORY APPROACH;Tsui-WeiWeng et al.;《arXiv》;20180131;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN109086884A (en) | 2018-12-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109086884B (en) | Neural network attack defense method based on gradient reverse countermeasure sample restoration | |
Liu et al. | Detection based defense against adversarial examples from the steganalysis point of view | |
Hosseini et al. | Google's cloud vision api is not robust to noise | |
CN109543760B (en) | Confrontation sample detection method based on image filter algorithm | |
CN109961444B (en) | Image processing method and device and electronic equipment | |
CN110348475B (en) | Confrontation sample enhancement method and model based on spatial transformation | |
CN113554089A (en) | Image classification countermeasure sample defense method and system and data processing terminal | |
CN112396129B (en) | Challenge sample detection method and universal challenge attack defense system | |
CN111753290B (en) | Software type detection method and related equipment | |
CN108416343B (en) | Face image recognition method and device | |
Lv et al. | Chinese character CAPTCHA recognition based on convolution neural network | |
CN117134958A (en) | Information processing method and system for network technology service | |
CN117152486A (en) | Image countermeasure sample detection method based on interpretability | |
Kaur et al. | Performance Evaluation of various thresholding methods using canny edge detector | |
US11349856B2 (en) | Exploit kit detection | |
Kang et al. | Identification of multiple image steganographic methods using hierarchical ResNets | |
CN111209567B (en) | Method and device for judging perceptibility of improving robustness of detection model | |
CN115631333B (en) | Countermeasure training method for improving robustness of target detection model and target detection method | |
CN113139187B (en) | Method and device for generating and detecting pre-training language model | |
WO2021098801A1 (en) | Data cleaning device, data cleaning method and face verification method | |
WO2024115580A1 (en) | A method of assessing inputs fed to an ai model and a framework thereof | |
EP4328813A1 (en) | Detection device, detection method, and detection program | |
Hanyu et al. | Incremental Training of SVM-Based Human Detector | |
Chua et al. | Using Adversarial Defences Against Image Classification CAPTCHA | |
Rangslang et al. | Feature Space Perturbation for Transferable Adversarial Examples in Image Forensics Networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |