CN111858343A - Countermeasure sample generation method based on attack capability - Google Patents
Countermeasure sample generation method based on attack capability Download PDFInfo
- Publication number
- CN111858343A CN111858343A CN202010712768.7A CN202010712768A CN111858343A CN 111858343 A CN111858343 A CN 111858343A CN 202010712768 A CN202010712768 A CN 202010712768A CN 111858343 A CN111858343 A CN 111858343A
- Authority
- CN
- China
- Prior art keywords
- neural network
- data
- sample
- data set
- confrontation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3668—Software testing
- G06F11/3672—Test management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
A confrontation sample generation method based on attack capability comprises a data set and neural network model processing module, a confrontation sample generation module and a distance evaluation module. The dataset and neural network model processing module converts the dataset into numpy data of python for numerical computation in the project. The challenge sample generation module randomly screens out a part of data, generates challenge samples by using different challenge sample generation methods, and evaluates the average time of generating a single challenge sample by each generation method. In the distance evaluation module, we analyze the number of misclassifications and the highest probability of misclassification from two dimensions according to the distribution of output information. Secondly, if the original data has no label, the correct label of the original data cannot be known, and the validity of the original data can be calculated only according to the output classification quantity and the classification probability distribution.
Description
Technical Field
The invention belongs to the field of software testing, and particularly relates to generation of a countermeasure sample, and evaluation of quality of the generated countermeasure sample is carried out based on attack capability of the generated countermeasure sample on a neural network.
Background
In recent years, deep learning and deep neural networks are rapidly developed, and deep learning models have been related to aspects of social life. Therefore, the testing technology for guaranteeing the quality of the deep learning model is also in an increasingly important position. The traditional software testing technology cannot meet the characteristics that a deep learning model is complex and various and is difficult to understand. Therefore, new testing techniques for deep learning models become an important hotspot problem in the current academic and industrial fields. Proved by facts, in some fields, the confrontation samples can effectively find out the bug of the deep learning model, and the quality of the deep learning model is improved. Countermeasures are the deliberate addition of some imperceptible subtle perturbations to the input samples, causing the model to give an erroneous output with high confidence. The confrontation sample generation only needs to calculate the original data, the effective generation property is consistent with the original property, but the error data of the neural network is caused, and a large amount of labor cost is saved.
In the deep learning model test, the countermeasure sample generation test has become the mainstream method because of fragmentation of deep learning data, inexplicability of the deep learning model, and diversity of test scenarios. However, openness against sample generation also tends to result in poor quality results. Quality control is a significant challenge for challenge sample generation, especially for multiple iterations of a challenge sample generation method, such as up. The quality control strategy in the generation of the countersample is mainly divided into the countersample attack capability, the countersample disturbance magnitude and the like. For most data (even common application users), it is easy to get countersamples in iterative calculations. However, the unverified challenge sample may not be able to make the neural network, resulting in a low quality challenge sample, and cannot be used to repair the neural network. These problems cause great inconvenience and burden to the defect location and repair work of the deep learning model. At present, there is no effective means for quality management of the challenge sample, and therefore, it is desirable to research an auxiliary means to help evaluate the property of the challenge sample, so as to achieve the purpose of quality management of the challenge sample.
Some preliminary work has been done by some researchers on the challenge samples, but the existing work is limited to the generation of the challenge samples and supports poor quality assessment of the challenge samples.
Based on the work, the invention deeply mines the output information of the confrontation sample of the deep learning model in the neural network. This information is simple and understandable for the person. However, further technical processing work is required for the machine to automatically recognize this information and to generalize it into a failed countermeasure sample. Therefore, based on the existing research results, the invention summarizes and modifies the corresponding challenge sample generation technology, adds a new technical method, applies the new technical method to the generation and evaluation of the challenge sample in the deep learning model, enhances the quality screening of the challenge sample, and automatically generates the challenge sample.
Disclosure of Invention
The invention aims to solve the problems that: the quality of the confrontation sample generated by the confrontation sample generation method in the deep learning model test is underground, and the effective defect repair effect is difficult to form. The invention can evaluate the attack capability of the invention on the neural network, automatically generate high-quality countersample and solve the problem of low quality of the countersample.
The technical scheme of the invention is as follows: a method for generating a countermeasure sample based on attack capability assessment is characterized in that a reliable countermeasure sample can be generated according to deep learning raw data. The generation method comprises the following three modules/steps:
1) the data set and neural network model processing module: firstly, acquiring a data set and a neural network model file input by a user, acquiring structural information, neuron quantity, weight of each neuron and an activation function of the neural network model, restoring the neural network model, and locally generating the neural network model of the user. And decoding the input data set, and converting the data set into numpy data of python for numerical calculation in the project.
2) And a confrontation sample generation module. The system firstly randomly screens a part of data aiming at the data set, generates the confrontation samples by using different confrontation sample generation methods, and evaluates the average time of generating a single confrontation sample by each generation method. Based on the average time of generation, the challenge sample generation method that takes time to exceed the threshold is eliminated, because this method may be too costly for this data set, which may affect the cost of the overall process of generating the challenge samples later. For each data in the data set, a screened countermeasure sample generation method is adopted to generate a countermeasure sample, and effective countermeasure disturbance can be performed on each iteration by calculating the gradient of each iteration. After the end of the iteration cycle, each confrontation sample generation method generates one confrontation sample by default, and of course, the user can also set the number of generated confrontation samples. There are several different ways of generating challenge samples for each datum.
3) A distance evaluation module: firstly, the generated confrontation sample is input into a neural network, and the output information of the confrontation sample is analyzed. The processing of the output information is different according to whether the original data has a label, i.e. if the original data has a label, the property label of the confrontation sample should be consistent with the original data, so according to the distribution of the output information, from two dimensions: the number of misclassifications, the highest probability of misclassification. Secondly, if the original data has no label, the correct label of the original data cannot be known, and the validity of the original data can be calculated only according to the output classification quantity and the classification probability distribution.
The invention is characterized in that:
1. the use of time-based evaluation challenge sample generation techniques was first proposed in the field of challenge sample generation.
2. The attack capability assessment is applied for the first time to the generation of the challenge sample.
Drawings
Fig. 1 is a general flow chart of the implementation of the present invention.
Figure 2 is a flow chart of key step 1.
Figure 3 is a flow chart of key step 2.
Figure 4 is a flow chart of key step 3.
Detailed Description
The embodiments of the present invention are described below with reference to specific examples, and other advantages and effects of the present invention will be readily apparent to those skilled in the art from the disclosure of the present specification.
The method is used for automatically generating the countermeasure sample based on distance evaluation, mainly adopts a countermeasure sample generation technology, and relates to specific key technologies including a deep Convolutional Neural Network (CNN), a gradient descent technology and an attack capability evaluation technology.
1. Feature extraction
In the invention, a convolutional neural network method is adopted to extract the characteristics of a data file uploaded by a user, and a picture is converted into a characteristic vector. Convolutional neural networks are a class of feed-forward neural networks that contain convolutional computations and have a deep structure. The convolutional neural network has the characteristic learning ability and can carry out translation invariant classification on input information according to the hierarchical structure of the convolutional neural network. The convolutional neural network is constructed by imitating a visual perception mechanism of a living being, and can perform supervised learning and unsupervised learning, and the convolutional neural network can perform lattice characterization with smaller calculation amount due to parameter sharing of convolution kernels in hidden layers and sparsity of interlayer connection.
2. Deep neural network reduction
In the invention, the network to which the method is directed is a deep neural network which mainly comprises convolution, pooling and activation functions, and the deep neural network has inexplicability, so that the sufficiency of an original data set is difficult to measure from the point of white box testing. However, the neuron is almost a general feature representing the deep neural network, the structure and weight of the neuron represent the cognitive behavior of the deep neural network, and the output of the neuron finally determines the output of the neural network. Therefore, when the deep neural network is introduced, the neuron is inserted, and the output change of the neuron after data input is calculated, so that the gradient direction after data disturbance is calculated.
3. Countermeasure sample method time assessment
In the invention, the original data is calculated by a confrontation sample generation method, and the confrontation sample is generated by iterative disturbance. The time cost is an important cost for the generation of the confrontation, so the time length required for different confrontation samples is measured, thereby excluding the method that the time cost cannot bear. In the invention, a method of sampling a data set is adopted to replace the whole data set, and the average generation time of different countermeasure sample generation methods for the sampled data is calculated, so that the approximate time cost of the method for the data set is obtained.
4. Antagonistic sample generation
In the invention, gradient calculation is adopted to perturb the image feature vector generated by the convolutional neural network into a countermeasure sample. In the machine learning algorithm, when the loss function is minimized, iterative solution can be performed step by step through a gradient method, and the minimized loss function and the model parameter value are obtained. When the countermeasure sample is generated, by applying perturbation to the data in the direction along the gradient, the perturbation can be made most effective, and the data making the neural network erroneous can be found efficiently.
5. Attack capability assessment
In the invention, the output distribution of the confrontation sample relative to the neural network is calculated, and the confrontation sample generated by disturbance is input into the original neural network to obtain the output probability distribution condition. The probability distribution refers to the probability of various classifications of the output of the sample by the neural network, and the attack capability of the countermeasure sample can be effectively calculated through the probability distribution, so that the countermeasure sample is evaluated
In this example, the deep neural network computation of the data feature extraction part converts the input picture into a data vector. In the neural network model reduction part, a publicly available neural network model reduction method is adopted to generate a neural network for calculating the generation of the confrontation sample. The confrontation sample list used in the confrontation sample filtering is generated by automatically perturbing from our data set, and there are five generation methods, each of which generates several confrontation samples. And in the final resisting sample evaluation part, an attack ability evaluation method is adopted, the output distribution of the generated resisting samples input into the neural network is effectively calculated, and the resisting samples with poor attack ability are eliminated.
Claims (4)
1. A counterattack sample generation method based on attack ability is characterized in that a data set and a neural network model processing module complete the conversion of the data set; configuring the correlation attributes generated by the samples in the confrontation sample generation module; and evaluating the generated result according to the distribution of the output information.
2. The method of transforming a data set with a neural network model processing module as claimed in claim 1, wherein: firstly, acquiring a data set and a neural network model file input by a user, acquiring structural information, neuron quantity, weight of each neuron and an activation function of the neural network model, restoring the neural network model, and locally generating the neural network model of the user. And decoding the input data set, and converting the data set into numpy data of python for numerical calculation in the project.
3. The correlation property for configuring sample generation in an antagonistic sample generation module as described in claim 1, wherein: the system firstly randomly screens a part of data aiming at the data set, generates the confrontation samples by using different confrontation sample generation methods, and evaluates the average time of generating a single confrontation sample by each generation method. Based on the average time of generation, the challenge sample generation method that takes time to exceed the threshold is eliminated, because this method may be too costly for this data set, which may affect the cost of the overall process of generating the challenge samples later. For each data in the data set, a screened countermeasure sample generation method is adopted to generate a countermeasure sample, and effective countermeasure disturbance can be performed on each iteration by calculating the gradient of each iteration.
4. The evaluation of the generated result based on the distribution of the output information as described in claim 1, wherein: firstly, the generated confrontation sample is input into a neural network, and the output information of the confrontation sample is analyzed. The processing of the output information is different according to whether the original data has a label or not, and if the original data has a label, the property label of the confrontation sample is consistent with the original data, so that the number of error classifications and the highest probability of error classification are analyzed from two dimensions according to the distribution of the output information. Secondly, if the original data has no label, the correct label of the original data cannot be known, and the validity of the original data can be calculated only according to the output classification quantity and the classification probability distribution.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010712768.7A CN111858343A (en) | 2020-07-23 | 2020-07-23 | Countermeasure sample generation method based on attack capability |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010712768.7A CN111858343A (en) | 2020-07-23 | 2020-07-23 | Countermeasure sample generation method based on attack capability |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111858343A true CN111858343A (en) | 2020-10-30 |
Family
ID=72949292
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010712768.7A Pending CN111858343A (en) | 2020-07-23 | 2020-07-23 | Countermeasure sample generation method based on attack capability |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111858343A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113537494A (en) * | 2021-07-23 | 2021-10-22 | 江南大学 | Image countermeasure sample generation method based on black box scene |
CN114821227A (en) * | 2022-04-12 | 2022-07-29 | 重庆邮电大学 | Deep neural network confrontation sample scoring method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110334808A (en) * | 2019-06-12 | 2019-10-15 | 武汉大学 | A kind of confrontation attack defense method based on confrontation sample training |
CN110991549A (en) * | 2019-12-13 | 2020-04-10 | 成都网域复兴科技有限公司 | Countermeasure sample generation method and system for image data |
CN111291828A (en) * | 2020-03-03 | 2020-06-16 | 广州大学 | HRRP (high resolution ratio) counterattack method for sample black box based on deep learning |
CN111325324A (en) * | 2020-02-20 | 2020-06-23 | 浙江科技学院 | Deep learning confrontation sample generation method based on second-order method |
-
2020
- 2020-07-23 CN CN202010712768.7A patent/CN111858343A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110334808A (en) * | 2019-06-12 | 2019-10-15 | 武汉大学 | A kind of confrontation attack defense method based on confrontation sample training |
CN110991549A (en) * | 2019-12-13 | 2020-04-10 | 成都网域复兴科技有限公司 | Countermeasure sample generation method and system for image data |
CN111325324A (en) * | 2020-02-20 | 2020-06-23 | 浙江科技学院 | Deep learning confrontation sample generation method based on second-order method |
CN111291828A (en) * | 2020-03-03 | 2020-06-16 | 广州大学 | HRRP (high resolution ratio) counterattack method for sample black box based on deep learning |
Non-Patent Citations (1)
Title |
---|
AI前线: "欺骗神经网络:创建你自己的对抗样本", 《知乎:HTTPS://ZHUANLAN.ZHIHU.COM/P/34038758》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113537494A (en) * | 2021-07-23 | 2021-10-22 | 江南大学 | Image countermeasure sample generation method based on black box scene |
CN113537494B (en) * | 2021-07-23 | 2022-11-11 | 江南大学 | Image countermeasure sample generation method based on black box scene |
CN114821227A (en) * | 2022-04-12 | 2022-07-29 | 重庆邮电大学 | Deep neural network confrontation sample scoring method |
CN114821227B (en) * | 2022-04-12 | 2024-03-22 | 重庆邮电大学 | Deep neural network countermeasures sample scoring method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109408389B (en) | Code defect detection method and device based on deep learning | |
CN109447099B (en) | PCA (principal component analysis) dimension reduction-based multi-classifier fusion method | |
Wu et al. | Applications of deep learning for smart water networks | |
CN112036513B (en) | Image anomaly detection method based on memory-enhanced potential spatial autoregression | |
CN112765896A (en) | LSTM-based water treatment time sequence data anomaly detection method | |
CN113569243A (en) | Deep semi-supervised learning network intrusion detection method based on self-supervised variation LSTM | |
Wu et al. | Optimized deep learning framework for water distribution data-driven modeling | |
CN111858343A (en) | Countermeasure sample generation method based on attack capability | |
CN110851654A (en) | Industrial equipment fault detection and classification method based on tensor data dimension reduction | |
CN114363195A (en) | Network flow prediction early warning method for time and spectrum residual convolution network | |
CN114067915A (en) | scRNA-seq data dimension reduction method based on deep antithetical variational self-encoder | |
CN115051864B (en) | PCA-MF-WNN-based network security situation element extraction method and system | |
CN111222689A (en) | LSTM load prediction method, medium, and electronic device based on multi-scale temporal features | |
CN115324843A (en) | Wind generating set fault diagnosis system and method based on monitoring data | |
CN114500004A (en) | Anomaly detection method based on conditional diffusion probability generation model | |
Nguyen et al. | InfoCNF: An efficient conditional continuous normalizing flow with adaptive solvers | |
CN116702090A (en) | Multi-mode data fusion and uncertain estimation water level prediction method and system | |
Zhang et al. | An intrusion detection method based on stacked sparse autoencoder and improved gaussian mixture model | |
Gopali et al. | A comparative study of detecting anomalies in time series data using LSTM and TCN models | |
CN117110446A (en) | Method for identifying axle fatigue crack acoustic emission signal | |
CN116680639A (en) | Deep-learning-based anomaly detection method for sensor data of deep-sea submersible | |
CN116205135A (en) | SO based on data decomposition and neural network 2 Emission prediction method and system | |
Guan et al. | GAMA: A multi-graph-based anomaly detection framework for business processes via graph neural networks | |
Xiang et al. | An improved multiple imputation method based on chained equations for distributed photovoltaic systems | |
CN111881034A (en) | Confrontation sample generation method based on distance |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |