CN114049537A

CN114049537A - Convergence neural network-based countermeasure sample defense method

Info

Publication number: CN114049537A
Application number: CN202111390854.1A
Authority: CN
Inventors: 朱琎; 孙凯; 郭诚刚; 王超; 李大一; 马国军; 吴昊; 周雨
Original assignee: Jiangsu University of Science and Technology
Current assignee: Jiangsu University of Science and Technology
Priority date: 2021-11-19
Filing date: 2021-11-19
Publication date: 2022-02-15
Anticipated expiration: 2041-11-19
Also published as: CN114049537B

Abstract

The invention discloses a confrontation sample defense method based on a convolutional neural network, which forms a confrontation sample and a protection for sensitive parameters in an image, and performs defense of a network model from the two aspects. The method comprises the following steps: positioning and selecting sensitive areas in the image; selecting the amount of key data sample data in the sensitive area; selecting the most critical data position value for the selected region by adopting a differential evolution algorithm; setting the perturbation value and the perturbation value increment step form a countermeasure sample and form certain elastic protection for the data set sample. Through training of the confrontation sample and the elastic protection image, parameters of the model are updated, and the generalization capability of the model network is improved, so that the robustness of network defense for resisting sample attack is improved.

Description

Convergence neural network-based countermeasure sample defense method

Technical Field

The invention provides a defense method for the overall structure of a network during image recognition by using a convolutional neural network, and particularly relates to a method for generating and defending a countermeasure sample.

Background

In recent years, deep learning techniques have been effective in the field of image recognition, and the accuracy of object recognition has exceeded the recognition level of human beings. But for the other aspect, when the image is transformed by some perturbation invisible to the human eye, i.e. to combat the perturbation in the sample, it should be subtle and indistinguishable by the human eye. However, when the deep learning network learns the sample, the neural network shows a very weak characteristic and can be identified as other label categories with large differences.

The current defense against challenge samples is one of the major directions in current research. The current attacks to the resistant samples mainly include an FGSM method and a variant thereof, a C & W attack, a substitute black box attack, a Hyperplanar Classification-based Deepfol algorithm attack, a single-pixel attack, an AdvGAN attack, a universal countermeasure disturbance, a backward transfer micro-approximation method and the like. Combining with the attack mode of the antagonistic sample, the current defense for the antagonistic sample is mainly divided into a passive mode and an active mode. The passive defense is mainly characterized in that the generated countermeasure samples are also put into a data set to be trained according to a known countermeasure sample generation method, and the defense capability of the model is improved. Active defense is generally performed by methods of filtering or denoising a data set before training, converting the data set into a pure sample, transforming the pure sample to make it difficult to find out a data structure, and quantizing and discretizing the pure sample to eliminate the influence on resistance disturbance. However, the defense method at the present stage mainly adopts passive defense, and the active defense method is too deficient, so that it is necessary to provide a method for actively defending against the sample.

Disclosure of Invention

The invention aims to provide a confrontation sample defense method based on a convolutional neural network. In combination with the existing defense method, huge security problems are sometimes brought to the anti-sample attack in the actual application scene. Especially, in the problems of personal privacy safety such as the existing face recognition, unmanned driving, face payment, medical auxiliary diagnosis and the like, if an attack is carried out on the recognition system, potential safety hazards can be caused to the lives and properties of users. Taking the detection of new coronavirus in medical auxiliary diagnosis as an example, most of the existing identification detection models only perform simple preprocessing on a data set to construct an identification model. If the sample of the identification model is contaminated or attacked, this may lead to misjudgment of the final judgment result, which may have serious consequences for the patient. Therefore, the accuracy and robustness indexes of the convolutional neural network resisting samples need to be guaranteed and improved. Aiming at the defects of the existing defense of the countermeasure sample, a method for actively defending the countermeasure sample is provided.

The purpose of the invention is realized by the following technical scheme:

a confrontation sample defense method based on a convolutional neural network improves the robustness of an image recognition model by an active defense method for a data set, and comprises the following steps:

selecting a classified recognition image as an original image input data set according to the characteristics presented by the data set applied by image recognition classification, and taking the selected data set as an initial training set; a preprocessing module: filtering and denoising the most initial training set data; the area frame selection module: the image which is processed by the preprocessing module in the earlier stage is trained through the training model in the earlier stage to select a characteristic area which is sensitive to result discrimination, and a preliminary area data positioning operation is carried out, wherein 10 to 20 percent of the area in the whole image is generally selected; a data protection module: the characteristic diagram framed by the region framing module is used for screening data of the characteristic diagram by adopting an evolutionary algorithm, the data value of 10% of pixel points in the screening framed region is used as key protection data for identifying and training the whole picture, the process is carried out by using the evolutionary algorithm, and the process of the evolutionary algorithm comprises the following steps:

1) according to the divided area of 10-20% of the whole picture, defining the first data of the pixel point in the area from the lower left corner as the origin of coordinates, and arranging other data points in sequence;

2) calling a random function to randomly select data values of 10% of pixel points in the framing area, wherein the data values are used as an initial data group;

3) calculating the importance degree of the data group to the final model decision judgment, namely the fitness of the data group, and completely meeting the requirement that the fitness is 100%; the data in the frame selection area plays a role in the final result judgment, and the larger the data value is, the more important the decision made on the final result judgment is; under the condition of data value normalization in an image, defining data between 0.5 and 1 as pixel point data meeting requirements. On the other hand, the fitness is also related to the distance between the pixel points, and the closer the data values meeting the requirements are to each other, the greater the influence is; here, the euclidean distance is used for measurement, the euclidean distance in the two-dimensional space is a straight-line distance between two points, and the euclidean distance between the pixel point p (x, y) and the pixel point q (s, t) is defined as follows:

defining a metric distance D_e(p, q) is 4, the distance D of the data values satisfying the requirement_e(p, q) is not more than 4, and when a certain value is satisfiedOne pixel point is taken as a center, and if the pixel point meeting the condition is more than 10 points, the pixel point is a protection group;

4) setting the threshold condition of the fitness as 90%, judging the selected data group, and ending the selection operation of the data group if the data meets the threshold condition;

5) if the threshold condition is not met, the operation of the second step, the third step and the fourth step is carried out again;

through continuous selection and judgment, the data value of 10% pixel points is found out and used as key protection data of the whole picture;

a confrontation sample generation module: for the selected data group, in order to ensure the variable range of the data, setting the disturbance value and the disturbance value increase step length of the data change, generating a countermeasure sample and carrying out elastic protection;

for the finally selected data group, as data which plays an important role in decision-making and judgment of the whole image, the original data values (disturbance values and disturbance value increase step length) can be changed within a certain range artificially by the user to form a new image and data set sample;

a convolution module: sending the data set processed by the confrontation sample generation module into a subsequent feature extraction convolutional neural network for training, and updating the overall parameters of the model; a full connection module: and connecting the data result trained by the convolution module to the full-connection layer, and judging the final result.

In the foregoing method for defending a confrontation sample based on a convolutional neural network, in the process of the preprocessing module, noise reduction filtering on an image is performed from the image itself, and the processing is mainly performed in the following manner:

the method is carried out from an image, and the image is preprocessed by using a spatial pixel characteristic noise reduction algorithm, a new central pixel value is obtained by analyzing direct relation between a central pixel and other adjacent pixels in a gray scale space in a window with a certain size, and the common spatial domain image noise reduction algorithm comprises a neighborhood average method, median filtering and low-pass filtering.

In the foregoing method for defending a confrontation sample based on a convolutional neural network, in the process of the preprocessing module, the noise reduction filtering processing on the image is performed from a transform domain, and the processing is mainly performed in the following manner:

the method is carried out from a transform domain, and an image is preprocessed by applying a transform domain noise reduction algorithm, the method is based on a frequency domain, and is mainly used for separating useful signals and interference signals by using a filter, such as Fourier transform, cosine transform, K-L transform, wavelet transform and the like, the Fourier transform is the most common transform domain method, and a specific formula is as follows:

m, N image size; u-0, 1,2, 3.., M-1; v-0, 1,2, 3.

For the two image preprocessing methods, one or both methods can be selected to preprocess the image according to the actual situation of the data set.

According to the confrontation sample defense method based on the convolutional neural network, a region which plays an important role in final decision judgment in an image is screened out in the region framing module, and the regions are mainly processed in the following way:

and in the case that each image of the data set only has a single identification category or the proportion of the identification categories in the whole image is large, the sensitive area of each image is defined by a gradient weighted category activation mapping algorithm.

and for the condition that one image has a plurality of same categories or identified categories in a small proportion in one image in the image identification data set, the scoring weighting category activation mapping algorithm is used for delimiting each image of the data set.

For the two area frame selection module processing methods, one of two schemes is selected to frame the important areas of the images according to the specific characteristics of the images in the data set.

In the confrontation sample defense method based on the convolutional neural network, the convolution module is used for extracting the features of the image, reducing the number of input data, retaining data which plays an important role in decision making, and processing the data in the following way:

when the information amount in the data set is larger and the distribution of useful information is scattered in the final decision judgment, a DenseNet model and a residual neural network model which are complex in overall structure, deep in network model and more stable in detail information extraction are selected.

and when the information amount of a single image in the image identification classification data set is relatively small and the data set is a small sample, selecting a VGG model and an AlexNet model which have relatively simple overall network structure and shallow network layer number.

For the two image convolution model schemes, selection is carried out according to the size of a single image in the data set and the proportion of the identification information in the single image, and one of the two image convolution model schemes is selected as a main network of the convolution module in the application process. But in general, for the selection of the convolutional neural network model, the image recognition model is designed according to the needs of the image classification data set of the image recognition model.

Compared with the prior art, the invention has the beneficial effects that: pre-processing protection for images in front of the main network of the image recognition network. The data points which are critical in the identification are protected by marking, and the weight of the data points in the network structure is increased. By means of the method for training, on one hand, the generalization capability of the whole network can be improved, on the other hand, part of training data sets can be expanded through the data protection module, and the problem that the data sets are deficient in some model training processes is solved. On the basis, compared with the recognition training models of the same type, the model can be converged more quickly, the training time is reduced, and the efficiency is improved; more importantly, the safety performance of the model is improved, the attack of the countercheck sample is effectively resisted, and the robustness of the model is greatly ensured.

Drawings

FIG. 1 is a flow chart of the overall structure of a defense method for adversarial samples based on a convolutional neural network;

FIG. 2 is a flow diagram of an implementation process of the data protection module.

Detailed Description

The invention is further described with reference to the following figures and specific examples.

Fig. 1 is a flowchart illustrating the overall structure of a defense method for adversarial samples of a convolutional neural network.

Step 1: selecting a classified recognition image as an original image input data set according to the characteristics presented by the data set applied by image recognition classification, and taking the selected data set as an initial training set;

the defense network is used in a recognition classification network model, so that the selection of a data set needs to meet the requirement that only one classification category is arranged in a graph and the basic characteristics of data classification are met. The medical CT image is selected as an original image input data set according to the characteristics of the novel coronavirus in the medical CT image, and the selected data set is used as an initial training set, so that the description is also made to embody the problem of selecting classified images. For the size of the initial image, a pixel size of 224 × 224 is selected in the description, and in the process of being applied to practice, adjustment can be actually made. Particularly for medical images, the requirements for detailed information of the images are more complex, and the adjustment can be carried out according to actual conditions.

Step 2: a preprocessing module: filtering and denoising the most initial training set data;

the simple challenge samples are created by adding some noise to the original image that is not visible to the human eye. The image is subjected to early filtering and noise reduction, and a certain defense effect can be achieved by simple countermeasure samples through early preprocessing. The data set is subjected to early-stage filtering and noise reduction processing, the accuracy of an early-stage training model is guaranteed to a certain extent, and a better guarantee is provided for subsequent frame selection of a sensitive area.

In the module, the filtering and denoising processing method for the image data in the early stage mainly adopts two modes of an airspace pixel characteristic denoising algorithm and a transform domain denoising algorithm to filter and denoise the image. For the method selection of filtering noise reduction, the selection can be made according to the characteristics of the data set. One of the methods can be selected to perform filtering and noise reduction of the image or the two methods can be used together by combining the characteristics of the image.

And preprocessing the image by using a spatial pixel characteristic noise reduction algorithm. The method obtains a new central pixel value by analyzing the direct relation between a central pixel and other adjacent pixels in a gray scale space in a window with a certain size. Common spatial domain image noise reduction algorithms include a neighborhood averaging method, median filtering, low-pass filtering and the like.

The filtering noise reduction process for the image may also be processed from a transform domain noise reduction algorithm. The transform domain noise reduction algorithm is a frequency domain-based processing method, and mainly uses a filter to separate a useful signal from an interference signal. The basic idea is to perform a certain transformation to convert the image from the spatial domain to the transform domain, then divide the noise into high, medium and low frequency noise from the frequency, so as to separate the noise with different frequencies, and then perform inverse transformation to convert the image from the transform domain to the original spatial domain, thereby finally achieving the purpose of removing the image noise. There are many transformation methods for converting an image from a spatial domain to a transform domain, such as fourier transform, cosine transform, K-L transform, wavelet transform, and the like. The fourier transform is a common transformation method for image noise reduction, and the specific formula is as follows:

m, N image size; u-0, 1,2, 3.., M-1; v-0, 1,2, 3.

And step 3: the area frame selection module: the image passing through the preprocessing module is subjected to a training model, a characteristic region sensitive to result judgment is selected in a frame mode, and preliminary region data positioning operation is carried out;

after the convolutional neural network image classification model is trained, the judged result is output through an output layer, and the number of the output results corresponds to the classification type one by one. The module is mainly used for screening out an area which plays an important role in final decision-making judgment, namely a sensitive area, from one image. For the selection of the sensitive area, the embodiment provides two schemes, and the selection can be performed according to different data sets.

The gradient weighted class activation mapping algorithm may be selected for situations where there is only a single recognition class per image, or where the recognition class accounts for a large percentage of the entire image. For the image recognition data set, the score weighted category activation mapping algorithm can be selected when a plurality of same categories exist in one image or the recognized categories account for a small proportion of one image. For the two different algorithms, the areas most sensitive to the image recognition system are selected through the final visual hot frame, so that the selection can be carried out according to the actual situation, and even the two algorithms can be superposed for some complex data sets.

The first gradient weighting class activation mapping algorithm scheme: after a previous training convolutional neural network, we provide it with a new image, which will return a class for that image. We take the output feature map of the last convolutional layer and channel-weight the gradient of each channel according to the output class, and map the processed image to the original image, we get a heat map indicating which parts of the input image are most active for that class. The trained images are processed in a convolutional neural network by using a gradient weighted class activation mapping algorithm, a primary frame selection can be performed on sensitive areas in one image, and the area which is most sensitive to the final decision judgment in one image is selected.

The above method is to select the sensitive area in the image by gradient weighting, the algorithm depends on the gradient, and the sensitive area is measured locally. Another score weighted category activation mapping algorithm scheme: the weight of each activation map is obtained by its forward pass score on the target class, with the final result being a linear combination of the weight and activation map. The algorithm does not process the feature map at the last layer of the convolutional neural network, but generates a plurality of heat maps by up-sampling after training. And mapping and activating the generated heat maps with the original image respectively, and then sending the activated image to the training model for training. And finally, combining a plurality of heat maps generated by up-sampling and a final identification judgment result to form a sensitive area which is decisive for final decision judgment.

After the sensitive area is selected by one of the above two methods, the more colorful the sensitive area is in the original image according to the mapping result, the more important the decisive is for the final recognition. Therefore, the portion with the best ratio in the whole picture is selected. The sizes of the images are unified into 224 × 224 according to step 1, so that an area occupying about 10% to 20% of the original image is preliminarily defined, that is, a 35 × 35 pixel area is defined as an identification sensitive area, and the preliminary positioning operation is achieved.

The division of the divided 35 × 35 pixel region size is further described here. The division is mainly determined according to the size of a sensitive area formed by the area frame selection module, and if the sensitive area in one image is small or large, the area can be adjusted according to the actual condition, so that the pixel point area accounting for 10% -20% of the original image is ensured on the whole.

And 4, step 4: a data protection module: screening data in the framing region by adopting an evolutionary algorithm through the region selected by the region framing module, and selecting data values of about 10% of pixel points in the framing region, wherein the data values are the most representative data and are used as key protection data of the whole image;

for the position selected by the step 3 frame, the position is a critical and decisive area for the whole image recognition system, and the data information of the part needs to be marked and protected. However, for the data amount in the whole picture, the data amount needs to be reduced, and the data value of about 10% of the most representative pixel points is selected from the data amount. In this example, the data values of 125 pixels are selected according to the size of the region selected in step 3. This process is performed by using an evolutionary algorithm, as shown in fig. 2, which is a detailed flow of the evolutionary algorithm:

1) according to the divided pixel point area size of 35 x 35, defining the first data of the data points in the area from the lower left corner as the coordinate origin, and sequencing other data points in sequence. In the specific description herein, if the frame selection area is 35 × 35, the number of corresponding data points is 1225;

2) calling a random function to select 125 values from the data of 0-1224, namely randomly selecting 125 data from the data, and taking the group of data as an initial data group;

3) calculating the importance degree of the data group to the final model decision judgment, namely the fitness of the data group, wherein the fitness is 100% if the final result is completely met. For the data in the frame area, the effect on the final result judgment is realized, and the larger the data value is, the more important the decision on the final result judgment is. In the case of normalization of data values in an image, the data set is a medical image in combination with the illustration herein, and data between 0.5 and 1 is defined herein as important protection data points meeting the requirements.

The fitness of a data set in an image is not only related to the size of individual data values, but also to the mutual position of the data, the closer the data points are to each other, the greater the impact. The euclidean distance, also known as the euclidean metric, the euclidean distance, is used herein to make the metric, which is the true distance between two points in the m-dimensional space. The euclidean distance in two dimensions is the distance of a straight line segment between two points. For example, the euclidean distance between pixel p (x, y) and pixel q (s, t) is defined as follows:

D_e(p, q) is a distance measure, and pixels whose distance from point p is less than or equal to a certain value r are circular planes centered at point p and having a radius r. The measurement distance r is defined here as 4 in combination with the boxed area being 35 x 35 in size. If the distance r between two pixel points satisfying the data value is less than or equal to 4, the condition is satisfied, and if a certain pixel point is taken as the center and the data point satisfying the condition is more than 10 points, the data point satisfies a protection group, and finally 125 data values satisfying the condition are found out. Finally, the fitness selection for the data set needs to be considered from these two aspects, leading to the final result.

through continuous selection and judgment, 125 pieces of suitable key protection data are found out and used as key data of the whole picture.

And 5: a confrontation sample generation module: for the selected 125 key protection data, setting a disturbance value and a disturbance value increase step length of data change to ensure the variable range of the data, generating a countermeasure sample and performing elastic protection;

the original data is changed within a certain range, so that the original data has a certain difference, the size value of the data change is the disturbance value increase step length, and the changed data is the disturbance value. And the finally selected 125 pieces of key protection data are used as data which play an important role in decision making judgment of the whole image. If these data are changed, the probability of false positives due to classification decision of the final image class is much greater than the change of other data points. Therefore, for the data, the initial data can be changed randomly within a certain range according to the set disturbance value and the set disturbance value increment step size, and a new image and a data set sample are formed. And combining the new data set samples with the initial data set and putting the new data set samples and the initial data set samples into a later training network for training, and updating the weight parameters of the network model.

On one hand, the countermeasure sample can be generated on the basis of the selected data set, and the richness of the data set is increased; on the other hand, the reasonable interval of the data is subjected to integral dynamic range change, so that better elasticity can be ensured when other people carry out pixel attack on the data.

Step 6: a convolution module: sending the data set processed by the confrontation sample generation module into a subsequent feature extraction convolutional neural network for training, and updating the weight parameters of the model;

after the operation of the preceding antagonistic sample generation module, the formed new data set is sent to the convolutional neural network for training. The main function of this step is to extract features from the most initial image, reduce the amount of input data, and retain data that is important for decision making, and can be regarded as a filter in the whole model training. For the whole model, the key data protection module is added in front of the input data of the convolutional neural network module, so that the accuracy of the finally judged data is ensured more stably. And finally, updating the parameters of a convolution kernel in the convolutional neural network by the whole convolutional layer module according to the training process of the convolutional neural network, and improving the robustness of the model.

For this module, the part critical for recognition is the whole training model. Different recognition models can be selected according to the characteristic difference of different data sets, and the classical neural network with residual error, the VGG model, the AlexNet model, the GoogLeNet model, the DenseNet model and the like are compared.

For the selection of the image recognition classification model, the selection is mainly made according to the characteristics of the data set. One situation is that for the aforementioned medical image data set, since the amount of information in the data set is large and the distribution of useful information for final decision making is scattered, a model with a complex overall structure, a deeper network model, and more stable extraction of detail information, such as a DenseNet model or a residual neural network model, needs to be selected. Another situation is for the image to identify recognition models that classify a single image in a dataset as a small sample with a relatively small amount of information. Models with relatively simple overall network structure and shallow network layer number, such as a VGG model, an AlexNet model and the like, can be selected.

In general, for the selection of the convolutional neural network model, the existing model is not necessarily required to be selected, and the model can be designed according to the category of the data set in the image classification. Since the network model is designed and constructed based on a specific data set for the existing image classification, most of the network model is constructed by combining the existing data set.

And 7: a full connection module: and connecting the data result trained by the convolution module to a full connection layer, and judging the category of the data set picture according to the result.

For a fully connected module, the function of a classifier is mainly played in the whole model. If we say that the convolutional layer module maps the raw data to the hidden layer feature space, the fully-connected layer plays a role in mapping the learned distributed feature representation to the sample label space. And comparing the final result with the data of the label, transmitting the result back to the convolution module in a back propagation mode according to the result of the loss function, and updating the weight parameter of the module.

For the model loss function, selection needs to be performed according to the specific action of the whole network model, and for most classification models, the cross entropy loss function is mainly selected. Z ═ Z₀,z₀₁…,z_c-1]Representing the non-softmax output of a sample, c represents the label of the sample, the loss function formula is described as follows:

the weight parameter is also set in the loss function, and if the weight is set, the formula is as follows:

where W is weight [ c ]. 1{ c ≠ align _ index }.

The data included in the implementation steps are mainly for the purpose of making the description easier to understand. When the method is applied to a specific implementation process, a series of conditions such as a training model of a network, the type of a data set, the amount of the data set, hardware conditions and the like need to be specifically adjusted. In addition to the above embodiments, the present invention may have other embodiments, and any technical solutions formed by equivalent substitutions or equivalent transformations fall within the scope of the claims of the present invention.

Claims

1. A confrontation sample defense method based on a convolutional neural network is characterized in that the robustness of an image recognition model is improved by an active defense method for a data set, and the method comprises the following steps:

selecting a classified recognition image as an original image input data set according to the characteristics presented by the data set applied by image recognition classification, and taking the selected data set as an initial training set; a preprocessing module: filtering and denoising the most initial training set data; the area frame selection module: the image which is processed by the preprocessing module in the early stage is trained through a training model in the early stage to select a characteristic area which is sensitive to result discrimination, and a preliminary area data positioning operation is carried out, wherein 10% -20% of the area in the whole image is generally selected; a data protection module: the characteristic diagram framed by the region framing module is used for screening data of the characteristic diagram by adopting an evolutionary algorithm, data values of about 10% of pixel points in a screening framed region are used as key protection data for identification training of the whole picture, the process is carried out by using the evolutionary algorithm, and the process of the evolutionary algorithm comprises the following steps:

2) calling a random function to select data values of 10% of pixel points in the frame selection area, wherein the data values are used as an initial data group;

defining a metric distance D_e(p, q) is 4, the distance D of the data values satisfying the requirement_e(p, q) is less than or equal to 4, namely the condition is met, and when a certain pixel point is taken as a center, the pixel point meeting the condition is more than 10 points, the pixel point is taken as a protection group;

a confrontation sample generation module: for the selected key protection data group, setting a disturbance value and a disturbance value increasing step length of data change to ensure the variable range of the data, generating a countermeasure sample and performing elastic protection; the finally selected data group is used as data which plays an important role in decision-making and judgment of the whole image, and the original data value can be changed within a certain range by people to form a new image and a data set sample;

2. The convolutional neural network-based defense against challenge samples method as claimed in claim 1, wherein in the course of said preprocessing module, the noise reduction filtering of the image is performed from the image itself, mainly by:

the method comprises the steps of preprocessing an image by using a spatial pixel characteristic noise reduction algorithm, acquiring a new central pixel value by analyzing direct relation between a central pixel and other adjacent pixels in a gray scale space in a window with a certain size, wherein the common spatial domain image noise reduction algorithm comprises a neighborhood averaging method, a median filtering method and a low-pass filtering method.

3. The convolutional neural network-based defense against antagonistic samples method as claimed in claim 1, characterized in that in the course of said preprocessing module, the noise reduction filtering processing of the image is performed from the transform domain, mainly by:

the method is based on frequency domain processing, and mainly separates useful signals from interference signals by using a filter, such as Fourier transform, cosine transform, K-L transform, wavelet transform and the like, wherein the Fourier transform is the most common transform domain method, and the specific formula is as follows:

m, N image size; u-0, 1,2, 3.., M-1; v-0, 1,2, 3.

4. The method for defending the confrontation sample based on the convolutional neural network as claimed in claim 1, wherein the region which plays an important role for the final decision judgment in one image is screened out in the region selection module, and the regions are processed by the following method:

5. The method for defending the confrontation sample based on the convolutional neural network as claimed in claim 1, wherein the region which plays an important role for the final decision judgment in one image is screened out in the region selection module, and the regions are processed by the following method:

6. The convolutional neural network-based defense against challenge samples method as claimed in claim 1, wherein the convolution module is used to perform feature extraction on the image, reduce the amount of input data, and retain data that plays an important role in decision making, and the method is processed by:

7. The convolutional neural network-based defense against challenge samples method as claimed in claim 1, wherein the convolution module is used to perform feature extraction on the image, reduce the amount of input data, and retain data that plays an important role in decision making, and the method is processed by: