CN113222960A

CN113222960A - Deep neural network confrontation defense method, system, storage medium and equipment based on feature denoising

Info

Publication number: CN113222960A
Application number: CN202110584110.7A
Authority: CN
Inventors: 董宇欣; 贾龙飞; 陈福坤; 韩爽; 闫鹏超; 刘皓; 梁泉; 叶润泽
Original assignee: Harbin Engineering University
Current assignee: Harbin Oceanwide Technology Development Co.,Ltd.
Priority date: 2021-05-27
Filing date: 2021-05-27
Publication date: 2021-08-06
Anticipated expiration: 2041-05-27
Also published as: CN113222960B

Abstract

A deep neural network confrontation defense method, a system, a storage medium and equipment based on feature denoising belong to the field of defense of image-based deep learning confrontation samples. The method aims to solve the problem that the existing defense method for the confrontation sample by using spatial domain filtering to carry out feature denoising has incomplete denoising, and further causes the problem of poor defense effect of the confrontation sample. The invention designs a neural network model containing at least one characteristic denoising module, wherein the characteristic denoising module comprises a 1x1 convolution, a residual connecting unit and a denoising operation unit, the denoising operation firstly carries out discrete wavelet transformation on a characteristic diagram of a model intermediate layer, separates useful information and noise information, then carries out denoising treatment combining frequency domain filtering and space domain filtering on high-frequency components containing the noise information, and finally reconstructs the characteristic diagram. The method can obviously improve the robustness in the face of resisting sample attacks under the resisting training. The method is mainly used for the defense of the deep neural network of the graph.

Description

Deep neural network confrontation defense method, system, storage medium and equipment based on feature denoising

Technical Field

The invention belongs to the field of defense of image-based deep learning countermeasure samples, and particularly relates to a method, a system, a storage medium and equipment for defense of deep neural network countermeasures based on feature denoising.

Background

In recent years, with the continuous development of artificial intelligence, deep learning gradually becomes a research hotspot in the field of artificial intelligence by virtue of the characteristics of automatic data feature extraction, strong model expression capability and the like, and is widely applied to many fields of computer vision, speech recognition, natural language processing and the like. However, the artificial intelligence technology such as deep learning is a pair of double-edged swords, and has inherent weaknesses while excellent performance. In the deep learning technology aiming at the image, an attacker can generate a countermeasure sample which can make a target neural network model carry out misclassification with high confidence degree by adding tiny countermeasure disturbance which is not perceivable to human eyes to an original sample. The existence of the countermeasure sample causes the safety of the deep learning system to be concerned, and for the deep learning system with higher safety requirements, such as an unmanned system and a disease diagnosis system, once the countermeasure attack is successful, the result is not imaginable, thereby not only revealing the easy aggressivity of the deep learning in the practical application, but also reflecting the obvious cognitive difference between the neural network and the human vision. Therefore, how to effectively defend against sample attacks is a very challenging problem.

In the field of defense of the deep learning countermeasure sample, the intermediate layer characteristic diagram of the model is processed by using traditional spatial domain filtering such as non-local mean filtering, bilateral filtering, median filtering and mean filtering, the classification accuracy of the model on the countermeasure sample can be obviously improved through countermeasure training, and therefore the defense effectiveness of the countermeasure sample with characteristic denoising is proved. However, the useful information and the noise information in the image cannot be effectively distinguished through the traditional spatial domain filtering, so that the useful information in the image can be influenced while denoising is carried out, namely, the problem of incomplete denoising exists, and the defense effect is poor, so that the method has important significance and value on more thorough denoising of the model intermediate layer characteristic diagram in the characteristic denoising direction.

Disclosure of Invention

The method aims to solve the problem that the existing defense method for the confrontation sample by using spatial domain filtering to carry out feature denoising has incomplete denoising, and further causes the defense effect of the confrontation sample to be poor.

A deep neural network confrontation defense method based on feature denoising comprises the following steps:

aiming at an image sample, identifying by adopting a countermeasure sample defense model, wherein the countermeasure sample defense model is a convolutional neural network model containing at least one characteristic denoising module, namely FSDCNN; the characteristic denoising module comprises a 1x1 convolution unit, a residual error connecting unit and a denoising operation unit;

denoising operation in a feature denoising module: firstly, performing discrete wavelet transformation on a characteristic diagram of a model intermediate layer, and then performing frequency domain filtering and space domain filtering on high-frequency components, wherein the frequency domain filtering is denoised by adopting a wavelet fixed threshold; and finally, reconstructing the characteristic diagram through inverse discrete wavelet transform.

Further, the spatial domain filtering includes non-local mean filtering, bilateral filtering, median filtering, and mean filtering, and is one of the filtering methods.

Further, the countermeasure sample defense model is fsdressnet 34, the fsdressnet 34 uses ResNet34 as the subject network architecture; two characteristic denoising modules are added into a main body frame of ResNet34, and the obtained model is FSDResNet 34.

Further, the two feature denoising modules in the fsdressnet 34 are respectively located after the third residual block and the seventh residual block of the ResNet 34.

Further, the training process of the fsdressnet 34 includes the following steps:

s1, preparing a mixed training set:

selecting a plurality of images in a sample data set as a clean sample set, wherein the clean sample set comprises a clean training set and a clean testing set;

attacking a clean sample set by using a PGD (product generated data) countermeasure sample generation algorithm to generate an equal number of countermeasure sample sets, wherein the countermeasure sample sets comprise an countermeasure training set and an countermeasure test set, the clean training set corresponds to the countermeasure training set, and the clean test set corresponds to the countermeasure test set; the clean training set and the confrontation training set jointly form a mixed training set;

s2, for fsdressnet 34, the hybrid training set in S1 is used for countertraining, and the optimization objective formula is as follows:

wherein, θ is a model weight parameter, x is an original clean sample, δ is an anti-disturbance, y is a label of the original clean sample, S is a disturbance allowable range, L (·) is a loss function, which is usually taken as a cross entropy, D is a joint probability distribution where the sample and the label are located, and E is an average loss; the formula consists of two problems, internal maximization and external minimization; aiming at the internal maximization problem, selecting and using a PGD (probabilistic generated data) countermeasure sample generation algorithm to obtain countermeasure disturbance which maximizes a target loss function; aiming at the external minimization problem, the characteristic denoising is carried out on the characteristic diagram of the model intermediate layer selectively;

the optimization algorithm used by the network training is SGD with momentum;

the countermeasure training strategy of the model is to firstly train the original model ResNet34 by using a clean training set, after the model is converged and higher accuracy is obtained on the clean test set, fix the weight parameters of each trained network layer into the FSDRESNet34 model, and carry out countermeasure training on the FSDRESNet34 model by using a mixed training set, namely only train the feature denoising module therein.

Further, in the training process of the FSDResNet34, the classification performance of the FSDResNet34 model is tested by using a clean test set and an antagonistic test set, and the characteristic denoising effect is displayed through characteristic map visualization.

Further, in the process of displaying the characteristic denoising effect through characteristic diagram visualization, Matplotlib is used for visualizing the characteristic diagram of the model intermediate layer.

A deep neural network confrontation defense system based on feature denoising is used for executing a deep neural network confrontation defense method based on feature denoising.

A storage medium having stored therein at least one instruction that is loaded and executed by a processor to implement a feature denoising-based deep neural network confrontation defense method.

An apparatus comprising a processor and a memory, the storage medium having stored therein at least one instruction that is loaded and executed by the processor to implement a method for feature denoising-based deep neural network confrontation defense.

The invention has the beneficial effects that:

the invention discloses a deep neural network confrontation defense method based on feature denoising, which is to combine frequency domain filtering and space domain filtering, add a feature denoising module designed by the invention in a model to perform feature denoising on a model intermediate layer feature map, effectively remove the influence of confrontation disturbance in a confrontation sample on the classification performance of the model through end-to-end confrontation training, ensure the classification accuracy of the model on a clean test set, remarkably improve the confrontation robustness of the model when the model is confronted with the confrontation sample attack, and further realize good defense effect of the deep neural network confrontation sample.

Drawings

Fig. 1 is an overall flow chart of the present invention.

FIG. 2 is a block diagram of a feature denoising module.

FIG. 3 is a flow chart of FSDCNN feature denoising.

Fig. 4 is a diagram of a fsdressnet 34 network model architecture.

FIG. 5 is a flow chart of the denoising operation in the feature denoising module.

Fig. 6 is a DWT flow diagram for a two-dimensional image.

Fig. 7 is an IDWT flow diagram for a two-dimensional image.

Fig. 8 is a process visualization of non-local mean filtering.

FIG. 9 is a diagram of the structure of Non-Local modules (Embedded Gaussian).

FIG. 10 is the classification accuracy (PGD) for each model on a clean test set under confrontational training.

FIG. 11 is the classification accuracy (PGD) of each model on the challenge test set under challenge training.

Fig. 12 is the classification accuracy on the challenge test set for different iterations of PGD attacks in response to ResNet34 and FSDResNet34 under challenge training.

FIG. 13 is the classification accuracy on the challenge test set for different iterations of PGD attacks for the remaining models under challenge training.

FIG. 14 shows the classification accuracy of each model under challenge training under different challenge sample attacks.

Fig. 15 is a FGSM attack sample visualization.

Fig. 16 is a BIM attack sample visualization.

Fig. 17 is a PGD attack sample visualization.

Fig. 18 is a feature map visualization of clean samples in ResNet 34.

Fig. 19 is a feature map visualization of clean samples in FSDResNet34+ Mean.

Fig. 20 is a challenge sample (PGD) generated in ResNet34 for a clean sample.

Fig. 21 is a feature map visualization of confrontation samples in ResNet 34.

Fig. 22 is a challenge sample (PGD) generated in fsdresset 34+ Mean for a clean sample.

Fig. 23 is a feature map visualization of challenge samples in fsdressnet 34+ Mean.

Detailed Description

The first embodiment is as follows:

the embodiment is a deep Neural network countermeasure defense method based on feature Denoising, which combines frequency domain filtering and Space domain filtering, designs and realizes a brand-new Convolutional Neural network FSDCNN (frequency and Space Denoising connecting Neural network) to defend countermeasure sample attack, not only ensures the classification accuracy of a model to a clean test set, but also obviously improves the countermeasure robustness of the model when facing the countermeasure sample attack.

The method for defending the deep neural network confrontation based on the characteristic denoising comprises the following steps:

1) making a hybrid training set

1a) 50000 images with the size of 32x32 in the CIFAR10 data set are selected as a clean sample set, and the clean sample set comprises a clean training set and a clean testing set.

1b) Attacking the clean sample set in 1a) by using a PGD (packet data generated) countermeasure sample generation algorithm, and generating an equivalent number of countermeasure sample sets, wherein the countermeasure sample sets comprise an countermeasure training set and an countermeasure testing set, the clean training set corresponds to the countermeasure training set, and the clean testing set corresponds to the countermeasure testing set; the clean training set and the confrontation training set together constitute a hybrid training set.

2) Constructing a feature denoising-based deep neural network confrontation defense model FSDResNet34

2a) ResNet34 was used as the subject network architecture and reference contrast model.

2b) Adding two Feature Denoising modules (Feature Denoising blocks) into a main body frame of the ResNet34, and respectively locating behind the third residual Block and the seventh residual Block to obtain an FSDRESNet34 model based on Feature Denoising; the characteristic denoising module comprises a 1x1 convolution unit, a residual error connecting unit and a denoising operation unit. When the method is applied to other models (non-ResNet 34 models) and other data sets (non-CIFAR 10 data sets), more than two feature denoising modules can be inserted, namely, a plurality of feature denoising modules can be actually inserted, and the number and the positions of the feature denoising modules can be determined by combining the models and the data sets;

2b) the denoising operation in the characteristic denoising module is the combination of frequency domain filtering and space domain filtering, namely, Discrete Wavelet Transform (DWT) is firstly carried out on the characteristic diagram of the model intermediate layer, then frequency domain filtering and space domain filtering are carried out on the high-frequency component, and the frequency domain filtering adopts wavelet fixed threshold denoising (Visushrink); finally, the feature map is reconstructed by Inverse Discrete Wavelet Transform (IDWT). Different fsdresset 34 models can be obtained according to different choices of spatial domain filtering.

3) Use of the hybrid training set in 1) and the FSDRESNet34 model in 2) for counter training

3a) The optimization objective formula used for network training is as follows:

where θ is a model weight parameter, x is an original clean sample, δ is an anti-disturbance, y is a label of the original clean sample, S is a disturbance allowable range, L (·) is a loss function, which is usually taken as a cross entropy, D is a joint probability distribution where the sample and the label are located, and E is an average loss (expectation). The formula consists of two problems, internal maximization and external minimization. Aiming at the internal maximization problem, selecting and using a PGD (probabilistic generated data) countermeasure sample generation algorithm to obtain countermeasure disturbance which maximizes a target loss function; and aiming at the external minimization problem, the characteristic denoising is selected to be carried out on the characteristic graph of the model intermediate layer.

3b) The optimization algorithm used for network training is SGD with momentum, the initial learning rate is set to be 0.1, and a multi-step long learning rate attenuation strategy (MultiSteplR) is used, namely, the learning rate is reduced at certain epoch intervals.

The strategy of the confrontation training of the model is to firstly train the original model ResNet34 by using the clean training set in 1a), after the model converges and obtains higher accuracy on the clean test set in 1a), fix the trained weight parameters of each network layer into the FSDResNet34 model, and carry out the confrontation training on the FSDResNet34 model by using the mixed training set in 1b), namely, train only the feature denoising module therein.

4) Using the classification performance of the test model in the clean test set in 1a) and the confrontation test set in 1b), and visually displaying the characteristic denoising effect through a characteristic diagram

4a) 10000 images with the size of 32x32 in a CIFAR10 data set are selected as a clean test set, and the classification accuracy of the images in the clean test set can be obtained by using the clean test set to test the FSDResNet34 model.

4b) According to the weight parameters of the current FSDRESNet34 model, using the PGD countermeasure sample generation algorithm to attack the clean test set in 4a), generating an equivalent number of countermeasure test sets, and testing the FSDRESNet34 model through the countermeasure test sets to obtain the classification accuracy of the FSDRESNet34 model under the countermeasure test sets.

4c) And (3) attacking the clean test set by using different PGD attack iteration times to generate a plurality of different countermeasure test sets, and testing the FSDResNet34 model by the countermeasure test sets to obtain the countermeasure robustness of the model facing different PGD attack iteration times.

4d) And (3) generating a clean test set by using three kinds of confrontation sample generation algorithms of FGSM, BIM and PGD, and generating three different confrontation test sets, wherein the three confrontation test sets are used for testing the FSDRESNet34 model so as to obtain the confrontation robustness when the model faces different confrontation sample attacks.

4e) The feature map of the model intermediate layer is visualized by Matplotlib, and the defense performance of the invention can be clearly seen by comparing and observing the reference model ResNet34 and the FSDRESNet34+ Mean model.

The invention is described in further detail below with reference to the accompanying drawings:

FIG. 1 is an overall flow chart of the present invention, which is divided into four steps, one is to make a hybrid training set containing clean samples and challenge samples; secondly, a deep neural network confrontation defense model based on feature denoising is built; thirdly, selecting a proper loss function and an optimization algorithm, and performing countermeasure training by using the mixed training set and the neural network model in the first two steps; and fourthly, testing the classification performance of the trained model by using the clean test set and the countermeasure test set.

FIG. 2 is a structural diagram of a feature denoising module, which mainly comprises three parts, namely 1x1 convolution, residual concatenation and denoising operation. The 1x1 convolution mainly has two functions, namely, the function of adjusting the number of channels of the denoised feature image is used for feature fusion in residual error connection; secondly, introducing learnable parameters to enable the model to be self-adjusted between useful information in the characteristic diagram and the counterdisturbance during end-to-end training; the residual error connection mainly has the function of avoiding the noise removal operation from influencing useful information in the characteristic diagram when resisting disturbance in the characteristic diagram; the denoising operation mainly comprises frequency domain filtering and space domain filtering and is used for denoising the characteristic diagram of the model intermediate layer.

Fig. 3 is a flow chart of FSDCNN feature denoising, that is, a feature denoising module designed in fig. 2 is embedded into a convolutional neural network to obtain a completely new convolutional neural network FSDCNN for defending against sample attack.

Fig. 4 is a diagram of a fsdressnet 34 network model structure, namely, two feature denoising modules designed in fig. 2 are embedded in a ResNet34 residual network, and are respectively positioned after a third residual block and a seventh residual block.

Fig. 5 is a flow chart of denoising operation in the feature denoising module, that is, Discrete Wavelet Transform (DWT) is performed on the model intermediate layer feature map, then denoising processing combining frequency domain filtering and spatial domain filtering is performed on the obtained high frequency component, and finally feature denoising on the model intermediate layer feature map is realized by reconstructing the feature map through Inverse Discrete Wavelet Transform (IDWT). The invention is used in two-dimensional discrete wavelet transform based on haar wavelet scale 1, fig. 6 is a DWT flow chart of a two-dimensional image, and fig. 7 is an IDWT flow chart of a two-dimensional image.

Before spatial domain filtering is carried out on high-frequency components, the VisuShrink wavelet threshold denoising method is firstly carried out on the high-frequency components, and the expression of the method is as follows:

where δ denotes a standard deviation of noise, N denotes a signal length or an image size, and since the size of an image in a CIFAR-10 data set used in the present invention is 32x32, N is set to 32 and Y is set in the experiment of the present invention_i,jThe wavelet coefficients obtained by wavelet transform are represented, and noise is mainly concentrated in high-frequency components, so that the noise standard deviation is estimated by using the wavelet coefficients of three high-frequency components of HL, LH and HH in the experiment of the invention.

Fig. 8 is a process visualization diagram of Non-Local mean filtering, and fig. 9 is a structure diagram of Non-Local mean filtering (Non-Local module). Different from local spatial domain filtering methods such as bilateral filtering and median filtering, the non-local mean filtering method utilizes all pixels in the whole image range, not only can effectively remove noise, but also can furthest maintain the detail characteristics of the image. The basic idea is that the estimated value of the pixel at the center point of the filtering template is obtained by weighted average of pixel points with similar field structures in the image, wherein the weight of each pixel point depends on the Gaussian weighted Euclidean distance between the filtering template and the image subblock taking the pixel point as the center.

General formula for Non-Local is as follows:

where x represents an input signal, typically an image in computer vision; y represents the output signal, and the size is consistent with the input signal x; x is the number of_iIs a vector with dimensions equal to the number of channels of the input signal x; f (x)_i,x_j) For calculating x in feature maps_iAnd x_jSimilarity of the two points; g (x)_j) Represents the pixel value of the input signal x at position j; c (x) is a normalization parameter.

Taking into account all the position information in the image, i.e. determining the non-local behavior, as a comparison, the convolution operation is to perform weighted summation on the local area, for example, when the convolution kernel is 3 × 3, the value range of j is [ i-1, i +1 ]]。

For simplicity, only g (x) is considered in this embodiment_j) Is the linear case:

g(x_j)＝W_gx_j

wherein, W_gIs a weight matrix to be learned, and is realized by spatial convolution of 1x 1.

f(x_i,x_j) Is used for calculating x in the feature map_iAnd x_jThe similarity of two points, the Gaussian function is expanded by calculating the similarity in an embedding space in the invention:

wherein, theta (x)_i)＝W_θx_iAnd phi (x)_j)＝W_φx_jIs two embedded forms, W_θAnd W_φIs a weight matrix to be learned, again implemented by a spatial convolution of 1x 1.

In addition, in order to reduce the calculation amount, the invention also adopts two efficient strategies:

(1) halving the channel: through W_θ、W_φAnd W_gHalving the profile channel and finally passing W_zThe number of channels is restored, the dimension is consistent when residual errors are connected, the design is similar to the bottleeck structure design in ResNet, and not only is the parameter amount reduced, but also the calculation amount of 1/2 is reduced.

z_i＝W_zy_i+x_i

(2) And (3) size reduction: by passing through W_φAnd W_gA downsampling operation is added to reduce the size of the feature map by half, for example, the size of the feature map is changed from 32x32 to 16x16, so that the calculation amount of 1/4 can be reduced, and the calculation becomes more sparse.

In addition, besides non-local mean filtering, other embodiments of spatial domain filtering in the present invention may also use bilateral filtering, median filtering, or mean filtering.

The bilateral filtering is a nonlinear local filter, not only considers the Euclidean distance between pixels in an image, namely spatial proximity, but also considers the color difference between the pixels, namely gray level similarity, and forms a new filtering template weighted value by carrying out nonlinear combination on the two, wherein the expression is as follows:

w(i,j)＝w_s(i,j)*w_r(i,j)

wherein, w_s(i, j) is the spatial proximity weight, w_r(i, j) is the gray level similarity weight, w (i, j) is the product of the two part weights,

for the result after bilateral filtering, g (i, j) represents the pixel value of the center point (i, j) of the filtering template, Ω represents the domain range of the size of (2R +1) x (2R +1) of the center point (i, j) of the filtering template, δ_sAnd delta_rIs the standard deviation in the gaussian function, the magnitude of which directly affects the effect of bilateral filtering.

The basic idea of the method is to arrange the pixel values of all points in the filtering template in a descending or descending order, and then replace the pixel value of the center point of the filtering template with the median of the sequence, and the expression is as follows:

wherein S is_xyRepresenting the (x, y) -centered filter template, and Median represents the Median function, i.e., the pixel value at (x, y) is replaced by the Median of the pixel values of all points in the (x, y) -centered filter template.

The mean filtering is the simplest linear filtering in the spatial domain filtering method, and the basic idea is to replace the pixel value of the center point of the filtering template with the average value of the pixel values of all the points in the filtering template, and the expression is as follows:

wherein S is_xyAnd (3) representing a filtering template taking (x, y) as a center, wherein M represents the total number of pixel points including the pixel point to be filtered in the filtering template.

Fig. 10 is the classification accuracy (PGD) of each model on the clean test set under confrontation training, and fig. 11 is the classification accuracy (PGD) of each model on the confrontation test set under confrontation training. The comparison method refers to a reference model ResNet34 and a model on which only spatial domain filtering is performed, and it is obvious from the figure that the model trained based on the method obtains higher classification accuracy on both a clean test set and a confrontation test set, that is, the method can remarkably improve the confrontation robustness of a deep neural network model in the face of confrontation sample attack under confrontation training.

In view of that the main experimental data of the present invention is based on 10 iterations of PGD attack, that is, the confrontation sample in the confrontation training and the confrontation test set in the test are generated by 10 iterations of PGD attack, in order to more fully demonstrate the confrontation defense performance of the present invention, the experiments of the present invention also comparatively analyze the classification accuracy of each model on the confrontation test set under different PGD attack iteration times. Fig. 12 shows the classification accuracy of ResNet34 and FSDResNet34 under the countertraining against the test set under different iterations of PGD attack, and fig. 13 shows the classification accuracy of the remaining models under the countertraining against the test set under different iterations of PGD attack. It can be seen from the figure that the classification accuracy of the model trained based on the invention is better than that of the comparison model in different PGD attack iteration times, namely, the invention is proved to have better anti-robustness in the face of PGD attacks with different strengths.

In addition, in order to test the robustness of the invention in the face of different countersample attacks, the experiments of the invention also compare and analyze the counterrobustness of each model under three counterattacks of FGSM, BIM and PGD, the classification accuracy of each model under the countertraining under the different countersample attacks is shown in FIG. 14, and the experimental results show that the model which is based on the PGD method for the countertraining also has good counterdefense performance for the first-order counterattacks such as FGSM, BIM and the like.

FGSM is an attack-fighting method that can generate attack-fighting samples quickly and efficiently, and calculates a gradient value by propagating a loss function back to an input sample, and adds a small disturbance to the input sample in a direction in which the gradient increases, so that the loss function increases to achieve the purpose of attack, and a calculation formula for generating the attack-fighting disturbance is as follows:

η＝ε·sign(▽_xJ(θ,x,y))

wherein epsilon is a coefficient for controlling the disturbance norm, sign (DEG) is a sign function capable of obtaining a specific gradient direction, J (theta, x, y) is a loss function during the training of the target network model, theta is a weight parameter, x is an input sample, y is a label corresponding to the input sample, and v is_xJ (θ, x, y) is the gradient of the loss function with respect to the input samples, and FGSM attack sample visualization is shown in fig. 15.

BIM is an iterative attack-fighting method proposed on the basis of FGSM, which generates more accurate fighting disturbance through multiple iterations, and in each iteration, in order to avoid large variation of pixel values, the pixel values are clipped to a specific domain range of input samples, and the formula for generating fighting samples is as follows:

wherein N is the number of iterations,

the Clip () is a clipping function for the disturbed image in the Nth iteration, which can ensure that the clipped image is maintained in the epsilon field of the original image, y_trueA BIM attack sample visualization is shown in fig. 16 for the true label of the original sample.

PGD is another iterative attack-fighting method proposed on the basis of BIM, and its basic idea is to perform random initialization in the range allowed by the original sample, then perform multiple iterative attacks to generate a fighting sample, and each iteration projects the disturbance into the specified constraint range, and the formula for generating the fighting sample is as follows:

the difference between PGD and BIM is mainly that PGD not only increases random initialization operation, but also the number of iterations is more, and PGD attack sample visualization is shown in fig. 17.

In order to observe the defense-resisting performance of the invention more intuitively, the invention also analyzes the middle-layer characteristic graphs of the reference model ResNet34 and the FSDRESNet34+ Mean model through visual comparison. Wherein clear represents a clean sample, adv represents a confrontation sample generated by using a PGD attack, layer1_2 represents an output characteristic diagram of a third residual block, layer2_0 represents an output characteristic diagram of a fourth residual block, and dwt _ space1 represents an output characteristic diagram of a first characteristic denoising module. Because the weight parameters except for the characteristic denoising module are fixed in the confrontation training method used by the invention, the characteristic graphs output by the two comparison models in layer1_2 are the same, and the difference is whether the characteristic denoising module disclosed by the invention is added.

Fig. 18 is a feature map visualization of a clean sample in ResNet34, and fig. 19 is a feature map visualization of a clean sample in FSDResNet34+ Mean. It can be seen from comparative observation that both models can be classified correctly, but only some outline characteristic information of birds is available in layer2_0 of ResNet34, while the main characteristic information about birds is more obvious in dwt _ space1 of FSDRESNet34+ Mean model.

Fig. 20 is a countermeasure sample (PGD) generated by a clean sample in ResNet34, fig. 21 is a feature map visualization of the countermeasure sample in ResNet34, fig. 22 is a countermeasure sample (PGD) generated by a clean sample in FSDResNet34+ Mean, and fig. 23 is a feature map visualization of the countermeasure sample in FSDResNet34+ Mean. It can be seen by comparative observation that ResNet34 misclassifies the challenge sample as a deer, while fsdressnet 34+ Mean is still able to classify correctly. In addition, through comparing and observing feature maps of layer2_0 and dwt _ space1, it can be clearly seen that activation similar to feature noise exists in a region without semantic information in layer2_0, and human eyes cannot recognize bird feature information, that is, the basic structure of an image is seriously damaged by anti-disturbance, and as the network depth is increased, the influence is continuously amplified, which finally causes misclassification of the model, while the main feature information of birds can still be seen in dwt _ space1, which indicates that the feature denoising module designed by the invention can not only remove the anti-disturbance, but also protect the main structure information. Therefore, through visualization of the model intermediate layer feature map, not only the process of model learning feature representation is observed, but also the reason of the invention is intuitively shown, so that the robustness of the model in confrontation with resisting sample attack can be obviously improved.

The second embodiment is as follows:

the embodiment is a deep neural network confrontation defense system based on feature denoising, and the system is used for executing a deep neural network confrontation defense method based on feature denoising.

The third concrete implementation mode:

the embodiment is a storage medium, wherein at least one instruction is stored in the storage medium, and the at least one instruction is loaded and executed by a processor to realize a feature denoising-based deep neural network confrontation defense method.

The fourth concrete implementation mode:

the embodiment is an apparatus, which comprises a processor and a memory, wherein the storage medium stores at least one instruction, and the at least one instruction is loaded and executed by the processor to realize a feature denoising-based defense method for a deep neural network.

The present invention is capable of other embodiments and its several details are capable of modifications in various obvious respects, all without departing from the spirit and scope of the present invention.

Claims

1. A deep neural network confrontation defense method based on feature denoising is characterized by comprising the following steps:

aiming at an image sample, adopting a countermeasure sample defense model for identification, wherein the countermeasure sample defense model is a convolutional neural network model containing at least one characteristic denoising module; the characteristic denoising module comprises a 1x1 convolution unit, a residual error connecting unit and a denoising operation unit;

2. The method as claimed in claim 1, wherein the spatial domain filtering includes non-local mean filtering, bilateral filtering, median filtering and mean filtering.

3. The method as claimed in claim 1 or 2, wherein the countermeasure sample defense model is fsdressnet 34, the fsdressnet 34 uses ResNet34 as the main network architecture; two characteristic denoising modules are added into a main body frame of ResNet34, and the obtained model is FSDResNet 34.

4. The method as claimed in claim 3, wherein two feature denoising modules in the FSDResNet34 are respectively located after the third and seventh residual blocks of ResNet 34.

5. The feature denoising-based deep neural network confrontation defense method as claimed in claim 4, wherein the training process of the FSDResNet34 comprises the following steps:

s1, preparing a mixed training set:

the optimization algorithm used by the network training is SGD with momentum;

6. The method for defending the confrontation of the deep neural network based on the feature denoising of claim 5, wherein in the training process of the FSDRESNet34, the classification performance of the FSDRESNet34 model is tested by using a clean test set and a confrontation test set, and the feature denoising effect is visualized through a feature map.

7. The method for defending the deep neural network confrontation based on feature denoising as claimed in claim 6, wherein Matplotlib is used to visualize the feature map of the model intermediate layer in the process of showing the feature denoising effect through feature map visualization.

8. A feature denoising-based deep neural network confrontation defense system, which is used for executing a feature denoising-based deep neural network confrontation defense method as claimed in one of claims 1 to 7.

9. A storage medium having stored therein at least one instruction that is loaded and executed by a processor to implement a feature denoising-based deep neural network confrontation defense method as recited in any one of claims 1 to 7.

10. An apparatus comprising a processor and a memory, wherein the storage medium has stored therein at least one instruction that is loaded and executed by the processor to implement a feature denoising-based deep neural network confrontation defense method as recited in any one of claims 1 to 7.