CN111598805A

CN111598805A - Confrontation sample defense method and system based on VAE-GAN

Info

Publication number: CN111598805A
Application number: CN202010402772.3A
Authority: CN
Inventors: 何永庆; 王海卫; 王荣耀; 王珂
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2020-05-13
Filing date: 2020-05-13
Publication date: 2020-08-28

Abstract

The invention belongs to the technical field of countermeasure sample defense, and discloses a VAE-GAN-based countermeasure sample defense method and system. The method for defending the confrontation sample based on the VAE-GAN belongs to input preprocessing and can learn and transfer among different classification models; the original classification network does not need to be retrained, so that the training cost is low; the classification precision of the original noise-free sample is hardly influenced; no confrontation sample is needed, so no external training confrontation sample is needed; the defense effect on the confrontation sample with small noise is also good; the preprocessing speed is fast and the output image quality is close to the original noise-free image.

Description

Confrontation sample defense method and system based on VAE-GAN

Technical Field

The invention belongs to the technical field of confrontation sample defense, and particularly relates to a confrontation sample defense method and system based on VAE-GAN.

Background

At present, the deep neural network has excellent performance on a plurality of problems which are difficult to solve by traditional machine learning. With the continuous improvement of deep neural network models, more and more deep learning solutions slowly enter people's daily life, such as: pattern recognition, face recognition, automatic driving, voice command recognition, and the like. Although deep neural networks have excellent performance in various fields, szegdy et al prove that modern deep neural networks are very vulnerable to challenge samples, and the challenge samples are only slightly disturbed (imperceptible to human vision) on original pictures, so that the deep neural network model can wrongly classify images (as shown in fig. 5). At present, more and more means for resisting attacks are provided for the deep neural network, the disturbance required by the attacks is smaller and smaller, and the traditional method for denoising the image and reducing the overfitting degree of the deep neural network cannot defend the attacks of the resisting samples. And the existing defense scheme has the defects of high training cost and poor defense migration capability.

The defense solutions that currently exist are divided into three directions: input pre-processing, improving the neural network model and only identifying whether it is an antagonistic sample without processing. The current defense scheme is as follows:

(1) input preprocessing: compressing and reconstructing the image, zooming the image, reducing the resolution of the image, and denoising the image;

(2) improving a neural network model: limiting the output of the neurons, adding an immutable part in the neural network model, reducing neural network overfitting, and adding a countermeasure sample in a training set to improve the robustness of the neural network model.

(3) Only identify if the challenge sample is not processed: the svm is used to distinguish whether the input data is a challenge sample, and the capsule network is used to distinguish whether the input data is a challenge sample.

However, in the existing defense scheme, the input preprocessing can cause the quality of an input picture to be reduced, the classification accuracy of an original noiseless image is reduced, meanwhile, the defense scheme mostly has a good defense effect on a countermeasure sample with large disturbance, and the defense effect is worse when the disturbance is smaller.

The improved neural network model can reduce the classification accuracy of the neural network model to a certain extent, but has the major disadvantages that the network model needs to be retrained, and meanwhile, the defense function is only realized under the current neural network model, the defense strategy cannot be migrated to other network models, and along with the upgrade of the countermeasure attack, the defense strategy also needs to be upgraded (otherwise, the updated countermeasure attack cannot be defended), so that the defense has extremely high network training cost.

Only identify if the challenge sample is not processed: the challenge samples cannot be identified and the input data affected by slight random noise is sometimes misrecognized.

Most current defense strategies are trained using challenge samples, which results in additional training costs.

In summary, the problems of the prior art are as follows: (1) the traditional image denoising and deep neural network overfitting degree reduction cannot defend attacks of the countersamples, and the existing defense scheme has the defects of high training cost and poor defense migration capability.

(2) In the existing defense scheme, the input preprocessing can cause the quality of an input picture to be reduced, the classification accuracy of an original noiseless image is reduced, meanwhile, the defense scheme mostly has a good defense effect on a countermeasure sample with large disturbance, and the smaller the disturbance, the worse the defense effect is.

(3) The method for improving the neural network model needs to retrain the network model, meanwhile, only has a defense effect under the current neural network model, the defense strategy cannot be migrated to other network models, and along with the upgrade of the countermeasure attack, the defense strategy also needs to be upgraded (otherwise, the updated countermeasure attack cannot be defended), so that the defense has extremely high network training cost.

(4) Only identify if it is a challenge sample without processing: the challenge samples cannot be identified and the input data affected by slight random noise is sometimes misrecognized. Most of the existing defense strategies are trained by using confrontation samples, which causes too high training cost.

The difficulty of solving the technical problems is as follows: as the algorithms for pairwise anti-sampling are continually improving, this leads to:

the cost of generating the countermeasure sample is lower and lower, and the countermeasure sample is easier to generate for attack.

The ability to resist sample attacks is also increasing.

(3) The attack mode is also changed from pure white box attack to black box attack.

The significance of solving the technical problems is as follows: at present, a plurality of deep learning solutions enter the daily life of people, such as face recognition, automatic driving, video detection and the like, and the existence of a countermeasure sample brings huge risks to the use of the solutions. For example, in a face recognition system, a lawbreaker can steal confidential information by using a challenge sample to assume the identity of another person, invade government or company internal systems. Or, in the automatic driving process, the confrontation sample is used to cover the real road sign, and the automatic driving system of the vehicle cannot make a correct decision on the road sign, so that a serious traffic accident is caused. Therefore, when designing a deep learning solution with stringent security requirements, consideration must be given to how to defend against sample attacks. The existence of the confrontation sample greatly limits the use of the deep learning solution, so that the research on how to effectively defend the confrontation sample attack has great practical significance.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides a VAE-GAN-based confrontation sample defense method, aiming at solving the problems that the existing confrontation sample defense scheme is high in defense training cost, poor in universality of the defense scheme, and the defense scheme possibly causes the classification precision of an original classifier to be reduced.

The method for defending the confrontation sample based on the VAE-GAN is realized by denoising the confrontation sample by using a Variational Automatic Encoder (VAE) and a generated confrontation network (GAN), denoising the confrontation sample by using the VAE as a preprocessing model of a classifier, and training the GAN for assisting the VAE so that an image result output by the VAE is closer to an original noiseless image.

Further, the VAE-GAN based confrontation sample defense method comprises the following steps:

step one, selecting a proper network structure to construct a VAE-GAN training network.

And step two, training the VAE-GAN network.

And step three, using the VAE-GAN module as a classifier preprocessing module to defend against the attack of the sample.

A VAE-GAN model fused with VAE and GAN is adopted to defend resisting sample attacks, as shown in figure 6, a VAE-GAN noise reduction model specially aiming at resisting samples is trained through the step two to serve as a preprocessing block of a classifier, any sample input into the classifier needs to be subjected to noise reduction through the VAE-GAN model firstly, then is sent into the classifier model to be classified, and finally the classifier can correctly identify the resisting samples.

Further, in the step one, the method for selecting a suitable network structure to construct the VAE-GAN training network is as follows:

the VAE-GAN training network is constructed by selecting a proper network structure, the neural network structure selected by VAE and GAN of the classifier is different, the DNN structure is selected by VAE and GAN of a small data set, and a deeper neural network or a convolutional neural network is selected by VAE and GAN of a large data set.

Further, in step one, the VAE-GAN comprises two parts: VAE portion and GAN portion. The main function of the VAE is to denoise an input image, and then the distribution of a VAE reconstruction image is close to the distribution of an original image as much as possible through the GAN, so that the quality of the image after noise removal is close to that of the original noise-free image is guaranteed.

The VAE may be further divided into an Encoder encoding part and a Decoder decoding part. Encoder mainly maps input image samples into two sets of n-dimensional vectors (mean vector and standard deviation vector). The Decoder mainly restores the two groups of n-dimensional vectors to original image samples after adding noise. The invention improves the quality of the generated image of the Variational Automatic Encoder (VAE) by using the Decoder of the GAN auxiliary training variational automatic encoder. GAN comprises two parts: the Generator (Generator) converts the one-dimensional vector z into an image, and the Discriminator (Discriminator) is used to identify whether the input image is a real image or the Generator generates an image. The decoding network of the VAE and the generation network of the GAN share the part of the network parameters in the decoding/generating network, and for convenience of description, will be collectively referred to as the decoding network hereinafter.

Further, in step two, the method for training the VAE-GAN network is as follows:

after the network construction of the VAE-GAN is completed, the optimization target of the whole network needs to be determined, the VAE-GAN defense model comprises three networks which are respectively a coding network, a decoding network and a judging network, and the optimization targets of the three networks are different.

The model global optimization objective function is defined as follows:

where σ and μ represent the mean and variance of the posterior distribution of the hidden variable z, x represents the original image (not necessarily the input image),

the output of the decoding network is shown, D is a discrimination network, G is a generation network (decoding network), and gamma is an introduced hyper-parameter and generally takes a value of 0.2.

The corresponding model global loss function is:

after the objective function is defined, model training can be performed directly through gradient descent (SGD) or other optimization algorithms.

Another object of the present invention is to provide a VAE-GAN based confrontation sample defense control system implementing the VAE-GAN based confrontation sample defense method.

It is another object of the present invention to provide a computer program product stored on a computer readable medium, comprising a computer readable program for providing a user input interface to implement the VAE-GAN based confrontation sample defense method when executed on an electronic device.

It is another object of the present invention to provide a computer-readable storage medium storing instructions that, when executed on a computer, cause the computer to perform the VAE-GAN based countermeasure sample defense method.

In summary, the advantages and positive effects of the invention are: the method for defending the confrontation sample based on the VAE-GAN belongs to input preprocessing and can learn and transfer among different classification models; the original classification network does not need to be retrained, so that the training cost is low; the classification precision of the original noise-free sample is hardly influenced; no confrontation sample is needed, so no external training confrontation sample is needed; the defense effect on the confrontation sample with small noise is also good; the preprocessing speed is fast and the output image quality is close to the original noise-free image. Experiments show that the classification accuracy of the common samples subjected to noise reduction by the VAE-GAN model is only reduced by 1-4%, which indicates that the classification of the common samples by the classifier is hardly influenced by the defense model; under the condition of a black box and a white box, the classification accuracy of the antagonizing sample subjected to the noise reduction by the VAE-GAN model is reduced by 2-24% compared with that of the original sample,

it turns out that the present invention is indeed able to defend against sample attacks and has substantially less impact on the classification accuracy of the classifier.

Drawings

FIG. 1 is a flow chart of a VAE-GAN based countermeasure sample defense method provided by an embodiment of the invention.

FIG. 2 is a diagram of a VAE-GAN structure provided in an embodiment of the present invention.

Fig. 3 is a schematic diagram of protection against sample attack by a preprocessing module using a VAE as a classifier according to an embodiment of the present invention.

Fig. 4 is a noise reduction effect diagram of the VAE after 209 training provided by the embodiment of the present invention.

Fig. 5 is a schematic diagram of a deep neural network against attacks provided by the embodiment of the present invention.

FIG. 6 is a diagram of a model for defending against sample attacks using a fused VAE-GAN model of VAE and GAN provided by embodiments of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The traditional image denoising and deep neural network overfitting degree reduction cannot defend attacks of the countersamples, and the existing defense scheme has the defects of high training cost and poor defense migration capability.

In the existing defense scheme, the input preprocessing can cause the quality of an input picture to be reduced, the classification accuracy of an original noiseless image is reduced, meanwhile, the defense scheme mostly has a good defense effect on a countermeasure sample with large disturbance, and the smaller the disturbance, the worse the defense effect is.

The method for improving the neural network model needs to retrain the network model, meanwhile, only has a defense effect under the current neural network model, the defense strategy cannot be migrated to other network models, and along with the upgrade of the countermeasure attack, the defense strategy also needs to be upgraded (otherwise, the updated countermeasure attack cannot be defended), so that the defense has extremely high network training cost.

Only identify if the challenge sample is not processed: the challenge samples cannot be identified and the input data affected by slight random noise is sometimes misrecognized. Most current defense strategies use challenge samples for training, which results in extra training costs.

Aiming at the problems in the prior art, the invention provides a VAE-GAN-based confrontation sample defense method and a VAE-GAN-based confrontation sample defense system, and the invention is described in detail below with reference to the accompanying drawings.

The method for defending the confrontation sample based on the VAE-GAN provided by the embodiment of the invention denoises the confrontation sample by using a Variational Automatic Encoder (VAE) and a generated confrontation network (GAN), a VAE-GAN module is used as a preprocessing model of a classifier to denoise the confrontation sample, and the GAN is used for assisting the training of the VAE, so that the image result output by the VAE is closer to the original noiseless image.

As shown in fig. 1, the method for defending a VAE-GAN-based confrontation sample provided by the embodiment of the invention comprises the following steps: 2

S101, selecting a proper network structure to construct a VAE-GAN training network.

And S102, training the VAE-GAN network.

S103, using the trained VAE-GAN model as a preprocessing module of the classifier to defend against the attack of the sample.

In step S103, a VAE-GAN model fused with VAE and GAN is used to defend against sample attacks, as shown in fig. 6, a VAE-GAN noise reduction model specific to a resisting sample is trained in step S102 as a preprocessing block of the classifier, any sample input to the classifier needs to be subjected to noise reduction by the VAE-GAN model first, and then is input into the classifier model for classification, so that the classifier can correctly recognize the resisting sample finally.

The invention is further described below with reference to specific assays.

The challenge sample: countermeasure samples are proposed by Christian szegdy et al and refer to input samples formed by deliberately adding subtle disturbances in the data set, causing the model to give an erroneous output with high confidence.

Generation of Antagonistic Networks (GAN): a Generative Adaptive Networks (GAN) is a deep learning model, and is one of the most promising methods for unsupervised learning in complex distribution in recent years. The model passes through (at least) two modules in the framework: the mutual game learning of the Generative Model (Generative Model) and the Discriminative Model (Discriminative Model) yields a reasonably good output.

Variational Automatic Encoder (VAE): the automatic variational encoder consists of a pair of interconnected neural networks, wherein the input end neural network is an encoder, and the output end neural network is a decoder. The encoder converts the input data into two sets of n-dimensional vectors: mean vector μ and standard deviation vector σ, respectively. The decoder restores the two sets of n-dimensional vectors to the original input data.

The invention is further described with reference to specific examples.

There are three major problems with currently existing defense solutions against samples: the defense training cost is high, the universality of the defense scheme is poor, and the classification precision of the original classifier is possibly reduced by the defense scheme. In view of the above problems, the present invention proposes an antagonistic sample defense scheme for denoising an antagonistic sample using a Variational Automatic Encoder (VAE) and a Generation Antagonistic Network (GAN). The VAE is used as a preprocessing model of the classifier to perform denoising processing on the confrontation sample, and the GAN is used for assisting training of the VAE, so that an image result output by the VAE is closer to an original noiseless image. Due to the different classifier performance and data sets that need to be defended, the neural network structures used by VAEs and GANs are different (the appropriate network structure is selected according to the requirements of the classifier, and the training cost is reduced). The defense scheme is divided into three steps: and selecting a proper network structure to construct a VAE-GAN training network, training the VAE-GAN network, and using a VAE module as a classifier preprocessing module to defend against the attack of the sample.

(1) The VAE-GAN training network is constructed by selecting a proper network structure, the neural network structure selected by VAE and GAN is different for different classifiers, a DNN structure can be selected for small data sets VAE and GAN, a deeper neural network or a convolutional neural network needs to be selected for large data sets VAE and GAN, and the constructed VAE-GAN structure is shown in FIG. 2.

The VAE-GAN shown in fig. 2 contains a total of 2 parts: VAE portion and GAN portion. The main function of the VAE is to denoise an input image, and then the distribution of a VAE reconstruction image is close to the distribution of an original image as much as possible through the GAN, so that the quality of the image after noise removal is close to that of the original noise-free image is guaranteed. The VAE may be further divided into an Encoder encoding part and a Decoder decoding part. Encoder mainly maps input image samples into two sets of n-dimensional vectors (mean vector and standard deviation vector). The Decoder mainly restores the two groups of n-dimensional vectors to original image samples after adding noise. The image generated by the traditional VAE is fuzzy, and the image quality after the image is denoised by the VAE is far lower than the original image quality. The invention improves the quality of the generated image of the Variational Automatic Encoder (VAE) by using the Decoder of the GAN auxiliary training variational automatic encoder. GAN comprises two parts: the Generator (Generator) converts the one-dimensional vector z into an image, and the Discriminator (Discriminator) is used to identify whether the input image is a real image or the Generator generates an image.

(2) Training of the VAE-GAN model, after the network construction of the VAE-GAN is completed, the optimization objective of the network needs to be determined first, the VAE-GAN in the invention has three optimization objectives which respectively correspond to three networks contained in the VAE-GAN defense model, and the optimization objective function is defined as follows:

the optimization targets of three networks (an encoder network, a decoding network and a discriminator network) included in the VAE-GAN defense model are different.

For the encoder, due to the change of the application scene, the input picture has the original image and the tamper image containing attack information (noise), the optimization target is the same as that of the common VAE encoder, but the meaning of each variable is greatly changed. The encoder optimization objectives defined by the present invention are as follows:

representing the output of the decoder. The corresponding encoder penalty function can be expressed as:

for a decoder, its optimization goal consists of two parts: loss of reconstruction of VAEs and loss of generators (decoders) of GANs. The decoder optimization objectives defined by the present invention are as follows:

where D denotes a discriminator and G denotes a generator (decoder). Since the reconstruction error of the decoder network at the beginning of training is much larger than the GAN generator loss, the decoder network cannot learn the distribution characteristics provided by GAN, and the super-parameter γ is introduced to characterize the two losses, and the decoder optimization target after introducing the super-parameter is as follows:

the super-parameter gamma is only used when updating the decoder network, and the value of gamma is generally 0.2. The corresponding decoder network can be represented as:

for the arbiter network, the optimization goal is the same as that of the general GAN arbiter network:

the loss function of the corresponding arbiter network can be expressed as:

the overall optimization goal of the model is as follows:

the corresponding model global loss function is:

after the objective function is defined, model training can be directly performed through a gradient descent (SGD) or other optimization algorithms, and the training process is as follows:

comprising the steps of: selecting a batch of image samples x from a training set_r(ii) a Random direction image sample x_rIn which the counter-disturbance is added to obtain x^*(ii) a Calculating the mean and variance mu and sigma of the posterior probability distribution of the hidden variable z through an encoder network; sampling posterior distribution of the hidden variable z to obtain the hidden variable z; restoration of hidden variables z to a noiseless image by a decoding network

(ii) a Calculating a loss function of the VAE encoder network; calculating a loss function for the decoder network; calculating a loss function of the discriminator network; the parameters in the three networks are updated in sequence.

(3) After the training of the VAE-GAN model is completed, the defense against the sample can be completed by using the VAE-GAN module, and the defense deployment is shown in FIG. 3.

The invention is further described below in connection with specific experiments.

The experimental hardware environment is as follows: intel Xeon E-5-2678v3 processor, 64G memory, with the GPU model number Invitta RTX2080, and the video memory size 8G. The software environment used for the experiment was: ubuntu16.04 operating system, programming language python3, development environment PyCharm, machine learning framework TensorFlow1.15.1. In the embodiment, an MNIST data set is adopted, and in order to verify the effectiveness of the strategy based on the VAE-GAN defense countermeasure sample under the white-box attack, two classifiers MNIST _ A and MNIST _ B are trained by adopting the MNIST data set.

As shown in table 1-1, MNIST _ a has a total of 6 layers of networks, where the first three convolutional layers have 64, 128, and 128 convolutional kernels, respectively, and the sizes of the convolutional kernels are yes; the step length of the convolution kernel of the first two layers is 2, and the step length of the convolution kernel of the last layer is 1; the activation functions are all ReLU functions. The fourth layer is a flattening layer and is used for converting multidimensional input into one-dimensional output; the fifth layer of full connection layer contains 100 neurons in total, and the activation function is a ReLU function; the last layer is the output classification result of the Softmax layer.

Table 1-1 MNIST dataset classifier model MNIST _ a network architecture

Table 1-2 MNIST dataset classifier model MNIST _ B network architecture

As shown in table 1-2, MNIST _ B has 10 layers of networks, the first and second convolutional layers have 64 convolutional kernels, the sizes of the convolutional kernels are both 3 × 3, the moving steps of the convolutional kernels are both 1, and the edge filling patterns are both SAME; the third layer is a pooling layer, the pooling method is a maximum pooling method, and the size of a pooling core is 2x 2; the fourth layer and the fifth layer are convolutional layers which are provided with 128 convolutional kernels, the sizes of the convolutional kernels are all 3x3, the moving step sizes of the convolutional kernels are all 1, and the edge filling modes are all VALID; the seventh layer is a pooling layer, the pooling method is a maximum pooling method, and the size of a pooling core is as follows; the eighth layer and the ninth layer are all fully connected layers respectively comprising 100 neurons and 100 neurons, and the activation functions are all ReLU functions; the last layer is a Softmax layer and is used for outputting a classification result.

The training parameters of the classifier are set as follows: for MNIST _ A and MNIST _ B models, the learning rate during training is set to be 0.001, 100 pictures are trained in parallel in each batch, and the training period is 20 and 25 respectively.

Then, countervailing samples for attacking the deep learning system are generated, the FGSM algorithm and the BIM algorithm are used for carrying out random targeted attack and non-targeted attack on the 4 attacking and defending target models, and the L is generated_∞Norm distances of 0.03, 0.05 and10000 each for 0.1 challenge samples (10000 each for each distance per model) as attack challenge samples for the FGSM algorithm and the BIM algorithm; performing non-target attack on 4 attack and defense target models by using a Deepfol algorithm to generate L₂10000 each with norm distances of 0.03, 0.05 and 0.1 as attack countermeasure samples of the Deepfol algorithm; use of C&The W algorithm carries out random targeted attack and non-targeted attack on 4 attack and defense target models to generate L_∞Challenge samples with norm distances of 0.03, 0.05 and 0.1 were 10000 each as C&The attack of the W algorithm is against the sample. The confrontation samples generated above are used for training the defense models of the embodiments to obtain the defense models. Training for the MNIST data set results in two defense models, namely VGMA and VGMB, wherein VGMA represents the defense model obtained by the countermeasure sample generated by the MNIST _ A model, and VGMB represents the defense model obtained by the countermeasure sample generated by the MNIST _ B model.

Experimental results for the white-box attack were then obtained. In order to test and objectively evaluate the performance of the defense model provided by the invention in resisting against sample attacks, 4 defense methods with better effects at present are selected as references in the embodiment and compared with the provided strategy for experiments. The four defense methods are respectively: an antagonistic sample defense method based on FGSM antagonistic training (adaptive FGSM), an antagonistic sample defense method based on BIM antagonistic training (adaptive BIM), a Distillation defense method (Distillation defense), and an antagonistic sample defense method based on image compression reconstruction (ComDefend). The results of the defense models against white-box attacks on the MNIST dataset are shown in tables 1-5 and tables 1-6,

where each row represents a defense model and each column represents an attack algorithm. Considering that the attack algorithm can generate the countermeasure samples with different scales, each data item comprises three sub-items respectively representing the disturbance magnitude of the attack algorithm (the FGSM and BIM metrics are L)_∞Deepfol and C&The W metric is L₂) The classification accuracy rates are 0.03, 0.05 and 0.1.

Tables 1-5 show the defense against white-box attacks on the MINIST _ A model … test results. Clear means that no processing is performed on the input picture, which is to compare the influence of the defense model on the original classification effect. It can be seen that the classification accuracy of the MNIST _ a classifier is 94%, and both classification accuracies are reduced after the defense strategy is introduced, but the VGMA model and the VGMB model of the present invention are less affected and still have an accuracy of 93%. This shows that the model of the present invention hardly affects the performance of the classifier while ensuring safety. Further observation shows that for FGSM attack, the first three models with the best defense effect are adaptive FGSM, VGMA and ComDefend in sequence; for BIM attack, the first three models with the best defense effect are VGMA, VGMB and adaptive BIM; for the Deepfol attack, the first three models with the best defense effect are VGMA, ComDefend and VGMB; for the C & W models, the first three models with the best defense effect are VGMB, VGMA, ComDefend. The VGM model can effectively defend various attack methods, and the classification accuracy under the worst condition is 71%. Obvious short boards exist in the defense models of the adaptive FGSM, the adaptive BIM and the Distillation Defect. The ComDefend has no obvious short board, but the performance index is slightly inferior to the VGMA and VGMB models of the invention. Considering that VGMB is trained by using a MINIST _ B classifier, but the training still has better effect, which shows that the model of the invention has certain migration capability.

Tables 1-5 Classification accuracy (%) -against white-box attacks on MNIST _ A model

Tables 1-6 use MNIST _ B model for classification, compared with MNIST _ A, MNIST _ B has better classification effect on the data set, and the classification accuracy rate reaches 98%. In the absence of attacks, the VGMA and VGMB of the invention still have 95% of classification accuracy. Similarly, it can be found that there are obvious short boards in the defense models of adaptive FGSM, adaptive BIM, dispertion defend, and ComDefend, and the worst case of the model of the present invention is a classification accuracy of 75%.

Tables 1-6 Classification accuracy (%) -against white-box attacks on MNIST _ B model

MNIST _ a and MNIST _ B were tested on the same dataset using different classifiers, leading to similar conclusions. The experiments prove that the method for defending the confrontation sample based on the VAE-GAN has small influence on the classification of the normal sample under different classifiers, has good defense effect on different white-box attack modes, and has no obvious defense short plate.

After 209 training passes, the VAE denoising effect is shown in FIG. 4. As can be seen from the figure, the implantation of the challenge sample causes certain programmed "noise" to appear on the original image, and the picture recovered by the VAE-GAN model noise reduction is similar to the original image. This shows that the VAE-GAN model can better restore the confrontation sample to the original sample, which is also an important reason that the VAE-GAN model can defend the confrontation sample.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims

1. The VAE-GAN based confrontation sample defense method is characterized in that the confrontation sample is denoised by using a variational automatic encoder VAE and a generation confrontation network GAN, and an image result output by the variational automatic encoder VAE is close to an original noiseless image by training of the GAN auxiliary variational automatic encoder VAE.

2. The VAE-GAN based countermeasure sample defense method according to claim 1, wherein the VAE-GAN based countermeasure sample defense method comprises the steps of:

selecting a proper network structure to construct a VAE-GAN training network;

step two, training the VAE-GAN network;

3. The VAE-GAN based confrontation sample defense method according to claim 1, wherein in step one, the method for selecting the suitable network structure to construct the VAE-GAN training network is as follows:

4. The method for defending a VAE-GAN based confrontation sample according to claim 1, wherein in step one, the VAE-GAN comprises two parts: a VAE moiety and a GAN moiety; the VAE has the main function of denoising an input image, and then enabling the distribution of a VAE reconstruction image to be as close to the distribution of an original image as possible through GAN;

the VAE can be divided into an Encoder coding part and a Decode decoding part; the Encoder mainly maps an input image sample into two groups of n-dimensional vectors, a mean vector and a standard deviation vector; the Decoder mainly restores the two groups of n-dimensional vectors into original image samples after adding noise, and the Decoder of the variational automatic encoder is trained by the aid of GAN; GAN comprises two parts: the Generator converts the one-dimensional vector z into an image, the Discriminator is used to identify whether the input image is a real image or a Generator-generated image, and the decoding network of the VAE and the generation network of the GAN share network parameters.

5. The VAE-GAN based countermeasure sample defense method according to claim 1, wherein in step two, the training method for the VAE-GAN network comprises the following steps:

after the network construction of the VAE-GAN is completed, the optimization objective of the whole network needs to be determined, and the model overall optimization objective function is as follows:

where σ and μ represent the mean and variance of the posterior distribution of the hidden variable z, x represents the original image,

the output of a decoding network is shown, D is a discrimination network, G is a generation network, and gamma is an introduced hyper-parameter and takes the value of 0.2;

the corresponding model global loss function is:

after the objective function is defined, model training is performed through gradient descent or other optimization algorithms.

6. A VAE-GAN based confrontation sample defense control system for implementing the VAE-GAN based confrontation sample defense method according to any one of claims 1 to 5.

7. A computer program product stored on a computer readable medium, comprising computer readable program for providing a user input interface to implement the VAE-GAN based confrontation sample defense method of any of claims 1 to 5 when executed on an electronic device.

8. A computer readable storage medium storing instructions which, when executed on a computer, cause the computer to perform the method of defending against a sample of VAE-GAN based confrontation as claimed in any one of claims 1 to 5.