CN111310802A

CN111310802A - Anti-attack defense training method based on generation of anti-network

Info

Publication number: CN111310802A
Application number: CN202010064965.2A
Authority: CN
Inventors: 孔锐; 黄钢; 曹后杰
Original assignee: Xh Smart Tech China Co ltd
Current assignee: Xh Smart Tech China Co ltd
Priority date: 2020-01-20
Filing date: 2020-01-20
Publication date: 2020-06-19
Anticipated expiration: 2040-01-20
Also published as: CN111310802B

Abstract

The invention provides a generation type countermeasure network-based countermeasure attack defense training method, which comprises S1, real sample image data x_realAnd performing standardization processing; s2, establishing a defense training framework; s3, generating random noise Z and generating random condition vector C_fake(ii) a S4, random noise Z and random condition vector C are combined_fakeA generator input into a defense training framework; s5, standardizing the real sample image data after being processed and the category c thereof_realInputting the data into an attack algorithm library; s6, performing defense training on the defense training frame, and storing parameters of the trained defense training frame; and S7, finishing training, abandoning the generator and the attack algorithm library, and reserving the discriminator. The method provided by the invention overcomes the defects of the traditional method for defending against attacks and training such as using an additional netThe method has the defect of heavy workload, and the method provided by the invention has higher robustness.

Description

Anti-attack defense training method based on generation of anti-network

Technical Field

The invention relates to the technical field of security defense of deep learning counterattack, in particular to a counterattack defense training method based on a generated counternetwork.

Background

Currently, deep learning is occupying the core position of the rapidly developing fields of machine learning and artificial intelligence, and achieves excellent performance in various visual and speech recognition tasks. However, modern visual Deep Neural Networks (DNNs) are vulnerable to attacks from challenge samples designed according to some specific blind spots, due to the non-intuitive nature and inexplicability of the model. Compared with a noise sample, the offensive countermeasure sample is well designed and is not easy to perceive, can cause wrong prediction and classification of a target network, has transferability and can directly execute black box attack. In other words, the attacker can find an alternative network similar to the target network and thereby train out an attack sample to apply it to the target network. Therefore, it is very important and urgent to design a defense training method capable of effectively defending against the sample from the black box attack.

Generative confrontation network theory is based on a game theory scenario in which the generator network learns to transform the distribution from some simple input distribution (usually a standard multivariate normal distribution or uniform distribution) to image space by competing with the opponent; as an adversary, the discriminator then attempts to distinguish between samples taken from the training data and samples generated from the generator.

A classifier model with a good decision boundary not only can correctly classify real samples, but also can ignore interference characteristics and pay attention to key characteristics of the samples when an attack sample is faced, and then correctly classify the attack sample. In the existing schemes, the defense modes against attacks can be mainly divided into the following categories:

(1) detection based on statistical tests: the method is direct, has poor effect, and is based on the statistical conclusion of a large number of confrontation samples, so that a large number of confrontation samples are required to mine the statistical rules, and the method is not suitable for detecting a single confrontation sample during detection.

(2) Modifying the training process or modifying the data during the model training process: carrying out supervised training by taking the confrontation sample and the original sample as training sets; compressing input data; carrying out image enhancement in the processes of introducing random rescaling, random padding and training on input data;

(3) modifying the neural network model, such as adding network layers, adding sub-networks, modifying loss functions and activation functions, etc.;

(4) when samples which are not found are classified, the external model is used as an additional network, namely a separately trained network is added to the original model, so that a method for immunizing against the samples without adjusting coefficients is achieved, and defense against general disturbance is completed.

In summary, for different types of countermeasures, some extra work is required to ensure the robustness of the classifier to the newly added attack means. In view of effects and cost, the existing two methods for modifying data and using additional networks are more used, because the two methods can not directly modify a target network model and can be directly used for a plurality of network models with similar functions, resources are greatly saved in engineering, but the workload is increased to a certain extent by modifying data and using additional networks, and training samples have limitations, so that the network boundary of defense training and a real decision boundary have differences.

Disclosure of Invention

In order to overcome the defect that the workload is increased when the traditional method for the defense training against the attacks is used for using an additional network, and the training samples have limitations, so that the network boundary of the defense training and the real decision boundary have differences, the invention provides the defense training against the attacks based on the generation of the network, and the method does not need an additional network and improves the robustness of the network defense sample against the attacks.

The present invention aims to solve the above technical problem at least to some extent. In order to achieve the technical effects, the technical scheme of the invention is as follows:

a counterattack defense training method based on a generative counternetwork comprises the following steps:

s1, real sample image data x_realIs defined as c_realAnd performing z-sco on real sample image datare standardization processing;

s2, establishing a defense training framework, wherein the defense training framework comprises a generator, an attack algorithm library, a discriminator and a target network;

s3, generating random noise Z and generating random condition vector C based on defined real sample image data_fake；

S4, random noise Z and random condition vector C are combined_fakeA generator input into a defense training framework;

s5, standardizing the z-score to obtain real sample image data and the category c thereof_realInputting the data into an attack algorithm library; inputting the output of the generator and the output of the attack algorithm library to a discriminator in a defense training framework;

s6, performing defense training on the defense training frame, and storing parameters of the trained defense training frame;

and S7, finishing training, abandoning the generator and the attack algorithm library, and reserving the discriminator.

Preferably, the real sample image data x of step S1_realObeying a discrete normal distribution P_realThe total number of types of the real sample image data is n_classesTrue sample image data x_realThe formula for performing the z-score normalization process is:

wherein the content of the first and second substances,

data representing the image of a real sample after a z-score normalization process, x_realData representing the true sample image before z-score normalization processing, mean representing the mean of the true sample image data, std representing the variance of the true sample image data.

Preferably, the defense training framework of step S2 includes a generator G, an attack algorithm library Ω for generating attack samples_attackA discriminator D and a target network F; the generator G is based on neural networks VGG, ResNet and Google Net, an up-sampling convolution neural network designed by one basic convolution unit in AlexNet; the attack algorithm library omega_attackThe algorithm in (1) includes Gradient attack algorithm, including but not limited to Fast Gradient Signal Method, Iterative least-likelyslass Method, Basic Iterative Methods; the discriminator D is a down-sampling convolution neural network designed based on one basic convolution unit of neural networks VGG, ResNet, GoogleNet and AlexNet; the target network F is composed of a convolutional neural network, and comprises one or any combination of VGG, ResNet, GoogleNet and AlexNet.

Preferably, the random noise Z in step S3 is a discrete normal distribution P with a mean value mean of 0 and a standard deviation of 1_zObtaining randomly; random condition vector C_fakeFrom uniform distribution to P_c＝[0,n_classes) Randomly in an integer between.

Preferably, the process of performing defense training on the defense training framework in step S6 is as follows:

s601, random noise Z and random condition vector C_fakeGenerating a false sample image x with a generator G as input to the generator G in a training framework_fake；

S602, standardizing the real sample image data after z-score processing

And its class c_realInput to the attack algorithm library omega_attack；

S603, with the target network F as an attack target, randomly selecting an attack algorithm library omega_attackAttack algorithm in on real sample image data

Attack is carried out, and attack samples are output

And their classes

S604, false sample image x_fakeAnd attack samples

Inputting the false sample images x into a discriminator D together to obtain a false sample image x of the discriminator D_fakeIs judged to be true or false and is lost L_tf(G) Class loss L_cls(G) And loss of true and false determination

And attack sample

True and false sample determination loss of

And a classification loss L_cls(D)。

Here, the target network F is through real sample data

During the defense training process of the defense training framework, the convolution neural network obtained after the training obtains false sample data which can prevent the target network F from working normally through an attack algorithm library, wherein the false sample data is true sample data

The modification is carried out on the basis, and the modification action is executed by a certain attack algorithm in the attack algorithm library.

Preferably, the generator G and the discriminator D of the training defense framework are trained in an Epoch round together, and the generator G and the discriminator D are alternately trained:

1) parameter θ of fixed discriminator D_DAnd (4) training a generator G unchanged, wherein the steps are as follows:

step one, discrete normal distribution P with mean value mean of 0 and standard deviation of 1_zObtaining M sample data randomly to form random noise Z; from uniform distribution to P_c＝[0,n_classes) In betweenRandomly obtaining M sample data in integer to form random condition vector c_fakeRandom noise Z and random condition vector c_fakeTransmitted to a generator G for generating M false sample images x_fake；

Step two, false sample image x_fakeTransmitting to a discriminator D to obtain a false sample image x of the discriminator D_fakeIs judged to be true or false and is lost L_tf(G) Comprises the following steps:

categorical prediction of loss L_cls(G) Is composed of

Wherein the formula represents a calculation function of the loss value;

and

all represent solutions to expected values; the subscript letter parameters serve as identifiers and have no practical significance to the formula itself.

Step three, back propagation and updating of parameter theta of generator G by using optimization function_GThe overall loss function L (G) is formulated as:

L(G)＝L_cls(G)+L_tf(G)

wherein the parameter θ of the generator G is updated_GThe optimization function of (a) is one of Adam, SGD, RMSProp and Momentum;

2) fixing the parameter θ of the generator G_GAnd (3) training a discriminator D:

step 1, from distribution to discrete normal distribution P_realRandomly selecting M image data from the image to form a real sample image x_realNormalizing the z-score processed real sample image data

And its class c_realInput to the attack algorithm library omega_attack；

Step 2, using the attack algorithm library omega_attackWith the target network F as an attack target, randomly selecting an attack algorithm library omega_attackAttack algorithm in on real sample image data

Attack is carried out, and attack samples are output

And their classes

Step 3, false sample image x_fakeAnd attack samples

And attack sample

True and false sample determination loss of

And a classification loss L_cls(D) I.e. by

Wherein the formula represents a calculation function of the loss value;

and

all the values are expected values; the subscript letter parameters declare the distribution of the data, and have no practical significance to the formula for the identification function;

step 4, from the discrete normal distribution P with mean value mean of 0 and standard deviation of 1_zRandomly acquiring M sample data to form random noise Z, random noise Z and random condition vector c_fakeTransmitted to a generator G for generating M false sample images x_fakeFalse sample image x_fakeTransmitting to a discriminator D to obtain the true and false determination loss

The total loss function L (D) is:

step 5, back propagation and updating the parameter theta of the generator D by using the optimization function_D(ii) a The optimization function is one of Adam, SGD, RMSProp and Momentum;

preferably, the defense training of the defense training framework is performed by batch training, wherein the size of each batch is M, namely M samples are input into the defense training framework in each training, and random noise Z and a random condition vector C are input into the defense training framework_fakeTrue sample image data

And its class c_realThe number of the N-substituted aryl groups is M; the attack algorithm has the following selection rules:

A. random selection of attack algorithm library omega by random selector_attackAn attack algorithm f in_attack；

B. Using the target network F as an attack target to carry out real sample image data

And its class c_realCarrying out attack on each sample in each training cycle, wherein an attack algorithm base omega is randomly selected for each attack_attackAn attack algorithm f in_attackGenerating an attack sample

C. Judging whether the target network F is successfully attacked, if so, keeping an attack sample; otherwise, abandoning the attack sample;

D. judging attack sample

If the batch size is equal to the batch size M, outputting an attack sample if the batch size is equal to the batch size M; otherwise, returning to B to execute again.

According to the batch size M, a target network F is taken as an attack target, each sample in each training cycle is attacked, and an attack algorithm base omega is randomly selected for each attack_attackAn attack algorithm f in_attackGenerating an attack sample

In a batch, the random selector will randomly select the attack algorithm library omega_attackIn the attack algorithm, M times, randomly selecting one attack algorithm and attack sample each time

In total of MCorresponding to attack sample class

The number of (2) is also M.

Preferably, the criterion for determining the attack failure of the target network F is: attack sample

Still recognized as correct class by the target network F

Preferably, the indexes of the training completion are as follows: classification prediction accuracy of discriminator D on real sample image

And on attack sample

Classification prediction accuracy of

At the same time, and storing the parameter theta of the discriminator D of the round_D。

Preferably, the classifier D predicts the accuracy of classification prediction of the real sample image

The calculation formula of (2) is as follows:

wherein the content of the first and second substances,

representing the number of real sample images; n is_accThe number of prediction classes representing a sample equal to its true class;

to attack sample

Classification prediction accuracy of

Comprises the following steps:

wherein the content of the first and second substances,

the number of attack samples; n is_accThe prediction class representing a sample is equal to the number of its real classes.

Compared with the prior art, the technical scheme of the invention has the beneficial effects that:

(1) the anti-attack defense training method based on the generated anti-network provided by the invention directly trains the defense training framework by modifying the training process and the training data, overcomes the defect that the traditional method for anti-attack defense training has heavy workload if an additional network is used, and has higher robustness.

(2) The training sample of the method is randomly generated based on the image data of the real sample, has no limitation, and avoids the phenomenon that the network boundary and the real decision boundary of the existing defense training are different.

Drawings

Fig. 1 is a flowchart of an anti-attack defense training method based on generation of an anti-network according to the present invention.

Fig. 2 is a training framework for defending against attacks based on generation of a countermeasure network according to the present invention.

Fig. 3 is a schematic diagram of the target attack performed by the attack algorithm provided by the present invention.

Fig. 4 is a schematic diagram of a specific process of the attack algorithm for target attack.

Fig. 5 is a schematic diagram of a network structure of the generator G according to the present invention.

Fig. 6 is a schematic diagram of a network structure of the discriminator D according to the present invention.

Fig. 7 is a diagram illustrating the classification accuracy of the target network F on the attack sample.

Fig. 8 is a diagram illustrating the classification accuracy of the target network F on the real sample.

Detailed Description

The drawings are for illustrative purposes only and are not to be construed as limiting the patent;

it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.

The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.

Example 1

FIG. 1 is a flow chart of an anti-attack defense training method based on a generation countermeasure network, and FIG. 2 is a flow chart of an anti-attack defense training framework based on a generation countermeasure network, which comprises a generator G, a discriminator D, a target network F and an attack algorithm library omega_attack。

Wherein, in this example, the generator G upsamples the tensor using the basic residual module of ResNet as a deconvolution neural network, a random noise z and a random condition vector c_fakeAs input to the generator G, a false sample image x is obtained after upsampling via a deconvolution network_fake(ii) a The discriminator D uses ResNet as a network structure and receives the information from the attack algorithm library omega_attackProcessing the attack sample after the target network F is taken as the attack target

And generating a sample x_fake(ii) a The target network F uses VGG as a network structure, and finally retains parameters of the discriminator through training of the generator G and the discriminator D, and as a final classifier, the experimental environment of the embodiment is as follows: the server processor is 32Intel (R) Xeon (R) CPU E5-2620 v4@2.10GHz, 64GB running memory (RAM), two NVIDIA Tesla P4 GPUs, PyTorch framework.

The training step comprises:

t1 handwriting data set to be written using MNISTAdopting a batch training method, setting the batch size to be M-64, carrying out z-score standardization processing on each sample of the MNIST data set, wherein the value range of the sample data is [ -1,1]By using

Representing that the shape of each batch of sample tensors is 64 × 1 × 28 × 28; to pair

Performing classification marking and designing the marking as a condition vector by c_realIt is shown that, in a batch, the shape of the condition vector is 64 × 1, and the MNIST data set selected in this embodiment is a collection of handwritten digital images related to handwritten numbers 0 to 9, so in the process of setting the condition vector: the data sets are classified into 10 classes with specific numbers from 0 to 9, i.e., n_classes＝10；

T2, random condition vector c_fakeGenerating, as input to generator G, specifying from distribution to P_cRandom sampling in integers between [0,10 ], in one batch, c_fakeHas a tensor shape of 64 × 1;

t3, random noise z, used as input to the generator G, and generating random noise vectors using the built-in function of PyTorch. Discrete normal distribution P with mean value of 0 and standard deviation of 1_z128 samples, and the tensor shape of z in one batch is 64 x 128.

In this embodiment, the target network F uses VGG11 as a model framework, and the accuracy of classification and prediction of the pre-training model for recognizing the MNIST handwriting data set is more than 99%. Set attack method library omega_attackThe Method is characterized by comprising three common Gradient Attack algorithms of Fast Gradient Method Attack (FGSM), Basic Iterative Method Attack (BIM) and moment Iterative Attack (MIFGSM), wherein a target network F is taken as an Attack target, a schematic diagram of the Attack algorithm for attacking is shown in FIG. 3, a specific process is shown in FIG. 4, and one batch of Gradient Attack algorithms are input

And c_realObtaining an attack sample by taking the target network F as an attack target

Used as input of the discriminator D, correct class information corresponding to the attack sample

Wherein

Has a tensor shape of 64 x 1 x 28,

is 64 × 1. The structural design sequence of the generator G is as follows: the full connection layer, the first upsampled residual block, the second upsampled residual block, the first residual block, the second residual block, the convolutional layer, and the Tanh active layer, in this embodiment, specific generator network construction details are shown in fig. 5. The structural design sequence of the discriminator D is as follows: the first downsampled residual block, the second downsampled residual block, the third downsampled residual block, the fourth downsampled residual block, the ReLU layer, and the full connection layer, and in this embodiment, specific generator network construction details are as shown in fig. 6.

T3, setting the iteration number of the discriminator D and the generator G to be 2:1

The loss function of training generator G is:

L(G)＝L_cls(G)+L_tf(G)

the loss function for training arbiter D is:

t4. the parameters of generator G and discriminator D are updated using Adam optimization function, and the learning rate is set to 0.0002, the exponential decay rate of Adam first order moment estimate is 0.0, the exponential decay rate of Adam second order moment estimate is 0.9, and the total training set iteration round number Epoch is 10 rounds.

T5, after training is completed, retaining the simultaneous attack sample

Recognition rate and for true samples x_realThe parameters of the discriminator D with the highest recognition rate are used as the result model of the present training. In the present embodiment, after the training of the above steps T1-T10, the result is obtained, and after 10 rounds of training, FIG. 7 shows the sample of the attack of the discriminator D

The classification prediction accuracy of (1), and fig. 8 shows the true sample x of the discriminator D_realThe classification prediction accuracy of (1). The result shows that for the attack sample, the accuracy rate is substantially equal to 100% at the abscissa of 3k, and for the real sample, the accuracy rate is increased and then decreased, and it can be seen from the curve that the accuracy rate is at most 96% at the abscissa of 3-5k, so in this embodiment, the parameters of the discriminator D stored in the interval of 3-5k with the abscissa are selected as the final result network.

The positional relationships depicted in the drawings are for illustrative purposes only and are not to be construed as limiting the present patent;

it should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims

1. An anti-attack defense training method based on a generative type countermeasure network is characterized by comprising the following steps:

s1, real sample image data x_realIs defined as c_realAnd carrying out z-score standardization processing on the real sample image data;

s5, standardizing the z-score to obtain real sample image data and the category c thereof_realInputting the output of the generator and the output of the attack algorithm library into a discriminator in a defense training frame;

2. The method for training defense against attack based on generative countermeasure network as claimed in claim 1, wherein step S1 is executed by using the real sample image data x_realObeying a discrete normal distribution P_realThe total number of types of the real sample image data is n_classesTrue sample image data x_realThe formula for performing the z-score normalization process is:

wherein the content of the first and second substances,

3. The method for training defense attack against adversarial attack based on generative adversarial network as claimed in claim 2, wherein the defense training framework of step S2 comprises a generator G, an attack algorithm library Ω for generating attack samples_attackA discriminator D and a target network F; the generator G is an up-sampling convolution neural network designed based on one basic convolution unit of neural networks VGG, ResNet, GoogleNet and AlexNet; the attack algorithm library omega_attackThe algorithm in the Method comprises a Gradient attack algorithm, which comprises one or any combination of Fast Gradient Signal Method, iterative least-likely class Method and basic iterative Methods; the discriminator D is a down-sampling convolution neural network designed based on one basic convolution unit of neural networks VGG, ResNet, GoogleNet and AlexNet; the target network F is composed of a convolutional neural network, and comprises one or any combination of VGG, ResNet, GoogleNet and AlexNet.

4. The method as claimed in claim 3, wherein the random noise Z in step S3 is a discrete normal distribution P with mean value mean of 0 and standard deviation of 1_zObtaining randomly; random condition vector C_fakeFrom uniform distribution P_c＝[0,n_classes) BetweenIs randomly obtained from the integer of (1).

5. The method for training defense attack based on generative countermeasure network according to claim 4, wherein the step S6 is to train defense training framework by:

S602, standardizing the real sample image data after z-score processing

And its class c_realInput to the attack algorithm library omega_attack；

Attack is carried out, and attack samples are output

And their classes

S604, false sample image x_fakeAnd attack samples

And attack sample

True and false sample determination loss of

And a classification loss L_cls(D)。

6. The generation-based countermeasure attack defense training method of the countermeasure network according to claim 5, wherein the generator G and the discriminator D of the training defense framework are trained in an Epoch round together, and the generator G and the discriminator D are alternately trained:

step one, discrete normal distribution P with mean value mean of 0 and standard deviation of 1_zObtaining M sample data randomly to form random noise Z; from uniform distribution to P_c＝[0,n_classes) Randomly obtaining M sample data from integers to form a random condition vector c_fakeRandom noise Z and random condition vector c_fakeTransmitted to a generator G for generating M false sample images x_fake；

categorical prediction of loss L_cls(G) Is composed of

Wherein the formula represents a calculation function of the loss value;

and

L(G)＝L_cls(G)+L_tf(G)

And its class c_realInput to the attack algorithm library omega_attack；

Attack is carried out, and attack samples are output

And their classes

Step 3, false sample image x_fakeAnd attack samples

And attack sample

True and false sample determination loss of

And a classification loss L_cls(D) I.e. by

Wherein the formula represents a calculation function of the loss value;

and

all represent solutions to expected values; subscript letter parameters are used for identification and have no practical significance on the formula;

step 4, from the discrete normal distribution P with mean value mean of 0 and standard deviation of 1_zRandomly acquiring M sample data to form random noise Z, random noise Z and random condition vector c_fakeTransmitted to a generator G for generating M false sample images x_fakeFalse sample imagex_fakeTransmitting to a discriminator D to obtain the true and false determination loss

The total loss function is:

step 5, back propagation and updating the parameter theta of the generator D by using the optimization function_D(ii) a The optimization function is one of Adam, SGD, RMSProp and Momentum.

7. The method as claimed in claim 6, wherein the defense training frame is trained by using batches, each batch is M in size, that is, M samples are input into the defense training frame for each training, and the random noise Z and the random condition vector C are input into the defense training frame_fakeTrue sample image data

And its class c_realCarrying out attack on each sample in each training cycle, wherein an attack algorithm base omega is randomly selected for each attack_attackAn attack algorithm f in_attackGenerating an attack sampleBook (I)

D. judging attack sample

If the number of the attack samples is equal to the batch size M, outputting attack samples; otherwise, returning to B to execute again.

8. The method for training defense against attack based on generative countermeasure network as claimed in claim 7, wherein the criterion for determining failure of target network F attack is: attack sample

Still recognized as correct class by the target network F

9. The method for training defending against attacks based on the generative countermeasure network as claimed in claim 8, wherein the training is completed by the following indexes: classification prediction accuracy acc of discriminator D on real sample image_(xreal)And on attack sample

Classification prediction accuracy of

10. The generative countermeasure network-based defense against attacks training method of claim 9, whichCharacterized in that the classifier D predicts the accuracy of classification of the real sample image

The calculation formula of (2) is as follows:

wherein the content of the first and second substances,

to attack sample

Classification prediction accuracy of

Comprises the following steps:

wherein the content of the first and second substances,