CN114120028A

CN114120028A - Countermeasure sample generation method based on double-layer generation countermeasure network

Info

Publication number: CN114120028A
Application number: CN202111249871.3A
Authority: CN
Inventors: 贺二路; 焦利彬; 贾哲; 赵阳阳; 吴巍
Original assignee: CETC 54 Research Institute
Current assignee: CETC 54 Research Institute
Priority date: 2021-10-26
Filing date: 2021-10-26
Publication date: 2022-03-01

Abstract

The invention provides a countermeasure sample generation method based on a double-layer generation countermeasure network, and relates to the field of artificial intelligence safety. The method comprises the steps of generating a countermeasure network by adopting a first layer of condition, a feature extractor, a second layer of generated countermeasure network and a target network; the conditional generation countermeasure network is used for generating a new sample, and the discriminator not only distinguishes the authenticity of the generated sample, but also judges the category of the generated sample; the characteristic extractor is used for extracting the characteristics of a hidden layer of an original sample and generating disturbance with countercheck prior; the second layer generates a countermeasure network for generating countermeasure disturbance, and the discriminator analyzes the authenticity of the countermeasure sample and the similarity of the countermeasure sample and the conditional generation countermeasure network generation sample; the target network is used to verify the attack success rate against the sample. The invention utilizes two layers of neural networks to respectively generate the samples of the specific categories and the confrontation disturbance, can realize the purpose of carrying out attack and confrontation training by utilizing the samples of the specific categories, and effectively improves the success rate of the attack and the efficiency of the confrontation training.

Description

Countermeasure sample generation method based on double-layer generation countermeasure network

Technical Field

The invention relates to the field of artificial intelligence safety, in particular to a countermeasure sample generation method based on a double-layer generation countermeasure network.

Background

With the rapid development of artificial intelligence technology, especially deep learning makes a major breakthrough in the fields of image recognition, image classification, natural language processing and the like. At present, the new technologies are applied to many engineering fields, and the model algorithm of the technology has huge potential safety hazards while bringing convenience to people in deep learning. The learners propose that the lack of robustness of the deep convolutional neural network is proved through experiments, and an attacker can design an attack method according to the characteristics of different models to influence the performance of the models.

At present, the attack mode faced by the deep neural network is mainly a countermeasure sample, wherein the countermeasure sample is a disturbance sample obtained by adding disturbance which is difficult to be detected by naked eyes to input clean data, and the sample can cause a model to output an error result with higher confidence. At present, researchers provide various solutions for the vulnerability of a depth model, wherein confrontation training means artificially adding a confrontation sample in a training stage of the model, so that the model learns the characteristics of the confrontation sample, and further the robustness and generalization capability of the model are improved. Therefore, a large number of countermeasure samples need to be generated according to training of the deep neural network model, and generating a specific countermeasure sample to achieve countermeasure training or an attack task becomes an urgent problem to be solved because of the difference between the training task and the attack task.

Disclosure of Invention

In order to make up for the shortage of confrontation samples during confrontation training and solve the problem that a generation model cannot generate specific class images, the invention provides a confrontation sample generation method based on a double-layer generated confrontation network.

In order to achieve the purpose, the invention adopts the technical scheme that:

a countermeasure sample generation method based on a two-layer generation countermeasure network, the two-layer generation countermeasure network comprises a first layer conditional generation countermeasure network, a second layer generation countermeasure network, a feature extractor F and a target network C, the first layer conditional generation countermeasure network comprises a generator G₁Discriminator D₁The second layer of the generative countermeasure network comprises a generator G₂Discriminator D₂Wherein the generator and the discriminator are both MLP multilayer perceptrons;

the method comprises the following steps:

(1) inputting original sample x, random noise z and class label c into generator G₁Generator G₁Fitting random noise z to new image samples x from original samples and class labels_c；

(2) Sample x_cInput to discriminator D₁Discriminating the authenticity and the category of the product;

(3) extracting hidden layer characteristics F (x) of an original sample by a characteristic extractor F;

(4) inputting hidden layer characteristics F (x) to generator G₂Generating an antagonistic disturbance G with an antagonistic prior₂(F(x))；

(5) The sample xc is compared with the counterdisturbance G₂(F (x)) fusing to obtain an antagonistic sample

(6) Inputting the challenge sample to a discriminator D₂Discriminating the sum of its authenticity from x_cSimilarity of (c);

(7) and inputting the confrontation samples into the target network C for classification, verifying the attack success rate of the target network C, and storing the successfully attacked confrontation samples.

Further, the feature extractor F is a VGG model, and the target network C is a ResNet model.

Further, the condition of the first layer generates a loss function against the network as:

L₁＝Ε_xlogD₁(x|c)+Ε_zlog(1-D₁(G₁(z|c)))

wherein x | c, z | c represent joint input, i.e., c is input jointly with x or z, c is category information specified to be generated; e denotes data distribution.

Further, the second layer generates a loss function against the network as:

wherein the content of the first and second substances,

for ensuring sample x_cAnd confrontation sample

Similarity of (c); l is_C＝Ε_xl_c(x_c+G₂(F (x)), t) loss of target class of attack, where t is the designated attack class, l_cThe cross entropy function is used for ensuring the success rate of resisting sample attack; e denotes data distribution.

Compared with the prior art, the invention has the following beneficial effects:

1. the method generates a specific class image through a class c guide model of the countermeasure network generated under the condition, and then adds disturbance to realize the attack of a specific countermeasure sample on the target model.

2. According to the method, the hidden layer characteristics of the sample are input when the disturbance is generated, the characteristics of the sample which can be better represented by the hidden layer characteristics are obtained through the characteristic extractor, the disturbance generated based on the hidden layer characteristics can increase the countercheck prior, so that the classifier is more sensitive to the disturbance, and the countercheck sample added with the disturbance can obviously improve the attack success rate.

Drawings

FIG. 1 is a flow chart of a challenge sample generation method according to an embodiment of the present invention.

Detailed Description

The invention will be further explained with reference to the drawings.

A countermeasure sample generation method based on a double-layer generation countermeasure network utilizes two generation countermeasure networks to respectively generate samples of specific categories and countermeasure disturbance. Specifically, the conditional generation countermeasure network is used to generate an image of a specific type, the conventional generation countermeasure network cannot control the generated data, and when two or more types are input, the generator of the generation countermeasure network cannot specify generation of a specific image, and the discriminator simply discriminates the authenticity of the generated image without classifying it. The method generates a specific image category through a conditional generation confrontation network control model. By inputting the auxiliary information c to the generator, c is a generated class label, which has a guiding role in the data generation of the generator, the loss function of the conditional generation countermeasure network is as follows:

L₁＝Ε_xlogD₁(x|c)+Ε_zlog(1-D₁(G₁(z|c)))

where x is the input sample, c is the class label, and z is the random noise of the input. In training discriminator D₁In time, not only is the real data required to be generated, but also the class specified by c needs to be satisfied, and the generator G is trained₁In this case, guidance of the category label is also required to generate a designated image.

The deep neural network classification model comprises an input layer, a hidden layer and an output layer, after a sample is input through the input layer, characteristics are extracted through the hidden layer, weights and self biases of neurons of different hidden layers corresponding to the neurons of different input layers are different, the weights can influence the sensitivity degree of a neural unit to input information, for example, the neural unit of the hidden layer forms recognition mode deviation through controlling the weights, the neural unit of the output layer adjusts the weights of the neural unit of the hidden layer, and deviation of an output result can be formed. And the output layer outputs the result according to different hidden layer weights and self-bias. The decision result of the classifier depends on the analysis of the image features by the classifier, and finally classification decision is carried out based on the hidden layer vector, so when the countermeasure disturbance is generated by generating the countermeasure network, the hidden layer vector is input instead of the image, and the countermeasure prior is generated more easily.

And inputting the original sample x into a feature extractor F to obtain hidden layer features F (x). Inputting F (x) to a generator G₂In generating an antagonistic disturbance G₂(F (x)). Sample x to be generated_cFusing with the antagonistic disturbance G (F (x)) to obtain an antagonistic sample

After obtaining the confrontation samples, the confrontation samples need to be respectively input into the discriminator D₂And in the target network C, in order to improve the effectiveness of the countermeasure sample, the whole training process needs to be constrained by a loss function, and for an attacker, the countermeasure sample needs to satisfy: (1) the attack success rate can be higher after the attack is input into a target network; (2) the disturbance resisting the sample addition cannot be recognized by human eyes; (3) the samples before and after adding the perturbation should have as high a similarity as possible.

In view of the above requirements, the method designs the following loss function:

wherein the content of the first and second substances,

the loss function generates a loss function of the countermeasure network for the second layer, through which the generator G is trained₂And discriminator D₂The effectiveness of generating challenge samples is guaranteed.

For ensuring sample x_cAnd confrontation sample

Similarity of (c); l is_C＝Ε_xl_c(x_c+G₂(F (x)), t) is the loss of the attack target class for ensuring the success rate against the sample attack, wherein l_cIs cross entropy and t is the attack category to be specified, if the attack result is t, the penalty is decreased, otherwise the penalty is increased.

The method aims at the problem that a specific sample is needed to attack a target network or the specific sample is needed to provide countertraining, and can generate the image of a specific class by adding class limitation to the counternetwork under the condition. In addition, when the countermeasure disturbance is generated, in order to increase the classification prior, the hidden layer feature extraction is carried out on the original sample through the feature extractor F, then the hidden layer feature is input into the generation countermeasure network to generate the countermeasure disturbance, the countermeasure sample generated through the disturbance can better mislead the target classifier, and the purpose of attack or countermeasure training is achieved. Finally, corresponding loss functions are provided respectively aiming at the attack success rate, the similarity of the countermeasure sample and the original sample, the performance of restraining two layers to generate the countermeasure network and the like, and the effectiveness and the authenticity of the countermeasure sample are guaranteed.

The following is a more specific example:

a method for generating a countermeasure sample based on a two-layer generated countermeasure network, as shown in FIG. 1, includes a generator G for generating the countermeasure network based on a first layer condition₁And discriminator D₁Feature extractor F, Generator G of the second layer for generating a countermeasure network₂And discriminator D₂And a target network C. The method comprises the following steps:

(1) conditional countermeasure generating network generates specific class sample x from class label c_c；

(2) Sample x_cInput to discriminator D₁Discriminating authenticity and class thereof；

(3) A feature extractor F extracts hidden layer features F (x) of an original sample;

(4) inputting hidden layer characteristics F (x) to generator G₂Generating an antagonistic disturbance G₂(F(x))；

(5) Sample x_cAnd opposing the disturbance G₂(F (x)) fusion to obtain challenge samples

(6) Inputting the confrontation samples into a discriminator D respectively₂Discriminating the sum of its authenticity from x_cSimilarity of (c);

In the method, a conditional generation countermeasure network is used for generating a new sample, the traditional generation countermeasure network can completely approximate to real data by sampling data distribution, but the method is too free, so that condition variables are added to restrict the method, and the method introduces a class label to guide the generation countermeasure network to generate images of a specific class. Generator G₁Comprises raw samples x, noise z and class labels c, generator G₁Gradually fitting the noise z to generate a new image sample from the original sample and the class label, discriminator D₁Not only to distinguish the generated sample x_cThe authenticity of (2) and the type of the sample (x) to be determined and the generated sample (x) to be discriminated are required_cWhether it belongs to category c.

The feature extractor F is used for extracting the hidden layer features of the original sample, and the convolutional neural network is selected as F, so that the hidden layer network of each layer is a layer of image features, as the layer number is deepened, the attention receptive field of a deep convolutional kernel is larger and larger, the global abstract features are more concerned, the features can be helpful for image classification, and therefore the hidden layer features can be used for increasing the prior countermeasures of disturbance.

A second layer generation countermeasure network for generating countermeasure disturbance, and inputting the hidden layer features extracted by the feature extractor F into the generatorG₂The counterdisturbance G with counterpriors can be obtained₂(F (x)), then with G₁Generated sample x_cFusing to obtain confrontation sample

Discriminator D₂Analyzing challenge samples

Authenticity of and x_cThe similarity of (c).

The target network is a network for resisting sample attack by mixing specific types of resisting samples

Input to the target network so that the target network misclassifies it as t-class.

In a word, the method adopts a first layer of condition generation countermeasure network, a feature extractor, a second layer of generation countermeasure network and a target network; the conditional generation countermeasure network is used for generating a new sample, and the discriminator not only distinguishes the authenticity of the generated sample, but also judges the category of the generated sample; the characteristic extractor is used for extracting the characteristics of a hidden layer of an original sample and generating disturbance with countercheck prior; the second layer generates a countermeasure network for generating countermeasure disturbance, and the discriminator analyzes the authenticity of the countermeasure sample and the similarity of the countermeasure sample and the conditional generation countermeasure network generation sample; the target network is used to verify the attack success rate against the sample. The method and the device respectively generate the samples of the specific category and the confrontation disturbance by utilizing the two layers of neural networks, can achieve the purpose of carrying out attack and confrontation training by utilizing the samples of the specific category, effectively improve the success rate of the attack and the efficiency of the confrontation training, and have wide application prospect.

Claims

1. A countermeasure sample generation method based on a two-layer generation countermeasure network is characterized in that the two-layer generation countermeasure network comprises a first layer of conditional generation countermeasure network, a second layer of generation countermeasure network, a feature extractor F and a target network C, and the first layer of conditional generation countermeasure network comprises generation countermeasure networkFinished device G₁Discriminator D₁The second layer of the generative countermeasure network comprises a generator G₂Discriminator D₂Wherein the generator and the discriminator are both MLP multilayer perceptrons;

the method comprises the following steps:

(5) Sample x_cAnd opposing the disturbance G₂(F (x)) fusing to obtain an antagonistic sample

2. The countermeasure sample generation method of claim 1, wherein the feature extractor F is a VGG model, and the target network C is a ResNet model.

3. The method of claim 2, wherein the conditional generation countermeasure network loss function of the first layer is:

L₁＝Ε_xlogD₁(x|c)+Ε_zlog(1-D₁(G₁(z|c)))

4. The method for generating the countermeasure sample based on the two-layer generated countermeasure network of claim 2, wherein the loss function of the second layer generated countermeasure network is:

wherein the content of the first and second substances,

for ensuring sample x_cAnd confrontation sample