CN112085069A

CN112085069A - Multi-target countermeasure patch generation method and device based on integrated attention mechanism

Info

Publication number: CN112085069A
Application number: CN202010830728.2A
Authority: CN
Inventors: 陈健; 谢鹏飞; 乔凯; 梁宁宁; 王林元; 张子飞; 罗旭; 魏月纳; 闫镔
Original assignee: Information Engineering University of PLA Strategic Support Force
Current assignee: Information Engineering University of PLA Strategic Support Force
Priority date: 2020-08-18
Filing date: 2020-08-18
Publication date: 2020-12-15
Anticipated expiration: 2040-08-18
Also published as: CN112085069B

Abstract

The invention belongs to the technical field of artificial intelligence security, and particularly relates to a multi-target countermeasure patch generation method and a device based on an integrated attention mechanism, wherein the method comprises the steps of constructing an image classification data set; constructing a multi-target countermeasure patch generation framework based on an integrated attention mechanism; defining a loss function and training a multi-target countermeasure patch generation framework; the test generates an effect of the attack against the patch. The method positions the key classification area of the input image through the integrated attention mechanism so as to ensure that the anti-patch can play better attack performance and mobility; the input of the generator fully utilizes the original image information, so that the anti-patch effect generated by the generator is better; the input of the generator is also fused with multi-target category information, and any specified category of the target model can be attacked to realize the attack of the multi-target category; and the input of the discriminator is cut to ensure that the discriminator learns more context information and improve the visual effect of the anti-patch.

Description

Multi-target countermeasure patch generation method and device based on integrated attention mechanism

Technical Field

The invention belongs to the technical field of artificial intelligence security, and particularly relates to a multi-target countermeasure patch generation method and device based on an integrated attention mechanism.

Background

Deep neural networks have enjoyed great success in the fields of image classification, image detection, text processing, speech recognition, and the like. As a revolutionary technology, the method brings great social and economic benefits and simultaneously causes worry and thinking of people about artificial intelligence safety. It has been shown that by adding carefully designed small perturbations to the original sample, a false decision can be made by the deep neural network.

The existence of the confrontation sample presents a major challenge to the application of artificial intelligence, such as automatic driving, face recognition and the like, which prompts a learner to continuously research the attack and defense algorithm of the confrontation sample, and the two are mutually gambled and mutually promoted, so that the safety of the artificial intelligence is continuously improved. Therefore, the research on the confrontation sample attack and defense algorithm has important value for the development of the artificial intelligence safety field.

At present, three methods of directly generating the anti-disturbance on the whole image are optimization-based, gradient-based and network generation-based. Optimization-based methods have been suggested by szegdy et al for the first time by L-BFGS based on box-constrained optimization to allow model misclassification by adding perturbations to the image that are not observable by the human eye. Carlini and Wagner propose a C & W attack algorithm by modifying an objective function for a defense algorithm at that time. Gradient-based methods there is, Goodfellow et al propose a fast gradient notation method (FGSM) using the linear nature of the deep neural network model in the high-dimensional space. Kurakin et al propose a Basic Iterative Method (BIM) for further FGSM optimization. Dong et al introduced momentum into BIM, proposing a momentum iterative attack method (MI-FGSM). Although optimization-based and gradient-based methods have met with some success in attack performance, they go through a number of iterative processes, which are very time-consuming. Zhao et al propose a network-based method of generating a confrontational sample of an image by generating a confrontational network, making the generated confrontational sample more natural.

The above methods are designed to resist disturbance aiming at the whole image, which includes both the target image and the background image, and in fact, if the attack is to be expanded into the physical world, the attacker is difficult to change the whole image, so that a certain specific area in the image can be selected, and disturbance is added to the area, so that the misjudgment of the deep neural network occurs.

Researchers have studied similar related works, and Brown et al first proposed the concept of anti-patch, i.e., adding perturbation to a specific region in the whole image to make the deep neural network misjudge. Compared with the traditional method, the method has the advantages of resisting disturbance, being independent of scenes and the like. Karman et al generate countermeasure patches by optimizing and modifying the loss function. Evtiov et al use conventional disturbance generation techniques to generate black and white bars to be attached to the traffic sign map, causing it to be misclassified. While anti-patch generation techniques have made good progress in performance and flexibility of attacks, most research on anti-patches often ignores visual effects, fails to produce patches that are consistent with the target background of the attack and aggressive, and therefore often results in erratic attack outcomes. Liu et al proposed a Perceptial-Sensitive GAN (PS-GAN) to generate a perceptually Sensitive countermeasure patch that effectively enhances the visual impact and aggression capability of the countermeasure patch. It cannot generate countermeasure patches for specific categories.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides a multi-target counterattack patch generation method and device based on an integrated attention mechanism, which can obviously reduce the training cost and the model memory space, and enable the counterattack patch to have good visual effect and directional attack performance by integrating the attention mechanism and the counterattack generation training.

In order to solve the technical problems, the invention adopts the following technical scheme:

the invention provides a multi-target countermeasure patch generation method based on an integrated attention mechanism, which comprises the following steps of:

constructing an image classification data set;

constructing a multi-target countermeasure patch generation framework based on an integrated attention mechanism;

defining a loss function and training a multi-target countermeasure patch generation framework;

the test generates an effect of the attack against the patch.

Further, the image classification data set comprises a training set and a testing set, the training set and the testing set comprise k-class images and class labels thereof, the training set is used for training a multi-target anti-patch generation framework, and the testing set is used for testing the anti-patch attack effect.

Further, the building of the multi-objective countermeasure patch generation framework based on the integrated attention mechanism includes:

constructing an integrated attention mechanism module to obtain the position where the countermeasure patch needs to be placed;

constructing a feature extractor module to obtain an embedded vector of an original image attention area;

the construction generator module generates a countermeasure patch aiming at the multi-target category;

constructing a discriminator module and cutting input;

and constructing a target attack model.

Further, the constructing of the integrated attention mechanism module to obtain the position where the countermeasure patch needs to be placed includes:

respectively obtaining training sets and testing sets as input through gradient weighted class activation mapping by adopting Resnet50 and VGG19 as pre-training networksObtaining a key classification area A of the original image_Resnet50(x) And A_VGG19(x) Get the key classification area A of common interest_common(x) Finding the maximum peak of the weight, and determining the position A of the countermeasure patch based on the size of the countermeasure patch and centering on the maximum peak of the weight_patch(x)。

Further, the constructing a feature extractor module to obtain an embedded vector of the attention area of the original image includes:

according to the location A of the countermeasure patch_patch(x) Taking out the original image attention area x to be replaced by the counterpatch_replacedIt is used as input to a feature extractor which takes the original image attention area x_replacedEmbedded vector x mapped to low dimensional space_embedding。

Further, the build generator module generates countermeasure patches for multiple target categories, including:

embedding vector x obtained by a feature extractor_embeddingAs the input of the generator G, the randomly generated other categories t except the original category are input at the same time, where t is a discrete variable rather than a constant, and the countermeasure patch for the target category t is generated as G (x)_embeddingT), and then placing the countermeasure patch at the position A of the countermeasure patch in the original image_patch(x) To form a confrontational sample

Further, the constructing a discriminator module and clipping the input includes:

by the position A of the countermark in the image_patch(x) Cutting the original image and the confrontation sample as the center, using the cut two as the input of a discriminator, the discriminator ensures the confrontation sample

Similarity to the original image x.

Further, the constructing the target attack model includes: target attack model FFor deep neural networks, confrontation samples are taken

Inputting a target attack model F and obtaining a returned class label value;

the defining of the loss function and the training of the multi-target countermeasure patch generation framework comprise the following steps:

location A of countermeasure patch obtained by inputting training set and integrated attention mechanism module_patch(x) Training the multi-target countermeasure patch generation framework; the loss function includes generating a countering loss function L_GANAnti-attack loss function L_advAnd-countermeasures patch similarity function L_patch；

Will confront the sample

Inputting a target attack model F, obtaining a class label value returned by the target attack model F, and resisting an attack loss function L through resisting generation training_advPrompt confrontation sample

The target attack model F is wrongly classified into a target category t;

in the above three formulas,

for the countermeasure sample, x is the original image, t is the object class, for the countermeasure patch, k is the original class, A_patch(x) To combat patch location, F is the target attack model, D is the discriminator, G is the generator, x_replacedThe attention area of the original image;

finally, a penalty function L is generated_GANAnti-attack loss function L_advAnd-countermeasures patch similarity function L_patchIn combination, as follows:

where λ >0, γ >0, to balance the weight of each penalty; optimizing the above problem will drive the multi-objective countermeasure patch generation framework to find a near-optimal generator.

Further, the testing generates an effect of the attack against the patch, including:

in the testing stage, the position A of the countermeasure patch obtained by the test set and the integrated attention mechanism module is input_patch(x) (ii) a Obtaining an embedding vector x by a feature extractor_embeddingRandomly generated classes and embedding vectors x_embeddingAfter the countermeasure patch is generated by the generator G, it is added to the position A of the countermeasure patch in the original image_patch(x) To form a confrontational sample

And finally, judging whether the target attack model can classify the confrontation samples into target categories, wherein the higher the classification accuracy is, the stronger the attack capability is.

The invention also provides a multi-target countermeasure patch generating device based on the integrated attention mechanism, which is characterized by comprising the following components:

the image classification data set construction module is used for constructing an image classification data set;

the multi-target countermeasure patch generation framework construction module is used for constructing a multi-target countermeasure patch generation framework based on an integrated attention mechanism;

the training module is used for defining a loss function and training a multi-target countermeasure patch generation framework;

and the testing module is used for testing and generating the attack effect of the anti-patch.

Compared with the prior art, the invention has the following advantages:

the invention discloses a multi-target countermeasure patch generation method based on an integrated attention mechanism, which is characterized in that a key area required to be placed by an countermeasure patch is obtained through the integrated attention mechanism, so that an countermeasure sample has better attack performance and mobility, a generator of the countermeasure patch capable of generating good visual effect and attack performance by countermeasure generation training is adopted, multi-target category information and original image information are fused into the input of the generator, the category information is fused into a frame, the generator can have the attack capability of multiple target categories, a single generator module can be trained to generate the countermeasure patches of multiple target categories, in addition, the original image information is fully utilized, the key information of an original image attention area is extracted by a feature extractor to serve as the input of the generator, and the visual effect of the countermeasure patch generated by the generator is better. And the input of the discriminator is cut to ensure that the discriminator learns more context information and the visual effect of the anti-patching can be improved. The invention can attack the image classification network quickly and effectively, and can obviously reduce the training cost and the storage amount of the model.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a flow chart of a multi-objective countermeasure patch generation method based on an integrated attention mechanism according to an embodiment of the invention;

FIG. 2 is a diagram of a multi-objective countermeasure patch generation framework based on an integrated attention mechanism according to an embodiment of the invention;

FIG. 3 is a flow chart of an integrated attention mechanism of an embodiment of the present invention;

fig. 4 is a flowchart of the test generator generating the attack effect against the patch according to the embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer and more complete, the technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention, and based on the embodiments of the present invention, all other embodiments obtained by a person of ordinary skill in the art without creative efforts belong to the scope of the present invention.

Recently, the deep neural network attracts a lot of attention because it is vulnerable to the attack of the countermeasure sample, wherein the countermeasure patch is an attack method which can be extended to the physical world, and the attack effect is more threatening.

As shown in fig. 1, the method for generating a multi-objective countermeasure patch based on an integrated attention mechanism of the embodiment includes the following steps:

step S101, constructing an image classification data set;

step S102, constructing a multi-target countermeasure patch generation framework based on an integrated attention mechanism;

step S103, defining a loss function and training a multi-target countermeasure patch generation framework;

step S104, the test generator generates the attack effect against the patch.

In order to complete the training and testing of the multi-target counterpatch generation framework based on the integrated attention mechanism, an image classification data set is required to be constructed firstly, wherein the image classification data set comprises a training set and a testing set, the training set and the testing set are classified to contain k types of images and class labels thereof, the training set is used for training the multi-target counterpatch generation framework, and the testing set is used for testing the anti-patch attack effect generated by a generator.

As shown in fig. 2, step S102 constructs a multi-objective countermeasure patch generation framework based on the integrated attention mechanism, where the framework specifically includes:

and step S1021, constructing an integrated attention mechanism module to obtain the position where the countercheck patch needs to be placed.

Traditional single noteThe attention model limits the attack performance, and particularly when the attack is in the face of migration, the key area concerned by a single attention model is not necessarily the key area of the attacked model, so that an attention mechanism integrating method is provided in the example, and the aim of finding the common key classification area concerned by each model is fulfilled. Specifically, as shown in fig. 3, the training set and the test set are used as inputs of the attention mechanism module a, which may use Resnet50, VGG19 or other classification networks as pre-training networks to obtain the key classification areas a of the original image through gradient-weighted class activation mapping (Grad-CAM)_Resnet50(x) And A_VGG19(x) Get the key classification area A of common interest_common(x) Finding the maximum peak value of the weight of the common key area, and determining the position A of the countermeasure patch by taking the maximum peak value of the weight as the center according to the size of the countermeasure patch_patch(x) And the input information is used as input information of the multi-target countermeasure patch generation framework during training and testing.

Step S1022, a feature extractor module is constructed to obtain an embedded vector of the attention area of the original image.

Position A of the countermeasure patch obtained in step S1021_patch(x) Taking out the original image attention area x to be replaced by the counterpatch_replacedIt is used as input to a feature extractor which takes the original image attention area x_replacedEmbedded vector x mapped to low dimensional space_embeddingThe prior information of the image contained in the training data will make the generator better trained; the purpose of the feature extractor in this example is to extract useful information of the attention area of the original image.

And step S1023, constructing a generator module and generating the counterwork patches aiming at the multi-target categories.

In order to better acquire the input sample information and improve the visual fidelity of the counterpatch, the embedded vector x obtained by the feature extractor in step S1022 is used_embeddingAs the input of the generator G, the randomly generated other categories t except the original category are input at the same time, and the countermeasure patch for the target category t is generated as G (x)_embeddingT), andthe single target attack is different in that the target class t is regarded as a discrete variable rather than a constant, the range of the target class t is other than the original class, and then the countermeasure patch is placed at the position A of the countermeasure patch in the original image_patch(x) To form a confrontational sample

And step S1024, constructing a discriminator module and cutting the input.

The arbiter guarantees perceptual similarity. In the conventional GAN-based countermeasure sample generation method, the input of the discriminator D is the whole original image and the whole countermeasure sample, and the size of the countermeasure patch occupying the whole image is limited, so in this example, the input of the discriminator D is clipped, which increases the countermeasure patch occupation ratio, and the discriminator can learn more context information to improve the visual effect of the countermeasure patch. In particular, the position A of the countermark in the image is used_patch(x) Clipping the original image and the resisting sample as the center, and using the clipped two as the input of a discriminator

As similar as possible to the original image x, thereby ensuring good visual effect and high perceptual relevance.

And step S1025, constructing a target attack model.

The target attack model F may be any given deep neural network that will fight the sample

And inputting the target attack model F and obtaining the returned class label value.

Step S103, defining a loss function and training the multi-objective countermeasure patch generation framework specifically includes:

inputting training set and position A of countercheck patch obtained in step S1021_patch(x) And training the multi-target countermeasure patch generation framework. The loss function includes generating a countering loss function L_GANAnti-attack loss function L_advAnd-countermeasures patch similarity function L_patch；

In which a penalty function L is generated_GANAnd a good visual effect is ensured, and images close to a target domain can be generated through the confrontation generation training, so that the confrontation patch with a good visual effect can be generated better by utilizing the loss.

Loss function L of counterattack_advEnsure good directional attack effect and resist the sample

Is misclassified as target class t by target attack model F.

Countermeasure patch similarity function L_patchWill ensure the confrontation patch and the original image attention area x_replacedAs similar as possible.

In the above three formulas,

for the countermeasure sample, x is the original image, t is the object class, for the countermeasure patch, k is the original class, A_patch(x) To combat the patch's location, F is the target attack model and D isArbiter, G is generator, x_replacedIs the attention area of the original image.

where λ >0 and γ >0 to balance the weight of each penalty. Under the guidance of the integrated attention mechanism module, the optimization of the problems drives the multi-target counterattack patch generation framework to find an approximately optimal generator, so that the multi-target counterattack patch with strong directional attack capability and good visual effect is generated.

The step S104 of generating the attack effect against the patch by the test generator specifically includes:

as shown in FIG. 4, in the testing stage, the test set is inputted and the location A of the countermeasure patch obtained in step S1021 is inputted_patch(x) Obtaining an embedding vector x by a feature extractor_embeddingRandomly generated classes and embedding vectors x_embeddingAfter the countermeasure patch is generated by the generator G, it is added to the position A of the countermeasure patch in the original image_patch(x) To form a confrontational sample

And finally, judging whether the target attack model can classify the confrontation samples into target categories, wherein the higher the classification accuracy is, the stronger the attack capability is. Here, the white box, i.e., the attacked model in the test phase, is consistent with the attacked model in the training phase, and the black box, i.e., the attacked model in the test phase, is different from the attacked model in the training phase.

Corresponding to the multi-target countermeasure patch generation method based on the integrated attention mechanism, the embodiment also provides a multi-target countermeasure patch generation device based on the integrated attention mechanism, which comprises an image classification data set construction module, a multi-target countermeasure patch generation framework construction module, a training module and a testing module.

The implementation means of the device has been described in detail in the generation method, and is not described herein again.

In conclusion, the key classification area of the input image is positioned through the integrated attention mechanism, so that the anti-patch can play better attack performance and migration performance; the input of the generator fully utilizes original image information, and the feature extractor is utilized to extract key information of the attention area of the original image as the input of the generator, so that the anti-patching effect generated by the generator is better; the input of the generator is also fused with multi-target category information, and any specified category of the target model can be attacked to realize the attack of the multi-target category; and the input of the discriminator is cut to ensure that the discriminator learns more context information and improve the visual effect of the anti-patch.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.

Finally, it is to be noted that: the above description is only a preferred embodiment of the present invention, and is only used to illustrate the technical solutions of the present invention, and not to limit the protection scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims

1. A multi-target countermeasure patch generation method based on an integrated attention mechanism is characterized by comprising the following steps:

constructing an image classification data set;

the test generates an effect of the attack against the patch.

2. The integrated attention mechanism-based multi-objective countermeasure patch generation method of claim 1, wherein the image classification data set comprises a training set and a test set, the training set and the test set comprise k types of images and class labels thereof, the training set is used for training the multi-objective countermeasure patch generation framework, and the test set is used for testing effectiveness of countermeasure patch attacks.

3. The integrated attention mechanism-based multi-objective countermeasure patch generation method according to claim 2, wherein the constructing of the integrated attention mechanism-based multi-objective countermeasure patch generation framework includes:

constructing a discriminator module and cutting input;

and constructing a target attack model.

4. The method for generating a multi-objective countermeasure patch based on an integrated attention mechanism as claimed in claim 3, wherein the constructing of the integrated attention mechanism module to obtain the position where the countermeasure patch needs to be placed comprises:

respectively obtaining key classification areas A of the original image by using a training set and a test set as input and using Resnet50 and VGG19 as a pre-training network through gradient weighted class activation mapping_Resnet50(x) And A_VGG19(x) Get the key classification area A of common interest_common(x) Finding the maximum peak of the weight, and determining the position A of the countermeasure patch based on the size of the countermeasure patch and centering on the maximum peak of the weight_patch(x)。

5. The integrated attention mechanism-based multi-target countermeasure patch generation method according to claim 4, wherein the constructing a feature extractor module to obtain an embedded vector of an attention area of an original image comprises:

6. The integrated attention mechanism-based multi-objective countermeasure patch generation method of claim 5, wherein the build generator module generates countermeasure patches for multi-objective categories, including:

7. The integrated attention mechanism-based multi-target countermeasure patch generation method of claim 6, wherein the constructing a discriminator module and clipping the input comprises:

Similarity to the original image x.

8. The integrated attention mechanism-based multi-target countermeasure patch generation method according to claim 7, wherein the constructing of the target attack model includes: the target attack model F is a deep neural network and is used for resisting samples

Inputting a target attack model F and obtaining a returned class label value;

Will confront the sample

The target attack model F is wrongly classified into a target category t;

in the above three formulas,

9. The integrated attention mechanism-based multi-target countermeasure patch generation method according to claim 8, wherein the testing generates an attack effect of the countermeasure patch, including:

10. A multi-objective countermeasure patch generation apparatus based on an integrated attention mechanism, comprising: